Skip to content
Snippets Groups Projects
Commit 22d777eb authored by Benjamin Franzke's avatar Benjamin Franzke Committed by Benni Mack
Browse files

[BUGFIX] Avoid double UTF-8 encoded PDF metadata in file indexer

There are different versions of pdfinfo available and used
by different providers/distributions.

a) Debian/Fedora use pdfinfo (>v20) from the poppler-utils package.
   Also hosters like Hetzner use this version.
   This variant defaults to UTF-8 output for metadata:
   https://linux.die.net/man/1/pdfinfo
   > -enc encoding-name
   Sets the encoding to use for text output. This defaults to "UTF-8".

   pdfinfo -v
   pdfinfo version 21.08.0
   Copyright 2005-2021 The Poppler Developers -
                       http://poppler.freedesktop.org
   Copyright 1996-2011 Glyph & Cog, LLC

b) Older servers and hosters with legacy software (Mittwald,
   Domainfactory) use pdfinfo v3. This one defaults to Latin1 output:
   https://www.xpdfreader.com/pdfinfo-man.html
   > −enc encoding-name
   > Sets the encoding to use for text output. […]
   > This defaults to "Latin1"

   pdfinfo -v
   pdfinfo version 3.02
   Copyright 1996-2007 Glyph & Cog, LLC

Both versions support an -enc UTF-8 option, which is nowused to
circumvent the differences between these tools, instead of implying
Latin1 output (as done in #80085) which breaks variant a) by
interpreting valid UTF-8 as ISO-8859-1 and thus applying
a double encoding.

Resolves: #99352
Related: #80085
Releases: main, 11.5, 10.4
Change-Id: Ib8f7ae742c5edc73036afcb7d2608cd01f4176fd
Reviewed-on: https://review.typo3.org/c/Packages/TYPO3.CMS/+/77081


Reviewed-by: default avatarBenni Mack <benni@typo3.org>
Tested-by: default avatarBenjamin Franzke <bfr@qbus.de>
Tested-by: default avatarBenni Mack <benni@typo3.org>
Reviewed-by: default avatarStefan Bürk <stefan@buerk.tech>
Tested-by: default avatarStefan Bürk <stefan@buerk.tech>
Reviewed-by: default avatarBenjamin Franzke <bfr@qbus.de>
Tested-by: default avatarcore-ci <typo3@b13.com>
parent 42271d9b
No related merge requests found
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment