producing searchable and copyable pdf files with accents using latex-pdflatex

by Martin Monperrus Tags:

You have accents/diacritics (accent grave, umlaut, etc.) in your latex document. You produce a PDF file with pdflatex. You might encounter the following problems:

Problem 1:
if you don't use \usepackage[T1]{fontenc}, you can neither search accented words nor copy/paste them using xpdf/acroread. This also impacts very much search engine indexing.

Problem 2:
if you use \usepackage[T1]{fontenc}, the PDF embeds Type3 fonts (that should be avoided), and xpdf renders them very poorly.

The following solutions use \usepackage[T1]{fontenc} and fonts that contain all latin characters (i.e. accented letters ).

Solution 1: With Bitstream charter fonts
\usepackage[T1]{fontenc}
\usepackage{charter}

Solution 2: With URW Times / Nimbus Roman No9 L fonts:
\usepackage[T1]{fontenc}
\usepackage{times}

Solution 3: With Latin Modern (lmodern) / Computer Modern font
\usepackage[T1]{fontenc}
\usepackage{lmodern}


Solution 4: With the CM-super's version of Computer Modern:
Install the cm-super font package ($ apt-get install cm-super on Debian/Ubuntu), then in your source:
\usepackage[T1]{fontenc}
The rest of the configuration is automatically done by updmap.


With all these solutions, you obtain PDF documents that satisfy the following requirements:
* you can search for accented words
* you can copy/paste accented words
* the PDF file is well indexed by robots and search engines.
* the PDF file only contains Type1 fonts
* the PDF file is nicely rendered with xpdf

(thanks to http://www-verimag.imag.fr/~monniaux/download/Latex-PDF.HOWTO and http://www.ctan.org/tex-archive/fonts/ps-type1/cm-super/FAQ)

see also Copy-pastable listings in PDF from LaTeX