Quantcast
Viewing latest article 2
Browse Latest Browse All 70

Text copied from pdf is missing spaces, or has extra ones

When I create a pdf with pdflatexand copy text from that pdf (using Adobe Reader DC on Windows 10), some of the spaces are missing. Here's an MWE:

\documentclass{article}\usepackage{newtxtext}\begin{document}    Therefore, this work ... \hspace*{\linewidth}\end{document}

When I copy text from that pdf, this is what I get (1 being the page number):

Therefore, thiswork ...1

Removing the \hspace*, OR removing newtxtext (or both) fixed the problem, but that's not I want, of course (as \hspace* represents some text following "this work").

I have come across Problem copying text from pdf - spaces being stripped and XeLaTeX and missing spaces in PDF text, which proposed \pdfgeninterwordspace, which is now \pdfinterwordspaceon (thanks, @egreg). So I tried that:

\documentclass{article}\usepackage{newtxtext}\pdfmapline{+dummy-space <dummy-space.pfb}\pdfinterwordspaceon\begin{document}    Therefore, this work ... \hspace*{\linewidth}\end{document}

(See Use pdfinterwordspaceon with pdflatex from MiKTeX on Windows if that does not compile for you.)

Now, when I copy text from that pdf, I get this:

Therefore,  this work  ... 1

So basically, additional space has been introduced regardless of whether or not it was needed. Yes, the missing space in "thiswork" has been added, which is good; but so have three extra spaces after "Therefore,", "work", and "...", which is not good.

Is there a better solution? Am I using \pdfinterwordspaceon correctly?


Viewing latest article 2
Browse Latest Browse All 70

Trending Articles