Quantcast
Viewing latest article 22
Browse Latest Browse All 70

Big LaTeX delimiters have a weird encoding in the pdf. How can I avoid this?

I noticed recently that changing the size of parentheses in LaTeX results in a different encoding of the text in the pdf (as determined using the copy/paste functionality in my pdf viewer). For instance, in the command

$\sin(x) + \sin\bigl(x\bigr)$

the first term is encoded as s i n ( x ) whereas the second term is encoded ass i n <CR> <LF> <U+FFFD> <CR> <LF> x <CR> <LF> <U+FFFD>. (Here <CR> is carriage return, <LF> is line feed, and <U+FFFD> is the unicode symbol for an unknown, unrecognised, or unrepresentable character).

This behavior is undesirable because it makes it impossible to find all instance of "sin(x)" by searching the file. As a mathematician who exclusively reads papers and books on the computer, I find it very important that pdf documents be easily searchable. This is also critical from an accessibility perspective.

Question: Is there any easy way to improve the encoding of the pdf file so that (for instance) the two terms above are encoded in the same way?

This site has a related problem that was solved using the accsupp package, but in my case that method results in the text string \sin (x) + \sin \big (x\big ), which is not so desirable either and is a hassle to make work in the LaTeX file. [Random question: do some visually impaired users prefer that the size of delimiters be recorded, as this output suggests?]


Viewing latest article 22
Browse Latest Browse All 70

Trending Articles