Quantcast
Channel: Active questions tagged copy-paste - TeX - LaTeX Stack Exchange
Viewing all articles
Browse latest Browse all 70

unable to extract devanagari from XeLaTeX generated PDF using ebook-convert

$
0
0

I'm using TeXLive 2020 in Debian Bullseye. I generated a PDF document using XeLaTeX containing Devanagari characters. By using the option \XeTeXgenerateactualtext=1, I'm able to copy the Devanagari text from the XeLaTeX generated PDF into an Unicode-aware text editor.

But when I use ebook-convert to convert it into a plain text file using

ebook-convert test.pdf test.txt

I'm unable to get the original Devanagari characters back.

edit: The modified MWE is as follows:

\documentclass[12pt]{article}\usepackage{polyglossia} %supports Unicode; compulsory\setdefaultlanguage{english}\setmainfont{Gentium Basic} %Unicode English font; any other font can be used as well. \setotherlanguage{sanskrit}\newfontfamily{\dev}[Script=Devanagari, Mapping=RomDev]{Shobhika}\begin{document}\XeTeXgenerateactualtext=1    \textit{Plain Unicode Diacritical Text:} dhṛtarāṣṭra uvāca \\     \textit{Plain Unicode Devanagari text:} {\dev धृतराष्ट्रउवाच} \\     \textit{Devanagari text generated from RomDev.tec:} {\dev dhṛtarāṣṭra uvāca}\end{document}

I have many XeLaTeX generated PDFs containing Devanagari characters, and I want to convert them into plain text documents (using the CLI and not Copy-Paste) for further usage, but am unable to do so. Please help me.

Regards.


Viewing all articles
Browse latest Browse all 70

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>