On 6/27/06, jimw <jpw@sasktel.net> wrote:
[pdftotext]
> a text done in
> the Cyrillic alphabet produces a file containing nothing but the
> punctuation. it's necessary to use the -enc option to turn it to
> cyrillic, but the result I get is this:
>
> /Desktop/Zash$ pdftotext -enc cyrillic ./zash1.pdf
> Error: Couldn't find unicodeMap file for the 'cyrillic' encoding
> Error: Couldn't get text encoding
>
> Anyone know what I'm doing wrong?
Hmm... First, are you still using Breezy? (this is not necessarily
the cause of any problems, I just am curious :)
Second, If you download
http://www.health.state.ny.us/nysdoh/hospital/healthcareproxy/pdf/1402.pdf
and just do "pdftotext 1402.pdf"
do you get a garbage document? (In Dapper it makes a nice text file,
e.g. it starts "Доверенность на принятие решений о медицинской
помощи")
Also, if you open the PDF file with a viewer can you copy and then
paste sections into a gedit window?
If that file works, can you send me a copy of the one that's giving you grief?
> Anyone care?(:))
LOL! Well, it might not affect any of us, but we don't want you to
have trouble ;)
CK
Received on Wed Jun 28 01:40:38 2006
This archive was generated by hypermail 2.1.8 : Fri Sep 08 2006 - 23:26:38 CST