PDF Readers

In message snipped-for-privacy@gmail.com, pinnerite snipped-for-privacy@gmail.com writes

Bullzip (which I have been using for ages, and can thoroughly recommend) is a 'Print-to-PDF' program - and not a PDF viewer.

For removing/rearranging/adding pages to a PDF document I use PDFTK Builder.

formatting link

Reply to
Ian Jackson
Loading thread data ...

Bully for you!

Reply to
Chris Hogg

There is the 'Seeing AI' iPhone app from Microsoft, if ye feel comfortable on the scanned content being gobbled and stored on Microsoft servers.

It doesn't have a PDF load facility, but if you point an iPhone at a printout or a computer screen, ye are most likely more than halfway there. Of course the reading order of blocks of text may need fixing, but maybe with a bit more development an AI may even do that.

I like this app. It even OCRs my scrawly handwriting, and recognises me as "a 58 year old man wearing glasses and looking neutral". Hmmm... Ok, I'm not that old.

formatting link
It's got accessibility features worth a play if ye have an iPhone, sighted or not.

Techmoan - An app that sees for those who can?t

formatting link

Reply to
Adrian Caspersz

Quite right. Senility is ramping up. :(

Reply to
pinnerite

Bully for me, and possibly useful information for anyone who might be misled by your utterances about Foxit.

Reply to
Scott

Indeed. I don't have any problem with it. I don't, however, use any browser plugin; I just ket it use Foxit directly.

Reply to
Bob Eager

Slightly OT, but anyone who uses a pdf reader and searches for text in a document should be aware of a problem I've only just noticed (and I've been using pdfs since they were around from almost 30 years ago).

It does not appear possible to find a phrase, or even hyphenated word, which runs from one line into the next line, probably due to the "soft line return" at the end of the line not being included in the searched phrase characters. This could be very important in a safety instruction or legal matter. If anyone has a pdf reader which does find that phrase over two lines, would they please share which one it is here!

Reply to
Jeff Layman

Words, even characters, that appear next to each other when rendered don't have to be next to each other within the PDF file, surely you've seen the occasional PDF where pages display in a "random chequerboard" fashion?

Reply to
Andy Burns

Indeed and exactly so

A pdf display text command tends to to be 'here's an X,Y co-ordinate, a font description and some text, display it there'.

Page DISPLAY Format

There are no implications as to it being more than a bitmap, if that serves the purpose.

Ability to find text should be regarded as an occasional lucky accident

Reply to
The Natural Philosopher

Thanks to all for the helpful suggestions, except to Scott who merely insisted that he didn't get a problem, which of course was no help at all.

Reply to
Chris Hogg

In the spirit of right of reply, I would suggest that reporting that something is okay is relevant and helpful input. If my car developed a fault I would certainly want to know if this was common to the model or a one-off affecting one vehicle. Indeed, it is the first question I would ask the garage.

Reply to
Scott

I have never found that as a general rule. As others have said, what lies behind a PDF may vary greatly. But I routinely find phrases which extend across more than one line using Acrobat (Pro X or DC Reader). And that's not just with reflow.

Another matter when they extend over pages with hard page numbers of course.

Hyphens are different again. And a nightmare anyway with people using em dash, en dash, minus sign or whatever.

But then much the same's true of text processors and desk top printing software I've used. E.g. in MS Word stick a hard line break in and a simple search for a phrase won't find it.

Reply to
Robin

I can't say that I have. How common is it?

I can't see how a function offered in many, if not most, pdf readers could be regarded as a lucky accident. There would be little point in offering something which /might/ work /some/ of the time. Have you found many pdf readers where the search function works randomly?

Reply to
Jeff Layman

I've read in another NG that Sumatra will also find text extended across more than one line. So it is beginning to look like it is app dependent. Unfortunately I cam using Linux, so unless I install Wine I can't check the function of Windows-only pdf readers.

Sorry, but I don't understand what a hard page number is and Google hasn't helped.

I assume that's because it's a paragraph mark, and you can search for it specifically. But it wouldn't be easy to do as a semi-random mark in the middle of some text. It should be possible, however, to arrange for a phrase search to look for text-only characters (a to z, and 0 to 9), and tell the search to ignore any other characters it finds during the search.

Reply to
Jeff Layman

I would expect it.

PDF is a container format. In simple terms, it can contain text or images. Text is easily searchable, but images require OCR, and the building of a text version of each page. This can be done, but not always done at time of creation. OCR at point of use wouold be unacceptably slow.

Reply to
Bob Eager

You *completely* misunderstand. Its nothing to do with the reader and everything to do with the PDF.

Text in a PDF is not stored as a contiguous block. There is nothing to say it even has to be text - it might be a bitmap. Or it might be random letters in random fonts each with its own x,y co-ordinates.

Sometimes, it is true, a text frame with a lot of text in it is stored - that would be the default for 'pure unformatted text' documents generated from a wordprocessor, but there is absolutely no guarantee - as a searchable text block in the PDF, but that is never guaranteed

It might as I said specify each letter in its own box to generate micro kerning in a justified block

The fancier the page, the less likely it is to contain contiguous searchable text

Reply to
The Natural Philosopher

And it can also contain single words or single letters.

Text is easily searchable,

But single words and single letters are not

Reply to
The Natural Philosopher

The function might be offered, but not be useable in a particular pdf. The pdf content might, f'instance, be a jpeg and nothing else. No text at all and therefore not searchable, even if the jpeg is an image of text and looks like text. It just doesn't quack like text.

Reply to
Tim Streater

According to Robin's reply: "But I routinely find phrases which extend across more than one line using Acrobat (Pro X or DC Reader). And that's not just with reflow." I also understand that Sumatra reader will also find text flowed over two lines. Other readers do not find that text. So if it's nothing to do with the reader how do you explain that?

Of course, but let's stay away from "graphical" text such as a photograph of a word. All I'm interested in is true text - such as that created by text editor or word processor, which is saved or printed as a pdf. My OP referred to text in a document - perhaps I should have said "non-graphical text" to be absolutely clear what I was referring to.

A pdf, created from plain text, seems to have /no/ viewable or searchable text if that pdf is opened in a text editor (with warning message ignored). It can only be viewed in a pdf reader. So in a way it is encoded specially by the pdf creator, and that code can be decoded by the pdf reader. Why then, does it decode it so that some invisible line-feed character appears within a phrase and that line-fed phrase is searchable by some pdf readers but not others?

Thinking rather OT, this reminds me somewhat of Lotus Wordpro, which created files with no recoverable text in them, and those files appeared to be just random characters when viewed in a text editor, unlike Word and Wordperfect.

Reply to
Jeff Layman

A lot of times the output will be equivalent to

moveto 400,400 print "hello"

so you could search for it, but if it was

move to 400,400 print "h" moveto 400,410 print "e" move to 400,420 print "l" moveto 400,430 print "l" move to 400,440 print "o"

would you find it? what about

move to 400,440 print "o" move to 400,400 print "h" move to 400,420 print "l" moveto 400,410 print "e" moveto 400,430 print "l"

all of them would render the same ...

Reply to
Andy Burns

HomeOwnersHub website is not affiliated with any of the manufacturers or service providers discussed here. All logos and trade names are the property of their respective owners.