I'm trying to create a pdf file of online forum posts. Using the browser print to pdf option works, but only up to a point in that the saved pdf file includes all the hidden text in the online original rather than just the text and images I see without clicking or hovering. Hope that makes sense.
How can I best save what I see?
Secondly, having hopefully achieved the above, can I combine pdf files? The forum I want to save consists of 30 threads, each with 3 or 4 pages. Saving as above means saving one page at a time which is OK, but would leave me with 100+ pdf files.
Any recommended prog or add on I could use? Ideally open source i.e. cheap or free?
Does the forum have a separate 'print view' mode? Or could you use the Stylus browser extension to rewrite the page style sheet to change the quoted text to 0pt white?
PDFSAM and others will concatenate or merge PDFs into one file
Thanks for the suggestions. yes, lots of large, full colour images. The forum software is phpBB which seems to be popular, and works well.
I think straight copy/paste of the text is the way forward, with images added where appropriate.
HTML is indeed no longer html, at least at my basic level. I note Owain's suggestion to rewrite the style sheet, but that is way beyond my experience. I do, though, like the idea of producing a pdf file for each thread then concatenate or merge PDFs into one file.
I used to copy and past into Word, but now print direct to pdf, or save the whole page if appropriate as HTML.
For printing and using the web style sheet I use the addon "Print Edit WE". I have not looked back since. You can also de-select areas you don't want to output.
If its got to be accessible ie, not just a graphic of a page, I don't actually think there is any obvious way to do what you want since Adobe say they want dosh when you try to do a conversion grin. Brian
You could try pasting the URLs into something like
formatting link
- having just quickly tried it it seems to do a reasonable job of 'printing' what you see (although it does add a small banner on the bottom). There a browser extensions to make it a bit more '1-click' too.
Funny how we read things. I read 'Is this a forum with binaries?' as referring to the online forum I mentioned, not 'Is this (uk.d-i-y) a forum with binaries?', not least because newshound would know that uk.d-i-y is not a binary group.
Thanks for all the comments. I have transferred the first two forum threads to pdf by copying and pasting the text and images required which gives a clean and satisfactory result, although somewhat laborious. Oh well, perhaps what lockdown was designed for?
You can create PDF files by hand, which would bring the question perilously close to the group charter.
The following file can be copied into Notepad and stored as "helloworld.pdf". Where the extension may help the icon of the file look like an Acrobat Reader icon.
The file is copied off the web, and I messed with it a bit and screwed up the checksums. (I added two sentences, used some matrix operators to step the line beginning for the next line, then corrected the stream length to
112 characters (includes a line termination character per line.)
If you screw up the file enough, Acrobat tries to repair it internally before displaying this. This might cause a 20 second delay until it opens.
----------------- Do not copy this line ------------------ %PDF-1.7
1 0 obj % entry point << /Type /Catalog /Pages 2 0 R >>
0 6
0000000000 65535 f
0000000010 00000 n
0000000079 00000 n
0000000173 00000 n
0000000301 00000 n
0000000380 00000 n trailer << /Size 6 /Root 1 0 R >>
startxref
492 %%EOF
----------------- Do not copy this line ------------------
It's a gnarly language, and barely feasible as a means for humans to package stuff by hand. Real files have a lot more baggage inside.
And if you looked inside another PDF and your conclusion is "Paul, a PDF doesn't look like this!". Of course not. PDF is available in binary and text format. And this is a human readable example. What I don't understand about this sample file, is it's missing a short "binary string" that has appeared in some other so-called text ones. And the file still seems to work.
Many modern documents contain "embedded fonts". Which would ruin a simple example like this. This sample file relies on the interpreter having a Times-Roman font. If you change the declaration to ComicSans, the document will likely not display (ComicSans not a part of a base set of fonts).
HomeOwnersHub website is not affiliated with any of the manufacturers or service providers discussed here.
All logos and trade names are the property of their respective owners.