I have spent some time scanning the rather tatty manuals for my Fanuc Wire Eroder into searchable PDF files, and have got the bulk of the work done, but two aspects have me stumped.
Firstly, the first two pages of one manual are particularly grubby, and on originally a yellow back ground. I'd like to lift the text from them and drop it onto a clean page. Now you can search individual words on the page so I'm sure that it must be possible to grab the text somehow and drop the back ground - but how? (NB certain lines on these two pages have 'links' on them to other pages in the document that won't work as you only have the two pages not the rest of the document as it's 2 x 36 mByte !)
Secondly, the scanning process uses an OCR program to make the pdf searchable, which is excellent, BUT it also decides to rotate some pages into 'landscape' to get the text the right way up. I want them all in 'portrait'. The pages as scanned are rather varied in size, but roughly A4. I can use a utility to rotate the page, but if I then use another utility to make all the pages A4 sized, it inverts these pages. This seems to happen whatever order I do the conversion in.
It's frustrating as it's VERY close to what I want, but not quite there
Total job is about 500 pages of scanned and OCR'd A4 which I've done, and gone through cleaning up grubby finger marks and edge effects - it's just these two aspects baulking me at the moment if anyone can help.
The scanner is a Plustek Opticbook 3800
The scanning software is 'Book Pavilion' and the OCR software it uses in the back ground is 'Finereader Sprint 9' both of which are bundled with the scanner.
The PDF utilities I've been using are: PDFill Editor ($20) and it's free PDF toolshttp://www.pdfill.com/
Incidentally I chose this scanner as it can scan up to 2 mm from an edge, so a book can lay on it with half hanging down the front and still scan into the fold without leaving a blank place that would loose text in some books.http://plustek.com/usa/products/opticbook-series/opticbook-3800/
Any help would be appreciated.