Process for digitizing books

2015-06-21

azim58 - Process for digitizing books
aka process for scanning a book
aka scan book
aka digitize a book
scanning books
scanned book


Process for digitizing books as of 5-21-15




older instructions pre 5-21-15

Process for digitizing books as of 7-13-13
-put box on table. Put glass on top of box and overhanging box, but hold the glass down to the box with a weight.
-move tree lamp nearby for good lighting
-place phone on top of glass and use CamScanner app to take a picture of all of the pages (hold pages down at edge to make the page as flat as possible; also position each page so that only text from that page makes it into the image). Make sure that phone has power and doesn't run out of battery (may not be an issue depending on battery details)
-convert document to pdf
-transfer pdf to computer
-compress the scanned pdf file by printing with pdf creator with the jpg option (the file still prints as a pdf). Specifically: File->print->pdf creator (choose auto-rotate sheets)->print->options->format->jpg (150 dpi, 24 bit colors, 75% quality)->save->save
-ocr document with Omnipage Standard
--To use omnipage standard just load a file, choose automatic ocr, when it asks you to start making individual corrections just choose document complete or something, then save to file by choosing text and pdf searchable image.
-transfer ocrd document to place on hard drive for storage (you might just have one document)



older instructions pre 04-05-2015d2259

-put box on table. Put glass on top of box and overhanging box, but hold the glass down to the box with a weight.
-move tree lamp nearby for good lighting
-place phone on top of glass and use CamScanner app to take a picture of all of the pages (hold pages down at edge to make the page as flat as possible; also position each page so that only text from that page makes it into the image). Make sure that phone has power and doesn't run out of battery (may not be an issue depending on battery details)
-convert document to pdf
-transfer pdf to computer
-compress the scanned pdf file by printing with pdf creator with the jpg option (the file still prints as a pdf). Specifically: File->print->pdf creator (choose auto-rotate sheets)->print->options->format->jpg (150 dpi, 24 bit colors, 75% quality)->save->save
-ocr document with free pdf xchange viewer (choose high ocr quality and "Preserve Original Content & Add Text Layer" option to keep file size down)
--note that as of 5-25-14 the ocr in the repligo reader app is not good enough for many situations
-transfer ocrd document to place on hard drive for storage (you might just have one document)



older instructions pre 08-11-2014d1408

-put box on table. Put glass on top of box and overhanging box, but hold
the glass down to the box with a weight.
-move tree lamp nearby for good lighting
-place phone on top of glass and use CamScanner app to take a picture of
all of the pages (hold pages down at edge to make the page as flat as
possible; also position each page so that only text from that page makes
it into the image). Make sure that phone has power and doesn't run out of
battery (may not be an issue depending on battery details)
-convert document to pdf
-transfer pdf to computer
-compress the scanned pdf file by printing with pdf creator with the jpg option (the file still prints as a pdf). Specifically: File->print->pdf creator (choose auto-rotate sheets)->print->options->format->jpg (150 dpi, 24 bit colors, 75% quality)->save->save
--(older instructions) keep a copy of the original document and use adobe acrobat professional
to ocr all of the pages
-ocr document with free pdf xchange viewer (choose high ocr quality and "Preserve Original Content & Add Text Layer" option to keep file size down)
--note that as of 5-25-14 the ocr in the repligo reader app is not good enough for many situations
- (you may not want to do this next part since it is time consuming and possibly unnecessary ->) split the document into sections using pdftk to extract certain pages
(concatenate feature); see extract pages from pdf with pdftk
-transfer whole ocrd document and extracted documents to hard drive for storage (you might just have one document)
--(older instructions) transfer whole original document, whole ocrd document, and extracted
documents to hard drive for storage. Use an always connected computer to
transfer a copy of the extracted sections to Dropbox for reading and
annotation






See also