Document Scanning - Conversion to Electronic Format

Document scanning is a method of converting paper documents to an electronic format that can then be stored digitally. This article outlines some of the key concepts and considerations required to optimize this process and result in a quality electronic document.

Core Steps and Considerations:

  1. Document Preparation

  2. Scanning

  3. Conversion to digital format (e.g. Tiff, JPG) or OCR (PDF, other)

  4. Compression

Document Preparation

Proper preparation is important for preserving the quality of your documents and protecting your scanner.

  • Remove staples, paper clips, and binder clips.

  • Remove Post-It notes and attach notes. If these need to be scanned, photocopy them and scan them as a separate page.

  • If possible, improve document quality by photocopying.

  • Repair torn pages.

  • Straighten folded corners.

Scanning

Determine the appropriate scanner driver settings for the types of documents you are scanning. Scanning software should support the creation of multiple scan profiles that will accommodate various scan configurations such as Black and White, Grayscale, Color, Single-Sided (Simplex), Double-Sided (Duplex), DPI (Dots per Inch – Resolution), compression settings and type, border removal, speckle removal, rotation, drop out color, etc.

A raster image is a sequence of on and off pixels.

A raster image is a sequence of on and off pixels.

Words may look like a series of letters, but they cannot be selected and are not understood by the system. Images need to be converted to an intelligent format to create a text-searchable document such as PDF.

Document quality is important for readability. Black and white paper documents with fair or poor quality can sometimes be improved by scanning to grayscale.

Grayscale-scanned images are larger than pure black and white scanned images. Determine if you can afford the larger image size by scanning to gray scale.

Examples

Black and White Scanned

An example of a black and white scanned page

Gray Scale Scanned

DPI - Resolution Settings

Typically, black and white business documents with little to no graphical elements are scanned at 200 DPI. If there are small details that need to be retained, it may be desirable to scan to 300 DPI. Higher resolutions will not improve image quality and will result in significantly larger file sizes.

Black and White 200 DPI Image

Gray Scale 200 DPI Image

Notice the increased image quality and detail in the 200 DPI Gray Scale image and further image quality improvements in the 300 DPI Gray Scale image below.

Gray Scale 300 DPI Image

Compression

  • Losless – less compression but no data loss (TIF).

  • Lossy – deep compression with subsequent data loss (JPG). 

Related Articles