While it’s easy to take a document on your computer and get a physical copy with a printer, it’s generally harder to go the other way. While scanners exist and can save scanned documents as an image, this isn’t particularly helpful if you wanted to edit the document. To be able to edit a document you want to use a technology called Optical Character Recognition or OCR.
How does optical character recognition work?
OCR uses a range of techniques to accurately read documents. OCR software adjusts the document, and potentially even individual words so that they are aligned correctly. The image is converted into a pure black and white format as that is easier than differentiating between shades of grey. Analysis is also performed to identify and remove any non-text items.
Two main types of OCR algorithms are used, matrix matching and feature extraction. Matrix matching takes an image of a single character then compares it to the algorithms configured fonts on a pixel by pixel basis. This technique requires the character to be correctly isolated from all other content and for the font to be included in the OCR software. This type of OCR also doesn’t work for recognizing handwriting.
Feature extraction algorithms break each character down into features, such as lines, curves, and line intersections. This technique significantly reduces the reliance on the algorithm being trained with known fonts. Feature extraction is capable of recognizing new fonts and transcribing them, as well as some handwriting, although the accuracy isn’t as good as for known fonts.
Some more advanced software uses the context of the surrounding letters to help identify letters that are not as clear. For example, if the word “dog” is printed and the OCR algorithm can’t tell for sure if the “o” is an “a” or an “o”, it can use a dictionary to see if any combination of potential characters makes a known word. In this case, the OCR algorithm would discount the possibility of the “a”, as “dag” isn’t a word, while “dog” is.
Where is OCR used?
One of the main uses of OCR is in the postal system. OCR is used to automatically identify the address of letters and parcels, a task it can do significantly faster than people could. In cases where the OCR system is unable to read the address of the label, it will be separated out for a human to process manually instead.
OCR is useful as an accessibility tool for people with visual impairments when combined with a text-to-speech tool. Google translate also implements OCR as part of the process of translating the text in images.
Did this help? Let us know!