Intelligent Character Recognition
Data extraction using OCR outsourcing or Intelligent Character Recognition (ICR): it’s the dream, right? Scan your paper, load your electronic images, and stand back while the algorithms do their thing. The result is clean, usable, error-free data.
Unfortunately, expectations and reality are often very different. While moving toward automation may be the right move, organizations must understand the ICR technology’s capabilities and shortcomings and whether they meet their organization’s data capture goals. These considerations will keep them from spending on the technology they may not need or employ one that can help successfully meet their document output objectives.
No one should force document automation if it isn’t needed. ICR is an expensive technology that requires a robust IT infrastructure to employ appropriately. As a data entry outsourcing partner and OCR Outsourcing provider, iTech can determine whether a project is a candidate for ICR, based on how the forms are structured, filled out, and their volume.
Structured and semi-structured forms can have the majority of their data read and captured with ICR. These form types are stable, and ICR can locate most fields with minimal effort. In addition to recognizing fields, ICR can assign data to multiple fields for future use, attach lookup tables, run calculations, organize, sort, and apply various algorithms based on the specific need.
Unstructured forms are the opposite; the data capture points are random. Because of their very nature, unstructured forms must be manually reviewed and interpreted. The best industry practice to capture data from unstructured form types is double-blind manual keying (a topic I will cover in another blog post).
How Forms are Filled Out
The best type of data to capture is machine print. These are documents filled out using standardized types, as would be produced by a computer or typewriter. When implementing an ICR that relies on finding and capturing keyword data in either structured or unstructured forms, using machine print documents is the best way to achieve a high data capture rate.
When capturing machine print data from structured forms, data capture accuracy approaches 100 percent. If structured documents are handwritten, the ICR software will know the data points, but the accuracy rate drops somewhat. For unstructured handwritten forms, the accuracy is extremely low and not suited for ICR.
Because of ICR’s expense, it only makes sense to employ it if a project’s volume is high enough. Automation is meant to be a time, and ultimately a cost, saving technology. Even if highly structured and machine print forms are in usage, the low volume will never produce a return on the ICR investment.
Intelligent Character Recognition has made some considerable advancements in recent years. With the rise of machine learning and artificial intelligence, it may even surpass the human ability to recognize and capture form-level data. Until then, we have to understand what it can and cannot do to determine its successful implementation.
iTech Data Services is a US-based data services and content management company with principal operations in the United States and India. iTech specializes in delivering cost-effective and quality solutions, including document scanning, OCR/ICR data capture, ICR/OCR data entry, ML-based data capture, data integration, forms processing, workflow management, data transformation, and data archiving. Well trained and skilled employees and state-of-the-art off-shore locations enable iTech to deliver optimal solutions for its clients. For more information, contact Jason Dodge at firstname.lastname@example.org
Reach out to our team today!