With the renewed attention to digital transformation and automation powered by AI, Intelligent Capture — including handwriting recognition — is undergoing a renaissance so to speak. While solutions offer a significant amount of potential, some major risks exist for adopting intelligent capture that are relatively easy to avoid if you are armed with good information. The focus here is to provide some actionable advice. Here are four key takeaways:
- Accuracy and What It Means. There is a tendency for vendors to focus prospect attention on accuracy and portray that figure as the only calculation for automation and cost savings. This is false and misleading. Accuracy rate statements such as 99%, 99%, or 90% by themselves mean nothing and are red herrings. For example, with claims processing, does a vendor count Box 33 (provider billing), which includes a number of data fields as a single field? If so, any error in this field will only be counted as one error and not several. Be sure to ask the vendor how they can make that claim for customer-specific requirements where no analysis has been conducted.
- Understanding Cost Savings. To understand cost savings, the vendor should provide the client with both the “read rate” and the “accuracy rate.” The read rate is the amount of data produced from the software. The accuracy rate is the amount of that produced data that is correct. The two together are essential. Ask the vendor to explain how they calculate automation and/or cost savings.
- Machine Learning vs. Custom Project. Beware of solution providers marketing machine learning offerings that are in-fact custom projects with high professional services costs and a long duration time to get to production. Ask the vendor how quickly a solution can be configured and be production-ready.
- Proof of Concept. Often times a vendor will make a project look easy, focusing on configurations that have already been built or create a configuration meant to look good in a Proof of Concept (PoC), but one that will not work in production. Be sure to ask the vendor about their previous PoCs. How long did they take to configure the system? What were the requirements? What were the results?
Handwriting Recognition Requires More Preparation than OCR
If your organization is looking to automate processes that involve handwritten information, there has never been a better time. Deep learning technologies have significantly improved the capabilities and overall quality of handwriting recognizers for everything from handprinted data to complex cursive writing. However, some significant aspects of handwriting recognition can impact project success. Here are a few that you should consider:
- Handwriting Recognition cannot deliver performance like OCR. Any solution provider that promises the same level of performance for handwriting that can be provided for text-oriented OCR is selling well beyond its capabilities, even with Deep Learning. The reality is that, while OCR only has to deal with a finite number of type fonts and sizes, there is literally a “font” for every different person. The result is that while a significant amount of handwritten information can be processed, it cannot deliver the same 95%-99% accuracy at a character or word level as OCR. This is especially true for larger amounts of handwritten information such as correspondence or answers to open-ended questions. The more words involved in a specific recognition task, the more errors there will be.
- Nothing is magic. There is still the need to design the system to help identify likely words, phrases and other data. While most text can be successfully processed with OCR to provide high levels of reliability, handwriting recognition requires additional information about the data to be recognized. This is called metadata or context that includes dictionaries of likely answers, identification of data types such as dates and amounts or use of expected value ranges. Solution providers suggesting that they can produce good results without this effort “through the magic of machine learning” are using another magic trick: smoke and mirrors.
What You Measure Matters
When it comes to any type of Intelligent Capture, including handwriting recognition, it’s all about the performance in terms of both the amount of data that can be provided and the accuracy of that data. Many solution providers send mixed messages when it comes to the question of “how accurate is your system?” preferring to only focus on the accuracy of output data, not bothering with the question of “how much data?”.
It is entirely possible that a vendor can factually claim their system is 99% accurate, but still not be able to deliver any significant amount of automation. For instance, a vendor can claim 99% accuracy on handwriting recognition, but that accuracy level is only measured on the amount of data that can be output.
The amount of output data is just as important as the accuracy of that data. Some solution providers make even more unsubstantiated claims tying the accuracy of their data to how much automation is achieved. Don’t believe the hype. Your measurements should rely upon two data points:
- Amount of Data, for instance, the total number of fields required to be extracted; and
- Accuracy of the Output Data. Together, you get the full picture. With only one part of the data, you work blind.
Getting Beyond the Machine Learning Hype
Use of Machine Learning is Only Part of the Solution
In the past several years, solution providers have jumped on the “machine learning bandwagon” making all sorts of claims on how machine learning or deep learning benefits their customers. Machine learning is only useful when it is applied to a problem. While it may seem magical, there is often a lot of customized work involved to get a system up-and-running.
One common weakness of machine learning is the need for large amounts of training data to produce good results. Deep learning is the most data hungry. Many solution providers that push machine learning require lots of data and lots of manual configuration to achieve even marginal results. This means projects are more costly, complex and time consuming significantly raising project risks.
The reality is that successful projects always blend a number of different technologies and techniques to deliver the best results, overcoming weaknesses inherent in any specific approach. Don’t buy into a solution that pushes a single path; that route is the perfect example of putting all of your eggs in one basket.