Machine learning is not the answer. In the past couple of years, it has become increasingly difficult to browse the Web without seeing news about the promise or dangers of machine learning. This is followed closely by software and hardware vendors’ claims about how their products tap into its power (disclosure: Parascript uses machine learning, and we also proudly announce that fact).
Enterprises Grow Skeptical of “Machine Learning”
Unfortunately, there is little out there in terms of information on where use of machine learning is beneficial and when to apply it. There is even less useful information on when to use one type of machine learning over another. In any case, the notion that machine learning is the solution to all of our problems is woefully misguided. Put simply, machine learning is not the answer. At least it isn’t the only answer. While many organizations are blindly pursuing machine learning projects for the sake of it to solve automation needs, the real question any organization faced with an automation problem should be asking themselves is: “what is the best, most-reliable approach should we take given our specific situation?”
Sometimes the answer is the deep learning variant of machine learning; sometimes it is with using simpler machine learning algorithms; sometimes it is through the use of practical rules-based approaches; and sometimes the best approach is to use a mix. Take, for instance the Alexa Prize which awards $1 million dollars to the team that can create a bot that “can converse coherently and engagingly with humans on popular topics for 20 minutes.” While many teams went down a machine learning-only path (many using the in-vogue deep learning) or a path that required building tons of hand-crafted rules, the winning team discovered that the best approach was to mix a number of approaches based upon the specific topics and information involved.
Exploring the Technology Options
Unfortunately, we are all mired in a machine learning mindset that imposes upon us an unnecessary construing of only using machine learning algorithms to solve problems. Given all of the hype, there are dozens and dozens of “machine learning-first” start-ups across various industries attempting to apply machine learning to solve problems in almost a brute-force manner. In our industry, we’ve seen then entrance of a number of newcomers that are taking an “aggressive: machine learning approach where it may not make sense.
Ultimately, they find that trying to use deep learning to solve a problem isn’t the path that yields the best approach. For instance, a fast-growing machine learning platform provider attempted to use deep learning to solve the classic “invoice problem” where the data is highly variable. The results were mixed (and not very good). There are many reasons that specific machine learning approaches work well on some problems and poorly on others – we won’t go into them here but knowing this each has strengths and weaknesses is of significant importance.
Deep Learning Algorithms
We’ve also experimented with various deep learning algorithms on invoices (and other tasks) and found that the single-minded pursuit of using deep learning to solve the problem just doesn’t work. What does work is a mix of different techniques based upon the individual automation tasks required for locating and extracting data on these highly-variable documents. The upshot is that a deep learning hammer approach views every automation problem as a nail. This is the path to failure.
While we do continue to use deep learning to achieve amazing results, particularly with handwriting recognition, we’ve found that the best approach is to test various techniques on specific tasks and then select the ones that work best. This approach has led to the creation of a document automation platform that we call Smart Learning. This platform uses a variety of techniques each focused on achieving the best results for specific document automation tasks.
The Right Tools
Ultimately, with automation, instead of taking a single-minded machine learning approach, the key is to understand the specific tasks at hand and then to test and select the approach that yields the best results. The reality is that often times organizations lack the expertise required to evaluate different approaches so they often go down the path that is the easiest to pursue. Or just as likely, they select a complex path such as use of machine learning, believing it is easy. Both paths end-up yielding poor results at a pretty high cost.
Machine learning is not the answer – an intelligent approach to using the right tool for the right task is.
If you found this article interesting, you may find this eBook useful, Machine Learning for Advanced Capture.