Preparing for AI in Document-centric Processes

ai-robot-brain-developing-world-innovation-processes
Mar 9 · 5 min read

Collecting and Structuring Data

You would have to be hiding in a deep, dark place to have missed the rife speculation about the impact of AI. Many AI applications rely on mining existing volumes of data for machine learning and insight. If you believe AI will impact back-office work in a near or distant future, now is a good time to think about where the data will come from. Now is a good time to collect data in a way that makes it easier for a machine to analyze it.

Insofar as organizations stored historical data, many records were stored in an analog format, even if they were originally created with digital tools, and many were documents. Consider letters, contracts, reports and forms created with word processors since the early 1990s.

What happened to them? Invariably, they were printed, signed with ink, and filed – possibly in hardcopy files, but perhaps scanned as PDFs. All remnants of their digital origins were lost, save only for what might be extracted by optical character recognition. Even if records were retained in a digital format, the data was typically unstructured – Word documents, for example, were not designed to store data for machine reading.

AI Impact

The first crop of AI solutions had to mine legacy data, as best it could, with confidence levels that weakened the results. It doesn’t need to be like this.

Without knowing when and what AI will be deployed in future, it’s possible now to collect data that will optimize its future value. If you operate at scale, the value of the data might even become a material contributor to the value of the enterprise. If you create a good store of data, it’s more likely that AI tools can be deployed to reveal insights from your data, rather than data necessarily aggregated from other sources.

The trick is collecting data even when you might not know how it will be used in future. For example, it’s easy to store employee records by the name of the student because a student name is an obvious identifier. Now imagine a future need to analyze employee records to examine all the data, combined with performance data, and identify factors that were predictive of employees more likely to become high performers. Could you do that?

Legito Approach

The Legito approach is to retain all the digital inputs that power back-office automation. In the near term, use the automation to fulfil the functions of your team, including those that include documents.

If you retain the digital inputs throughout your organization processes, you have the best possible data to examine later. Who knows what you will learn, and the value you will add? Data storage is cheap.

Preparing for AI in Document-centric Processes

ai-robot-brain-developing-world-innovation-processes
Mar 9 · 5 min read

Collecting and Structuring Data


You would have to be hiding in a deep, dark place to have missed the rife speculation about the impact of AI. Many AI applications rely on mining existing volumes of data for machine learning and insight. If you believe AI will impact back-office work in a near or distant future, now is a good time to think about where the data will come from. Now is a good time to collect data in a way that makes it easier for a machine to analyze it.

Insofar as organizations stored historical data, many records were stored in an analog format, even if they were originally created with digital tools, and many were documents. Consider letters, contracts, reports and forms created with word processors since the early 1990s. What happened to them? Invariably, they were printed, signed with ink, and filed – possibly in hardcopy files, but perhaps scanned as PDFs. All remnants of their digital origins were lost, save only for what might be extracted by optical character recognition. Even if records were retained in a digital format, the data was typically unstructured – Word documents, for example, were not designed to store data for machine reading.

AI Impact

The first crop of AI solutions had to mine legacy data, as best it could, with confidence levels that weakened the results. It doesn’t need to be like this.
Without knowing when and what AI will be deployed in future, it’s possible now to collect data that will optimize its future value. If you operate at scale, the value of the data might even become a material contributor to the value of the enterprise. If you create a good store of data, it’s more likely that AI tools can be deployed to reveal insights from your data, rather than data necessarily aggregated from other sources.

The trick is collecting data even when you might not know how it will be used in future. For example, it’s easy to store employee records by the name of the student because a student name is an obvious identifier. Now imagine a future need to analyze employee records to examine all the data, combined with performance data, and identify factors that were predictive of employees more likely to become high performers. Could you do that?

Legito Approach

The Legito approach is to retain all the digital inputs that power back-office automation. In the near term, use the automation to fulfil the functions of your team, including those that include documents.

If you retain the digital inputs throughout your organization processes, you have the best possible data to examine later. Who knows what you will learn, and the value you will add? Data storage is cheap.

More Weekly Articles