Robotic Text Automation (RTA)

Applica RTA is unrivaled at interpreting documents and making necessary decisions with the highest levels of precision

Next generation process automation

Applica RTA is the newest member to the automation landscape, specifically developed to bridge the legacy gaps in NLP and OCR solutions. Applica RTA combines the best of two worlds – deep-learning-driven NLP and Computer Vision – by leveraging Applica’s proprietary research in layout-aware neural language modeling. Therefore, Applica RTA is able to process all documents types – plain text, tables, and forms – all without relying on laborious templates.

Experience Applica RTA for yourself

To show how easy it is all people of all levels to operate, we’ve included a five-step graphic that highlights the Applica RTA user experience. To use, click on each header (Data, DataPoints, Context, Training, Validation) and the corresponding screen displays what the user interface looks like during that step of the Applica RTA process. To view additional information about the various components in each step, click on the blue circles on each screen.

Manage your document library for automated processing

Specify what needs to be done

Apply Applica's proprietary Contextual Awareness technology

Train your AI model and review its performance

Self-learning from interactions with end users

A set of documents created by user that can be selected for further processing.

In this example, we are using publicly available NDAs and "Jurisdiction" and "Counterparts" is the metadata we are training on.

The pipeline defines the way that the data will be processed and what engine will be launched (Classification, Extraction, Context); pipelines are selected by the user.

Every Pipeline has a different set of possible DataTypes, e.g.: Classification pipeline will have only a "Class" type while Extraction pipelines will have such type as "organization", "personal data" or "currency".

The user can switch between different DataPoints to find relevant information faster.

The user can easily see the highlighted fragment even if the fragment is divided within the document (e.g.: the first fragment is on first page, the second fragment is on the last page).

In order to train Applica's proprietary Contextual Awareness feature to find the relevant phrasing for the information to be extracted from the documents, the user simply needs to highlight the relevant text associated with the metadata point.

In RTAStudio all machine learning processes (Extraction, Classification, Context) can be called "training".

Utilizing only a small document training set and only a couple of examples for Contextual Awareness training, Applica RTA achieves an incredibly high level of precision.

When the model is ready, the user will save the DataPoints and deploy the RTABot.

The RTABot, combined with RPA technology or other workflow platforms, will allow for unprecedented levels of straight-through processing.

Fscore tells the average from precision and recall.

Recall tells how many good "yes'" were in all "yes'" it should predict (predicted 50 correct "yes'", but in the set were 100 "yes'", then recall is 50/100). It tells us the capacity of the model to detect the value.

Precision tells how many good "yes'" were in all "yes" predicted results (predicted 60 "yes'" but only 50 "yes'" are correct, then precision is 50/60). It tells how sure we can be of the predicted "yes" result being correct.

Here the user can see the amount of documents used for training vs. testing, as well as the results of it. In filtering the TrainSet/TestSet, the quality of model can be easily checked (as user will see values given by trained model).

Gold value is the value that we want to get and we are sure is good. It's used to evaluate the model (fscore, precision, recall), when the model is tested on a test set (documents that were not taken to training).

When the user hovers over these text snippets they can conveniently see the fragment of text that was extracted by the AI model.

Easy manual validation for the corner cases that fall below the confidence score selected by the user.

User can group DataPoints for easy validation.

Use the "Phrase" navigator on the top right to easily find words or phrases (it's useful especially when the model didn't find any value or the value looks wrong and the user must find it themselves to fill it manually).

User can see extracted as well as classified values (different pipeline).

User can click on DataPoint and scroll to the value for an easy way to access it.

User can scroll automatically to context (when present) by clicking on the arrow next to the DataPoint chips.

User can modify the values manually.

User easily can see which DataPoints had low confidence scores (noted as "LOW" in red text) and should be verified.

On the left panel, the values are normalized (e.g.: you always have the same date format even if the document says "15th of April").

If too many DataPoints are in one document, they will be divided by pages (and on the preview, only highlighted values for the current page will be visible).

Natural Language Processing

Intelligent OCR

Applica RTA

Simple technology

Limited technology

The latest technology

Relies on keywords and rules hand-crafted by AI experts OR requires extremely large volumes of training data
Merely recognizes letters and simple data, doesn’t comprehend or make decisions
Helps organizations make faster, more accurate decisions
Handles homogenous plain-text documents
Only copies snippets of text from forms and tables
Contextually extracts information and makes necessary decisions
Extracts information usually without interpretation
Language dependent
Language and industry agnostic
Struggles with tables/forms
Can’t interpret text
Comprehends documents regardless of format (plain text and tables/forms)
Laborious deployment
Relies on templates
Templateless
Requires expensive maintenance by AI experts
Requires expert knowledge and continuous template updates
Deployment by business practitioners (no AI knowledge required) with minimal customer data
Heavy maintenance procedures involving expert engineering resources
System maintenance by self-learning from end users