Workflow of the Week: Visual Text Recognition & Text Aggregation

Clarifai's platform offers predictive modeling for images, videos and text - three of the most common forms of data in the world. One of the most exciting things about a platform that can work with these three input types, is how these input types can work together to solve complex problems. Let's take a look at our public "Visual Text Recognition" workflow and how it can help you to connect the dots between text in images and encoded text.

Setting up an app to work with "Visual Text Recognition"

To begin, simply create a new application and choose "Visual Text Recognition" as your base workflow.

Workflow ID: Visual-Text-Recognition

Owner: Clarifai

Input: Image

Output: Text

vtr

Once we've created our app, lets create a custom Text Aggregation model. To do this, click on the "Model Mode" tab on the lefthand side of the screen. The Text Aggregation Operator offers parameters that let you tune the width and height of the window within which words are considered part of the same line. You can adjust these parameters to for optimal performance based on the type of image data that you will be processing (road signs will have different visual characteristics than scanned documents, for example). Adjust as needed, give your model a descriptive name, and Click create new model.

textagg

Next, visit the "Workflows" tab in Model Mode. Navigate to your "Visual Text Recognition" workflow and click the "Copy to New Workflow" button.

copyworkfl

This will grab all of the underlying models in the Visual Text Recognition workflow and take you to the "Create a Workflow" page. From here lets add our Text Aggregator model to our Visual Text Recognition workflow.

addmodel

Finally, we need to connect our "Input Nodes" so that data flows through our workflow properly. Connect the "1.0 Cropper" to the "Visual Text Detection" model, the "Visual Text Recognition" model to the "1.0 Cropper", the "Text Aggregation" model to the "Visual Text Recognition" model, and click "Create Workflow".

inputnodes

Now lets upload some images and test out the results. Upload your images through Data Mode or our API, and view your images in the explorer tab. In the righthand sidebar, select the App Workflow tab, select the gear icon and select your new workflow.

workflowotopns

Your new workflow now detects, crops, recognizes and aggregates your image text into encoded text.

results

Deploy

Data lifecycle

Model Lifecycle

Governance & Control

Platform overview

Learn more about Clarifai's AI Lifecycle Platform

on-demand WEBINAR

Founder's AMA: Maximize the value of your AI investments

AI Compute Orchestration

Create and control your AI workloads on any compute infrastructure

Workflow Of The Week: Visual Text Recognition + Text Aggregation

Table of Contents:

Setting up an app to work with "Visual Text Recognition"

CONTACT

Platform

Solutions

Community

COMPANY

Resources

CONTACT