You are here

Handwritten Text Recognition Using Tensorflow and CNN

Submitted by OodlesAI on Wed, 06/10/2020 - 23:51

Tensorflow is an open-source platform for machine learning. It is a deep learning framework, we use TensorFlow to build OCR systems for handwritten text, object detection, and number plate recognition. This solves accuracy issues. As a well-positioned AI development company , Oodles AI explores how to build and deploy handwritten text recognition using TensorFlow and CNN from scratch.

Handwritten Text Recognition (HTR) systems power computers to receive and interpret handwritten input from sources such as scanned images. The systems are able to convert handwritten texts into digital text or simply can digitize, store, and extract valuable information for accurate analysis. At Oodles, we use tools like OpenCV and provide tensorflow development services to build a Neural Network (NN) which is trained on line-images from the off-line HTR dataset.

This Neural Network (NN) model split the text written in the scanned image into segmented line images. These line-images are smaller than images of the complete page image. 9/10 of the words of a segmented line from the validation-set are correctly recognized and the character error rate is around 8%.

The network is made up of 5 CNN and 2 RNN layers and workflow can be divided into 3 steps-

1. Create 5 Convolutional Neural Network (CNN ) layers

There are 5 CNN layers. First, the Convolutional layer with 5×5 filter kernels in the first 2 layers Second, the non-linear RELU function is there. Finally, a pooling layer. The output is a feature map.

2. Create a Recurrent neural network (RNN) layers and return its output

Create and stack two RNN layers with 256 units each and a bidirectional RNN from the stacked layers. Get 2 output sequences forward and backward of size 32×256. The output Calculates loss value and also decodes into the final text.

Architecture

3. Create IAM-compatible dataset and train model

The data-loader expects the IAM dataset [5] in the data/ directory. Below are the steps to get dataset:

Register for free at this fki.inf.unibe.ch
Download words/words.tgz and extract
Download ascii/words.txt.
Put words.txt into the data/ directory.
Create the directory data/words/.
Input the content (directories a01, a02, ...) of words.tgz into data/words/.
Train the model from scratch

To train the model from scratch we go to the src/ directory of our project and execute this command on terminal python main.py --train. After training, validation is done on a validation set (the dataset is split into 95% of the samples used for training and 5% for validation as defined in the class DataLoader). Validation is done by executing the command python main.py –validate. Training on the CPU takes about 30 hours on a normal configuration system.

Learn more: Text Recognition Using Tensorflow