Text Recognition From an Image App

Rutvi Pan
4 min readMar 19, 2021

--

Optical Character Recognition (OCR) has been a topic of interest for many years. It is defined as the process of digitizing a document image into its constituent characters.so we are here with one OCR app for text recognition from an image.

Introduction:

Optical Character Recognition (OCR) is a software that converts printed text and images into digital form such that it can be manipulated by machine. Human brain can easily recognize different text/number/character from any image But machine not as intelligent as humans so they can not easily get the information that is available in image. Therefore, a large number of research efforts have been put forward that attempts to transform an image to format understandable for machine. OCR is complex problem because there are variety of languages, styles, fonts can be written and the complex rules of every languages etc.

Types of Optical Character Recognition Systems:

We can categorize these character recognition systems based on character connectivity, image acquisition mode, , font-restrictions, languages etc. Based on the type of input that means image OCR can be categorized in two parts as 1) image have handwritten text or 2)image have machine printed character recognition. The machine printed is relatively simpler problem because characters are usually of uniform dimensions, and the positions of characters on the page can be predicted. Handwriting character recognition is a very tough job due to different writing style of user and also different language is writing by user as well as different pen movements by the user for the same character. These character recognition systems can be divided into two sub-categories i.e. online and offline systems. The online is performed in real-time while the users are writing the character. They are less complex as they can capture time based information that means speed, velocity, number of strokes made, direction of writing of strokes etc. In addition, there no need for techniques as the trace of the pen is few pixels wide. The offline recognition systems operate on static data that means the input is a bitmap. Hence, it is very difficult to perform recognition.

This figure categorizes the character recognition system

Phases Of OCR:

These phases are as follows:

Image acquisition: first capture the image from a scanner or a camera etc. Preprocessing: Once the image has been captured, different preprocessing steps can be performed to improve the quality of image. Among those preprocessing techniques some are thresholding and extraction image base line etc.

Character segmentation: In this step, the characters which are there in the image are separated such that they can be passed to recognition engine.

Feature extraction: The segmented characters from an image are then processes to extract different features. Based on these features, the characters are recognized.

Character classification: This step maps the features of image to different categories.

There are different types of character classification techniques.

Structural classification techniques are those that based on features that are extracted from the structure of image and uses different decision rules to classify characters.

Statistical pattern classification methods are based on probabilistic models and other statistical methods to classify the characters.

Post processing: After classification, the results are not 100% correct, especially for complex languages or complex fonts. Post processing techniques are performed to improve the accuracy of OCR systems. These techniques utilizes natural language processing, geometric and linguistic context to correct errors in OCR results

Methodology and Working of Our app:

Name of our app is capture2text. By using our app we can extract text from image that is captured by camera and also from the that is chosen from the gallery.

snapshots of capture2text app:

Translation Part:

We have also implement translation part which is helpful for people those have language barrier. So they can translate extracted English text into hindi language. for translation we have used Firebase ML kit.

We have use 2 dependencies for crop the image and extract text from the image:

For crop the image:

‘com.theartofdev.edmodo:android-image-cropper:2.8.+’

For extract text from an image:

‘com.google.android.gms:play-services-vision:16.2.0’

Tools and Technology:

We have made whole app in android studio using java and xml. And we have also use google ML kit for text recognition.

Application of OCR:

During the early days, OCR has been used for mail sorting, bank chaque reading and signature verification.

OCR can be used by organizations for automated form processing in places where a huge number of data is available in printed form.

OCR include processing utility bills, passport validation, pen computing and automated number plate recognition etc.

OCR is also useful in helping blind and visually impaired people to read text.

--

--