Avos Text recognition SpiralSynthModular GPhoto Verispeed Schemix JMK This

Text area recognition

In my spare time I wrote a library in Java that can recognise areas of an input image that are likely to contain text. The algorithm used to find areas of text is one I designed myself.

The table below contains the results of some test runs of the library. The parameters of the program were kept constant for the whole series of images in the assumption that, to be really useful, the program should not assume the luxury of a user being available to adjust parameters for each image. I have included some tests that didn't work very well to illustrate the limitations of the program. However, the parameters can be adjusted so that the textual regions in each image are correctly identified.

Input Output Input Output
Test image 1 Output from test image 1 Test image 2 Output from test image 2
Test image 3 Output from test image 3 Test image 4 Output from test image 4
Test image 5 Output from test image 5 Test image 6 Output from test image 6
Test image 7 Output from test image 7 Test image 8 Output from test image 8
Test image 9 Output from test image 9 Test image 10 Output from test image 10
Test image 11 Output from test image 11 Test image 12 Output from test image 12

So how does the algorithm work? It starts by ignoring most of the information in the image, and concentrating on the contrast. For example, the two images below show a typical input image, and its contrast as seen by the library.

Example input Example contrast
This is already looking good. It accentuates the letters in the text while reducing the background noise. The contrast algorithm in the library has a few fairly sophisticated features such as the ability to ignore very long straight lines of contrast, which are more likely to be borders than text.

Next, imagine drawing two graphs: On one of them, we'll plot x against the sum of the contrast on column x. On the second graph, we'll plot y against the sum of all the contrast on row y. We can use these two graphs to draw some boxes around areas that will be likely to contain text. Every time the contrast rises sharply with a change in x, we mark that position as the left hand side of a box. When the contrast falls, we mark the position where it falls as the right hand side of a box. We do the same for y to get the tops and bottoms of the boxes.

We'll now have a pretty large set of boxes and we need to reduce this set down to give our final result. The algorithm has several rules for merging, shrinking, and deleting boxes using measures like aspect-ratio, mass (the amount of contrast enclosed by the box), density (mass/area), etc.

The library is licensed under the GPL, and you can download it here. You can compile the program by typing

javac GetImageText.java

To run the program, type

java GetImageText inputfile outputfile

where inputfile and outputfile are the (.jpg) input and output files.

Thanks very much to Daniel Morrione for encouraging me to release this software.