TransWikia.com

Handwriting recognition with Mathematica

Mathematica Asked on February 7, 2021

I am trying to teach Mathematica to read my handwriting. Since I don’t want to ruin my reputation by showing my own handwriting, I am going to use a font called blackjack.

Lets say this is a paragraph I have written.

para = StringTake[ExampleData[{"Text", "OriginOfSpecies"}, "FormattedText"], 401]

INTRODUCTION.

When on board H.M.S. ‘Beagle,’ as naturalist, I was much struck with
certain facts in the distribution of the inhabitants of South
America, and in the geological relations of the present to the past
inhabitants of that continent. These facts seemed to me to throw some
light on the origin of species–that mystery of mysteries, as it has
been called by one of our greatest philosophers.

First I am going to recognise an alphabet say $h$. (At some point in distant future I am thinking about keeping a sample file of actual handwritten alphabets.)

font = "blackjack";
text = Binarize@Rasterize@Style[para, Bold, 30, FontFamily -> font];
w = Binarize@Rasterize@Style["h", Bold, 30, FontFamily -> font];

x = ImageCorrelate[text, w, NormalizedSquaredEuclideanDistance];
w1 = ColorNegate[Binarize[x, 0.12]];
loc = ComponentMeasurements[w1, {"Centroid", "EquivalentDiskRadius"}];
pos = loc[[All, 2, 1]]; Length[pos1]
Length[pos]
Show[text, Graphics[{Opacity[0.5], Red, Disk[#, 10] & /@ pos}], ImageSize -> 500]

enter image description here

Then I iterate over all alphabets, signs and digits.

alph = Join[ToUpperCase[#] & /@ Alphabet[], Alphabet[],
       {".", ",", ";", ":", "-", "?"}, {"1", "2", "3", "4", "5", "6", "7", "8", "9", "0"}]

wlist = {};
Do[
  w = Binarize@Rasterize@Style[abc, Bold, 30, FontFamily -> font];
  x = ImageCorrelate[text, w, NormalizedSquaredEuclideanDistance];
  w1 = ColorNegate[Binarize[x, 0.11]];
  loc = ComponentMeasurements[w1, {"Centroid", "EquivalentDiskRadius"}];
  pos = loc[[All, 2, 1]];
  If[Length[pos] > 0, AppendTo[wlist, {abc, pos}]],
{abc, alph}]

Then convert it to a machine readable font and use TextRecognize

newtext = Graphics[Block[{w = #[[1]], pos = #[[2]]}, 
 Text[Style[w, 18], #] & /@ pos] & /@ wlist, ImageSize -> 700]

enter image description here

TextRecognize[newtext]

INTRQDUCTIQN

When on board H M S ‘B glef 5 naiurallsl, I W5 mu h stru k
wlth ceftaln fa ts ln the d|str|but|0n of the lnhabltants of South
Amenca, and |n the geologncat r Iahons of the prsem lo the
pal lnhdiltants of thal tnnllnem ThSe fa ts seemed to me
to throw some ||ght on the orlgln of pec|s– that myslety of
mysletls, as || ha been called by ore ofour grealsl ph||os phefs

Now the question – How to improve this?

The major challenge is to identify all the alphabets. Some alphabets are missing. In some cases $c$ looks like $e$ etc. I was thinking about
using different font families and create a list with Classify for better comparison – still not sure how good that would be.

The last part concerning TextRecognize probably can be improved by rearranging the positions of individual alphabets to avoid any overlapping.

2 Answers

I think writing code in mathematica to read a font that has only one image per character is a much different task then teaching mathematica to recognize your hand writting. you already have the training data with that script font you used. you can achieve %100 accuracy without using any machine learning. you need to pick one raster size for your TTF font and stick to it. if not, then there are multiple raster sizes there for multiple possible matches for each test case. you can also sort the alignment of text problem by carefully selecting the origin of each character in the data set or by using perfectly aligned text to train some machine learning model. the idea here is that you create text that is in a normal font with perfect alignment to train the computer how to generate perfectly aligned text. you might not even need to do that if you find a simpler way to do code snapping of the detected object coordinates. this obviously requires weights and lines that reward properly aligned text. you need to define the spacing vertically between lines of output text. also some ratio of character height vs line spacing. you should be recording character height to use as an input to this special alignment correction function. you also have multiple characters being super imposed. there is no code to detect and suppress those kinds of errors. you need to work out how to always use one output character per detected character.

eventually you get to doing your own handwriting but you realize that you could be using tensorflow instead. you need more layers, better data sets, more of the errors need to be worked out. that means you need to carefully design the data sets around what you think the computer will need to do the job. it might not be words or sentences. it would probably be more individual characters with a plain line leading in and out. then you would need to tag all that data and put boxes around it. you also need to score the data manually if you are using words and sentences to test the model. you eventually realize you need a generator because you will never have enough raw tagged data. you need machine learning to recognize statistical patterns over infinite unknown data sets. ImageCorrelate is the dull tool that is not customized to do the trivial case you posted here. you would be better off using one channel xor on binary data and summation as a basic pseudo machine learning one layer discriminator. this simple solution only works with a controlled test with a perfect dictionary with one raster per search term. one search term per raster. 2*26 if you want upper lower. about 70 characters if you really cover everything. you should be able to do some quick tests that look for the worst case of a training letter containing as a subset some pixels that are an exact match for some other character. if t contains l then all t's are true for t and true for l. but since l does not contain t, you must hard code l if and only if (not t) and l. this might not be a problem if everything is part of a controlled experiment with matching exact controlled master rasters. so this subset problem may not even exist. it depends on how the font was constructed and it depends on the quality of the dictionary.

Answered by acacia on February 7, 2021

enter image description here

enter image description here


This topic got some updates since 2016 :-) I will give a short review of resources.


Top recent: Wolfram Technology Conference (WTC) 2020

At the Wolfram Technology Conference (WTC) 2020 (currently in progress) Mikayel Egibyan from Wolfram image processing team just gave a talk on this topic exactly with approaches based on the modern machine learning and neural networks. Video will be available later at WTC site but here is the talk presentation on Wolfram Community:

Handwritten Recognition and Analysis

https://community.wolfram.com/groups/-/m/t/2091085

In the talk the following covered:

  • History of OCR and Analysis
  • Text Recognition Techniques
  • Existing Architectures for Handwritten Text Recognition
  • Techniques to Analyze Handwritten Text
  • The Importance of the Loss Function
  • Building and Training a Toy Network
  • Applications: Handwritten Recognition
  • Applications: Handwritten Verification
  • Applications: Handwritten Identification

William J Turkel FREE Book: Digital Research Methods with Mathematica

William J Turkel in his FREE Book "Digital Research Methods with Mathematica" (notebooks and screencasts) in Lesson 21 Section 3 very nicely discusses some starter topics for Handwriting recognition:

(BTW William also gave a cool talk at WTC 2020 "Text and Image Mining for Historical Research")

Other relevant resources and experiments:

Wolfram Neural Net Repository

I recommend checking from time to time with Wolfram Neural Net Repository

https://resources.wolframcloud.com/NeuralNetRepository

First of all, a net of direct relevance can appear there as new nets are getting added constantly. But also it is a good source for various available net architectures you can explore and modify. For instance, both basic nets - LeNet and CapsNet - for handwritten digit are there, but also many more others:

Answered by Vitaliy Kaurov on February 7, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP