Mathematica Asked on July 5, 2021
I have a list of rules that has as the left hand side the font size and a letter and for the right hand side the space in pixels between this letter and the following letter. The data is from a postscript file. Two examples.
{8.5, "S"} -> 5.5165
{8.5, "251"} -> 2.363
The second example uses a character code to represent the actual character.
Here is the complete list of rules. Apologize for the length but I don’t know what parts are and are not important.
letterRules={{8.5, "S"} -> 5.5165, {8.5, "P"} -> 5.6695, {8.5, "E"} ->
5.508, {8.5, "C"} -> 6.2985, {8.5, "I"} -> 2.5075, {8.5, "A"} ->
5.8225, {8.5, "D"} -> 6.2988, {8.5, "O"} -> 6.613, {8.5, "U"} ->
6.298, {8.5, "B"} -> 5.984, {8.5, "L"} -> 5.041, {8.5, "E"} ->
5.508, {8.5, "S"} -> 5.5165, {8.5, "I"} -> 2.5075, {8.5, "M"} ->
7.7095, {8.5, "P"} -> 5.6695, {8.5, "L"} -> 5.0405, {8.5, "O"} ->
6.6131, {8.5, "V"} -> 5.355, {8.5, "E"} -> 5.508, {8.5, "R"} ->
6.137, {8.5, "C"} -> 6.298, {8.5, "A"} -> 5.823, {8.5, "L"} ->
5.04, {8.5, "J"} -> 4.726, {8.5, "U"} -> 6.2985, {8.5, "M"} ->
7.7095, {8.5, "O"} -> 6.613, {8.5, "V"} -> 5.3546, {8.5, "E"} ->
5.508, {8.5, "R"} -> 6.137, {8.5, "C"} -> 6.299, {8.5, "A"} ->
5.822, {8.5, "L"} -> 5.041, {8.5, "O"} -> 6.613, {8.5, "P"} ->
5.6695, {8.5, "E"} -> 5.508, {8.5, "N"} -> 6.2985, {8.5, "I"} ->
2.5075, {8.5, "N"} -> 6.2985, {8.5, "P"} -> 5.6697, {8.5, "R"} ->
6.137, {8.5, "E"} -> 5.508, {8.5, "E"} -> 5.508, {8.5, "M"} ->
7.709, {8.5, "P"} -> 5.67, {8.5, "T"} -> 5.193, {8.5, "N"} ->
6.299, {8.5, "O"} -> 6.613, {8.5, "T"} -> 5.193, {8.5, "R"} ->
6.137, {8.5, "U"} -> 6.299, {8.5, "M"} -> 7.709, {8.5, "O"} ->
6.613, {8.5, "V"} -> 5.355, {8.5, "E"} -> 5.508, {8.5, "R"} ->
6.137, {8.5, "C"} -> 6.298, {8.5, "A"} -> 5.823, {8.5, "L"} ->
5.04, {8.5, "L"} -> 5.041, {8.5, "D"} -> 6.299, {8.5, "E"} ->
5.508, {8.5, "F"} -> 5.04, {8.5, "E"} -> 5.508, {8.5, "N"} ->
6.299, {8.5, "S"} -> 5.516, {8.5, "V"} -> 5.355, {8.5, "N"} ->
6.298, {8.5, "O"} -> 6.613, {8.5, "T"} -> 5.194, {8.5, "R"} ->
6.137, {8.5, "U"} -> 6.298, {8.5, "M"} -> 7.71, {8.5, "O"} ->
6.613, {8.5, "V"} -> 5.355, {8.5, "E"} -> 5.508, {8.5, "O"} ->
6.613, {8.5, "P"} -> 5.669, {8.5, "P"} -> 5.67, {8.5, "251"} ->
2.363, {8.5, "T"} -> 5.194, {8.5, "/"} -> 3.153, {8.5, "D"} ->
6.299, {8.5, "O"} -> 6.613, {8.5, "U"} -> 6.298, {8.5, "B"} ->
5.984, {8.5, "L"} -> 5.041, {8.5, "D"} -> 6.2985, {8.5, "I"} ->
2.5075, {8.5, "R"} -> 6.137, {8.5, "E"} -> 5.508, {8.5, "C"} ->
6.2985, {8.5, "C"} -> 6.2988, {8.5, "U"} -> 6.298, {8.5, "E"} ->
5.508, {8.5, "B"} -> 5.984, {8.5, "I"} -> 2.508, {8.5, "S"} ->
5.4315, {8.5, "L"} -> 4.9555, {8.5, "A"} -> 5.7375, {8.5, "C"} ->
6.2135, {8.5, "O"} -> 6.528, {8.5, "N"} -> 6.2135, {8.5, "V"} ->
5.27, {8.5, "E"} -> 5.423, {8.5, "N"} -> 6.2135, {8.5, "T"} ->
5.1085, {8.5, "I"} -> 2.4224, {8.5, "O"} -> 6.528, {8.5, "N"} ->
6.214, {8.5, "L"} -> 5.542, {8.5, "E"} -> 6.0605, {8.5, "A"} ->
6.409, {8.5, "D"} -> 6.9275, {9, "P"} -> 6.003, {9, "r"} ->
3.501, {9, "i"} -> 2.322, {9, "m"} -> 8.154, {9, "a"} ->
5.166, {9, "r"} -> 3.501, {9, "s"} -> 4.833, {9, "i"} ->
2.322, {9, "g"} -> 5.499, {9, "n"} -> 5.337, {9, "a"} ->
5.166, {9, "t"} -> 3.168, {9, "p"} -> 5.499, {9, "a"} ->
5.166, {9, "r"} -> 3.501, {9, "t"} -> 3.168, {9, "n"} ->
5.337, {9, "e"} -> 5.166, {9, "r"} -> 3.501, {9, "251"} ->
2.502, {9, "l"} -> 2.322, {9, "e"} -> 5.166, {9, "a"} ->
5.166, {9, "d"} -> 5.499, {8.5, "D"} -> 6.801, {8.5, "E"} ->
5.949, {8.5, "F"} -> 5.441, {8.5, "E"} -> 5.949, {8.5, "N"} ->
6.801, {8.5, "S"} -> 5.958, {8.5, "I"} -> 2.711, {8.5, "V"} ->
5.78, {8.5, "C"} -> 6.8, {8.5, "A"} -> 6.291, {8.5, "R"} ->
6.63, {8.5, "D"} -> 6.8, {8.5, "I"} -> 2.711, {8.5, "N"} -> 6.8}
I use both Classify
and Predict
with the training set.
letterClassify = Classify[letterRules, Method -> "NaiveBayes"];
letterPredict = Predict[letterRules, Method -> "NeuralNetwork"];
In general I am happy with the results. I created tables to show the results for a rule, for example rule #139 has a 9 pt font size, the letter d and a spacing of 5.499 pixels
The letter N with a font size of 8.5 occurs in a number of rules from 38 up to 153, various spacings with the majority around 6.3 pixels.
The columns are the rule number, the rule, the Classify
result and the Predict
result. Both Classify
and Predict
give reasonable results for the rule with one occurrence, d, as well as the rule with multiple occurrences, N.
However Predict
gives an awful result for this single occurrence rule with a slash, /.
and in the following example there are two single occurrences (because the font size differs) but Classify
fails to treat them separately whereas Predict
honors the font size.
Finally in this example there are many samples close to 5.8 but Classify
picks the largest one which has only one occurrence.
I hope someone can help me to understand why I get mostly reasonable results but apparent failure in the above four cases.
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP