Using string distance to compare sketches

by Ugo

post-it with the map

There is an article called Trainable Sketch Recognizer for Graphical User Interface Design from A. Coyette and others showing an approach to recognize pen-made sketches based on Levenshtein distance algorithm for string comparison.

The article talks about recognizing elements of user interface such as buttons, combo boxes and windows, when they are sketched by a designer. Well, they do that by using Levenshtein’s String Distance algorithm. Thats right, an algorithm created for string comparison to check if they are closer or not, in terms of character swapping. You know, when there is a typo on you search at google like “algoritm”, it says “you meant algorithm”. Thats a string distance algorithm working. It is really simple and I was quite amazed about how good it worked for Sketch.

So, what the authors of the article did to transform drawings into words? First you need to assign a number for each cardinal point, to compose your words. Lets say 1 for north, 2 for northeast, 3 for east, and so on. Then take the points (x,y) of the sketch and then each pair is compared to the next one, if the point is at north of the previous one then its a gesture going up and the character relative to North its added, 1.

How do you know what cardinal point each pair of x,y belongs relative to the previous is easy, take a look at the post-it at the beginning of this post (its actually a post-it, its on my wall). Lets get two points A and B, if (B.x – A.x) its positive and (B.y – A.y) its zero, then B its at East of A. If they are negative and positive respectively, then B its at SouthWest from A.

So a square would be something like: 3333333355555557777777111111111
A triangle would be like: 45454544537777777782232312

But people draw things differently, you may start a square by moving your pen South, instead of East, for example.
Since the algorithm its so fast, you may compare your sketch to several samples of squares, several words, or even better, let the user tells your application what he/she meant with that they just drawed.
“this is a square, learn the way I do”

Its a powerful tool which combined with other algorithms such as corner finding could give a fingerprint of the user’s sketch.

This is implemented as the single one algorithm responsible for recognition on the Sketch Shapes Application. There is more about this on the way, as the project matures: new algorithms will take place, but I think it couldnt be any simpler than Levenshtein’s.

Kudos for Coyette and the team.