Anagram Solver

Posted on July 16, 2009

0


Abstract:
Anagram is a word game in which a bunch of characters from a dictionary word are re-arranged. The task of the player is to arrange these given characters such that, they form a meaningful word. In this work, an automated program to do the work of the player has been developed.

Downloads: 
Online Demo HERE
Source Code (C Programming Language) HERE

Content:
During the time when I was in my 1st year at LNMIIT (late 2008), people (which also includes me) were hooked onto Orkut. It used to be the talk of the day. All this started with an Orkut application which was actually an multi-player game of anagram. In that application, a jumbled set of characters were shown and who so ever user tells the answer first to this anagram gets the point. I had started playing this game for a while and found out, people used to reply within less than 5 sec. And this happened even for words as long as 12-15 characters. By now, I had realized that, these guys must be surely using some kind of computer algorithm to lookup this sequence in a dictionary. This is how I decided to built my own version of an anagram solver.

The first task for this was to find a dictionary file. Lucky, this was quite straight forward. There is a unix command-line utility (also on linux) called  “look”. This program looks up if the given word is in the dictionary or on (as it is). It was pretty straight to get their dictionary file.

The first approach, I could think of was to make all the n! (n factorial) permutations of the input word and lookup them all in the dictionary. Ofcourse this was not practical, since it would take up too too too much long time even for size of 6 or 7. So, this approach was crossed out.

Next, I could think of was to divide the dictionary into multiple files based on the lengths of the words. For example words with 4 letters all go in one file, words with 6 letters go in another and so on. And the lookup was done on the basics of character frequency count. What I mean to say is, two letters were considered to be same, if they contain same characters (irrespective of their order). So, the basic program flow was, to iterate into the file of the length of input word. And compare each word in it as mentioned earlier.

Advertisements
Posted in: Softwares