| leonardo ( @ 2007-03-05 22:45:00 |
| Entry tags: | computer science, python, riddle |
Add-A-Gram
This is the solution of a cute programming riddle:
http://www.itasoftware.com/careers/puzz
An "add-a-gram" is a sequence of words formed by starting with a 3-letter word, adding a letter and rearranging to form a 4-letter word, and so on. For example, here are add-a-grams of the word "CREDENTIALS":
ail + s =
sail + n =
nails + e =
aliens + t =
salient + r =
entrails + c =
clarinets + e =
interlaces + d =
CREDENTIALS (length 11)
Given the dictionary found here (1.6 MB), what is the longest add-a-gram?
If the link of that dictionary doesn't work you can also use another dictionary, like this smaller local one:
http://www.fantascienza.net/leonard
My Python solution:
http://www.fantascienza.net/leonard
The zip file contains:
- add_a_gram.py: the program that solves the problem, with comments, doctests, docstrings, etc. This is the reference implementation, use this to understand the algorithm, etc.
- add_a_gram_short.py: a compact version of the program, without comments, just to show how much short a Python solution can be.
- add_a_gram_SS.py: the solving program adapted and simplified for ShedSkin (http://shedskin.sourceforge.net ), with little comments. This is not the official implementation of this program.
To see how the algorithm works you can see the commented Python code (add_a_gram.py), hopefully it is readable enough. Here is a very approximative explanation. We take all the cleaned words of the file, and we create a dictionary similar to the one used to find anagrams, but for each different signature we keep only the last seen word. Then for every word in the uniquely signed words starting from the longer one (to be sure to find the best solution) we look at all the signatures with a len-1 length, keeping the first one that is present into the dictionary. We stop if we reaach a len=3 word. We backtrack when we can't find a way to reach a len=3 word.
To speed up the program some you can remove the exception management, and use a *fixed len* list instead of a deque, with a pos index that points the current element. I have used a deque because it's a bit simpler (no need of the index, you just do append() and pop()) and this Python programs runs in few seconds on my very old PC, so speed optmizations aren't necessary.
Solution with WORD.LST:
underestimations
underestimation
determinations
antimodernist
terminations
nitrosamine
antinomies
nominates
mentions
tension
nitons
niton
into
ton
Italian solution with my lemmario_italiano.txt (not included):
deresponsabilizzarono
deresponsabilizzaron
deresponsabilizzano
responsabilizzando
responsabilizzano
responsabilizzan
responsabilizza
spersonalizzai
spersonalizza
polarizzasse
piazzassero
piazzasser
spezzarsi
sprezzai
sprizza
spazzi
pizza
aziz
zia