| leonardo ( @ 2008-06-07 23:23:00 |
Google Treasure Hunt p.4
I have tried to solve the "Google Treasure Hunt", part 4, my starting point was this page:
http://www.catonmat.net/blog/solving-go ogle-treasure-hunt-prime-number-problem-f our/
The problem:
Find the smallest number that can be expressed as
the sum of 7 consecutive prime numbers,
the sum of 17 consecutive prime numbers,
the sum of 41 consecutive prime numbers,
the sum of 541 consecutive prime numbers,
and is itself a prime number.
For example, 41 is the smallest prime number that can be expressed as
the sum of 3 consecutive primes (11 + 13 + 17 = 41) and
the sum of 6 consecutive primes (2 + 3 + 5 + 7 + 11 + 13 = 41).
I have tried to solve the problem with Python, in my first try I have tried to keep all data in memory (2 GB RAM, on a Dual Core 2 GHz), failing.
Later I have found a way to reduce memory used, making xsliding_window_sum yield only prime values (using a set to keep first 15 million prime numbers). This Python program (not included here) gives the correct result in about 1 minute.
I have then implemented the same algoritm in D language, it needs about 7.7 seconds to run. I have not used my sets in D because D AAs are much much slower than python sets, so I have had to keep the sorted array of primes, looking for them with a binary search, and using an array intersection based on a modified merge sort.
But a search engine like Google must intersect its indexes in a lazy way, because users generally only want the first search results, so I have created a lazy intersection in Python, the resulting Python code (included) gives the result in 10.35 seconds.
Most of the running time of this Python version comes from the loading of prime numbers from disk (generated with my D code, it needs only 0.78 seconds to generate them). So I have created one more version, that uses Pyd (http://pyd.dsource.org/ ) to bridge Python and D, D is used to generate the array of primes (generation plus conversion to Python list requires about 1.5 seconds), followed by the lazy intersection (that with Psyco needs only 0.4 seconds). This hybrid program needs only 2.18 seconds to find the solution.
All the code:
http://www.fantascienza.net/leonard o/js/google4.zip
I have tried to solve the "Google Treasure Hunt", part 4, my starting point was this page:
http://www.catonmat.net/blog/solving-go
The problem:
Find the smallest number that can be expressed as
the sum of 7 consecutive prime numbers,
the sum of 17 consecutive prime numbers,
the sum of 41 consecutive prime numbers,
the sum of 541 consecutive prime numbers,
and is itself a prime number.
For example, 41 is the smallest prime number that can be expressed as
the sum of 3 consecutive primes (11 + 13 + 17 = 41) and
the sum of 6 consecutive primes (2 + 3 + 5 + 7 + 11 + 13 = 41).
I have tried to solve the problem with Python, in my first try I have tried to keep all data in memory (2 GB RAM, on a Dual Core 2 GHz), failing.
Later I have found a way to reduce memory used, making xsliding_window_sum yield only prime values (using a set to keep first 15 million prime numbers). This Python program (not included here) gives the correct result in about 1 minute.
I have then implemented the same algoritm in D language, it needs about 7.7 seconds to run. I have not used my sets in D because D AAs are much much slower than python sets, so I have had to keep the sorted array of primes, looking for them with a binary search, and using an array intersection based on a modified merge sort.
But a search engine like Google must intersect its indexes in a lazy way, because users generally only want the first search results, so I have created a lazy intersection in Python, the resulting Python code (included) gives the result in 10.35 seconds.
Most of the running time of this Python version comes from the loading of prime numbers from disk (generated with my D code, it needs only 0.78 seconds to generate them). So I have created one more version, that uses Pyd (http://pyd.dsource.org/ ) to bridge Python and D, D is used to generate the array of primes (generation plus conversion to Python list requires about 1.5 seconds), followed by the lazy intersection (that with Psyco needs only 0.4 seconds). This hybrid program needs only 2.18 seconds to find the solution.
The timings: Pure Python (google4_pure.py): 10.35 s D (google4.exe): 7.71 s D + Python (google4_hybrid.py): 2.18 s
All the code:
http://www.fantascienza.net/leonard