leonardo
View:Recent Entries.
View:Archive.
View:Friends.
View:User Info.
View:Website (My Website).
You're looking at the latest 10 entries.
Missed some entries? Then simply jump back 10 entries

Tags:
Subject:Links + D libs updates
Time:11:21 pm
Probably now I'll be able to update the blog more often.

Announce of Spyke python-to-C compiler:
http://mail.python.org/pipermail/python-dev/2008-April/078469.html


Computer Graphics and Geometric Ornamental Design, Craig S. Kaplan. PhD thesis, 2002, related to metamagical themas by Douglas Hofstadter and many works by M.C. Escher:
http://www.cgl.uwaterloo.ca/~csk/phd/


Scientific Computing with Python, video courses/talks:
http://www.nanohub.org/resources/99/
There are quite good PDFs too, "Introduction to Scientific Computing with Python" that explains lot of good stuff:
http://www.nanohub.org/site/archive/2004.10.24-Python_talk1.pdf
http://www.nanohub.org/site/archive/2004.10.24-Python_talk2.pdf


Computational Photography - G22.3033-006 7, documents and articles from the course, really good stuff:
http://www.cs.nyu.edu/~fergus/teaching/comp_photo/index.html


"What To Know Before Debating Type Systems", a simple basic reading on this topic:
http://cdsmith.twu.net/types.html


"Gin Television and Social Surplus", a good article:
http://www.shirky.com/herecomeseverybody/2008/04/looking-for-the-mouse.html


Everyone is talking about this, a small JavaScript program to run Processing code:
http://ejohn.org/blog/processingjs/

"The Mysterious Memristor", by Sally Adee, that explains in a simple way this small revolution in electronics:
http://www.spectrum.ieee.org/print/6207

-------------------

Updates regarding my D libs:
http://www.fantascienza.net/leonardo/so/libs_d.zip

- Added: IsClass, boxArrayCmp, AttrExtract, XattrExtract, minMax, xpairs
- removed StructArr
- splitted module func into templates and func modules.
- Globally cleaned and more tidy
- renamed modules "func2" as "extra", "ppmlib" as "ppm", "geom" as geometry.
- Introduced a quite tidier usage of exceptions.
- xgroupby and xchain simplified and cleaned.
comments: Leave a comment Add to Memories Tell a Friend

Tags:, , , ,
Subject:Pythagorean triples
Time:06:05 pm
A little puzzle:
Given an array of n integers, find all Pythagorean triples in the array, that is, three elements such that a^2 + b^2 = c^2. Do this in O(n^2) time.

A simple solution is to test the equation with a hash of the squares on every unordered (a^2,b^2) pair (see the first Python version below).

First I have written a Python solution that uses a set, then I have translated it to D using my set module, then using D associative arrays, then I have translated the code to C adapting a simple hash code to integer keys. Then I have simplified the C code, removing the values from the hash.

That's the faster general solution, but the code can be improved if the problem is made less general. So I have tried to count the number of distinct Pythagorean triples among the integers [1,10_000].
So I have tried to simplify the hash some more. I have tried a really simple hash, and while counting the collision frequency I have found a very simple hash configuration where every number is present at its place in the array, or it's absent. Then as usual I have backported that code to Psyco and D.

Click to see the source code
Click here )

Using MinGW 4.2.1 I have compiled that C code with:
gcc -O3 -s -march=pentiumpro -fprofile-generate triples.c -o triples
Followed by a run, followed by:
gcc -O3 -s -march=pentiumpro -fprofile-use triples.c -o triples

The general D version works with any group of input numbers (not tested below).

The D1 version is really close to the C version. The D2 version is faster and higher-level, it uses my d libs:
www.fantascienza.net/leonardo/so/libs_d.zip

The timing results (Pentium3 500 MHz):
Timings triples, N = 10_000:
  D general: 13.38 s
  Psyco:      6.89 s   (4.12 X)   (1.87 X)
  D1:         4.02 s   (2.40 X)   (1.09 X)
  D2:         3.68 s   (2.20 X)   (1    X)
  C:          1.67 s   (1    X)   (0.45 X)

Triples in [1, 10000): 12467
Triples in [1, 20000): 27171
Triples in [1, 40000): 58728
Triples in [1, 80000): 394138

The timings are quite good for Psyco but quite bad for D: the C-like D version is 2.4 times slower than the C version, and the Psyco version is just 1.7 times slower than the C-like D version (and the Psyco version is 1.9 times slower than the faster D version). This means that with this tiny program there's not that much purpose in using D instead of Psyco.
comments: 1 comment or Leave a comment Add to Memories Tell a Friend

Tags:,
Subject:Jolly Jumpers
Time:04:59 pm
A little programming challenge, "Jolly Jumpers" in D:
http://mikael.jansson.be/journal/2008/03/jolly-jumpers-challenge

The version jolly() below comes from a C version by the Reddit user manannan (indents reduced to 2 spaces for LiveJournal).
The version jolly2() is for maniacs :-) It uses 1/8 of memory of jolly() using a bitmap with the bts compiler intrinsic, and alloca() to speed up memory allocation.

import std.math: abs, ceil;
import std.intrinsic: bts;
import std.c.stdlib: alloca;
import std.stdio: writefln;

bool jolly(int[] arr) {
  auto flags = new ubyte[arr.length - 1];
  foreach (i, el; arr[0 .. $-1]) {
    auto diff = abs(arr[i+1] - el) - 1;
    if (diff < 0 || diff > arr.length || flags[diff]++)
      return false;
  }
  return true;
}

bool jolly2(int[] arr) {
  int nuints = cast(int)ceil(arr.length / cast(float)(uint.sizeof * 8));
  uint[] flags = (cast(uint*)alloca(nuints * uint.sizeof))[0 .. nuints];
  if (flags.ptr is null)
  	throw new Exception("Stack overflow");
  flags[] = 0;
  foreach (i, el; arr[0 .. $-1]) {
    int diff = abs(arr[i+1] - el) - 1;
    if (diff < 0 || diff > arr.length || bts(flags.ptr, diff))
      return false;
  }
  return true;
}

void main() {
  foreach (a; [[1, 4, 2, 3], [1, 4, 2, -1, 6]])
    writefln(jolly(a), " ", jolly2(a));
}
comments: Leave a comment Add to Memories Tell a Friend

Tags:, , , ,
Subject:Updates and few links
Time:03:42 pm
Slow updates because of limited net access.

The C language is designed for the hardware of the computer.
Python is designed for the programmer.
The D language is designed for both the designer of its compiler and for its compiler.

-----------------------

Few links:

Many free video courses, for science and engineering, there are an couple of them for Python too, with interesting PDF files, that show how to use Python in various scientific situations:
http://www.datawrangling.com/hidden-video-courses-in-math-science-and-engineering.html


Very nice, how to teach some Computer Science without computer to young children, some of the things shown are nice, like the sorting algorithms:
http://csunplugged.com/
But I don't know if it's an advantage for young children to learn sorting nets and other things. I think they have more important things to learn first, like reading complex texts, writing well, some mathematics, statistics, geometry, combinatorics, etc.



A really interesting parser+interepreter, PyMeta, it's a version of OMeta+Python:
http://washort.twistedmatrix.com/
OMeta is a part of a dream to create now foundations of computer programming, this is the third part of an article that discusses related matters:
http://www.moserware.com/2008/04/towards-moores-law-software-part-3-of-3.html

-----------------------

As soon as possible I'll update my D libs, there are many news.
New things:
- editDistance, editDistanceFast: they find the Levenshtein distance between two arrays (strings too).
- ALL: template that's true if the template predicate is true for all the items of the given tuple.
- MAP: template that maps a template on all the items of the given tuple and return the resulting tuple.
- BKtree: fast implementation of BK-trees to find nearby objects, given a distance function.
- polyCentroid: to find area and centroid coordinate of a given polygon.
- delVoidC, delVoidGC, NewVoidCArray, NewVoidCMatrix, NewVoidGCArray, NewVoidGCMatrix: high-performance functions to allocate/deallocate 1D/2D void arrays.
- xchain, Chainable: to chain any number of any kind of iterable, plus now all x-something lazy iterable classes support the ~ operator to chain them.
Changes:
- Removed boxarray and strarray functions because they are of little use.
- ExprType and xsplit simplified and improved.
- 'select' module renamed to 'ranking' to avoid any name clashing with func.select.
- Removed dependencies from 'meta' package, it's not included anymore.

-----------------------

I have developed the editDistanceFast/BKtree code in Python and D, and I have found that there are some advantages in developing in two (quite different) languages at a time.
- If during the development I put a bug in one of the two implementations, they may give different outputs, so I am able to see there's a bug.
- Optimizing the code for D allows me to focus to low level things. While if I program just in Python I seem not much able to "see" some of such details; and in this code I have found that every time I have invent a way to speed up the D code, the same trick was able to speed up the Psyco (the Python JIT) version too.
- Python allows me to quickly write a program that works, that is nearly bug-free. It's great for prototyping, while developing in D is slower. Later I can usually translate the code to D. Python allows me to write very short code too, and short programs are very useful, because they often hide less bugs and you can understand them better and faster. Later I may use my D libs to translate the Python code to a very short D code (using the high-order functions, etc). While if I program in D from the start I may miss spots where I can shorten the code with little performance loss.
- The short original Python code, plus the wonderful doctests allow me to set a starting point, to define what's the correct output of the code. Later I can translate the Python doctests to D unittests, and even later I can use the short and easy Python version to test if the D version is correct.
- So jumping between the two languages allows me to find many things that I may miss when I develop in just one language. The result is code that is readable, debugged, short, and fast :-)

As an example of the results of such language jumping, this is the D version of the editDistance, that's back-translated from Python, showing a quite short and readable code (I have created a pair of editDistance and editDistanceFast in both languages, the fast version has limits in the length of the possible input iterables/arrays, and avoid dynamic allocations):
int editDistance(T)(T[] s1, T[] s2) {
    if (len(s1) > len(s2)) { auto sa = s1; s1 = s2; s2 = sa; }
    auto r1 = range(len(s2) + 1);
    auto r2 = new int[len(r1)];
    foreach (i, c1; s1) {
        r2[0] = i + 1;
        foreach (j, c2; s2)
            r2[j+1] = c1 == c2 ? r1[j] : min(r2[j], r1[j], r1[j+1]) + 1;
        auto ra = r1; r1 = r2; r2 = ra;
    }
    return r1[$ - 1];
}

While the following Psyco code is essentially back-translated from the fast D version:
def editDistanceFast(s1, s2, r1=[0]*35, r2=[0]*35):
    if s1 == s2: return 0
    if len(s1) > len(s2):
        s1, s2 = s2, s1
    len_s2 = len(s2)
    assert len(s2) <= 34, "Error: one input sequence"\
           "is too much long (> 34), use editDistance()."
    for i in xrange(len_s2 + 1):
        r1[i] = i
        r2[i] = 0
    i = 0
    for c1 in s1:
        r2[0] = i + 1
        j = 0
        for c2 in s2:
            if c1 == c2:
                r2[j+1] = r1[j]
            else:
                a1 = r2[j]
                a2 = r1[j]
                a3 = r1[j+1]
                if a1 > a2:
                    if a2 > a3:
                        r2[j+1] = 1 + a3
                    else:
                        r2[j+1] = 1 + a2
                else:
                    if a1 > a3:
                        r2[j+1] = 1 + a3
                    else:
                        r2[j+1] = 1 + a1
            j += 1
        aux = r1; r1 = r2; r2 = aux
        i += 1
    return r1[len_s2]

There you can even see those r1=[0]*35 and r2=[0]*35 that are a way to implement a kind of the static arrays I have used in the fast D version. The end result is that the fast D version is just 4.5 times faster than the fast Psyco version.

You can find the Python version here:
http://www.fantascienza.net/leonardo/so/bktree.py
comments: 2 comments or Leave a comment Add to Memories Tell a Friend

Tags:, , ,
Subject:Links
Time:08:17 pm
How the JVM compiles Java code, it shows some quite interesting optimizazions:
http://weblogs.java.net/blog/kohsuke/archive/2008/03/deep_dive_into.html

Faster object message sending in Objective-C:
http://www.ridiculousfish.com/blog/archives/2005/08/01/objc_msgsend/

Nest functions, and ways to manage them (one of such ways, the delegate-based one is used by D, but there are two other ones):
http://ridiculousfish.com/blog/archives/2006/02/05/nest/
comments: Leave a comment Add to Memories Tell a Friend

Tags:, , , ,
Subject:Slow D
Time:03:25 am
Here I have collected few benchmarks where the D language, or the DMD compiler, or the Phobos std lib don't shine in their performance. Two of the following performance problems (see 'gc1' and 'sort' benchmarks) have already a solution.

http://www.fantascienza.net/leonardo/js/slow_d.zip

--------------------

Compilers/interpreters used:
Digital Mars D Compiler v1.028

gcc version 4.2.1-dw2 (mingw32-2)

Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45) [MSC v.1310 32 bit (Intel)] on win32

Psyco V.1.5.2 (JIT for Python)

javac 1.6.0_03, Java version "1.6.0_03"


All timings are "warm", best of 3, in seconds.
CPU used is a Pentium3 500 MHz, 256 MB RAM, with Win.


Compilation flags used (when not specified otherwise):
gcc: -O3 -s
dmd: -O -release -inline
Python: no flags.

-------------------
BENCHMARKS:

recursive, n = 38:
  C: 3.89 s (with __builtin_expect)
  C: 3.99 s
  D: 4.95 s


hash, n = 500_000:
  D:     7.07 s (7.15 s with GC patched)
  D:     5.89 s (GC disabled)
  Psyco: 4.23 s
(Note that Python+Psyco has much bigger overhead compared to D in any instruction, simple loops too).
wordcount, on a txt file of 13_312_768 bytes:
  C: 0.73 s
  D: 5.77 s


gc1, with 6.3 MB of text:
        loading   splitting   total
         time        time     time
  D:       1.87 s     3.58 s   13.76 s
  Python:  0.28 s     1.98 s    3.52 s
  Python:  0.1 s      2.0 s     3.38 s (load with 'rb', same result)
With a Patch to the D GC, plus disabling the GC:
  D:       0.08 s     0.72 s    1.36 s (GC patched + GC disabled after load)


gc2, n=1_000, m=10_000:
               seconds  MB
  DMD class:   18.95    1.7
  GDC class:   17.91    1.8
  DMD struct:  11.77    1.7
  GDC struct:  12.31    1.8
  Python:      37.10    3.1
  Psyco:       15.68    3.5
  Java:         2.19    7.3


gc3 (binarytrees), n = 15:
  Java:   9.12 s
  D:     35.01 s


sort, const n = 1_000_000:
  Random distribution:
    sort: 2.934
    fastSort: 0.881
  Already sorted arrays:
    sort: 1.723
    fastSort: 0.41
 Inverted order arrays:
    sort: 1.853
    fastSort: 0.651
(For a better and more complete version of fastSort see my D libs:
www.fantascienza.net/leonardo/so/libs_d.zip ).
comments: Leave a comment Add to Memories Tell a Friend

Tags:, , , , ,
Subject:Returning '(()) isn't difficult
Time:04:19 pm
A little amusing post about language syntax:
http://cadrlife.blogspot.com/2008/03/returning-considered-difficult.html

"Suppose you want a function to find all subsequences of a list with a certain length. For instance, the subsequences of (1...5) with length 2 are as follows."
(1 2) (1 3) (1 4) (1 5)
(2 3) (2 4) (2 5)
(3 4) (3 5)
(4 5)
In a lisp dialect:
(def subn (n li)
 (if (is 0 n) '(())
     (no li) '()
       (join (map [cons (car li) _] (subn (- n 1) (cdr li)))
             (subn n (cdr li)))))
Haskell:
subn 0 _      = [[]]
subn _ []     = []
subn n (x:xs) = map (x:) (subn (n-1) xs) ++ subn n xs
Python:
def subn(n, a):
    if not n: return [[]]
    if not a: return []
    return [[a[0]] + sub for sub in subn(n-1, a[1:])] + subn(n, a[1:])
The Scala version too uses pattern matching, showing how much it can be useful (it may be useful for D language too):
def subn[T](n: int, li : List[T]) : List[List[T]] = (n,li) match {
   case (0,_)     => List(Nil)
   case (_,Nil)   => Nil
   case (_,x::xs) => (subn(n-1, xs) map (x :: _)) ::: subn(n, xs)
}
D version using my libs:
import d.func: putr, map, not, range, xrange, select, BaseType, array;

T[][] subn(T)(int n, T[] a) {
    if (!n) return new T[][](1, 0);
    if (not(a)) return new T[][0];
    return map((T[] s){return a[0..1] ~ s;}, subn(n-1, a[1..$])) ~ subn(n, a[1..$]);
}
If you want shorter lines you can use select (reverted from the 'list' name):
T[][] subn2(T)(int n, T[] a) {
    if (!n) return new T[][](1, 0);
    if (not(a)) return new T[][0];
    T[] s;
    return select(a[0..1] ~ s, s, subn2(n-1, a[1..$])) ~ subn2(n, a[1..$]);
}
This version is able to digest any iterable:
BaseType!(TyIter)[][] subn3(TyIter)(int n, TyIter iter) {
    auto a = array(iter);
    alias BaseType!(TyIter) T;
    if (!n) return new T[][](1, 0);
    if (not(a)) return new T[][0];
    return map((T[] s){return a[0..1] ~ s;}, subn3(n-1, a[1..$])) ~ subn3(n, a[1..$]);
}

void main() {
    putr( subn(2, range(5)) );
    // Output: [[0, 1], [0, 2], [0, 3], [0, 4], [1, 2], [1, 3], [1, 4],
    //          [2, 3], [2, 4], [3, 4]]  

    putr( subn2(3, range(5)) );
    // [[0, 1, 2], [0, 1, 3], [0, 1, 4], [0, 2, 3], [0, 2, 4], [0, 3, 4],
    //  [1, 2, 3], [1, 2, 4], [1, 3, 4], [2, 3, 4]]

    putr( subn3(2, xrange(5)) );
    // Output: [[0, 1], [0, 2], [0, 3], [0, 4], [1, 2], [1, 3], [1, 4],
    //          [2, 3], [2, 4], [3, 4]]    
}
comments: Leave a comment Add to Memories Tell a Friend

Tags:, ,
Subject:Self-recursive generators in D
Time:07:45 pm
I have (slightly) adapted the Python-like generators for D written by Witold Baryluk to my libs. They allow a simpler syntax that can be used for single iterations (not parallel scan allowed, you have to use some threads or light threads). So I have used them to translate to D some self-recursive generators from Python code by David Eppstein:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/221457

As an example this yields Fibbinary numbers (no consecutive ones in binary representation):
http://www.research.att.com/projects/OEIS?Anum=A003714
struct A003714 {
  void generator() {
    yield(1);
    foreach(x; A003714()) {
      yield(2 * x);
      if (!(x & 1))
        yield(2 * x + 1);
    }
  }
  mixin Generator!(int);
}

The original Python version:
def A003714():
  yield 1
  for x in A003714():
    yield 2 * x
    if not (x & 1):
      yield 2 * x + 1

The syntax isn't as clean as Python one, but it's simple enough anyway. They are 2-3 times slower than handwritten opApply struct methods, so you may want to use them only where code size/readability matters more.

Click here to see more examples )
comments: Leave a comment Add to Memories Tell a Friend

Tags:, , ,
Subject:Euler problem 14
Time:05:23 pm
The Euler problem n.14:

The following iterative sequence is defined for the set of positive integers:
n -> n/2 (n is even)
n -> 3n + 1 (n is odd)
Using the rule above and starting with 13, we generate the following sequence of 10 terms:
13 -> 40 -> 20 -> 10 -> 5 -> 16 -> 8 -> 4 -> 2 -> 1
Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1. Which starting number, under one million, produces the longest chain of Collatz numbers?


My Python solution, that uses memoization of results to speed up computation:
import array

def main(m):
  lens = array.array('H', [0]) * m
  max_steps = max_steps_pos = 1
  for i in xrange(1, m):
    n = i
    n_steps = 1
    while n != 1:
      if n < i:
        n_steps += lens[n] - 1
        break
      n = 3 * n + 1 if n & 1 else n // 2
      n_steps += 1
    if n_steps > max_steps:
      max_steps = n_steps
      max_steps_pos = i
    lens[i] = n_steps
  return max_steps, max_steps_pos

import psyco; psyco.bind(main)
print "max_steps, max_steps_pos:", main(1000000)

Psyco is often able to manage array.array very efficiently.
'H' represents unsigned shorts [0 .. 2^16-1], because the chains are never too much long. This allows to reduce memory used and improve CPU cache efficiency (mostly in the D version). In Python you can use 'l' that is safer and doesn't slow down the code much, but uses twice the memory (so you can manage smaller problems).

The D version is essentially the same, the only significant difference is that the array isn't pre-initialized:
import std.stdio, std.c.stdlib;

void main() {
  const uint m = 1_000_000;
  auto lens = (cast(ushort*)malloc(m * ushort.sizeof))[0 .. m];
  uint max_steps = 1;
  uint max_steps_pos = 1;

  for (uint i = 1; i < m; i++) {
    uint n = i;
    uint n_steps = 1;

    while (n != 1) {
      if (n < i) {
        n_steps += lens[n] - 1;
        break;
      }
      n = n & 1 ? 3 * n + 1 : n / 2;
      n_steps++;
    }
    if (n_steps > max_steps) {
      max_steps = n_steps;
      max_steps_pos = i;
    }
    lens[i] = n_steps;
  }
  writefln("max_steps, max_steps_pos: ", max_steps, " ", max_steps_pos);
}


Timings:
m = 1_000_000:
D/C:    0.29 s  ( 1   X)
Psyco:  1.60 s  ( 5.5 X) with array 'H'
Python: 19.20 s (66   X) with list

A C version is very similar to that D version, and the running time (with GCC 4.2.1) is the same, but the executable is almost 20 times smaller.

The D code scans m = 100_000_000 in 25 s finding a chain of length 758 on n= 83_706_505.

An Haskell version:
http://www.haskell.org/haskellwiki/Euler_problems/11_to_20#Problem_14

To improve cache efficiency I have tried to store half of the values, but the code is slower (it allows to scan twice the values with the same max memory).
comments: Leave a comment Add to Memories Tell a Friend

Tags:
Subject:Links
Time:11:36 pm
Counting polyominos in Haskell, slow but very elegant code:
http://alaska-kamtchatka.blogspot.com/2007/11/counting-cattle.html

This wonderful article by Lockharts (I'd like it to be more widely read) teaches you what mathematics is, and why people like to develop/enjoy it. I agree with about 99% of the things he/she says:
http://www.maa.org/devlin/LockhartsLament.pdf

Easy intro to the pdb ("Python DeBugger"):
http://www.ferg.org/papers/debugging_in_python.html

This shows you some of the very nice things (useful too) you can do with Python generators (part of those things are possible with a similar number of lines of code in D with my D libs):
http://www.dabeaz.com/generators/Generators.pdf

Very nice regular expression generator, starting from text, I'd like to have this stand-alone instead of online:
http://www.txt2re.com/index-python.php3

Babbage Difference Engine in LEGO!
http://acarol.woz.org/
 
A nice "Quickhull" demonstration in JavaScript, sometimes nowdays Java applet are a less necessary:
http://infoecho.net/blogs/echo/archives/2007/03/13/230/
comments: Leave a comment Add to Memories Tell a Friend

leonardo
View:Recent Entries.
View:Archive.
View:Friends.
View:User Info.
View:Website (My Website).
You're looking at the latest 10 entries.
Missed some entries? Then simply jump back 10 entries