leonardo
View:Recent Entries.
View:Archive.
View:Friends.
View:User Info.
View:Website (My Website).
You're looking at the latest 1 entries.

Tags:, , , ,
Subject:Updates and few links
Time:03:42 pm
Slow updates because of limited net access.

The C language is designed for the hardware of the computer.
Python is designed for the programmer.
The D language is designed for both the designer of its compiler and for its compiler.

-----------------------

Few links:

Many free video courses, for science and engineering, there are an couple of them for Python too, with interesting PDF files, that show how to use Python in various scientific situations:
http://www.datawrangling.com/hidden-video-courses-in-math-science-and-engineering.html


Very nice, how to teach some Computer Science without computer to young children, some of the things shown are nice, like the sorting algorithms:
http://csunplugged.com/
But I don't know if it's an advantage for young children to learn sorting nets and other things. I think they have more important things to learn first, like reading complex texts, writing well, some mathematics, statistics, geometry, combinatorics, etc.



A really interesting parser+interepreter, PyMeta, it's a version of OMeta+Python:
http://washort.twistedmatrix.com/
OMeta is a part of a dream to create now foundations of computer programming, this is the third part of an article that discusses related matters:
http://www.moserware.com/2008/04/towards-moores-law-software-part-3-of-3.html

-----------------------

As soon as possible I'll update my D libs, there are many news.
New things:
- editDistance, editDistanceFast: they find the Levenshtein distance between two arrays (strings too).
- ALL: template that's true if the template predicate is true for all the items of the given tuple.
- MAP: template that maps a template on all the items of the given tuple and return the resulting tuple.
- BKtree: fast implementation of BK-trees to find nearby objects, given a distance function.
- polyCentroid: to find area and centroid coordinate of a given polygon.
- delVoidC, delVoidGC, NewVoidCArray, NewVoidCMatrix, NewVoidGCArray, NewVoidGCMatrix: high-performance functions to allocate/deallocate 1D/2D void arrays.
- xchain, Chainable: to chain any number of any kind of iterable, plus now all x-something lazy iterable classes support the ~ operator to chain them.
Changes:
- Removed boxarray and strarray functions because they are of little use.
- ExprType and xsplit simplified and improved.
- 'select' module renamed to 'ranking' to avoid any name clashing with func.select.
- Removed dependencies from 'meta' package, it's not included anymore.

-----------------------

I have developed the editDistanceFast/BKtree code in Python and D, and I have found that there are some advantages in developing in two (quite different) languages at a time.
- If during the development I put a bug in one of the two implementations, they may give different outputs, so I am able to see there's a bug.
- Optimizing the code for D allows me to focus to low level things. While if I program just in Python I seem not much able to "see" some of such details; and in this code I have found that every time I have invent a way to speed up the D code, the same trick was able to speed up the Psyco (the Python JIT) version too.
- Python allows me to quickly write a program that works, that is nearly bug-free. It's great for prototyping, while developing in D is slower. Later I can usually translate the code to D. Python allows me to write very short code too, and short programs are very useful, because they often hide less bugs and you can understand them better and faster. Later I may use my D libs to translate the Python code to a very short D code (using the high-order functions, etc). While if I program in D from the start I may miss spots where I can shorten the code with little performance loss.
- The short original Python code, plus the wonderful doctests allow me to set a starting point, to define what's the correct output of the code. Later I can translate the Python doctests to D unittests, and even later I can use the short and easy Python version to test if the D version is correct.
- So jumping between the two languages allows me to find many things that I may miss when I develop in just one language. The result is code that is readable, debugged, short, and fast :-)

As an example of the results of such language jumping, this is the D version of the editDistance, that's back-translated from Python, showing a quite short and readable code (I have created a pair of editDistance and editDistanceFast in both languages, the fast version has limits in the length of the possible input iterables/arrays, and avoid dynamic allocations):
int editDistance(T)(T[] s1, T[] s2) {
    if (len(s1) > len(s2)) { auto sa = s1; s1 = s2; s2 = sa; }
    auto r1 = range(len(s2) + 1);
    auto r2 = new int[len(r1)];
    foreach (i, c1; s1) {
        r2[0] = i + 1;
        foreach (j, c2; s2)
            r2[j+1] = c1 == c2 ? r1[j] : min(r2[j], r1[j], r1[j+1]) + 1;
        auto ra = r1; r1 = r2; r2 = ra;
    }
    return r1[$ - 1];
}

While the following Psyco code is essentially back-translated from the fast D version:
def editDistanceFast(s1, s2, r1=[0]*35, r2=[0]*35):
    if s1 == s2: return 0
    if len(s1) > len(s2):
        s1, s2 = s2, s1
    len_s2 = len(s2)
    assert len(s2) <= 34, "Error: one input sequence"\
           "is too much long (> 34), use editDistance()."
    for i in xrange(len_s2 + 1):
        r1[i] = i
        r2[i] = 0
    i = 0
    for c1 in s1:
        r2[0] = i + 1
        j = 0
        for c2 in s2:
            if c1 == c2:
                r2[j+1] = r1[j]
            else:
                a1 = r2[j]
                a2 = r1[j]
                a3 = r1[j+1]
                if a1 > a2:
                    if a2 > a3:
                        r2[j+1] = 1 + a3
                    else:
                        r2[j+1] = 1 + a2
                else:
                    if a1 > a3:
                        r2[j+1] = 1 + a3
                    else:
                        r2[j+1] = 1 + a1
            j += 1
        aux = r1; r1 = r2; r2 = aux
        i += 1
    return r1[len_s2]

There you can even see those r1=[0]*35 and r2=[0]*35 that are a way to implement a kind of the static arrays I have used in the fast D version. The end result is that the fast D version is just 4.5 times faster than the fast Psyco version.

You can find the Python version here:
http://www.fantascienza.net/leonardo/so/bktree.py
comments: 2 comments or Leave a comment Add to Memories Tell a Friend

leonardo
View:Recent Entries.
View:Archive.
View:Friends.
View:User Info.
View:Website (My Website).
You're looking at the latest 1 entries.