leonardo ([info]leonardo_m) wrote,
@ 2008-04-18 15:42:00
Previous Entry  Add to memories!  Tell a Friend  Next Entry
Entry tags:d language, links, ometa, programming, python

Updates and few links
Slow updates because of limited net access.

The C language is designed for the hardware of the computer.
Python is designed for the programmer.
The D language is designed for both the designer of its compiler and for its compiler.

-----------------------

Few links:

Many free video courses, for science and engineering, there are an couple of them for Python too, with interesting PDF files, that show how to use Python in various scientific situations:
http://www.datawrangling.com/hidden-video-courses-in-math-science-and-engineering.html


Very nice, how to teach some Computer Science without computer to young children, some of the things shown are nice, like the sorting algorithms:
http://csunplugged.com/
But I don't know if it's an advantage for young children to learn sorting nets and other things. I think they have more important things to learn first, like reading complex texts, writing well, some mathematics, statistics, geometry, combinatorics, etc.



A really interesting parser+interepreter, PyMeta, it's a version of OMeta+Python:
http://washort.twistedmatrix.com/
OMeta is a part of a dream to create now foundations of computer programming, this is the third part of an article that discusses related matters:
http://www.moserware.com/2008/04/towards-moores-law-software-part-3-of-3.html

-----------------------

As soon as possible I'll update my D libs, there are many news.
New things:
- editDistance, editDistanceFast: they find the Levenshtein distance between two arrays (strings too).
- ALL: template that's true if the template predicate is true for all the items of the given tuple.
- MAP: template that maps a template on all the items of the given tuple and return the resulting tuple.
- BKtree: fast implementation of BK-trees to find nearby objects, given a distance function.
- polyCentroid: to find area and centroid coordinate of a given polygon.
- delVoidC, delVoidGC, NewVoidCArray, NewVoidCMatrix, NewVoidGCArray, NewVoidGCMatrix: high-performance functions to allocate/deallocate 1D/2D void arrays.
- xchain, Chainable: to chain any number of any kind of iterable, plus now all x-something lazy iterable classes support the ~ operator to chain them.
Changes:
- Removed boxarray and strarray functions because they are of little use.
- ExprType and xsplit simplified and improved.
- 'select' module renamed to 'ranking' to avoid any name clashing with func.select.
- Removed dependencies from 'meta' package, it's not included anymore.

-----------------------

I have developed the editDistanceFast/BKtree code in Python and D, and I have found that there are some advantages in developing in two (quite different) languages at a time.
- If during the development I put a bug in one of the two implementations, they may give different outputs, so I am able to see there's a bug.
- Optimizing the code for D allows me to focus to low level things. While if I program just in Python I seem not much able to "see" some of such details; and in this code I have found that every time I have invent a way to speed up the D code, the same trick was able to speed up the Psyco (the Python JIT) version too.
- Python allows me to quickly write a program that works, that is nearly bug-free. It's great for prototyping, while developing in D is slower. Later I can usually translate the code to D. Python allows me to write very short code too, and short programs are very useful, because they often hide less bugs and you can understand them better and faster. Later I may use my D libs to translate the Python code to a very short D code (using the high-order functions, etc). While if I program in D from the start I may miss spots where I can shorten the code with little performance loss.
- The short original Python code, plus the wonderful doctests allow me to set a starting point, to define what's the correct output of the code. Later I can translate the Python doctests to D unittests, and even later I can use the short and easy Python version to test if the D version is correct.
- So jumping between the two languages allows me to find many things that I may miss when I develop in just one language. The result is code that is readable, debugged, short, and fast :-)

As an example of the results of such language jumping, this is the D version of the editDistance, that's back-translated from Python, showing a quite short and readable code (I have created a pair of editDistance and editDistanceFast in both languages, the fast version has limits in the length of the possible input iterables/arrays, and avoid dynamic allocations):

int editDistance(T)(T[] s1, T[] s2) {
    if (len(s1) > len(s2)) { auto sa = s1; s1 = s2; s2 = sa; }
    auto r1 = range(len(s2) + 1);
    auto r2 = new int[len(r1)];
    foreach (i, c1; s1) {
        r2[0] = i + 1;
        foreach (j, c2; s2)
            r2[j+1] = c1 == c2 ? r1[j] : min(r2[j], r1[j], r1[j+1]) + 1;
        auto ra = r1; r1 = r2; r2 = ra;
    }
    return r1[$ - 1];
}

While the following Psyco code is essentially back-translated from the fast D version:
def editDistanceFast(s1, s2, r1=[0]*35, r2=[0]*35):
    if s1 == s2: return 0
    if len(s1) > len(s2):
        s1, s2 = s2, s1
    len_s2 = len(s2)
    assert len(s2) <= 34, "Error: one input sequence"\
           "is too much long (> 34), use editDistance()."
    for i in xrange(len_s2 + 1):
        r1[i] = i
        r2[i] = 0
    i = 0
    for c1 in s1:
        r2[0] = i + 1
        j = 0
        for c2 in s2:
            if c1 == c2:
                r2[j+1] = r1[j]
            else:
                a1 = r2[j]
                a2 = r1[j]
                a3 = r1[j+1]
                if a1 > a2:
                    if a2 > a3:
                        r2[j+1] = 1 + a3
                    else:
                        r2[j+1] = 1 + a2
                else:
                    if a1 > a3:
                        r2[j+1] = 1 + a3
                    else:
                        r2[j+1] = 1 + a1
            j += 1
        aux = r1; r1 = r2; r2 = aux
        i += 1
    return r1[len_s2]

There you can even see those r1=[0]*35 and r2=[0]*35 that are a way to implement a kind of the static arrays I have used in the fast D version. The end result is that the fast D version is just 4.5 times faster than the fast Psyco version.

You can find the Python version here:
http://www.fantascienza.net/leonardo/so/bktree.py



(2 comments) - (Post a new comment)

CS without a computer
[info]pragma_x
2008-04-18 07:17 pm UTC (link)
Back in High-School (back in the dark ages), I once made it to the county science fair with a CS-based project. The problem that my lab partner and I had was: how do we communicate the virtues of a distributed sorting algorithm to the layperson?

The solution was amazingly simple: use a deck of cards to show the non parallel and parallel versions of the sort in action. That worked well enough to win us second place in our category, since the actual project wasn't flashy at all (two PC's, a null modem cable and console output) and could only be truly validated by reading the source code.

Cards are very useful for explaining CS concepts to newbies since most people are already intimately familiar with their properties. This familiarity also helps by not intimidating folks in the way that most CS stuff can. Throw in a bit a showmanship and suddenly your lecture turns into a kind of magic show; and who wouldn't pay attention to that?

Since then I've taken to using a deck of playing cards, dice and coins to help explain theoretical stuff to people; all without the aid of a computer. Sometimes, I think it works better than involving electronics in the first place.

(Reply to this) (Thread)

Re: CS without a computer
[info]leonardo_m
2008-04-21 09:38 am UTC (link)
Cards are very useful for explaining CS concepts to newbies since most people are already intimately familiar with their properties.

I have a deck of 100 small handmade "cards" that I carry around in my bag all the time, I use them on trains and in similar boring places to teach the basic sorting algorithms (quicksort, insertion sort, radix sort, merge sort, etc) to random people, children, etc :-)

(Reply to this) (Parent)


(2 comments) - (Post a new comment)

Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…