?

Log in

No account? Create an account

August 7th, 2008 - leonardo
View:Recent Entries.
View:Archive.
View:Friends.
View:Profile.
View:Website (My Website).
Missed some entries? Then simply jump to the previous day or the next day.

Tags:, , , ,
Security:
Subject:Goto
Time:07:42 pm
Python 'for' and 'while' statements allow an 'else' clause, such part is executed if the cycle isn't exited by a break. So this code prints 2) and 4):
for x in range(10):
    break
else:
    print "1)"

for x in range(10):
    pass
else:
    print "2)"

while True:
    break
else:
    print "3)"

while False:
    break
else:
    print "4)"
I don't know other languages with such clause. I don't use it often, I find it not much intuitive, but once in a while I have found it useful.

While reading an old article by Knuth I think I may have found one of the things that may have suggested Guido Rossum to put such syntax into Python. I think such 'else' removes one (common?) use case for the 'goto' statement, that's missing in Python (http://entrian.com/goto/index.html ).

This is the article: "Structured Programming With Go To Statements", by Knuth, Computing Surveys, Vol 6, No 4, December 1974:
http://pplab.snu.ac.kr/courses/adv_pl05/papers/p261-knuth.pdf

At page 6 of the PDF the author presents a little problem. We want to search into an array A of distinct values, to find the index of a X item inside it. If not present, we want to add it to the sequence (at the end, you can do it increasing a length value if there's enough space). Plus, we want to keep a second array B, where the B[i] contains the number of times we have looked for A[i].

This means we are trying to create an associative array of the frequencies, so in Python you can do that with a default dict of the frequencies, test is some testing data of many items X:
test = [1, 4, 3, 2, 5, 1, 2, 3, 4, 6, 1]

from collections import defaultdict
d = defaultdict(int)
for x in test:
    d[x] += 1
print d
If we want to use the array pair as in the original description of the problem, we can do like this:
a = []
b = []
for x in test:
    for i, el in enumerate(a):
        if el == x:
            b[i] += 1
            break
    else:
        a.append(x)
        b.append(1)
print a, b
In Python dicts are very fast, and the computational complexity of that code is linear. But if you work in a lower level language like D, and you have just few elements (like in that 'test' input array), and you already have two A, B (so you don't need to create them, for example because you're in a loop, etc), then that code with two arrays can be faster (because in such low level languages a scan in a short array can be faster than an hash lookup).

In C a way to implement that code efficiently is the following one, that uses a goto (note that there's no need to clear a and b, they become overwritten), C code:
#define TESTLEN 11

void counter1(int* test) {
    int a[11], b[11];
    int last = 0, i = 0, j;

    while (i < TESTLEN) {
        int x = test[i];
        for (j = 0; j < last; j++)
            if (a[j] == x) {
                b[j]++;
                goto FOUND;
            }
        a[last] = x;
        b[last] = 1;
        last++;

        FOUND:
            i++;
    }
    // here use a[0..last-1] and b[0..last-1]
}
In D you can solve the same problem in many ways, you can use code much similar to that C one, or you can use an extra boolean variable, that slows down the code a little, D code:
int[11] a, b;
int last;
for (int i; i < test.length; i++) {
    auto x = test[i];
    bool found = false;
    for (int j; j < last; j++)
        if (a[j] == x) {
            b[j]++;
            found = true;
            break;
        }
    if (!found) {
        a[last] = x;
        b[last] = 1;
        last++;
    }
}
But in the past every CPU cycle was precious and CPUs weren't good at doing many things done by modern CPUs (branch prediction, ahead of time execution, etc), so the usage of that 'found' plus the 'if' was a waste. In D you can also solve the same problem without gotos, shortening the code, using a named 'continue' statement (but the code with goto is faster), D code:
OUTER:
for (int i; i < test.length; i++) {
    auto x = test[i];
    for (int j; j < last; j++)
        if (x == set[j]) {
            freqs[j]++;
            continue OUTER;
        }
    set[last] = x;
    freqs[last++] = 1;
}
In most practical situations of this problem, in D as in Python/Psyco, I use an associative array/dict.

If there's a way to allocate one item longer a,b arrays then you can use a sentinel, this trick produces my fastest solution (you don't need to remove the sentinel at the end because the 'last' tells you how many good items are present), C code:
void counter2(int* test) {
    #define LENARR 12
    int a[LENARR], b[LENARR];
    int last = 0, i, j;

    for (i = 0; i < TESTLEN; i++) {
        int x = test[i];
        a[last] = x;
        j = 0;
        while (a[j] != x)
            j++;
        if (j < last)
            b[j]++;
        else {
            a[last] = x;
            b[last] = 1;
            last++;
        }
    }
    #undef LENARR
}
Compiling it with MinGW 4.2.1 with:
 gcc -Wall -fomit-frame-pointer -fprofile-generate -O3 -s tempc.c -o tempc
followed by a run, followed by:
 gcc -Wall -fomit-frame-pointer -fprofile-use -O3 -s tempc.c -o tempc

counter2 becomes:
_counter2:
    pushl    %edi
    xorl    %ecx, %ecx
    pushl    %esi
    pushl    %ebx
    xorl    %ebx, %ebx
    subl    $48, %esp
    movl    64(%esp), %edi
    movl    %esp, %esi
    .p2align 4,,7
L37:
    movl    (%edi,%ecx,4), %eax
    xorl    %edx, %edx
    movl    %eax, (%esp,%ebx,4)
    cmpl    %eax, (%esp)
    je    L40
    cmpl    %eax, 4(%esi)
    movl    $1, %edx
    je    L40
    cmpl    %eax, 8(%esi)
    movb    $2, %dl
    je    L40
    cmpl    %eax, 12(%esi)
    movb    $3, %dl
    je    L40
L41:
    addl    $1, %edx
    cmpl    %eax, (%esi,%edx,4)
    jne    L41
    .p2align 4,,7
L40:
    cmpl    %ebx, %edx
    jl    L42
    movl    %eax, (%esp,%ebx,4)
    addl    $1, %ebx
L42:
    addl    $1, %ecx
    cmpl    $11, %ecx
    jne    L37
    addl    $48, %esp
    popl    %ebx
    popl    %esi
    popl    %edi
    ret
To scan the array of 11 items it needs about 76 clock ticks of my CPU, that's just a bit more than the minimum. If you write the code in assembly I think it's not easy to go below 50-60 clock ticks (and I don't know if SSSE3 instructions may help here).
comments: 1 comment or Leave a comment Share

August 7th, 2008 - leonardo
View:Recent Entries.
View:Archive.
View:Friends.
View:Profile.
View:Website (My Website).
Missed some entries? Then simply jump to the previous day or the next day.