leonardo
View:Recent Entries.
View:Archive.
View:Friends.
View:User Info.
View:Website (My Website).
You're looking at the latest 10 entries.
Missed some entries? Then simply jump back 10 entries

Security:
Subject:Links and updates
Time:12:49 am
Updates:

Given the natural numbers n and m, find the minimum number of integer-sided squares that tile an (n,m) rectangle:
http://www.fantascienza.net/leonardo/js/index.html#squares_problem
A D solution that improves the performance of a C++ solution.

A benchmark from the Computer Game site, in D:
http://www.fantascienza.net/leonardo/js/index.html#meteor_bench

A D implementation of a simple and famous simulation:
http://www.fantascienza.net/leonardo/js/segregation_game.zip

Various type-level solutions in D language of the "Instant Insanity" puzzle (inspired by a Haskell solution):
http://www.fantascienza.net/leonardo/js/instant_insanity.zip

-------------

Links:


A testing experience in Haskell:
http://www.serpentine.com/blog/2013/12/30/testing-a-utf-8-decoder-with-vigour/

A simple but sometimes useful optimization for modern CPUs:
http://blog.libtorrent.org/2013/12/memory-cache-optimizations/

A nice talk about the Julia language:
http://vimeo.com/84661077

"Types for Units-of-Measure: Theory and Practice" by Andrew Kennedy, the complexity behind the units of measure feature of F# (PDF file):
http://research.microsoft.com/en-us/um/people/akenn/units/cefp09typesforunitsofmeasure.pdf

A little story that reminds us that teaching debugging at the University is important:
http://danluu.com/teach-debugging/
comments: Leave a comment Add to Memories Share

Security:
Subject:Links and updates
Time:10:50 pm
Updated software:

http://www.fantascienza.net/leonardo/js/index.html#fasta
http://www.fantascienza.net/leonardo/js/index.html#audioactive


Added software:

http://www.fantascienza.net/leonardo/js/index.html#puzzle15game
http://www.fantascienza.net/leonardo/js/index.html#naval_simulation
http://www.fantascienza.net/leonardo/js/index.html#water_filling
http://www.fantascienza.net/leonardo/js/mark_sweep.zip
http://www.fantascienza.net/leonardo/js/index.html#challenge139


Links, mostly programming:

Easy use of Wikipedia from Python:
https://github.com/goldsmith/Wikipedia

A little sad Python-Haskell comparison:
http://eyallotem.blogspot.co.il/2013/05/comparing-python-and-haskell.html

Introductions to Parser Combinators:
http://blog.fogcreek.com/fparsec/
http://yannesposito.com/Scratch/en/blog/Parsec-Presentation/

"What I Wish I Knew When Learning Haskell", not easy to understand:
http://dev.stephendiehl.com/hask/

Mathics is a general-purpose computer algebra system:
http://www.mathics.net/

Slides of the StrangeLoop 2013 conference:
https://github.com/strangeloop/StrangeLoop2013/tree/master/slides/sessions

Introduction to Typed Clojure:
http://nathanic.org/posts/2013/typed-clojure-tour/

Simple introduction to Monoids for programmers:
http://fsharpforfunandprofit.com/posts/monoids-without-tears/
http://fsharpforfunandprofit.com/posts/monoids-part2/

Several simple C++ implementations of software rendeerers:
http://kagamin.net/hole/simple/index.htm

Cute but with several reserves:
http://www.xfront.com/Haskell/Why-Functional-Programming-Matters-Part-2.pdf

Doctest for Haskell:
https://github.com/sol/doctest-haskell#readme

Units-of-Measure in F#:
http://clear-lines.com/blog/post/Safe-refactoring-with-FSharp-Units-of-Measure.aspx

Nice and simple Pythagoras Tree in F#:
http://fsharpnews.blogspot.it/2009/05/pythagoras-tree.html

Eye-opening, "Dispelling the Myths of Parallel Computing" by Patrick H. Madden:
http://www.cs.binghamton.edu/~pmadden/pubs/dispelling-ieeedt-2013.pdf

Parse, of the Rebol language:
http://www.red-lang.org/2013/11/041-introducing-parse.html
comments: 1 comment or Leave a comment Add to Memories Share

Security:
Subject:Graph Walk
Time:01:48 am
In the "Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers" book by John MacCormick there is a simple simulation of a graph of Web pages, to implement a basic page ranking. This is modified from Figure 3-7 (from: http://press.princeton.edu/releases/m9528_illus.pdf ):



I have implemented the algorithm using the D language. The simulation is just a random walk on the graph, and at each step there is also a small probability (0.15) to jump to a random node. There are various ways to implement this walk, I have chosen a fast one:
// graph_walk1.d
import std.stdio, std.random, std.algorithm, std.math, std.typetuple;

alias NodeIndex = int;

enum nVisits = 20_000_000;
enum nUnroll = 8;
static assert(nVisits % nUnroll == 0);
enum restartProbability = 0.15;
enum nMaxOutArcs = 5;
enum nNodes = 16;

template Iota(int stop) {
    static if (stop <= 0)
        alias TypeTuple!() Iota;
    else
        alias TypeTuple!(Iota!(stop - 1), stop - 1) Iota;
}

struct Node {
    immutable uint nOutArcs;
    immutable uint[nMaxOutArcs] outbound;
    uint hitCount = 0;
}

void main() {
    auto rng = Xorshift(1);
    Node[nNodes] graph = [{/* 0*/ 3, [13, 9, 8]},
                          {/* 1*/ 1, [2]},
                          {/* 2*/ 2, [0, 8]},
                          {/* 3*/ 1, [1]},
                          {/* 4*/ 2, [1, 2]},
                          {/* 5*/ 2, [2, 8]},
                          {/* 6*/ 3, [3, 4, 5]},
                          {/* 7*/ 1, [6]},
                          {/* 8*/ 1, [15]},
                          {/* 9*/ 5, [14, 12, 11, 10, 8]},
                          {/*10*/ 1, [15]},
                          {/*11*/ 1, [15]},
                          {/*12*/ 1, [15]},
                          {/*13*/ 1, [9]},
                          {/*14*/ 1, [15]},
                          {/*15*/ 1, [7]}];

    uint position = uniform(0, graph.length, rng);
    graph[position].hitCount++;

    foreach (immutable _; 1 .. nVisits / nUnroll)
        foreach (immutable __; Iota!nUnroll) {
            if (graph[position].nOutArcs == 0 || uniform(0.0, 1.0, rng) < restartProbability)
                position = uniform(0, graph.length, rng);
            else
                position = graph[position].outbound[uniform(0, graph[position].nOutArcs, rng)];
            graph[position].hitCount++;
        }

    foreach (immutable i, immutable node; graph)
        writefln("%2d %2.0f", i, round(100.0 * node.hitCount / cast(real)nVisits));
}


The program performs twenty millions jumps on the graph. The graph is represented with an adjacency array. I have used a fixed-size 2D matrix, that contains some holes because it has five columns, but only the node 9 has five outbound arcs. Each Node must contain the real number of outbound arcs, plus a counter of the visits.

There is a more compact representation for this graph, a linear array that has no holes and contains both the indexes of the outbound arcs and the number of outbound arcs, plus a little extra array that contains the start indexes of the nodes and the counts. But this representation is more complex, and I prefer a simpler code.

The Iota!nUnroll is used to unroll the loop 8 times, to speed up the walk a little.

To improve the performance of the code I have created a second version of the D program. It uses a similar data structure, but the counts are kept in a separated array.

This program performs the random walks jumping between the nodes using goto, so the state is kept in the program counter. The finite state machine is implemented with code generated at compile-time from the Graph data structure:

// graph_walk2.d
import std.stdio, std.random, std.algorithm, std.math;

alias NodeIndex = int;

enum nVisits = 20_000_000;
enum restartProbability = 0.15;
enum nMaxOutArcs = 5;
enum nNodes = 16;

struct Node {
    immutable uint nOutArcs;
    immutable int[nMaxOutArcs] outbound;
}

string genSwitchCode(in int nMax) /*pure*/ {
    string switchCode = "switch (position) {\n";
    foreach (immutable i; 0 .. nMax)
        switchCode ~= format("    case %d: goto NODE%d;\n", i, i);
    switchCode ~= "    default: assert(0);\n";
    switchCode ~= "}\n";
    return switchCode;
}

string genNodeCode(in int nMax, in string switchCode) /*pure*/ {
    string nodeCode;
    foreach (immutable i; 0 .. nMax)
        nodeCode ~= format("
        NODE%d:
            hits[%d]++;
            steps++;
            if (steps >= nVisits) goto END;
            if (uniform(0.0, 1.0, rng) < restartProbability)
                position = uniform(0, graph.length, rng);
            else
                position = graph[%d].outbound[uniform(0, graph[%d].nOutArcs, rng)];
            %s
            ", i, i, i, i, switchCode);
    return nodeCode;
}

void main() {
    auto rng = Xorshift(1);

    static enum Node[nNodes] graph = [
        {/* 0*/ 3, [13, 9, 8]},
        {/* 1*/ 1, [2]},
        {/* 2*/ 2, [0, 8]},
        {/* 3*/ 1, [1]},
        {/* 4*/ 2, [1, 2]},
        {/* 5*/ 2, [2, 8]},
        {/* 6*/ 3, [3, 4, 5]},
        {/* 7*/ 1, [6]},
        {/* 8*/ 1, [15]},
        {/* 9*/ 5, [14, 12, 11, 10, 8]},
        {/*10*/ 1, [15]},
        {/*11*/ 1, [15]},
        {/*12*/ 1, [15]},
        {/*13*/ 1, [9]},
        {/*14*/ 1, [15]},
        {/*15*/ 1, [7]}];
    uint[graph.length] hits;
    uint steps = 0;

    uint position = uniform(0, graph.length, rng);
    hits[position]++;
    steps++;

    enum switchCode = genSwitchCode(graph.length);
    enum nodeCode = genNodeCode(graph.length, switchCode);
    mixin(switchCode);
    mixin(nodeCode);

    END:
    foreach (immutable i, immutable h; hits)
        writefln("%2d %2.0f", i, round(100.0 * h / cast(real)nVisits));
}

genSwitchCode() generates a simple jump table like this:

switch (position) {
    case 0: goto NODE0;
    case 1: goto NODE1;
    case 2: goto NODE2;
    case 3: goto NODE3;
    case 4: goto NODE4;
    case 5: goto NODE5;
    case 6: goto NODE6;
    case 7: goto NODE7;
    case 8: goto NODE8;
    case 9: goto NODE9;
    case 10: goto NODE10;
    case 11: goto NODE11;
    case 12: goto NODE12;
    case 13: goto NODE13;
    case 14: goto NODE14;
    case 15: goto NODE15;
    default: assert(0);
}


While genNodeCode() generates the whole finite state machine, each node looks like this:

NODE0:
hits[0]++;
steps++;
if (steps >= nVisits) goto END;
if (uniform(0.0, 1.0, rng) < restartProbability)
    position = uniform(0, graph.length, rng);
else
    position = graph[0].outbound[uniform(0, graph[0].nOutArcs, rng)];
switch (position) {
    case 0: goto NODE0;
    case 1: goto NODE1;
    case 2: goto NODE2;
    case 3: goto NODE3;
    case 4: goto NODE4;
    case 5: goto NODE5;
    case 6: goto NODE6;
    case 7: goto NODE7;
    case 8: goto NODE8;
    case 9: goto NODE9;
    case 10: goto NODE10;
    case 11: goto NODE11;
    case 12: goto NODE12;
    case 13: goto NODE13;
    case 14: goto NODE14;
    case 15: goto NODE15;
    default: assert(0);
}


The LLVM back-end of the LDC2 compiler seems able to recognize this kind of jumps tables, and replaces them with a goto (like an jmp *%eax instruction), so this machine is quick. Most of the time is spent generating random numbers.

To see if the computed gotos of the GNU C of GCC (http://gcc.gnu.org/onlinedocs/gcc-3.2/gcc/Labels-as-Values.html ) lead to a better performance, I have translated the D code to GNU-C99:

// graph_walk3.c
#include "stdio.h"
#include "stdlib.h"
#include "time.h"

#define nVisits 20000000
#define restartProbability 0.15
#define nMaxOutArcs 5
#define nNodes 16

inline static double uniform01() {
    return rand() / (double)RAND_MAX;
}

inline static int uniform(const int max) {
    int k = (int)(max * (rand() / (double)RAND_MAX));
    return (k == max) ? k - 1 : k;
}

typedef struct {
    unsigned int nOutArcs;
    void *outbound[nMaxOutArcs];
} Node;


#define NODE_BLOCK(N)                                                  \
    do {                                                               \
        NODE##N:                                                       \
            hits[N]++;                                                 \
            steps++;                                                   \
            if (__builtin_expect(steps >= nVisits, 0)) goto END;       \
            if (__builtin_expect(uniform01() < restartProbability, 0)) \
                goto *NodeAddresses[uniform(nNodes)];                  \
            else                                                       \
                goto *graph[N].outbound[uniform(graph[N].nOutArcs)];   \
    } while (0)


int main() {
    srand(time(NULL));
    static const void *NodeAddresses[] = {
        &&NODE0,  &&NODE1,  &&NODE2,  &&NODE3,
        &&NODE4,  &&NODE5,  &&NODE6,  &&NODE7,
        &&NODE8,  &&NODE9,  &&NODE10, &&NODE11,
        &&NODE12, &&NODE13, &&NODE14, &&NODE15};

    static const Node graph[nNodes] = {
        /* 0*/ {3, {&&NODE13, &&NODE9, &&NODE8}},
        /* 1*/ {1, {&&NODE2}},
        /* 2*/ {2, {&&NODE0, &&NODE8}},
        /* 3*/ {1, {&&NODE1}},
        /* 4*/ {2, {&&NODE1, &&NODE2}},
        /* 5*/ {2, {&&NODE2, &&NODE8}},
        /* 6*/ {3, {&&NODE3, &&NODE4, &&NODE5}},
        /* 7*/ {1, {&&NODE6}},
        /* 8*/ {1, {&&NODE15}},
        /* 9*/ {5, {&&NODE14, &&NODE12, &&NODE11, &&NODE10, &&NODE8}},
        /*10*/ {1, {&&NODE15}},
        /*11*/ {1, {&&NODE15}},
        /*12*/ {1, {&&NODE15}},
        /*13*/ {1, {&&NODE9}},
        /*14*/ {1, {&&NODE15}},
        /*15*/ {1, {&&NODE7}}};

    unsigned int hits[nNodes] = {0};
    unsigned int steps = 0;
    goto *NodeAddresses[uniform(nNodes)];

    NODE_BLOCK(0);
    NODE_BLOCK(1);
    NODE_BLOCK(2);
    NODE_BLOCK(3);
    NODE_BLOCK(4);
    NODE_BLOCK(5);
    NODE_BLOCK(6);
    NODE_BLOCK(7);
    NODE_BLOCK(8);
    NODE_BLOCK(9);
    NODE_BLOCK(10);
    NODE_BLOCK(11);
    NODE_BLOCK(12);
    NODE_BLOCK(13);
    NODE_BLOCK(14);
    NODE_BLOCK(15);

    END:
    for (int i = 0; i < nNodes; i++)
        printf("%2d %2.1f\n", i, 100.0 * hits[i] / (double)nVisits);

    return 0;
}

I have used C macros to keep the C code short about as the D code.

The "do { } while (0)" is a useful trick to require a semicolon to the macro calls.

The "NODE##N:" syntax allows to generate the numered jump labels from the NODE_BLOCK() macro argument N.

Now there is no need of jump tables, there are just gotos to the addresses. I think this is one of the fastest ways to implement a hard-coded finite state machine.

GCC 4.8.0 generates code like this for each node:

L14:
    addl $1, %ebx
    addl $1, 20(%esp)
    cmpl $19999999, %ebx
    ja L15
    call _rand
    movsd LC1, %xmm1
    cvtsi2sd %eax, %xmm0
    mulsd LC0, %xmm0
    comisd %xmm0, %xmm1
    ja L125
    call _rand
    cvtsi2sd %eax, %xmm0
    xorl %eax, %eax
    mulsd LC0, %xmm0
    cvttsd2si %xmm0, %edx
    cmpl $1, %edx
    cmovne %edx, %eax
    movl _graph.4128+28(,%eax,4), %eax
    jmp *%eax


While the Intel compiler ICC 13.0.1 splits the code in a fast zone and a slow zone, I have not timed this but I think this could be a little more efficient:
    incl      %ebx
    incl      40(%rsp)
    cmpl      $20000000, %ebx
    jae       ..B1.108      # Prob 4%
    call      uniform01()
    comisd    .L_2il0floatpacket.20(%rip), %xmm0
    jbe       ..B1.139      # Prob 18%
    movl      $1, %edi
    call      uniform(int)
    movslq    %eax, %rax
    movq      488+graph.351.0.3(,%rax,8), %rax
    jmp       *%rax
...


..B1.139:
    movl      $16, %edi
    call      uniform(int)
    movslq    %eax, %rax
    movq      NodeAddresses.351.0.3(,%rax,8), %rax
    jmp       *%rax


Run-time per nVisits = 20_000_000:
  graph_walk1: 1.35 (nUnroll = 8)
  graph_walk2: 1.14
  graph_walk3: 1.08

Compilation arguments used:
GCC:
-Wall -Wextra -std=c99 -mfpmath=sse -march=native -Ofast -flto -s graph_walk3.c -o graph_walk3
LDC2 (ldmd2):
-O -release -inline -noboundscheck

Compilers:
GCC V.4.8.0
LDC2 V.0.12.1 based on DMD v2.063.2 and LLVM 3.3.1.

The code shown in this post:
http://www.fantascienza.net/leonardo/js/graph_walk.zip
comments: Leave a comment Add to Memories Share

Security:
Subject:Coin game probabilities
Time:01:53 pm
A simple programming problem, here solved in Matlab:
http://blogs.mathworks.com/pick/2008/03/31/puzzler-coin-game/

There is a table composed of four squares in a circle. When you are on the last square and you move one step forward, you go back to the first square.

Starting at the first square, flip two coins at each move. If the coins are:
- Head Head: move one square forward;
- Tail Tail: move two squares forward;
- Head Tail or Tail Head: go to the first square.

With such rules, what is the probability of being on each of the four squares after twenty moves?

A first simple way to find the probabilities numerically is by simulation, in D language:

import std.stdio, std.random, std.algorithm, std.traits,
       std.bigint, std.exception;

enum uint nSquares = 4;

uint move(uint pos, ref Xorshift rng) {
    final switch (rng.front % nSquares) {
        case 0, 1: // Move one square forward.
            pos = (pos + 1) % nSquares;
            break;
        case 2: // Move two squares forward.
            pos = (pos + 2) % nSquares;
            break;
        case 3: // Go to the first square.
            pos = 0;
            break;
    }
    rng.popFront;
    return pos;
}

void main() {
    enum nTurns = 20,
         nTries = 100_000_000;

    auto rng = Xorshift(unpredictableSeed);
    enforce(isUnsigned!(typeof(rng.front)) &&
            (rng.front.max.BigInt + 1) % nSquares == 0,
            "move() needs to use std.random.uniform().");

    uint[nSquares] finalPosCounts;
    foreach (immutable _; 0 .. nTries) {
        uint pos = 0; // Starting position.
        foreach (immutable __; 0 .. nTurns)
            pos = move(pos, rng);

        finalPosCounts[pos]++;
    }

    finalPosCounts[].map!(x => x / cast(real)nTries).writeln;
}

I have used rng.front % nSquares instead of std.random.uniform() because compiled with the LDC2 compiler it's almost twice faster.

If I compile that first program with:
ldmd2 -O -release -inline -noboundscheck

LDC2 produces the following good 32 bit asm from the two loops of the main():

LBB1_1:
    xorl    %esi, %esi
    xorl    %ecx, %ecx
    .align  16, 0x90
LBB1_2:
    movl    %edi, %edx
    movl    %esi, %edi
    movl    %edx, %ebx
    andl    $3, %ebx
    xorl    %esi, %esi
    cmpl    $3, %ebx
    je  LBB1_7
    cmpl    $2, %ebx
    je  LBB1_13
    cmpl    $1, %ebx
    ja  LBB1_14
    incl    %edi
    jmp LBB1_6
    .align  16, 0x90
LBB1_13:
    addl    $2, %edi
LBB1_6:
    andl    $3, %edi
    movl    %edi, %esi
LBB1_7:
    movl    48(%esp), %ebx
    movl    52(%esp), %edi
    movl    %edi, 48(%esp)
    movl    56(%esp), %edi
    movl    %edi, 52(%esp)
    movl    %ebx, %edi
    shll    $11, %edi
    xorl    %ebx, %edi
    movl    %edx, 56(%esp)
    movl    %edx, %ebx
    shrl    $19, %ebx
    xorl    %edx, %ebx
    xorl    %edi, %ebx
    shrl    $8, %edi
    xorl    %ebx, %edi
    movl    %edi, 60(%esp)
    incl    %ecx
    cmpl    $20, %ecx
    jl  LBB1_2
    incl    32(%esp,%esi,4)
    incl    %eax
    cmpl    $100000000, %eax
    jl  LBB1_1


A different algorithm, as explained in the linked article is to power the transition probability matrix (here the matrix multiplication is implemented in a slow high level way):

import std.stdio, std.range, std.array, std.numeric, std.algorithm;

T[][] mul(T)(in T[][] A, in T[][] B) {
    const Bt = B[0].length.iota.map!(i=> B.transversal(i).array).array;
    return A.map!(a => Bt.map!(b => a.dotProduct(b)).array).array;
}

void main() {
    real[][] M = [[1, 2, 1, 0], [1, 0, 2, 1], [2, 0, 0, 2], [3, 1, 0, 0]];
    foreach (row; M)
        row[] *= 0.25;
    auto P = M.mul(M);
    foreach (immutable _; 2 .. 20)
        P = P.mul(M);

    P[0].writeln;
}

One run of the simulation (nTries = 100_000_000, runtime less than 21 seconds) gives:
[0.386108, 0.234545, 0.213841, 0.165506]

The second program gives a similar output:
[0.386203, 0.234484, 0.213797, 0.165516]

Running the first program with (ulong nTries = 10_000_000_000) in less than an hour gives:
[0.386202, 0.234477, 0.213796, 0.165525]

The code:
http://www.fantascienza.net/leonardo/js/index.html#table_probs
comments: Leave a comment Add to Memories Share

Security:
Subject:Better C++/D code
Time:03:50 pm
When I read the "The Little Schemer" I have done countless small coding exercises using recursion:
https://github.com/pkrumins/the-little-schemer

But later when I've started learning Haskell I've seen than explicit recursion is uncommon in good Haskell code. Most times in Haskell you use map, folds, and other higher order functions instead of recursion. (Sometimes in Haskell you use explicit recursion as an optimization).

C and C++ code contains lot of 'for' and 'while' loops, that are sometimes hard to read and understand.


Recently they have released videos and slides of the talks of the GoingNative 2013 conference. Among the interesting talks the one I have appreciated more is "C++ Seasoning" by Sean Parent (and from the comments I see that others agree with me, and even ask the speaker to write a book on the topic):
http://channel9.msdn.com/Events/GoingNative/2013/Cpp-Seasoning

I suggest to start reading the updated slides:
https://github.com/sean-parent/sean-parent.github.com/wiki/presentations/2013-09-11-cpp-seasoning/cpp-seasoning.pdf

The talk gives some suggestions for C++ code:
- No Raw Loops
- No Raw Syntonization Primitives
- No Raw Pointers

The first part suggests to reduce the number of loops in C++ code and replace them with algorithms (often with STL algorithms), a little like it's done in Haskell.

C++ is an abstraction over C, and the heavy usage of STL-like algorithms is kind of a new language that abstracts over raw loop-heavy C++ code. Writing code like that in C++ takes time and some brain power, but the resulting code is good, short, simpler to keep bug-free, and often efficient too.

In D language I suggest to use std.algorithm for similar purposes. I also suggest to make const/immutable all variables, unless they need to mutate, or unless something in D or in Phobos makes that impossible (like Ranges, that often can't be const).

Walter Bright calls "component programming" a variant of that coding style that relies on ranges a lot. I have shown several examples of such linear flow programming in past blog posts (that today is becoming very common, LINQ in C#, in Java8, with Python itertools, in F#, and so on).

Here is another simple example, the Euler Problem number 50 (http://projecteuler.net/problem=50 ), the problem:
The prime 41, can be written as the sum of six consecutive primes: 41 = 2 + 3 + 5 + 7 + 11 + 13
This is the longest sum of consecutive primes that adds to a prime below one-hundred.
The longest sum of consecutive primes below one-thousand that adds to a prime, contains 21 terms, and is equal to 953.
Which prime, below one-million, can be written as the sum of the most consecutive primes?
This is a simple imperative solution in D (modified and ported from Python code found here: http://blog.dreamshire.com/2009/04/12/project-euler-problem-50-solution/ ):
import std.stdio, std.algorithm, std.range, std.functional;
...

int euler50(in int top) {
    /*const*/ auto primes = top.sieve.assumeSorted;

    int[] prime_sum = [0];
    int tot = 0, count = 0;
    while (tot < top) {
        tot += primes[count];
        prime_sum ~= tot;
        count++;
    }

    int terms = 1;
    int max_prime = -1;
    foreach (immutable i; 0 .. count)
        foreach (immutable j; i + terms .. count) {
            immutable n = prime_sum[j] - prime_sum[i];
            if (j - i > terms && primes.contains(n)) {
                terms = j - i;
                max_prime = n;
            }
        }
    return max_prime;
}

void main() {
    1_000_000.euler50.writeln; // 997651
}
The idea is to compute a cumulative of the prime numbers, and then take a subset of all the N^2 subintervals, removing the sum of the starting end of the interval from the sum of its end, to compute in O(1) the sum of its items. The contains() function is a linear search, so this code is kind of O(n^3), but for the given problem it's fast enough, finding the correct result of 997651 in less than half second.

A very simple sieve implementation used in that euler50() function could be:

uint[] sieve(in uint limit) {
    if (limit < 2)
        return [];
    auto composite = new bool[limit];

    foreach (immutable uint n; 2 .. cast(uint)(limit ^^ 0.5) + 1)
        if (!composite[n])
            for (uint k = n * n; k < limit; k += n)
                composite[k] = true;

    return iota(2, limit).filter!(i => !composite[i]).array;
}

If you take a look at the euler50() functions you see general patterns. The cumulative of an array could be written as general function (even better is to write this as a lazy range that works on any input range):
T[] accumulate(T)(in T[] items) pure nothrow {
    auto result = new T[items.length];
    if (!items.empty) {
        result[0] = items[0];
        foreach (immutable i; 1 .. items.length)
            result[i] = result[i - 1] + items[i];
    }
    return result;
}

The iteration on all subsets could be done with a handy "Pairwise" range (imports not included):
struct Pairwise(Range) {
    alias R = Unqual!Range;
    alias Pair = Tuple!(ForeachType!R, "a", ForeachType!R, "b");
    R _input;
    size_t i, j;

    this(Range r_) {
        this._input = r_;
        j = 1;
    }

    @property bool empty() {
        return j >= _input.length;
    }

    @property Pair front() {
        return Pair(_input[i], _input[j]);
    }

    void popFront() {
        if (j >= _input.length - 1) {
            i++;
            j = i + 1;
        } else {
            j++;
        }
    }
}

Pairwise!Range pairwise(Range)(Range r)
if (isRandomAccessRange!Range && hasLength!Range && !isNarrowString!Range) {
    return typeof(return)(r);
}

Then what's left is a search among all the subsequences, throwing away all the ones that are too much long and the ones that don't sum to a prime number. Also the pre-computed array of primes is sorted, so it's better to use a binary search. Now we can organize all the computation in few simple chains:
int euler50b(in int top) {
    auto primes = top.sieve.assumeSorted;
    auto primeSum = primes.release.accumulate;

    return primeSum
           .until!(p => p > top)
           .walkLength
           .iota
           .pairwise
           .map!(ij => tuple(ij[1] - ij[0], primeSum[ij[1]] - primeSum[ij[0]]))
           .filter!(dn => primes.contains(dn[1]))
           .reduce!max[1];
}
This code works and runs in something like 0.3-0.4 seconds, so it's still fast enough.

This euler50b() function has several problems:
- A range like pairwise() is not yet present in std.range;
- Something like accumulate() (with a given reducing function) is missing in std.algoritm;
- In D there is no syntax to destructure (unpack) tuples, so you have to use things like: ij => tuple(ij[1] - ij[0] ...
- std.algorithm.max is not yet accepting a key function.
- Currently "primes" can't be immutable or even const because you can't call the "release" method on a const SortedRange;
- Currently "primeSum" can't be immutable otherwise reduce() doesn't work.
- Currently "iota" is not pure nor nothrow.

How the main function could look, once D and Phobos become more refined:
int euler50c(in int top) pure nothrow {
    immutable primes = top.sieve.assumeSorted;
    immutable primeSum = primes.accumulate!q{a + b};

    return primeSum
           .until!(p => p > top)
           .walkLength
           .iota
           .pairwise
           .map!({i, j} => tuple(j - i, primeSum[j] - primeSum[i]))
           .filter!({d, n} => primes.contains(n))
           .reduce!(max!({d, n} => d))[1];
}

The full code of the two versions:
http://www.fantascienza.net/leonardo/js/euler50.zip
comments: 4 comments or Leave a comment Add to Memories Share

Security:
Subject:A little Morse Code decoder in D
Time:11:19 pm
A little Morse Code decoder in D language, adapted from here:
http://apfelmus.nfshost.com/articles/fun-with-morse-code.html
void main() {
    import std.stdio, std.string, std.array, std.algorithm, std.conv;

    "-- --- .-. ... .    -.-. --- -.. ."
    .split("    ")
    .map!(w => w
               .split
               .map!(s => " ETIANMSURWDKGOHVF L PJBXCYZQ  "
                          [reduce!q{2 * a + 1 + (b & 1)}(0, s)])
               .text)
    .join(" ")
    .writeln;
}
It prints: MORSE CODE
comments: Leave a comment Add to Memories Share

Security:
Subject:Updates and links
Time:10:45 pm
Lot of time since the last update.

News:

Solutions of the first three problems of the qualifications of the Google Code Jam 2013:
http://www.fantascienza.net/leonardo/js/gcj13.rar

A D solution for the Euler problem n.395, translated from Python:
http://www.fantascienza.net/leonardo/js/index.html#euler395

Updated code in the zip file:
http://www.fantascienza.net/leonardo/ar/d_notes2.html

Updated and improved (now "foo-bar" is digested as "foo bar" instead of "foobar"):
http://www.fantascienza.net/leonardo/js/index.html#markov

Added the "Spectral-norm" benchmark using SIMD:
http://www.fantascienza.net/leonardo/js/index.html#spectralnorm_simd

Updated path tracing demo:
http://www.fantascienza.net/leonardo/js/index.html#path_tracing

Added a De Bruijn sequence generator in D:
http://www.fantascienza.net/leonardo/js/index.html#de_bruijn_sequence

Added two photos of a peluche of the Pegasus pony Bluespark:
http://www.fantascienza.net/leonardo/im/index.html#bluespark_toy


Links:

An nice example of not so good functional programming, and a better solution:
http://pragprog.com/magazines/2013-03/uncle-bob-and-functional-programming

A talk video, Simon Peyton Jones on Adventures with types, nice and interesting:
http://skillsmatter.com/podcast/java-jee/keynote-3842
The slides:
http://laser.inf.ethz.ch/2012/slides/PeytonJones/Adventures%20with%20types.pdf
All the slides of the interesting LASER 2013 conference, including more slides from Simon Peyton:
http://laser.inf.ethz.ch/2012/slides/

Many interesting slides from the CPPNOW 2013 conference:
https://github.com/boostcon/cppnow_presentations_2013

A simple optimization story:
http://users.softlab.ntua.gr/~ttsiod/straylight.html

Some things to know about tin:
http://www.youtube.com/watch?v=Rh8yWyuoL2A

http://sealedabstract.com/rants/why-mobile-web-apps-are-slow/

A simple intro to finite state machines for games:
http://gameprogrammingpatterns.com/state.html

Less Wrong's / Eliezer Yudkowsky's Harry Potter fanfic-in-progress, "Harry Potter and the Methods of Rationality", it's a very long and very good fanfiction novel that introduces ways to think rationally:
http://hpmor.com/

"Flutterwonder", a 3D musical video about the little pony Fluttershy:
http://www.youtube.com/watch?&v=iQYrFecvCGo
comments: Leave a comment Add to Memories Share

Security:
Subject:A word counting problem
Time:04:31 pm
A little programming problem from this blog post:
http://www.leancrew.com/all-this/2011/12/more-shell-less-egg/

The problem is:
Read a file of text, determine the n most frequently used words, and print out a sorted list of those words along with their frequencies.
A solution with shell scripting:
Here's the script that does that, with each command given its own line:
bash:
1:  tr -cs A-Za-z '\n' |
2:  tr A-Z a-z |
3:  sort |
4:  uniq -c |
5:  sort -rn |
6:  head -${1}
Explanation:
1) Make one-word lines by transliterating the complement (-c) of the alphabet into newlines (note the quoted newline), and squeezing out (-s) multiple newlines.
2) Transliterate upper case to lower case.
3) Sort to bring identical words together.
4) Replace each run of duplicate words with a single representative and include a count (-c).
5) Sort in reverse (-r) numeric (-n) order.
6) Pass through a stream editor; quit (q) after printing the number of lines designated by the script's first parameter (${1}).

Even if data-filtering programs for these exact purposes were not at hand, these operations would well be implemented separately: for separation of concerns, for easier development, for piecewise debugging, and for potential reuse. The simple pipeline given above will suffice to get answers right now, not next week or next month. It could well be enough to finish the job.
Some comments to that blog post:
Andre is quite right that this is an example of the elegance of FP vs. imperative -- but the flip side is that FP often fails when it runs up against real-world edge cases. Knuth's version may not handle contractions -- but it could, easily. Could McIlroy's? What if I want British/US spellings (such as "color" and "colour") to be counted as one word? What about hyphenated words?
20+ years later, we still don't have sed or awk available as a library, afaik. We also don't have a tr-like function which works on streams, or a sort for text files in standard libraries. Which is why you simply can't use it when you write something that isn't completely solvable via shell scripting - it doesn't get compiled in, and your team leader/senior programmer/whatever will definitely not allow you to call shell commands from C, C++ or Pascal, even when it pays off.
This is a modern solution (from the blog post http://franklinchen.com/blog/2011/12/08/revisiting-knuth-and-mcilroys-word-count-programs/ with several small changes, probably it can be improved further):
import System.Environment (getArgs)
import Data.List (sortBy)
import Data.Char (toLower, isLetter)
import qualified Data.HashMap.Strict as HM

createReport :: Int -> String -> String
createReport n text =
    unlines $
    map (\(count, w) -> w ++ " " ++ show count) $
    take n $
    sortBy (flip compare) $
    map (\(w, count) -> (count, w)) $
    HM.toList $
    HM.fromListWith (+) $
    map (\w -> (w, 1)) $
    words $
    map toLower $
    map (\c -> if isLetter c then c else '\n') $
    text

main = do
    [fileName, nstr] <- getArgs
    let n = read nstr
    text <- readFile fileName
    putStr $ createReport n text
Usage: runhaskell wfreqh.hs test_data.txt 10

A comment from that blog post:
The shell script operates on raw text and everything is just strings being parsed and reparsed by the respective Unix utility programs.

The Haskell program is statically typed. It is type-checked by the compiler, which generates native code. The program uses standard libraries and data types, such as lists and hash maps.

Also, the Haskell program could be refined, extended, optimized in various ways. The most important optimizations I can think of off the top of my head:
- Using a better representation of strings than the default built-in "string as list of characters". Easily accessible advice can be found on Stack Overflow and browsing through Haskell library documentation, such as for the text package.
- Loop fusion, deforestation can be applied to deal with the apparent allocation of lots of new lists in the pipeline. One of the selling points of using a language like Haskell is the opportunity for the compiler to perform radical optimizations that are impossible for languages that have side effects.

I don't write many bash scripts these days. General-purpose programming languages can do a decent job munging data without difficulty. The situation was different decades ago when there was C, and few standard high-level libraries for the C world.
This is a D language version (it requires no installation of libraries like the unordered-containers from Cabal in the Haskell program). It works with unicode text. If you only have ASCII text the code becomes simpler. That Haskell version seems to have problems with unicode text (on the other hand don't expect the D code to perform a fully correct handling of unicode sorting, that is a complex problem).
import std.stdio, std.conv, std.file, std.string,
       std.algorithm, std.range;

void main(string[] args) {
    args[1]
    .readText
    .toLower
    .tr("A-Za-z", "\n", "cs")
    .split
    .sort()
    .group
    .array
    .schwartzSort!q{ -a[1] }
    .take(args[2].to!uint)
    .map!q{ text(a[0], " ", a[1]) }
    .join("\n")
    .writeln;
}
D version: 17 nonblank lines, Haskell: 23 lines.
(The cast is needed because of small D bug that probably will be fixed. The sort function has () to avoid calling the built-in sort, that will be deprecated.)
File sizes:
9.005.142 test_data.txt
      365 wfreqd.d
      701 wfreqh.hs
1.565.710 wfreqh.exe (stripped)
  510.492 wfreqd.exe
Haskell code compiled with:
ghc -O3 wfreqh.hs
Glasgow Haskell Compiler, Version 7.6.1, stage 2 booted by GHC version 7.4.2

D code compiled with:
dmd -O -release -inline -noboundscheck wfreqd.d
DMD Version 2.064alpha.
Runtime, best of 3 (top 100 words):
  wfreqh: 7.45
  wfreqd: 6.24 
Tests performed on Windows 32 bit, using an about 9 MB text file in English.

In theory the shell program is fully lazy (even the sort can be done with an external sorting algorithm), while the D (and probably the Haskell) version load the whole file in memory.

The D and Haskell programs are strongly statically typed so they are able to catch several programming errors at compile-time.

The shell program uses text, so it has to convert numbers to text and then from text to numbers, while the D and Haskell programs use lazy sequences of arbitrary types, that can be more powerful and save time in the textual serialization and deserialization. And unless the floating point values are serialized as hex floats, such textual conversions can also introduce some error.

If you have to perform some operation not present among the unix tools, you can write a small program (in C or another language) that uses text pipes, and call it like the other shell commands. In Haskell or D you just write a function (or a lazy Range in D, often a struct), and call it, this is simpler and more integrated, but it's less modular.

Update Jun 21 2013: wiz offers this Haskell version (gist.github.com/wiz/5826131 ):
{-# LANGUAGE TupleSections #-}

import qualified Data.Text.Lazy as TL
import qualified Data.Text.Lazy.IO as TL
import           Data.Char (isLetter)
import           Data.List (sortBy)
import           Data.Function (on)
import qualified Data.HashMap.Strict as HM
import System.Environment (getArgs)

collect = sortBy (flip compare `on` snd)
        . HM.toList
        . HM.fromListWith (+)
        . map (,1)
        . filter (not . TL.null)
        . TL.split (not . isLetter)
        . TL.toLower

display (word, count) = do
    TL.putStr word
    putStr ": "
    print count

main = do
    [fname, count] <- getArgs
    text <- TL.readFile fname
    mapM_ display $ take (read count) (collect text)

This Haskell version is slower than the precedent one, it takes about 10.8 seconds to run. The D code compiled with the ldc2 compiler (that optimizes better than dmd) takes 4.5 seconds to run (if you use ldc2 2.063 then use .sort!q{-a[1] < -b[1]} instead of the schwartzSort, that was improved later in Phobos).

There are many ways to improve the performance of the D code. Like using an hash counter, that is missing in Phobos (run-time with ldc2 is about 3.17 seconds):
import std.stdio, std.conv, std.file, std.string,
       std.algorithm, std.range, std.traits;

auto hashCounter(R)(R items) if (isForwardRange!R) {
    size_t[ForeachType!R] result;
    foreach (x; items)
        result[x]++;
    return result.byKey.zip(result.byValue);
}

void main(string[] args) {
    args[1]
    .readText
    .toLower
    .tr("A-Za-z", "\n", "cs")
    .split
    .hashCounter
    .array
    .sort!q{ -a[1] < -b[1] }
    .take(args[2].to!uint)
    .map!q{ text(a[0], " ", a[1]) }
    .join("\n")
    .writeln;
}
comments: 31 comments or Leave a comment Add to Memories Share

Security:
Subject:Links and updates
Time:07:15 pm
A simple nice idea to organize a functional program in Clojure:
http://blog.getprismatic.com/blog/2013/2/1/graph-abstractions-for-structured-computation


A quite nice sequence of programming puzzles:
http://cplus.about.com/od/programmingchallenges/a/List-Of-Every-Programming-Contest.htm
And I have added some D code in my site about the Challenge 41:
http://www.fantascienza.net/leonardo/js/index.html#challenge41


Intro to atomic operations in C++:
http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-1-of-2
http://channel9.msdn.com/Shows/Going+Deep/Cpp-and-Beyond-2012-Herb-Sutter-atomic-Weapons-2-of-2


About C++ axioms:
http://akrzemi1.wordpress.com/2012/01/11/concept-axioms-what-for/
http://akrzemi1.wordpress.com/2013/02/11/preconditions-part-ii/


An usage example of Liquid Haskell:
http://goto.ucsd.edu/~rjhala/liquid/haskell/blog/blog/2013/02/16/kmeans-clustering-I.lhs/
http://goto.ucsd.edu/~rjhala/liquid/haskell/blog/blog/2013/02/17/kmeans-clustering-II.lhs/


Basic info about iteration (part 3 is coming):
http://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/ 
http://journal.stuffwithstuff.com/2013/02/24/iteration-inside-and-out-part-2/
comments: Leave a comment Add to Memories Share

Security:
Subject:Links
Time:08:11 pm
This time mostly some recent good coding & computer science links.

Catalan numbers pop out in many different situations, so for a programmer it's useful to know them. This explanation is nice and very simple:
http://www.geometer.org/mathcircles/catalan.pdf


Rustan Leino, Microsoft Research - "Program Verification: Yesterday, Today, Tomorrow":
http://www.youtube.com/watch?&v=5t4WntcsZZo


More on verification:
http://whiley.org/2012/12/04/generating-verification-conditions-for-whiley/


"The Checker Framework: Custom pluggable types for Java", seems nice, but also annotation-heavy:
http://types.cs.washington.edu/checker-framework/current/checkers-manual.html


"Go at Google: Language Design in the Service of Software Engineering", about some of the design principles of the Go language:
http://talks.golang.org/2012/splash.article


Facebook NYC Tech Talk - Jordan DeLong "using cxx::types", using a stronger typing is useful in C++:
http://www.reddit.com/r/cpp/comments/1571m9/facebook_nyc_tech_talk_jordan_delong_using/


"The algebra of algebraic data types" by Chris Taylor, talk at London HUG, a little mind-bending but good:
Video: https://www.youtube.com/watch?&v=YScIPA8RbVE
Slides: https://github.com/chris-taylor/LondonHUG/archive/master.zip


About the Rust language, but it's too much short, it doesn't explain lot of things:
http://www.infoq.com/presentations/Rust


Algebra is useful again and again:
http://izbicki.me/blog/the-categorical-distributions-algebraic-structure


Simple article about external and internal iteration:
http://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/
comments: Leave a comment Add to Memories Share

leonardo
View:Recent Entries.
View:Archive.
View:Friends.
View:User Info.
View:Website (My Website).
You're looking at the latest 10 entries.
Missed some entries? Then simply jump back 10 entries