| ||||||||

By leonardo maffi, V.1.0, 1 Aug 2014. A nice programming exercise in Haskell by Harry Garrood, "Permutations: an exercise" (21 Jul 2014): http://harry.garrood.me/blog/permutatio http://www.reddit.com/r/haskell/comment Harry Garrood writes: We're going to use Haskell, because this is all about functions (in the mathematical sense), and so Haskell, being a functional programming language, is especially well suited to the job.Most of the code you below is in D language (with a C version at the end) because it's a good language to solve similar problems. The problem statement defines a card shuffling technique: 1) Take one card from the top of the deck and discard it into a second pile. Below I explain the varios D implementations, that you can find in the archive: http://www.fantascienza.net/leonard The fantascienza site is down, so you can temporarily find the programs here: perms1.hs: http://lpaste.net/24122596844896256 perms2.hs: http://lpaste.net/39030856752300032 perms3.d: http://dpaste.dzfl.pl/2eb987664dfd perms3b.d: http://dpaste.dzfl.pl/0aa48d395cb2 perms4.d: http://dpaste.dzfl.pl/0b0cb344ccce perms5.d: http://dpaste.dzfl.pl/9f929c2d064d perms6.d: http://dpaste.dzfl.pl/dbf1d3f19321 perms7.d: http://dpaste.dzfl.pl/a8eddf54067d perms7b.d: http://dpaste.dzfl.pl/8a9a1afc10f8 perms7c.d: http://dpaste.dzfl.pl/45ff250cafa7 perms8.d: http://dpaste.dzfl.pl/32614ae6430c perms9.c: http://dpaste.dzfl.pl/608dd06e44a5 I compile all the code on a 32 bit Windows using: dmd -O -release -inline -noboundscheck ghc -O3 gcc -Ofast -flto -std=c99 -Wall -Wextra dmd 2.066beta4 Glasgow Haskell Compiler, Version 7.6.1, stage 2 booted by GHC version 7.4.2 gcc 4.0.0- - - - - - - - - - - - - - - - - - - perms1.hs:perms2.hs:The perms1.hs and perms2.hs programs are adapted from the two solutions by Harry Garrood, with small changes. This is the original 'permutations' function: permutation :: (Deck -> Deck) -> Int -> PermutationGraph permutation f n = go 1 xs where xs = unCardAll . f . makeDeck $ n go m (y:ys) = (y, m) : go (m+1) ys go _ [] = [] This is my version: permutation :: (Deck -> Deck) -> Int -> PermutationGraph permutation shuf n = zip perm [1 ..] where perm = unCardAll $ shuf $ makeDeck nperms2.hs runs on the six test cases (n = 4, 5, 52, 53, 100, 200) in about 0.08 seconds (such small timings are not precise). - - - - - - - - - - - - - - - - - - - perms3.d:perms3b.d:This program is equivalent to the perms1.hs program. Harry Garrood defines a very nice and clean Haskell function that performs the two first steps of the shuffling: step :: (Deck, Deck) -> (Deck, Deck) step (deck, pile) = case deck of a:b:cs -> (cs ++ [b], a : pile) [a] -> ([], a : pile) [] -> ([], pile) I have implemented this 'step' function in various ways in D, and at the end this is the version I have used (here I have removed the pre/post-conditions, see the files for the full code), this performs the whole shuffling (including the third step): Deck deckShuffle(in Deck Ain) pure nothrow @safe @nogc { typeof(return) B; const(Card)[] A = Ain; while (!A.empty) { // Step 1: B = [Card(A.front)] ~ B; A.popFront; // Step 2: if (!A.empty) { A ~= A.front; A.popFront; } } return B; } This function is (strongly) pure, it accepts a constant array of Cards and performs the shuffling using slices and concatenations, building a new output array B. Here a Deck is a dynamic array, but perhaps for this algorithm a more efficient data structure for a Deck is a (singly linked) list (about as in the Haskell code). In perms3 I have a commented-out the range-based version of 'makeDeck' because currently the 'array' function is not @safe for that use case. So I've written a more internally imperative version of 'makeDeck': enum makeDeck = (in uint n) pure nothrow => iota(1, n + 1).map!Card.array; I have used a Typedef to implement the 'newtype' of the Haskell code (despite currently Typedef has some bugs): alias Card = Typedef!int;The function 'f1' uses a do-while because Phobos still lacks an 'iterate' range-based function, as used in the 'shuffle' Haskell function. I have run both perms1.hs and perms3.d on the few testing values, both print: 4 2 5 5 52 510 53 53 100 120 200 8460 perms1.hs produces this output in about 2.72 seconds: [2,5,510,53,120,8460] perms3.d produces this output in (best of 3) in about 1.39 seconds: 4 2 5 5 52 510 53 53 100 120 200 8460I have then created a perms3b.d program that is very similar to perms3.d, but represents a Deck with a SList (a singly linked list) hoping to improve performance avoiding the O(n) copies of the slices of the array-based version. But perms3b.d runs in about 30.4 seconds. In most cases linked lists don't give any performance advantage. - - - - - - - - - - - - - - - - - - - perms4.d:This version is similar to perms3.d, but here I have implemented 'deckShuffle' in a lower level way. This deckShuffle is now @nogc (it doesn't allocate on the heap), and works in-place on two given arrays, using indexes (it could also be implemented using pointer arithmetic, but there's no need for that, this function is fast and @safe). Now deckShuffle is still annotated with 'pure' but now it's weakly pure: void deckShuffle(Deck A, Deck B) pure nothrow @safe @nogc { size_t aTop = 0, aLength = A.length, bTop = B.length - 1; // B is empty at start. while (aLength) { // Step 1: B[bTop] = A[aTop]; aTop = (aTop + 1) % A.length; bTop--; aLength--; // Step 2: if (aLength) { A[(aTop + aLength) % A.length] = A[aTop]; aTop = (aTop + 1) % A.length; } } }This kind of code is more fiddly and requires more care and testing compared to more a functional style, but it's also much faster and produces no heap garbage. The run-time of this function is also more predictable. The run time of 'perms4.d' to produce the six output pairs is about 0.06 seconds, so this is about 23 times faster than 'perms3.d'. Unlike many other languages, in D it's easy to mix low level code with higher level code, and produce code that still looks natural and free of most accidental complexity. - - - - - - - - - - - - - - - - - - - perms5.d:This implements the algorithm from perms2.hs. For simplicity I have used the slower but simpler 'deckShuffle' from perms3.d. Instead of writing the various functions of the Haskell version, building the various intermediate data stutures (like the list of pairs), I have written a single 'f2' function that computes the lcm of the cycleLen without storing the actual cycles. D has less 'glue' than Haskell, so often in D it's more convenient to write a little larger functions (with the side effect that they are often faster than the Haskell versions, but also less easy to combine in different ways): ulong f2(in uint n) pure nothrow { uint[uint] graph; foreach (immutable uint i, immutable x; n.makeDeck.deckShuffle) graph[i + 1] = cast(uint)x; typeof(return) result = 1; while (graph.length) { auto x = graph.byKey.front; uint cycleLen = 0; while (x in graph) { cycleLen++; immutable y = graph[x]; graph.remove(x); x = y; } result = lcm(result, cycleLen); } return result; }This code is less general and more specialized than the Haskell functions. It's less easy to see what the code is doing. Here 'graph' of the pairs is an associative array to avoid the linear searches in the lists (but a linear scan is still present despite being hidden, the one performed by byKey.front). The run-time of perms5.d is very small, so beside calling 'f2' on the six test cases, this function also computes 'f2' on the numbers n in [1, 1_000[. The total run-time is about 1.29 seconds. If I modify perms2.hs to compute f2 in the same [1, 1_000[ interval, the run-time is very large (about half an hour, and it uses 32 bit integers, so some results for the [1, 1_000[ interval are wrong). So the 'f2' function in D has some advantages (and it's still strongly pure). The D 'f2' function returns a ulong because when n grows large, the output grows a lot. - - - - - - - - - - - - - - - - - - - perms6.d:This is similar to perms5.d, but uses the more efficient 'deckShuffle' function from perms4.d. The run-time of perms6.d is only 0.20 seconds (this includes the computations for n in [1, 1_000[), so it's more than six times faster than perms5.d. - - - - - - - - - - - - - - - - - - - perms7.d:perms7b.d:perms7c.d:This is a lower-level version (that is easy to port to C language), it's similar to perms6.d, but here in 'f2' instead of the associative array 'graph' I have re-used the array D2, filling it with 'emptyItem' when I remove a value. 'f2' is now also @nogc because it accepts fixed-size arrays as arguments and then slices them. So the program allocates the arrays only inside the 'main' (and here are stack-allocated, so even the main is @nogc): ulong f2(size_t m)(const uint n, ref Card[m] D1, ref Card[m] D2) pure nothrow @safe @nogc { for (uint i = 0; i < n; i++) D1[i] = cast(Card)i; deck_shuffle(D1, D2, n); typeof(return) result = 1; uint graph_len = n; enum emptyItem = uint.max; while (graph_len) { uint x = uint.max; for (uint i = 0; i < n; i++) if (D2[i] != emptyItem) x = i; uint cycle_len = 0; while (D2[x] != emptyItem) { cycle_len++; const uint y = D2[x]; D2[x] = emptyItem; graph_len--; x = y; } result = lcm(result, cycle_len); } return result; } void main() nothrow @nogc { import core.stdc.stdio: printf; const uint nMax = 1_000; Card[nMax] d1 = void, d2 = void; for (uint n = 1; n <= nMax; n++) printf("%llu\n", f2(n, d1, d2)); }perms7.d runs in about 0.05 seconds. So it's about 4 times faster than perms6.d. To avoid overflow bugs I have created the perms7b.d and perms7c.d versions, that use core.checkedint (despite core.checkedint is not meant for user code). For both perms7b.d and perms7c.d the run-time is similar to the unchecked versions, and perms7b.d spots the overflow bugs: ... 1389: 4247163320400 1390: 465585120 1391: 1391 1392: 6570 1393: 4294945235192621194 (overflow) 1394: 3655769040 1395: 1194 1396: 930 1397: 1748154975840 1398: 1398 ...- - - - - - - - - - - - - - - - - - - perms8.d:This version is very similar to perms7.d, but I've used BigInt instead of ulong to compute all correct results (if there are no other bugs or design mistakes). perms8.d is still quite fast, it runs in about 0.08 seconds (this includes the computations for n in [1, 1_000[). If I increase nMax to 10_000 it runs in about 6.6 seconds. D BigInts are not very fast, but I think this is not a significant problem for this program. - - - - - - - - - - - - - - - - - - - perms9.c:This is C code derived from perms7.d. perms9.c runs in about 0.03 seconds or less. perms9.c is a little faster than perms7.d, but the speed difference is really visible when nMax is much larger (but then the program gives wrong outputs becuse of overflows). The difference between 0.03 and 0.05 seconds is mostly caused by the time taken to initialize the D-runtime. - - - - - - - - - - - - - - - - - - - | ||||||||

comments: Leave a comment |