leonardo
View:Recent Entries.
View:Archive.
View:Friends.
View:User Info.
View:Website (My Website).
You're looking at the latest 10 entries.
Missed some entries? Then simply jump back 10 entries

Tags:, , ,
Subject:amb chain - update
Time:06:11 pm
Updated the 'amb chain' article with better implementations:
http://www.fantascienza.net/leonardo/ar/amb_chain.html
comments: Leave a comment Add to Memories Tell a Friend

Tags:, , ,
Time:03:31 pm
A new informatics article of mine, about word chains, in Python and D:
http://www.fantascienza.net/leonardo/ar/amb_chain.html
comments: Leave a comment Add to Memories Tell a Friend

Tags:, ,
Subject:Built-in hashing in D and Python
Time:12:01 am
Both the D1/D2 languages and Python have built-in hash, in Python there are dicts, sets and frozensets, while in D there are associative arrays.

Having them built-in is very handy, and it allows a nice syntax. In future D2 language may keep such syntax while moving their implementation into the standard library, allowing more flexibility and keeping them about as handy as now (probably the compilation time may grow some and error messages may worsen, but the efficiency and flexibility will increase).

Any Python object can be used inside a dict/set, but by default the hash value of the object is the ID and the equality too is done amond IDs. So to avoid duplications you must define __hash__() and __eq__ (or __cmp__()).

In D you have to define the opEquals() and opEquals() and opCmp() too, because the hash is currently implemented in a different way, in case of hash collisions (when the hash value is equal when opEquals is true) they are resolved with a sorted search tree (that requires a <).

This means that while Python dicts are currently usually faster or much faster than D associative arrays when used on normal data with normally good hash functions, D AAs switch from being about O(1) to being about O(log n) in insert and access time, while Python dicts in such situations may become O(n). So D AAs are safer when the hash function is bad (or even in situation of hash attacks).

I have written small programs to benchmark the situation. (In the tests I have not used Psyco or the new LDC D compiler because their performance is the same or worse.)

This is the Python test (only insertions are performed, so this benchameks is also strongly determined by the efficiency of the memory allocator):
from sys import argv

class K(object):
    def __init__(self, x):
        self.x = x
    def __eq__(self, other):
        return self.x == other.x
    def __hash__(self):
        return 1
        #return self.x

def test(n):
    d = {}
    for i in xrange(n):
        d[K(i)] = i

n = int(argv[1]) if len(argv) == 2 else 10
test(n)

Equivalent D1 code:
import std.conv: toInt;

class K {
    int x;
    this(int xx) {
        this.x = xx;
    }
    int opEquals(Object other) {
        return this.x == (cast(K)other).x;
    }
    int opCmp(Object other) {
        int other_x = (cast(K)other).x;
        return (this.x == other_x) ? 0 : (this.x - other_x);
    }
    hash_t toHash() {
        return 1;
        //return cast(hash_t)this.x;
    }
}

void test(int n) {
    int[K] d;
    for (int i; i < n; i++)
        d[new K(i)] = i;
}

void main(string[] args) {
    int n = args.length == 2 ? toInt(args[1]) : 10;
    test(n);
}

I have also tried the same D1 code t, using struct used by value (not allocated on the heap, well, allocated inside the associative array itself) instead of objects:
import std.conv: toInt;

struct K {
    int x;
    int opEquals(K other) {
        return this.x == other.x;
    }
    int opCmp(K other) {
        return (this.x == other.x) ? 0 : (this.x - other.x);
    }
    hash_t toHash() {
        return 1;
        //return cast(hash_t)this.x;
    }
}

void test(int n) {
    int[K] d;
    for (int i; i < n; i++)
        d[K(i)] = i;
}

void main(string[] args) {
    int n = args.length == 2 ? toInt(args[1]) : 10;
    test(n);
}
As you can see the hash methods alaways return 1, so every item is an hash collision.

The timings:
Timings with all collisions (hash=1):
  N =       1000  2000  3000  4000  10_000  100_000
  Python:   0.53  1.27  2.45  4.30  30.27
  D class:  0.06  0.09  0.12  0.17   1.07
  D struct: 0.06  0.09  0.09  0.11   0.50    44.26

With no collisions (hash=this.x):
  N =       1000  2000  3000  4000  10_000  100_000  1_000_000
  Python:   0.28  0.28  0.28  0.28   0.28     0.53    6.97
  D class:  0.05  0.05  0.05  0.05   0.06     0.12    1.01
  D struct: 0.05  0.05  0.05  0.05   0.06     0.07    0.39
The timings don't show the clean grow factors I was talking about (probably also because of memory allocations), but you can clearly see how much faster timings grow in Python compared to D when there are many collisions.
comments: 3 comments or Leave a comment Add to Memories Tell a Friend

Tags:,
Subject:The Case for D - the other side of the coin
Time:12:27 am
Andrei Alexandrescu has written a nice article, "The Case for D" (click on 'Print' to read it on a single page):
http://www.ddj.com/hpc-high-performance-computing/217801225

D1 is a very nice language, and I use it often, but this article shows too much the good sides of the D2 language and its compilers, focusing on what it may do in future, ignoring their numerous current downsides and problems. Giving false expectations in possible new D users is dangerous. I think that giving a more balanced account of the current situation is better, even if in future most of current D problems may be fixed.

A good article must show the current troubles of the language too, and not just talk about good implementations that may be found years from now. At the moment Java is a very fast language, the compiler helps the programmer avoid many bug-prone situations, and the toolchain is very good. But at the beginning Java was really slow and of limited usefulness, it was little more than a toy.

This post isn't a list of all faults I see in the D language, it's a list of comments about the article by Andrei Alexandrescu.

From the article:

>In the process, the language's complexity has increased, which is in fact a good indicator because no language in actual use has ever gotten smaller.<

D2 language is more complex than D1, and even if each thing added to D may have its justifications, C++ language clearly shows that too much complexity is bad. So higher complexity is not a good indicator.


>Other implementations are underway, notably including an a .NET port and one using the LLVM infrastructure as backend.<

The LDC compiler (with LLVM backend) is already usable on Linux to compile D1 code with the Tango standard lib (but it lacks the built-in profiler). On windows LLVM lacks exception support, so it can't be used yet.


>D could be best described as a high-level systems programming language.<

It may be quite hard to think about using D to write something like the Linux kernel, or to write code for little embedded systems. D compiled programs are too much big for embedded systems with few kilobytes of RAM, an the D language relies too much on the GC (even if it can be switched off, etc) to be a good tool to write real-world kernel.

So D is currently more like a systems programming-like language. A multi-level language that can be used to write code quite close to the 'metal' or to write high-level generic code too.


>It encompasses features that are normally found in higher-level and even scripting languages -- such as a rapid edit-run cycle,<

Being made of compiled modules, the edit-run cycle in a D program can be as fast as in other languages like C# and Java.


>In fact, D can link and call C functions directly with no intervening translation layer.<

On Windows you usually have to compile the C code with DMC to do this.


>However, you'd very rarely feel compelled to go that low because D's own facilities are often more powerful, safer, and just as efficient.<

In practice currently there are situiations where using C-style code can lead to higher performance in D1 (especially if you use the DMD compiler instead of the LDC one).


>support for documentation and unit testing is built-in.<

Such things are very handy and nice. But the current built-in support for documentation has many bugs, and the built-in unit testing is very primitive and limited: for example tests have no name, they just contain normal code and assert(), and their running stops as soon as the first assert fails.


return printf("hello, world\n") < 0;

This may be more correct C:

if (printf("hello, world\n") >= 0)
return EXIT_SUCCESS;
else
return EXIT_FAILURE;


>(and T!(X) or simply T!X for T)<

In D1 the T!X syntax isn't supported. In D2 there's another rule, you can't write:
T!(U!(X))
As:
T!U!X
This is an example where things are more complex in D2 just to save two chars.


>D's unit of compilation, protection, and modularity is the file. The unit of packaging is a directory.<

D module system is nice and handy, but it currently has several bugs, and it has some semantic holes.

The sensation it leaves in the programmer is that its design was started well, but then the development of such design has stopped mid-course, leaving some of its functionalities half-unfinished.

For example if you import the module 'foo', in the current namespace it imports not just 'foo', but all the names contained into 'foo', and the 'foo' name itself. This is silly.

There are also troubles with circular import semantics, package semantics, safety (it lacks a syntax to import all names from a module. That's the default berhavour, and this is bad).

Another downside is that all current D compilers aren't able to follow the module tree by themselves to compile code, so you need to tell the compiler all the modules you need to compile, even if such information is already fully present in the code itself. There are several tools that try to patch this basic functionality hole (very big programs need more complex building strategies, but experience shows me that most small D programs can be fine with that automatic compilation model).


>* One, the language's grammar allows separate and highly optimized lexing, parsing, and analysis steps.<

This also has the downside that it limits the possible syntax that can be used in the language, for example it makes this code impossible:
foreach (i, item in items)
Forcing the language to use this, that is a bit less readable and a little more bug-prone:
foreach (i, item; items)


>* Three, Walter Bright, the creator and original implementor of D, is an inveterate expert in optimization. <

This is probably true, despite this the backend of DMD produces not much efficient code. LDC (LLVM-backend) is generally much better in this.
Update1, Jun 17 2009: DMD (especially DMD D1) is faster than LDC in compiling code.


>Other procedural and object-oriented languages made only little improvements,<

Untrue, see Clojure and Scala. Hopefully D will do as well or better.
Update1, Jun 17 2009: both Clojure and Scala run on the JVM, so the situation is different.


>a state of affairs that marked a recrudescence of functional languages<

Some other people may talk about a reinassance, instead :-)


>SafeD is focussed only on eliminating memory corruption possibilities.<

It may be better to add other safeties to such SafeD modules.


>That makes Java and C# code remarkably easy to port into a working D implementation.<

It's indeed quite easy to port C/Java code to D. But translating C headers to D may require some work. And currently the D garbage collector is much less efficient than the common Java ones, so D requires code that allocates less often.
Update1, Jun 17 2009: there are tools that help convert C headers to D.


>such as an explicit override keyword to avoid accidental overriding,<

It's optional.


>and a technique I can't mention because it's trademarked, so let's call it contract programming.<

It's built-in in the language. It's not implemented in a very complete way, but it may be enough if you aren't used to Eiffel.


>The implementation now takes O(n) time, and tail call optimization (which D implements) takes care of the space complexity.<

At the moment only the LDC compiler (a D1 compiler) is able to perform tail-call elimination (and probably only in simple situations. But probably as LLVM improves, LDC will improve).
Update1, Jun 17 2009: I was wrong, DMD is able to tail-call optimize if the situation is simple.


>iron-clad functional purity guarantees, and comfortable implementation when iteration is the preferred method. If that's not cool, I don't know what is.<

At the moment calls to pure functions aren't moved out of loops. There can be problems if the pure function generates an out of memory exception, or if it's involved a change in the floating point rounding mode.

Functional programming juggles lot of immutable data, and this puts the garbage collector under a high pressure. Currently the D GC isn't efficient enough for such quick cycles of memory allocation, so it's not much fit yet for functional-style programming (or Java-style Object Oriented style of programming that allocates very frequently).

All this isn't meant to discourage you from using the D1/D2 languages.

-------------------------------

Update1, Jun 17 2009:
See also the discussion on Reddit:
http://www.reddit.com/r/programming/comments/8t7s1/the_case_for_d_the_other_side_of_the_coin/

Answers to the received comments:

Thank you Anonymous for your large amount of comments. I'll fix the blog post where I see it's necessary. Your comments will help me a lot in improving my blog post.


>For exception support, it's more C++'s LLVM and Windows SEH issue, to get it right.<

Eventually LLVM/Clang developers will support exceptions on Windows. Several things tell me that LDC will be a good compiler.


>As for profiler, I believe you can compile to LLVM bytecode and profile that by LLVM tools, but well, it's ugly.<

Some things are already possible (I am trying KCachegrind now), but DMD is quite more handy, you can just add a "-profile" and it just works. (Code coverage of DMD too is handy, but it doesn't work on some bigger programs of mine). Walter has said more than one time that having easy to use tools helps people use them more often.


>but what we actually want are just more tools and more mature tools.<

Command-line features like DMD profiler are enough for me in many situations.


>Well, there is actually microkernel OS in D around:<

I know, but I have read an half-serious proposal to create another compiler to compile the Linux kernel because GCC isn't too much fit for this purpose. So I guess D compilers too may be even less fit for that purpose.
On the other hand Microsoft is trying to use a modified C# to write a OS (and they say the extra safety offered by C# allows to avoid some controls in the code, and this ends up creating globally efficient enough code), so it may be doable in D too.


>D programs are somewhat bigger minimal C apps (and esp., compiled by LLVM LDC) because of 3 things:<

A GC can't be avoided, but maybe it's possible to keep it outside, dynamically linked.
The runtime contains unicode management, associative arrays, dynamic arrays and more, but it may be possible to strip away some of such things when not used.


>(as a example of such multi-level language, but I'd like to see OMeta-like stuff for D better).<

OMeta is the future :-)
See also Pymeta, Meta for Python:
http://washort.twistedmatrix.com/


>Exactly, but you always can reimpement your wheels (read: modules/packages via classes, and some design pattern around that), and feed them thru CTFE/mixins.<


I'd like the built-in unittest systems to be a bit more powerful, or you can of course re-implement them outside the language, but then it's better to remove the built-in unittest features. Keeping both is not good.


>That's actually matter not compiler itself, but your build system.<

The DMD compiler already has built-in things that are beyond the purposes of a normal compiler. Adding this automatic build feature isn't essential but it's handy and positive.


>Hey, that 'item in items' stuff is not D semantic, and has nothing to with compiler itself.<

D compiler is designed in several separated layers. So it seems that to change the syntax adding an "in" inside the foreach you have to add some feedback between layers, and this is seen as bad for the compiler (and probably Walter is right here).


>public/private import?<

Imports are already private by default now in D. The problems are quite more big here.


>new instaneous dee0xd<

Never seen that before.


>Arguable: dmd still compiles faster, and binary sizes are smaller. LLVM optimizations are much more promising, though.<

In most of my benchmarks LDC produces programs that are faster or much faster. DMD indeed compiles faster (DMD of D2 is a bit less fast). Binary sizes produced by LDC are sometimes bigger but they are working on this, and most times the size is similar.


>Somewhat different playgrounds here: JVM-based or self-hosted.<

You are right, the situation is different. But I think you can implement Clojure multiprocessing ideas even without a VM.


>Just stub your own GC in. There are different GC strategies after all, why to hope 'one size fitts all' on every cases?<

Indeed, JavaVM ships with more than one GC to fulfill different purposes.
My own GC is probably going to be worse than the current built-in one. I am not able to write a GC as good JavaVM ones. So what you write here is not good.


>Java GC's was much worse than Oberon's btw, when it just appeared.<

Java at the beginning was WAY worse, I know, I have stated this at the beginning of my blog post.


>And if you have many of 'quick cycles of memory allocation', something is wrong with your memory allocator. It's not better when you have lotso manual malloc/free, its better when you have memory pools, arenas, zones, and right allocation (or GC) strategy, which fits better for you app.<

If you look at most Java programs you can often see many small objects allocated in loops.
At the same way, in functional-style languages/programs you can see lot of immutable data structures that are created and collected away all the time. From my benchmarks I think the current D GC isn't fit for such kinds of code.


>So I believe we can't rely on one single GC for all use cases, but we need lotso strategies and pluggable GC's for different uses cases and different strategies.<

I agree, but probably 2-3 GCs (built-in and switchable at compile time) can be enough for most D purposes. I am sure there are many ways to improve the current D GC (for example having a type system able to tell apart GC-manages pointers, and a hybrid moving\conservative GC that pins down memory manually managed, and moves and compacts all the other memory), my purpose was just to show and talk about the current situation.


>That shouldn't stop you in any way from using D<

Of course. I don't waste hours of my time commenting about a language I don't like to program with :-)
D is my second preferred language (after Python), I like it and I have written lot of D code :-)

Thank you again for all your comments, as you see I agree with most of the things you have written here.
comments: 18 comments or Leave a comment Add to Memories Tell a Friend

Tags:, , , , ,
Subject:fpaq0 - update
Time:06:09 pm
The ldc D compiler (http://www.dsource.org/projects/ldc ) is based on the LLVM backend and it's very good. LLVM is currently a bit young still compared to GCC, and it's often possible to find code where GCC produces faster code, but such difference is usually within 5-20%, so in practice it's acceptable in all situations where performance isn't critical (and in such situations you may want to use something like the Intel compiler or inline asm anyway).

So I have used ldc to recompile fpaq0, a simple order-0 arithmetic file compressor for stationary sources by by Matt Mahoney:
http://cs.fit.edu/%7Emmahoney/compression/fpaq0.cpp
That I have translated to D. I have also added a version of the imports for the D Tango standard library too. As input file I use a large purely ASCII text of 6_500_314 bytes, mixed English texts (originally by Peter Norvig, cleaned up a bit).

The D code and more details about the timings:
http://www.fantascienza.net/leonardo/so/fpaq0.zip

The D timing results are very good, better or equal than the C++ code compiled with G++. In both the D and C++ version I have also used putc_unlocked/getc_unlocked to increase I/O performance (I'd like to see such functions in Tango).
Timings Windows, seconds:
  DMD:        3.01  (100%)
  LLVM-G++ C: 1.78  ( 59%) (100%)
  LLVM-G++ A: 1.73  ( 57%) ( 97%) (100%)
  G++ C:      1.41  ( 47%) ( 79%) ( 81%)
  G++ A:      1.35  ( 45%) ( 76%) ( 78%)

Timings Pubuntu, seconds:
  G++ A:      3.53
  LDC:        1.92 (100%)
  G++ B:      1.55 ( 81%)
  G++ B:      1.48 ( 78%) putc_unlocked/getc_unlocked
  LDC:        1.41        putc_unlocked/getc_unlocked 
Compilers used:

On WindowsXp:
DMD Digital Mars D Compiler v1.042
gcc version 4.3.3-dw2-tdm-1 (GCC)
gcc version 4.2.1 (Based on Apple Inc. build 5636) (LLVM build)

On Pubuntu:
gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu4)
ldc based on DMD v1.045 and llvm 2.6svn (Sun Jun 7 14:18:55 2009)


dmd used with:
dmd -O -release -inline

ldc used with:
ldc -O5 -release -inline

g++ A used with:
g++ -O3 -s

g++ B used with:
g++ -Wall -O3 -s -fomit-frame-pointer -msse3 -march=native

g++/llvm-gcc C used with:
g++ -Wall -O3 -s -fomit-frame-pointer -msse3 -march=core2
llvm-g++ -Wall -O3 -s -fomit-frame-pointer -msse3 -march=core2
comments: 2 comments or Leave a comment Add to Memories Tell a Friend

Tags:, , , , , ,
Subject:SciMark 2.0 in C and D
Time:11:08 pm
SciMark 2.0 is a Java benchmark for scientific and numerical computing, developed by Roldan Pozo and Bruce R Miller:
http://math.nist.gov/scimark2/index.html

On my PC the applet (run locally) scores 388, I have a Core2 at 2 Ghz. (The site shows that some people have scored 7300, I don't know how).

The site also offers C code, that I have run and then ported to D, for benchmarks (it can also be possible to port the Java code to D, but probably the C code leads to higher performance). (Update1: I have converted the Java code too to D).

Here you can find the C and D code (for Phobos and Tango standard libraries), plus all the detailed results of the benchmarks:
http://www.fantascienza.net/leonardo/js/scimark2.zip

Thanks to the latest LLVM backend the results of LDC are quite good, better than C compiled with GCC 4.2.1.

I have tested LDC on the same PC using Pubuntu, that is a "Portable Ubuntu", that leads to a performance a bit decreased compared to a native Ubuntu running on the same CPU.
Timings on WinXp 32 bit, 2.00 seconds min time:
               small   large
  D DMD:         299    207
  D DMD:         401    268 (OOP version)
  C GCC:         578    314
  C LLVM-GCC:    609    351

Timings on Pubuntu 32 bit, 2.00 seconds min time:
               small  large
  D LDC:        462     272 (OOP version)
  C GCC:        568     315
  D LDC:        575     327

CPU: Core2 at 2 Ghz, 2 GB RAM.

Compilers used on WindowsXp:
  gcc version 4.3.3-dw2-tdm-1 (GCC)
  gcc version 4.2.1 (Based on Apple Inc. build 5636) (LLVM build)

Compilation argumenents on WindowsXp:
  dmd -O -release -inline scimark2_d_phobos.d
  gcc -O3 -s -fomit-frame-pointer -msse3 -march=core2 scimark2_c.c -o scimark2_c_gcc
  llvm-gcc -O3 -s -fomit-frame-pointer -msse3 -march=core2 scimark2_c.c -o scimark2_c_llvm


Compilers uses on Pubuntu:
  gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu4)
  LLVM D Compiler rev. based on DMD v1.045 and llvm 2.6svn (Mon Jun  1 22:54:33 2009)

Compilation argumenents on Pubuntu:
  ldc -O5 -release -inline scimark2_d.d
  gcc -O3 -s -fomit-frame-pointer -msse3 -march=native -lm scimark2_c.c -o scimark2_c

Update 1, Jun 5 2009: added OOP D version translated from the Java code (the code is a bit raw, it's not an example of good code), plus its timings. The class-based version is slower with LDC but faster with DMD, I don't know why.
comments: 4 comments or Leave a comment Add to Memories Tell a Friend

Tags:,
Subject:Links
Time:01:30 am
A big free book of low-level algorithms (updated, I have already shown this in the past):
http://www.jjj.de/fxt/


A small tutorial about usage of floating point values in D language. FP management in D is among the best you can find:
http://www.digitalmars.com/d/2.0/d-floating-point.html


Video "Debugging Backwards in Time", about 50 minutes, very nice, I'd like to have something like this in Python:
http://video.google.com/videoplay?docid=3897010229726822034


Thrust is a CUDA library of parallel algorithms with an interface resembling the C++ Standard Template Library (STL). Thrust provides a flexible high-level interface for GPU programming:
http://code.google.com/p/thrust/


Free book, it looks interesting, "A Functional Pattern System for Object-Oriented Design" by Thomas Kühne, 1999:
http://homepages.ecs.vuw.ac.nz/~tk/fps/fps-sans-escher.pdf


A free book about image processing in C (It doesn't look very updated, surely there are better books about this topic):
http://homepages.inf.ed.ac.uk/rbf/BOOKS/PHILLIPS/cips2ed.pdf
comments: Leave a comment Add to Memories Tell a Friend

Tags:, , , , ,
Current Location:programming
Subject:List deletions problem
Time:06:15 pm
Once in a while in my programs I need to keep an ordered sequence of items, process them all iteratively, and during each iteration remove few of them that aren't good anymore. So in each iteration only a small number of items is removed, and such removals happen in random positions.

I have written a small paper about how to solve this common programming problem in an efficient way, in Python+Psyco, D language and C:
http://www.fantascienza.net/leonardo/ar/list_deletions.html

The code of the paper:
http://www.fantascienza.net/leonardo/js/list_deletions.zip
comments: 2 comments or Leave a comment Add to Memories Tell a Friend

Tags:, ,
Subject:Updates and links
Time:07:21 pm
Updates:

The "two towers" problem:
http://www.cs.williams.edu/%7Ejeannie/cs136/labs/towers.pdf
Code and timings of various solutions of mine:
http://www.fantascienza.net/leonardo/js/two_towers.zip
You can also see my small paper about it:
http://www.fantascienza.net/leonardo/ar/two_towers/two_towers.html


I have implement various standard binary tree scans in Python done in a lazy way (later I may do the same in D1/D2 languages):
http://www.fantascienza.net/leonardo/js/bin_trees.py

--------------------

Links:

"Demystifying the 'restrict' C99 keyword", a nice and very clear blog post that shows why such small thing improves the C language a lot:
http://www.cellperformance.com/mike_acton/2006/05/demystifying_the_restrict_keyw.html


A talk plus some slides, marvelous 2D/3D mathematical art. I need hours to fully appreciate it:
http://mirror.csclub.uwaterloo.ca/csclub/kaplan-mathematical-art-slides.pdf
The related video:
http://csclub.uwaterloo.ca/media/Rapid%20Prototyping%20and%20Mathematical%20Art


Very useful and quite important, ideas to not damage our own body:
http://hubpages.com/hub/Health-benefits-of-squat-toilets
comments: Leave a comment Add to Memories Tell a Friend

Tags:, , , ,
Subject:Lazy iterable performance in C#, D1, D2
Time:01:04 pm
I've done a basic benchmark of the performance of lazy iterables in C# and D languages. (My experience of C# is limited, so the C# code may be improved.)
This test just creates an iterable and counts the numbers it yields:
using System;
using System.Collections;

public sealed class YieldBench {
    public static IEnumerable Xrange(int stop) {
        for (int i = 0; i < stop; i++) {
            yield return i;
        }
    }

    static void Main(string[] args) {
        int stop = args.Length > 0 ? Int32.Parse(args[0]) : 10;
        int count = 0;
        foreach (int i in Xrange(stop))
            count++;
        Console.Write("count: {0}\n", count);
    }
}

C#, version 2, with generics, version offered by Scale:
using System;
using System.Collections;
using System.Collections.Generic;

public sealed class YieldBench {
    public static IEnumerable<int> Xrange(int stop) {
        for (int i = 0; i < stop; i++)
            yield return i;
    }

    static void Main(string[] args) {
        int stop = args.Length > 0 ? Int32.Parse(args[0]) : 10;
        int count = 0;
        foreach (int i in Xrange(stop))
            count++;
        Console.Write("count: {0}\n", count);
    }
}

D1 code:
import std.stdio: writefln;
import std.conv: toInt;

struct Xrange1 {
    int stop;
    int opApply(int delegate(ref int) dg) {
        int result;
        for (int i = 0; i < stop; i++) {
            result = dg(i);
            if (result)
                break;
        }
        return result;
    }
}

void main(string[] args) {
    int n = args.length == 2 ? toInt(args[1]) : 10;
    int count;
    foreach (i; Xrange1(n))
        count++;
    writefln("count: %d", count);
}

D2 code, it also contains a second way to create a lazy iterable:
import std.stdio: writefln;
import std.conv: to;

struct Xrange1 {
    int stop;
    int opApply(int delegate(ref int) dg) {
        int result;
        for (int i = 0; i < stop; i++) {
            result = dg(i);
            if (result)
                break;
        }
        return result;
    }
}

struct Xrange2 {
    int stop, _current;

    // true if the xrange if finished
    bool empty() { return this._current == this.stop; }

    // returns current iterated item
    int front() { return this._current; }

    // Advances the iteration
    void popFront() { this._current++; }
}

void main(string[] args) {
    int n = args.length == 2 ? to!int(args[1]) : 10;
    int count;
    foreach (i; Xrange1(n))
    //foreach (i; Xrange2(n))
        count++;
    writefln("count: %d", count);
}

Python code 1:
from sys import argv

def Xrange(stop):
    for i in xrange(stop):
        yield i

def main():
    stop = int(argv[1]) if len(argv) == 2 else 10
    count = 0
    for i in Xrange(stop):
        count += 1
    print "count:", count

import psyco; psyco.bind(main)
main()

Python code 2:

from sys import argv

Xrange = lambda stop: (i for i in xrange(stop))

def main():
    stop = int(argv[1]) if len(argv) == 2 else 10
    count = 0
    for i in Xrange(stop):
        count += 1
    print "count:", count

import psyco; psyco.bind(main)
main()


Timings, best of 6, seconds:
            N  D1/L  D2/X2   D2/X1  D1/X1    C#1    C#G     Py1    Psy1    Py2    Psy2
      100_000     -      -       -      -   0.14      -    0.35   0.30    0.34
    1_000_000     -      -       -      -   0.16   0.12    0.46   0.41    0.42    0.44
   10_000_000  0.07   0.10    0.11   0.10   0.31   0.23    2.15   1.21    1.74    1.44
  100_000_000  0.12   0.32    0.47   0.46   2.13   1.19   19.06   9.14   14.90   11.55
1_000_000_000  0.77   2.56    4.09   4.09  20.36  10.73  189.    78.4   146.    112.

Key:
D1/L = compiled with DMD 1, normal for loop.
D2/X2 = with DMD 2, Xrange2
D2/X1 = with DMD 2, Xrange1
D1/X1 = with DMD 1, Xrange1
Py1 = Python 1 program
Psy1 = Python 1 program JIT-compiled with Psyco
C#G = second C# version, with generics

My CPU is a core2 2 GHz, so it means that D2/X2 is doing about 5 clock tips / loop cycle, while D1/L is doing about 1.54.

D2/X2 is 2.56 / 0.77 = 3.32 times slower than a for loop.

Asm produced by DMD V.1 and V.2, just the core of the loops:
D2/X1, inside opApply:
L1B: lea ECX,010h[ESP]
     mov EAX,018h[ESP]
     push ECX
     call ESI
     mov EBX,EAX
     test EAX,EAX
     jne L38
     inc dword ptr 010h[ESP]
     mov EDX,[EDI]
     cmp EDX,010h[ESP]
     jg L1B


D2/X2:
L38: inc dword ptr 010h[ESP]
     inc EBX
     mov ECX,010h[ESP]
     cmp ECX,0Ch[ESP]
     jne L38


D1/L:
L30: inc ESI
     inc EBX
     cmp EBX,EDI
     jl L30
A person says the updated LDC compiler is able to compile the opApply better, and in this situation produces asm similar to a loop.

CPU: Intel Core 2, 2 GHz (2 cores), WinXP SP2.

D code compiled with:
DMD v1.042 and D 2.029
dmd -O -release -inline

Python 2.6.2 (r262:71605, Apr 14 2009, 22:40:02)
[MSC v.1500 32 bit (Intel)] on win32

Psyco for Python 2.6, V.1.6.0 final 0

DotNet 3.5 sp1, C# 2008

Update: Scale (Alessio Scalerandi) has given me a good suggestion to improve the timings of the C# version, using generics. It's about twice faster than the first version, and now it's just about four times slower than the D ranged lazy iteration.
comments: 3 comments or Leave a comment Add to Memories Tell a Friend

Advertisement

leonardo
View:Recent Entries.
View:Archive.
View:Friends.
View:User Info.
View:Website (My Website).
You're looking at the latest 10 entries.
Missed some entries? Then simply jump back 10 entries