leonardo ([info]leonardo_m) wrote,
@ 2008-12-25 11:05:00
Previous Entry  Add to memories!  Tell a Friend  Next Entry
Current location:c, gcc, llvm, benchmark, d language

Himeno test
I have compiled the small Himeno benchmark with GCC, LLVM-gcc, and DMD (for D code) to compare their peformance. (In the close future I hope to add timings for the LDC D compiler too).

This benchmark measures a CPU performance of floating point operation by a Poisson equation solver.
The code is available in two versions, a static memory one, and a dynamic one.
The static memory version is much faster and shows how much more can the compiler optimize such kind of code.
Such performance advantages of the static memory can be see in my wirun2 (version) program too:
http://www.fantascienza.net/leonardo/js/wirun2.c

The Himeno benchmarks by Dr. Ryutaro Himeno:
http://accc.riken.jp/HPC/HimenoBMT/index_e.html

You can find the C code here:
http://accc.riken.jp/HPC/HimenoBMT/program1_e.html


You can find the code I have used in my site too:
http://www.fantascienza.net/leonardo/js/himeno.zip

Contents of the "himeno.zip" file:
- himenoBMTxps.c: original C benchmark, with static memory.
- himenoBMTxps.d: direct translation of the C code with static memory to D.
- himenoBMTxpa.c: original C benchmark, with dynamic memory.
- himenoBMTxpa.d: direct translation of the C code with dynamic memory to D.
- himenoBMTxpa2.d: D code, dynamic memory, a little closer to D style.
- himenoBMTxpa3.d: D code, dynamic memory, with true dynamic arrays, D style.

The results, the MFLOPS measured:
  Static version:
    himenoBMTxps_dmd:   700
    himenoBMTxps_gcc:   935
    himenoBMTxps_llvm: 1248 

  Dynamic version:
    himenoBMTxpa_dmd:   174
    himenoBMTxpa_gcc:   146
    himenoBMTxpa_llvm:   96

  Modified dynamic version:
    himenoBMTxpa2_dmd:  173
    himenoBMTxpa3_dmd:  241
LLVM-gcc used with:
-O3 -s -fomit-frame-pointer -msse3 -march=core2

GCC used with:
-O3 -s -fomit-frame-pointer -msse3

DMD used with:
-O -release -inline

CPU used: Intel Core2 at 2 GHz, 2 GB RAM.

The results are quite curious. It seems DMD here is able to manage dynamic arrays better than normal C arrays, even better than GCC. Maybe with vector of the C++ STL the performance (for the dynamic memory benchmark) can be even higher (but I haven't tested this yet).
Now I'd like to see how LDC goes with the "himenoBMTxps.d" code.



Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…