| leonardo ( @ 2009-02-07 21:45:00 |
Multiprecision integers - part 2
This time I benchmark small sums among multiprecision integers (MPI). This kind of operation is interesting because it shows how well such multiprecision integers can replace normal integer operations, to avoid integer overflows but keep running efficiency as much as possible. Among the ones I have tested the Python integral numbers seem the most fit for this purpose. Probably the libs I have tested aren't much fit for this. Probably the results for Python3 are worse (even when Psyco will be available for it).
This time I benchmark small sums among multiprecision integers (MPI). This kind of operation is interesting because it shows how well such multiprecision integers can replace normal integer operations, to avoid integer overflows but keep running efficiency as much as possible. Among the ones I have tested the Python integral numbers seem the most fit for this purpose. Probably the libs I have tested aren't much fit for this. Probably the results for Python3 are worse (even when Psyco will be available for it).
TIMINGS, N = 1_000_000 * 10, best of 3:
#program, seconds run:
#2 D: 0.027 s (LDC)
#1 Python: 0.34 s (+ Psyco)
#5 C: 1.04 s (+ GMP)
#3 Python: 11.7 s (+ Psyco + GMPY)
#4 D: 21.2 s (LDC + Tango)
#6 C++: (+ GMP)
#7 Java:
PROGRAM OUTPUT: 45000000
-----------------
UPDATE Feb 11 2009:
Indeed, the LLVM backend is able to optimize away the loops of
the D program #2, this is a version with printing:
import tango.stdc.stdio: printf;
void main() {
int n = 0;
for (int j; j < 1_000_000; j++)
for (int i; i < 10; i++)
n += i;
printf("%d\n", n);
}
Compiling with -O2 -inline -release, such extreme optimization
isn't performed, this is the asm generated:
subl $12, %esp
xorl %eax, %eax
movl %eax, %ecx
.align 16
.LBB1_1: # forcond
cmpl $999999, %eax
jg .LBB1_6 # endfor
.LBB1_2:
xorl %edx, %edx
.align 16
.LBB1_3: # forcond2
cmpl $9, %edx
jg .LBB1_5 # forinc
.LBB1_4: # forbody3
addl %edx, %ecx
incl %edx
jmp .LBB1_3 # forcond2
.LBB1_5: # forinc
incl %eax
jmp .LBB1_1 # forcond
.LBB1_6: # endfor
movl %ecx, 4(%esp)
movl $.str1, (%esp)
call printf
But the total running time is minimal anway, 0.027 s or less, such
timign is lost in noise.
In all the other programs the compilers/interpreters can't optimize
away the loops, mostly because they use multiprecision integers.
-----------------
PROGRAMS:
# program #1, Python with Psyco
import psyco
def main():
n = 0
for j in xrange(1000000):
for i in xrange(10):
n += i
return n
psyco.bind(main)
main()
// program #2, D
void main() {
int n = 0;
for (int j; j < 1_000_000; j++)
for (int i; i < 10; i++)
n += i;
}
# program #3, Python with Psyco, gmpy
import psyco
from gmpy import mpz
def main():
n = mpz(0)
for j in xrange(1000000):
for i in xrange(10):
n += i
return n
psyco.bind(main)
main()
// program #4, D with BigInt
import tango.math.BigInt: BigInt;
void main() {
BigInt n = "0";
for (int j; j < 1_000_000; j++)
for (int i; i < 10; i++)
n += i;
}
// program #5, C with GMP
#include "stdio.h"
#include "gmp.h"
int main(void) {
mpz_t n;
int i, j;
mpz_init_set_si(n, 0);
for (j = 0; j < 1000000; j++)
for (i = 0; i < 10; i++)
mpz_add_ui(n, n, i);
mpz_out_str(0, 10, n);
putchar('\n');
return 0;
}
// program #6, C++ with GMP (not tested)
#include <iostream>
#include <gmpxx.h>
int main() {
mpz_class n;
int i, j;
n = 0;
for (j = 0; j < 1000000; j++)
for (i = 0; i < 10; i++)
n += i;
std::cout << n << "\n";
return 0;
}
// program #7, Java (not tested)
import java.math.BigInteger;
public class bigint_small_sum_7 {
public static void main(String[] args) {
BigInteger n = BigInteger.ZERO;
for (int j = 0; j < 1000000; j++)
for (int i = 0; i < 10; i++)
n = n.add(BigInteger.valueOf(i));
}
}
-----------------
CPU: Pentium3 @ 500 MHz, 256 MB RAM
OS: Ubuntu 8.10.
Timings are "real" seconds found with the 'time' command
(the best from 3+ consecutive runs).
VM/compiler/libs used:
- Python V.2.5.2
- Psyco V.1.6.0
- GMPY V.1.02
- GMP V.4.2.4
- gcc V.4.3.2 (Ubuntu 4.3.2-1ubuntu12)
- ldc rev.939, on dmd v1.039 + llvm 2.4 (Feb 4 2009)
- java 1.6.0_10
Code compiled with:
ldc -O3 -release -inline bigint_small_sum_2.d
ldc -O3 -release -inline bigint_small_sum_4.d
gcc -O3 -s bigint_small_sum_5.c -o bigint_small_sum_5 -lgmp
g++ bigint_small_sum_6.cpp -o bigint_small_sum_6 -lgmpxx -lgmp
javac bigint_small_sum_7.java