Friday, 18th June, 2010

x86_64 Assembler, Part 3

In this part, I look at my asm code versus other languages, with a specific emphasis on C. The main idea is to see if a novice asm coder can hold a candle to a decent compiler (gcc 4.3.3, in this particular case).

First, I looked at my Problem 5 asm implementation.

Obviously, it wouldn’t be fair if I had started messing around with which algorithm I used, so I stuck with Euler’s original description of the GCD algorithm (based on subtraction, rather than the more modern division variation). Basically, I tried to mimic the algorithm in C as much as possible.

#include <stdio.h>
	
int lcm(int, int);
int gcd(int, int);
	
int main() {
   int acc = 1;
   int i;
   for(i = 2; i < 20; i++) {
      acc = lcm(i, acc);
   }
   printf("%d\n", acc); // I did use a system call in the asm version.
   return 0;
}
	
int lcm(int a, int b) {
    // An implementation of lcm, using the identity a*b = lcm(a,b)*gcd(a,b)
    return (a*b)/(gcd(a,b));
}
	
int gcd(int a, int b) {
    // An implementation of gcd based on Euler's algorithm.
    if (a == 0) {
        return b;
    }
    while (b != 0) {
        if (a > b) {
            a = a - b;
        }
        else {
            b = b - a;
        }
    }
    return a;
}

And the results, euler5c is the C version, whereas euler5 is my asm version.

	
dst502@milan:~/Code/asm$ time ./euler5 
XXXXXXXXX
	
real	0m0.026s
user	0m0.024s
sys	0m0.001s
dst502@milan:~/Code/asm$ time ./euler5c
XXXXXXXXX
	
real	0m0.007s
user	0m0.006s
sys	0m0.000s

Out of somewhere, there’s almost a 4x speedup! But that’s not all.

dst502@milan:~/Code/asm$ ls -lah euler5c euler5
-rwxr-xr-x 1 dst502 students 2.2K 2010-06-18 10:17 euler5*
-rwxr-xr-x 1 dst502 students 4.5K 2010-06-18 10:17 euler5c*

My binary is smaller (Both binaries have been stripped), but slower. What’s going on here. For brevity, I won’t put gcc’s asm output here, but for the C version, there’s a lot of setup, what appear to be type definitions, etc. Whereas with my code I simply jump in at the deep end.

Now to compare it with some another compiled language, Haskell.

First, Haskell:

myLcm :: Integer -> Integer -> Integer
myLcm a b = (a*b) `div` (myGcd a b)
	
myGcd :: Integer -> Integer -> Integer
myGcd a 0 = a
myGcd a b
  | (b > a) = gcd (a - b) b
  | (b < a) = gcd a (b - a)
  | (b == a) = a
	
main = print $ foldr (myLcm) 1 [2..20]

And it’s time after compilation (and size):

dst502@milan:~/Code/asm$ ls -lah euler5hs
-rwxr-xr-x 1 dst502 students 467K 2010-07-04 13:28 euler5hs*
dst502@milan:~/Code/asm$ time ./euler5hs
XXXXXXXXX
	
real	0m0.003s
user	0m0.000s
sys	0m0.003s

It’s HUGE! Incidentally, it’s consistently faster than the C implementation on this box. I just find it a little odd, and I have no real explanation for it at hand, other than that GHC is awesome.

This blog is powered by FlatPress.