Saturday, 8th May, 2010

x86_64 Assembler, Part 1

Oh god. Oh god. WHY IS rcx BEING SET TO 0?!

This was effectively an exercise in getting to grips with x86_64 and gdb. Both powerful, both requiring years of use to really “know”.

Either way, I had a dabble, tackling both problem 5 and problem 6 in assembler. This blog post will cover problem 5.

Original Problem 5 Here

A real challenge was that I’d never had used an assembler, or a linker, so I had to get those working along side getting my code to work. I chose to use the Linux system calls for getting output from my program, since I don’t actually know how those work deep down, but they’re there.

For my assembler, I settled for nasm, and for linking, I left that to GCC. I used nasm because it was what I first came across (with regards to tutorials) and it seemed to be the most likely to “just work” without additional complexity.

So in total, I had to work out:

  • x86_64 Assembler
  • nasm
  • GCC linking stuff
  • GDB
  • Use of Linux system calls at a low level

To get both problem 5 and 6 working correctly took well over 3 days of on/off hacking.

I first set about writing a Greatest Common Divisor algorithm, using subtraction as opposed to division, because I’ve always been told that division is slow. Also, it seemed easier to implement a subtraction based algorithm in asm than a division one.

gcd:
   ; calculate the GCD of RAX and RBX. Leave the result in RAX
   push rbx
	
gcd_st:
   ; Start of the GCD loop
   cmp rax, rbx
   je gcd_ret   ; If a = b, return (a)
   jl gcd_lt    ; elif a < b, return gcd(a-b, b),
   jg gcd_gt    ; elif a > b, return gcd(b-a, a).
	
gcd_ret:
   pop rbx ; Put rbx back where it was :-p
   ret
	
gcd_lt:
   sub rbx, rax ; b = b - a
   jmp gcd_st
	
gcd_gt:
   sub rax, rbx ; a = a - b
   jmp gcd_st

Oddly enough, to actually get that written down on paper was surprisingly easy! After that, I just needed to use it for a Lowest Common Divisor function:

lcm:
  ; Calculate the lcm of rax and rbx, leaving the result in rax.
  push rbx
  push rcx
  push rdx
  push rax
  mul rbx   ; a = a * b
  mov rcx, rax ; c = a * b
  pop rax ; a = a
  call gcd  ; a = gcd ( a, b )
  xchg rax, rcx ; swap a*b and gcd(a, b)
  div rcx   ; a = ( a * b ) / gcd ( a, b ) 
  pop rdx
  pop rcx
  pop rbx
  ret

Quite simple really… note the pushes an pop’s at the beginning and end respectively. I wanted to save the state of the registers before entering the procedure. After that, it is a simple implementation of (a*b) / gcd(a,b).

That’s the bulk of the program right there, the rest of it is simply looping trough the numbers 1 … 20 and finding their gcd with an accumulator.

; Entry point
section .text
    global _start
	
; The printf format string, "%lld\n".
section .data
   msg db '%lld', 10, 0
	
; Get access to the two system calls we need.
extern printf
extern _exit
	
_start:
  mov rax, 1
  mov rbx, 2
  call loop_start
	
loop_start:
  ; Begin the calculation, note that rax is an accumulator.
  call lcm   ; rax = lcm ( rax, rbx )
  inc rbx    ; rbx++
  cmp rbx, 21  ; if rbx >= 21...
  je write_result     ; then exit
  jmp loop_start ; else continue looping
	
write_result:
   ; write the result to the screen
   mov rsi, rax
   mov rdi, msg
   mov rax, 0
   call printf
   jmp exit
      
exit:
  ; Call the exit system call.
  push  42
  call _exit

Ow! That took me the better part of a day and a half. Calling printf especially. Most of the tutorials out there are aimed at x86, not x86_64. This causes quite a lot of confusion. Especially when you use GCC & C to write some assembler that uses printf and it just uses 32-bit methods.

And for good measure, I put the lines to assemble and link it in a shell script:

nasm -f elf64 euler5.s
gcc -nostartfiles -o euler5 euler5.o -lc

And of course, let’s see how long it takes to run (With the python version I wrote much earlier for comparison):

dst502@milan:~/Code/asm$ time ./euler5
XXXXXXXXX
	
real	0m0.022s
user	0m0.020s
sys	0m0.002s
dst502@milan:~/Code/asm$ time python euler5.py
XXXXXXXXX
	
real	0m0.047s
user	0m0.039s
sys	0m0.007s

A massive thanks to those on ##proggit who helped me suss out nasm, gcc and x86_64 in general, GeDaMo, ryan[WIN], I’m looking at you :-p

Done! :-D