Tuesday, July 05, 2011

Fun with Assembly Language

This is a quick page about my adventures with Linux assembly language, with links to resources, and some source code.

Why?

I won't fool you into thinking that assembly language is a useful tool: it isn't. By and large, if you want to get stuff done, you're much better off learning Python, C or Java. Assembly language is great if you're a tinkerer, and want to know what is going on inside your computer. Many people have written interesting tiny programs using assembly, which might be handy on embedded environments. Finally, there are some features that you can only get through assembly: low-level architecture features, getting around compiler limitations and understanding languages themselves. For me, assembly is just for fun!

Resources

I do all my programming on Linux, and it is one of the best environments to learn assembly. The tools are quite good, and there is some lovely documentation. If you don't already have Linux, get a copy of Knoppix, which is a live CD that lets you try out Linux without modifying your computer setup. Here is a list of books that I highly recommend.
  1. [PGU] - Programming From the Ground Up, by Jonathan Bartlett: an excellent introduction to x86 assembly language on Linux. The entire book is free for download. This one book is all you need, initially.
  2. [PAL] - Professional Assembly Language: another excellent introduction.
  3. [IPM] - Intel Programming Manuals: complete, in-depth information on architecture and assembly language. You're probably interested in Vol 1, 2A, 2B, 3A and 3B. It is meant as a reference: not a cover-to-cover read.
  4. [APM] - AMD x86-64 Programming Manuals: complete, in-depth information from AMD. Get all the volumes.

Fun with Assembly

So you got yourself a copy of [PGU], and want a challenge?
  1. We try to rewrite the first program from [PGU] as 01_exitValue.s. The expected return value is 999, but you get something else.
    1. Can you guess what the return value is without running the program?
    2. Can you explain why the return value is not 999?
    3. Can you write the program so that this problem can be caught by the assembler?
    4. 01_exitValue_Solution.s.
  2. We try to rewrite the second program from [PGU], using our understanding from the earlier problem, as 02_maximumValue.s. The expected return value is 214, but you get something else.
    1. Can you guess what the return value is without running the program?
    2. Can you explain why the return value is not 214?
    3. Can you write the program so that this problem can be caught by the assembler?
    4. 02_maximumValue_Solution.s.
  3. Here's an example for why you might need assembly language. Your mission is breaking into a program to steal a secret key. Some experienced hackers have isolated where the secret key is being passed to a secret function.
    1. secretFunction(char *useless, char *secret_key) calls validate() immediately upon starting. You job is to print secret_key inside validate. As an example, the file 03_keyIsHere.o contains the secret function that has the second argument as a key. You are only allowed to write a validate() method in a separate .s or .c file. Try not to modify the original object file. You are assured that the key is exactly 13 characters long. Try writing an assembly solution, and compile and run with gcc 03_keyIsHere_Solution.s 03_keyIsHere.o -o 03_keyIsHere; ./03_keyIsHere
    2. If the validate was called in the end, how does it change your solution?
    3. Can you use this trick to guess the local variables of secretFunction?
    4. In case the object file doesn't work for you, 03_keyIsHere.c contains the C source. The solution must not modify it.
    5. 03_keyIsHere_Solution.s Compile and run with gcc 03_keyIsHere_Solution.s 03_keyIsHere.c -o 03_keyIsHere; ./03_keyIsHere
  4. This is a more sophisticated example compared to the previous one. Having moved on in your career, you are faced with a new program that calls printf, and you have to inject code without a recompilation. The program 04_crackMe executes happily on its own. You know that printf is being called in it, and that on the first invocation, the calling function has a secret-key as its second parameter.
    1. secretFunction(char *useless, char *secret_key) calls printf() immediately upon starting. You job is to print secret_key. Most probably, you'll want to inject code into printf. The solution does not require modifying the original binary. You are assured that the key is exactly 13 characters long.
    2. Can you do this while continuing to print the original messages?
    3. In case the object file doesn't work for you, 04_crackMe.c contains the C source. Compile and run with gcc 04_crackMe.c -o 04_crackMe; ./04_crackMe . Remember that the solution does not require modifying the 04_crackMe binary or the C source code.
    4. 04_crackMe_Solution.s. Instructions on how to run it are inside the solution.
    5. Sufficiently pleased? Good, now carry it over to the next level by visiting this awesome page on making 13 equal to 17.