Help File:Basic assembler

Originally posted by Dark Byte + addition by Smidge204

Most people think assembler is very difficult, but in fact it's very easy. In this tutorial i'll try to explain how some basic assembler works

The processor works with memory and registers. The registers are like memory but a lot faster than memory. Registers are EAX,EBX,ECX,EDX,ESP,EBP,ESI,EDI, and the segment registers. (There's also EIP, wich is the Instruction Pointer. It points to the instruction that is about to be executed)

Some examples[edit]

 sub ebx,eax  // (ebx=00000005,eax=00000002)

Lets take it apart in it's most basic elements:

opcode param1,param2

The opcode is the instruction telling the processor what to do, in this case decrease the value stored in register ebx with the value stored in register eax.

In this case ebx=5 and eax=2 so ebx would be after this instruction 3. (5-3)

Also note that whever you see a opcode with 2 parameters: The first parameter is the target of the instruction. The 2nd is the source

 sub [esi+13],ebx  // (ebx=00000003,esi=008AB100)

In this case you see the first parameter is between brackets. This indicates that instead of registers a memorylocation is being used. The memorylocation is pointed at by whats in between the brackets, in this case esi+13 (Note that the 13 is in hexadecimal)

ESI=008AB100 so the address pointed at is 008AB113.

This instruction would decrease the value stored at location 008AB113 with the value stored in ebx(wich is 3).

If the value at location 008AB113 was 100 then the value stored at 008AB113 after this instruction would be 97.

 sub [esi+13],63  // (esi=008AB100)

This is almost the same as above but instead of using a register it uses a direct value.

Note that 63 is actually 99 because the instruction is always written using hexadecimal.

Lets say the value at 008ab113 is 100 (wich is 64 in hexadecimal) then the value at 008ab113 after execution would be 1 (100-99)

 sub ebx,[esi+13]  // (ebx=00000064 esi=008ab100)

This instruction decreases the value stored in ebx with the value stored at location 008ab113. (esi+13=008ab100+13=008ab113, in case you forgot)

Up until now i've only used SUB as instruction, but there are lots and lots of other instructions the procesor knows.

Lets take a look at MOV, one of the most often used instructions. Although it's name sugests that it moves data, it just COPYs data from one spot to another.

MOV works exactly the same as sub. first parameter is the destination, and second parameter is the source.

Examples:

 MOV eax,ebx  // eax=5,ebx=12

Copies the value stored in ebx into eax.

So, if this instruction would be executed eax would be 12. (and ebx would stay 12)

 MOV [edi+16],eax  // eax=00000064, edi=008cd200)

This instruction will place the value of eax(64hex=100 decimal) at the location of edi+16 (008cd200+16=008cd216).

So after instruction the value stored at 008cd216 will be 100 (64 hex)

As you see, it works just like the SUB instruction.

Then there are also those instructions that only have 1 parameter like inc and dec.

Example:

 inc eax  // Increase the value at eax with 1.
 dec ecx  // Decrease the value of ecx with 1.
 dec [ebp]  // Decrease the value stored at the address pointed to by ebp with 1.

Right now i've only shown the 32-bit registers (eax, ebx ecx....) but there are also 16-bit register and 8-bit registers that can be used.

the 16 bit registers are:

AX
BX
CX
DX
SP
BP
SI
DI

the 8 bit register are:

AH
AL
BH
BL
CH
CL
DH
DL

Note that when changing ah or al you'll also change AX, and if you change AX you'll also change EAX, same goes for bl+bh+bx+ebx,ch+cl+cx+ecx,dh+dl+dx+edx

You can use them almost the same with the instructions for 32 bit but they will only change 1 (8 bit) or 2(16-bit) bytes, instead of 4 (32-bit) bytes.

Example:

 dec al  // Decreases the 8 bit register al.
 sub [esi+12],al  // Decreases the 1-byte value stored at the location esi+12 points at with the value of al.
 mov al,[esi+13]  // Places the 1-byte value stored at the location esi+13 points in the al register.

Note that it is IMPOSSIBLE to use a 16 or 8 bit register for instructions that point to an address.

e.g.: mov [al+12],0 will NOT work.

There are also 64 and 128 bit registers, but I wont discuss them since they are hardly ever used, and cant be used with the other instructions that also work with 32 bit).

Then there are the JUMPS, LOOPS, and CALLS[edit]

JMP:

The JMP instruction is the easiest it changes the Instruction Pointer (EIP) to the location the JMP instruction points at and continues from there.

There are also conditional jumps that will only change the instruction pointer if a special condition has met. (for example set using the compare instruncion (CMP))

JA: Jump if Above.
JNA: Jump if not above.
JB: Jump if below.
JE: Jump if equal.
JC: Jump if carry.

And LOTS of other conditional jump.

LOOP:

The loop instruction also points just like the JMP to a memory location, but only jumps to that location if the ECX register is not 0.

And of course, there are also special contitional loops:

LOOPE: Loop while ecx is not 0 AND the zero flag is not set.
LOOPZ: Same as LOOPE.
LOOPNE: Loop while ECX is not 0 AND the zero flag is set.
LOOPNZ: Same as LOOPNE.

I gues I should also explain what flags are, they are bits in the processor that can be used to check the condition of a previous instruction like 'cmp al,12' if al=12 then the zero flag (ZF) will be set to true, else the Zero flag(ZF) will be set to false.

CALL:

Call is the same as JMP except it uses the stack to go back.

Explenation of the stack:

The stack is a location on memory pointed at by the ESP register. You can put values in it using the PUSH command, and take out it using the POP command. If you use PUSH it will decrease the ESP register and place the value at the location of ESP. If you use POP it will place the value pointed at by pop into the location pointed at by the parameter of POP and increase the value of ESP. In short: The last thing you push in the stack will be the first thing you pop from the stack, the 2nd last item in will be the 2nd item out.

RET:

After CALL has pushed the location of the next instruction onto the stack it jumps to that location. (sets the instruction pointer to that location)

After a while it will encounter a RET instruction, and will then jump to the location that is stored in the stack. (Call pushed the location in the stack, ret pops it out again and jumps to that location)

And thats the tutorial on the basics of assembler, if you have questions about assembler and stuff just ask and I'll try to answer.

Nice file to check out if you want more info:

?? Bad Link

podgoretsky.com/ftp/Docs/Hardware/Processors/Intel/24547111.pdf

Added

Note: It's really usefull to understand how those values between brackets work, because then you can make the most use of the pointer stuff in CE 4.1 (It will remove for most games the Dynamic Memory Allocation problem for most games, if you know how how to look at the assembler code that accesses the values you found)

The "flags" are a set of bits stored in a special register. If the bit is "1" the flag is said to be set, and if it's "0" then the flag said to be "clear". Collectively, the flags tell you all about the processor's internal status and gives more information about the results of previous instructions.

There are three types of flags: Status flags that tell you about the results of the last instruction, Control flags that tell you how the processor will behave, and System flags that tell you about the environment your program is executing it.

The flag register is 32 bits: (S=Status flag, C=Control flag, X=System flag)

Code:

 0   S   Carry 
 1      (Reserved) 
 2   S   Parity 
 3      (Reserved) 
 4   S   Auxiliary Carry 
 5      (Reserved) 
 6   S   Zero 
 7   S   Sign 
 8   X   Trap 
 9   X   Interrupt Enable 
 10   C   Direction 
 11   S   Overflow 
 12   X   I/O Privilage (bits 12&13) 
 13   X    
 14   X   Nested Task 
 15      (Reserved) 
 16   X   Resume 
 17   X   Virtual 8086 
 18   X   Alignment Check 
 19   X   Virtual Interrupt 
 20   X   Virtual Interrupt Pending 
 21   X   Identification 
 22   
 23    | 
 24    | 
 25    | 
 26    |_ (Reserved) 
 27    | 
 28    | 
 29    | 
 30    | 
 31   /

Let's go over the status flags, since those are used most often.

Overflow:

When an operation (Addition, subtraction, multiplication, etc) produces a result that is too big to fit in the register (or memory location) used, the Carry flag is set. (If not, it's cleared automatically) For example, if you're using a 16 bit register and your operation produces a value that won't fit in 16 bits, the carry flag is set.

Sign:

Set if the result is negative, cleared if positive. This is typically a mirror of MSB (most significant bit) of a value.

Zero:

Set if result is 0.

Auxiliary Carry:

Similar to Carry, but it will treat the register/memory location as 3-bits instead of 8, 16 or 32. This is used for BCD (Binary coded decimal) stuff and it generally pretty useless otherwise.

Carry:

The carry flag is set if the bit one past the lmit of the register/memory location would have been set. For example, mov al, 0xFF then add al, 1 will cause a carry because the 9th bit would have been set. Also note that the overflow and zero flags would be set and sign flag cleared, too!

Links[edit]

Help File

Back

Next

Help File:Basic assembler

Some examples[edit]

Then there are the JUMPS, LOOPS, and CALLS[edit]

Links[edit]

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools