Help File:ASM Basics 1

From Cheat Engine
Jump to navigation Jump to search

Originally posted by DABhand

The Basics

Opcodes

Ok whats opcodes? An opcode is an instruction the processor can understand. For example

SUB and ADD and DIV

The sub instructions subtracts two numbers together. Most opcodes have operands

SUB destination,source like the following

SUB eax, ecx

SUB has 2 operands. In the case of a subtraction, a source and a destination. It subtracts the source value to the destination value and then stores the result in the destination. Operands can be of different types: registers, memory locations, immediate values.

So basically that instruction is this, say for example eax contained 20 and ecx contained 10

 eax = eax - ecx
 eax = 20 - 10
 eax = 10

Easy that bit huh



Registers

Ahhh here is the main force of asm, Registers contain values and information which is used in a program to keep track of things, and when new to ASM it does look messy but the system is practically efficient. It is honestly

Lets take a look at the main Register used, its eax. Say it contains the value FFEEDDCCh (the h means hexidecimal) when working later with softice u will see hex values alot so get used to it now

Ok Ill show how the registers are constructed

 EAX     FFEEDDCC
 AX      DDCC
 AH      DD
 AL      CC

ax, ah, al are part of eax. EAX is a 32-bit register (available only on 386+), ax contains the lower 16 bits (2 bytes) of eax, ah contains the high byte of ax, and al contains the low byte of ax. So ax is 16 bit, al and ah are 8 bit. So, in the example above, these are the values of the registers:

 eax = FFEEDDCC (32-bit)
 ax  = DDCC (16-bit)
 ah  = DD (8-bit)
 al  = CC (8-bit)

Understand? I know its alot to take in, but thats how registers work Heres some more examples of opcodes and the registers used...

 mov eax, 002130DF         //  mov loads a value into a register
 mov cl, ah                //  move the high byte of ax (30h) into cl
 sub cl, 10                //  substract 10 (dec.) from the value in cl
 mov al, cl                //  and store it in the lowest byte of eax.

So at start..

eax = 002130DF

at end

eax = 00213026

Did you follow what happened? I hope so, cause im trying to make this as easy as I can

Ok lets discuss the types of registers, there is 4 types used mainly (there is others but will tell about them later)


General Purpose Registers

These 32-bit (and their 16bit and 8bit sub registers) registers can be used for anything, but their main purpose is shown after them.

eax (ax/ah/al) Accumulator ebx (bx/bh/bl) Base ecx (cx/ch/cl) Counter edx (dx/dh/dl) Data

As said these are hardly used nowadays for their main purpose and is used to ferry arround information within programs and games (such as scores, health value etc)


Segment Registers

Segment registers define the segment of memory that is used. You'll probably won't need them with win32asm, because windows has a flat memory system. In dos, memory is divided into segments of 64kb, so if you want to define a memory address, you specify a segment, and an offset (like 0172:0500 (segment:offset)). In windows, segments have sizes of 4gig, so you won't need segments in win. Segments are always 16-bit registers.

 CS code segment
 DS data segment
 SS stack segment
 ES extra segment
 FS (only 286+) general purpose segment
 GS (only 386+) general purpose segment


Pointer Registers

Actually, you can use pointer registers as general purpose registers (except for eip), as long as you preserve their original values. Pointer registers are called pointer registers because their often used for storing memory addresses. Some opcodes (and also movb,scasb,etc.) use them.

 esi (si) Source index
 edi (di) Destination index
 eip (ip) Instruction pointer

EIP (or IP in 16-bit programs) contains a pointer to the instruction the processor is about to execute. So you can't use eip as general purpose registers.


Stack Registers

There are 2 stack registers: esp & ebp. ESP holds the current stack position in memory (more about this in one of the next tutorials). EBP is used in functions as pointer to the local variables.

 esp (sp) Stack pointer
 ebp (bp) Base pointer



MEMORY

How is the memory used within ASM and the layout of it? Well hopefully this will answer some questions. Bear in mind there is more advanced things than what is explained here, but hell you lot arent advanced, so start from the basics

Lets look at the different types..


DOS

In 16-bit programs like for DOS (and Win 3.1), memory was divided in segments. These segments have sizes of 64kb. To access memory, a segment pointer and an offset pointer are needed. The segment pointer indicates which segment (section of 64kb) to use, the offset pointer indicates the place in the segment itself.

Take a look at this


 ----------------------------MEMORY--------------------------------
 |SEGMENT 1 (64kb)|SEGMENT 2 (64kb)|SEGMENT 3 (64kb)|etc...........|


Hope that shows well

Note that the following explanation is for 16-bit programs, more on 32-bit later (but don't skip this part, it is important to understand 32-bits).

The table above is the total memory, divided in segments of 64kb. There's a maximum of 65536 segments. Now take one of the segments:


 -------------------SEGMENT 1(64kb)----------------------
 |Offset 1|Offset 2|Offset 3|Offset 4|Offset 5|etc.......|


To point to a location in a segment, offsets are used. An offset is a location inside the segment. There's a maximum of 65536 offsets per segment. The notation of an address in memory is:

SEGMENT:OFFSET

For example:

0145:42A2 (all hex numbers remember )

This means: segment 145, offset 42A2. To see what is at that address, you first go to segment 145, and then to offset 42A2 in that segment.

Hopefully you remembered to read about those Segment Registers a while ago on this thread.

 CS - Code segment
 DS - Data Segment
 SS - Stack Segment
 ES - Extra Segment
 FS - General Purpose
 GS - General Purpose <<< Them remember

The names explain their function: code segment (CS) contains the number of the section where the current code that is being executed is. Data segment for the current segment to get data from. Stack indicates the stack segment (more on the stacks later), ES, FS, GS are general purpose registers and can be used for any segment (not in win32 though).

Pointer registers most of the time hold an offset, but general purpose registers (ax, bx, cx, dx etc.) can also be used for this. IP (Pointer register) indicates the offset (in the CS (code segment)) of the instruction that is currently executed. SP (Stack register) holds the offset (in the SS (stack segment)) of the current stack position.

Phew and you thought 16bit memory was hard huh

Sorry if thats all confusing, but its the easiest way to explain it. Reread it a few times it will eventually sink into your brain on how memory works and how it is accessed to be read and written too

Now we move to


32-bit Windows

You have probably noticed that all this about segments really isn't fun. In 16-bit programming, segments are essential. Fortunately, this problem is solved in 32-bit Windows (9x and NT).

You still have segments, but don't care about them because they aren't 64kb, but 4 GIG. Windows will probably even crash if you try to change one of the segment registers.

This is called the flat memory model. There are only offsets, and they now are 32-bit, so in a range from 0 to 4,294,967,295. Every location in memory is indicated only by an offset.

This is really one of the best advantages of 32-bit over 16-bit. So you can forget the segment registers now and focus on the other registers.

Oh the madness of it all, wow 4 gig bits to work with



The Fun Part

The Fun Part begins!!!

Its

THE OPCODES

Here is a list of a few opcodes you will notice alot of when making trainers or cracking etc.


MOV

This instruction is used to move (or actually copy) a value from one place to another. This 'place' can be a register, a memory location or an immediate value (only as source value of course). The syntax of the mov instruction is:

 mov destination, source

You can move a value from one register to another (note that the instruction copies the value, in spite of its name 'move', to the destination).

 mov edx, ecx

The instruction above copies the contents of ecx to edx. The size of source and destination should be the same, this instruction for example is NOT valid:

 mov al, ecx ; NOT VALID

This opcode tries to put a DWORD (32-bit) value into a byte (8-bit). This can't be done by the mov instruction (there are other instructions to do this). But these instructions are allowed because source and destination don't differ in size, like for example...

 mov al, bl
 mov cl, dl
 mov cx, dx
 mov ecx, ebx

Memory locations are indicated with an offset (in win32, for more info see the previous page). You can also get a value from a certain memory location and put it in a register. Take the following table as example:

 offset   34 35 36 37 38 39 3A 3B 3C 3D 3E 3F 40 41 42
 data     0D 0A 50 32 44 57 25 7A 5E 72 EF 7D FF AD C7

(each block represents a byte)

The offset value is indicated as a byte here, but it is a 32-bit value. Take for example 3A (which isn't a common value for an offset, but otherwise the table won't fit...), this also is a 32-bit value: 0000003Ah. Just to save space, some unusual and low offsets are used. All values are hexcodes.

Look at offset 3A in the table above. The data at that offset is 25, 7A, 5E, 72, EF, etc. To put the value at offset 3A in, for example, a register you use the mov instruction, too:

 mov eax, dword ptr [0000003Ah] ... but.......

You will see this more commonly in programs as

 mov eax, dword ptr [ecx+45h]

This means ecx+45 will point to the memory location to take the 32 bit data from, we know its 32bit because of the dword in the instruction. To take say 16bit of data we use WORD PTR or 8bit BYTE PTR, like the following examples..

 mov cl, byte ptr [34h]    cl will get the value 0Dh (see table above)
 mov dx, word ptr [3Eh]    dx will get the value 7DEFh (see table above, remember that the bytes are reversed)

The size sometimes isn't necessary:

 mov eax, [00403045h]

because eax is a 32-bit register, the assembler assumes (and this is the only way to do it, too) it should take a 32-bit value from memory location 403045.

Immediate numbers are also allowed:

 mov edx, 5006

This will just make the register edx contain the value 5006. The brackets, [ and ], are used to get a value from the memory location between the brackets, without brackets it is just a value. A register as memory location is allowed to (it should be a 32-bit register in 32-bit programs):

 mov eax, 403045h ; make eax have the value 403045 hex.
 mov cx, [eax] ; put the word size value at the memory location EAX (403045) into register CX.

In mov cx, [eax], the processor first looks what value (=memory location) eax holds, then what value is at that location in memory, and put this word (16 bits because the destination, cx, is a 16-bit register) into CX.

Phew


ADD,SUB,MUL and DIV

These are easy to understand Good old maths, im sure everyone can add and subtract and multiply and divide

Anyways on with the info

The add-opcode has the following syntax:

add destination, source

The calculation performed is destination = destination + source. The following forms are allowed:

 Destination                            Source                                   Example
 Register                 Register                 add ecx, edx
 Register                                 Memory                                  add ecx, dword ptr [104h] / add ecx, [edx]
 Register                 Immediate value                  add eax, 102
 Memory                                 Immediate value                  add dword ptr [401231h], 80
 Memory                                 Register                 add dword ptr [401231h], edx

This instruction is very simple. It just takes the source value, adds the destination value to it and then puts the result in the destination. Other mathematical instructions are:

 SUB destination, source (destination = destination - source)
 MUL destination, source (destination = destiantion * source)
 DIV source (eax = eax / source, edx = remainer)

Its easy peasy aint it Or is it

Substraction works the same as add, multiplication is just dest = dest * source. Division is a little different. Because registers are integer values (i.e. round numbers, not floating point numbers) , the result of a division is split in a quotient and a remainder. For example:

 28 / 6 --> quotient = 4, remainder = 4
 30 / 9 --> quotient = 3, remainder = 3
 97 / 10 --> quotient = 9, remainder = 7
 18 / 6 --> quotient = 3, remainder = 0

Now, depending on the size of the source, the quotient is stored in (a part of) eax, the remainder in (a part of) edx:

 Source                   size                         Division                  Quotient stored inRemainder Stored in...
 BYTE                      (8-bits)   ax / source                                            AL                                           AH
 WORD                   (16-bits) dx:ax* / source                                     AX                                           DX
 DWORD                (32-bits) edx:eax* / source                 EAX                                        EDX
  • = For example: if dx = 2030h, and ax = 0040h, dx: ax = 20300040h. dx:ax is a dword value where dx represents the

higher word and ax the lower. Edx:eax is a quadword value (64-bits) where the higher dword is edx and the lower eax.

The source of the div-opcode can be:

 an 8-bit register (al, ah, cl,...)
 a 16-bit register (ax, dx, ...)
 a 32-bit register (eax, edx, ecx...)
 an 8-bit memory value (byte ptr [xxxx])
 a 16-bit memory value (word ptr [xxxx])
 a 32-bit memory value (dword ptr [xxxx])

The source can not be an immediate value because then the processor cannot determine the size of the source operand.


BITWISE OPS

These instructions all take a destination and a source, exept the 'NOT' instruction. Each bit in the destination is compared to the same bit in the source, and depending on the instruction, a 0 or a 1 is placed in the destination bit:

 Instruction                AND     OR      XOR   NOT
 Source Bit             |0 0 1 1|0 0 1 1|0 0 1 1|0 1|
 Destination Bit |0 1 0 1|0 1 0 1|0 1 0 1|X X|
 Output Bit              |0 0 0 1|0 1 1 1|0 1 1 0|1 0|

AND sets the output bit to 1 if both the source and destination bit is 1. OR sets the output bit if either the source or destination bit is 1 XOR sets the output bit if the source bit is different from the destination bit. NOT inverts the source bit.

An example:

 mov ax, 3406
 mov dx, 13EAh
 xor ax, dx

ax = 3406 (decimal), which is 0000110101001110 in binary.

dx = 13EA (hex), which is 0001001111101010 in binary.

Perform the XOR operation on these bits:

 Source                   0001001111101010 (dx)
 Destination            0000110101001110 (ax)
 Output                    0001111010100100 (new dx)

The new dx is 0001111010100100 (7845 decimal, 1EA5 in hex) after the instruction.

Another example:

 mov ecx, FFFF0000h
 not ecx

FFFF0000 is in binary 11111111111111110000000000000000 (16 1's, 16 0's)

If you take the inverse of every bit, you get:

 00000000000000001111111111111111 (16 0's, 16 1's), which is 0000FFFF in hex.

So ecx is after the NOT operation 0000FFFFh.

The last one is handy for serial generating, as is XOR. Infact XOR is used more for serials than any other instruction, widely used for serial checking in Winzip, Winrar, EA Games, Vivendi Universalis

I WONT TELL YOU HOW TO MAKE KEYGENS SO DONT ASK :)

INC/DEC(REMENTS)

There are 2 very simple instructions, DEC and INC. These instructions increase or decrease a memory location or register with one. Simply put:

 inc reg -> reg = reg + 1
 dec reg -> reg = reg - 1
 inc dword ptr [103405] -> value at [103405] will increase by one.
 dec dword ptr [103405] -> value at [103405] will decrease by one.

Ahh easy one to understand So is the next one


NOP

This instruction does absolutely nothing. This instruction just occupies space and time. It is used for filling purposes and patching codes.


BIT rotation and shifting

Note: Most of the examples below use 8-bit numbers, but this is just to make the picture clear.

Shifting functions

 SHL destination, count
 SHR destination, count

SHL and SHR shift a count number of bits in a register/memlocation left or right.

Example:

 ; al = 01011011 (binary) here
 shr al, 3

This means: shift all the bits of the al register 3 places to the right. So al will become 00001011. The bits on the left are filled up with zeroes and the bits on the right are shifted out. The last bit that is shifted out is saved in the carry-flag. The carry-bit is a bit in the processor's Flags register. This is not a register like eax or ecx that you can directly access (although there are opcodes to do this), but it's contents depend on the result of the instruction. This will be explained later, the only thing you'll have to remember now is that the carry is a bit in the flag register and that it can be on or off. This bit equals the last bit shifted out.

shl is the same as shr, but shifts to the left.

 ; bl = 11100101 (binary) here
 shl bl, 2

bl is 10010100 (binary) after the instruction. The last two bits are filled up with zeroes, the carry bit is 1, because the bit that was last shifted out is a 1.


Then there are two other opcodes:

SAL destination, count (Shift Arithmetic Left) SAR destination, count (Shift Arithmetic Right)

SAL is the same as SHL, but SAR is not quite the same as SHR. SAR does not shift in zeroes but copies the MSB (most significant bit - The first bit if 1 it moves 1 in from the left, if 0 then 0's will be placed from left). Example:

 al = 10100110
 sar al, 3
 al = 11110100
 sar al, 2
 al = 11111101
 
 bl = 00100110
 sar bl, 3
 bl = 00000100

This one you may have problems to get to grips with

Rotation functions

 rol destination, count ; rotate left
 ror destination, count ; rotate right
 rcl destination, count ; rotate through carry left
 rcr destination, count ; rotate through carry right

Rotation looks like shifting, with the difference that the bits that are shifted out are shifted in again on the other side:

Example: ror (rotate right)


                                        Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0
 Before                                    1     0          0           1     1     0     1     1
 Rotate count 3                      1     0     0     1     1     0     1     1 (Shift out)
 Result                                    0     1     1     1     0     0     1     1

As you can see in the figure above, the bits are rotated, i.e. every bit that is pushed out is shift in again on the other side. Like shifting, the carry bit holds the last bit that's shifted out. RCL and RCR are actually the same as ROL and ROR. Their names suggest that they use the carry bit to indicate the last shift-out bit, which is true, but as ROL and ROR do the same, they do not differ from them.


Exchange

Quite Straightforward this, I wont go into major details, it just swaps the values of two registers about (values, addresses). Like example..

 eax = 237h
 ecx = 978h
 xchg eax, ecx
 eax = 978h
 ecx = 237h

Anyways end of day 1, if you learn this into your head the following days will get easier than harder. This is the basics ive taught you. Learn em well.

Links