Difference between revisions of "Assembler"
| m (→Opcodes) |  (→Opcodes) | ||
| Line 379: | Line 379: | ||
| ; [[Assembler:Commands:MOV|MOV]] ''destination'', ''source'' | ; [[Assembler:Commands:MOV|MOV]] ''destination'', ''source'' | ||
| : Sets the ''destination'' to the ''source''. | : Sets the ''destination'' to the ''source''. | ||
| + | |||
| + | ; [[Assembler:Commands:LEA|LEA]] ''destination'', ''source'' | ||
| + | : Load Effective Address | ||
| + | : <code>EBX = 0x00403A40</code> | ||
| + | : <code>lea eax, [ebx+8]</code> | ||
| + | : <code>EBX = 0x00403A48</code> | ||
| ; [[Assembler:Commands:INC|INC]] ''operand'' | ; [[Assembler:Commands:INC|INC]] ''operand'' | ||
Revision as of 02:33, 26 May 2021
Contents
Registers
16 bit
There are fourteen 16-bit registers. Four of them (AX, BX, CX, DX) are general-purpose registers (GPRs). Although each may have an additional purpose; for example, only CX can be used as a counter with the loop instruction. Each can be accessed as two separate bytes (thus BX's high byte can be accessed as BH and low byte as BL). Two pointer registers have special roles: SP (stack pointer) points to the "top" of the stack, and BP (base pointer) is often used to point at some other place in the stack, typically above the local variables (see frame pointer). The registers SI, DI, BX and BP are address registers, and may also be used for array indexing.[1]
Four segment registers (CS, DS, SS and ES) are used to form a memory address. The FLAGS register contains flags such as carry flag, overflow flag and zero flag. Finally, the instruction pointer (IP) points to the next instruction that will be fetched from memory and then executed; this register cannot be directly accessed (read or written) by a program.[1]
32 bit
With the advent of the 32-bit processor, the 16-bit general-purpose registers, base registers, index registers, instruction pointer, and FLAGS register, but not the segment registers, were expanded to 32 bits. The nomenclature represented this by prefixing an "E" (for "extended") to the register names in x86 assembly language. Thus, the AX register corresponds to the lowest 16 bits of the new 32-bit EAX register, SI corresponds to the lowest 16 bits of ESI, and so on. The general-purpose registers, base registers, and index registers can all be used as the base in addressing modes, and all of those registers except for the stack pointer can be used as the index in addressing modes.[1]
Two new segment registers (FS and GS) were added. With a greater number of registers, instructions and operands, the machine code format was expanded. To provide backward compatibility, segments with executable code can be marked as containing either 16-bit or 32-bit instructions. Special prefixes allow inclusion of 32-bit instructions in a 16-bit segment or vice versa.[1]
The 80387 processor line added eight 80-bit wide (FPU) registers: st(0) to st(7).[1]
With the Pentium III, Intel added eight 128-bit SSE floating point registers (XMM0 to XMM7).[1]
There are more 32 bit registers but this gets into some that are rarely used, and processor family specific registers.[1]
64 bit
Starting with the AMD Opteron processor, the x86 architecture extended the 32-bit registers into 64-bit registers in a way similar to how the 16 to 32-bit extension took place. An R-prefix identifies the 64-bit registers (RAX, RBX, RCX, RDX, RSI, RDI, RBP, RSP, RFLAGS, RIP), and eight additional 64-bit general registers (R8-R15) were also introduced in the creation of x86-64. However, these extensions are only usable in 64-bit mode, which is one of the two modes only available in long mode.[1]
128 bit
SIMD registers XMM0–XMM15.[1]
Purpose
Although the main registers (with the exception of the instruction pointer) are "general-purpose" in the 32-bit and 64-bit versions of the instruction set and can be used for anything, it was originally envisioned that they be used for the following purposes:
- AL/AH/AX/EAX/RAX: Accumulator
- BL/BH/BX/EBX/RBX: Base index (for use with arrays)
- CL/CH/CX/ECX/RCX: Counter (for use with loops and strings)
- DL/DH/DX/EDX/RDX: Extend the precision of the accumulator (e.g. combine 32-bit EAX and EDX for 64-bit integer operations in 32-bit code)
- SI/ESI/RSI: Source index for string operations.
- DI/EDI/RDI: Destination index for string operations.
- SP/ESP/RSP: Stack pointer for top address of the stack.
- BP/EBP/RBP: Stack base pointer for holding the address of the current stack frame.
- IP/EIP/RIP: Instruction pointer. Holds the program counter, the current instruction address.
Segment registers:
- CS: Code
- DS: Data
- SS: Stack
- ES: Extra data
- FS: Extra data #2
- GS: Extra data #3
No particular purposes were envisioned for the other 8 registers available only in 64-bit mode.[1]
Structure
| 64 | 56 | 48 | 40 | 32 | 24 | 16 | 8 | 
|---|---|---|---|---|---|---|---|
| R?X | |||||||
| E?X | |||||||
| ?X | |||||||
| ?H | ?L | ||||||
| 64 | 56 | 48 | 40 | 32 | 24 | 16 | 8 | 
|---|---|---|---|---|---|---|---|
| ? | |||||||
| ?D | |||||||
| ?W | |||||||
| ?B | |||||||
| 16 | 8 | 
|---|---|
| ?S | |
| 64 | 56 | 48 | 40 | 32 | 24 | 16 | 8 | 
|---|---|---|---|---|---|---|---|
| R?P | |||||||
| E?P | |||||||
| ?P | |||||||
| ?PL | |||||||
Note: The ?PL registers are only available in 64-bit mode.[1]
| 64 | 56 | 48 | 40 | 32 | 24 | 16 | 8 | 
|---|---|---|---|---|---|---|---|
| R?I | |||||||
| E?I | |||||||
| ?I | |||||||
| ?IL | |||||||
Note: The ?IL registers are only available in 64-bit mode.[1]
| 64 | 56 | 48 | 40 | 32 | 24 | 16 | 8 | 
|---|---|---|---|---|---|---|---|
| RIP | |||||||
| EIP | |||||||
| IP | |||||||
Segments
Segment registers: CS,ES,DS,SS,FS,GS
- Bits 0, 1 describe the RPL, request privilege level
- Bit 2 describes if the LDT is used or not
- Bits 3 to 15 contain the offset into the GDT or LDT table (when shifted left by 3)
- Example:
- CS of 8 = 1000b = 1 0 00 : RPL=0, LDT=0, so GDT is used, offset in GDT table is (1 << 3) = 8
- CS of 0x23 = 100011b = 100 0 11 : RPL=3, LDT=0 (GDT), offset in GDT table is 100b=4, (4 << 3) = 32
 
Note that even though 64-bit mode is used, bits 3 to 15 still only need to be shifted by 3 to point to the proper offset
GDT
The gdt is a table of descriptors that describe what should happen when entering a specific segment and setting it's rights. (What access rights, the limits, if it's data or code, etc...)
IDT
The IDT is a table of descriptors that describe what should happen when an interrupt occurs. It contains the used code segment, and the EIP/RIP address to call, but also information like the DPL of the interrupt and if it's a callgate, taskgate or interrupt gate
Useful interrupts in regards of game hacking: Interrupt 1(Single step), 3(breakpoint),13(General protection fault) and 14 (Page fault)
Flags
The FLAGS register is the status register that contains the current state of the processor. This register is 16 bits wide. Its successors, the EFLAGS and RFLAGS registers, are 32 bits and 64 bits wide. The wider registers retain compatibility with their smaller predecessors.[6]
Missing bits in this list are not a mistake, some flags temporarily use their neighbours.[5]
| Bit | Flag | Name | Description | 
|---|---|---|---|
| 00 | CF | Carry Flag | Becomes one if an addition, multiplication, AND, OR, etc results in a value larger than the register meant for the result. | 
| 02 | PF | Parity Flag | Becomes 1 if the lower 8-bits of an operation contains an even number of 1 bits. | 
| 04 | AF | Auxiliary Flag | Set on a carry or borrow to the value of the lower order 4 bits. | 
| 06 | ZF | Zero Flag | Becomes 1 if an operation results in a 0 writeback, or 0 register. | 
| 07 | SF | Sign Flag | Is 1 if the value saved is negative, 0 for positive. | 
| 08 | TF | Trap Flag | Allows for the stopping of code within a segment (allows for single stepping/debugging in programming). | 
| 09 | IF | Interrupt Flag | When this flag is set, the processor begins 'listening' for external interrupts. | 
| 10 | DF | Direction Flag | Determines the direction to move through the code (specific to repeat instructions). | 
| 11 | OF | Overflow Flag | Becomes 1 if the operation is larger than available space to write (eg: addition which results in a number > 32-bits). | 
| 12-13 | IOPL | I/O Privilege Level | 2-bit register specifying which privilege level is required to access the IO ports | 
| 14 | NT | Nested Task | Becomes 1 when calls within a program are made. | 
| 16 | RF | Resume Flag | Stays 1 upon a break, and stays that way until a given 'release' or resume operation/command occurs. | 
| 17 | VM | Virtual Machine 8086 | Becomes a 1 if the processor is to simulate the 8086 processor (16-bit). | 
| 18 | AC | Alignment Check | Checks that a file or command is not breaking its privilege level. | 
| 19 | VIF | Virtual Interrupt Flag | Almost always set in protected mode, listening for internal and assembling interrupts. | 
| 20 | VIP | Virtual Interrupt Pending | 1 if a virtual interrupt is yet to occur. | 
| 21 | ID | ID Flag | Is set if a CPU identification check is pending (used in some cases to ensure valid hardware). | 
[1][5]
The POPF, POPFD, and POPFQ instructions read from the stack the first 16, 32, 
and 64 bits of the flags register, respectively. 
POPFD was introduced with the i386 architecture and POPFQ with the x64 architecture. 
In 64-bit mode, PUSHF/POPF and PUSHFQ/POPFQ are available but not PUSHFD/POPFD.[1]
Opcodes
In computing, an opcode (abbreviated from operation code) is the portion of a machine language instruction that specifies the operation to be performed. Beside the opcode itself, most instructions also specify the data they will process, in the form of operands.[8][9][10]
Most commonly used opcodes:
- PUSH operand
- PUSHes (saves) data onto the stack.
- POP operand
- POPs (clears) data from the stack.
- PUSHF
- PUSHes (saves) the FLAGS register (16 bit) onto the stack.
- POPF
- POPs (clears) the FLAGS register (16 bit) from the stack.
- PUSHFD
- PUSHes (saves) the EFLAGS register (32 bit) onto the stack.
- Not available in 64 bit mode.
- POPFD
- POPs (clears) the EFLAGS register (32 bit) from the stack.
- Not available in 64 bit mode.
- PUSHFQ
- PUSHes (saves) the RFLAGS register (64 bit) onto the stack.
- POPFQ
- POPs (clears) the RFLAGS register (64 bit) from the stack.
- JMP operand
- Jumps to the given operand (address).
- CALL operand
- Calls the given operand (address or function).
- A RET must be hit for a CALL to work properly, best to use JMPs if unsure.
- RET operand
- Returns from a CALL optionally removing, operand number of, bytes from the stack.
- This is used for POPing values passed, to the CALL, from the stack.
- MOV destination, source
- Sets the destination to the source.
- LEA destination, source
- Load Effective Address
- EBX = 0x00403A40
- lea eax, [ebx+8]
- EBX = 0x00403A48
- INC operand
- Increases the operand by one.
- operand = operand + 1
- DEC operand
- Decreases the operand by one.
- operand = operand - 1
- ADD destination, source
- Adds the source to the destination.
- destination = destination + source
- SUB destination, source
- Subtracts the source from the destination.
- destination = destination - source
- MUL operand
- Performs an unsigned multiplication of two operands.
- Multiplies the operand by the accumulator register. Placing the high value in the data register and the low value in the accumulator register.
- AH:AL = AL * operand : byte
- DX:AX = AX * operand : WORD
- EDX:EAX = EAX * operand : DWORD
- RDX:RAX = RAX * operand : QWORD
- DIV operand
- Performs an unsigned division of two operands.
- Divids the data register (high) and the accumulator register (low) by the operand.
- Placing the quotient in the accumulator register and the remainder in the data register.
- AL AH = AH:AL/operand : byte
- AX DX = DX:AX/operand : WORD
- EAX EDX = EDX:EAX/operand : DWORD
- RAX RDX = RDX:RAX/operand : QWORD
- NOP
- No Operation.
- Usually used when removing original code.
- OR destination, source
- The OR instruction is used for supporting logical expression by performing bitwise OR operation.
- The bitwise OR operator returns 1, if the matching bits from either or both operands are one.
- It returns 0, if both the bits are zero.
- destination = destination | source
- Example:
| destination: 0101 | | source: 0011 | |-----------------------------------| | After OR -> destination: 0111 |
- XOR destination, source
- The XOR instruction implements the bitwise XOR operation.
- The XOR operation sets the resultant bit to 1, if and only if the bits from the operands are different.
- If the bits from the operands are same (both 0 or both 1), the resultant bit is cleared to 0.
- destination = destination ^ source
- Example:
| destination: 0101 | | source: 0011 | |------------------------------------| | After XOR -> destination: 0110 |
- AND destination, source
- The AND instruction is used for supporting logical expressions by performing bitwise AND operation.
- The bitwise AND operation returns 1, if the matching bits from both the operands are 1, otherwise it returns 0.
- destination = destination & source
- Example:
| destination: 0101 | | source: 0011 | |------------------------------------| | After AND -> destination: 0001 |
- TEST destination, source
- The TEST instruction works same as the AND operation, but unlike AND instruction, it does not change the first operand.
- It is useful for quick tests of a register to test it for being negative or zero. For instance `test eax,eax` will set the Z flag if eax is zero so you can use `jz` to jump if zero or `jnz` to jump if not zero, and the sign bit so you can use `js` to jump if negative or `jns` to jump if positive or zero.
- NOT operand
- The NOT instruction implements the bitwise NOT operation.
- NOT operation reverses the bits in an operand.
- The operand could be either in a register or in the memory.
- operand = !operand
- Example:
| operand: 0101 0011 | |------------------------------------| | After NOT -> operand: 1010 1100 |
- LOOP operand
- The LOOP instruction assumes that the (E)CX register contains the loop count.
- When the loop instruction is executed, the (E)CX register is decremented and the control jumps to the target label,
- until the (E)CX register value, i.e., the counter reaches the value zero.
- Used for Loop control.
label(loop_start) MOV ECX,10 loop_start: // loop body LOOP loop_start
- CMP operand1, operand2
- Compares the first operand with the second operand and sets the status flags in the (E)FLAGS register according to the results.
See also
External links
- Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 2A: Instruction Set Reference, A-M
- Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 2B: Instruction Set Reference, N-Z
- wikibooks.org/wiki/X86_Assembly
- wikibooks.org/wiki/X86_Assembly/X86_Instructions
- wikipedia.org/wiki/X86_instruction_listings
- wikibooks.org/wiki/X86_Assembly/Other_Instructions
- c9x.me/x86/
- ref.x86asm.net
Sources
- wikipedia.org/wiki/X86
- www.cs.virginia.edu/~evans/cs216/guides/x86.html
- 64-ia-32-architectures-software-developer-vol-1-manual.pdf
- wikipedia.org/wiki/File:Table_of_x86_Registers_svg.svg
- www.tech-recipes.com/rx/1239/assembly-flags/
- wikipedia.org/wiki/FLAGS_register
- wikipedia.org/wiki/Status_register
- wikipedia.org/wiki/Opcode
- wikipedia.org/wiki/X86_instruction_listings
- wikibooks.org/wiki/X86_Assembly/Other_Instructions]


