I decided to brush some of basics, before I get on with my life... I picked up random stuff and started reading from middle.
I. First stop
1. AT & T syntax and Intel
a. AT & T prefix % for registers and $ for immeds
movl $1 %eax
movl $0xff %ebx
int $0x80
b. Intel does not use prefix for registers.. mark immeds with b (binary and hexa)
mov eax, 1
mov ebx, 0ffh
int 80h
Note: directions are opposite .. AT&T src dst .. Intel dst src
2. Memory operands
a. Intel
mov ebx, [ebx]
mov ebx, [ebx + 3]
b. AT&T
movl (%ebx), eax
movl 3(%ebx), eax
a. Intel
instr foo segreg:[base + index * scale + disp]
sub eax, [ebx + ecx * 4h - 20h]
b. AT&T
instr %segreg:disp(base, index, scale), foo
sub -0x20(%ebx, %ecx, 0x4), eax
3. Suffix indicate length of operands
l - long, w - word, b - byte
mov al, bl
mov ax, bx
mov eax, dword ptr [ebx]
movb %bl %al
movw %bx, %ax
movl %eax, %ebx
I dint understand something here... free
II. Next stop
6897 - In nibbles (bits) - 1 1010 1111 0001
6897 - In hex 1 a f 1
(Group from right.. This is why hex is important... Hex compresses bits and there is a direct conversion possible)
2. ASCII - Maps simple characters like A - 65 (I guess).. B-66
Uses 1 Byte
To accommodate more characters (many other languages) unicode is used
Uses 2 bytes
(Thats all there to unicode.. and the word sounded like a jargon to me all these days.. which stone was I living under)
3. Compilers convert high level language programs into machine specific instructions for that particular architecture (duh 4 years of education.. you should know it by now...)
4. I remember wagely, the joy of tying instructions in hex on a 8085 / 8086.. and getting the result. World was lot more simple atleast in college
8088 / 8086-
Must read -
http://www.cpu-world.com/Arch/8088.html16 bit registers - Hence segments of size 64 KB possible
Hence to access more than 64KB(1 MB), the special segment registers CS - code, DS - Data, SS - Stack, Extra ES
1. 8088 16 bits
(address) - Lower order byte
(address + 1) - Higher ordere byte
2. 80386 32 bits
(address) - segment lower order
(address + 1) - segment higher order
(address + 2) - offset lower order
(address + 3) - offset higher order
address (segment * 16) + offset
Code segment -
64KB segment, Jump can be short jump within the segment or long jump outside the segment
Reserved locations
0000h - 03ffh ( Interrupt vectors - 32 bit format segment:offset)
fff0h - ffffh (reset starting point)
Code segment: holds the code
All instructions that uses the IP (instruction pointer) retrieves the instruction using CS * 16 + IP
Stack segment:(address) -
All access to Base pointer (BP) and stack pointer(SP) are offsets in segment pointed to SS register.
Data segment:
All access by AX, BX, CX, DX, index registers (SI, DI) are offsets into segment pointed to be DS register
Extra Segment
DI by default is used with data segment. During string manipulation this might be used.. (not very clear - free)
AX (Accumulator)
BX (Base registers)
CX (Counters)
DX (Data)
SP (Stack pointer)
BP (Base pointer)
SI (Source Index)
DI (Destination index)
IP (Instruction pointer)
OF - Overflow
DF - Direction flag (For string manipulation)
IF - Interrupt enable flag
TF - Single-Step flag
SF - Sign flag
ZF - Zero flag
AF - Auxiliary flag
PF - Parity flag
CF - Carry flag
8086/ 8088 - 16 bit + segments
80286 - Protected mode
80386 - Extends registers 32 bit protected mode - Hence 4GB segments possible.. Bulb.. 4GB process space in linux..
80486/Pentium - MMX
AX - 16 bits EAX - 32 bits