Professional Documents
Culture Documents
Machine-Level Programming
I: Basics
15-213/18-213: Introduction to Computer Systems
5th Lecture, Jan 28, 2014
Instructors:
Seth Copen Goldstein, Anthony Rowe, Greg Kesden
Carnegie Mellon
Today: Machine
Programming I: Basics
Carnegie Mellon
Evolutionary design
Backwards compatible up until 8086, introduced in
1978
Added more features as time goes on
Carnegie Mellon
5-10
386
1985
275K
16-33
Pentium 4F 2004
3800
125M
2800-
Core 2
3500
2006
291M
1060-
Core i7
2008
731M
1700-
Carnegie Mellon
Goodness
Moores Law
Time
5
Carnegie Mellon
Goodness
Moores Law
Time
6
Carnegie Mellon
Goodness
Moores Law
Time
7
Carnegie Mellon
Moores Law
Goodness
Happy
H aHappy
ppy
BDay
B
Da
Bday
y
Time
8
Carnegie Mellon
Compare to
1983
Carnegie Mellon
10
Carnegie Mellon
Machine Evolution
386
Pentium
Pentium/MMX
PentiumPro
Pentium III
Pentium 4
Core 2 Duo
Core i7
1985
1993
1997
1995
1999
2001
2006
2008
0.3M
3.1M
4.5M
6.5M
8.2M
42M
291M
731M
Added Features
Instructions to support multimedia operations
Instructions to enable more efficient conditional
operations
Transition from 32 bits to 64 bits
More cores
11
Carnegie Mellon
Historically
AMD has followed just behind Intel
A little bit slower, a lot cheaper
Then
Recruited top circuit designers from Digital Equipment
Corp. and other downward trending companies
Built Opteron: tough competitor to Pentium 4
Developed x86-64, their own extension to 64 bits
12
Carnegie Mellon
Intels 64-Bit
13
Carnegie Mellon
Our Coverage
IA32
The traditional x86
shark> gcc m32 hello.c
x86-64
The emerging standard
shark> gcc hello.c
shark> gcc m64 hello.c
Presentation
Carnegie Mellon
Today: Machine
Programming I: Basics
15
Carnegie Mellon
Definitions
16
Assembly Programmers
View
CPU
Registers
PC
Condition
Codes
Memory
Addresses
Code
Data
Stack
Data
Instructions
Programmer-Visible
State
PC: Program counter
Address of next instruction
Called EIP (IA32) or RIP
(x86-64)
Carnegie Mellon
Memory
Byte addressable array
Code and user data
Stack to support
procedures
Register file
Condition codes
17
Carnegie Mellon
te
xt
te
xt
bina
ry
bina
ry
C program (p1.c
p2.c)
Compiler (gcc
-S)
Asm program (p1.s
p2.s)
Assembler (gcc or
as)
Object program (p1.o
Static
p2.o)
libraries (.a)
Linker (gcc or
ld)
Executable program (p)
18
Carnegie Mellon
Generated IA32
Assembly
sum:
pushl %ebp
movl %esp,%ebp
movl 12(%ebp),%eax
addl 8(%ebp),%eax
popl %ebp
ret
Carnegie Mellon
20
Carnegie Mellon
Assembly Characteristics:
Operations
Transfer control
Unconditional jumps to/from procedures
Conditional branches
21
Carnegie Mellon
Object Code
Code for sum
0x401040 <sum>:
0x55
0x89
0xe5
0x8b
0x45
0x0c
0x03
0x45
0x08
0x5d
0xc3
Total of 11
bytes
Each
instruction
1, 2, or 3
bytes
Starts at
Assembler
Translates .s into .o
Binary encoding of each instruction
Nearly-complete image of executable
code
Missing linkages between code in
different files
Linker
Resolves references between files
Combines with static run-time libraries
E.g., code for malloc, printf
Some libraries are dynamically linked
Linking occurs when program begins
execution
22
Machine Instruction
Example
C Code
Carnegie Mellon
int t = x+y;
addl 8(%ebp),%eax
Similar to expression:
x += y
More precisely:
int eax;
int *ebp;
eax += ebp[2]
0x80483ca:
Assembly
03 45 08
Object Code
3-byte instruction
23
Disassembling Object
Code
Carnegie Mellon
Disassembled
080483c4 <sum>:
80483c4: 55
80483c5: 89 e5
80483c7: 8b 45 0c
80483ca: 03 45 08
80483cd: 5d
80483ce: c3
push
mov
mov
add
pop
ret
%ebp
%esp,%ebp
0xc(%ebp),%eax
0x8(%ebp),%eax
%ebp
Disassembler
objdump -d p
Useful tool for examining object code
Analyzes bit pattern of series of instructions
Produces approximate rendition of assembly code
Can be run on either a.out (complete executable) or .o
file
24
Carnegie Mellon
Alternate Disassembly
Disassembled
Object
0x401040:
0x55
0x89
0xe5
0x8b
0x45
0x0c
0x03
0x45
0x08
0x5d
0xc3
Carnegie Mellon
What Can be
Disassembled?
% objdump -d WINWORD.EXE
WINWORD.EXE:
No symbols in "WINWORD.EXE".
Disassembly of section .text:
30001000 <.text>:
30001000: 55
30001001: 8b ec
30001003: 6a ff
30001005: 68 90 10 00 30
3000100a: 68 91 dc 4c 30
push
mov
push
push
push
%ebp
%esp,%ebp
$0xffffffff
$0x30001090
$0x304cdc91
26
Carnegie Mellon
Today: Machine
Programming I: Basics
27
Carnegie Mellon
general purpose
Origin
(mostly obsolete
%eax
%ax
%ah
%al
accumulate
%ecx
%cx
%ch
%cl
counter
%edx
%dx
%dh
%dl
data
%ebx
%bx
%bh
%bl
base
%esi
%si
source
index
%edi
%di
destination
index
%esp
%sp
%ebp
%bp
16-bit virtual registers
(backwards compatibility)
stack
pointer
base
pointer
28
Carnegie Mellon
%eax
%ecx
Moving Data
movl Source, Dest:
%edx
%ebx
Operand Types
Immediate: Constant integer data
%esi
Example: $0x400, $-533
%edi
Like C constant, but prefixed with $
%esp
Encoded with 1, 2, or 4 bytes
%ebp
Register: One of 8 integer registers
Example: %eax, %edx
But %esp and %ebp reserved for special use
Others have special uses for particular instructions
Memory: 4 consecutive bytes of memory at address
given by register
Simplest example: (%eax)
Various other address modes
29
Carnegie Mellon
%eax
%ecx
Moving Data
movl Source, Dest:
%edx
%ebx
Operand Types
Immediate: Constant integer data
%esi
Example: $0x400, $-533
%edi
Like C constant, but prefixed with $
%esp
Encoded with 1, 2, or 4 bytes
%ebp
Register: One of 8 integer registers
Example: %eax, %edx
But %esp and %ebp reserved for special use
Others have special uses for particular instructions
Memory: 4 consecutive bytes of memory at address
given by register
Simplest example: (%eax)
Various other address modes
30
Carnegie Mellon
movl Operand
Combinations
Source Dest
movl
Src,Dest
C Analog
Imm
temp = 0x4;
Reg
temp2 = temp1;
Mem
Reg
temp = *p;
movl (%eax),%edx
*p = -147;
*p = temp;
31
Carnegie Mellon
Simple Memory
Addressing Modes
Normal
(R)
Mem[Reg[R]]
Register R specifies memory address
Aha! Pointer dereferencing in C
movl (%ecx),%eax
DisplacementD(R)
Mem[Reg[R]+D]
Register R specifies start of memory region
Constant displacement D specifies offset
D is an arbitrary integer constrained to fit in 1-4
bytes
32
swap:
pushl %ebp
movl %esp,%ebp
pushl %ebx
movl
movl
movl
movl
movl
movl
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
popl
popl
ret
%ebx
%ebp
Carnegie Mellon
Set
Up
Body
Finish
33
swap:
pushl %ebp
movl %esp,%ebp
pushl %ebx
movl
movl
movl
movl
movl
movl
popl
popl
ret
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
%ebx
%ebp
Carnegie Mellon
Set
Up
Body
Finish
34
Carnegie Mellon
Understanding Swap
void swap(int *xp, int *yp)
{
int t0 = *xp;
int t1 = *yp;
*xp = t1;
*yp = t0;
}
Register
%edx
%ecx
%ebx
%eax
Value
xp
yp
t0
t1
movl
movl
movl
movl
movl
movl
Offset
Stack
(in memory
12
yp
xp
4 Rtn adr
0 Old %ebp
%ebp
-4 Old %ebx
%esp
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
#
#
#
#
#
#
edx
ecx
ebx
eax
*xp
*yp
=
=
=
=
=
=
xp
yp
*xp (t0)
*yp (t1)
t1
t0
35
Carnegie Mellon
Understanding Swap
123
Address
0x124
456
0x120
0x11c
%eax
Offset
%edx
%ecx
%ebx
%esi
%edi
0x114
yp
12
0x120
0x110
xp
0x124
0x10c
%ebp
%esp
%ebp
0x118
0x104
movl
movl
movl
movl
movl
movl
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
#
#
#
#
#
#
0x100
edx
ecx
ebx
eax
*xp
*yp
=
=
=
=
=
=
xp
yp
*xp (t0)
*yp (t1)
t1
t0
36
Carnegie Mellon
Understanding Swap
123
Address
0x124
456
0x120
0x11c
%eax
%edx
Offset
0x124
%ecx
%ebx
%esi
%edi
0x114
yp
12
0x120
0x110
xp
0x124
0x10c
%ebp
%esp
%ebp
0x118
0x104
movl
movl
movl
movl
movl
movl
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
#
#
#
#
#
#
0x100
edx
ecx
ebx
eax
*xp
*yp
=
=
=
=
=
=
xp
yp
*xp (t0)
*yp (t1)
t1
t0
37
Carnegie Mellon
Understanding Swap
123
Address
0x124
456
0x120
0x11c
%eax
%edx
0x124
%ecx
0x120
Offset
%ebx
%esi
%edi
0x114
yp
12
0x120
0x110
xp
0x124
0x10c
%ebp
%esp
%ebp
0x118
0x104
movl
movl
movl
movl
movl
movl
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
#
#
#
#
#
#
0x100
edx
ecx
ebx
eax
*xp
*yp
=
=
=
=
=
=
xp
yp
*xp (t0)
*yp (t1)
t1
t0
38
Carnegie Mellon
Understanding Swap
123
Address
0x124
456
0x120
0x11c
%eax
%edx
0x124
%ecx
0x120
%ebx
123
Offset
%esi
%edi
0x114
yp
12
0x120
0x110
xp
0x124
0x10c
%ebp
%esp
%ebp
0x118
0x104
movl
movl
movl
movl
movl
movl
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
#
#
#
#
#
#
0x100
edx
ecx
ebx
eax
*xp
*yp
=
=
=
=
=
=
xp
yp
*xp (t0)
*yp (t1)
t1
t0
39
Carnegie Mellon
Understanding Swap
123
Address
0x124
456
0x120
0x11c
%eax
456
%edx
0x124
%ecx
0x120
%ebx
123
Offset
%esi
%edi
0x114
yp
12
0x120
0x110
xp
0x124
0x10c
%ebp
%esp
%ebp
0x118
0x104
movl
movl
movl
movl
movl
movl
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
#
#
#
#
#
#
0x100
edx
ecx
ebx
eax
*xp
*yp
=
=
=
=
=
=
xp
yp
*xp (t0)
*yp (t1)
t1
t0
40
Carnegie Mellon
Understanding Swap
456
Address
0x124
456
0x120
0x11c
%eax
456
456
%edx
0x124
%ecx
0x120
%ebx
123
Offset
%esi
%edi
0x114
yp
12
0x120
0x110
xp
0x124
0x10c
%ebp
%esp
%ebp
0x118
0x104
movl
movl
movl
movl
movl
movl
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
#
#
#
#
#
#
0x100
edx
ecx
ebx
eax
*xp
*yp
=
=
=
=
=
=
xp
yp
*xp (t0)
*yp (t1)
t1
t0
41
Carnegie Mellon
Understanding Swap
456
Address
0x124
123
0x120
0x11c
%eax
456
%edx
0x124
%ecx
0x120
%ebx
123
Offset
%esi
%edi
0x114
yp
12
0x120
0x110
xp
0x124
0x10c
%ebp
%esp
%ebp
0x118
0x104
movl
movl
movl
movl
movl
movl
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
#
#
#
#
#
#
0x100
edx
ecx
ebx
eax
*xp
*yp
=
=
=
=
=
=
xp
yp
*xp (t0)
*yp (t1)
t1
t0
42
Carnegie Mellon
Special Cases
(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]]
D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D]
(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]]
43
Carnegie Mellon
Today: Machine
Programming I: Basics
44
Carnegie Mellon
C Data Type
unsigned
4
int
4
long int
8
char
1
short
2
float
4
double
8
8
45
Carnegie Mellon
%eax
%r8
%r8d
%rbx
%ebx
%r9
%r9d
%rcx
%ecx
%r10
%r10d
%rdx
%edx
%r11
%r11d
%rsi
%esi
%r12
%r12d
%rdi
%edi
%r13
%r13d
%rsp
%esp
%r14
%r14d
%rbp
%ebp
%r15
%r15d
46
Carnegie Mellon
Instructions
New instructions:
movl movq
addl addq
sall salq
etc.
Carnegie Mellon
swap:
pushl %ebp
movl %esp,%ebp
pushl %ebx
movl
movl
movl
movl
movl
movl
8(%ebp), %edx
12(%ebp), %ecx
(%edx), %ebx
(%ecx), %eax
%eax, (%edx)
%ebx, (%ecx)
popl
popl
ret
%ebx
%ebp
Set
Up
Body
Finish
48
Carnegie Mellon
movl
movl
movl
movl
(%rdi), %edx
(%rsi), %eax
%eax, (%rdi)
%edx, (%rsi)
ret
Set
Up
Body
Finish
49
Carnegie Mellon
movq
movq
movq
movq
ret
(%rdi), %rdx
(%rsi), %rax
%rax, (%rdi)
%rdx, (%rsi)
Set
Up
Body
Finish
64-bit data
Data held in registers %rax and %rdx
movq operation
50
Carnegie Mellon
Machine Programming I:
Summary
Intro to x86-64
A major departure from the style of code seen in IA32
51