Professional Documents
Culture Documents
Motivation
Start from a simple processor core Find new macro instructions to enhance performance and reduce code size
Application
ALU
Control
I/D Mem.
Reg Bus unit
control
RISC8 Architecture
Why RISC8?
Simple
Small
Methodology
Application (*.c) Front end Instr. Profiling Istr. Syn performance
IR (exp. tree)
simulation Istr. Syn
mach. code
byte
ASSIGN ADD
VAR VAR
addr16
AND
CON
reg
byte
VAR
reg
con08
addr16
SUIF IR
Expression trees
SUIF IR
Register level
Machine code
Data type carried Inaccurate cost No profiling Simple less tree nodes Machine independent
Data type carried One-to-one between macro instructions Profiling data can be back annotated Machine dependent
Data type lost One-to-one between machine instructions Profiling data accurate Large expression trees Machine dependent
Instruction Enumeration
Traverse tree structure in post-order
Normalize sub-tree orders Combine patterns from sub-trees Hash new instruction patterns Collect register usage and memory access for evaluation reg byte Annotate profiling information
ADD acc
AND
acc
byte
reg
con08
reg
byte
reg
con08
ADD
AND
byte
con08
Istr. Syn
Instr. Profile
Assembler
mach. code
Op-Code Reuse
Op codes may not be fully used in a specific application
Remove un-used instruction op-codes Typical applications use far less than 256 op-codes
FIR 28 ADPCM 49 GSM 32 max7219 39 LCD4x20 40 PRN-IO 30
application Opcodes
Implementation
Compiler front-end: SUIF Code generator: SPAM-olive
Retargeted to RISC8
RTL pattern enumeration: C++ RISC8 assembler: PERL RISC8 simulator: PERL Machine level pattern enumeration: PERL Macro driven instruction implementation automation: PERL
Benchmarks
Benchmark
adpcm
Instructions
null:nASSIGN(word,nAND(areg,const16)) null:nASSIGN(word,nADD(areg,word)) bool:nBOOL(areg,const16) bool:nBOOL(nAND(areg,const16),areg) areg:nIOR(nAND(areg,const16),word) acc:nAND(acc,const08) acc:nAND(nASR(acc,const08),reg) acc:nIOR(nAND(acc,const08),reg) acc:nASR(acc,const08) acc:nIOR(nAND(nASR(acc,const08),const08),reg) acc:nIOR(acc,const08) null:nASSIGN(byte,nIOR(acc,reg)) null:nASSIGN(byte,nIOR(acc,const08)) bool:nBOOL(nAND(areg,const16),const16) bool:nBOOL(acc,const08) null:nASSIGN(byte,nADD(reg,one)) Acc:nIOR(acc,const08) bool:nBOOL(nAND(acc,reg),zero)
#
40 40 86 36 24 796 492 414 330 621 240 96 96 60 99 30 140 48
GSM-encoder
PRN-IO
LCD_4X20 max7219
GSM encoder
Hardware/software tradeoff
Software gain: execution speed, code size Hardware cost: functional unit, decoding logic, data path configuration
4000 3500 3000 2500 2000 1500 1000 500 0 base instr#1 instr#2 instr#3 instr#4 code-size cycle hardware
Conclusions
RTL level pattern enumeration
Key to automating instruction identification, code-generation, assembly and simulation No need to change algorithm source code
Hardware/software trade-off
Op-code reuse