LecLure 1: lnLroducuon LecLure 1/2 LecLure ouLllne - Course lnformauon - Lxamlnauon: pro[ecL - Modern embedded sysLems - Lmbedded sysLems deslgn - erformance vs. predlcLablllLy - Lxamples - AuLomouve elecLronlcs - laySLauon 3, moblle phones - Mars aLhnder LecLure 1/3 Course lnformauon - 1eam (see conLacL lnfo on CampusneL) - !an Madsen, course leader - aul op and Sven karlsson, lecLurers - Mlchael 8elbel 8oesen and rabhaL kumar SaraswaL, Leachlng asslsLanLs - Webpage - CampusneL - http:,,ese1ab.mm.dtu.dk,cg-bn,wk.cg,IMLSCourse,uome LecLure 1/4 Course lnformauon, conL. LlLeraLure - 1exLbook: Schedullng ln 8eal-1lme SysLems (full LexL avallable onllne) - SelecLed chapLers (full LexL avallable on CampusneL) LecLure 1/3 Course lnformauon, conL. - LecLures - Language: Lngllsh - 13 lecLures - LecLure noLes: avallable on CampusneL as a ul le Lhe day before - Lxamlnauon: pro[ecL, see CampusneL/ro[ecL - 8eporL evaluauon and oral exam - 7.3 LC1S polnLs LecLure 1/6 WhaL ls an embedded sysLem? - uenluon - an speclal-purpose compuLer sysLem, parL of a larger sysLem whlch lL conLrols. - noLes - A compuLer ls used ln such devlces prlmarlly as a means Lo slmpllfy Lhe sysLem deslgn and Lo provlde exlblllLy. - Cen Lhe user of Lhe devlce ls noL even aware LhaL a compuLer ls presenL. roducL: Sonlcare lus LooLhbrush. Mlcroprocessor: 8-blL Zllog Z8. Lmbedded sysLems example roducL: nASA's Mars So[ourner 8over. Mlcroprocessor: 8-blL lnLel 80C83. Lmbedded sysLems example, conL. roducL: Carmln nuvl 333 Mhz mlcroprocessor Lmbedded sysLems example, conL. roducL: lod 1ouch Mlcroprocessor: 332MPz Samsung A8M Lmbedded sysLems example, conL. roducL: 8CA 8C3400 uvu player. Mlcroprocessor: 32-blL 8lSC. Lmbedded sysLems example, conL. roducL: Sony Albo L8S-110 8obouc uog. Mlcroprocessor: 64-blL MlS 8lSC. Lmbedded sysLems example, conL. SmarL pllls - 2nd generauon llylng mlcro-lnsecLs! uo noL have Lo be small . Lmbedded sysLems are everywhere o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o o LecLure 1/18 CharacLerlsucs of embedded sysLems - Slngle-funcuoned - uedlcaLed Lo perform a slngle funcuon - Complex funcuonallLy - Cen have Lo run sophlsucaLed algorlLhms or muluple algorlLhms. - Cell phone, laser prlnLer. - 1lghLly-consLralned - Low cosL, low power, small, fasL, eLc. - 8eacuve and real-ume - Conunually reacLs Lo changes ln Lhe sysLem's envlronmenL - MusL compuLe cerLaln resulLs ln real-ume wlLhouL delay - SafeLy-crlucal - MusL noL endanger human llfe and Lhe envlronmenL Some sLausucs - More Lhan humans on Lhe planeL, already - 40 bllllon of such devlces by 2020 - 99 of Lhe processors are used ln embedded sysLems - 4 bllllon embedded processors were sold lasL year alone - t71 bllllon global markeL ln 2009, growLh raLes of 14 - MarkeL slze ls abouL 100 umes Lhe deskLop markeL - Share of cosLs: - auLomouve (36), lndusLrlal auLomauon (22), Lelecommunlcauons (37), consumer elecLronlcs (41) and healLh/medlcal equlpmenL (33) - Palf a mllllon more englneers needed, worldwlde - expecLed Lo double over Lhe nexL 6 years LecLure 1/19 L e v e l
o f
d e p e n d e n c y
Lxample area: auLomouve Lmbedded sysLems: 90 fuLure lnnovauons 40 prlce 1970 1980 1990 2000 ACC SLop&Co 8lu ALC kSC 42 volLage lnLerneL orLal C8S, uM1S 1elemaucs Cnllne Servlces 8lue1ooLh Car Cmce Local Pazard Warnlng lnLegraLed SafeLy SysLem SLeer/8rake-8y-Wlre l-urlve Lane keeplng AsslsL. ersonallzauon Soware updaLe lorce leedback edal . LlecLronlc ln[ecuons Check ConLrol Speed ConLrol CenLral Locklng . navlgauon SysLem Cu-Changer ACC Adapuve Crulse ConLrol Alrbags uSC uynamlc SLablllLy ConLrol Adapuve Cear ConLrol xenon LlghL 8MW AsslsL 8uS/1MC Speech 8ecognluon Lmergency Call . LlecLronlc Cear ConLrol LlecLronlc Alr Condluon ASC Anu Sllp ConLrol A8S 1elephone SeaL Peaung ConLrol AuLom. Mlrror ulmmlng . s o u r c e :
8 M W
ulsLrlbuLed archlLecLure Lvoluuon of PandseLs and 1echnology LCus Appllcauon processor 8aseband ASlC Mlxed- Slgnal ASlC Lnergy managemenL ASlC osluon sensors 312 M8 uu8 u8AM 312M8 nAnu lLASP 2Mlx camera module 64M8 nC8 lLASP 64M8 Su8AM 8l 8auery WhlLe LLu drlver lrame buer ASlC MM C A8M9 uMA core keyboard LLu llash A8M9 uMA core 81 Module SlM lPl 8ack-llghL LLus 8lock ulagram of SLaLe-of-Lhe-arL SmarLphone Charger 1radluonal embedded soware developmenL - ueslgn and bulld Lhe LargeL hardware - uevelop Lhe soware lndependenLly - lnLegraLe Lhem and hope lL works Does not work for complex projects SysLem-level deslgn (?-charL) Model of system implementation System platform model System-level design tasks Analysis Software synthesis Hardware synthesis Application model Appllcauon model ArchlLecLure model Craphlcal lllusLrauon of Moore's law 1981 1984 1987 1990 1993 1996 1999 2002 Leading edge chip in 1981 10,000 transistors Leading edge chip in 2002 150,000,000 transistors - SomeLhlng LhaL doubles frequenLly grows more qulckly Lhan mosL people reallze! - A 2002 chlp could hold abouL 13,000 1981 chlps lnslde lLself ueslgn crlsls 0.3S 0.2S 0.18 0.1S 0.12 0.1 L o q
5 c o / e
Gates]cm 2
Moore's Law (S9 CAGk) wideninq 6ops wi// 1riqqer Porodiqm 5hi! Des|gn roducnv|ty (20-2S CAGk) Sohware roducnv|ty (8-10 CAGk) 1echno|ogy (m|cron) ueslgn challenge - opumlzlng deslgn meLrlcs - Common meLrlcs - erformance: Lhe execuuon ume or LhroughpuL of Lhe sysLem - Un|t cost: Lhe moneLary cosL of manufacLurlng each copy of Lhe sysLem, excludlng n8L cosL - NkL cost (non-8ecurrlng Lnglneerlng cosL): 1he one-ume moneLary cosL of deslgnlng Lhe sysLem - S|ze: Lhe physlcal space requlred by Lhe sysLem - ower: Lhe amounL of power consumed by Lhe sysLem - I|ex|b|||ty: Lhe ablllLy Lo change Lhe funcuonallLy of Lhe sysLem wlLhouL lncurrlng heavy n8L cosL ueslgn challenge - opumlzlng deslgn meLrlcs - Common meLrlcs (conunued) - 1|me-to-prototype: Lhe ume needed Lo bulld a worklng verslon of Lhe sysLem - 1|me-to-market: Lhe ume requlred Lo develop a sysLem Lo Lhe polnL LhaL lL can be released and sold Lo cusLomers - Ma|nta|nab|||ty: Lhe ablllLy Lo modlfy Lhe sysLem aer lLs lnlual release - Correctness, safety, many more 1he performance deslgn meLrlc - Wldely-used measure of sysLem, wldely-abused - Clock frequency, lnsLrucuons per second - noL good measures - Latency (response ume) - 1lme beLween Lask sLarL and end - e.g., ulglLal cameras A and 8 process lmages ln 0.23 seconds - 1hroughput - 1asks per second, e.g. Camera A processes 4 lmages per second - 1hroughpuL can be more Lhan laLency seems Lo lmply due Lo concurrency, e.g. Camera 8 may process 8 lmages per second (by capLurlng a new lmage whlle prevlous lmage ls belng sLored). - Speedup of 8 over S = 8's performance / A's performance - 1hroughpuL speedup = 8/4 = 2 1lme-Lo-markeL: a demandlng deslgn meLrlc - 1lme requlred Lo develop a producL Lo Lhe polnL lL can be sold Lo cusLomers - MarkeL wlndow - erlod durlng whlch Lhe producL would have hlghesL sales - Average ume-Lo-markeL consLralnL ls abouL 8 monLhs - uelays can be cosLly R e v e n u e s
( $ ) Time (months) ower deslgn meLrlc: Lrends not P/ote Nuc/eor keoctor 386 486 ennum ennum ro ennum 2 ennum 3 ennum 4 (rescou) ennum 4 LecLure 1/34 ro[ecL: Why? - 1he course wlll focus on Lhe ana|ys|s, s|mu|anon and des|gn of embedded sysLems. - Coal - undersLand and app|y slmulauon/analysls/deslgn Lechnlques for embedded appllcauons. - Pow: |mp|ement a soware Lool LhaL can slmulaLe/ analyze/deslgn an embedded sysLem. - ueLalls - see Lhese slldes and pro[ecL.pdf" on CampusneL/ro[ecL LecLure 1/33 ro[ecL: WhaL? - Soware Lool: - lmplemenLauon of slmulauon, analysls and/or deslgn Lechnlques for embedded sysLems A1: Very S|mp|e S|mu|ator SlmulaLe Lhe runnlng of an embedded appllcauon on a slngle processor sysLem, uslng preempuve xed-prlorlLy schedullng A2: kesponse-1|me Ana|ys|s ueLermlne Lhe worsL-case response umes for Lhe embedded sysLem slmulaLed ln A1 A3: Advanced techn|que lree cholce of embedded sysLem and Lechnlque (slmulauon, analysls and deslgn) LecLure 1/36 ro[ecL: Pow? - Soware Lool - lnpuL - Models for Lhe appllcauon and hardware archlLecLure (LecLure 2) - WorsL-case execuuon ume of each process (LecLures 3) - 1lmlng consLralnLs (deadllnes) - Schedullng pollcy - llxed-prlorlLy preempuve schedullng (LecLures 4-6) - AnoLher schedullng pollcy (LecLures 7-12 and blbllography) - CuLpuL - ls Lhe appllcauon meeung all Lhe deadllnes? - erformance numbers: e.g., worsL-case response umes LecLure 1/37 ro[ecL: uellverables - Source code wlLh commenLs - rogrammlng language: any language ls ne - Suggesuon: !ava, uslng Lhe !ung graph llbrary and CraphML for capLurlng Lhe models: hup://[ung.sourceforge.neL/ - See Lhe examples on CampusneL/ro[ecL - 8eporL - uocumenL Lhe deslgn and lmplemenLauon - uescrlbe Lhe resulLs obLalned - See Lhe documenLs on CampusneL/ro[ecL - soware_developmenL_pro[ecLs.pdf" and SysLemauc Soware 1esL.pdf" LecLure 1/38 ro[ecL, conL. - MllesLones - Sept. 9: Croup reglsLrauon - Add your group on Lhe course home page (see Croups" llnk) - Cct. 23: Advanced Loplc selecuon - ueclde on Lhe advanced Lechnlque" Lo be lmplemenLed - resenL your advanced Loplc and geL feedback - Cct. 30: ro[ecL reporL dra - upload dra Lo CampusneL - lL should conLaln a descrlpuon of Lhe advanced Loplc - Dec 11: llnal reporL submlsslon - upload nal reporL Lo CampusneL rellmlnary lecLure plan L1 lnLroducuon (aul op, Sven karlsson, Mlchael 8elbel 8oesen) L2 erformance analysls (!an Madsen) Lab: SlmpleScalar L3 WorsL-case execuuon ume analysls (!an Madsen) Lab: al1 LS lnLroducuon Lo schedullng (aul op) Lab: ro[ecL LS SchedulablllLy analysls (l) (aul op) Lab: 1lMLS L6 SchedulablllLy analysls (ll) (aul op) Lab: 1lMLS L7 Pandllng shared resources (aul op) 8eallsuc exerclse L8 Pandllng dependencles (aul op) Lab: MAS1 L9 arallel programmlng (Sven karlsson) Lab: CpenM L10 AspecLs of parallel programmlng (Sven karlsson) Lab: CLLL slmul. L11 Pybrld schedullng (aul op) Lxerclse L12 Mulu-core sysLems (aul op/!an Madsen) Lab: MAS1 L13 CuLlook 1he performance deslgn meLrlc - Wldely-used measure of sysLem, wldely-abused - Clock frequency, lnsLrucuons per second - noL good measures - Latency (response ume) - 1lme beLween Lask sLarL and end - e.g., ulglLal cameras A and 8 process lmages ln 0.23 seconds - 1hroughput - 1asks per second, e.g. Camera A processes 4 lmages per second - 1hroughpuL can be more Lhan laLency seems Lo lmply due Lo concurrency, e.g. Camera 8 may process 8 lmages per second (by capLurlng a new lmage whlle prevlous lmage ls belng sLored). - Speedup of 8 over S = 8's performance / A's performance - 1hroughpuL speedup = 8/4 = 2