You are on page 1of 24

White Paper Lori M.

Matassa Software Technical Marketing Engineer Intel Corporation

PowerPC* to Intel Architecture Migration


January 2009

321079-001

PowerPC* to Intel Architecture Migration

Executive Summary
Completing a successful PowerPC* to Intel architecture software migration requires an awareness of architecture differences and its impact to the software. This white paper outlines the information that should be considered when planning a software product port from PowerPC to Intel architecture-based platforms. A thorough review of the architecture differences, operating system considerations, system initialization, migration tools, and software development products must be completed. The first thing to understand is that every situation is different. Therefore, the scope of work and effort required for the port will vary between situations. The information outlined in this paper will identify items that need to be considered when investigating the migration and resources that can assist during the implementation.

321079

PowerPC* to Intel Architecture Migration

Contents
Executive Summary ............................................................................... 2 Introduction .......................................................................................... 4 Intel Embedded Design Center ........................................................ 4 Reasons to Migrate to Intel Architecture ............................................. 4 Migration Considerations ....................................................................... 5 Hardware Architecture Differences ..................................................... 5 Operating Systems .......................................................................... 7 System Initialization Firmware .......................................................... 8 Architecture Migration Tools............................................................ 11 Software Development Tools........................................................... 13 Multi-core Solutions ....................................................................... 15 Training and Design Information ......................................................... 18 Intel Software College.................................................................. 18 Intel Software Network ................................................................ 18 Migration Design Guide (Putting It All Together) ................................. 18 Step 1 Port PowerPC* Code to Target Operating System .................. 19 Step 2 Execute Code Correctly on One Intel Architecture Core ................................................................. 19 Step 3 Optimize the Code for Performance on One Intel Architecture Core .......................................................... 20 Step 4 Apply Multi-core Software Design Updates ........................... 21 Step 5 Optimize the Software Design for Multi-core Intel Architecture Performance ...................................................... 21 Conclusion ........................................................................................... 22

321079

PowerPC* to Intel Architecture Migration

Introduction
Porting software to platforms of different processor architecture can be simple or require additional effort depending on the design (portability) of the original software. Software that is specifically written to run on one hardware architecture will need to be updated to support the architecture differences. For software implementations that abstract away the hardware and operating system specific information, the port could be as simple as a recompile. One of the main migration hurdles is the Endianness difference between PowerPC* (PPC) and Intel architecture. Other considerations include variations between the current and target operating systems and development tools. Completing a successful port involves assessing and understanding the current situation and requirements before the migration begins.

Intel Embedded Design Center


Intel provides the Intel Embedded Design Center (EDC) to help designers get started with designing on Intel architecture. The EDC provides embedded hardware and software design information, as well as step-by-step guidance to design information and decisions. The EDC simplifies information searches by providing everything you need to know about designing for embedded Intel architecture systems all in one location. Whether the information needed is hardware schematics or firmware and device drivers, the EDC is the one-stop shop for embedded Intel architecture hardware and software design information such as downloads, white papers, and case studies at your fingertips. Visit the EDC at http://www.intel.com/embedded/edc.

Reasons to Migrate to Intel Architecture


Intel architecture has established itself as a proven leader in performance and innovation over its entire history. Intel continues to offer a strong roadmap with products that are optimized for performance, power and value. Also, Intel's involvement in developing and driving standards based platforms has dramatically altered the computer industry. Today most customers demand standards based open architecture and compatibility with legacy applications.
Architecture migration is faced with software development as a key opportunity in meeting time-to-market constraints, and the availability of good software development tools is a main influencer when making decisions for any software system. Intel provides a full line of Intel development products (see page 13) that help implement, debug, and tune software for performance and correctness, as well as reducing the time-to-market. Open source initiatives such as Linux* and the strong support for development tools from major independent software vendors (ISVs) and operating system vendors (OSVs) are considerations that are too important to pass up. In addition, Intel's initiatives in driving a strong ecosystem of independent hardware vendors (IHVs) and standards to address various form factors, helps customers to focus on their unique solution and enhance their intellectual property.
4 321079

PowerPC* to Intel Architecture Migration

Migration Considerations
Architecture migration includes consideration of multiple software design areas including several hardware architectural differences, operating system, system initialization, and migration and development tools. Another architecture aspect to be considered when migrating from PPC to Intel architecture is moving from a uniprocessor serial code to a multi-core software system. This paper discusses each of these areas along with various design choices. Understanding that every migration situation is different, the migration design guide will step system designers through situational decisions and solutions, which will guide their overall migration plan.

Hardware Architecture Differences


The hardware architecture differences between PPC and Intel architecture span instruction-set, register, and memory categories. The beauty of developing code in a high level programming language, such as C-language, is that the source code, for the most part, is portable between hardware architectures. This is because high level languages use compilers, which handle instruction-set and register differences (they generate the machine code for the target processor architecture). Architectural differences to be aware of include the following items listed in Table 1. Differences Between PPC and Intel Hardware Architecture: Table 1. Differences Between PPC and Intel Hardware Architecture1

Instruction set
PPC and Intel architecture instructions are very different. For some instructions there is no one to one (PPC to Intel architecture) Instructions equivalent. Refer to the Intel Software Developer Manuals and instruction set information and tools that may assist the assembly code migration. PPC instructions are all 4 bytes in size and must be aligned on 4 byte boundaries. Intel architecture instructions vary in size and therefore do not require alignment. Alignment On PPC a bool is 4 bytes. On Intel architecture, a bool is 1 byte. Make the code portable by changing the PPC boolean data to an unsigned 32-bit integer. PPC uses Altivec* instructions. Intel architecture uses Streaming Vector oriented SIMD Extensions (SSE). Refer to the Vector Oriented Code section instructions for details about migrating Altivec to SSE instructions.

Operations
Divide-by-zero For Integer divide-by-zero, PPC simply returns zero. On Intel architecture, executing this operation is fatal. Code should always check the denominator for zero before executing the divide operation. There is no difference in operation between PPC and Intel architecture floating point divide-by-zero.

321079

PowerPC* to Intel Architecture Migration

Table 1.

Differences Between PPC and Intel Hardware Architecture1

Hardware Devices
If a PPC driver or library comes from a third party vendor, check with the vendor for equivalent Intel architecture products. If any device drivers or libraries are developed in-house, they will need to be re-written for Intel architecture. Refer to the Device Drivers section of this paper for chipset and graphics driver information.

Drivers and Libraries

Registers
Calling conventions Specified by the application binary interface (ABI) Arguments are passed in registers for PPC. For Intel architecture, arguments are passed on the stack. Intel architecture has fewer registers than PPC and therefore local variables may be stored on the stack as well.

Memory
Endianness describes how multi-byte data is represented by a computer system and is dictated by the CPU architecture of the system. Intel architecture uses little endian and PPC uses big endian format to store multi-byte data. The difference in Endianarchitecture is an issue when software or data is shared between computer systems. Refer to the Endianness section of this paper for more information. The order of bit fields in memory can be reversed between architectures. Refer to the Bit Fields and Bit Masks section of the Endianness white paper for more details.

Byte order (Endianness)

Bit fields

1. "Architectural Differences." Universal Binary Programming Guidelines. 26 Feb 2007. Apple.com. 18 Dec 2008. http://developer.apple.com/documentation/MacOSX/Conceptual/universal_binary/ universal_binary_intro/chapter_1_section_1.html

Endianness
Endianness describes how multi-byte data is represented by a computer system and is dictated by the CPU architecture of the system. Unfortunately not all computer systems are designed with the same Endian-architecture. Big endian is an order in which the big end (most significant value in the sequence) is stored first, at the lowest storage address. The most significant byte is stored in the leftmost position. PPC systems use the big endian model, where the most significant byte is at the lowest address in memory. Little endian is an order in which the little end (least significant value in the sequence) is stored first. The most significant byte is stored in the rightmost position. Intel architecture systems use the little endian model, where the least significant byte is at the lowest address in memory. In Table 2, the 32-bit hex value 0x12345678 is stored in memory as follows for each Endianarchitecture. The lowest memory address is represented in the leftmost position. You can break up your copy into three levels of headings if desired, but no more.

321079

PowerPC* to Intel Architecture Migration

Table 2.

Hex values for Endian-architecture

Endian Order
Big Endian Little Endian

Byte 00 12 78 (LSB)

Byte 01 34 56

Byte 02 56 34

Byte 03 78 12

The difference in Endian-architecture is an issue when software or data is shared between computer systems; between files or passed through a network connection. If the code is not endian-neutral it must be updated to account for little endian architecture because difference in byte ordering can produce incorrect results. For complete details that describe software considerations related to microprocessor Endian architecture and guidelines for developing Endian-neutral code see the Endian White Paper at:http://www.intel.com/design/intarch/papers/endian.htm.

Operating Systems
If the architecture migration includes a port to a new OS, check with the target OS distributor to see if there is an OS migration guide available that supports the current and target OS pair used in the migration. Considerations for porting source code to a new OS not only includes updating the OS calls, but also includes locating the correct version of all necessary third-party utilities and libraries needed to build the application. Common examples are: Source control system Developer tools Build utilities Licensing, graphics, or other third-party libraries If the situation allows, make sure to port to the OS version that will be used for the target multi-core solution. Meaning, if SMP will be used as the target OS solution, port to the SMP version of the target OS.

Real-time Requirements and Power Management


Some embedded applications depend on predictable response times and therefore run on operating systems (OSs) offering real-time support. In a real-time environment, it is critical for an OS to be able to guarantee certain time slices. Additionally, it is important that time slice measurements remain consistent. This should be taken into consideration when ever power management is enabled. For example: If a high end platform response should happen within a certain number of microseconds, with power management is enabled it could require some of that time for the processor to wake up, which affects the amount of time left for the response. Furthermore, Speculative pre-fetch memory accesses can also cause real-time operating (RTOS) issues in tight loops. For requirements of system RTOS guaranteed response times, Intel architecture power management and speculative pre-fetch features should be disabled.

321079

PowerPC* to Intel Architecture Migration

Device Drivers
If the PPC driver is developed in-house, the low level initialization will need to be updated for Intel architecture. Open source versions of the driver may help guide the changes that are required.

Intel Embedded and Communications Chipset Drivers


The Intel architecture chipset data sheets contain information about registers that need to be programmed. Technical information about the Intel embedded and communications chipsets can be found at http://www.intel.com/products/embedded/ chipsets.htm?iid=embed_body+chip. Depending on the OS, Intel architecture device drivers are available from various providers. The RTOS board support packages (BSPs) for Intel embedded chipset drivers are available from the RTOS vendors. The standard desktop, mobile, server drivers for Microsoft Windows* (XP or Vista*) and Linux* can be download from http:// downloadcenter.intel.com/. BSPs for Microsoft Windows CE* can be downloaded from these third party vendor sites: Adeneo Corporation* BSQUARE* Wipro Technologies*

Intel Embedded Graphics Drivers


If a graphics driver is required, the IEGD driver is implemented specifically to address embedded usage models. IEGD is also developed to be supported on Microsoft Windows XP*, Microsoft Windows XP Embedded*, Microsoft Windows CE*, and various Linux* distributions. The IEGD driver is not included with packaged chipset drivers or board support packages, and must download separately at http://www.intel.com/go/ iegd.

System Initialization Firmware


Every embedded Intel architecture design must include a firmware stack which initializes the processor, memory, IO, peripherals, and may include diagnostic routines. Firmware initializes the system to a point where the operating system can load and take control. PPC systems use home-grown boot loaders, but achieving system initialization on Intel architecture is just as easy for closed box and open box designs.

Boot Loaders for Closed Box Designs


Some embedded systems use minimized specialized firmware stacks created for fast speed, small size and specific system requirements. These boot loaders perform static hardware configurations; only initialize critical hardware features prior to hand off to an operating system. They are tuned to a targeted OS, specific application or function set, and support minimal upgrade and expansion capabilities.

321079

PowerPC* to Intel Architecture Migration

QNX* Fastboot Technology for Intel Atom Processors


QNX* Fastboot technology integrates system initialization into the QNX Neutrino RTOS, eliminating the need for BIOS or other boot loader. It was developed specifically for use in the QNX Neutrino RTOS, for Intel AtomTM processor Z5xx series platforms. Customers using QNX Fastboot can achieve boot times of milliseconds while eliminating the BIOS royalty from their bill of materials.

Intel Architecture System BIOS for Open Box Designs


A common requirement for open, expandable system designs is to provide the broadest possible system initialization solution, allowing the flexibility to load a wide range of offthe-shelf operating systems and methodical, dynamic hardware configurations. These designs will support multiple standard interfaces and expansion slots, and host mainstream operating systems with a broad set of pre-OS features and are ready to run multiple applications. On IA designs that require the flexibility, developers can choose from vendor provided firmware.

Legacy Basic Input/Output System (BIOS)


The BIOS initializes the hardware and boots it to a point where the operating system can load, and it also abstracts the hardware from the operating system through various industry standard tables (ACPI, SMBIOS, IRQ Routing, Memory maps, etc). Access to the hardware is directly made through silicon specific BIOS commands or industry standards interfaces. Intel architecture has commonly used BIOS for 20+ years to support designs with multiple use cases, customizable services, multiple boot paths, native OSs, or are feature rich. Major BIOS vendors include: American Megatrends Inc.* Insyde Software Corp.* Phoenix Technologies, Ltd.* Nanjing Byosoft Co.,Ltd.*

Unified Extensible Firmware Interface (UEFI)


The Unified EFI Forum, Inc., formed in 2005, is a Washington non-profit corporation whose goal is to forward the technical advancement of the IT industry through the development and promotion of a set of UEFI standard specifications. Intel developed the original Extensible Firmware Interface (EFI) and donated it to the UEFI forum as a starting point for the creation of the industry specifications, including UEFI and Platform Interface (PI). The forum is governed by a board of directors from eleven promoter companies including AMD*, AMI*, Apple*, Dell*, HP*, IBM*, Insyde*, Intel, Lenovo*, Microsoft* and Phoenix*. In addition there are over 120 contributor and adopter member companies.

321079

PowerPC* to Intel Architecture Migration

The UEFI Forum is responsible for two specifications: 1. The Unified Extensible Firmware Interface (UEFI) specification - Defines interfaces between OS, add-in firmware drivers and system firmware where the OS and other high-level software should ONLY interact with exposed interfaces and services defined by the UEFI specification: Includes the EFI Byte Code (EBC) specification which defines an interpretive layer for portable component drivers. 2. Platform Initialization Interface (PI) specifications The core code and services that are required for an implementation of the Platform Initialization (PI) specifications (hereafter referred to as the PI Architecture). Interoperability standards between firmware phases and pre-OS components from different providers. Figure 1. UEFI Block Diagram

OS

Pre-boot Tools

UEFI Specification

text Platform Drivers

text

PI Specification

Silicon Component text Modules

Hardware

The UEFI specifications define a model for the interface between operating systems and platform firmware. The interface consists of data tables that contain platform-related information, plus boot and runtime service calls that are available to the operating system and its loader. Together, these provide a standard environment for booting an operating system and running pre-boot applications. For more details about the UEFI specifications, writing UEFI drivers, and how to use the UEFI Sample Implementation and UEFI Application Toolkit, see the UEFI web site at http://www.uefi.org/.

10

321079

PowerPC* to Intel Architecture Migration

Intel Platform Innovation Framework for EFI


The Intel Platform Innovation Framework for EFI (referred to as "the Framework" and previously called Tiano) is an implementation of UEFI and PI specifications. The Framework is a set of architectural interfaces that has been designed to enable the BIOS industry and our customers to accelerate the evolution of innovative, differentiated, platform designs. The Framework is Intel's recommended implementation of the PI and UEFI Specifications for platforms based on all members of the Intel architecture family. BIOS vendors provide a compatibility support module (CSM), which is used to connect operating systems to the Framework that require legacy BIOS interfaces. The Framework is a reference code base developed by Intel. The Framework firmware implementation includes support for UEFI without the CSM, but does provide interfaces that support adding a CSM supplied by a BIOS vendor. The EFI Developer Kit (EDKII) is the open source portion of the Framework code base, referred to as the Foundation, and is available from the TianoCore project at http://www.tianocore.org/. A complete Framework implementation is not generally available directly from Intel, but is offered by participating vendors as products and services based on the Framework for both Intel and non-Intel silicon. Framework Vendors: Aptio* by American Megatrends Inc* InsydeH2O* by Insyde Software Corp* SecureCore Tiano* by Phoenix Technologies, Ltd* Nanjing Byosoft Co.,Ltd* For information on the Framework and specification, as well as participating vendor information see: http://www.intel.com/technology/framework/. More information about the genesis of the Intel Platform Innovation Framework and its implementation and adoption can be found at wikipedia http://en.wikipedia.org/wiki/Tiano. For more information on implementing Intel Embedded Intel architecture firmware see the white paper at the EDC site titled Implementing Firmware on Embedded IA Designs.

Architecture Migration Tools


For the most part, migration will need to be done manually. However, there are a few migration tools that can provide some help. These tools are described below.

Intel Architecture and Instruction Set


We must consider that there will always be a portion of assembly code either contained in assembly source files or inline assembly used within C source files. Assembly code is not portable and will need to be updated to target Intel architecture processor instructions. If the code was originally written in assembly for performance reasons, hardware and compiler improvements may now permit it to be rewritten in C or C++.

321079

11

PowerPC* to Intel Architecture Migration

Intel 64 and IA-32 Architectures Software Developer's Manuals


The Intel 64 and IA-32 Architectures Software Developers Manuals contain the details for each Intel architecture instruction, including the Intel Streaming SIMD Extensions 4 (Intel SSE4) instructions. Use this set of manuals as a reference for converting (re-writing) PPC assembly code instructions to equivalent functionality with Intel architecture instructions.

Vector Oriented Code


SIMD (single instruction, multiple data) is a technology used for vector oriented code. AltiVec* and SSE (Streaming SIMD Extensions) are extensions to the fundamental processor architecture instruction set. PPC uses AltiVec and Intel architecture uses SSE. If the PPC software uses vector oriented code the AltiVec instructions must be ported to SSE instructions and optimized for Intel architecture.

Manual Vector Oriented Code Migration


For information on translating AltiVec to SSE instructions see the AltiVec/SSE Migration Guide.

N.A. Software* Vector Oriented Code Conversion Tools


Converting existing highly optimized AltiVec* software to Intel architecture SSE can be a daunting task. Intel is working with N.A. Software* Vector Oriented Code Conversion Tools to bring three tools to market for Linux* and Wind River* VxWorks* operating systems, which will reduce the Digital Signal Processing (DSP) software conversion effort. 1. Vector Signal Image Processing Library (VSIPL) - Highly efficient computational middleware for signal and image processing applications. VSIPL is an Open standard for embedded signal and image processing software and hardware vendors. It abstracts hardware implementation details; applications are portable across processor types and generations without rewriting the software. This tool will be available as the VSIPL library, or as C-VSIPL, the plain C equivalent for in house libraries that need to be converted. N.A.Software will also port custom inhouse DSP libraries to Intel architecture. 2. Altivec.h include file for Intel Architecture Same as the PPC altivec.h, but targets the Intel SSE instruction set instead of Altivec. Applications DSP code remains unchanged. 3. Altivec Assembler to Intel Assembler-Compiler Converts small(ish) blocks of PPC Altivec assembler into C code, which can then be compiled into IA SSE assembler code. This tool is currently under a feasibility design study by Intel and the prototype should be available by the end of 2008. If successful, the product could be released in mid 2009.

12

321079

PowerPC* to Intel Architecture Migration

Software Development Tools


Software tools are important for any platform migration. Understanding the needs and availability of tools for the new platform is important when investigating the requirements of the port. Keep in mind that software development tools have system requirements, as with all software applications. This means that the tool must support the target processor and operating system. Thus, a tool that is used on the current preport system may not be available for the target system. This may affect the OS and tools choice for your migration plan. Check with the OS and tools provider to determine their tools product availability for the target Intel processor.

Intel Software Development Products


Intel supports a wide variety of software development products, which help developers unleash the performance of their software on Intel platforms. The Intel software tools product line includes compilers and debuggers, performance analyzers, performance libraries, and threading tools. These tools are extremely helpful when implementing threads and even more important when tuning the performance of the application and optimizing the software for multi-core platforms. For more details about each product and OS support, visit the products web sites.

Intel Compilers
Intel Compilers are compatible with other tools you might use, integrate into popular development environments and are source and binary compatibility with other widelyused compilers. The Intel compilers offer the support for creating multi-threaded applications and includes features for advanced optimization, automatic processor dispatch, vectorization, auto-parallelization, multithreading, OpenMP*, data prefetching, and loop unrolling, along with highly optimized libraries. Visit the product web site at: http://www.intel.com/cd/software/products/asmo-na/eng/compilers/ 284132.htm. OpenMP* is a standard for compiler based multiprocessing features. To learn more about OpenMP and the specification visit the web site: http://www.openmp.org.

Intel VTune Performance Analyzer


The Intel VTune Performance Analyzer provides application performance tuning with a graphical user interface and no recompiles required. It is compiler and language independent so it works with C, C++, Fortran, C#, Java, .NET and more. Unlike some products that offer only call graph analysis or only a limited set of sampling events, Intel VTune analyzer offers both with an extensive set of tuning events for all the latest Intel processors. This performance analyzer can locate hotspots in the code, identifying the lines of code where the hotspot exists. Visit the product web site at: http://www.intel.com/cd/software/products/asmo-na/eng/vtune/239144.htm.

Intel Performance Libraries


Intel Performance Libraries are foundation level building blocks for high-performance threading, math and multimedia applications, and provide consistent performance across all Intel microprocessors. Use the performance libraries to get the most out of todays new multi-core and multi-processor systems.

321079

13

PowerPC* to Intel Architecture Migration

Intel Integrated Performance Primitives


This highly optimized Intel software library contains audio, video, imaging, cryptography, speech recognition, and signal processing functions and codec component functions for digital media and data-processing applications.

The Intel Math Kernel Library


This library contains highly optimized, extensively threaded, mathematical functions for engineering, scientific, and financial applications that require maximum performance. Visit the Intel performance library products web site at: http://www.intel.com/cd/ software/products/asmo-na/eng/perflib/219780.htm.

Intel Threading Analysis Tools


There is a better way to develop threaded software. Intel Thread Checker, Intel Thread Profiler and Intel Threading Building Blocks (Intel TBB) help to thread your application correctly and unleash its true performance on Intel multi-core processor systems.

Intel Thread Checker


The Intel Thread Checker facilitates debugging of multithreaded programs by automatically finding common errors such as storage conflicts, deadlock, API violations, inconsistent variable scope, thread stack overflows, etc. The non-deterministic nature of concurrency errors makes them particularly difficult to find with traditional debuggers. The Intel Thread Checker pinpoints error locations down to the source lines involved and provides stack traces showing the paths taken by the threads to reach the error. It also identifies the variables involved.

Intel Thread Profiler


The Intel Thread Profiler facilitates analysis of applications written using Windows* API threading API, POSIX Threading API or OpenMP pragmas. The OpenMP Thread Profiler provides details on the time spent in serial regions, parallel regions, and critical sections and graphically displays performance bottlenecks due to load imbalance, lock contention, and parallel overhead in OpenMP applications. Performance data can be displayed for the whole program, by region, and even down to individual threads. The Windows API or POSIX Threads API Thread Profiler facilitates understanding the threading patterns in multithreaded software by visual depiction of thread hierarchies and their interactions. It will also help identify and compare the performance impact of different synchronization methods, different numbers of threads, or different algorithms. Since the Intel Thread Profiler plugs in to the Intel VTune Performance Analyzer, multiple runs across different number of processors can be compared to determine the scalability profile. It also helps locate synchronization constructs that directly impact execution time and correlates to the corresponding source line in the application.

14

321079

PowerPC* to Intel Architecture Migration

Intel Threading Building Blocks (TBB)


Intel TBB is a C++ runtime library that abstracts the low-level threading details necessary for optimal multi-core performance. It uses common C++ templates and coding style to eliminate tedious threading implementation work. Intel TBB requires fewer lines of code to achieve parallelism than other threading models. Applications written with Intel TBB are portable across platforms. Since the library is also inherently scalable, no code maintenance is required as more processor cores become available. An open source version of Intel Threading Building Blocks is also available. There is also a book available for Intel Threading Building Blocks titled: Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism. See the Intel Threading Building Blocks product web site for directions on ordering the book. Visit the threading tools products web site at: http://www.intel.com/cd/software/ products/asmo-na/eng/threading/219785.htm.

On Chip Debugging Tools


Joint Test Action Group (JTAG) is a transport system used for in-circuit emulators, which are useful for debugging embedded systems. JTAG vendors that support Intel architecture include American Arium*, and as of February 2009 Macraigor Systems LLC* will also provide support for Intel architecture.

Multi-core Solutions
There are several factors that will guide the plan for the multi-core migration. Factors include the starting point (design) of the original source code, as well as migration goals and constraints. Each method has its own strengths. More operating systems are now providing Symmetric Multiprocessing (SMP), including embedded RTOSs, but SMP requires code to be architected to take advantage of multiple CPUs (parallelized). For situations where the application(s) is not well suited for parallelization, Asymmetric Multiprocessing (AMP) and Virtualization could be a more viable solution for leveraging the extra processing capabilities of multi-core hardware. Employing virtualization and partitioning in the embedded system will enable some benefit to be derived from multi-core processors independent of explicit OS support. However, the ideal situation is to have symmetric multiprocessing and asymmetric multiprocessing, including virtualization, at your disposal.

Asymmetric Multiprocessing
AMP has started to show up in product descriptions for embedded processors. The term is used to refer to a case where multiple OS images are supported on a single CPU consisting of multiple cores. The term is used to distinguish from the SMP case where there is a single OS image on the CPU.

321079

15

PowerPC* to Intel Architecture Migration

AMP requires no application changes to leverage the benefits of multiple cores. It can leverage multiple cores by running multiple instances of the OS and application in separate partitions that are dedicated to specific cores, PCI devices, and system memory areas. AMP requires a boot loader that supports AMP (can partition the hardware resources and make OS/application assignments to the partitions). The OS must also meet requirements to support AMP such as: The OS must be relocate-able, must be able to restrict its memory region, and the OS must only operate on it assigned PCI devices.

Wind River* VxWorks* AMP Support for Intel Architecture


Wind River* will release a VxWorks* AMP solution for Intel architecture in January of 2009. The first product will allow versions of VxWorks 5.5 (uniprocessor) and VxWorks 6.7 (SMP) to reside side-by-side on the same system. This OS is the first embedded AMP product for Intel architecture. This AMP solution allows legacy serial PPC/ VxWorks designs to take advantage of Intel architecture multi-core hardware, keeping the legacy software unmodified while running new/future code (parallelized), in a VxWorks SMP partition.

Symmetric Multiprocessing
SMP operating systems treat all cores as equals and distribute the workload/processing to the available cores. An SMP design is probably the more efficient way to take advantage of multi-core hardware. It can be written to scale performance automatically as the number of processing cores increase. The tradeoff for the SMP performance and scalability benefit is that writing software for parallel processing can be tricky because the software design must decompose the problem into sub problems that can safely execute simultaneously (threads are used to execute the concurrent processing). For guidelines on multithreading applications see Developing Multithreaded Applications: A Platform Consistent Approach. For symmetric multi-processing using Wind River* VxWorks*, see Best Practices: Adoption of Symmetric Multiprocessing Using VxWorks and Intel Multi-core Processors White Paper. OS Based (SMP Affinity) - Use processor affinity mechanisms to assign specific cores to specific tasks/threads. This method can improve performance on multiprocessor systems by pinning threads that share data to cores that share cache, which improves data locality in cache and thus, improves cache hits. Refer to the Intel Software Network article Improved Linux* SMP Scaling: User-directed Processor Affinity for details about user directed affinity. An example where SMP affinity improved performance on a dual processor multi-core system is the case study Intel performed on an open source intrusion detection application known as SNORT*. Read the case study Supra-linear Packet Processing Performance with Intel Multi-core Processors for more details. Most of the popular commercial OSs have SMP products, such as Microsoft* Windows* server and client and Linux distributions. However, this isnt the norm for embedded and real-time OSs. Understand the level of SMP support provided by your OS.

16

321079

PowerPC* to Intel Architecture Migration

RTOS vendors that provide real-time SMP support for Intel architecture are: Green Hills Integrity* LynuxWorks LynxOS* QNX* Neutrino* Wind River* VxWorks*

Virtualization
The beauty of virtualization is that it can bring together the benefits of all multi-core solutions on a single system and extend those benefits with additional features such as security, quality of service (QoS), high availability (HA), and load distribution. Virtualization provides a software management layer that increases software protection between the partitions and provides core management to optimize power efficiency. Basically, the CPU is run as multiple independent partitions each running their own OS and application. This is a very effective strategy for applications that are constructed from multiple application-components that are independent and CPU bound (i.e. not bound by contention to shared resources). There is no need to make legacy software stack changes when using virtualization to partition multiple OSs to run within virtual machines (VM). Instead, let the Virtual Machine Manager (VMM) manage the assignment and access between the VMs and platform resources. There are several use cases for partitioning including: system consolidation, running an RTOS side-by-side with a GPOS (also referred to as OS colocation), and leveraging the additional processing power of multi-core hardware by replicating the application(s) and OSs across multiple cores. Intel Virtualization Technology (Intel VT) provides features that make VMM development easier and enhance performance of virtualized systems enabled with the technology. Visit the Intel Product Technologies for Intel Embedded and Communications Applications web site for more information about Intel VT and other technologies: http:// www.intel.com/technology/advanced_comm/index.htm.

VMM vendors that support real-time and Intel VT features are:


Green Hills Integrity Padded Cell* LynuxWorks LynxSecure* TenAsys* VirtualLogix VLX* Wind River*

321079

17

PowerPC* to Intel Architecture Migration

Training and Design Information


Intel Software College
The Intel Software College provides training for Intel processors, software development products, multi-core, and software technologies such as parallel programming. There are several classroom, online, and on-demand web-based courses available. Refer to the following web site for more information about the Intel Software College:https://shale.intel.com/softwarecollege/.

Intel Software Network


Intel has a global network of software tools and resources that bring together the proven depth and breadth of Intel's engineering expertise, technology leadership, strategic insight, and global reach, delivering an offering that works better for you. Connect with community developers and engineers within the network forums and blogs, and access software development products, training, and knowledge bases for your software development requirements. Find the Intel Software Network home page at: http://software.intel.com.

Migration Design Guide (Putting It All Together)


Up to this point, this paper discussed the software design areas that need to be considered when migrating from PowerPC* to Intel architecture. The migration methodology provides guidelines for the procedures of the migration. The guide steps through situational decisions, resulting in an outline for the overall migration solution and plan. For each step the Embedded Design Center can be consulted for further design information. There are five ordered steps for the migration and cover OS requirements, hardware differences, and software optimization on the target Intel architecture platform: 1. Port Code to Target Operating System 2. Execute code correctly on one Intel architecture core. 3. Optimize the code for performance on one Intel architecture core. 4. Apply multi-core software design updates. 5. Optimize the software design for multi-core Intel architecture performance.

18

321079

PowerPC* to Intel Architecture Migration

Step 1 Port PowerPC* Code to Target Operating System


If the architecture migration includes a port to a different operating system, complete the port to the new operating system before starting the migration of the PPC software to the Intel architecture hardware. The goal in this step is to ensure that the software performs as expected and correctly on the new OS. Since this step requires stability of the same code running on the target OS, do not make other software design changes in this step. Refer to the OS section in this paper for more details.

Step 2 Execute Code Correctly on One Intel Architecture Core


1. Update the operating system related code for Intel architecture: Whether the current and target OS are the same or different, device drivers, libraries, and software development tools need to be surveyed and availability for Intel architecture determined. a. b. Intel Embedded Graphics Drivers (IEGD) can be downloaded at: www.intel.com/go/iegd. Intel embedded chipset drivers: i. ii. Standard desktop, mobile, and server chipset drivers for Microsoft (XP or Vista*) can be downloaded at: http://downloadcenter.intel.com/. Board support packages (BSPs) for RTOSs are provided by the RTOS vendor.

iii. Board support packages (BSPs) for Microsoft* Windows CE*, can be downloaded from these third party vendors sites: c. d. e. Adeneo Corporation* BSQUARE* Wipro Technologies*

If any device drivers or libraries are developed in-house, they will need to be rewritten for Intel architecture. If any third party drivers or libraries are required, check with the third party vendor (TPV) for equivalent Intel architecture products. Development tools for Intel architecture. See the Intel Software Development Products section for information about Intel tools and visit the products web sites for information on OS support. On Chip Debugging Tools for Intel architecture are supported by American Arium or Macraigor Systems LLC (February 2009). BIOS Choose BIOS and/or UEFI firmware if the design will support multiple standard interfaces and expansion slots, or a host mainstream OSs with a broad set of pre-OS features, which are ready to run multiple applications. Boot Loader Choose a boot loader for minimal or specialized firmware stacks where requirements might include optimization for speed, size, or specific system requirements, and will support minimal upgrade or expansion capabilities. QNX Fastboot Technology is available for Intel AtomTM Processors.

2. Choose the Method for System Initialization: a.

b.

321079

19

PowerPC* to Intel Architecture Migration

3. If any part of the code written in assembly code it will need to be updated for IA instructions. Solutions: a. b. Basic assembly instructions Manually update the basic assembly instructions using the Intel 64 and IA-32 Architectures Software Developer's Manuals. Vector Oriented Code Solutions: i. ii. Manually update vector oriented code using the AltiVec/SSE Migration Guide Translate the vector oriented code using the NASoftware*/PowerPC*/ Altivec* to Intel/SSE conversion tools.

4. Does the software abstract the memory architecture of the processor? a. b. Yes The code is endian-neutral. No changes are required. No The code will need to be updated for little-endian memory architecture. Manually update the Endianness differences in the code. Use the Endianness White Paper as a guide to the required changes.

5. Refer to Table 1 for any other architecture differences that may need software updates. 6. Build, test and debug the code using one Intel architecture core.

Step 3 Optimize the Code for Performance on One Intel Architecture Core
Although the end product will run on multi-core architecture, performance tuning methodology first requires that serial code be optimized for serial performance. 1. Use the top down, closed end loop performance methodology, and when applicable use the Intel Software Development Products. a. Analyze the performance i. ii. b. i. Use the Intel VTune Performance Analyzer to pinpoint hotspots in the code where the processing could be distributed between the available cores. Use the Intel Thread Profiler to identify any thread imbalances. Use the Intel C++ Compiler and select features to implement advanced optimizations using Profile Guided Optimization (PGO), executable size, and power consumption. Use the Intel Performance Libraries to Increase performance with a variety of APIs that are highly tuned for Intel architecture. Functions include video, imaging, compression, cryptography, audio, speech recognition, and signal processing functions and codec component functions for digital media and data-processing applications.

Generate alternatives and implement code changes

ii.

c.

Debug the code Use the Intel Thread Checker to identify threading bugs, such as data race and deadlock conditions.

2. The OSV should also provide a set of software development tools. Check with the OSV to understand which tools are available. 3. Use an on-chip debugging tool (JTAG) for low level debugging at the hardware level and where a high level debugger would otherwise interfere with timing critical code.

20

321079

PowerPC* to Intel Architecture Migration

Step 4 Apply Multi-core Software Design Updates


Intel multi-core processors are based on Intel Core microarchitecture. There are several ways to benefit from multi-core. PPC migrations will most likely start from serial code bases. Therefore, the target software design needs to identify the solution to meet the migration requirements. SMP can improve application performance and can be designed to scale as the number of processors increase. However, SMP requires analysis to identify opportunities for parallelism in the code and re-writing the source code to introduce the parallelism using multithreading. For CPU intensive code, which is difficult to redesign for parallel processing using SMP and multithreading, AMP could be a good alternative solution.
1. Choose the Multi-core Design a. b. AMP Choose AMP if the migration requirements specify that no changes can be made to the application or operating system. SMP Choose SMP if one operating system will be run, using all of the cores as equal processing resources, and the applications can be parallelized to benefit from SMP systems. SMP Affinity can sometimes improve cache hit rates on multiprocessor systems by pinning certain tasks to certain cores to improve data locality. Virtualization Choose virtualization for system consolidation, OS co-location, and the additional benefits of features such as security, quality of service (QoS), high availability (HA), and load distribution.

c.

Step 5 Optimize the Software Design for Multicore Intel Architecture Performance
Whether the design is SMP or AMP, multi-core software designs require specialized software development tools. For SMP the tools help identify and implement parallelism into the code and pinpoint threading issues such as race conditions, deadlocks, and thread load imbalances. The tuning methodology is the same as for a uni-processor, except that the goal is to correctly and efficiently execute multiple processes or threads simultaneously across multiple cores. Multi-core tools help implement parallelism and help tune and debug the parallelized code. 1. Use the top down, closed end loop performance methodology, and when applicable use the Intel Software Development Products. a. b. c. d. Intel VTune Performance Analyzer Pinpoints hotspots in the code where the processing could be distributed between the available cores. Intel C++ Compiler Multi-core features include OpenMP and auto-parallel. Intel Performance Libraries Increase parallelism with performance threaded APIs that are highly tuned for Intel architecture multi-core. Intel Threading Tools Implement threads with Intel Thread Building Blocks. Debug threads with Intel Thread Checker. Identify workload imbalances and lock contention of the threads with Intel Thread Profiler.

2. The OSV should also provide a set of multi-core development tools. Check with the OSV to understand which tools are available. 3. Use an on-chip debugging tool (JTAG) for low level debugging at the hardware level and where a high level debugger would otherwise interfere with timing critical code.

321079

21

PowerPC* to Intel Architecture Migration

Conclusion
This paper overviewed the software considerations and guidelines for completing a successful PowerPC* to Intel architecture software migration, as well as resources that can assist during the migration software design and implementation. The paper included information about architecture differences, migration tools, system initialization, operating system considerations, Intel software development products, and available training for Intel architecture. Remember, each situation is different and the effort required for the migration depends on the amount of abstraction that is already programmed into the code. Therefore the migration could be as simple as recompiling the software or more involved, requiring extra programming for areas of software that are hardware or OS dependent. Completing a successful port involves assessing and understanding the current situation and requirements, and planning each step before the migration begins. Dont forget to visit the Embedded Design Center at http://www.intel.com/embedded/ edc for the one-stop-shop to embedded Intel architecture design information.

22

321079

PowerPC* to Intel Architecture Migration

Authors
Lori M. Matassa is a Software Technical Engineer with Intel.

Acronyms
AMP API BIOS CSM DSP EDC EDK EFI GPOS HA IA IEGD ISN JTAG LSB OS PCI POSIX PPC QoS RTOS SIMD SMP SSE UEFI VM VMM Asymmetric Multiprocessing Application Programming Interface Basic Input Output System Compatibility Support Module Digital Signal Processing Intel Embedded Design Center EFI Developer Kit Extensible Firmware Interface General Purpose Operating System High Availability Intel Architecture Intel Embedded Graphics Driver Intel Software Network Joint Test Action Group Least Significant Bit Operating System Peripheral Component Interconnect Portable Operating System Interface PowerPC Quality of Service Real-time Operating System Single Instruction, Multiple data Asymmetric Multiprocessing Streaming SIMD Extensions Unified EFI Forum Virtual Machine Virtual Machine Manager

321079

23

PowerPC* to Intel Architecture Migration

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTELS TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. Intel products are not intended for use in medical, life saving, or life sustaining applications. Intel may make changes to specifications and product descriptions at any time, without notice. This paper is for informational purposes only. THIS DOCUMENT IS PROVIDED "AS IS" WITH NO WARRANTIES WHATSOEVER, INCLUDING ANY WARRANTY OF MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR ANY PARTICULAR PURPOSE, OR ANY WARRANTY OTHERWISE ARISING OUT OF ANY PROPOSAL, SPECIFICATION OR SAMPLE. Intel disclaims all liability, including liability for infringement of any proprietary rights, relating to use of information in this specification. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted herein. Unless otherwise agreed in writing by Intel, the Intel products are not designed for nor intended for any application in which the failure of the Intel product could create a situation where personal injury or death may occur. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked reserved or undefined. Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata that may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents that have an order number and are referenced in this document or other Intel literature may be obtained by calling 800-548-4725 or by visiting Intels website. Intel, the Intel logo, Intel Atom, Intel Core, Intel VTune, Intel Threading Tools, Intel C++ Compiler, Intel Thread Profiler, Xeon, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Copyright 2009, Intel Corporation. All rights reserved.

24

321079

You might also like