You are on page 1of 110

Module 1 Oracle Architecture

Primary Architecture Components

The figure shown above details the Oracle architecture.

Oracle server: An Oracle server includes an Oracle Instance and an Oracle database. An Oracle database includes several different types of files: datafiles, control files, redo log files and archive redo log files. The Oracle server also accesses parameter files and password files. This set of files has several purposes. o One is to enable system users to process SQL statements. o Another is to improve system performance. o Still another is to ensure the database can be recovered if there is a software/hardware failure. The database server must manage large amounts of data in a multi-user environment. The server must manage concurrent access to the same data. The server must deliver high performance. This generally means fast response times.

Oracle instance: An Oracle Instance consists of two different sets of components: The first component set is the set of background processes (PMON, SMON, RECO, DBW0, LGWR, CKPT, D000 and others). o These will be covered later in detail each background process is a computer program. o These processes perform input/output and monitor other Oracle processes to provide good performance and database reliability. The second component set includes the memory structures that comprise the Oracle instance. o When an instance starts up, a memory structure called the System Global Area (SGA) is allocated.

o At this point the background processes also start. An Oracle Instance provides access to one and only one Oracle database.

Oracle database: An Oracle database consists of files. Sometimes these are referred to as operating system files, but they are actually database files that store the database information that a firm or organization needs in order to operate. The redo log files are used to recover the database in the event of application program failures, instance failures and other minor failures. The archived redo log files are used to recover the database if a disk fails. Other files not shown in the figure include: o The required parameter file that is used to specify parameters for configuring an Oracle instance when it starts up. o The optional password file authenticates special users of the database these are termed privileged users and include database administrators. o Alert and Trace Log Files these files store information about errors and actions taken that affect the configuration of the database.

User and server processes: The processes shown in the figure are called user and server processes. These processes are used to manage the execution of SQL statements. A Shared Server Process can share memory and variable processing for multiple user processes. A Dedicated Server Process manages memory and variables for a single user process.

This figure from the Oracle Database Administration Guide provides another way of viewing the SGA.

Connecting to an Oracle Instance Creating a Session

System users can connect to an Oracle database through SQLPlus or through an application program like the Internet Developer Suite (the program becomes the system user). This connection enables users to execute SQL statements.

The act of connecting creates a communication pathway between a user process and an Oracle Server. As is shown in the figure above, the User Process communicates with the Oracle Server through a Server Process. The User Process executes on the client computer. The Server Process executes on the server computer, and actually executes SQL statements submitted by the system user.

The figure shows a one-to-one correspondence between the User and Server Processes. This is called a Dedicated Server connection. An alternative configuration is to use a Shared Server where more than one User Process shares a Server Process.

Sessions: When a user connects to an Oracle server, this is termed a session. The User Global Area is session memory and these memory structures are described later in this document. The session starts when the Oracle server validates the user for connection. The session ends when the user logs out (disconnects) or if the connection terminates abnormally (network failure or client computer failure). A user can typically have more than one concurrent session, e.g., the user may connect using SQLPlus and also connect using Internet Developer Suite tools at the same time. The limit of concurrent session connections is controlled by the DBA. If a system users attempts to connect and the Oracle Server is not running, the system user receives the Oracle Not Available error message.

Physical Structure Database Files


As was noted above, an Oracle database consists of physical files. The database itself has: Datafiles these contain the organization's actual data. Redo log files these contain a chronological record of changes made to the database, and enable recovery when failures occur. Control files these are used to synchronize all database activities and are covered in more detail in a later module.

Other key files as noted above include: Parameter file there are two types of parameter files. o The init.ora file (also called the PFILE) is a static parameter file. It contains parameters that specify how the database instance is to start up. For example, some parameters will specify how to allocate memory to the various parts of the system global area. o The spfile.ora is a dynamic parameter file. It also stores parameters to specify how to startup a database; however, its parameters can be modified while the database is running. Password file specifies which *special* users are authenticated to startup/shut down an Oracle Instance. Archived redo log files these are copies of the redo log files and are necessary for recovery in an online, transaction-processing environment in the event of a disk failure.

Memory Structure
The memory structures include three areas of memory: System Global Area (SGA) this is allocated when an Oracle Instance starts up. Program Global Area (PGA) this is allocated when a Server Process starts up. User Global Area (UGA) this is allocated when a user connects to create a session.

System Global Area


The SGA is a read/write memory area that stores information shared by all database processes and by all users of the database (sometimes it is called the Shared Global Area). o This information includes both organizational data and control information used by the Oracle Server. o The SGA is allocated in memory and virtual memory. o The size of the SGA can be established by a DBA by assigning a value to the parameter SGA_MAX_SIZE in the parameter filethis is an optional parameter. The SGA is allocated when an Oracle instance (database) is started up based on values specified in the initialization parameter file (either PFILE or SPFILE). The SGA has the following mandatory memory structures: Database Buffer Cache Redo Log Buffer Java Pool Streams Pool Shared Pool includes two components: o Library Cache

o Data Dictionary Cache Other structures (for example, lock and latch management, statistical data)

Additional optional memory structures in the SGA include: Large Pool

The SHOW SGA SQL command will show you the SGA memory allocations. This is a recent clip of the SGA for the DBORCL database at SIUE. In order to execute SHOW SGA you must be connected with the special privilege SYSDBA (which is only available to user accounts that are members of the DBA Linux group).

SQL> connect / as sysdba Connected. SQL> show sga

Total System Global Area 1610612736 bytes Fixed Size Variable Size Database Buffers Redo Buffers 2084296 bytes 1006633528 bytes 587202560 bytes 14692352 bytes

Early versions of Oracle used a Static SGA. This meant that if modifications to memory management were required, the database had to be shutdown, modifications were made to the init.ora parameter file, and then the database had to be restarted. Oracle 9i, 10g, and 11g use a Dynamic SGA. Memory configurations for the system global area can be made without shutting down the database instance. The advantage is obvious. This allows the DBA to resize the Database Buffer Cache and Shared Pool dynamically.

Several initialization parameters are set that affect the amount of random access memory dedicated to the SGA of an Oracle Instance. These are:

SGA_MAX_SIZE: This optional parameter is used to set a limit on the amount of virtual memory allocated to the SGA a typical setting might be 1 GB; however, if the value for SGA_MAX_SIZE in the initialization parameter file or server parameter file is less than the sum the memory allocated for all components, either explicitly in the parameter file or by default, at the time the instance is initialized, then the database ignores the setting for SGA_MAX_SIZE. For optimal performance, the entire SGA should fit in real memory to eliminate paging to/from disk by the operating system. DB_CACHE_SIZE: This optional parameter is used to tune the amount memory allocated to the Database Buffer Cache in standard database blocks. Block sizes vary among operating systems. The DBORCL database uses 8 KB blocks. The total blocks in the cache defaults to 48 MB on LINUX/UNIX and 52 MB on Windows operating systems. LOG_BUFFER: This optional parameter specifies the number of bytes allocated for the Redo Log Buffer. SHARED_POOL_SIZE: This optional parameter specifies the number of bytes of memory allocated to shared SQL and PL/SQL. The default is 16 MB. If the operating system is based on a 64 bit configuration, then the default size is 64 MB. LARGE_POOL_SIZE: This is an optional memory object the size of the Large Pool defaults to zero. If the init.ora parameter PARALLEL_AUTOMATIC_TUNING is set to TRUE, then the default size is automatically calculated. JAVA_POOL_SIZE: This is another optional memory object. The default is 24 MB of memory.

The size of the SGA cannot exceed the parameter SGA_MAX_SIZE minus the combination of the size of the additional parameters, DB_CACHE_SIZE, LOG_BUFFER, SHARED_POOL_SIZE, LARGE_POOL_SIZE, and JAVA_POOL_SIZE.

Memory is allocated to the SGA as contiguous virtual memory in units termed granules. Granule size depends on the estimated total size of the SGA, which as was noted above, depends on the SGA_MAX_SIZE parameter. Granules are sized as follows: If the SGA is less than 1 GB in total, each granule is 4 MB.

If the SGA is greater than 1 GB in total, each granule is 16 MB.

Granules are assigned to the Database Buffer Cache, Shared Pool, Java Pool, and other memory structures, and these memory components can dynamically grow and shrink. Using contiguous memory improves system performance. The actual number of granules assigned to one of these memory components can be determined by querying the database view named V$BUFFER_POOL. Granules are allocated when the Oracle server starts a database instance in order to provide memory addressing space to meet the SGA_MAX_SIZE parameter. The minimum is 3 granules: one each for the fixed SGA, Database Buffer Cache, and Shared Pool. In practice, you'll find the SGA is allocated much more memory than this. The SELECT statement shown below shows a current_size of 1,152 granules.

SELECT name, block_size, current_size, prev_size, prev_buffers FROM v$buffer_pool;

NAME PREV_BUFFERS

BLOCK_SIZE CURRENT_SIZE

PREV_SIZE

-------------------- ---------- ------------ ---------- ----------DEFAULT 71244 8192 560 576

Program Global Area

The Program Global Area is also termed the Process Global Area (PGA) and is a part of memory allocated that is outside of the Oracle Instance.

It stores data and control information for a single Server Process or a single Background Process. It is allocated when a process is created and the memory is scavenge by the operating system when the process terminates. This is NOT a shared part of memory one PGA to each process only.

The content of the PGA varies, but as shown in the figure above, generally includes the following:

Private SQL Area: Stores information for a parsed SQL statement stores bind variable values and runtime memory allocations. A user session issuing SQL statements has a Private SQL Area that may be associated with a Shared SQL Area if the same SQL statement is being executed by more than one system user. This often happens in OLTP environments where many users are executing and using the same application program. o Dedicated Server environment the Private SQL Area is located in the Program Global Area. o Shared Server environment the Private SQL Area is located in the System Global Area.

Session Memory: Memory that holds session variables and other session information.

SQL Work Areas: Memory allocated for sort, hash-join, bitmap merge, and bitmap create types of operations.

o Oracle 9i and later versions enable automatic sizing of the SQL Work Areas by setting the WORKAREA_SIZE_POLICY = AUTO parameter (this is the default!) and PGA_AGGREGATE_TARGET = n (where n is some amount of memory established by the DBA). However, the DBA can let the Oracle DBMS determine the appropriate amount of memory.

User Global Area


The User Global Area is session memory.

A session that loads a PL/SQL package into memory has the package state stored to the UGA. The package state is the set of values stored in all the package variables at a specific time. The state changes as program code the variables. By default, package variables are unique to and persist for the life of the session. The OLAP page pool is also stored in the UGA. This pool manages OLAP data pages, which are equivalent to data blocks. The page pool is allocated at the start of an OLAP session and released at the end of the session. An OLAP session opens automatically whenever a user queries a dimensional object such as a cube. Note: Oracle OLAP is a multidimensional analytic engine embedded in Oracle Database 11g. Oracle OLAP cubes deliver sophisticated

calculations using simple SQL queries - producing results with speed of thought response times. The UGA must be available to a database session for the life of the session. For this reason, the UGA cannot be stored in the PGA when using a shared server connection because PGA is specific to a single process. Therefore, the UGA is stored in the SGA when using shared server connections, enabling any shared server process access to it. When using a dedicated server connection, the UGA is stored in the PGA.

Automatic Shared Memory Management


Prior to Oracle 10G, a DBA had to manually specify SGA Component sizes through the initialization parameters, such as SHARED_POOL_SIZE, DB_CACHE_SIZE, JAVA_POOL_SIZE, and LARGE_POOL_SIZE parameters.

Automatic Shared Memory Management enables a DBA to specify the total SGA memory available through the SGA_TARGET initialization parameter. The Oracle Database automatically distributes this memory among various subcomponents to ensure most effective memory utilization. The DBORCL database SGA_TARGET is set in the initDBORCL.ora file: sga_target=1610612736 With automatic SGA memory management, the different SGA components are flexibly sized to adapt to the SGA available. Setting a single parameter simplifies the administration task the DBA only specifies the amount of SGA memory available to an instance the DBA can forget about the sizes of individual components. No out of memory errors are generated unless the system has actually run out of memory. No manual tuning effort is needed. The SGA_TARGET initialization parameter reflects the total size of the SGA and includes memory for the following components:

Fixed SGA and other internal allocations needed by the Oracle Database instance The log buffer

The shared pool The Java pool The buffer cache The keep and recycle buffer caches (if specified) Nonstandard block size buffer caches (if specified) The Streams Pool

If SGA_TARGET is set to a value greater than SGA_MAX_SIZE at startup, then the SGA_MAX_SIZE value is bumped up to accommodate SGA_TARGET. When you set a value for SGA_TARGET, Oracle Database 10g automatically sizes the most commonly configured components, including:

The shared pool (for SQL and PL/SQL execution) The Java pool (for Java execution state) The large pool (for large allocations such as RMAN backup buffers) The buffer cache

There are a few SGA components whose sizes are not automatically adjusted. The DBA must specify the sizes of these components explicitly, if they are needed by an application. Such components are:

Keep/Recycle buffer caches (controlled by DB_KEEP_CACHE_SIZE and DB_RECYCLE_CACHE_SIZE) Additional buffer caches for non-standard block sizes (controlled by DB_nK_CACHE_SIZE, n = {2, 4, 8, 16, 32}) Streams Pool (controlled by the new parameter STREAMS_POOL_SIZE)

The granule size that is currently being used for the SGA for each component can be viewed in the view V$SGAINFO. The size of each component and the time and type of the last resize operation performed on each component can be viewed in the view V$SGA_DYNAMIC_COMPONENTS.

SQL> select * from v$sgainfo; More...

NAME

BYTES RES

-------------------------------- ---------- --Fixed SGA Size 2084296 No

Redo Buffers Buffer Cache Size Shared Pool Size Large Pool Size Java Pool Size Streams Pool Size Granule Size Maximum SGA Size Startup overhead in Shared Pool Free SGA Memory Available 11 rows selected.

14692352 No 587202560 Yes 956301312 Yes 16777216 Yes 33554432 Yes93 0 Yes 16777216 No 1610612736 No 67108864 No 0

Shared Pool

The Shared Pool is a memory structure that is shared by all system users. It caches various types of program data. For example, the shared pool stores parsed SQL, PL/SQL code, system parameters, and data dictionary information. The shared pool is involved in almost every operation that occurs in the database. For example, if a user executes a SQL statement, then Oracle Database accesses the shared pool. It consists of both fixed and variable structures. The variable component grows and shrinks depending on the demands placed on memory size by system users and application programs.

Memory can be allocated to the Shared Pool by the parameter SHARED_POOL_SIZE in the parameter file. The default value of this parameter is 8MB on 32-bit platforms and 64MB on 64bit platforms. Increasing the value of this parameter increases the amount of memory reserved for the shared pool.

You can alter the size of the shared pool dynamically with the ALTER SYSTEM SET command. An example command is shown in the figure below. You must keep in mind that the total memory allocated to the SGA is set by the SGA_TARGET parameter (and may also be limited by the SGA_MAX_SIZE if it is set), and since the Shared Pool is part of the SGA, you cannot exceed the maximum size of the SGA. It is recommended to let Oracle optimize the Shared Pool size. The Shared Pool stores the most recently executed SQL statements and used data definitions. This is because some system users and application programs will tend to execute the same SQL statements often. Saving this information in memory can improve system performance. The Shared Pool includes several cache areas described below.

Library Cache
Memory is allocated to the Library Cache whenever an SQL statement is parsed or a program unit is called. This enables storage of the most recently used SQL and PL/SQL statements. If the Library Cache is too small, the Library Cache must purge statement definitions in order to have space to load new SQL and PL/SQL statements. Actual management of this memory structure is through a Least-Recently-Used (LRU) algorithm. This means that the SQL and PL/SQL statements that are oldest and least recently used are purged when more storage space is needed. The Library Cache is composed of two memory subcomponents: Shared SQL: This stores/shares the execution plan and parse tree for SQL statements, as well as PL/SQL statements such as functions, packages, and triggers. If a system user executes an identical statement, then the statement does not have to be parsed again in order to execute the statement. Private SQL Area: With a shared server, each session issuing a SQL statement has a private SQL area in its PGA. o Each user that submits the same statement has a private SQL area pointing to the same shared SQL area. o Many private SQL areas in separate PGAs can be associated with the same shared SQL area. o This figure depicts two different client processes issuing the same SQL statement the parsed solution is already in the Shared SQL Area.

Data Dictionary Cache

The Data Dictionary Cache is a memory structure that caches data dictionary information that has been recently used. This cache is necessary because the data dictionary is accessed so often. Information accessed includes user account information, datafile names, table descriptions, user privileges, and other information.

The database server manages the size of the Data Dictionary Cache internally and the size depends on the size of the Shared Pool in which the Data Dictionary Cache resides. If the size is too small, then the data dictionary tables that reside on disk must be queried often for information and this will slow down performance.

Server Result Cache


The Server Result Cache holds result sets and not data blocks. The server result cache contains the SQL query result cache and PL/SQL function result cache, which share the same infrastructure.

SQL Query Result Cache


This cache stores the results of queries and query fragments. Using the cache results for future queries tends to improve performance. For example, suppose an application runs the same SELECT statement repeatedly. If the results are cached, then the database returns them immediately. In this way, the database avoids the expensive operation of rereading blocks and recomputing results.

PL/SQL Function Result Cache


The PL/SQL Function Result Cache stores function result sets. Without caching, 1000 calls of a function at 1 second per call would take 1000 seconds. With caching, 1000 function calls with the same inputs could take 1 second total. Good candidates for result caching are frequently invoked functions that depend on relatively static data. PL/SQL function code can specify that results be cached.

Buffer Caches
A number of buffer caches are maintained in memory in order to improve system response time.

Database Buffer Cache


The Database Buffer Cache is a fairly large memory object that stores the actual data blocks that are retrieved from datafiles by system queries and other data manipulation language commands. A query causes a Server Process to first look in the Database Buffer Cache to determine if the requested information happens to already be located in memory thus the information would not need to be retrieved from disk and this would speed up performance. If the information is not in the Database Buffer Cache, the Server Process retrieves the information from disk and stores it to the cache. Keep in mind that information read from disk is read a block at a time, not a row at a time, because a database block is the smallest addressable storage space on disk. Database blocks are kept in the Database Buffer Cache according to a Least Recently Used (LRU) algorithm and are aged out of memory if a buffer cache block is not used in order to provide space for the insertion of newly needed database blocks. The buffers in the cache are organized in two lists: the write list and, the least recently used (LRU) list.

The write list holds dirty buffers these are buffers that hold that data that has been modified, but the blocks have not been written back to disk. The LRU list holds free buffers, pinned buffers, and dirty buffers that have not yet been moved to the write list. Free buffers do not contain any useful data and are available for use. Pinned buffers are currently being accessed. When an Oracle process accesses a buffer, the process moves the buffer to the most recently used (MRU) end of the LRU list this causes dirty buffers to age toward the LRU end of the LRU list. When an Oracle user process needs a data row, it searches for the data in the database buffer cache because memory can be searched more quickly than hard disk can be accessed. If the data row is already in the cache (a cache hit), the process reads the data from memory; otherwise a cache miss occurs and data must be read from hard disk into the database buffer cache. Before reading a data block into the cache, the process must first find a free buffer. The process searches the LRU list, starting at the LRU end of the list. The search continues until a free buffer is found or until the search reaches the threshold limit of buffers.

Each time a user process finds a dirty buffer as it searches the LRU, that buffer is moved to the write list and the search for a free buffer continues. When a user process finds a free buffer, it reads the data block from disk into the buffer and moves the buffer to the MRU end of the LRU list. If an Oracle user process searches the threshold limit of buffers without finding a free buffer, the process stops searching the LRU list and signals the DBWn background process to write some of the dirty buffers to disk. This frees up some buffers. The block size for a database is set when a database is created and is determined by the init.ora parameter file parameter named DB_BLOCK_SIZE. Typical block sizes are 2KB, 4KB, 8KB, 16KB, and 32KB. The size of blocks in the Database Buffer Cache matches the block size for the database. The DBORCL database uses an 8KB block size. This figure shows that the use of non-standard block sizes results in multiple database buffer cache memory allocations.

Because tablespaces that store oracle tables can use different (non-standard) block sizes, there can be more than one Database Buffer Cache allocated to match block sizes in the cache with the block sizes in the non-standard tablespaces. The size of the Database Buffer Caches can be controlled by the parameters DB_CACHE_SIZE and DB_nK_CACHE_SIZE to dynamically change the memory allocated to the caches without restarting the Oracle instance. You can dynamically change the size of the Database Buffer Cache with the ALTER SYSTEM command like the one shown here: ALTER SYSTEM SET DB_CACHE_SIZE = 96M; You can have the Oracle Server gather statistics about the Database Buffer Cache to help you size it to achieve an optimal workload for the memory allocation. This information is displayed from the V$DB_CACHE_ADVICE view. In order for statistics to be gathered, you can dynamically alter the system by using the ALTER SYSTEM SET DB_CACHE_ADVICE (OFF, ON, READY) command. However, gathering statistics on system performance always incurs some overhead that will slow down system performance. SQL> ALTER SYSTEM SET db_cache_advice = ON; System altered. SQL> DESC V$DB_cache_advice; Name ID NAME BLOCK_SIZE ADVICE_STATUS SIZE_FOR_ESTIMATE SIZE_FACTOR BUFFERS_FOR_ESTIMATE ESTD_PHYSICAL_READ_FACTOR NUMBER VARCHAR2(20) NUMBER VARCHAR2(3) NUMBER NUMBER NUMBER NUMBER Null? Type

----------------------------------------- -------- ------------

ESTD_PHYSICAL_READS ESTD_PHYSICAL_READ_TIME ESTD_PCT_OF_DB_TIME_FOR_READS ESTD_CLUSTER_READS ESTD_CLUSTER_READ_TIME

NUMBER NUMBER NUMBER NUMBER NUMBER

SQL> SELECT name, block_size, advice_status FROM v$db_cache_advice; NAME BLOCK_SIZE ADV

-------------------- ---------- --DEFAULT <more rows will display> 21 rows selected. SQL> ALTER SYSTEM SET db_cache_advice = OFF; System altered. 8192 ON

KEEP Buffer Pool


This pool retains blocks in memory (data from tables) that are likely to be reused throughout daily processing. An example might be a table containing user names and passwords or a validation table of some type. The DB_KEEP_CACHE_SIZE parameter sizes the KEEP Buffer Pool.

RECYCLE Buffer Pool


This pool is used to store table data that is unlikely to be reused throughout daily processing thus the data blocks are quickly removed from memory when not needed.

The DB_RECYCLE_CACHE_SIZE parameter sizes the Recycle Buffer Pool.

Redo Log Buffer

The Redo Log Buffer memory object stores images of all changes made to database blocks. Database blocks typically store several table rows of organizational data. This means that if a single column value from one row in a block is changed, the block image is stored. Changes include INSERT, UPDATE, DELETE, CREATE, ALTER, or DROP.

LGWR writes redo sequentially to disk while DBWn performs scattered writes of data blocks to disk. o Scattered writes tend to be much slower than sequential writes. o Because LGWR enable users to avoid waiting for DBWn to complete its slow writes, the database delivers better performance.

The Redo Log Buffer as a circular buffer that is reused over and over. As the buffer fills up, copies of the images are stored to the Redo Log Files that are covered in more detail in a later module.

Large Pool
The Large Pool is an optional memory structure that primarily relieves the memory burden placed on the Shared Pool. The Large Pool is used for the following tasks if it is allocated: Allocating space for session memory requirements from the User Global Area where a Shared Server is in use. Transactions that interact with more than one database, e.g., a distributed database scenario. Backup and restore operations by the Recovery Manager (RMAN) process. o RMAN uses this only if the BACKUP_DISK_IO = n and BACKUP_TAPE_IO_SLAVE = TRUE parameters are set. o If the Large Pool is too small, memory allocation for backup will fail and memory will be allocated from the Shared Pool. Parallel execution message buffers for parallel server operations. The PARALLEL_AUTOMATIC_TUNING = TRUE parameter must be set.

The Large Pool size is set with the LARGE_POOL_SIZE parameter this is not a dynamic parameter. It does not use an LRU list to manage memory.

Java Pool

The Java Pool is an optional memory object, but is required if the database has Oracle Java installed and in use for Oracle JVM (Java Virtual Machine). The size is set with the JAVA_POOL_SIZE parameter that defaults to 24MB. The Java Pool is used for memory allocation to parse Java commands and to store data associated with Java commands. Storing Java code and data in the Java Pool is analogous to SQL and PL/SQL code cached in the Shared Pool.

Streams Pool
This pool stores data and control structures to support the Oracle Streams feature of Oracle Enterprise Edition. Oracle Steams manages sharing of data and events in a distributed environment. It is sized with the parameter STREAMS_POOL_SIZE. If STEAMS_POOL_SIZE is not set or is zero, the size of the pool grows dynamically.

Processes
You need to understand three different types of Processes: User Process: Starts when a database user requests to connect to an Oracle Server. Server Process: Establishes the Connection to an Oracle Instance when a User Process requests connection makes the connection for the User Process. Background Processes: These start when an Oracle Instance is started up.

Client Process

In order to use Oracle, you must obviously connect to the database. This must occur whether you're using SQLPlus, an Oracle tool such as Designer or Forms, or an application program. The client process is also termed the user process in some Oracle documentation.

This generates a User Process (a memory object) that generates programmatic calls through your user interface (SQLPlus, Integrated Developer Suite, or application program) that creates a session and causes the generation of a Server Process that is either dedicated or shared.

Server Process

A Server Process is the go-between for a Client Process and the Oracle Instance. Dedicated Server environment there is a single Server Process to serve each Client Process. Shared Server environment a Server Process can serve several User Processes, although with some performance reduction. Allocation of server process in a dedicated environment versus a shared environment is covered in further detail in the Oracle11g Database Performance Tuning course offered by Oracle Education.

Background Processes

As is shown here, there are both mandatory and optional background processes that are started whenever an Oracle Instance starts up. These background processes serve all system users. We will cover mandatory process in detail.

Mandatory Background Processes Process Monitor Process (PMON) System Monitor Process (SMON) Database Writer Process (DBWn) Log Writer Process (LGWR) Checkpoint Process (CKPT) Manageability Monitor Processes (MMON and MMNL) Recover Process (RECO)

Optional Processes Archiver Process (ARCn) Coordinator Job Queue (CJQ0) Dispatcher (number nnn) (Dnnn) Others

Optional Background Process Definition: ARCn: Archiver One or more archiver processes copy the online redo log files to archival storage when they are full or a log switch occurs. CJQ0: Coordinator Job Queue This is the coordinator of job queue processes for an instance. It monitors the JOB$ table (table of jobs in the job queue) and starts job queue processes (Jnnn) as needed to execute jobs The Jnnn processes execute job requests created by the DBMS_JOBS package. Dnnn: Dispatcher number "nnn", for example, D000 would be the first dispatcher process Dispatchers are optional background processes, present only when the shared server configuration is used. Shared server is discussed in your readings on the topic "Configuring Oracle for the Shared Server".

Of these, you will most often use ARCn (archiver) when you automatically archive redo log file information (covered in a later module).

PMON
The Process Monitor (PMON) is a cleanup type of process that cleans up after failed processes such as the dropping of a user connection due to a network failure or the abnormal termination (ABEND) of a user application program. It does the tasks shown in the figure below.

SMON
The System Monitor (SMON) is responsible for instance recovery by applying entries in the online redo log files to the datafiles. It also performs other activities as outlined in the figure shown below.

If an Oracle Instance fails, all information in memory not written to disk is lost. SMON is responsible for recovering the instance when the database is started up again. It does the following: Rolls forward to recover data that was recorded in a Redo Log File, but that had not yet been recorded to a datafile by DBWn. SMON reads the Redo Log Files and applies the changes to the data blocks. This recovers all transactions that were committed because these were written to the Redo Log Files prior to system failure. Opens the database to allow system users to logon. Rolls back uncommitted transactions.

SMON also does limited space management. It combines (coalesces) adjacent areas of free space in the database's datafiles for tablespaces that are dictionary managed.

It also deallocates temporary segments to create free space in the datafiles.

DBWn (also called DBWR in earlier Oracle Versions)


The Database Writer writes modified blocks from the database buffer cache to the datafiles. Although one database writer process (DBW0) is sufficient for most systems, you can configure up to 20 DBWn processes (DBW0 through DBW9 and DBWa through DBWj) in order to improve write performance for a system that modifies data heavily.

The initialization parameter DB_WRITER_PROCESSES specifies the number of DBWn processes.

The purpose of DBWn is to improve system performance by caching writes of database blocks from the Database Buffer Cache back to datafiles. Blocks that have been modified and that need to be written back to disk are termed "dirty blocks." The DBWn also ensures that there are enough free buffers in the Database Buffer Cache to service Server Processes that may be reading data from datafiles into the Database Buffer Cache. Performance improves because by delaying writing changed database blocks back to disk, a Server Process may find the data that is needed to meet a User Process request already residing in memory!

DBWn writes to datafiles when one of these events occurs that is illustrated in the figure below.

LGWR
The Log Writer (LGWR) writes contents from the Redo Log Buffer to the Redo Log File that is in use. These are sequential writes since the Redo Log Files record database modifications based on the actual time that the modification takes place. LGWR actually writes before the DBWn writes and only confirms that a COMMIT operation has succeeded when the Redo Log Buffer contents are successfully written to disk. LGWR can also call the DBWn to write contents of the Database Buffer Cache to disk. The LGWR writes according to the events illustrated in the figure shown below.

CKPT
The Checkpoint (CPT) process writes information to update the database control files and headers of datafiles to identify the point in time with regard to the Redo Log Files where instance recovery is to begin should it be necessary. This is done at a minimum, once every three seconds.

Think of a checkpoint record as a starting point for recovery. DBWn will have completed writing all buffers from the Database Buffer Cache to disk prior to the checkpoint, thus those records will not require recovery. This does the following: Ensures modified data blocks in memory are regularly written to disk CKPT can call the DBWn process in order to ensure this and does so when writing a checkpoint record. Reduces Instance Recovery time by minimizing the amount of work needed for recovery since only Redo Log File entries processed since the last checkpoint require recovery. Causes all committed data to be written to datafiles during database shutdown.

If a Redo Log File fills up and a switch is made to a new Redo Log File (this is covered in more detail in a later module), the CKPT process also writes checkpoint information into the headers of the datafiles.

Checkpoint information written to control files includes the system change number (the SCN is a number stored in the control file and in the headers of the database files that are used to ensure that all files in the system are synchronized), location of which Redo Log File is to be used for recovery, and other information. CKPT does not write data blocks or redo blocks to disk it calls DBWn and LGWR as necessary.

MMON and MMNL


The Manageability Monitor Process (MMNO) performs tasks related to the Automatic Workload Repository (AWR) a repository of statistical data in the SYSAUX tablespace (see figure below) for example, MMON writes when a metric violates its threshold value, taking snapshots, and capturing statistics value for recently modified SQL objects.

The Manageability Monitor Lite Process (MMNL) writes statistics from the Active Session History (ASH) buffer in the SGA to disk. MMNL writes to disk when the ASH buffer is full.

The information stored by these processes is used for performance tuning we survey performance tuning in a later module.

RECO
The Recoverer Process (RECO) is used to resolve failures of distributed transactions in a distributed database. Consider a database that is distributed on two servers one in St. Louis and one in Chicago. Further, the database may be distributed on servers of two different operating systems, e.g. LINUX and Windows. The RECO process of a node automatically connects to other databases involved in an in-doubt distributed transaction. When RECO reestablishes a connection between the databases, it automatically resolves all indoubt transactions, removing from each database's pending transaction table any rows that correspond to the resolved transactions.

ARCn
While the Archiver (ARCn) is an optional background process, we cover it in more detail because it is almost always used for production systems storing mission critical information. The ARCn process must be used to recover from loss of a physical disk drive for systems that are "busy" with lots of transactions being completed.

When a Redo Log File fills up, Oracle switches to the next Redo Log File. The DBA creates several of these and the details of creating them are covered in a later module. If all Redo Log Files fill up, then Oracle switches back to the first one and uses them in a round-robin fashion by overwriting ones that have already been used it should be obvious that the information stored on the files, once overwritten, is lost forever.

If ARCn is in what is termed ARCHIVELOG mode, then as the Redo Log Files fill up, they are individually written to Archived Redo Log Files and LGWR does not overwrite a Redo Log File until archiving has completed. Thus, committed data is not lost forever and can be recovered in the event of a disk failure. Only the contents of the SGA will be lost if an Instance fails.

In NOARCHIVELOG mode, the Redo Log Files are overwritten and not archived. Recovery can only be made to the last full backup of the database files. All committed transactions after the last full backup are lost, and you can see that this could cost the firm a lot of $$$.

When running in ARCHIVELOG mode, the DBA is responsible to ensure that the Archived Redo Log Files do not consume all available disk space! Usually after two complete backups are made, any Archived Redo Log Files for prior backups are deleted.

Logical Structure
It is helpful to understand how an Oracle database is organized in terms of a logical structure that is used to organize physical objects.

Tablespace: An Oracle database must always consist of at least two tablespaces (SYSTEM and SYSAUX), although a typical Oracle database will multiple tablespaces. A tablespace is a logical storage facility (a logical container) for storing objects such as tables, indexes, sequences, clusters, and other database objects. Each tablespace has at least one physical datafile that actually stores the tablespace at the operating system level. A large tablespace may have more than one datafile allocated for storing objects assigned to that tablespace.

A tablespace belongs to only one database. Tablespaces can be brought online and taken offline for purposes of backup and management, except for the SYSTEM tablespace that must always be online. Tablespaces can be in either read-only or read-write status.

Datafile: Tablespaces are stored in datafiles which are physical disk objects. A datafile can only store objects for a single tablespace, but a tablespace may have more than one datafile this happens when a disk drive device fills up and a tablespace needs to be expanded, then it is expanded to a new disk drive. The DBA can change the size of a datafile to make it smaller or later. The file can also grow in size dynamically as the tablespace grows.

Segment: When logical storage objects are created within a tablespace, for example, an employee table, a segment is allocated to the object. Obviously a tablespace typically has many segments. A segment cannot span tablespaces but can span datafiles that belong to a single tablespace.

Extent: Each object has one segment which is a physical collection of extents. Extents are simply collections of contiguous disk storage blocks. A logical storage object such as a table or index always consists of at least one extent ideally the initial extent allocated to an object will be large enough to store all data that is initially loaded. As a table or index grows, additional extents are added to the segment. A DBA can add extents to segments in order to tune performance of the system. An extent cannot span a datafile.

Block: The Oracle Server manages data at the smallest unit in what is termed a block or data block. Data are actually stored in blocks.

A physical block is the smallest addressable location on a disk drive for read/write operations. An Oracle data block consists of one or more physical blocks (operating system blocks) so the data block, if larger than an operating system block, should be an even multiple of the operating system block size, e.g., if the Linux operating system block size is 2K or 4K, then the Oracle data block should be 2K, 4K, 8K, 16K, etc in size. This optimizes I/O. The data block size is set at the time the database is created and cannot be changed. It is set with the DB_BLOCK_SIZE parameter. The maximum data block size depends on the operating system. Thus, the Oracle database architecture includes both logical and physical structures as follows: Physical: Control files; Redo Log Files; Datafiles; Operating System Blocks. Logical: Tablespaces; Segments; Extents; Data Blocks.

SQL Statement Processing


SQL Statements are processed differently depending on whether the statement is a query, data manipulation language (DML) to update, insert, or delete a row, or data definition language (DDL) to write information to the data dictionary.

Processing a query: Parse: o Search for identical statement in the Shared SQL Area. o Check syntax, object names, and privileges. o Lock objects used during parse. o Create and store execution plan. Bind: Obtains values for variables.

Execute: Process statement. Fetch: Return rows to user process.

Processing a DML statement: Parse: Same as the parse phase used for processing a query. Bind: Same as the bind phase used for processing a query. Execute: o If the data and undo blocks are not already in the Database Buffer Cache, the server process reads them from the datafiles into the Database Buffer Cache. o The server process places locks on the rows that are to be modified. The undo block is used to store the before image of the data, so that the DML statements can be rolled back if necessary. o The data blocks record the new values of the data. o The server process records the before image to the undo block and updates the data block. Both of these changes are made in the Database Buffer Cache. Any changed blocks in the Database Buffer Cache are marked as dirty buffers. That is, buffers that are not the same as the corresponding blocks on the disk. o The processing of a DELETE or INSERT command uses similar steps. The before image for a DELETE contains the column values in the deleted row, and the before image of an INSERT contains the row location information.

Processing a DDL statement: The execution of DDL (Data Definition Language) statements differs from the execution of DML (Data Manipulation Language) statements and queries, because the success of a DDL statement requires write access to the data dictionary. For these statements, parsing actually includes parsing, data dictionary lookup, and execution. Transaction management, session management, and system management SQL statements are processed using the parse and execute stages. To re-execute them, simply perform another execute.

Starting ORACLE and Setting Environment Variables


This section will provide a basic understanding of how to start ORACLE and some environment variables that may be of use. This section will be kept as general as possible and specific information, such as the name of the database instance, will be provided in the lecture, lab, or assignment writeup. ORACLE is a relational database management system and as such has many development and production tools (i.e. SQL*PLUS, SQL*MENU, SQL*FORMS ..). You must evaluate the available tools and select those that are most appropriate for your application. Each tool that is available for your use will be explained in the following chapters. ORACLE can be used from many different platforms and on each platform it can be used in many different environments. For example, you can use ORACLE from openwindows, olwm, twm, gwm or sunview on UNIX platforms. Although ORACLE can be used from many environments, it has been tested from the openwindows shelltool environment and seems to work better from this environment. If you are running ORACLE from an xterm, you need to bring up xterm with the ``sf'' option. Before you can use any of the tools, you MUST set a few environment variables allowing you to identify yourself to ORACLE, the database instance you want to use and the location where ORACLE binaries can be found. You should source the script ``/usr/local/bin/coraenv'' (``source /usr/local/bin/coraenv'') for csh shell users and ``/usr/local/bin/oraenv'' for sh users, from your lab machines to help you set the necessary variable parameters. The variables below may or may not indicate the correct settings and may change from time to time. See your TA, lab coordinator or assignment writeup for the correct settings to the variables below:
setenv ORACLE_HOME /usr/gwynne3/oracle (or ``setenv ORACLE\_HOME ~oracle'') setenv ORACLE_SID crs setenv T2KDEV sun (or xsun) setenv TWO_TASK crs setenv TERM sun (or xsun)

ORACLE_HOME is set to the home directory of the ``oracle'' user. ORACLE_SID is the name of the database instance that you are using. T2KDEV is set to the type of terminal you are using; sun if you are using openwindows; xsun if you are using X windows.

TERM is set to the type of terminal you are using. Note: If you are using openwindows, you may have to redefine the variable TERM to sun (instead of cmd-sun) and you should execute ORACLE from a shelltool. TWO_TASK is set to the location where ORACLE server can be found and will normally be the name of the database instance. This variable helps simplify the command line for starting ORACLE tools.
call to ORACLE without TWO_TASK and SQL*NET sqlplus scott/tiger@T:gwynne:cr call to ORACLE without TWO_TASK but with SQL*NET sqlplus scott/tiger@crs call to ORACLE with TWO_TASK and SQL*NET sqlplus scott/tiger

Note: T above stands for the protocol name (tcp/ip in this case). ``gwynne'' is the name of the database instance server and crs is the name of the database instance.

Starting ORACLE
If you are not given an ORACLE user ID or PASSWORD, then your user ID will be the same as your system login ID and your password will be verified by the system login. So to start an ORACLE tool SQL*PLUS, with TWO_TASK variable set with password identified by the system and SQL*NET installed, type:
sqlplus /

Setting Up Your Preferred Editor


You may prefer to use your own editor when using the ORACLE tools instead of the default system editor (ed - type ``q'' to quit). This can be done by setting the environment variable EDITOR to the editor of your choice.
setenv EDITOR /usr/ucb/vi

Online Error Messages


Use the online error message facility ``oerr'' to find out the descriptive error messages that are associated with the error codes that ORACLE produces.
% oerr facility error

Facility is identified by the three-letter prefix in the error string. For example, if you get ORA7300, ``ora'' is the facility and ``7300'' is the error, so you should type ``oerr ora 7300''. If you get LCD-111, type ``oerr lcd 111'' and so on.

An Introduction to the Data Dictionary

One of the most important parts of an ORACLE database is its data dictionary. The data dictionary is a set of tables to be used as a read-only reference, which provides information about its associated database. For example, a data dictionary can provide the following information:

names of ORACLE users privileges and roles each user has been granted names of schema objects (tables, views, indexes, synonyms) information about integrity constraints default values for columns how much space has been allocated for, and is currently used by, the objects in a database

The data dictionary is structured in tables and views, just like other database data. To access the data dictionary, you use SQL (see SQL*Plus). Because the data dictionary is read-only, users can only issue queries (SELECT statement) against the tables and views of the data dictionary. The views of the data dictionary serve as a reference for all database users. Certain views are accessible to all ORACLE users, while others are intended for administrators only. The data dictionary consists of sets of views. In many cases, a set consists of three views containing similar information and distinguished from each other by their prefixes:
USER ALL DBA user's view (what is in the user's schema) expanded user's view (what the user can access) database administrator's view (what all users can access)

The views most likely to be of interest to typical users are those with the prefix USER. These views:

refer to user's own private environment in the database, including information about objects created by the user, grants made by the user, and so on display only rows pertinent to the user have identical columns to the other views, except that the column OWNER is implied (the current user) return a subset of the information in the ALL_views

For example, the following query returns all the objects contained in your schema:
SELECT * FROM user_objects;

You can obtain a list of all possible views by querying the data dictionary itself:

SELECT * FROM dict;

Reporting Problems
If you have any valid suggestions on improving this manual, you should send a message to the user ``oracle''. If you encounter any problems using ORACLE, you should see your TA, lab coordinator or send a message to the user ``oracle''. For system problems or other ORACLE problems which require immediate attention, see the file ``/etc/motd'' for the method of reporting problems.

ORACLE Architecture and Terminology


This section will provide a basic understanding of ORACLE including the concepts and terminology of the ORACLE Server. It is important that you read through this section to familiarize yourself with the concepts and terminology to be used throughout this manual. Most of the information contained in this section is DIRECTLY extracted from ``ORACLE7 Server Concepts Manual'' and all credit should be attributed to ORACLE. Before you can begin to use ORACLE, you must have a basic understanding of the architecture of ORACLE to help you start thinking about an ORACLE database in the correct conceptual manner. Figure 1 illustrates a typical variation of ORACLE's memory and process structures; some of the memory structures and processes in this diagram are discussed in the following section. For more information on these memory structures and processes, see page 1-15 of ``ORACLE7 Server Concepts Manual.''

Figure 1. ORACLE Architecture

Memory Structures and Processes


The mechanisms of ORACLE execute by using memory structures and processes. All memory structures exist in the main memory of the computers that constitute the database system. Processes are jobs or tasks that work in the memory of these computers.

Memory Structures
ORACLE creates and uses memory structures to complete several jobs. For example, memory is used to store program code being executed and data that is shared among users. Several basic memory structures are associated with ORACLE: the system global area (which includes the database and redo log buffers, and the shared pool) and the program global area.

System Global Area (SGA) is a shared memory region allocated by ORACLE that contains data and control information for one ORACLE instance. An ORACLE instance contains the SGA and the background processes. The SGA is allocated when an instance starts and deallocated when the instance shuts down. Each instance that is started has its own SGA. The Program Global Area (PGA) is a memory buffer that contains data and control information for a server process. A PGA is created by ORACLE when a server process is started. The information in a PGA depends on the configuration of ORACLE.

Processes
A process is a ``thread of control'' or a mechanism in an operating system that can execute a series of steps. Some operating systems use the terms job or task. An ORACLE database system has two general types of processes: user processes and ORACLE processes. A user process is created and maintained to execute the software code of an application program (such as a PRO*C program) or an ORACLE tool (such as SQL*PLUS). The user process also manages the communication with the server processes. User processes communicate with the server processes through the program interface. ORACLE processes are called by other processes to perform functions on behalf of the invoking process. ORACLE creates a server process to handle requests from connected user processes. ORACLE also creates a set of background processes for each instance (see ``ORACLE7 Server Concepts Manual'' pages 1-18, 1-19).

Database Structures
The relational model has three major aspects: Structures Structures are well-defined objects that store the data of a database. Structures and the data contained within them can be manipulated by operations. Operations Operations are clearly defined actions that allow users to manipulate the data and structures of a database. The operations on a database must adhere to a pre-defined set of integrity rules. Integrity Rule Integrity rules are the laws that govern which operations are allowed on the data and structures of a database. Integrity rules protect the data and the structures of a database.

An ORACLE database has both a physical and a logical structure. By separating physical and logical database structure, the physical storage of data can be managed without affecting the access to logical storage structures.

Logical Database Structure


An ORACLE database's logical structure is determined by:

one or more tablespaces. the database's schema objects (e.g., tables, views, indexes, clusters, sequences, stored procedures).

The logical storage structures, including tablespaces, segments, and extents, dictate how the physical space of a database is used. The schema objects and the relationships among them form the relational design of a database.

Tablespaces and Data Files Tablespaces are the primary logical storage structures of any ORACLE database. The usable data of an ORACLE database is logically stored in the tablespaces and physically stored in the data files associated with the corresponding tablespace. Figure 2 illustrates this relationship. Although databases, tablespaces, data files, and segments are closely related, they have important differences:

databases and tablespaces An ORACLE database is comprised of one or more logical storage units called tablespaces. The database's data is collectively stored in the database's tablespaces. tablespaces and data files Each tablespace in an ORACLE database is comprised of one or more operating system files called data files. A tablespace's data files physically store the associated database data on disk. databases and data files A database's data is collectively stored in the data files that constitute each tablespace of the database. For example, the simplest ORACLE database would have one tablespace, with one data file. A more complicated database might have three tablespaces, each comprised of two data files (for a total of six data files). schema objects, segments, and tablespaces When a schema object such as a table or index is created, its segment is created within a designated tablespace in the database. For example, suppose a table is created in a specific tablespace using the CREATE TABLE command with the TABLESPACE

option. The space for this table's data segment is allocated in one or more of the data files that constitute the specified tablespace. An object's segment allocates space in only one tablespace of a database.

Figure 2. Data Files and Tablespaces

A database is divided into one or more logical storage units called tablespaces. A database administrator can use tablespaces to do the following:

Control disk space allocation for database data. Assign specific space quotas for database users. Control availability of data by taking individual tablespaces online or offline. Perform partial database backup or recovery operations. Allocate data storage across devices to improve performance.

Every ORACLE database contains a tablespace named SYSTEM, which is automatically created when the database is created. The SYSTEM tablespace always contains the data dictionary tables for the entire database. You can query these data dictionary tables to obtain pertinent information

about the database; for example, the names of the tables that are owned by you or ones to which you have access. See Chapter 3 for more information on how to access data dictionary tables. Data files associated with a tablespace store all the database data in that tablespace. One or more datafiles form a logical unit of database storage called a tablespace. A data file can be associated with only one tablespace, and only one database. After a data file is initially created, the allocated disk space does not contain any data; however, the space is reserved to hold only the data for future segments of the associated tablespace - it cannot store any other program's data. As a segment (such as the data segment for a table) is created and grows in a tablespace, ORACLE uses the free space in the associated data files to allocate extents for the segment. The data in the segments of objects (data segments, index segments, rollback segments, and so on) in a tablespace are physically stored in one or more of the data files that constitute the tablespace. Note that a schema object does not correspond to a specific data file; rather, a data file is a repository for the data of any object within a specific tablespace. The extents of a single segment can be allocated in one or more data files of a tablespace (see Figure 3); therefore, an object can ``span'' one or more data files. The database administrator and end-users cannot control which data file stores an object.

Data Blocks, Extents, and Segments ORACLE allocates database space for all data in a database. The units of logical database allocations are data blocks, extents, and segments. Figure 3 illustrates the relationships between these data structures. Data Blocks At the finest level of granularity, an ORACLE database's data is stored in data blocks (also called logical blocks, ORACLE blocks, or pages). An ORACLE database uses and allocates free database space in ORACLE data blocks. Figure 4 illustrates a typical ORACLE data block. Extents The next level of logical database space is called an extent. An extent is a specific number of contiguous data blocks that are allocated for storing a specific type of information. Segments The level of logical database storage above an extent is called a segment. A segment is a set of extents which have been allocated for a specific type of data structure, and all are stored in the same tablespace. For example, each table's data is stored in its own data segment, while each index's data is stored in its own index segment. ORACLE allocates space for segments in extents. Therefore, when the existing extents of a segment are full, ORACLE allocates another extent for that segment. Because extents are allocated as

needed, the extents of a segment may or may not be contiguous on disk, and may or may not span files. An extent cannot span files, though.

Figure 3. The Relationship Among Segments, Extents and Data Blocks

ORACLE manages the storage space in the data files of a database in units called data blocks. A data block is the smallest unit of I/O used by a database. A data block corresponds to a block of physical bytes on disk, equal to the ORACLE data block size (specifically set when the database is created - 2048). This block size can differ from the standard I/O block size of the operating system that executes ORACLE. The ORACLE block format is similar regardless of whether the data block contains table, index, or clustered data. Figure 4 shows the format of a data block.

Figure 4. Data Block Format

Header (Common and Variable) The header contains general block information, such as block address, segment type, such as data, index, or rollback. While some block overhead is fixed in size (about 107 bytes), the total block overhead size is variable. Table Directory The table directory portion of the block contains information about the tables having rows in this block. Row Directory This portion of the block contains row information about the actual rows in the block (including addresses for each row piece in the row data area). Once the space has been allocated in the row directory of a block's header, this space is not reclaimed when the row is deleted. Row Data This portion of the block contains table or index data. Rows can span blocks.

Free Space Free space is used to insert new rows and for updates to rows that require additional space (e.g., when a trailing null is updated to a non-null value). Whether issued insertions actually occur in a given data block is a function of the value for the space management parameter PCTFREE and the amount of current free space in that data block. Space Used for Transaction Entries Data blocks allocated for the data segment of a table, cluster, or the index segment of an index can also use free space for transaction entries. Two space management parameters, PCTFREE and PCTUSED, allow a developer to control the use of free space for inserts of and updates to the rows in data blocks. Both of these parameters can only be specified when creating or altering tables and clusters (data segments). In addition, the storage parameter PCTFREE can also be specified when creating or altering indicies (index segments). The PCTFREE parameter is used to set the percentage of a block to be reserved (kept free) for possible updates to rows that already are contained in that block. For example, assume that you specify the following parameter within a CREATE TABLE statement: pctfree 20 This states that 20\% of each data block used for this table's data segment will be kept free and available for possible updates to the existing rows already within each block. After a data block becomes full, as determined by PCTFREE, the block is not considered for the insertion of new rows until the percentage of the block being used falls below the parameter PCTUSED. Before this value is achieved, the free space of the data block can only be used for updates to rows already contained in the data block. For example, assume that you specify the following parameter within a CREATE TABLE statement: pctused 40 In this case, a data block used for this table's data segment is not considered for the insertion of any new rows until the amount of used space in the blocks falls to 39\% or less (assuming that the block's used space has previously reached PCTFREE). No matter what type, each segment in a database is created with at least one extent to hold its data. This extent is called the segment's initial extent. If the data blocks of a segment's initial extent become full and more space is required to hold new data, ORACLE automatically allocates an incremental extent for that segment. An incremental extent is a subsequent extent of the same or incremented size of the previous extent in that segment.

Every non-clustered table in an ORACLE database has a single data segment to hold all of its data. The data segment for a table is indirectly created via the CREATE TABLE/SNAPSHOT command. Storage parameters for a table, snapshot, or cluster control the way that a data segment's extents are allocated. Setting these storage parameters directly via the CREATE TABLE/SNAPSHOT/CLUSTER or ALTER TABLE/SNAPSHOT/CLUSTER commands affects the efficiency of data retrieval and storage for that data segment. For more information on Data Blocks, Segments and Extents, see ``ORACLE7 Server Concepts Manual.''

Physical Database Structure


An ORACLE database's physical structure is determined by the operating system files that constitute the database. Each ORACLE database is comprised of these types of files: one or more data files, two or more redo log files, and one or more control files. The files of a database provide the actual physical storage for database information. For more information on these physical storage files, see ``ORACLE7 Server Concepts Manual.''

Figure 5. Maintaining the Free Space of Data Blocks with PCTFREE and PCTUSED

Example 1. Loading Data into Multiple Tables

CONTROL FILE - The control file for this example.


-- Loads EMP records from first 23 characters -- Creates and loads PROJ records for each PROJNO listed -- for each employee

LOAD DATA INFILE 'ulcase5.dat' BADFILE 'ulcase5.bad' DISCARDFILE 'ulcase5.dsc' a. b. REPLACE INTO TABLE emp (empno POSITION(1:4) INTEGER EXTERNAL, ename POSITION(6:15) CHAR, deptno POSITION(17:18) CHAR, mgr POSITION(20:23) INTEGER EXTERNAL) INTO TABLE proj -- PROJ has two columns, both not null: EMPNO and PROJNO WHEN projno != ' ' (empno POSITION(1:4) INTEGER EXTERNAL, projno POSITION(25:27) INTEGER EXTERNAL) INTO TABLE proj WHEN projno != ' ' (empno POSITION(1:4) INTEGER EXTERNAL, projno POSITION(29:31) INTEGER EXTERNAL) INTO TABLE proj WHEN projno != ' ' (empno POSITION(1:4) INTEGER EXTERNAL, projno POSITION(33:35) INTEGER EXTERNAL)

b. c. c. b.

-- 1st proj

-- 2nd proj

b.

-- 3rd proj

--------------------------------------------------------------

NOTES: (a) REPLACE indicates that if there is data in the tables to be loaded (EMP and PROJ), that data should be deleted before new rows are loaded.

Multiple INTO clauses are used to load two tables, EMP and PROJ. The same set of records (b) is processed three times using different combinations of columns each time, to load table PROJ. (c) WHEN is used to load only rows with non-blank project numbers. When PROJNO is defined as columns 25..27, rows are inserted into PROJ only if there is a value in those columns.

DATA FILE - Part of the data file follows.


1234 BAKER 1234 JOKER 2664 YOUNG 10 9999 101 102 103 10 9999 777 888 999 20 2893 425 abc 102

INVOKING SQL*LOADER - The command line for this example.


SQLLOAD / CONTROL=ULCASE5.CTL LOG=ULCASE5.LOG

Example 2 Loading a Delimited, Free-Format File

CONTROL FILE - The control file for this example.


-- Variable-length, delimited and enclosed data format a. b. c. d. e. f. LOAD DATA INFILE * APPEND INTO TABLE emp FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' (empno, ename, job, mgr, hiredate DATE "DD-Month-YYYY", sal, comm, deptno CHAR TERMINATED BY ':', projno, loadseq SEQUENCE(MAX,1)) BEGINDATA 7782, "CLARK", "Manager", 7839, 09-June-1981, 2572.50, 10:101 7839, "King", "President", , 17-January-1982, 920.00, 10:102 --------------------------------------------------------------

NOTES: (a) INFILE * signifies the data is found at the end of the control file. (b) (c) APPEND indicates that data may be loaded even if the table already contains rows; the table need not be empty. The default terminator for the data fields is a comma, and some fields may be enclosed by a double quote.

(d) The data to be loaded into column HIREDATE appears in the format DD-Month-YYYY. The SEQUENCE function is used to generate a unique value in the column LOADSEQ. This (e) function finds the current maximum value in column LOADSEQ and adds the increment (1) to it to obtain the value for LOADSEQ for each row inserted. (f) BEGINDATA signifies the end of the control information and the beginning of the data. INVOKING SQL*LOADER - The command line for this example.
SQLLOAD / CONTROL=ULCASE3.CTL LOG=ULCASE3.LOG

Control File Syntax

The control file usually begins with the phase LOAD DATA, followed by several phrases that describe the data to be loaded. Only comments or the OPTIONS phrase can precede the LOAD DATA phase.

For a complete control file syntax diagram see Appendix C in this manual.

Only a subset of the syntax will be explained below. For a complete explanation of the above syntax, see chapter 6 of ``ORACLE7 Server Utilities Users Guide''.

Comments
Comments may appear anywhere in the command section of the file, but they should not appear in the data. Comments are preceded with a double dash, which may appear anywhere on a line. All text to the right of the double dash is ignored, until the end of line.

The OPTIONS Clause


The OPTIONS clause is useful when you usually invoke a control file with the same set of options, or when the command line and all its arguments becomes very long. This clause allows you to specify runtime arguments in the control file rather than on the command line. SKIP = n LOAD = n ERRORS = n ROWS = n -- Number of logical records to skip (DEFAULT 0) -- Number of logical records to load (DEFAULT all) -- Number of errors to allow (DEFAULT 50) -- Number of rows in conventional path bind array (DEFAULT 64)

BINDSIZE = n -- Size of conventional path bind array in bytes SILENT = {HEADER | FEEDBACK | ERROR | DISCARDS | ALL } -- Suppress messages during run For example:
OPTIONS (BINDSIZE=10000, SILENT=(ERRORS, FEEDBACK) )

Values specified on the command line override values specified in the control file. With this precedence, the OPTIONS keyword in the control file established default values that are easily changed from the command line.

Continuing Interrupted Loads


If SQL*Loader runs out of space for data rows or index entries, the load is discontinued. (For example, the table might reach its maximum number of extents.) Discontinued loads can be continued after more space is made available. When a load is discontinued, any data already loaded remains in the tables, and the tables are left in a valid state. SQL*Loader's log file tells you the state of the tables and indexes and the number of logical records already read from the input data file. Use this information to resume the load where it left off. For example:
SQLLOAD / CONTROL=FAST1.CTL SKIP=345

CONTINUE\_LOAD DATA statement is used to continue a discontinued direct path load involving multiple tables with a varying number of records to skip. For more information on this command, see chapter 6 of ``ORACLE7 Server Utilities Users Guide''.

Identifying Data Files


To specify the file containing the data to be loaded, use the INFILE or INDDN keyword, followed by the filename. A filename specified on the command line overrides the first INFILE or INDDN statement in the control file. If no filename is specified, the filename defaults to the control filename with an extension or filetype of DAT.

Loading into Non-Empty Database Tables


SQL*Loader does not update existing records, even if they have null columns. If the tables you are loading already contain data, you have three choices for how SQL*Loader should proceed:

INSERT - This is the default option. It requires the table to be empty before loading. SQL*Loader terminates with an error if the table contains rows. APPEND - If data already exists in the table, SQL*Loader appends the new rows to it; if data doesn't already exist, the new rows are simply loaded. REPLACE - All rows in the table are deleted and the new data is loaded. This option requires DELETE privileges on the table. You can create one logical record from multiple physical records using CONCATENATE and CONTINUEIF. See chapter 6 of ``ORACLE7 Server Utilities Users Guide''.

Loading Logical Records into Tables


The INTO TABLE clause allows you to tell which table you want to load data into. To load multiple tables, you would include one INTO TABLE clause for each table you wish to load. The INTO TABLE clause may continue with some options for loading that table. For example, you may specify different options (INSERT, APPEND, REPLACE) for each table in order to tell SQL*Loader what to do if data already exists in the table. The WHEN clause appears after the table name and is followed by one or more field conditions. For example, the following clause indicates that any record with the value ``q'' in the fifth column position should be loaded:
WHEN (5) = 'q'

A WHEN clause can contain several comparisons as long as each is preceded by AND. Parentheses are optional but should be used for clarity with multiple comparisons joined by AND. For example:
WHEN (DEPTNO = '10') AND (JOB = 'SALES')

To evaluate the WHEN clause, SQL*Loader first determines the values of all the fields in the record. Then the WHEN clause is evaluated. A row is inserted into the table only if the WHEN clause is true. When the control file specifies more fields for a record than are present in the record, SQL*Loader must determine whether the remaining (specified) columns should be considered null, or whether an error should be generated. TRAILING NULLCOLS clause tells SQL*Loader to treat any relatively positioned columns that are not present in the record as null columns. For example, if the following data
10 Accounting

is read with the following control file

INTO TABLE dept TRAILING NULLCOLS ( deptno CHAR TERMINATED BY " ", dname CHAR TERMINATED BY WHITESPACE, loc CHAR TERMINATED BY WHITESPACE )

and the record ends after DNAME, then the remaining LOC field is set to null. Without the TRAILING NULLCOLS clause, an error would be generated, due to missing data.

Specifying Datatypes
The datatype specification in the control file tells SQL*Loader how to interpret the information in the data file. The server defines the datatypes for the columns in the database. SQL*Loader extracts data from a field in the input file, guided by the datatype specification in the control file. SQL*Loader then sends the field to the server to be stored in the appropriate column. The server does any data conversion necessary to store the data in the proper internal format. The datatype of the data in the file does not necessarily have to be the same as the datatype of the column in the ORACLE table. ORACLE automatically performs conversions - but you need to ensure that the conversion makes sense and does not generate errors. SQL*Loader does not contain datatype specifications for ORACLE internal datatypes like NUMBER or VARCHAR2. SQL*Loader's datatypes describe data that can be produced with text editors (character datatypes) and with standard programming languages (native datatypes).

Native Datatypes
Some datatypes consist entirely of binary data, or contain binary data in their implementation. These non-character datatypes are the native datatypes: INTEGER ZONED SMALLINT VARCHAR FLOAT GRAPHIC DOUBLE GRAPHIC EXTERNAL BYTEINT VARGRAPHIC (packed) DECIMAL RAW These datatypes will not be discussed as most of the datatypes that you will be using will be character datatypes. For more information on SQL*Loader datatypes, see page 6-52 of ``ORACLE7 SERVER Utilities User's Guide''.

Character Datatypes
The character datatypes are CHAR, DATE, and the numeric EXTERNAL datatypes (INTEGER and DECIMAL). These fields can be delimited, and can have lengths (or maximum lengths) specified in the control file.

CHAR - This data field contains character data. The length is optional, and is taken from the POSITION specification if it is not present here. If present, this length overrides the length in the POSITION specification. If no length is given, CHAR data is assumed to have a length of 1. A field of datatype CHAR may also be variable-length delimited or enclosed.

To Load LONG Data: If the column in the database table is defined as LONG, you must explicitly specify a maximum length either with a length-specifier on the CHAR keyword, or with the POSITION keyword. This guarantees that a large enough buffer is allocated for the value, and is necessary even if the data is delimited or enclosed.

DATE - This data is character data that should be converted to an ORACLE date using the specified date mask. The length specification is optional, unless a varying-length data mask is specified. With a specification like:
DATE "Month dd, YYYY"

the date mask is 14 characters, while the length of a field like


September 31, 1991

is 18 characters. In this case, a length must be specified. Similarly, a length is required for any Julian dates (date mask ``J'') - a field length is required any time the length of the date string could exceed the length of the mask. An explicit length specification, if present, overrides the length in the POSITION clause. Either of these overrides the length derived from the mask. The mask may be any valid ORACLE date mask. If you omit the mask, the default ORACLE date mask of ``dd-mon-yy'' is used. See Chapter 6 for the Oracle date masks.

Numeric EXTERNAL - The numeric external datatypes are the numeric datatypes (INTEGER, FLOAT, DECIMAL, and ZONED) specified with the EXTERNAL keyword along with optional length and delimiter specifications. These datatypes are the human-readable, character form of numeric data.

The data is a number in character form (not binary representation). As such, these datatypes are identical to CHAR and are treated identically, with one exception: the use of DEFAULTIF. If you want the default to be null, use CHAR; if you want it to be zero, use EXTERNAL.
>>----INTEGER ---EXTERNAL------------------------------------|___FLOAT___| _| |___DECIMAL_| |___ZONED___| |_ ( length ) _| |_ delimiter_spec

delimiter_spec - The boundaries of CHAR, DATE, or numeric EXTERNAL fields may also be marked by specific delimiter characters contained in the input data record. You indicate how the field is delimited by using a delimiter specification after specifying the datatype. Delimited data can be TERMINATED or ENCLOSED.

11 Physical Storage Structures


This chapter describes the primary physical database structures of an Oracle database. Physical structures are viewable at the operating system level.

Introduction to Physical Storage Structures


One characteristic of an RDBMS is the independence of logical data structures such as tables, views, and indexes from physical storage structures. Because physical and logical structures are separate, you can manage physical storage of data without affecting access to logical structures. For example, renaming a database file does not rename the tables stored in it.

An Oracle database is a set of files that store Oracle data in persistent disk storage. This section discusses the database files generated when you issue a CREATE DATABASE statement:

Data files and temp files A data file is a physical file on disk that was created by Oracle Database and contains data structures such as tables and indexes. A temp file is a data file that belongs to a temporary tablespace. The data is written to these files in an Oracle proprietary format that cannot be read by other programs.

Control files A control file is a root file that tracks the physical components of the database.

Online redo log files The online redo log is a set of files containing records of changes made to data.

A database instance is a set of memory structures that manage database files. Figure 11-1 shows the relationship between the instance and the files that it manages. Figure 11-1 Database Instance and Database Files

Description of "Figure 11-1 Database Instance and Database Files"

Mechanisms for Storing Database Files


Several mechanisms are available for allocating and managing the storage of these files. The most common mechanisms include:

Oracle Automatic Storage Management (Oracle ASM) Oracle ASM includes a file system designed exclusively for use by Oracle Database. "Oracle Automatic Storage Management (Oracle ASM)" describes Oracle ASM.

Operating system file system

Most Oracle databases store files in a file system, which is a data structure built inside a contiguous disk address space. All operating systems have file managers that allocate and deallocate disk space into files within a file system. A file system enables disk space to be allocated to many files. Each file has a name and is made to appear as a contiguous address space to applications such as Oracle Database. The database can create, read, write, resize, and delete files. A file system is commonly built on top of a logical volume constructed by a software package called a logical volume manager (LVM). The LVM enables pieces of multiple physical disks to be combined into a single contiguous address space that appears as one disk to higher layers of software.

Raw device Raw devices are disk partitions or logical volumes not formatted with a file system. The primary benefit of raw devices is the ability to perform direct I/O and to write larger buffers. In direct I/O, applications write to and read from the storage device directly, bypassing the operating system buffer cache. Note:
Many file systems now support direct I/O for databases and other applications that manage their own caches. Historically, raw devices were the only means of implementing direct I/O.

Cluster file system A cluster file system is software that enables multiple computers to share file storage while maintaining consistent space allocation and file content. In an Oracle RAC environment, a cluster file system makes shared storage appears as a file system shared by many computers in a clustered environment. With a cluster file system, the failure of a computer in the cluster does not make the file system unavailable. In an operating system file system, however, if a computer sharing files through NFS or other means fails, then the file system is unavailable.

A database employs a combination of the preceding storage mechanisms. For example, a database could store the control files and online redo log files in a traditional file system, some user data files on raw partitions, the remaining data files in Oracle ASM, and archived the redo log files to a cluster file system.

Oracle Automatic Storage Management (Oracle ASM)


Oracle ASM is a high-performance, ease-of-management storage solution for Oracle Database files. Oracle ASM is a volume manager and provides a file system designed exclusively for use by the database.

Oracle ASM provides several advantages over conventional file systems and storage managers, including the following:

Simplifies storage-related tasks such as creating and laying out databases and managing disk space Distributes data across physical disks to eliminate hot spots and to provide uniform performance across the disks Rebalances data automatically after storage configuration changes

To use Oracle ASM, you allocate partitioned disks for Oracle Database with preferences for striping and mirroring. Oracle ASM manages the disk space, distributing the I/O load across all available resources to optimize performance while removing the need for manual I/O tuning. For example, you can increase the size of the disk for the database or move parts of the database to new devices without having to shut down the database.
Oracle ASM Storage Components

Oracle Database can store a data file as an Oracle ASM file in an Oracle ASM disk group, which is a collection of disks that Oracle ASM manages as a unit. Within a disk group, Oracle ASM exposes a file system interface for database files. Figure 11-2 shows the relationships between storage components in a database that uses Oracle ASM. The diagram depicts the relationship between an Oracle ASM file and a data file, although Oracle ASM can store other types of files. The crow's foot notation represents a one-to-many relationship. Figure 11-2 Oracle ASM Components

Description of "Figure 11-2 Oracle ASM Components"

Figure 11-2 illustrates the following Oracle ASM concepts:

Oracle ASM Disks

An Oracle ASM disk is a storage device that is provisioned to an Oracle ASM disk group. An Oracle ASM disk can be a physical disk or partition, a Logical Unit Number (LUN) from a storage array, a logical volume, or a network-attached file. Oracle ASM disks can be added or dropped from a disk group while the database is running. When you add a disk to a disk group, you either assign a disk name or the disk is given an Oracle ASM disk name automatically.

Oracle ASM Disk Groups An Oracle ASM disk group is a collection of Oracle ASM disks managed as a logical unit. The data structures in a disk group are self-contained and consume some disk space in a disk group. Within a disk group, Oracle ASM exposes a file system interface for Oracle database files. The content of files that are stored in a disk group are evenly distributed, or striped, to eliminate hot spots and to provide uniform performance across the disks. The performance is comparable to the performance of raw devices.

Oracle ASM Files An Oracle ASM file is a file stored in an Oracle ASM disk group. Oracle Database communicates with Oracle ASM in terms of files. The database can store data files, control files, online redo log files, and other types of files as Oracle ASM files. When requested by the database, Oracle ASM creates an Oracle ASM file and assigns it a fully qualified name beginning with a plus sign (+) followed by a disk group name, as in +DISK1. Note:
Oracle ASM files can coexist with other storage management options such as raw disks and third-party file systems. This capability simplifies the integration of Oracle ASM into pre-existing environments.

Oracle ASM Extents An Oracle ASM extent is the raw storage used to hold the contents of an Oracle ASM file. An Oracle ASM file consists of one or more file extents. Each Oracle ASM extent consists of one or more allocation units on a specific disk. Note:
An Oracle ASM extent is different from the extent used to store data in a segment.

Oracle ASM Allocation Units

An allocation unit is the fundamental unit of allocation within a disk group. An allocation unit is the smallest contiguous disk space that Oracle ASM allocates. One or more allocation units form an Oracle ASM extent. See Also:

Oracle Database 2 Day DBA to learn how to administer Oracle ASM disks with Oracle Enterprise Manager (Enterprise Manager) Oracle Automatic Storage Management Administrator's Guide to learn more about Oracle ASM

Oracle ASM Instances

An Oracle ASM instance is a special Oracle instance that manages Oracle ASM disks. Both the ASM and the database instances require shared access to the disks in an ASM disk group. ASM instances manage the metadata of the disk group and provide file layout information to the database instances. Database instances direct I/O to ASM disks without going through an ASM instance. An ASM instance is built on the same technology as a database instance. For example, an ASM instance has a system global area (SGA) and background processes that are similar to those of a database instance. However, an ASM instance cannot mount a database and performs fewer tasks than a database instance. Figure 11-3 shows a single-node configuration with one Oracle ASM instance and two database instances, each associated with a different single-instance database. The ASM instance manages the metadata and provides space allocation for the ASM files storing the data for the two databases. One ASM disk group has four ASM disks and the other has two disks. Both database instances can access the disk groups. Figure 11-3 Oracle ASM Instance and Database Instances

Description of "Figure 11-3 Oracle ASM Instance and Database Instances "

See Also:

Oracle Database 2 Day DBA to learn how to administer Oracle ASM disks with Oracle Enterprise Manager (Enterprise Manager) Oracle Automatic Storage Management Administrator's Guide to learn more about Oracle ASM

Oracle Managed Files and User-Managed Files


Oracle Managed Files is a file naming strategy that enables you to specify operations in terms of database objects rather than file names. For example, you can create a tablespace without specifying the names of its data files. In this way, Oracle Managed Files eliminates the need for administrators to directly manage the operating system files in a database. Oracle ASM requires Oracle Managed Files. Note:

This feature does not affect the creation or naming of administrative files such as trace files, audit files, and alert logs (see "Overview of Diagnostic Files").

With user-managed files, you directly manage the operating system files in the database. You make the decisions regarding file structure and naming. For example, when you create a tablespace you set the name and path of the tablespace data files. Through initialization parameters, you specify the file system directory for a specific type of file. The Oracle Managed Files feature ensures that the database creates a unique file and deletes it when no longer needed. The database internally uses standard file system interfaces to create and delete files for data files and temp files, control files, and recovery-related files stored in the fast recovery area. Oracle Managed Files does not eliminate existing functionality. You can create new files while manually administering old files. Thus, a database can have a mixture of Oracle Managed Files and user-managed files. See Also:
Oracle Database Administrator's Guide to learn how to use Oracle Managed Files

Overview of Data Files


At the operating system level, Oracle Database stores database data in data files. Every database must have at least one data file.

Use of Data Files


Part I, "Oracle Relational Data Structures" explains the logical structures in which users store data, the most important of which are tables. Each nonpartitioned schema object and each partition of an object is stored in its own segment. For ease of administration, Oracle Database allocates space for user data in tablespaces, which like segments are logical storage structures. Each segment belongs to only one tablespace. For example, the data for a nonpartitioned table is stored in a single segment, which is turn is stored in one tablespace. Oracle Database physically stores tablespace data in data files. Tablespaces and data files are closely related, but have important differences:

Each tablespace consists of one or more data files, which conform to the operating system in which Oracle Database is running. The data for a database is collectively stored in the data files located in each tablespace of the database. A segment can span one or more data files, but it cannot span multiple tablespaces.

A database must have the SYSTEM and SYSAUX tablespaces. Oracle Database automatically allocates the first data files of any database for the SYSTEM tablespace during database creation. The SYSTEM tablespace contains the data dictionary, a set of tables that contains database metadata. Typically, a database also has an undo tablespace and a temporary tablespace (usually named TEMP).

Figure 11-4 shows the relationship between tablespaces, data files, and segments. Figure 11-4 Data Files and Tablespaces

Description of "Figure 11-4 Data Files and Tablespaces"

See Also:

"Overview of Tablespaces" Oracle Database Administrator's Guide and Oracle Database 2 Day DBA to learn how to manage data files

Permanent and Temporary Data Files


A permanent tablespace contains persistent schema objects. Objects in permanent tablespaces are stored in data files.

A temporary tablespace contains schema objects only for the duration of a session. Locally managed temporary tablespaces have temporary files (temp files), which are special files designed to store data in hash, sort, and other operations. Temp files also store result set data when insufficient space exists in memory. Temp files are similar to permanent data files, with the following exceptions:

Permanent database objects such as tables are never stored in temp files. Temp files are always set to NOLOGGING mode, which means that they never have redo generated for them. Media recovery does not recognize temp files. You cannot make a temp file read-only. You cannot create a temp file with the ALTER DATABASE statement. When you create or resize temp files, they are not always guaranteed allocation of disk space for the file size specified. On file systems such as Linux and UNIX, temp files are created as sparse files. In this case, disk blocks are allocated not at file creation or resizing, but as the blocks are accessed for the first time. Caution:
Sparse files enable fast temp file creation and resizing; however, the disk could run out of space later when the temp files are accessed.

Temp file information is shown in the data dictionary view DBA_TEMP_FILES and the dynamic performance view V$TEMPFILE, but not in DBA_DATA_FILES or the V$DATAFILE view.

Online and Offline Data Files


Every data file is either online (available) or offline (unavailable). You can alter the availability of individual data files or temp files by taking them offline or bringing them online. Offline data files cannot be accessed until they are brought back online. Administrators may take data files offline for many reasons, including performing offline backups, renaming a data file, or block corruption. The database takes a data file offline automatically if the database cannot write to it. Like a data file, a tablespace itself is offline or online. When you take a data file offline in an online tablespace, the tablespace itself remains online. You can make all data files of a tablespace temporarily unavailable by taking the tablespace itself offline See Also:

"Online and Offline Tablespaces" Oracle Database Administrator's Guide to learn how to alter data file availability

Data File Structure

Oracle Database creates a data file for a tablespace by allocating the specified amount of disk space plus the overhead for the data file header. The operating system under which Oracle Database runs is responsible for clearing old information and authorizations from a file before allocating it to the database. The data file header contains metadata about the data file such as its size and checkpoint SCN. Each header contains an absolute file number and a relative file number. The absolute file number uniquely identifies the data file within the database. The relative file number uniquely identifies a data file within a tablespace. When Oracle Database first creates a data file, the allocated disk space is formatted but contains no user data. However, the database reserves the space to hold the data for future segments of the associated tablespace. As the data grows in a tablespace, Oracle Database uses the free space in the data files to allocate extents for the segment. Figure 11-5 illustrates the different types of space in a data file. Extents are either used, which means they contain segment data, or free, which means they are available for reuse. Over time, updates and deletions of objects within a tablespace can create pockets of empty space that individually are not large enough to be reused for new data. This type of empty space is referred to as fragmented free space. Figure 11-5 Space in a Data File

Description of "Figure 11-5 Space in a Data File"

Overview of Control Files


The database control file is a small binary file associated with only one database. Each database has one unique control file, although it may maintain identical copies of it.

Use of Control Files


The control file is the root file that Oracle Database uses to find database files and to manage the state of the database generally. A control file contains information such as the following:

The database name and database unique identifier (DBID) The time stamp of database creation Information about data files, online redo log files, and archived redo log files Tablespace information RMAN backups

The control file serves the following purposes:

It contains information about data files, online redo log files, and so on that are required to open the database. The control file tracks structural changes to the database. For example, when an administrator adds, renames, or drops a data file or online redo log file, the database updates the control file to reflect this change.

It contains metadata that must be accessible when the database is not open. For example, the control file contains information required to recover the database, including checkpoints. A checkpoint indicates the SCN in the redo stream where instance recovery would be required to begin (see "Overview of Instance Recovery"). Every committed change before a checkpoint SCN is guaranteed to be saved on disk in the data files. At least every three seconds the checkpoint process records information in the control file about the checkpoint position in the online redo log.

Oracle Database reads and writes to the control file continuously during database use and must be available for writing whenever the database is open. For example, recovering a database involves reading from the control file the names of all the data files contained in the database. Other operations, such as adding a data file, update the information stored in the control file.

Multiple Control Files


Oracle Database enables multiple, identical control files to be open concurrently and written for the same database. By multiplexing a control file on different disks, the database can achieve redundancy and thereby avoid a single point of failure. Note:
Oracle recommends that you maintain multiple control file copies, each on a different disk.

If a control file becomes unusable, then the database instance fails when it attempts to access the damaged control file. When other current control file copies exist, the database can be remounted and opened without media recovery. If all control files of a database are lost, however, then the instance fails and media recovery is required. Media recovery is not straightforward if an older backup of a control file must be used because a current copy is not available. See Also:

Oracle Database Administrator's Guide to learn how to maintain multiple control files Oracle Database Backup and Recovery User's Guide to learn how to back up and restore control files

Control File Structure


Information about the database is stored in different sections of the control file. Each section is a set of records about an aspect of the database. For example, one section in the control file tracks data files and contains a set of records, one for each data file. Each section is stored in multiple logical control file blocks. Records can span blocks within a section. The control file contains the following types of records:

Circular reuse records These records contain noncritical information that is eligible to be overwritten if needed. When all available record slots are full, the database either expands the control file to make room for a new record or overwrites the oldest record. Examples include records about archived redo log files and RMAN backups.

Noncircular reuse records These records contain critical information that does not change often and cannot be overwritten. Examples of information include tablespaces, data files, online redo log files, and redo threads. Oracle Database never reuses these records unless the corresponding object is dropped from the tablespace.

As explained in "Overview of the Dynamic Performance Views", you can query the dynamic performance views, also known as V$ views, to view the information stored in the control file. For example, you can query V$DATABASE to obtain the database name and DBID. However, only the database can modify the information in the control file. Reading and writing the control file blocks is different from reading and writing data blocks. For the control file, Oracle Database reads and writes directly from the disk to the program global area (PGA). Each process allocates a certain amount of its PGA memory for control file blocks. See Also:

Oracle Database Reference to learn about the V$CONTROLFILE_RECORD_SECTION view Oracle Database Reference to learn about the CONTROL_FILE_RECORD_KEEP_TIME initialization parameter

Overview of the Online Redo Log

The most crucial structure for recovery is the online redo log, which consists of two or more preallocated files that store changes to the database as they occur. The online redo log records changes to the data files.

Use of the Online Redo Log


The database maintains online redo log files to protect against data loss. Specifically, after an instance failure the online redo log files enable Oracle Database to recover committed data not yet written to the data files. Oracle Database writes every transaction synchronously to the redo log buffer, which is then written to the online redo logs. The contents of the log include uncommitted transactions, undo data, and schema and object management statements. Oracle Database uses the online redo log only for recovery. However, administrators can query online redo log files through a SQL interface in the Oracle LogMiner utility (see "Oracle LogMiner"). Redo log files are a useful source of historical information about database activity.

How Oracle Database Writes to the Online Redo Log


The online redo log for a database instance is called a redo thread. In single-instance configurations, only one instance accesses a database, so only one redo thread is present. In an Oracle Real Application Clusters (Oracle RAC) configuration, however, two or more instances concurrently access a database, with each instance having its own redo thread. A separate redo thread for each instance avoids contention for a single set of online redo log files. An online redo log consists of two or more online redo log files. Oracle Database requires a minimum of two files to guarantee that one is always available for writing while the other is being archived (if the database is in ARCHIVELOG mode). See Also:
Oracle Database 2 Day + Real Application Clusters Guide and Oracle Real Application Clusters Administration and Deployment Guide to learn about online redo log groups in Oracle RAC Online Redo Log Switches

Oracle Database uses only one online redo log file at a time to store records written from the redo log buffer. The online redo log file to which the log writer (LGWR) process is actively writing is called the current online redo log file. A log switch occurs when the database stops writing to one online redo log file and begins writing to another. Normally, a switch occurs when the current online redo log file is full and writing must continue. However, you can configure log switches to occur at regular intervals, regardless of whether the current online redo log file is filled, and force log switches manually.

Log writer writes to online redo log files circularly. When log writer fills the last available online redo log file, the process writes to the first log file, restarting the cycle. Figure 11-6 illustrates the circular writing of the redo log. Figure 11-6 Reuse of Online Redo Log Files

Description of "Figure 11-6 Reuse of Online Redo Log Files"

The numbers in Figure 11-6 shows the sequence in which LGWR writes to each online redo log file. The database assigns each file a new log sequence number when a log switches and log writers begins writing to it. When the database reuses an online redo log file, this file receives the next available log sequence number. Filled online redo log files are available for reuse depending on the archiving mode:

If archiving is disabled, which means that the database is in NOARCHIVELOG mode, then a filled online redo log file is available after the changes recorded in it have been checkpointed (written) to disk by database writer (DBWn). If archiving is enabled, which means that the database is in ARCHIVELOG mode, then a filled online redo log file is available to log writer after the changes have been written to the data files and the file has been archived.

In some circumstances, log writer may be prevented from reusing an existing online redo log file. For example, an online redo log file may be active (required for instance recovery) rather

than inactive (not required for instance recovery). Also, an online redo log file may be in the process of being cleared. See Also:

"Overview of Background Processes" Oracle Database 2 Day DBA and Oracle Database Administrator's Guide to learn how to manage the online redo log

Multiple Copies of Online Redo Log Files

Oracle Database can automatically maintain two or more identical copies of the online redo log in separate locations. An online redo log group consists of an online redo log file and its redundant copies. Each identical copy is a member of the online redo log group. Each group is defined by a number, such as group 1, group 2, and so on. Maintaining multiple members of an online redo log group protects against the loss of the redo log. Ideally, the locations of the members should be on separate disks so that the failure of one disk does not cause the loss of the entire online redo log. In Figure 11-7, A_LOG1 and B_LOG1 are identical members of group 1, while A_LOG2 and B_LOG2 are identical members of group 2. Each member in a group must be the same size. LGWR writes concurrently to group 1 (members A_LOG1 and B_LOG1), then writes concurrently to group 2 (members A_LOG2 and B_LOG2), then writes to group 1, and so on. LGWR never writes concurrently to members of different groups. Figure 11-7 Multiple Copies of Online Redo Log Files

Description of "Figure 11-7 Multiple Copies of Online Redo Log Files"

Note:
Oracle recommends that you multiplex the online redo log. The loss of log files can be catastrophic if recovery is required. When you multiplex the online redo log, the database must increase the amount of I/O it performs. Depending on your system, this additional I/O may impact overall database performance.

See Also:
Oracle Database Administrator's Guide to learn how to maintain multiple copies of the online redo log files Archived Redo Log Files

An archived redo log file is a copy of a filled member of an online redo log group. This file is not considered part of the database, but is an offline copy of an online redo log file created by the database and written to a user-specified location. Archived redo log files are a crucial part of a backup and recovery strategy. You can use archived redo log files to:

Recover a database backup Update a standby database (see "Computer Failures")

Obtain information about the history of a database using the LogMiner utility (see "Oracle LogMiner")

Archiving is the operation of generating an archived redo log file. Archiving is either automatic or manual and is only possible when the database is in ARCHIVELOG mode. An archived redo log file includes the redo entries and the log sequence number of the identical member of the online redo log group. In Figure 11-7, files A_LOG1 and B_LOG1 are identical members of Group 1. If the database is in ARCHIVELOG mode, and if automatic archiving is enabled, then the archiver process (ARCn) will archive one of these files. If A_LOG1 is corrupted, then the process can archive B_LOG1. The archived redo log contains a copy of every group created since you enabled archiving. See Also:

"Data File Recovery" Oracle Database Administrator's Guide to learn how to manage the archived redo log

Structure of the Online Redo Log


Online redo log files contain redo records. A redo record is made up of a group of change vectors, each of which describes a change to a data block. For example, an update to a salary in the employees table generates a redo record that describes changes to the data segment block for the table, the undo segment data block, and the transaction table of the undo segments. The redo records have all relevant metadata for the change, including the following:

SCN and time stamp of the change Transaction ID of the transaction that generated the change SCN and time stamp when the transaction committed (if it committed) Type of operation that made the change Name and type of the modified data segment

12 Logical Storage Structures


This chapter describes the nature of and relationships among logical storage structures. These structures are created and recognized by Oracle Database and are not known to the operating system.

Introduction to Logical Storage Structures


Oracle Database allocates logical space for all data in the database. The logical units of database space allocation are data blocks, extents, segments, and tablespaces. At a physical level, the data

is stored in data files on disk (see Chapter 11, "Physical Storage Structures"). The data in the data files is stored in operating system blocks. Figure 12-1 is an entity-relationship diagram for physical and logical storage. The crow's foot notation represents a one-to-many relationship. Figure 12-1 Logical and Physical Storage

Description of "Figure 12-1 Logical and Physical Storage"

Logical Storage Hierarchy


Figure 12-2 shows the relationships among data blocks, extents, and segments within a tablespace. In this example, a segment has two extents stored in different data files. Figure 12-2 Segments, Extents, and Data Blocks Within a Tablespace

Description of "Figure 12-2 Segments, Extents, and Data Blocks Within a Tablespace"

At the finest level of granularity, Oracle Database stores data in data blocks. One logical data block corresponds to a specific number of bytes of physical disk space, for example, 2 KB. Data blocks are the smallest units of storage that Oracle Database can use or allocate. An extent is a set of logically contiguous data blocks allocated for storing a specific type of information. In Figure 12-2, the 24 KB extent has 12 data blocks, while the 72 KB extent has 36 data blocks. A segment is a set of extents allocated for a specific database object, such as a table. For example, the data for the employees table is stored in its own data segment, whereas each index for employees is stored in its own index segment. Every database object that consumes storage consists of a single segment. Each segment belongs to one and only one tablespace. Thus, all extents for a segment are stored in the same tablespace. Within a tablespace, a segment can include extents from multiple data files, as shown in Figure 12-2. For example, one extent for a segment may be stored in users01.dbf, while another is stored in users02.dbf. A single extent can never span data files. See Also:
"Overview of Data Files"

Logical Space Management


Oracle Database must use logical space management to track and allocate the extents in a tablespace. When a database object requires an extent, the database must have a method of

finding and providing it. Similarly, when an object no longer requires an extent, the database must have a method of making the free extent available. Oracle Database manages space within a tablespace based on the type that you create. You can create either of the following types of tablespaces:

Locally managed tablespaces (default) The database uses bitmaps in the tablespaces themselves to manage extents. Thus, locally managed tablespaces have a part of the tablespace set aside for a bitmap. Within a tablespace, the database can manage segments with automatic segment space management (ASSM) or manual segment space management (MSSM).

Dictionary-managed tablespaces The database uses the data dictionary to manage extents (see "Overview of the Data Dictionary").

Figure 12-3 shows the alternatives for logical space management in a tablespace. Figure 12-3 Logical Space Management

Description of "Figure 12-3 Logical Space Management" Locally Managed Tablespaces

A locally managed tablespace maintains a bitmap in the data file header to track free and used space in the data file body. Each bit corresponds to a group of blocks. When space is allocated or freed, Oracle Database changes the bitmap values to reflect the new status of the blocks. The following graphic is a conceptual representation of bitmap-managed storage. A 1 in the header refers to used space, whereas a 0 refers to free space.

Description of the illustration cncpt332.gif

A locally managed tablespace has the following advantages:

Avoids using the data dictionary to manage extents Recursive operations can occur in dictionary-managed tablespaces if consuming or releasing space in an extent results in another operation that consumes or releases space in a data dictionary table or undo segment.

Tracks adjacent free space automatically In this way, the database eliminates the need to coalesce free extents.

Determines the size of locally managed extents automatically Alternatively, all extents can have the same size in a locally managed tablespace and override object storage options.

Note:
Oracle strongly recommends the use of locally managed tablespaces with Automatic Segment Space Management.

Segment space management is an attribute inherited from the tablespace that contains the segment. Within a locally managed tablespace, the database can manage segments automatically

or manually. For example, segments in tablespace users can be managed automatically while segments in tablespace tools are managed manually.
Automatic Segment Space Management

The ASSM method uses bitmaps to manage space. Bitmaps provide the following advantages:

Simplified administration ASSM avoids the need to manually determine correct settings for many storage parameters. Only one crucial SQL parameter controls space allocation: PCTFREE. This parameter specifies the percentage of space to be reserved in a block for future updates (see "Percentage of Free Space in Data Blocks").

Increased concurrency Multiple transactions can search separate lists of free data blocks, thereby reducing contention and waits. For many standard workloads, application performance with ASSM is better than the performance of a well-tuned application that uses MSSM.

Dynamic affinity of space to instances in an Oracle Real Application Clusters (Oracle RAC) environment

ASSM is more efficient and is the default for permanent, locally managed tablespaces. Note:
This chapter assumes the use of ASSM in all of its discussions of logical storage space.
Manual Segment Space Management

The legacy MSSM method uses a linked list called a free list to manage free space in the segment. For a database object that has free space, a free list keeps track of blocks under the high water mark (HWM), which is the dividing line between segment space that is used and not yet used. As blocks are used, the database puts blocks on or removes blocks from the free list as needed. In addition to PCTFREE, MSSM requires you to control space allocation with SQL parameters such as PCTUSED, FREELISTS, and FREELIST GROUPS. PCTUSED sets the percentage of free space that must exist in a currently used block for the database to put it on the free list. For example, if you set PCTUSED to 40 in a CREATE TABLE statement, then you cannot insert rows into a block in the segment until less than 40% of the block space is used. As an illustration, suppose you insert a row into a table. The database checks a free list of the table for the first available block. If the row cannot fit in the block, and if the used space in the block is greater than or equal to PCTUSED, then the database takes the block off the list and

searches for another block. If you delete rows from the block, then the database checks whether used space in the block is now less than PCTUSED. If so, then the database places the block at the beginning of the free list. An object may have multiple free lists. In this way, multiple sessions performing DML on a table can use different lists, which can reduce contention. Each database session uses only one free list for the duration of its session. As shown in Figure 12-4, you can also create an object with one or more free list groups, which are collections of free lists. Each group has a master free list that manages the individual process free lists in the group. Space overhead for free lists, especially for free list groups, can be significant. Figure 12-4 Free List Groups

Description of "Figure 12-4 Free List Groups"

Managing segment space manually can be complex. You must adjust PCTFREE and PCTUSED to reduce row migration (see "Chained and Migrated Rows") and avoid wasting space. For example, if every used block in a segment is half full, and if PCTUSED is 40, then the database does not permit inserts into any of these blocks. Because of the difficulty of fine-tuning space allocation parameters, Oracle strongly recommends ASSM. In ASSM, PCTFREE determines whether a new row can be inserted into a block, but it does not use free lists and ignores PCTUSED. See Also:

Oracle Database Administrator's Guide to learn about locally managed tablespaces Oracle Database 2 Day DBA and Oracle Database Administrator's Guide to learn more about automatic segment space management

Oracle Database SQL Language Reference to learn about storage parameters such as PCTFREE and PCTUSED

Dictionary-Managed Tablespaces

A dictionary-managed tablespace uses the data dictionary to manage its extents. Oracle Database updates tables in the data dictionary whenever an extent is allocated or freed for reuse. For example, when a table needs an extent, the database queries the data dictionary tables, and searches for free extents. If the database finds space, then it modifies one data dictionary table and inserts a row into another. In this way, the database manages space by modifying and moving data. The SQL that the database executes in the background to obtain space for database objects is recursive SQL. Frequent use of recursive SQL can have a negative impact on performance because updates to the data dictionary must be serialized. Locally managed tablespaces, which are the default, avoid this performance problem. See Also:
Oracle Database Administrator's Guide to learn how to migrate tablespaces from dictionary-managed to locally managed

Overview of Data Blocks


Oracle Database manages the logical storage space in the data files of a database in units called data blocks, also called Oracle blocks or pages. A data block is the minimum unit of database I/O.

Data Blocks and Operating System Blocks


At the physical level, database data is stored in disk files made up of operating system blocks. An operating system block is the minimum unit of data that the operating system can read or write. In contrast, an Oracle block is a logical storage structure whose size and structure are not known to the operating system. Figure 12-5 shows that operating system blocks may differ in size from data blocks. The database requests data in multiples of data blocks, not operating system blocks. Figure 12-5 Data Blocks and Operating System Blocks

Description of "Figure 12-5 Data Blocks and Operating System Blocks"

When the database requests a data block, the operating system translates this operation into a requests for data in permanent storage. The logical separation of data blocks from operating system blocks has the following implications:

Applications do not need to determine the physical addresses of data on disk. Database data can be striped or mirrored on multiple physical disks.

Database Block Size

Every database has a database block size. The DB_BLOCK_SIZE initialization parameter sets the data block size for a database when it is created. The size is set for the SYSTEM and SYSAUX tablespaces and is the default for all other tablespaces. The database block size cannot be changed except by re-creating the database. If DB_BLOCK_SIZE is not set, then the default data block size is operating system-specific. The standard data block size for a database is 4 KB or 8 KB. If the size differs for data blocks and operating system blocks, then the data block size must be a multiple of the operating system block size. See Also:

Oracle Database Reference to learn about the DB_BLOCK_SIZE initialization parameter Oracle Database Administrator's Guide and Oracle Database Performance Tuning Guide to learn how to choose block sizes

Tablespace Block Size

You can create individual tablespaces whose block size differs from the DB_BLOCK_SIZE setting. A nonstandard block size can be useful when moving a transportable tablespace to a different platform.

See Also:
Oracle Database Administrator's Guide to learn how to specify a nonstandard block size for a tablespace

Data Block Format


Every data block has a format or internal structure that enables the database to track the data and free space in the block. This format is similar whether the data block contains table, index, or table cluster data. Figure 12-6 shows the format of an uncompressed data block (see "Data Block Compression" to learn about compressed blocks). Figure 12-6 Data Block Format

Description of "Figure 12-6 Data Block Format" Data Block Overhead

Oracle Database uses the block overhead to manage the block itself. The block overhead is not available to store user data. As shown in Figure 12-6, the block overhead includes the following parts:

Block header This part contains general information about the block, including disk address and segment type. For blocks that are transaction-managed, the block header contains active and historical transaction information.

A transaction entry is required for every transaction that updates the block. Oracle Database initially reserves space in the block header for transaction entries. In data blocks allocated to segments that support transactional changes, free space can also hold transaction entries when the header space is depleted. The space required for transaction entries is operating system dependent. However, transaction entries in most operating systems require approximately 23 bytes.

Table directory For a heap-organized table, this directory contains metadata about tables whose rows are stored in this block. Multiple tables can store rows in the same block.

Row directory For a heap-organized table, this directory describes the location of rows in the data portion of the block. After space has been allocated in the row directory, the database does not reclaim this space after row deletion. Thus, a block that is currently empty but formerly had up to 50 rows continues to have 100 bytes allocated for the row directory. The database reuses this space only when new rows are inserted in the block.

Some parts of the block overhead are fixed in size, but the total size is variable. On average, the block overhead totals 84 to 107 bytes.
Row Format

The row data part of the block contains the actual data, such as table rows or index key entries. Just as every data block has an internal format, every row has a row format that enables the database to track the data in the row. Oracle Database stores rows as variable-length records. A row is contained in one or more row pieces. Each row piece has a row header and column data. Figure 12-7 shows the format of a row. Figure 12-7 The Format of a Row Piece

Description of "Figure 12-7 The Format of a Row Piece"


Row Header

Oracle Database uses the row header to manage the row piece stored in the block. The row header contains information such as the following:

Columns in the row piece Pieces of the row located in other data blocks If an entire row can be inserted into a single data block, then Oracle Database stores the row as one row piece. However, if all of the row data cannot be inserted into a single block or an update causes an existing row to outgrow its block, then the database stores the row in multiple row pieces (see "Chained and Migrated Rows"). A data block usually contains only one row piece per row.

Cluster keys for table clusters (see "Overview of Table Clusters")

A row fully contained in one block has at least 3 bytes of row header.
Column Data

After the row header, the column data section stores the actual data in the row. The row piece usually stores columns in the order listed in the CREATE TABLE statement, but this order is not guaranteed. For example, columns of type LONG are created last.

As shown in Figure 12-7, for each column in a row piece, Oracle Database stores the column length and data separately. The space required depends on the data type. If the data type of a column is variable length, then the space required to hold a value can grow and shrink with updates to the data. Each row has a slot in the row directory of the data block header. The slot points to the beginning of the row. See Also:
"Table Storage" and "Index Storage"
Rowid Format

Oracle Database uses a rowid to uniquely identify a row. Internally, the rowid is a structure that holds information that the database needs to access a row. A rowid is not physically stored in the database, but is inferred from the file and block on which the data is stored. An extended rowid includes a data object number. This rowid type uses a base 64 encoding of the physical address for each row. The encoding characters are A-Z, a-z, 0-9, +, and /. Example 12-1 queries the ROWID pseudocolumn to show the extended rowid of the row in the employees table for employee 100. Example 12-1 ROWID Pseudocolumn
SQL> SELECT ROWID FROM employees WHERE employee_id = 100; ROWID -----------------AAAPecAAFAAAABSAAA

Figure 12-8 illustrates the format of an extended rowid. Figure 12-8 ROWID Format

Description of "Figure 12-8 ROWID Format"

An extended rowid is displayed in a four-piece format, OOOOOOFFFBBBBBBRRR, with the format divided into the following components:
OOOOOO

The data object number identifies the segment (data object AAAPec in Example 12-1). A data object number is assigned to every database segment. Schema objects in the same segment, such as a table cluster, have the same data object number.
FFF

The tablespace-relative data file number identifies the data file that contains the row (file AAF in Example 12-1).
BBBBBB

The data block number identifies the block that contains the row (block AAAABS in Example 12-1). Block numbers are relative to their data file, not their tablespace. Thus, two rows with identical block numbers could reside in different data files of the same tablespace.
RRR

The row number identifies the row in the block (row AAA in Example 12-1). After a rowid is assigned to a row piece, the rowid can change in special circumstances. For example, if row movement is enabled, then the rowid can change because of partition key updates, Flashback Table operations, shrink table operations, and so on. If row movement is disabled, then a rowid can change if the row is exported and imported using Oracle Database utilities. Note:
Internally, the database performs row movement as if the row were physically deleted and reinserted. However, row movement is considered an update, which has implications for triggers.

Data Block Compression


The database can use table compression to eliminate duplicate values in a data block (see "Table Compression"). This section describes the format of data blocks that use compression. The format of a data block that uses basic and OLTP table compression is essentially the same as an uncompressed block. The difference is that a symbol table at the beginning of the block stores duplicate values for the rows and columns. The database replaces occurrences of these values with a short reference to the symbol table. Assume that the rows in Example 12-2 are stored in a data block for the seven-column sales table. Example 12-2 Rows in sales Table

2190,13770,25-NOV-00,S,9999,23,161 2225,15720,28-NOV-00,S,9999,25,1450 34005,120760,29-NOV-00,P,9999,44,2376 9425,4750,29-NOV-00,I,9999,11,979 1675,46750,29-NOV-00,S,9999,19,1121

When basic or OLTP table compression is applied to this table, the database replaces duplicate values with a symbol reference. Example 12-3 is a conceptual representation of the compression in which the symbol * replaces 29-NOV-00 and % replaces 9999. Example 12-3 OLTP Compressed Rows in sales Table
2190,13770,25-NOV-00,S,%,23,161 2225,15720,28-NOV-00,S,%,25,1450 34005,120760,*,P,%,44,2376 9425,4750,*,I,%,11,979 1675,46750,*,S,%,19,1121

Table 12-1 conceptually represents the symbol table that maps symbols to values. Table 12-1 Symbol Table
Symbol Value Column Rows

* %

29-NOV-00 9999

3 5

958-960 956-960

Space Management in Data Blocks


As the database fills a data block from the bottom up, the amount of free space between the row data and the block header decreases. This free space can also shrink during updates, as when changing a trailing null to a nonnull value. The database manages free space in the data block to optimize performance and avoid wasted space. Note:
This section assumes the use of automatic segment space management. Percentage of Free Space in Data Blocks

The PCTFREE storage parameter is essential to how the database manages free space. This SQL parameter sets the minimum percentage of a data block reserved as free space for updates to existing rows. Thus, PCTFREE is important for preventing row migration and avoiding wasted space.

For example, assume that you create a table that will require only occasional updates, most of which will not increase the size of the existing data. You specify the PCTFREE parameter within a CREATE TABLE statement as follows:
CREATE TABLE test_table (n NUMBER) PCTFREE 20;

Figure 12-9 shows how a PCTFREE setting of 20 affects space management. The database adds rows to the block over time, causing the row data to grow upwards toward the block header, which is itself expanding downward toward the row data. The PCTFREE setting ensures that at least 20% of the data block is free. For example, the database prevents an INSERT statement from filling the block so that the row data and header occupy a combined 90% of the total block space, leaving only 10% free. Figure 12-9 PCTFREE

Description of "Figure 12-9 PCTFREE"

Note:
This discussion does not apply to LOB data types, which do not use the PCTFREE storage parameter or free lists. See "Overview of LOBs".

See Also:
Oracle Database SQL Language Reference for the syntax and semantics of the PCTFREE parameter

Optimization of Free Space in Data Blocks

While the percentage of free space cannot be less than PCTFREE, the amount of free space can be greater. For example, a PCTFREE setting of 20% prevents the total amount of free space from dropping to 5% of the block, but permits 50% of the block to be free space. The following SQL statements can increase free space:
DELETE UPDATE

statements statements that either update existing values to smaller values or increase existing values and force a row to migrate INSERT statements on a table that uses OLTP compression If inserts fill a block with data, then the database invokes block compression, which may result in the block having more free space.

The space released is available for INSERT statements under the following conditions:

If the INSERT statement is in the same transaction and after the statement that frees space, then the statement can use the space. If the INSERT statement is in a separate transaction from the statement that frees space (perhaps run by another user), then the statement can use the space made available only after the other transaction commits and only if the space is needed.

See Also:
Oracle Database Administrator's Guide to learn about OLTP compression
Coalescing Fragmented Space

Released space may or may not be contiguous with the main area of free space in a data block, as shown in Figure 12-10. Noncontiguous free space is called fragmented space. Figure 12-10 Data Block with Fragmented Space

Description of "Figure 12-10 Data Block with Fragmented Space"

Oracle Database automatically and transparently coalesces the free space of a data block only when the following conditions are true:

An INSERT or UPDATE statement attempts to use a block that contains sufficient free space to contain a new row piece. The free space is fragmented so that the row piece cannot be inserted in a contiguous section of the block.

After coalescing, the amount of free space is identical to the amount before the operation, but the space is now contiguous. Figure 12-11 shows a data block after space has been coalesced. Figure 12-11 Data Block After Coalescing Free Space

Description of "Figure 12-11 Data Block After Coalescing Free Space"

Oracle Database performs coalescing only in the preceding situations because otherwise performance would decrease because of the continuous coalescing of the free space in data blocks.
Reuse of Index Space

The database can reuse space within an index block. For example, if you insert a value into a column and delete it, and if an index exists on this column, then the database can reuse the index slot when a row requires it. The database can reuse an index block itself. Unlike a table block, an index block only becomes free when it is empty. The database places the empty block on the free list of the index structure and makes it eligible for reuse. However, Oracle Database does not automatically compact the index: an ALTER INDEX REBUILD or COALESCE statement is required. Figure 12-12 represents an index of the employees.department_id column before the index is coalesced. The first three leaf blocks are only partially full, as indicated by the gray fill lines. Figure 12-12 Index Before Coalescing

Description of "Figure 12-12 Index Before Coalescing"

Figure 12-13 shows the index in Figure 12-12 after the index has been coalesced. The first two leaf blocks are now full, as indicated by the gray fill lines, and the third leaf block has been freed. Figure 12-13 Index After Coalescing

Description of "Figure 12-13 Index After Coalescing"

See Also:

Oracle Database Administrator's Guide to learn how to coalesce and rebuild indexes Oracle Database SQL Language Reference to learn about the COALESCE statement

Chained and Migrated Rows

Oracle Database must manage rows that are too large to fit into a single block. The following situations are possible:

The row is too large to fit into one data block when it is first inserted. In row chaining, Oracle Database stores the data for the row in a chain of one or more data blocks reserved for the segment. Row chaining most often occurs with large rows. Examples include rows that contain a column of data type LONG or LONG RAW, a VARCHAR2(4000) column in a 2 KB block, or a row with a huge number of columns. Row chaining in these cases is unavoidable.

A row that originally fit into one data block is updated so that the overall row length increases, but insufficient free space exists to hold the updated row. In row migration, Oracle Database moves the entire row to a new data block, assuming the row can fit in a new block. The original row piece of a migrated row contains a pointer or "forwarding address" to the new block containing the migrated row. The rowid of a migrated row does not change.

A row has more than 255 columns. Oracle Database can only store 255 columns in a row piece. Thus, if you insert a row into a table that has 1000 columns, then the database creates 4 row pieces, typically chained over multiple blocks.

Figure 12-14 depicts shows the insertion of a large row in a data block. The row is too large for the left block, so the database chains the row by placing the first row piece in the left block and the second row piece in the right block. Figure 12-14 Row Chaining

Description of "Figure 12-14 Row Chaining"

Figure 12-15, the left block contains a row that is updated so that the row is now too large for the block. The database moves the entire row to the right block and leaves a pointer to the migrated row in the left block. Figure 12-15 Row Migration

Description of "Figure 12-15 Row Migration"

When a row is chained or migrated, the I/O needed to retrieve the data increases. This situation results because Oracle Database must scan multiple blocks to retrieve the information for the row. For example, if the database performs one I/O to read an index and one I/O to read a nonmigrated table row, then an additional I/O is required to obtain the data for a migrated row. The Segment Advisor, which can be run both manually and automatically, is an Oracle Database component that identifies segments that have space available for reclamation. The advisor can offer advice about objects that have significant free space or too many chained rows. See Also:

"Row Storage" and "Rowids of Row Pieces" Oracle Database 2 Day DBA and Oracle Database Administrator's Guide to learn how to reclaim wasted space Oracle Database Performance Tuning Guide to learn about reducing chained and migrated rows

Overview of Extents
An extent is a logical unit of database storage space allocation made up of contiguous data blocks. Data blocks in an extent are logically contiguous but can be physically spread out on disk because of RAID striping and file system implementations.

Allocation of Extents

By default, the database allocates an initial extent for a data segment when the segment is created. An extent is always contained in one data file. Although no data has been added to the segment, the data blocks in the initial extent are reserved for this segment exclusively. The first data block of every segment contains a directory of the extents in the segment. Figure 12-16 shows the initial extent in a segment in a data file that previously contained no data. Figure 12-16 Initial Extent of a Segment

Description of "Figure 12-16 Initial Extent of a Segment"

If the initial extent become full, and if more space is required, then the database automatically allocates an incremental extent for this segment. An incremental extent is a subsequent extent created for the segment. The allocation algorithm depends on whether the tablespace is locally managed or dictionarymanaged. In the locally managed case, the database searches the bitmap of a data file for adjacent free blocks. If the data file has insufficient space, then the database looks in another data file. Extents for a segment are always in the same tablespace but may be in different data files. Figure 12-17 shows that the database can allocate extents for a segment in any data file in the tablespace. For example, the segment can allocate the initial extent in users01.dbf, allocate the first incremental extent in users02.dbf, and allocate the next extent in users01.dbf. Figure 12-17 Incremental Extent of a Segment

Description of "Figure 12-17 Incremental Extent of a Segment"

The blocks of a newly allocated extent, although they were free, may not be empty of old data. In ASSM, Oracle Database formats the blocks of a newly allocated extent when it starts using the extent, but only as needed (see "Segment Space and the High Water Mark"). Note:
This section applies to serial operations, in which one server process parses and runs a statement. Extents are allocated differently in parallel SQL statements, which entail multiple server processes.

Deallocation of Extents
In general, the extents of a user segment do not return to the tablespace unless you drop the object using a DROP command. In Oracle Database 11g Release 2 (11.2.0.2), you can also drop the segment using the DBMS_SPACE_ADMIN package. For example, if you delete all rows in a table, then the database does not reclaim the data blocks for use by other objects in the tablespace. Note:
In an undo segment, Oracle Database periodically deallocates one or more extents if it has the OPTIMAL size specified or if the database is in automatic undo management mode (see "Undo Tablespaces").

In some circumstances, you can manually deallocate space. The Oracle Segment Advisor helps determine whether an object has space available for reclamation based on the level of fragmentation in the object. The following techniques can free extents:

You can use an online segment shrink to reclaim fragmented space in a segment. Segment shrink is an online, in-place operation. In general, data compaction leads to better cache utilization and requires fewer blocks to be read in a full table scan. You can move the data of a nonpartitioned table or table partition into a new segment, and optionally into a different tablespace for which you have quota. You can rebuild or coalesce the index (see "Reuse of Index Space"). You can truncate a table or table cluster, which removes all rows. By default, Oracle Database deallocates all space used by the removed rows except that specified by the MINEXTENTS storage parameter. In Oracle Database 11g Release 2 (11.2.0.2), you can also use TRUNCATE with the DROP ALL STORAGE option to drop entire segments. You can deallocate unused space, which frees the unused space at the high water mark end of the database segment and makes the space available for other segments in the tablespace (see "Segment Space and the High Water Mark").

When extents are freed, Oracle Database modifies the bitmap in the data file for locally managed tablespaces to reflect the regained extents as available space. Any data in the blocks of freed extents becomes inaccessible.

Storage Parameters for Extents


Every segment is defined by storage parameters expressed in terms of extents. These parameters control how Oracle Database allocates free space for a segment. The storage settings are determined in the following order of precedence, with setting higher on the list overriding settings lower on the list: 1. Segment storage clause 2. Tablespace storage clause 3. Oracle Database default A locally managed tablespace can have either uniform extent sizes or variable extent sizes determined automatically by the system:

For uniform extents, you can specify an extent size or use the default size of 1 MB. All extents in the tablespace are of this size. Locally managed temporary tablespaces can only use this type of allocation. For automatically allocated extents, Oracle Database determines the optimal size of additional extents.

For locally managed tablespaces, some storage parameters cannot be specified at the tablespace level. However, you can specify these parameters at the segment level. In this case, the databases uses all parameters together to compute the initial size of the segment. Internal algorithms determine the subsequent size of each extent.

You might also like