You are on page 1of 2

default user- dbc

it owns all objects, users or dbs that are created


default database name- dbc

teradata architecture and features:


1. unlimited parallelism- concept known as massively parallel processing- a task is
broken down to be fed to its multiple processes
2. shared nothing architecture- teradata nodes, access module processors and
associated memory are independent
3. high scalability: performance can be linearly scaled by adding upto 2048 amps
4. standard sql is supported in addition to its own extensions
5.robust utilities for import/export- features like FastLoad, MultiLoad, FastExport
& TPT

teradata table types-


1. permanent table- contains data stored by user permanently
2. volatile tabel- data is held during a user login session. used for storing data
during transformation
3. global temporary table- the table is persistent but the data is deleted at the
end of user session
4. derived table- used for holding data while a query is being computed

classification of tables- set and multiset.. set does not store duplicate data,
multiset can store

temporary tables in teradata:


1. derived table: the table is used only for computation within a query and it is
dropped once the execution is complete.
2. volatile table: the table is created and dropped for a user session. syntax:
create set|multiset volatile table tablename
table definition
column definition
index definition
on commit delete|preserve rows
3. global temporary table: the table is used by many users/sessions, but the data
loaded into a table lasts only for one session. a max of 2000 gtmp tables can be
created per session.

spces in teradata:
1. permanent space: space allocated for permanent tables, journals, fallback
tables, secondary index sub-tables
this space is not pre-allocated for a db. the total space is divided by the number
of AMPs.

2. spool space: it is the unused space of the permanen space which is reserved to
store intermediate results of a query. if no spool space is available, the user
cannot execute a query
divided by AMPs.

3. temp space: unused permanent space which is used by global temp tables.
divided by AMPs.

indexing concept:
a table can contain only one primary index.

keys and indexs in teradata concept and difference:


1. primary key cannot be null, index can be null
2. primary key is not mandatory, index is must
3. primary key doesnt help in data distribution, while an index does.
4. primary key should be unique, index can be unique or non-unique depending on
definition.
5. primary key is a logical implementation, while index is a physical
implementation.

scenarios:
1. primary key and primary index havent been defined.. teradata will check if any
column is unique and make it the unique primary index, else the first column will
be the primary index.
2. primary index has not been defined but the primary key has been defined...
teradata will make the primary key as the unique primary index.
3. primary key and index have been defined, but on different columns.. teradata
will make primary as the unique secondary key

in short, primary index is quite similar to primary key concept for other dbs.

data distribution concept in teradata:


teradata stores row values equally among the amps. ie. if there are 4 amps, row 1
gets stored in amp #1, row 2 in amp #2, etc.
identification of where the data is stored. the teradata system stores the
information by a name called hash values, a 32-bit number. for eg. for row now 24,
the first 16-bits will be the bucket number for 24, and the 2nd 16-bits will be the
amp no. here it is 4.
teradata will use the primary index for any computation. therefore, the primary
index value will also go into the hashmap.

secondary index:
alternate path to the data.
primary index is always faster.
secondary index is faster than a full table scan
there can be a max of 32 secondary indexes on a table
every secondary index creates a sub-table on every amp designed to point to the
primary index row id
there are two types of secondary indexes: unique SI and Non-unique SI
the unique SI is a two-amp operation, while the NUSI is an all-amp operation

THE SI sub table:


takes up permanent space.
therefore, the si should be used only when using known queries or stuff when it is
to be used over and over again
for Unique SI, whenever, teradata finds a query having the USI in the where clause,
it devises a plan to retrieve the row using only 2 AMPS. the values are hashed for
fast retrieval.
for NUSI, the sub-table contains the base column on which the NUSI was created
along with the primary index. the NUSI rows are AMP-local.

You might also like