Professional Documents
Culture Documents
Sydney Brenner
Molecular Sciences Institute Inc, 2168 Shattuck Avenue, Berkeley, CA 94704, USA
Jefferey H. Miller
University of California, Los Angeles, Department of Microbiology and Moleculer Genetics, 405 Hilgard
Avenue, Los Angeles, CA 90024, USA
Associate Editors
William J. Broughton
LBMPS, Universite de Gene`ve, Plant Molecular Biology, I Chemin de IImperatrice, CH-1292 Chambesy,
Gene`ve, Switzerland
Malcolm Ferguson-Smith
University of Cambridge, Department of Clinical Veterinary Medicine, Madingley Road, Cambridge, CB3
OES, UK
Walter Fitch
University of California, Irvine, Department of Ecology and Evolutionary Biology, 321 Steinhaus Hall,
Irvine, CA 92697, USA
Nigel D. F. Grindley
Yale University, Department of Molecular Biophysics and Biochemistry, PO Box 208144, New Haven, CT
06520-8114, USA
Daniel L. Hartl
Harvard University, Department of Organismic and Evolutionary Biology, 16 Divinity Avenue,
Cambridge, MA 02138, USA
Jonathan Hodgkin
University of Oxford, Genetics Unit, Department of Biochemistry, South Parks Road, Oxford, OXI 3QU,
UK
Charles Kurland
Uppsala University, Munkarpsv. 21, SE 243 32, Hoor, Sweden
Elizabeth Kutter
The Evergreen State College, Lab I, Olympia, WA 98505, USA
Terry H. Rabbitts
MCR Laboratory of Molucular Biology, Division of Protein and Nucleic Acid Chemistry, Hills Road,
Cambridge, CB2 2QH, UK
Ira Schildkraut
New England Biolabs Inc., 32 Tozer Road, Beverly, MA 01915, USA
Lee Silver
Princeton University, Department of Molecular Biology, Princeton, NJ 08544-1014, USA
Gerald R. Smith
Fred Hutchinson Cancer Research Centre, 1100 Fairview Avenue North, AI-162, Seattle, WA 98104-1024,
USA
Ronald L. Somerville
Purdue University, Department of Biochemistry, West Lafayette, IN 47907-1153, USA
P re fa c e vii
Preface
In the spring of 1953, shortly after Watson and Crick's discovery of the double-helical structure of
DNA, I found myself at dinner next to the famous geneticist R.A. Fisher. When I asked him what he
thought would be that structure's implication for genetics, he replied firmly ``None!'' He may have
taken this myopic view, because to him genetics was an abstract mathematical subject with laws that
were independent of the physical nature of genes. At the time, it was an esoteric subject taught only to
a few biologists and regarded as largely irrelevant to medicine. Archibald Garrod's seminal book
Inborn Errors of Metabolism had made little impact, perhaps because no-one knew what genes were
made of or how mutations acted.
The first recognition that mutations act on proteins came in 1949 when Linus Pauling and his
collaborators published their paper on ``Sickle cell anemia, a molecular disease.'' They found that
patients suffering from this recessively inherited disease had an abnormal hemoglobin that differed
from the normal form by the elimination of two negative charges. Eight years later, Vernon Ingram
showed that this was due to the replacement of a single glutamic acid residue in each of the identical
half-molecules of hemoglobin by a valine. Ingram's discovery was the first specific evidence of the
chemical effect of a mutation. It marked the birth of molecular genetics and molecular medicine and it
started the transformation of genetics to its central position in molecular biology, biochemistry, and
medicine today. Even so, people who have studied genetics as part of their curriculum may not have
become familiar with all the hundreds of specialized terms that geneticists have coined. To them and
many others at the periphery of genetics this Encyclopedia will prove most useful.
Like the Encyclopedia Britannica it is a cross between a dictionary and a text book. Hybrids are
defined in three lines, while Jonathan Hodgkin's brilliant exposition of past and present research on
Drosophila's rival, the minute nematode worm Caenorhabditis elegans, occupies nearly six pages.
This entry is intelligible to non-specialists, but that is not true of some others that were ``Greek'' to
me.
The Encyclopedia includes biographies of many of the pioneers, from Gregor Mendel to Ernst
Mayr, the great evolutionist who actually contributed several entries. Many of the other contributors
are also pioneers, even though they are not yet old enough to figure among the biographies. Sydney
Brenner has written the entry on the genetic code which he himself helped to discover 40 years ago;
David Weatherall has written on thalassemia to whose exploration he has devoted a lifetime; Malcolm
Ferguson-Smith has written on human chromosomal anomalies on which he is the world authority;
he and some others made such varied contributions to their fields that they figure among many of the
1650-odd entries. Sydney Brenner, one of the two editors-in-chief, may be the only contributor who
does not need the Encyclopedia.
What of the future of genetics? Its applications are likely to multiply, and many of the applicants
coming from other fields will welcome the Encyclopedia. An increasing number of applications will
be in medicine. I have heard predictions that in future every newborn child will have its genes
screened and the results imprinted on a computer chip that he or she will carry for life. Recorded
on it would be all genetic anomalies, susceptibilities to diseases, and intolerance of drugs. In case of
illness or accident, that chip would activate an algorithm that automatically prescribes the correct
treatment. Unfortunately for such utopias, medicine is more complex. Weatherall has shown elsewhere that the single recessive disease thalassemia is a multiplicity of different diseases and that the
same genotype may give rise to widely different phenotypes, depending on environmental and other
factors. This complexity arises even when single point mutations do not necessarily lead to disease,
but merely to susceptibilities to disease, as in a1-antitrypsin deficiency or with certain abnormal
hemoglobins. The complexity is much greater still in multifactorial diseases like schizophrenia or
diabetes. For many reasons good medicine will continue to require wide knowledge, mature judgement, empathy, and wisdom.
There have been glib predictions that the mapping of the human genome will allow most genetic
diseases to be cured by either germline or somatic gene therapy, but the former is too risky and the
latter is proving extremely difficult and costly. The risks of germline therapy are illustrated by the
attempt to create a genetically modified monkey. Scientists injected the gene for the green fluorescent
protein from a jellyfish into 222 monkey eggs. After fertilizing them with monkey sperm, they
viii
P re f a c e
incubated them and implanted a pair each into the wombs of 20 surrogate monkey mothers. Only five
of the implants resulted in pregnancies. Only three monkey babies were born and only one of them
carried the jellyfish gene, but the monkey does not fluoresce, because the gene, though incorporated
into its chromosomes, fails to be expressed. It will be argued that technical improvements may lead to
a 100% success rate, but this kind of gene transfer has now been practised for some years in mice and
other animals, and it has remained a haphazard affair that would be criminal to apply to humans.
Human cloning carries similar risks.
After many failures, somatic gene therapy of a potentially fatal human genetic disease recently
succeeded for the first time. A French team cured two baby boys of severe combined immunodeficiency (SCID)-XI disease. They infected cells extracted from the boys' bone marrow with cDNA
containing the required gene coupled to a retrovirus-derived vector, and then re-injected them back
into the boys' bone marrow. The therapy restored normal immune function that was still intact 10
months later. Another interesting development was the restoration of normal function to the muscle
of a dystrophic mouse after injection of fragments of the giant muscular dystrophy gene. On the other
hand, it has so far been impossible to cure some of the most common genetic diseases: thalassemia,
sickle cell anemia, or cystic fibrosis, because it has proved extremely difficult to express the genes in
the correct place in the patients' chromosomes and get them to express the required protein in
sufficient quantities in the right tissues. If gene therapy has a bright future, it does not look as though
it is just round the corner.
Before the completion of the Human Genome Project, identification of the genes for some human
inherited diseases required truly heroic efforts. The search for the Huntington's disease gene occupied
up to a hundred people for about 10 years. The same work could now be accomplished by few people
in a fraction of the time. This is one of the Human Genome Project's important medical benefits.
Others may be the rapid identification of promising new drug targets against diseases ranging from
high blood pressure to a variety of cancers, the epidemiology of alleles linked to susceptibility to
various diseases and improved basic understanding of human physiology and pathology.
Agriculture offers the greatest scope for applied genetics, but distrust of genetically modified foods
has blinded the public to its potential benefits and its vital importance for the avoidance of widespread
famines later in this century. Since the early 1960s, food available per head in the developing world has
increased by 20% despite a doubling of the population. This outstanding success has been achieved by
the introduction of crops improved by crossing and by intensive application of fertilizers, pesticides,
and weed-killers. Even so, there are 800 million hungry people and 185 million seriously malnourished pre-school children in the developing world. It is unlikely that the methods that have raised
cereal yields hitherto will allow them to be raised again sufficiently to reduce these distressing
numbers. Since most fertile land is already intensively cultivated, scientists are trying to introduce
genes into crops that would allow them to be grown on poorer soils and in harsher climates, and to
make existing crops more nutritious. In the tropics, fungi, bacteria, and viruses still cause huge harvest
losses. Scientists are trying to introduce genes that will confer resistance to some of these pests,
enabling farmers to use fewer pesticides. Genetically modified plants offer our best hope of feeding a
world population that is expected to double in the next 50 years. It will be tragic if the present outcry
over genetically modified foods will discourage further research and development in this field. If this
Encyclopedia helps to promote better public understanding of genetics, this might be the best remedy
against irrational fears.
Max F. Perutz
MRC Laboratory of Molecular Biology
Hills Road, Cambridge
CB2 2QH
UK
Introduc tion ix
Introduction
Genetics, the study of inheritance, is fundamental to all of biology. Living organisms are unique
among all natural complex systems in that they contain within their genes an internal description
encoded in the chemical text of DNA. It is this description and not the organism itself which is
handed down from generation to generation and understanding how the genes work to specify the
organism constitutes the science of genetics. Furthermore, this constancy is embedded in a vast range
of diversity, from bacteria to ourselves, all having arisen by changes in the genes. Understanding
evolution is also part of genetics, and an area which will benefit from our increasing ability to
determine the complete DNA sequences of genomes. In some sense, these sequences contain a record
of genome history and we have now learnt that many genes in our genomes can be found in other
organisms, quite unlike us. Indeed some are much the same as those found in bacteria, and can be
viewed as molecular fossils, preserved in our genomes.
Although ``like begets like'' must be one of the oldest observations of mankind, it was only in the
19th century that major scientific advances began. Charles Darwin put forward his theory of the
origin of the species by natural selection but he lacked a credible theory of the mechanism of
inheritance. He believed in blending inheritance which meant that variation would be continually
removed and he was therefore compelled to introduce variation in each generation as an inherited
acquired character. Gregor Mendel, working at the same time, discovered the laws of inheritance and
showed how the characteristics of the organism could be accounted for by factors which specified
them. Mendels' work was rediscovered in 1900 independently by Correns, de Vries and Tchermak
and soon after this, Bateson coined the term ``gene'' for the Mendelian factors and called the science,
``genetics.''
During the first 50 years of the twentieth century, there was a stream of important discoveries in
genetics. We came to understand the relation between genes and chromosomes, and the connection
between recombination maps and the physical structure of chromosomes. However, what the genes
were made of and what they did remained a mystery until 1953 when Watson and Crick proposed the
double helical structure of DNA, which at one blow unified genetics and biochemistry and ushered in
the modern era of molecular biology.
Genetics and especially the molecular approach to it is now a pervasive field covering all of biology.
In the Encyclopedia of Genetics, we have tried to draw together the many strands of what is still a
rapidly expanding field, to present a view of all of genetics. This has been a five year effort by over 700
expert authors from all around the world. We have tried to ensure that the breadth of the work has not
compromised the depth of the articles, and we hope that readers will be able to find accurate and
up-to-date information on all major topics of genetics. We have also included articles on the history of
the field as well as the impact of the applications of genetics to medicine and agriculture.
When we began this work, the sequencing of complete genomes was still in its infancy and the
sequencing of the human genome was thought to be far in the future. Technological advances and
the concentration of resources have brought this to fruition this year, and genetics is a subject very
much in the public eye. We hope that at least some of the articles will also be of value to those who are
not professional biological or medical scientists, but want to discover more about this field. Many of
the articles contain lists for further reading, and the online version of the Encyclopedia also includes
hypertext links to original articles, abstracts, source items, databases and useful websites, so that
readers can seamlessly search other appropriate literature.
We would like to acknowledge the efforts of the Associate Editors, who worked hard in
commissioning individual contributors to prepare cutting-edge articles, and who reviewed and edited
the manuscripts in a timely manner. Our thanks also go to the Publishers, Academic Press, and the
outstanding staff for their commitment, resourcefulness and creative input; and in particular Tessa
Picknett, Kate Handyside, Peronel Craddock and the production team, who helped to make this a
reality.
Sydney Brenner, Jeffrey H. Miller
Editors-in-Chief
Abbreviatio ns xi
Abbreviations
A
aa
aa-tRNA
ABC
AD
ADA
ADC
ADH
ADP
AFA
AFB
AFP
AHA
AIDS
AIL
Ala
ALL
AMCA
AMH
AML
Amp
AMP
An
AP
APC
APOAPOBEC
APP
ARF
Arg
ARS
Asn
ASO
Asp
AT
ATP
BAC
BER
BIC
BIME
BMD
bp
BR
BS
BSE
BWS
C
CCAF
cAMP
CAP
CD
CD
adenine
amino acid
amino acyl-tRNA
ATP binding cassette
Alzheimer's disease
adenosine deaminase
adenocarcinoma
alcohol dehydrogenase
adenosine 50 -diphosphate
acromegaloid facial appearance
(syndrome)
aflatoxin B
alpha-fetoprotein
acute hemolytic anemia
acquired immunodeficiency
syndrome
advanced intercross line
alanine
acute lymphoblastic leukemia
aminomethyl coumarin acetic acid
anti-Mullerian hormone
acute myeloid leukemia
ampicillin
adenosine 50 -monophosphate
polyadenylation
apyrimidinic (site)
adenomatous polyposis coli
apolipoproteinapo-B mRNA editing cytidine
deaminase
amyloid precursor protein
ADP-ribosylation factor
arginine
autonomous replication sequence
asparagine
allele specific oligonucleotide
aspartic acid
ataxia telangiectasia
adenosine 50 -triphosphate
bacterial artificial chromosome
base excision repair
Breast Cancer Information Corp
bacterial interspersed mosaic element
Becker muscular dystrophy
base pair
Balbiani rings
Bloom's syndrome
bovine spongiform encephalopathy
BeckwithWiedemann syndrome
cytosine
carboxylchromatin assembly factor
cyclic AMP
catabolite activator protein
campomelic dysplasia
circular dichroism
CDC
CDK
cDNA
CDS
CEN
CF
CFTR
CGH
CHO
CJD
CL
cM
CMD
CML
CMS
CoA
Col
COP
CPE
cR
CR
CRC
Cre
CREB
CRP
CT
ctDNA
ctf
CTP
CV
CVS
Cys
Da
DI
DIC
DMD
DMD
DMEM
DMI
DMSO
dN
DNA
DNP
dNTP
ds
DSBR
EBN
EBV
ECM
EDS
EF
EGF
ELISA
EMBL
xii
A b b revi a t i o n s
EN
EN
ENU
EP
ER
eRF
ES
ESI
ESS
EST
F-factor
FA
FACS
FAD
FADH2
FAK
FAP
FDS
FFI
FGF
FH
FIGE
FISH
FITC
FRDA
FRET
FSH
FSHMD
G
G-6-P
G-banding
G-proteins
GAP
GDB
GDP
GEF
GF
GFP
GIST
Glu
Gly
gp
GPCR
gRNA
GSD
GSS
GTP
HA
HAT
Hb
hCG
HCL
HDAC
HDGS
HDL
Hfr
HGMD
early nodule
endonuclease
N-ethyl-N-nitrosourea
early promotor
endoplasmic reticulum
eukaryal release factor
embryonic stem (cells)
electrospray ionization
evolutionarily stable strategy
expressed sequence tagged
fertility-factor
fluctuating asymmetry
fluorescence-activated cell sorter
flavin-adenine dinucleotide
reduced FAD
focal adhesion kinase
familial adenoma polyposis
first-division segregation
familial fatal insomnia
fibroblast growth factor
familial hypercholesterolemia
field inversion gel electrophoresis
fluorescent in situ hybridization
fluorescein isothoicyanate
Friedreich's ataxia
fluorescence energy resonance transfer
follicle stimulating hormone
facioscapulohumeral muscular
dystrophy
guanine
glucose-6-phosphate
Giemsa-banding
GTP-binding proteins
GTPase activating protein
genome database
guanosine diphosphate
guanine nucleotide exchange factor
growth factor
green fluorescent protein
gastrointestinal stromal tumors
glutamic acid
glycine
glycoprotein
G-protein-coupled receptor
guide RNA
GerstmannStraussler disease
GerstmannStrausslerScheinker
syndrome
guanosine triphosphate
hemagglutinin
histone acetyl transferase
hemoglobin
human chorionic gonadotrophin
hairy cell leukemia
histone deacetylase
homology-dependent gene silencing
high-density lipoprotein
high frequency recombination
Human Gene Mutation Database
HGMP
His
HIV
HLA
HLH
HMC
HMG
HMW
HNPCC
hnRNP
HPLC
HPV
hsp
HTH
HTLV
HV-I, HV-II
HW equilibrium
IBD
IBS
ICM
ICSI
IES
IF
Ig
IGF
IHF
IL
ILAR
Ile
IMAC
IN
INR
IPTG
IS
ISH
ISR
IVF
IVS
kb
KL
KO
KSS
Lac
LBC
LCR
LD
LDL
Leu
LH
LHSI/II
LINE
LMC
LMW
LOD
LOH
A bb revi a t i on s xiii
Lox
LPS
LRC
LRE
LRR
LTR
Lys
m-BCR
M-BCR
M-phase
MAP
MAP
MAPK
MAR
Mb
MBP
MC'F
MCR
MCS
MDS
Mel
MELAS
MEN
MERRF
Met
MFH
MFS
MGD
MGF
MGI
MHC
MIC
Mis
MLP
MLS
mM
MMC
MMR
MOI
MPS
MRCA
MRD
mRNA
MS
MSI
mtDNA
Mu element
NNAD
NADH
NADP
NAP
NBU
ncRNA
locus of X-over
lipopolysaccharide
local resource competition
local resource enhancement
leucine-rich repeat
long terminal repeat
lysine
minor breakpoint cluster region
major breakpoint cluster region
meiosis/mitosis phase
microtubule associated protein
mitogen-activated protein
mitogen-activated protein kinase
matrix-attached region
megabase
myelin-based protein
micro-complement fixation
mutation cluster region
multiple cloning site
myelodysplastic syndrome
maternal-effect embryonic lethal
mitochondrial encephalomyopathy,
lactic acidosis, and stroke-like
symptoms
multiple endocrine neoplasia
myoclonus epilepsy with ragged red
fibers
methionine
malignant fibrous histiocytoma
Marfan syndrome
Mouse Genome Database
mast cell growth factor
Mouse Genome Informatics
major histocompatibility complex
minimum inhibitory concentration
Mullerian-inhibiting substance
major late promotor
myxoid liposarcoma
millimolar
maternally inherited myopathy and
cardiomyopathy
mismatch repair
multiplicity of infection
mucopolysaccharidosis
most recent common ancestor
minimal residual disease
messenger RNA
mass spectroscopy
microsatellite instability
mitochondrial DNA
mutator element
aminonicotinamide adenine dinucleotide
reduced NAD
nicotinamide adenine dinucleotide
phosphate
nucleosome assembly protein
nonreplication Bacteroides units
noncoding RNA
NCS
NER
NHL
NK
NMD
NMR
NOR
NOS
NPC
NPC
NR
nt
OI
OMIM
OPMD
ORC
ORF
Ori
Ori T
OTU
PAA
PAC
PAC
PAGE
PAI
PAPP
PAR
Pax
PBP
Pc
PCO
PCR
PCT
PDGF
PE
PEP
PEV
PFGE
PGPR
PH
PH
Phe
Pi
PI
PIP2
PKA
PKU
PMF
PMS
Pol
PR
Pro
PS
PS I/II
PTC
PTGS
Q-(banding)
nonchromosomal striped
nucleotide excision repair
non-Hodgkin lymphoma
natural killer (cells)
nonsense-mediated mRNA decay
nuclear magnetic resonance
nucleolus organizing region
nitric oxide synthetase
nasopharyngeal carcinoma
nuclear pore complex
nuclear reorganization
nucleotide
osteogenesis imperfecta
Online Mendelian Inheritance in Man
oculopharyngeal muscular dystrophy
origin of replication complex
open reading frame
origin (of replication)
origin of transfer
operational taxonomic unit
propionic acidemia
P1 artificial chromosome (vector)
prostate adenocarcinoma
polyacrylamide gel electrophoresis
pathogenicity islands
pregnancy-associated plasma protein
pseudoautosomal region
paired box-containing genes
penicillin binding protein
polycomb
polycystic ovarian (disease)
polymerase chain reaction
plasmacytoma
platelet derived growth factor
phosphatidylethanolamine
phosphoenolpyruvate
position effect variegation
pulsed-field gel electrophoresis
plant growth-promoting
Rhizobacteria
plekstrin homology
polyhedron
phenylalanine
inorganic phosphate
phosphatidylinositol
phosphatidylinositol-4,5-bisphosphate
protein kinase A
phenylketonurea
proton motive force
postmeiotic segregation
polymerase
protease
proline
phosphatidylserine
photosystem I/II
premature termination codon
posttranscriptional gene silencing
quinacrine
xiv
A bb revi a t i on s
QTL
R
R-plasmids
RA
Ram
RB
RCC
RCL
RCS
rDNA
RDR
RED
RER
REV
RF
RF
RFLP
RI
Rif
RIM
RIP
RM
RN
RNA
RNAi
RNP
ROS
RRF
rRNA
RSS
RSV
RT
RTK
S-phase
SAR
SAR
SBT
SC
SCE
SCF
scRNA
SDP
SDS
SDS-PAGE
SEM
sen DNA
Ser
SER
SF
SH (domains)
SI
SINE
SIV
SL
SLF
snoRNA
SNP
snRNA
snRNP
SOD
SRE
SRP
SRY
ss
SSB
SSLP
SSR
STR
STS
su
SU
SV40
T
Taq
TBP
TCA
TCR
TDF
TE
TEL
TEM
Ter
Tet
TF
TGF
TGS
Thr
TIM
TIMPS
TIR
TK
Tm
TM
TMV
TN
TNF
TOM
topo
Topt
TRD
TRiC
TRITC
tRNA
Trp
TSD
TSE
TSG
TSS
Tyr
U
Ub
UPD
URF
USS
Abbrev ia tions xv
UTI
UTP
UTR
UV
V gene
Val
VEGF
VHL
VNTR
VWF
WHO
WS
WT
WT1
XIC
Xist
XP
YAC
ZP
A
(A)n tail
See: Poly(A) Tail
A DNA
See: DNA, History of; DNA Structure
Abortive Transduction
E A Raleigh
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0001
Acentric Fragment
Acentric Fragment
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1750
Achondroplasia
R Savarirayan, V Cormier-Daire, and
D L Rimoin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0002
Achondroplasia is the most common form of disproportionate short stature (dwarfism) with an estimated
incidence of 1 per 20 00030 000 live births. This type
of dwarfism has been recognized for more than 4000
years, and can be seen depicted in many ancient statues
and drawings. Achondroplasia is inherited as an
autosomal dominant trait with approximately 75%
of cases representing new dominant mutations. The
molecular defects underlying achondroplasia have
recently been elucidated, and comprise heterozygous
mutations in the fibroblast growth factor receptor 3
(FGFR3) gene located on the short arm of chromosome 4. This gene encodes a tyrosine kinase cell
surface receptor, and one specific gain-of-function
mutation (G1138A), resulting in a glycine to arginine
substitution in the transmembrane domain of FGFR3,
is responsible for the vast majority (approximately
98%) of cases, and is the most common known mutation in humans.
Diagnosis of achondroplasia is usually made at or
around birth, based on the typical appearance of these
infants comprising: disproportionate short stature
with short limbs, especially the most proximal (rhizomelic) segments, redundant folds of skin overlying the
A c ro s o m e 3
high cervical myelopathy, central apnea, or profound
hypotonia and motor delay and may, in some instances, require decompressive neurosurgery. Other potential complications in infancy include significant nasal
obstruction that may lead to sleep apnea in a minority
(5%) of cases, development of a thoracolumbar kyphosis, which usually resolves upon weight-bearing,
and hydrocephalus in a small proportion of cases
(1%) during the first 2 years of life, which may require
shunting. From early childhood, and as the child
begins to walk, several orthopedic manifestations
may evolve including progressive bowing of the legs
due to fibular overgrowth, development of lumbar
lordosis, and hip flexion contractures. Recurrent ear
infections with ensuing chronic serous otitis media are
common complications at this time and may lead to
conductive hearing loss with consequent delayed
speech and language development. The older child
with achondroplasia commonly develops dental malocclusion secondary to a disproportionate cranial base
with subsequent crowding of teeth and crossbite.
The main potential medical complication of the adult
with achondroplasia is lumbar spinal canal stenosis,
with impingement on the spinal cord roots. This
complication may be manifested by lower limb pain
and parasthesiae, bladder or bowel dysfunction, and
neurological signs and may require decompressive
surgery.
Throughout their lives, some people with achondroplasia may experience a variety of psychosocial
challenges. These can be addressed by specialized
medical and social support of the individual and
family, appropriate anticipatory guidance and by
interaction with patient support groups such as the
``Little People of America.''
See also: Genetic Diseases
Acquired Resistance
See: Systemic Acquired Resistance (SAR)
Acridines
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0003
Acrocentric Chromosome
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1751
Acrosome
G S Kopf
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0004
Active S ite
The mechanisms by which these acrosomal components are targeted to this organelle during biogenesis
are also not known. Although spermatogenic cells
possess functional mannose-6-phosphate/insulin-like
growth factor II receptors, it is not clear whether these
receptors play a role in the transport of glycoproteins
to the acrosome or whether targeting occurs primarily
through the `default' pathway seen in the transport of
proteins in other secretory systems. Finally, once these
components are packaged into the acrosome, the functional significance of additional processing of these
components (i.e., posttranslational modifications;
movement within the organelle) during sperm residence in the testis and/or during residence in the extratesticular male reproductive organs (i.e., epididymis;
vas deferens) is not clear.
In some species (e.g., guinea pig, mouse), the formation of specific protein domains within the acrosome
has been clearly demonstrated, but the mechanism by
which this compartmentalization is established is
poorly understood and an understanding of the biological role of this compartmentalization is only starting
to be realized. Answers to all of these questions will no
doubt become apparent when a systematic evaluation
of the proteins comprising the acrosome is undertaken
with respect to transcription, translation, and posttranslational modifications. An understanding of these
processes may greatly further our knowledge of the
role of the acrosome in fertilization since it is becoming
apparent that this secretory vesicle may have multiple
functions (see below). It should also be noted that
individuals whose sperm have poorly formed acrosomes or lack acrosomes altogether display infertility;
this speaks to the importance of this organelle in
the normal fertilization process. In any event, studies
focused on the synthesis and processing of acrosomal
components should be considered in the context of the
acrosome functioning as a secretory granule and not a
modified lysosome, as has been historically suggested.
Although the fusion of the plasma membrane overlying the acrosome and the outer acrosomal membrane constitutes the acrosome reaction, it must be
emphasized that this process is very complex and
likely involves many of the steps constituting regulated exocytotic processes in other cell types. Such
steps might include membrane priming, docking, and
fusion. Therefore, this process can also be referred to
as acrosomal exocytosis. Recent data support the idea
that sperm capacitation, an extratesticular maturational process that normally occurs in the female
reproductive tract and confers fertilization competence to the sperm, may comprise signal transduction
events that ready the plasma and outer acrosomal
membranes for subsequent fusion during the process
of acrosomal exocytosis. Acrosomal exocytosis is
Active Site
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1752
Adaptive Landscapes
M B Cruzan
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0006
Overview
The genetic determination of fitness is complex, involving a large number of loci with numerous interactions.
In 1932 Sewall Wright depicted this myriad of effects
as a two-dimensional view of peaks and valleys that
represented fitness levels of multilocus genotypes
A d a p t i ve La n d s c a p e s 5
(Figure 1A). In this version of an adaptive landscape (a
gene combination landscape), the horizontal and vertical axes represent genetic dimensions, and fitness
(selective value) is indicated by contours (lines representing elevation differences as found on a topographic
map). As envisioned by Wright, a gene combination
landscape could consist of many thousands of peaks
of various elevations separated by valleys and saddles.
Individual genotypes are represented by single points,
and populations as clouds of points that are typically
found on or near an adaptive peak. Adaptive evolution
translates into local hill climbing, and shifts to higher
peaks can only occur through fitness reductions as
populations traverse valleys or saddles. The rugged
genetic topography is due to the prevalence of genetic
interactions such that many different gene combinations can produce high-fitness phenotypes. The paradigm of an adaptive landscape is a key element of
Wright's Shifting Balance Theory of evolution, whereby species undergo shifts among fitness peaks.
Phenotypic Landscapes
Holey Landscapes
When Wright developed the fitness surface metaphor,
his ability to characterize a genotypic space with a
large number of dimensions was hampered by the
availability of appropriate analytical tools. In recent
1
Locus 2
1
Locus 1
(B)
(A)
Genotype space
Character 2
Character 1
(C)
(D)
Genotype space
Figure 1 Adaptive landscapes. In each case increasing elevation on the three-dimensional surface is equivalent to
higher fitness. The fitness contours on the floor of each graph represent two-dimensional projections of the adaptive
surface. Four different versions of the adaptive landscape are depicted: (A) A portion of a gene combination landscape
features a rugged topography and axes that correspond to a multidimensional genotype space. (B) In the gene
frequency landscape only two loci are considered. In this case there are two fitness peaks separated by a saddle. (C)
An example of a phenotypic landscape displays a ridge of equal fitness produced by different combinations of values
for two quantitative traits. (D) Holey adaptive landscapes are characterized by networks of equal fitness perforated
by regions of low adaptive value (holes). The actual genotype space consists of a large number of dimensions, so the
graphical representation shown here is a rough approximation.
A d d i t i ve Ge n e t i c Va r i a n c e 7
and divergence of populations by small steps without
the necessity of crossing valleys or saddles.
Future Prospects
As both metaphors and analytical constructs, adaptive
landscapes will continue to be useful tools for understanding evolutionary processes from both theoretical
and empirical perspectives. The theory associated with
fitness surfaces has made substantial advances in recent
years, but empirical evidence supporting the topographies proposed in these models is sparse. The recent
development of multidimensional models of adaptive landscapes, with their more concise predictions
concerning the genetic determination of mating
barriers between divergent populations and taxa, provides new foci for empirical investigations and an
opportunity to refine our understanding of evolutionary processes.
Further Reading
References
Adaptor Hypothesis
S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0007
The additive genetic variance is, in effect, the component of the total genetic variance that can be
explained by genes within genotypes. If some character is determined by the genes at a single locus, and has
measurement 3 for A1A1 individuals, 4 for A1A2 individuals, and 5 for A2A2 individuals, then all the variation in the value of this character can be explained by
genes, with an A2 gene contributing an additive component of 1 compared to an A1 gene. Here the additive
genetic variance comprises all the genetic variance. If
the character has measurement 3 for A1A1 individuals,
4 for A1A2 individuals, and 3 for A2A2 individuals,
and A1 and A2 are equally frequent, then none of the
variation can be explained by genes, and the additive
genetic variance is zero. Usually a situation intermediate between these extremes holds.
A better expression for the additive genetic variance is the `genic variance,' that is the component of
the variance attributable to genes, but the expression
`additive genetic' is entrenched in the literature.
The adjective `additive' is explained by the definition
of this variance through a least-squares procedure.
Suppose that some character determined by the
genes at a single locus, with individuals of genotypes A1A1, A1A2, and A2A2 having respective
measurement values m11, m12, and m22 for this character. If the population frequencies of these three
genotypes are P11, 2P12, and P22, then the popu P11 m11
lation mean for this measurement is m
2P12 m12 P22 m22 and the population variance in the
2 2P12 m12 m
2
character is 2 P11 m11 m
2
. The additive genetic variance is
P22 m22 m
found by finding two parameters, a1 (for the allele
A1) and a2 (for the allele A2) which minimize the
quadratic function Q, defined by Q P11 m11 m
1 2 2 P22 m22 m
21 2 2P12 m12 m
22 2 , the minimization being subject to the constraint
P11 P12 1 P11 P22 2 0. This explains the
adjective `additive,' since the additive genetic variance
is that portion of the total variance explained by fitting
additive parameters associated with the genes at the
locus in question. If Q can be reduced to zero by this
additive fitting of the values, then all the genetic
variance is additive genetic.
When many alleles are possible at the locus in
question, the additive genetic variance is found by a
direct extension of the above procedure. In both the
two- and many-allele cases, any variance not explained
by fitting additive parameters, that is the difference
between the total genetic variance and the additive
component, is called the dominance variance. This is
usually denoted D 2 while the additive genetic variance is denoted by A 2 (or by VA). In both the two-and
multiallele cases, the a quantities are called the average
effects of the respective alleles. Because of their use
A de no c a rc in om a s 9
variance to zero at any equilibrium point of the evolutionary process.
See also: Adaptive Landscapes; Fitness;
Fundamental Theorem of Natural Selection;
Linkage Disequilibrium
Adenine
E J Murgola
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0008
HC
N
CH
Figure 1
N
H
Adenine.
Adenocarcinomas
A F Gazdar and A Maitra
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1540
Adenocarcinomas (ADCs) are defined as malignant epithelial tumors with glandular differentiation or mucin production by the tumor cells. Their
benign counterparts are known as adenomas. ADCs
constitute a subgroup of a broader category of epithelial tumors known as carcinomas. Carcinomas are the
most prevalent form of human tumors (approximately
80% of all noncutaneous malignancies) and ADCs
are the commonest form of carcinomas. Carcinomas
may arise in virtually any organ that contains glandular or secretory epithelium, and the most frequent sites
include lung, kidney, gastrointestinal tract, breast, and
prostate. In some organs, such as colorectum, breast,
and kidney, almost all of the carcinomas are ADCs,
while in other organs such as lung, only a portion
of the carcinomas are ADCs. As with other epithelial malignancies, ADCs are usually preceded by a
series of histopathologically identifiable preneoplastic
lesions. Molecular changes can usually be detected
during the lengthy preneoplastic process, and may be
present in histologically normal appearing epithelium.
Because ADCs may arise from multiple diverse structures and organs, their molecular pathogenesis varies
considerably. In this section we will briefly discuss
some aspects of the molecular genetics of ADCs arising in the common organ sites.
Colorectal Adenocarcinomas
There are two major genetic pathways by which colorectal adenocarcinomas (CRCs) arise: the ``suppressor'' and the ``mutator'' pathway. The ``suppressor''
phenotype is epitomized by the adenomacarcinoma
sequence, where progressive accumulation of mutations in dominant tumor-promoting oncogenes and
recessive tumor suppressor genes (TSGs) results in
transformation of a benign adenoma into an ADC.
One of the earliest changes in this sequence involves
inactivation of the Adenomatous Polyposis Coli (APC)
gene on chromosome 5q21. Germline APC mutations
are responsible for the inherited polyposis syndrome
familial adenomatous polyposis or FAP (see Table 1).
While FAP is a relatively uncommon cause of CRC,
inactivation of the APC gene has been found in the
vast majority of sporadic tumors as well. APC gene
mutations are seen in aberrant crypt foci, considered
the first histologic manifestation of disordered epithelial proliferation in the colon, and up to 80% of sporadic adenomas and CRCs. Although the traditionally
proposed ``two-hit'' pathway of loss of APC function
in CRCs has been deletion of one parental allele followed by mutational inactivation of the second, recent
data suggests that hypermethylation of the APC promoter site is an equally important epigenetic mechanism of inactivation, especially in sporadic tumors. The
second consistent genetic aberration in the ``suppressor'' phenotype involves activating point mutations of
the K-ras oncogene, located on chromosome 12p12.
K-ras encodes a 21-kDa protein involved in GTP
10
Adenoc arcinoma s
Table 1
Clinical features
Affected gene
PeutzJeghers syndrome
Juvenile polyposis
Cowden syndrome
signal transduction, which controls cellular proliferation and differentiation. K-ras mutations are seen in
half of adenomas >1 cm in size and an equal number of
CRCs, but much less frequently in smaller adenomas,
suggesting that these changes are preceded by APC
gene mutations in most cases. Additional genetic
alterations in the adenomacarcinoma sequence are
usually seen within the larger late-stage adenomas or
only at the carcinoma stage. For example, loss of
heterozygosity (LOH) at the Deleted in Colon Carcinoma (DCC) gene locus on chromosome 18q is found
in approximately 70% of CRCs and 50% of late-stage
adenomas, but only in 10% of smaller adenomas. Similarly, LOH at 17p13, the p53 TSG locus, is seen in 75%
of CRCs, but infrequently in any adenomas, including
the late-stage ones. The majority of CRCs with LOH
of one p53 allele demonstrate missense mutation of
the remaining allele. To summarize, the ``suppressor''
phenotype of CRCs is characterized by a defined
sequence of genetic alterations, which begins in histologically benign epithelium with inactivation of the
APC gene, and proceeds through stepwise accumulation of mutations involving K-ras, DCC, p53, and
putative TSGs on other chromosomes.
In contrast, the ``mutator'' phenotype is characterized by microsatellite instability (MSI). Microsatellites are short, simple repetitive DNA sequences of
mono-, di-, tri-, or tetranucleotides dispersed throughout the human genome and are by nature highly polymorphic. CRCs developing along this pathway are
initiated by an inherited or somatic mutation within
one or more DNA mismatch repair (MMR) genes.
Unlike classic TSGs (or ``gatekeepers''), MMR genes
are ``caretakers'' of the genome, responsible for correcting spontaneous slippage-induced errors during
DNA replication. Inactivating mutations in the
MMR genes facilitate further mutations in cancer
causing genes or additional MMR genes, resulting in
tremendous genetic instability and a fertile soil for
Breast Adenocarcinomas
There are two principal subtypes of breast ADCs:
infiltrating ductal and infiltrating lobular carcinomas.
In general, the clinical features and underlying genetic
abnormalities of these two subtypes are similar. Most
breast carcinomas arise on a backdrop of premalignant
breast disease, seen as a histologic continuum that begins with usual ductal or lobular hyperplasia, proceeds
through atypical hyperplasias and carcinoma-insitu, and culminates in invasive tumors. Approximately 10% of breast cancers show a strong familial
tendency, appearing in a younger subset of women
than the general population. Linkage studies in these
families have led to the isolation of two breast cancer
susceptibility genes BRCA1 (for BReast CAncer 1)
on chromosome 17q21 and BRCA2 on chromosome
13q12. Women with germline BRCA1 mutations have
Adenoc arcinoma s 11
an 8590% lifetime risk of developing breast cancers
and about 33% risk of developing ovarian cancers.
Until recently, the role of BRCA1/2 genes in sporadic
breast cancers was unclear since mutations of these
genes were not detected outside the context of familial
cases. New data suggest that a subset of sporadic
breast cancers with LOH at 17q21 undergo complete
abrogation of BRCA1 function via promoter hypermethylation of the second allele. It is likely that
methylation will emerge as a common mechanism of
inactivation in most sporadic cancers, just as intragenic deletions and mutations play the predominant
role in their familial counterparts. Other genetic loci
important in breast cancer development include p53
and the ataxia telangiectasia gene, ATM, on chromosome 11q23. Patients with germline p53 mutations
(LiFraumeni syndrome) have an increased risk of
developing breast cancers, and 50% of sporadic cases
contain a mutated p53 gene. Similarly, patients with
germline ATM mutations are also predisposed to
breast cancers and 40% of sporadic tumors demonstrate LOH at the ATM locus. As with BRCA1, ATM
mutations have not been detected in sporadic tumors
and it remains to be determined what frequency of
these cases will demonstrate promoter hypermethylation. Besides TSGs, one growth-promoting oncogene
that has received considerable attention in recent
clinical trials has been c-erbB2 or Her-2neu. Her2neu shares considerable homology with the epidermal
growth factor receptor, and encodes a transmembrane
glycoprotein with intracellular tyrosine kinase activity. Approximately 2030% of breast cancers demonstrate amplification of Her-2neu, and there is strong
correlation between amplification and proliferation
indices. The use of Herceptin, a monoclonal antibody
to Her-2neu in patients with advanced stage breast
cancers that overexpress the oncogene, has been
shown to prolong survival and delay disease progression, and represents one of the best-known
examples of translational cancer research in modern
times.
Prostatic Adenocarcinomas
The incidence of prostatic adenocarcinomas (PAC)
has been increasing in recent times, primarily due to
the better use of screening and detection techniques,
such as the prostate specific antigen assay. More than
70% of men above the age of 80 years will harbor foci
of PAC, such that, conceptually, malignancy can
almost be called a ``physiologic inevitability'' at this
age. Androgens have at least a permissive role in the
genesis of these tumors, because neoplastic epithelial
cells possess androgen receptors, and androgen deprivation can lead to regression of PAC in many cases. In
approximately 10% of patients, the development of
PAC has been linked to inheritance of putative susceptibility genes. Linkage analysis from these cohorts
has the narrowed the locus for one-third of familial
cases to the hereditary prostate cancer 1 or HPC1
region on chromosome 1q2425. The identity of the
implicated gene within the HPC1 region remains to be
determined although several candidate genes have
been isolated. Allelic deletions of the short arm of
chromosome 8 are found in 80100% of sporadic
PACs, and 3065% of its precursor lesion, prostatic
intraepithelial neoplasia. No single gene has been
identified as the target of intragenic deletions in these
cases and the search for a causal TSG continues.
Approximately one-fourth of PACs demonstrate
inactivation of the PTEN/MMAC gene on chromosome 10q23, and preliminary studies have suggested
that loss of PTEN/MMAC function may be associated
with disease progression and metastasis.
Summary
In Table 2, we have summarized the most frequently
implicated genetic alterations in ADCs at various
other anatomic sites. It is intriguing that despite the
defining morphologic pattern common to all ADCs
i.e., the formation of neoplastic glands each of these
12
Adenoma
Table 2
Organ
Genetic abnormality
Pancreas
Kidney
Stomach
Gallbladder
Inactivation
Inactivation
Inactivation
Inactivation
of
of
of
of
tumors has a subset of distinct molecular abnormalities that is organ-specific, and probably represents the
endpoint of a complex interplay between genetic and
environmental factors. With the increasing integration
of molecular approaches into the diagnosis and treatment of cancers, it is possible that the future will see
ADCs being classified not by the organ they arise in
but rather by their unique and clinically relevant
``molecular profiles.''
Further Reading
Baylin SB, Herman JG, Graff JR, Vertino PM and Issa JP (1998)
Alterations in DNA methylation: a fundamental aspect of
neoplasia. Advances in Cancer Research 72: 141196.
Blackwood MA and Weber BL (1998) BRCA1 and BRCA2: from
molecular genetics to clinical medicine. Journal of Clinical
Oncology 16: 19691977.
Dong JT, Isaacs WB and Isaacs JT (1997) Molecular advances in
prostate cancer. Current Opinion in Oncology 9: 101107.
Fearon ER and Vogelstein B (1990) A genetic model of colorectal tumorigenesis. Cell 61: 759767.
Marra G and Boland CR (1995) Hereditary nonpolyposis colorectal cancer: the syndrome, the genes, and historical perspectives. Journal of the National Cancer Institute 87: 1114
1125.
Sekido Y, Fong KM and Minna JD (1998) Progress in understanding the molecular pathogenesis of human lung cancer.
Biochimica Biophysica Acta 1378: F2159.
Adenoma
D Harrison
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1541
Further Reading
Klatt EC and Kumar V (2000) Robbins Review of Pathology. Philadelphia, PA: W.B. Saunders.
Adenomatous
Polyposis Coli
A H Wyllie
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1542
Molecular Organization
The human gene was originally described as encoding
a 2843-amino-acid protein, but several isoforms are
now known to exist, arising from alternative splicing,
an additional exon (10A), and further exons 50 to exon
1 (four have been described, three including translation start codons). There is some evidence for tissue
specificity in the expression of these isoforms. An
unusual feature in the genomic organization is the exceptionally long 15th (terminal) exon, which encodes
more than two-thirds of the entire protein. The
N-terminal portion of the protein includes an oligomerization domain (corresponding to codons 1171)
and seven tandem armadillo repeats (453767). The
central region encodes three incomplete 15-aa repeats
and seven incomplete 20-aa repeats (10142130). In
the C-terminal third is a basic domain and a terminal
T/SXV motif. The functional significance of the oligomerization and armadillo repeat domains is not
yet completely clear, but in the central region the 15and 20-amino-acid repeats bind b-catenin. Three of
the 20-amino-acid incomplete repeats contain GSK3b
phosphorylation sites and axin (or its homolog
conductin) is bound by three closely adjacent
SAMP-motif sequences. There are also both nuclear
localizing and nuclear export signals in this central
region. The basic region, and possibly other regions
N-terminal to it, binds microtubules, whilst the
C-terminal 170 amino acids include a binding site
for the microtuble-associated protein EB-1. The
C-terminal T/SXV motif binds the human homolog
of the Drosophila discs large protein (hDLG), a member of the membrane-associated guanylate kinase
(MAGUK) family. There is at least one caspase-3
site, at Asp777, which is cleaved in apoptosis.
Homologs
Closely similar homologs of human APC are present
in many species, and there is a second family member
(APC-2) located on human chromosome 19p13.3. All
of these include versions of the oligomerization and
armadillo repeat domains and the b-catenin and axinbinding sites, but the terminal domains, including the
Dlg binding site, is absent in both APC-2 and the
Drosophila homologs. The expression of APC-2 appears more restricted than that of APC, being largely
in the central nervous system.
Function
The best-understood functions of APC relate to its
interaction with b-catenin and axin. Binding to
b-catenin through the 15-amino-acid repeats is constitutive, but binding through the third, fourth, and
seventh 20-amino-acid repeats is dependent on their
phosphorylation by the serine-threonine kinase
GSK-3b. GSK-3b also phosphorylates b-catenin, targeting its ubiquitination and proteasomal destruction.
APC catalyzes this reaction, through forming a multimeric complex with axin/conduction, b-catenin, and
GSK-3b. Hence APC plays a major role in destabilizing b-catenin. This has profound effects on the cell, as it
prevents entry of b-catenin to the nucleus, where, with
tcf as a partner, it acts as a heterodimer transcription
factor. Amongst the proteins known to be transactivated by b-catenin/tcf are the immediateearly DNA
replicative proteins c-myc and c-jun, the cell cycle protein cyclin D1, the epithelial growth factor gastrin, and
the extracellular protease matrilysin. APC appears to
play an additional role in reducing the effective intranuclear concentration of b-catenin by exporting it
from the nucleus. By both these mechanisms, therefore, APC can interrupt pathways in which b-catenin
activates transcription and hence cell division.
One such pathway is associated with cell stimulation by Wnt-1. This paracrine growth factor binds to
the seven-span membrane receptor Frizzled (Fz),
and so activates the cytoplasmic protein Disheveled
(Dvl), an inhibitor of the kinase activity of GSK-3b.
Thus Wnt-1 stimulation stabilizes b-catenin, opposing APC. Free cytoplasmic b-catenin is also generated
through its release from the cytosolic moiety of the
cell-to-cell adherence molecule F-cadherin, a major
protein in adherens junctions. This release is triggered
by tyrosine kinase activity, and occurs in the cellular
response to stimulation by many growth factors,
whose receptors also cluster at these junctions. In
both these circumstances, through destabilizing
b-catenin, APC downregulates the effect of external
growth signals on cell proliferation.
Components of the Wnt-1 signaling pathway,
including APC, axin, GSK-3b, and b-catenin appear
widely in biology, and in development are often concerned with issues of cell polarity and cell-to-cell
orientation as well as with the initiation of cell replication. The binding of APC to microtubules, EB-1, and
14
Adenosine P hosphates
Pathology
These observations go some way towards explaining
the remarkably high frequency with which APC deficiency leads to the formation of colorectal adenomas,
lesions in which there is a sustained disorder in cell
orientation, migration, and proliferation. In FAP
almost all the germline mutations generate truncated,
N-terminal fragments of the protein, and within the
tumors the function of the residual allele is suppressed
through second mutation, partial or complete chromosome loss, or suppression of expression. Over 70% of
sporadic tumors also show a similar pattern of biallelic
silencing, but here more than 60% of the truncation
mutations fall within a mutation cluster region (MCR)
delineated by codons 12861513. This generates
N-terminal peptides including the oligomerization and
armadillo repeat domains but without the axin- and
GSK-3b-dependent b-catenin binding sites or the
nuclear export sites. Correspondingly, most adenomas, both in human and the murine APC-deleted
models, show intense nuclear b-catenin accumulation.
In the FAP syndrome associated with germline
mutation close to the MCR, the large numbers of
colorectal adenomas are usually accompanied by
extracolonic abnormalities, most commonly congenital hyperplasia of the retinal pigment epithelium
(CHRPE), duodenal adenomas, and (often) desmoid
tumors of the soft tissues. More rarely, there are
osteomas of the mandibular bone and in a very
small percentage, carcinomas of the thyroid gland,
medulloblastoma (Turcot syndrome) or hepatoblastoma. Germline deletions distant from the MCR also
tend to produce a less severe phenotype, with many
fewer adenomas and CHRPE is not found associated
with mutations 50 to the ninth exon. Somewhat mysteriously, germline mutations in the centre of the MCR
(close to codon 1300) tend to be associated in tumors
with suppression of the residual APC allele by partial
or complete deletion (producing loss of heterozygosity at this locus), whilst mutations at more proximal
or distal sites are associated with second point mutations. Rare sense mutations in the MCR are not
usually associated with FAP, but can confer cancer
susceptibility. Finally, studies with animal models of
germline APC mutation show the existence of other
genes that substantially modify the multiple adenoma
phenotype.
Further Reading
McCartney BM and Peifer M (2000) Teaching tumour suppressors new tricks. Nature Cell Biology 2: April.
Kusick VA Adenomatous polyposis of the colon: APC.
http://www.ncbi.nlm.nih.gov/htbin-post/Omim.dispmim/
175100.
Peifer M and Polakis P (2000) Wnt signaling in oncogenesis and
embryogenesis a look outside the nucleus. Science 287.
Adenosine Phosphates
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0009
A d e n o v i r u s e s 15
Adenoviruses
A J Berk
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0010
Replication Cycle
The globular domain at the termini of the fibers
(Figure 1) absorbs to a cell-surface glycoprotein
receptor (CAR for Ad2 and 5, also the receptor for
coxsackie B virus), followed by the binding of an
RGD sequence in a flexible loop of the penton protein
at each vertex to cellular fibronectin-binding integrins
(integrins avb5 and avb3). This stimulates receptormediated endocytosis. As the pH of the resulting
endosome drops, virion structural proteins change
conformation, releasing the fiber proteins and lysing
the endosomal membrane, thus delivering the virion
into the cytosol. This process has been exploited
experimentally to introduce proteins, which have
been added to the medium and endocytosed with the
virion or DNA absorbed to the virion surface, into the
cytosol of cultured cells. On entry into the cytosol, a
virion-associated protease is activated by the reducing
environment, resulting in the release of several virion
16
Adenov iruses
A d e n o v i r u s e s 17
pIX
E1A
L1
MLP
E1B
L2
L3
L4
VAI, VAII
E3
E2 early
IVa2
E2B
L5
E4
E2 late
E2A
Figure 2 Ad2 transcription map. Transcription units are designated by horizontal arrows above and below the
double line representing the 36 kb viral genome, with arrowheads indicating the direction of transcription. Vertical
arrows indicate sites of polyadenylation for each of the five families of late mRNAs processed from the major late
promoter (MLP) transcription unit (L1L5) and the E2 early and late transcription units (E2A, E2B).
late viral mRNAs and viral DNA which revealed
introns as looped out regions of single-stranded
DNA. Adenovirus introns and splice sites have been
important substrates in in vitro experiments that have
characterized the mechanism of RNA splicing.
Late in infection (after the onset of DNA replication), host cell and early viral protein synthesis is
inhibited due to the dephosphorylation and consequential inactivation of the cap-binding translation
initiation factor eIF4E. Translation of mRNAs transcribed from the MLP are resistant to this inhibition
because of their common *200 base 50 untranslated region, the `tripartite leader' produced from the
splicing of three short exons. The tripartite leader
allows translation initiation by hypophosphorylated
eIF4E, and stimulates a high rate of translation by
a `ribosome shunting' mechanism that transfers the
40S initiation complex from the 50 end of the leader
to an AUG within *30 bases of the 30 end of the
leader, without scanning through the intervening
RNA. Host cell mRNA nuclearcytoplasmic transport is also inhibited at very late times in infection
by a poorly understood mechanism that requires
the E1B-55KE4-ORF6 complex and viral DNA replication. Progeny virions assemble in the nucleus
forming crystalline arrays of *105 virions per HeLa
cell and are released by disintegration of the killed
host cell.
Oncogenic Transformation
At low frequency, Ad2, 5, and 12 transform cultured
rodent fibroblasts to a noncontact inhibited phenotype that forms foci of transformed clones readily
recognized on a monolayer of nontransformed cells.
Ad12 causes sarcomas in rats and hamsters at the site
of injection. Human adenovirus DNA replication and
late gene expression is attenuated in rodent cells that
18
Adjacent/Alternate Disjunction
Further Reading
Adjacent/Alternate
Disjunction
M Hulten and C Tease
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0011
A
A
A'
Pachytene
Cross
B
B'
A Darvasi
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0013
Alternate segregation
A B'
A' B
Adjacent I segregation
A B
A' B'
Metaphase II
Gametes
N
Interstitial
chiasma
A
A'
Pachytene
Cross
B
B'
Alternate segregation
A B'
A' B
Metaphase II
Adjacent I segregation
A B
A' B'
Gametes
N
Further Reading
Armstrong SJ and Hulten MA (1998) Meiotic segregation analysis by FISH investigations in sperm and spermatocytes of
translocation heterozygotes. European Journal of Human
Genetics 6: 430 431.
Armstrong SJ, Goldman SH and Hulten MA (2000) Meiotic
studies of a human male carrier of the common translocation, t(11;22), suggests postzygotic selection rather than preferential 3:1 MI segregation as the cause of liveborn offspring
with an unbalanced translocation. American Journal of Human
Genetics 67: 601609.
Rickards GK (1983) Orientation behavior of chromosome multiples of interchange (reciprocal translocation) heterozygotes.
Annual Review of Genetics 17: 443 498.
20
A d v a n c e d / D e r i ve d
confidence interval length in proportion of recombination units at the tth generation, one obtains:
Parental lines
Ct C=t=2
F1
F2
Semirandom intercrossing
F3
Semirandom intercrossing
Advanced/Derived
See: Cladograms
Fn
Figure 1
rt 2 1
2r=2
Aflatoxins
P L Foster and W A Rosche
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0014
A g i ng , G e ne t i c s o f 21
(A)
(B)
(C)
O
O
O
O
O
O
1
11
O
O
2O
3
O 10
O
OH
O
9
5
8
7
OCH3
Aflatoxin B1(AFB1)
Figure 1
HN 1
OCH3
OCH3
H2 N
6
3
AFB1-N7-guanine
AFB1 8,9-oxide
Structures of aflatoxins. (A) Aflatoxin B1 (AFB1); (B) AFB1 8,9-oxide; (C) AFB1 N7-guanine.
intermediate AFB1 8,9-oxide (Figure 1B). This metabolite rapidly reacts with DNA and forms adducts to
the N-7 position of guanine (Figure 1C). Adducted
guanines can undergo two further reactions: (1) the
imidazole ring of the guanine can open, yielding the
formamidopyrimidine derivative (AFB1-FAPY); or
(2) the glycosidic bond between the guanine and the
sugar can break, resulting in loss of the adducted base
from the DNA. Of these three DNA lesions it is
probably the N-7 adduct that is responsible for
AFB1's mutagenicity and carcinogenicity.
AFB1 exposure produces point mutations, chromosomal aberrations, chromosomal breaks, and other
types of genetic damage. AFB1 was one of the first
human carcinogens whose mutagenicity was demonstrated with the `Ames strains' (McCann et al.,
1975), a set of bacterial strains that have become
part of the battery of short-term tests for genotoxic
agents. AFB1 was also one of the first human carcinogens whose mutational spectrum was determined in
vivo, again using a simple bacterium (Foster et al.,
1983). AFB1 preferentially induces G to Tand, secondarily, G to A mutations in all organisms tested, indicating that the mechanism by which it produces
mutations does not differ among species. The adduction pattern of AFB1 is also nonrandom: it preferentially binds to guanines that are surrounded by other
guanines.
Mutations in the tumor suppressor gene p53 occur
in many cancers and in most hepatocarcinomas. More
than 50% of the hepatocarcinomas from patients
living in areas with high dietary aflatoxin intake have
one specific mutation in p53 a GC to TA mutation at
the third position of codon 259 (which changes AGG
to AGT, so a serine replaces the normal arginine in
the protein) (Bressac et al., 1991; Hsu et al., 1991).
While this mutation reflects the mutagenic action of
aflatoxin, it is also possible that the mutation conveys
some particular advantage to liver cells. This remarkable mutational specificity suggests that the presence
of a G to T mutation at codon 259 could be used as a
biomarker for human aflatoxin exposure.
Further Reading
Wang J-S and Groopman JD (1999) DNA damage by mycotoxins. Mutation Research 424: 167181.
References
Aging, Genetics of
G J Lithgow
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1692
22
Aging, Genetics of
Human Aging
A genetic underpinning of aging in humans is revealed
by rare genetic premature aging conditions and heritability estimates of longevity in normal populations.
Hutchinson-Guilford syndrome is a rare autosomal
dominant condition of childhood characterized by
balding, skin wrinkling, subcutaneous fat loss, and
intense atherosclerosis resulting in cardiovascularassociated death. There is no acceleration in brain
aging illustrating that premature aging is segmental
with many tissue types remaining unaffected. Werner
syndrome (WS) is an adult progeria and is characterized by skeletal muscle atrophy, premature greying of
hair, heart valve calcification, intense atherosclerosis,
and hypogonadism. WS is caused by mutation of the
WRN gene that encodes a member of the RecQ family
of DNA helicases.
Genetic factors account for 25% of variance in lifespan in human populations but the contributing loci
are unknown. Some alleles associated with longevity
have been identified from gene association studies.
These include the e2 variant of the APOE gene, which
is overrepresented in centenarian populations but is
also associated with type III and IV hyperlipidemia.
Cell culture studies reveal the characteristics of
human aging. Human cells cultured in vitro undergo
a limited number of cell divisions (replicative senescence). The nondividing cells display altered characteristics including the secretion of matrix-degrading
extracellular metalloproteinases. A limited number of
such senescent cells also occur in the tissues of aged
humans and may contribute to age-related changes and
functional deficits. Senescent cells result from a shortening of chromosomal telomeres during replication.
Replicative senescence can be prevented in vitro by
restoration of the telomere-synthesizing enzyme,
telomerase.
A g ro b a c t e r i um 23
Agouti
G S Barsh
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0017
Agrobacterium
M Van Montagu
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0018
24
Agrobacter ium
A lb i ni s m 25
plasmids, but leaving the border sequences which
enabled the transfer (origin and terminator of transfer
sites). This T-DNA was then replaced by one or more
plant or plant-like genes. When an Agrobacterium
harboring such an engineered Ti plasmid was used to
interact with plant tissue, it became possible to regenerate from the transformed cells a healthy and
fertile plant expressing the newly introduced genes.
Such transgenic or genetically modified (GM) plants
are at the base of a real revolution in the plant sciences.
It made it possible to study plant growth and development, plant physiology, and plant ecology in molecular terms. The importance of this approach was so
overwhelming that attempts were made to extend the
host range of the Agrobacteriumplant interaction, so
that many more plant species could be transformed.
This work is ongoing but already representatives of
most plant families, including monocots from the
Gramineae family (e.g., grasses such as rice) can efficiently been transformed by Agrobacterium-based
methods.
Such appropriate modifications of the Ti plasmids
into gene vectors made it also possible to engineer
crop plants with beneficial traits. The first laboratory
achievements were around the mid-1980s and concerned the engineering of plants producing their own
insecticide. This was an insecticidal protein encoded
by a bacterial gene, cloned from a Bacillus thuringiensis strain, and modified so that it could be expressed
and accumulated in plant cells. The second success
story was the engineering of plants tolerant to novel,
ecologically more acceptable herbicides. These constructions were then introduced into the elite lines
of major crops such as corn, soybean, canolla (rapeseed), and cotton. These GM-plants, as they have been
called, were field-tested for many years and were
approved by the US controlling agencies such as
FDA, EPA APHIS (USDA) for large-scale trials in
1996. In 2000 some 45 million hectares of these transgenic plants were grown, mostly in North and South
America and China. After 15 years of testing and 5
years of large-scale production no science-based argument of danger for health of humans, animals, or
environment has been advanced. Present studies
involve the engineering of crop plants better adapted
to biotic and abiotic stresses and the engineering of
new compounds in plants.
Hence Agrobacterium will be increasingly used in
fundamental and applied studies for unraveling and
improving its interactions with plant cells.
See also: Rhizobium; Ti Plasmids; Transfer of
Genetic Information from Agrobacterium
tumefaciens to Plants
Alanine
E J Murgola
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0019
Alanine (Figure 1) is one of the 20 amino acids commonly found in proteins. Its abbreviation is Ala and its
single letter designation is A. As one of the nonessential
amino acids in humans, it is synthesized by the body
and so need not be provided in the individual's diet.
COOH
H 2N
CH3
Figure 1
Alanine.
Albinism
R A Spritz
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0020
26
Albinism
A lc oho li s m 27
in African and some AfricanAmerican patients,
accounting for a frequency of OCA2 up to 1 per
1400 in parts of Africa. The P polypeptide is a
melanosomal membrane protein that has considerable
homology to known small-molecule transporters, but
its specific function is not yet known.
OCA3 is a clinically somewhat milder form of
albinism associated with moderately reduced, often
reddish, pigmentation of the skin and hair and variable
visual defects. OCA3 results from mutations in the
TYRP1 gene located at 9p23, which corresponds to the
brown (b) locus of mice. TYRP1 encodes the melanogenic enzyme DHICA oxidase, which enhances production of black eumelanin versus brown pigments.
OCA3 has thus far only been studied in African and
AfricanAmerican patients, in whom a specific frameshift mutation may account for the majority of cases.
The frequency of OCA3 is unknown.
OA1, also called X-linked recessive ocular albinism, is a form of OCA in which skin and hair hypopigmentation is very mild, whereas visual deficits may
be relatively severe. In general, only males manifest
clinical symptoms, although carrier females may exhibit variegated pigmentation of the retina. The OA1
gene, located at Xp22.3p22.2, encodes a melanosomal
protein whose function is unknown. There is no known
mouse disorder homologous to human OA1.
ChediakHigashi syndrome (CHS) is a rare, autosomal recessive disorder characterized by variable
manifestations of OCA, mild bleeding tendency, severe
immunologic deficiency, slowly progressive neurologic dysfunction, and frequently early death from an
unusual lymphoproliferative syndrome. CHS results
from mutations of the CHS1 gene, located at 1q42
43 and which corresponds to the mouse beige (bg)
locus. Homozygosity for protein-null mutant alleles
of CHS1 results in the typical severe childhood form
of CHS, whereas amino acid substitutions can be associated with a clinically milder form of the disorder.
The function of the CHS protein is not yet known, but
it is thought to be involved in sorting of proteins to
vesicles, lysosomes, and cytoplasmic granules.
HermanskyPudlak syndrome (HPS) is an autosomal recessive disorder characterized by OCA, a
moderate bleeding disorder, apparent lysosomal storage, colitis, and progressive restrictive lung disease
that frequently results in death in mid-adulthood.
Although rare in most populations, HPS is prevalent
in Puerto Rico, where it occurs with a frequency of
about 1 per 1800. HPS results from mutations of the
HPS gene, located at 10q23, which corresponds to
the mouse pale-ear (ep) locus. The great majority of
Puerto Rican patients have a specific frameshift, indicative of a founder effect in this island population.
However, only about half of non-Puerto Rican HPS
Further Reading
King RA (1998) Albinism. In: Nordlund JJ et al. (eds) The Pigmentary System, p. 553. New York: Oxford University Press.
King RA, Hearing VJ, Creel DJ and Oetting WS (2001) Albinism.
In: Scriver CR et al. (eds) The Metabolic and Molecular Bases of
Inherited Disease, p. 5587. New York: McGraw-Hill.
Alcoholism
K E Browman and J C Crabbe
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0023
Alcoholism can be defined as excessive and/or compulsive use of alcohol. Alcoholism results in persistent
social, psychological, and medical problems, and is a
leading cause of morbidity and premature death. The
motivation underlying excessive alcohol consumption, however, is obscure.
Determinants of alcohol abuse include an interaction of environmental and biological factors: among
the latter, genetic factors deserve special attention.
One-third of alcoholics have at least one alcoholic
parent and children of alcoholics are more likely to
become alcoholic than children of nonalcoholics, even
when raised in nonalcoholic families. Despite a large
effort designed to understand the underlying biological basis of alcoholism, a definitive biological marker
for alcoholism has not been found.
Animal Models
Since genetic animal models were first employed, studies have investigated how different strains of mice
differ in their response to alcohol, what gene products
(proteins) are influencing behavior, and through
which mechanism(s). Genetic animal models offer
several advantages over studies using human subjects.
For one, the experimenter can control the subject's
genotype, whereas in humans only monozygotic twins
have identical genotypes. One widely used method is
the study of inbred strains. Each inbred strain consists
of animals that are essentially identical twins. By studying a number of different inbred strains, one can investigate differences among the strains in response to
alcohol. If individual differences exist within a strain
they are assumed to be nongenetic (environmental) in
28
Alcoholism
Clinical Findings
Certain characteristics present in those likely to
become alcoholics may be useful markers indicating
potential risk for the development of alcoholism.
Electrophysiological evidence from the brain suggests
that an abnormality exists in alcoholics and their nondrinking offspring. When exposed to a novel stimulus,
alcohol-naive sons have a pattern of brain waves
(called P3 or P300 evoked potentials) resembling
those measured in alcoholics (Begleiter et al., 1998).
These differences in brain activity are hypothesized
to reflect a genetic vulnerability to alcoholism.
Predisposing factors have also been studied in
human populations that have very little genetic or
social/environmental variability (e.g., a southwestern
American Indian tribe), and researchers have found
several genetic markers to be linked with alcoholism
in this tribe (Long et al., 1998). One marker was located near a gene coding for a GABA receptor. While
the results from this study require verification, they
support other evidence implicating this neurotransmitter system in alcoholism. A second group of researchers used the general population in the United
States, selecting families affected by alcoholism (Reich
et al., 1998).Oneofthemarkersdiscriminating alcoholics from nonalcoholics was provisionally mapped to a
location near the gene coding for the alcohol metabolizing enzyme alcohol dehydrogenase (ADH). Other
evidence suggests that possession of a variant of the
ADH gene tends to protect against the development
of alcoholism in Asian populations. Although still
preliminary, these studies are beginning to identify
genes mediating susceptibility to alcoholism.
Conclusions
Genetic animal models have provided many cases
where clear genetic influences on alcohol sensitivity
exist. There are also readily available populations of
human alcoholics with well-defined pedigrees that
are making it possible to relate the animal models to
human genetic findings. This offers the hope that
besides improving our understanding of how alcohol
works, genetic studies can provide new methods for
identifying individuals that might be at risk. Ideally,
this information will also help to generate new pharmacotherapies to address this disease.
A l i g n m e n t P ro b l e m 29
Further Reading
References
Alignment Problem
D Higgins
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0024
Sequences
During the course of evolution, nucleotide and amino
acid sequences change. These changes are of two basic
VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNP
VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLS HGSA
* *
* * ****
* *** *
* *
* ***
KVKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHF
QVKGHGKKVADALTNAVAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHL
** *****
** *
** **
** *** ** **
** *
GKEFTPPVQAAYQKVVAGVANALAHKYH
PAEFTPAVHASLDKFLASVSTVLTSKYR
** * * * *
* *
**
Figure 1 A sequence alignment between human a-globin (bottom) and b-globin (top). These sequences have been
aligned by the insertion of gaps (`' characters). Residues that are identical between the two sequences are marked by
stars.
30
A l i g n m e n t P ro b l e m
Alignment is Important
Sequence alignment is an essential prerequisite for a
wide range of analyses, that can be carried out on sequences. Any analysis that involves the simultaneous
treatment of a number of homologous proteins will
usually require that the proteins have been lined up
with the homologous residues in columns. In Figure 2
we can see a multiple alignment of some globins where
this has been done. Only when we have such an alignment can we attempt to ask questions about the way in
which these sequences evolve. These will include
questions relating to the phylogeny of the sequences
and the rate at which they change (e.g., numbers of
estimated substitutions per site). Such phylogenetic
and evolutionary analyses are interesting in their
own right or can be used in a more practical manner.
For example, they can be used to tell us about the dates
of important events in the evolution of a gene family
or to derive amino acid weight matrices (see below).
Sequence alignment is very widely used in the biological literature to demonstrate conserved regions in
a protein alignment, which we assume to have great
functional importance. They may also be used to
demonstrate homology between a protein family and
a distantly related member. There may be little overall
similarity between the proteins but we can feel more
confident in the possible homology if we observe that
the residues that are most conserved in the family as a
whole are also present in the new member. Alignments
may also be used to investigate conservation of protein
structure or to predict the structures of new members
when we know the tertiary structures of one or more
members of a sequence data set.
Most protein sequences belong to multigene families
or contain protein domains which are related, evolutionarily, to domains in other proteins (from the same
and from different species). The largest families contain hundreds of members in many species, especially in
multicellular eukaryotes. The most familiar examples
include protein kinases, zinc-finger transcription
factors, and 7-transmembrane receptor proteins. It is
difficult to see the subgroupings within these families
or to follow all of the functional diversity or to relate
function in different species, without an evolutionary overview of the proteins. This overview can be
provided by a phylogenetic analysis. This requires prior
alignment. This situation is more important than ever
Human
Horse
Human
Horse
Whale myoglobin
Cyanohemoglobin
Leghemoglobin
VHLTPEEKSAVTALWGKVN VDEVGGEALGRLLVVYPWTQRFFESFGDLST
VQLSGEEKAAVLALWDKVN EEEVGGEALGRLLVVYPWTQRFFDSFGDLSN
VLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLS
VLSAADKTNVKAAWSKVGGHAGEYGAEALERMFLGFPTTKTYFPHFDLS
VLSEGEWQLVLHVWAKVEADVAGHGQDILIRLFKSHPETLEKFDRFKHLKT
PIVDTGSVAPLSAAEKTKIRSAWAPVYSTYETSGVDILVKFFTSTPAAQEFFPKFKGLTT
GALTESQAALVKSSWEEFNANIPKHTHRFFILVLEIAPAAKDLFSFLKGTSE
Human
Horse
Human
Horse
Whale myoglobin
Cyanohemoglobin
Leghemoglobin
PDAVMGNPKVKAHGKKVLGAFSDGLAHLDN LKGTFATLSELHCDKLHVDPENFRL
PGAVMGNPKVKAHGKKVLHSFGEGVHHLDN LKGTFAALSELHCDKLHVDPENFRL
HGSAQVKGHGKKVADALTNAVAHVDD MPNALSALSDLHAHKLRVDPVNFKL
HGSAQVKAHGKKVGDALTLAVGHLDDLPGALSNLSDLHAHKLRVDPVNFKL
EAEMKASEDLKKHGVTVLTALGAILKKKGHHEAELKPLAQSHATKHKIPIKYLEF
ADQLKKSADVRWHAERIINAVNDAVASMDDT EKMSMKLRDLSGKHAKSFQVDPQYFKV
VPQNNPELQAHAGKVFKLVYEAAIQLQVTGVVVTDATLKNLGSVHVSKGVADAHFPV
Human
Horse
Human
Horse
Whale myoglobin
Cyanohemoglobin
Leghemoglobin
LGNVLVCVLAHHFGKEFTPPVQAAYQKVVAGVANALAHKYH
LGNVLVVVLARHFGKDFTPELQASYQKVVAGVANALAHKYH
LSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSKYR
LSHCLLSTLAVHLPNDFTPAVHASLDKFLSSVSTVLTSKYR
SEAIIHVLHSRHPGDFGADAQGAMNKALELFRKDIAAKYKELGYQG
LAAVIADTVAAG D AGFEKLMSMICILLRSAY
VKEAILKTIKEVVGAKWSEELNSAWTIAYDELAIVIKKEMNDAA
Figure 2 A multiple alignment of seven globin sequences from human (a- and b-chains of hemoglobin), horse
(a- and b-chains), whale (myoglobin), lamprey (cyanohemoglobin), and lupin (leghemoglobin).
A l i g n m e n t P ro b l e m 31
with the elucidation of the entire genomic sequences
of so many model organisms, including humans.
Alignment Parameters
From Figure 1 we can clearly see that these sequences
are reasonably similar to each other in many respects.
They share a similar arrangement of a-helices and the
heme-binding histidines are at matching positions.
These sequences are now 46% identical to each
other, assuming that this alignment is correct (there
are 63 identical residues from 137 alignment positions,
ignoring gaps). Therefore, over the past few hundred
million or so years, these sequences have diverged by
the accumulation of numerous amino acid substitutions, which we now observe as sequence differences
along the alignment. Again, it can be hard to tell exactly which substitutions have occurred and in which
sequences, although, again, we can make some guesses
by including other globin sequences in our study. For
example, at the fourth position of the alignment we
have a serine (S) in the a-globin and a threonine (T) in
the b-globin. This is a difference and we can assume
that at least of one of the sequences has changed since
their most recent common ancestor. If we do not
have access to any other sequences we cannot tell if
the serine changed to a threonine or vice versa or if both
sequences changed or even if several substitutions
occurred in one or both sequences, at this position.
The patterns of insertions/deletions and substitutions that we observe in real sequences are highly
nonrandom. There are two obvious constraints on
indels, from the point of view of natural selection.
Firstly, at the nucleotide level, indels in proteincoding regions must maintain the phase of the coding
sequence, i.e., there are normally only deletions or
insertions of multiples of three nucleotides. Indels,
which do not maintain phase, are almost guaranteed
to result in truncated or highly altered amino acid
sequences. One may, occasionally see insertions or
deletions of entire exons in eukaryotes, but again, the
phase of the coding sequence must be maintained. It is
harder to generalize about indels in noncoding regions
or in non protein-coding RNA genes. Once an indel
has occurred in a protein-coding gene, this will result
in an altered amino acid sequence. Clearly, if the
change greatly impairs the functionality of the protein, this will not be tolerated by natural selection.
As a general rule, indels which alter the folding of a
protein or which affect the active site (in the case of an
enzyme) or a binding site for a ligand or cofactor will
not be tolerated. More specifically, indels are rare in
conserved a-helices and b-strands but are relatively
common in the loops of irregular structure that connect them. Helices and strands must pack together to
32
A l i g n m e n t P ro b l e m
A
R
N
D
C
Q
E
G
H
I
L
K
M
F
P
S
T
W
Y
V
Figure 3
4
1
2
2
0
1
1
0
2
1
1
1
1
2
1
1
0
3
2
0
5
0
2
3
1
0
2
0
3
2
2
1
3
2
1
1
3
2
3
6
1
3
0
0
0
1
3
3
0
2
3
2
1
0
4
2
3
6
3
0
2
1
1
3
4
1
3
3
1
0
1
4
3
3
9
3
4
3
3
1
1
3
1
2
3
1
1
2
2
1
5
2
2
0
3
2
1
0
3
1
0
1
2
1
2
5
2
0
3
3
1
2
3
1
0
1
3
2
2
6
2
4
4
2
3
3
2
0
2
2
3
3
8
3
3
1
2
1
2
1
2
2
2
3
4
2
3
1
0
3
2
1
3
1
3
4
2
2
0
3
2
1
2
1
1
E G
5
1
3
1
0
1
3
2
2
5
0
2
1
1
1
1
1
K M
6
4
2
2
1
3
1
F
7
1 4
1 1 5
4 3 2 11
3 2 2 2 7
2 2 0 3 1
P
4
V
A BLOSUM 62 amino acid weight matrix. The 20 amino acids are given using the one-letter code.
Alignment Methods
In order to carry out an alignment of two sequences,
we could do this very accurately if we knew exactly
which changes (insertions/deletions and substitutions) occurred and where. We could simply match up
residues which we know to be related to each other,
either because they have not changed during evolution
or because we know they have diverged through one
or more substitutions. In practice, we observe the present-day sequences and align these with each other
according to some measure of alignment quality.
Commonly, we attempt to find an alignment that displays a maximum number (and quality) of matches
between residues and a minimal number (and length)
of gaps. This can be done manually using a word
processing program on a computer but this is tedious
and error-prone; more commonly, automatic computer programs are used.
A useful device to help visualize alternative alignments is to make use of a dot matrix plot as shown in
Figure 4. This is a graphical device that places the two
sequences along two sides of a rectangle and inserts
dots at all positions between the two sequences where
there is a match of some kind (say a residue in one
sequence that is identical to one in the second sequence or some number of residues out of so many,
e.g., three amino acids out of five identical or some
number of residues with a high score using an amino
acid weight matrix). If a plot is made between homologous sequences, then the best alignment will usually
show up on the plot as lines of dots, parallel to the
main diagonal. These lines will be interrupted by
blanks corresponding to mismatches between the
sequences and there will be jumps to different diagonals corresponding to gaps. These plots are useful
because they are very simple. Repeated sequences
show up as sets of parallel lines and small regions of
local similarity, perhaps corresponding to isolated
matches of single domains in otherwise unrelated
proteins, can be detected very easily. Plots can also
be used to reveal regions of self-complementarity in
A l i g n m e n t P ro b l e m 33
Dynamic Programming
34
A l i g n m e n t P ro b l e m
References
Al kaptonuri a 35
Krogh A, Brown M, Mian IS, Sjolander K and Haussler D
(1994) Hidden Markov models in computational biology:
applications to protein modeling. Journal of Molecular Biology
235: 15011531.
Lipman DJ, Altschul SF and Kececioglu JD (1989) A tool for
multiple sequence alignment. Proceedings of the National Academy of Sciences, USA 86: 44124415.
Needleman SB and Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequences
of two proteins. Journal of Molecular Biology 48: 443453.
Smith TF and Waterman MS (1981) Identification of common
molecular subsequences. Journal of Molecular Biology 147:
195197.
Thompson JD, Higgins DG and Gibson TJ (1994) CLUSTAL W:
improving the sensitivity of progressive multiple sequence
alignment through sequence weighting, position-specific gap
penalties and weight matrix choice. Nucleic Acids Research 22:
46734680.
Alkaptonuria
B N La Du
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0025
individuals are deficient in the critical enzyme, homogentisic acid oxidase (HGO). This enzymatic deficiency causes an accumulation of HGA in their tissues
and the excretion of several grams of HGA per day in
the urine. Alkaptonuric patients show three characteristic features: the excretion of HGA in the urine; a
yellow (ocher) pigmentation of their connective tissues
(joints and tendons); and, in later years, arthritis of the
larger, weight-bearing joints (hips, knees, shoulders,
and the lower spine). The exact relationships between
the accumulation of HGA, tissue ochronosis, and
arthritis are still not completely understood.
The essentially complete deficiency of HGO activity is inherited as an autosomal recessive disorder.
Over 30 distinct genetic mutations have been discovered over the past few years as being responsible for
structural abnormalities in the enzyme. This means
that although specific DNA tests can now be developed to determine whether or not a person is carrying
one of the specific HGO gene mutations, we still do
not have a general test for detecting heterozygous
carriers of alkaptonuria.
The urine containing HGA turns dark slowly on
exposure to oxygen and alkaline conditions, but may
not be an abnormal color when first excreted. Thus,
the diagnosis may be missed until adulthood, when
operations are made on the knee, for example, and
black, pigmented fragments of the knee cartilage are
noticed. Unusual X-rays of the lower spine also may
suggest alkaptonuria because of the characteristic
degeneration of the intervertebral disks with a narrowing of the space between the disks and calcification of the intervertebral material.
Alkaptonuria is a lifetime condition that slowly
leads to arthritis when the person reaches middle-age,
CH2CHNH2COOH
Phenylalanine
hydroxylase
HO
Phenylalanine
Tyrosine
transaminase
HO
CH2CHNH2COOH
CH2 C
p-Hydroxyphenylpyruvate
oxidase
HO
OH
COOH
O
Tyrosine
CH2COOH
p-Hydroxyphenylpyruvic acid
Homogentisic acid
Enzyme block
in alkaptonuria
Homogentisic
acid oxidase
OH
Fumarylacetoacetic acid
hydroxylase
Fumaric acid +
Fumarylacetoacetic
acetoacetic acid
acid
Maleylacetoacetic
acid isomerase
C
H
CH
C
COOH
CHCOOH
OH
Maleylacetoacetic acid
Figure 1 Enzymatic steps in phenylalanine and tyrosine metabolism showing the location of the metabolic block in
alkaptonuria.
36
Alkyltransferases
Alkyltransferases
M Ambrose and L D Samson
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0026
A l l e l e F req u en c y 37
expressed and only accepts alkyl groups from alkylated DNA bases (i.e., at the Cys139 residue in the
C-terminal half of the protein).
Alkyltransferases have been found in a variety of
different organisms, including yeast, insect, rodent,
fish, and human; this fact alone underscores the
importance of alkylating agents as a source of DNA
damage. There is considerable amino acid sequence
similarity among the known alkyltransferases. For
example, the cysteine residue that accepts an alkyl
group from the O6-MeG/O4-MeT residues is contained in the conserved amino acid sequence
PCHRI/V (i.e., Pro, Cys, His, Arg, Ile/Val), while
the active cysteine of those alkyltransferases that
accept alkyl groups from alkylphosphotriesters (i.e.,
E. coli Ada, Bacillus subtilis AdaB, and Salmonella
typhimurium Adast) is contained in the conserved
FRPCKR (Phe, Arg, Pro, Cys, Lys, Arg) sequence.
Crystal structures of alkyltransferases from E. coli
(the 19-kDa C-terminal Ada fragment, Ada-C), Pyrococcus kodakaraenis (Pkat), and human (hAGT) have
recently been determined. Studying the crystal structures of Ada-C and hAGT, Madeleine Moore and
colleagues and Robin Vora and colleagues have proposed two somewhat different models to describe
how alkyltransferases might recognize and transfer
the alkyl group from alkylated DNA lesions. Moore's
group initially determined the crystal structure of
Ada-C and observed that the active alkyl-acceptor
cysteine is buried within the protein, and thus is not
properly positioned to make favorable contacts with
the alkylated base in duplex DNA. Therefore, Moore's
group suggested that a conformational change in alkyltransferase proteins is required to expose the active
cysteine to the target alkylated DNA base. Given that
Ada-C contains a helixturnhelix motif (HTH) which
is characteristic of DNA-binding proteins, they suggested further that Ada-C most likely binds the major
groove of the DNA molecule via the second helix (i.e.,
the so-called recognition helix) and that the C-terminal
helix is then swiveled about the DNA molecule such
that the active cysteine site of the protein is exposed to
the alkylated DNA base substrate whose alkyl group
protrudes into the DNA major groove. Thus, this
model requires a gross change in the conformation of
the Ada-C protein and not the DNA molecule.
By contrast, Vora's group suggested that alkyltransferases bind to the DNA major groove, but instead
induce the `flipping out' of the target alkylated base
into the binding pocket containing the active site
cysteine, a process which might be initiated via the
insertion of an amino acid residue(s) into the DNA
helix. They suggested that this could be initiated by
the arginine residue that is contained in the conserved
sequence, RAV[A/G] (present in the recognition
helix of the HTH motif of all the known alkyltransferases), and might also be important for stabilizing
the extrahelical (displaced) nucleotide or even for
forming hydrogen bonds with the unpaired (orphan)
base. In support of the Vora model, results of biochemical studies with the hAGT show that substituting Arg128 to lysine, which apparently can still enter
the major groove of DNA but has reduced hydrogenbonding capacity, greatly diminishes the repair activity of the hAGT protein.
It is still unclear which of the two models mentioned above most accurately describes the repair
activities of DNA alkyltransferases. Further, crystal
structures of the bacterial and human alkyltransferase
proteins will no doubt help us to better understand
precisely how DNA alkyltransferases recognize, bind,
and repair alkylation DNA damage.
See also: DNA Repair
Allele Frequency
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn. 2001.0027
38
Alleles
Alleles
P Anderson
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0028
Alloster y 39
example, nonsense alleles alter a gene sequence such
that a translation-terminating stop codon is introduced into its mRNA, while missense alleles alter a
gene sequences such that one amino acid is substituted
with another in the protein. Mutations that completely eliminate an encoded protein are termed null
alleles and are particularly useful for understanding
function of a gene.
Reference
Allelic Exclusion
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0029
Allopatric
E Mayr
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0031
Allostery
P Lu
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0032
Many proteins have two or more different, nonoverlapping ligand-binding sites. In the case of enzymes,
one of these, the active site, binds the substrate and is
responsible for the biological activity of the protein.
The other site, or allosteric site, is specific for the
structure of some other metabolites, the allosteric
effector. The effector molecule can have a positive or
negative effect on the rate of reaction or the binding
affinity at the substrate site. When the protein binds
the allosteric effector, it changes its geometry and
undergoes an allosteric transition. Today allostery
refers to almost any action transmitted through a
macromolecule.
Origins
F. Monod, J.-P. Changeux, and F. Jacob introduced
the idea of allosteric proteins for control of cellular
metabolism in 1963. Their original examples involved
the biosynthetic pathways of several amino acids
where end-product inhibition was observed. They
distinguished between steric inhibition, where a substrate analog may inhibit enzymes by interacting with
the active site, and end-product inhibition of an
enzyme at the start of a pathway. In the latter, inhibitors have little or no steric resemblance to the substrate for the first step in a multistep pathway. This
difference in geometry of a substrate and inhibitor and
two sites of action leads to allosteric proteins (Greek
allos, other steros, solid or space). Not surprisingly,
the final example of a use for allostery in that 1963
publication by Monod, Changeux, and Jacob, was for
gene regulation; Jacob and Monod introduced their
operon theory in 1961.
Two years later, in 1965, Monod, Wyman, and
Changeux expanded on the idea of the allosteric proteins and introduced a model dependent on multimeric proteins and the constraints of maintaining
interactions among subunits as individual subunits
undergo allosteric transitions. There is a vast literature
on the nature of the allosteric transition and the
propagation of the allosteric signal from the effector
binding site to the substrate binding site. All cases fall
between the symmetry model and the sequential
model. In part, this literature was due to the absence
of known protein structures. Lysozyme, the first
enzyme structure to be determined, a monomer,
was determined in 1965. Hemoglobin was the only
40
A l l o t y p es
multimeric protein with known structure. In the intervening time, structures of numerous proteins have
been investigated in great detail in connection with
allostery. These are structures both in the absence and
presence of substrate and effectors.
Specific Examples
Historically, hemoglobin, being one of the first two
proteins whose structure was determined, served as a
source of information about the nature of allostery. It
has both homotropic, i.e., interaction among identical
ligands, and heterotropic allosteric effects.`Homotropic' because as each subunit binds an oxygen molecule
the affinity for the next oxygen on the next subunit
increases (also called `cooperativity'). `Heterotropic'
because pH and 2, 3-bisphosphogylcerate affect O2
binding. The other classic example is aspartate transcarbamoylase. This protein is interesting, since it is a
protein with 12 subunits, 6 of which are catalytic and 6
of which are regulatory.
Among the gene regulatory proteins, the repressors
for lactose operon, purine biosynthesis, and the
tryptophan operon of Escherichia coli have been determined with and without small molecule defectors and
DNA. In these, instead of substrate conversion to
product, the binding affinity of the protein for operator DNA is modified by the small molecule effector.
These molecules illustrate the propagation of longrange structure changes over the protein.
The term `allosteric' or `allostery' is used today to
refer to almost any consequent action at a distance
that involves macromolecules and interaction with
ligands. For example, cell-surface receptor activation
of events inside the cells are often referred to by the
term `allostery.'
Further Reading
Ackers GK (1998) Deciphering the molecular code of hemoglobin allostery. Advances in Protein Chemistry 51: 185253.
Koshland Jr DE, Nemethy G and Filmer D (1966) Comparison
of experimental binding data and theoretical models in proteins containing subunits. Biochemistry 5: 365385.
Lipscomb WN (1994) Aspartate transcarbamylase from Escherichia coli: activity and regulation. Advances in Enzymology and
Related Areas of Molecular Biology 68: 67151.
Luisi BF and Sigler PB (1990) The stereochemistry and biochemistry of the trp repressoroperator complex. Biochimica et
Biophysica Acta 1048: 113 126.
Monod J, Changeux J-P and Jacob F (1963) Allosteric proteins
and cellular control systems. Journal of Molecular Biology 6:
306 329.
Monod J, Changeux J-P and Jacob F (1965) On the nature of
allosteric transitions: A plausible model. Journal of Molecular
Biology 12: 88 118.
Allotypes
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0033
Alpha (a1)-Antitrypsin
Deficiency
D W Cox
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1512
Genetic Variation
a1AT shows considerable genetic variability, with
more than 70 different genetic variants, called PI
types. These different genetic types are identified
using isoelectric focusing, which separates the variants
according to their charge. Many of the genetic variants
have also been sequenced.
The deficiency is inherited as an autosomal recessive
trait. The most common deficiency is PI type Z. PI ZZ
homozygotes have about 15 to 20% of the normal
plasma concentration of a1AT. The Z protein is visible
by isoelectric focusing and shows both a reduced concentration and a more acidic isoelectric point. About
95% of a1AT deficiency is due to the presence of the Z
allele. This allele is particularly common in northern
European populations, with a frequency of about 1 in
2000 in Scandinavian countries, and 1 in 7000 in North
Americans of European ancestry, but is not found in
African or Asian populations. The plasma deficiency
is due to lack of secretion of the Z type protein from
the liver cell. The Z a1AT has a tendency to selfaggregate, and forms insoluble inclusions within the
liver cell. Several other rare variants, including Mmalton, also show this tendency to self-aggregation.
About 5% of deficiency variants are made up of
more than 10 rare deficiency alleles, which are found
in individuals of all racial origins. Other deficiency
variants are due to a variety of causes including early
truncation of the protein and instability of message.
The Gene
The gene for a1AT is located on human chromosome
14 (14q32) within a cluster of genes of similar
sequence which include corticosteroid-binding globulin, a1-antichymotrypsin, protein C inhibitor, and kallistatin. The gene is 12.2 kb in length, including a
1.4 kb coding region and six introns. Somewhat different forms of the gene are expressed in different tissues:
there are three different forms produced in monocytes, and one in the cornea that are different from
those produced in the liver. Each of the transcription
start sites for the gene has its own promoter regulating
tissue specific expression.
Disease Predisposition
Lung Disease
Liver Disease
Other Diseases
Therapy
The most effective approach to prevent tissue destruction is to avoid smoking. Infusion of purified a1AT is
being used, but the extent of the benefit is not clear.
Aerosol administration of purified a1AT may become
feasible.
Further Reading
42
Alpha ( a ) - Fe t o p ro t ei n
Alpha (a)-Fetoprotein
B T Spear
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0453
Alternation of Gene
Expression
A C Glasgow
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0034
Introduction
The discovery by Barbara McClintock of transposable elements controlling gene expression in maize
(see McClintock, Barbara; Transposable Elements)
led researchers to look for DNA rearrangements as
A l te r na t i o n o f Ge n e E x pre s s i o n 43
mechanisms for regulation of gene expression in other
eukaryotic and prokaryotic systems. Many examples
of reversible on/off gene expression and switching
between expression of two alleles of a gene were
found to involve DNA rearrangements, including
DNA inversion, insertion/excision of DNA elements,
and directed gene conversion events (see Gene
Rearrangement in Eukaryotic Organisms; Gene Rearrangements,Prokaryotic).Theswitch-regulatedgene
products are commonly surface antigens required for
motility, adhesion, and cell-type determination, such as
flagellin, pilin, extracellular polysaccharide, and a/a
mating-type proteins. Specialized DNA recombination systems, involving site-specific recombination
(see Site-Specific Recombination) or transposition
(see Insertion Sequence; Transposable Elements),
mobilize DNA elements that control alternation of
gene expression. Directed gene conversion utilizes
endonucleases that nick or cleave at specific DNA
sequences flanking gene alleles to be switched. General recombination/replication enzymes are directed
to the cleaved sites where they mediate gene conversion, as in the case of mating-type switching in fission
and budding yeast (see Mating-Type Genes and Their
Switching in Yeasts).
Table 1 lists representative DNA rearrangement
systems that control alternation of gene expression in
prokaryotes and eukaryotes. This article focuses on
specialized recombination systems involved in alternation of gene expression. Directed gene conversion
controlling mating type in yeast is described in the
article (see Mating-Type Genes and Their Switching
in Yeasts).
stimulation) and HU (histone-like protein). The characterization of the molecular details of this DNA inversion system, and related systems in bacteriophages
Mu and P1 (Table 1), contributed significantly to
defining the roles of DNA enhancers in stimulating
DNA recombination, replication, and transcription.
FimB/FimE-Mediated Inversion
and Type 1 Pilin Phase Variation in
Escherichia coli
The site-specific DNA inversion strategy controlling ON/OFF phase variation of type 1 fimbriae in
Escherichia coli is quite distinct from the Hin-related
systems. The exceptional feature of this DNA rearrangement is that two different site-specific recombinases, FimB and FimE, mediate inversion of the
chromosomal segment containing the promoter for
the type 1 fimbriae gene ( fim). FimE, which has been
shown to be a lambda-integrase-related recombinase
(see Phage l Integration and Excision), mediates
inversion of the promoter segment in only one direction, the ON-to-OFF direction; whereas FimB can
mediate inversion to either orientation. Interestingly,
the regulation of these two recombinases is linked to
the expression of pyelonephritis-associated pili ( pap).
FimB-promoted bidirectional switching is inhibited by
PapB, a positive transcriptional regulator of pap, while
the FimE-mediated unidirectional inversion (ONto-OFF) is stimulated by PapB. Thus, expression of
Pap is linked to repression of type 1 fimbriae production in the same cell, effectively switching which tissuespecific adhesin is present on the surface of the E. coli
cell (Xia et al., 2000).
A unique site-specific DNA inversion system regulates type 4 pilin (tfpQ/I) expression in both Moraxella bovis, a cow eye pathogen, and Moraxella
lacunata, a human eye pathogen. Type 4 pilin, an
important virulence factor for these pathogens, is
required for adherence to corneal and conjunctival epithelial tissues. The invertible chromosomal segment
of M. bovis contains the coding sequence for the
C-terminal regions of TfpQ and TfpI; the constant
N-terminal region of these pilin proteins and the
tfpQ/I promoter (tfpQ/Ip2) are encoded immediately
upstream of the invertible segment (Figure 1).
Inversion, mediated by the recombinase (Piv),
switches the type 4 pilin gene segment that is translationally fused to the constant region of tfpQ/I. The
DNA sequence of the inversion region of M. lacunata
is nearly identical to M. bovis with the notable
exception of a 19 bp duplication early in the tfpI
44
Specialized recombination systems controlling alternation of gene expression in prokaryotes and eukaryotes
Organism
Recombination system
References
Piv inversion
Hin inversion
Gin/Cin inversion
FimBE inversion
Site-specific inversion
Site-specific inversion or reversible insertion (?)
IS492 reversible insertion
IS256 reversible insertion
IS1301 reversible insertion
IS1301 insertional inactivation
Site- and strand-specific nick, gene conversion
HO endonuclease, directed gene conversion
Not determined
1
2
2
1
4
5
1
1
1
6
7
3
8
1. This article.
2. Articles on Hin-Gin Mediated Site-Specific DNA Inversion; and Insertion Sequences.
3. Article on Mating-Type Genes and their Switching in Yeasts.
4. Dybvig K, Sitaraman R and French CT (1998) A family of phase-variable restriction enzymes with differing specificities generated by high-frequency gene rearrangements.
Proceedings of the National Academy of Sciences, USA 95: 1392313928.
5. Lysnyansky I, Rosengarten R and Yogev D (1996) Phenotypic switching of variable surface lipoproteins in Mycoplasma bovis involves high-frequency chromosomal
rearrangements. Journal of Bacteriology 178: 53955401.
6. Newcombe J, Cartwright K, Dyer S and McFadden J (1998). Naturally occurring insertional inactivation of the porA gene of Neisseria meningitidis by integration of IS1301.
Molecular Microbiology 30: 453454.
7. Arcangioli, B and de Lahondes R (2000) Fission yeast switches mating type by a replicationrecombination coupled process. EMBO Journal 19: 13891396.
8. Norris TL, Kingsley RA and Bumler AJ (1998) Expression and transcriptional control of the Salmonella typhimurium Ipf fimbrial operon by phase variation. Molecular
Microbiology 29(1): 311320.
A l t e r n a t i o n o f G e n e E x p re s s i o n
Table 1
A l te r na t i o n o f Ge n e E x pre s s i o n 45
segment. Consequently, M. lacunata exhibits ON/
OFF phase variation of the TfpQ pili associated with
inversion of the pilin segment. While the organization
of these inversion regions is quite similar to many of
the Hin-related systems, the novel feature of this
inversion system is the recombinase, Piv. Based on its
primary amino acid sequence, Piv is unlike any other
known site-specific DNA recombinase. In fact, it has
recently been demonstrated that Piv is structurally
and functionally related to the transposases of the
IS110/IS492 family of insertion elements (Tobiason
et al., in press). Not surprisingly, insertion elements
from this and other IS families have been found to
control alternation of gene expression in bacteria.
Reversible transposition of insertion elements controls ON/OFF phase variation of gene expression in
tfpQ/Ip2
) tfpQ
tfpI
tfpB
invL
piv
pivp
invR
Piv
tfpQ/Ip2
tfpB
tfpI
invL
tfpQ
piv
pivp
invR
Figure 1 Alternation of type 4 pilin in M. bovis. The invertible chromosomal segment, containing the alternate pilin gene
segments, tfpQ (solid box) and tfpI (cross-hatched box), and the tfpB gene (of unknown function, open box), is shown in the
orientations for TfpQ expression (solid pili) and for TfpI expression (cross-hatched pili). The reversible inversion of the
chromosomal segment is mediated by the recombinase, Piv, at the invL and invR recombination sites. piv (shaded box) is
encoded immediately downstream of the invertible segment, andis transcribed from pivp in the opposite direction of tfpQ/I.
IS492
MooV
mooV
eps
eps+
mooV
IS492 circle
Figure 2 ON/OFF phase variation of EPS in P. atlantica. IS492 (solid box) is inserted in eps (hatched box), an
essential gene for synthesis of EPS in P. atlantica. Excision of IS492 from a specific site within eps is mediated by the
transposase, MooV, which is encoded within the IS element (open box). Precise excision produces a circular form of
the IS element containing one copy of the duplicated 5 bp target site at the circle junction. Although the process is
reversible, the source for IS492 that reinserts into eps is not known; it may be the circular form or one of at least four
copies of the element found at different sites on the chromosome.
46
A l t e r n a t i o n o f G e n e E x p re s s i o n
precise excision (Perkins-Balding et al., 1999). The frequency of precise excision can range from 10 5 to 10 1
per cell per generation, depending on cell growth conditions. The regulation of excision by environmental
conditions and the high rate of precise excision are
unique to the IS492 transposition system at this
time.
Further Reading
Craig N (1996) Transposition. In Neidhardt F et al. (eds) Escherichia coli and Salmonella Cellular and Molecular Biology, pp.
23392362. Washington, DC: ASM Press.
Mahillon J and Chandler M (1998) Insertion sequences. Microbiology and Molecular Biology Reviews 62(3): 725774.
Nash H (1996) Site-specific recombination: integration, excision, resolution, and inversion of defined DNA segments. In:
Neidhardt F et al. (eds) Escherichia coli and Salmonella
Cellular and Molecular Biology, pp. 23632376. Washington,
DC: ASM Press.
References
A l t e r n a t i ve S p l i c i n g 47
Alternative Splicing
D L Black
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0035
48
Altruism
proteins, as well as other factors. Some of these systems appear quite complex, exhibiting both positive
and negative splicing regulation. Work in this area
is currently focused on identifying splicing regulatory proteins and characterizing their interactions
with the pre-mRNA and the general splicing apparatus. However, there are likely to be multiple mechanisms contributing to the regulation of alternative
splicing.
See also: Gene Regulation; Pre-mRNA Splicing;
Sex Determination, Human; Sex Determination,
Mouse
Altruism
See: Kin Selection
Alu Family
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1753
The Alu family is a set of related distributed sequences, each approximately 300 bp long, in the human
genome. Individual members possess Alu cleavage
sites at each end.
See also: Repetitive (DNA) Sequence
Alzheimer's Disease
D C Rubinsztein
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0037
Amber Mutatio n 49
4043 amino acids, depending on the C-terminal
cleavage site. The b-amyloid peptide is derived from
a larger protein coded for by the amyloid precursor
protein (APP) gene. The possibility that APP mutations caused AD was supported by the observations
that pathological changes indistinguishable from those
seen in AD are almost universal in Down syndrome
(trisomy 21) cases aged older than 40 years. This suggested that there was a locus (loci) on chromosome
21 that was sufficient to cause AD if present in three
copies rather than two copies. Subsequently, the APP
gene was mapped to chromosome 21 and its protein
was shown to be overexpressed in Down syndrome.
The discovery of dominant mutations in the APP gene
in early-onset AD families and characterization of
the consequences of these mutations suggested that
abnormal overproduction of b-amyloid was responsible for some cases of AD. APP mutations account
for less than 5% of familial early-onset AD.
Dominant mutations in the presenilin-1 gene on
chromosome 14 may account for up to 50% of familial
early-onset cases, while mutations in its homolog,
presenilin-2, are very rare. Most presenilin mutations
described to date in AD cases are missense, suggesting
that these may create abnormal proteins which interfere with normal metabolism by gain-of-function or
dominant-negative effects. Fibroblasts from patients
with presenilin-1 and presenilin-2 mutations oversecrete the 42-amino-acid form of b-amyloid and
transgenic mice expressing mutant presenilin-1 overproduce b-amyloid compared to transgenic mice
expressing wild-type human presenilin-1. This provides further support for the gain-of-function nature
of these mutations and the model that b-amyloid overproduction may be a central theme in AD pathology.
The apoE gene is located on chromosome 19q13.2
and codes for a mature protein of 299 amino acids. The
three common alleles in humans are distinguished by
amino acid changes at positions 112 and 158. Apo "3,
the commonest allele in most populations, has
cysteine at amino acid 112 and arginine at position
158. Apo "4 has arginines at positions 112 and 158,
while apo "2 has cysteines at these positions. ApoE
plays important roles in lipopotein transport and the
different alleles are associated with variations in
plasma cholesterol and triglyceride concentrations.
Individuals with one copy of apo "4 have a threefold
risk of AD, while apo "4 homozygotes have an 11.5fold risk, relative to apo "3 homozygotes. In addition,
increasing apo "4 dose is associated with earlier onset
of AD. Apo "4 heterozygosity and homozygosity are
both associated with higher relative risks for earlyonset AD compared to late-onset AD. The apo "2 allele
is associated with a decreased relative risk for AD in
individuals aged over 65 years, compared to apo "3.
Amber Codon
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1754
Amber Mutation
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1755
50
Amber S uppressors
Amber Suppressors
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1756
Ames, Bruce
G F Ames
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0039
Ames Te st 51
bacterial cells to low doses of oxidants allow these
cells to adapt to subsequent challenges by higher
doses of oxidants. These pioneering studies have provided the insights and foundation for understanding
how higher organisms such as mammals adapt to oxidant exposure.
Ames (with Lois Swirsky Gold) has been the leader
in painting a broad picture of the wide variety of mutagens and carcinogens to which humans are exposed.
Their carcinogenic potency data base is the definitive
reference source for all animal cancer tests. Their analyses are having an unusual impact on the prevailing
paradigm in the field. They have characterized the
large background of natural mutagens and carcinogens
and thus have put into perspective, in humans, low
exposures to synthetic chemicals, both qualitatively
and in terms of quantitative carcinogenic potency.
Ames and Gold have shown that half of the chemicals
tested in high-dose animal cancer bioassays, whether
synthetic or natural, are classified as carcinogens.
They have critically addressed the reasons for this
high positivity rate and have supported the interpretation that it is a high-dose effect: induced cell division
and cell replacement converting DNA lesions to
mutations. Thus they have made a rigorous and persuasive case that the current practice of linear extrapolation from high-dose animal cancer tests to predict
human risk for low doses of synthetic chemicals has
distorted the perception of hazard and allocation of
resources, a matter of great societal import. Ames has
also provided an intellectual bridge that connects cancer mechanisms to epidemiological results on the role
of diet in the causation and prevention of cancer.
Ames's recent research showing that deficiencies of
micronutrients such as folic acid are a major cause of
DNA damage in humans, is likely to have a major
impact on health and prevention of cancer. He has
shown that acetyl carnitine and lipoic acid, fed to rats
at high levels, reverse some of the age-related decay of
mitochondria. These compounds may be conditional
micronutrients and thus have a major impact on delaying aging.
Bruce Ames is a Professor of Biochemistry and
Molecular Biology, and Director of the National Institute of Environmental Health Sciences Center, University of California, Berkeley. He is also a Senior
Research Scientist at the Children's Hospital Oakland
Research Institute. He is a member of the National
Academy of Sciences and was on their Commission on
Life Sciences. He was a member of the board of directors of the National Cancer Institute, the National
Cancer Advisory Board, from 1976 to 1982. His
awards include the General Motors Cancer Research
Foundation Prize (1983), the Tyler Prize for environmental achievement (1985), the Gold Medal Award
Ames Test
J G Hengstler and F Oesch
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1543
52
Ames Te st
Table 1
Strain
TA
TA
TA
TA
TA
98
100
102
1535
1537
Lipopolysaccharide
barrier
Frameshift in hisD3052
Base substitution in hisG46
Base substitution in hisG428
Base substitution in hisG46
Frameshift in hisC3076
rfa
rfa
rfa
rfa
rfa
DNA repair
Resistance
uvrB
pKM101
Ampicillin
Tetracycline
R
R
R
S
S
S
S
R
S
S
All strains were originally derived from Salmonella typhimurium LT2. hisD3052, mutation in the hisD gene coding for
histidinol dehydrogenase; hisG46, mutation in the hisG gene coding for the first step in histidine biosynthesis; hisG428, TA
102 contains A-T base pairs at the site of the mutation in hisG in contrast to TA 100 and TA 1535 that contain G-C base
pairs at the site of mutation; rfa, mutation that causes a strong reduction in the lipopolysaccharide layer; uvrB, a gene
involved in DNA excision repair; pKM101, plasmid that increases chemically induced and spontaneous mutagenesis by
enhancement of error-prone DNA repair; R, resistant; S, sensitive.
Chemical mutagens
Salmonella
typhimurium
TA 1535
Wild-type
bacterium
hisG46
Endogenous
processes
hisG46
5'-C T C-3'
3'-G A G-5'
(Leucine)
5'-C C C-3'
3'-G G G-5'
(Proline)
Formation of colonies
on minimal agar
No colony formation on
minimal agar
Specific Techniques
Two versions of the Ames test are usually applied
(Figure 2): the plate incorporation assay, where bacteria and test substance are mixed and immediately
given onto the agar, and the preincubation assay, where
bacteria and test substance are incubated for 1 h at
37 8C before plating them on agar. The preincubation
version is more sensitive for some compounds, but
also more laborious. In toxicological routine, a negative
result in the plate incorporation assay has to be confirmed in a second series; the first may be a plate incorporate, the second a preincubation assay. Recently, a
Ames Te st 53
(A)
Preincubation assay
S9 Mix
Bacteria
Test
substance
S9 Mix
Bacteria
Top
agar
Test
substance
Incubation 1h,
shaking water bath
Incubation
4872 h
Growth of
colonies
Negative control
Bacteria incubated
with mutagen
54
Amino A cids
57 days
Liver
S9
Administration of substance
that induce liver enzymes
- Excision of liver
- Homogenization
- Centrifugation at 9000g
Sediment
Figure 3 Preparation of rat liver S9 mix. After centrifugation of liver homogenate at 9000, the supernatant (S9) is
used as a metabolizing system in the Ames test. S9 contains microsomes and cytosol and therefore all microsomal and
cytosolic xenobiotic metabolizing enzymes. In contrast, the sediment containing cell membranes and lysosomes is
discarded. An NADPH (cofactor for cytochrome P450-dependent monooxygenase activity)-generating system is
added to S9 to form the ``S9 mix.''
the cell membrane and loss of phase II metabolizing
enzymes owing to dilution of cofactors. Thus, bear in
mind that the Ames test is an artificial system and does
not necessarily reflect the in vivo situation. This is
illustrated by the observation that the endogenous
tripeptide glutathione and the amino acid cysteine
both are positive in the Ames test under specific conditions (Glatt et al., 1983).
Performance and interpretation of Ames test results
have been standardized by international guidelines,
such as those of the Organization for Economic Cooperation and Development (OECD Guideline 471)
and the International Conference on Harmonization
(ICH).
Future Prospects
The Ames test is a sensitive tool in screening for
potential genotoxic carcinogens. However, despite the
high correlation, a positive result is difficult to interpret
for the individual case in question, because a mutagen
in the Ames test is not necessarily harmful to humans.
These problems can be alleviated in future. It has been
shown clearly that the use of intact cells instead of S9
mix improves the correlation between carcinogenicity
and mutagenicity data (Utesch et al., 1987). Since
cryopreserved human hepatocytes are now available,
they can be used as a metabolizing system instead of
rat S9 mix (Hengstler et al., 2000). This gives a possibility to test whether a positive result in the standard
rat S9 Ames test is also relevant to humans.
Further Reading
References
Amino Acids
L B Willis, P A Lessard and A J Sinskey
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0041
A m i n o A c i d s 55
in nature, such as the b-amino acids, in which the
amino and carboxyl groups are attached to different
carbons in the backbone (Figure 1B).
All a-amino acids (with the exception of glycine)
have four different substituents attached to the
a-carbon and are therefore chiral molecules. Chirality,
also called handedness, is a special subset of asymmetry describing objects that have no internal plane of
symmetry and are not superimposable upon their own
mirror image. Almost all of the amino acids found
in nature have the same chirality with respect to the
a-carbon and are referred to as l-amino acids based on
chemical nomenclature guidelines. Their rare, stereoisomer, mirror-image counterparts are called d-amino
acids. Just as your right hand will not fit properly into
a lefthand glove, a d-amino acid will not fit properly in
a space that fits an l-amino acid.
Although hundreds of amino acids have been
identified or synthesized, 20 of these are often designated the common amino acids. These are shown in
Figure 2. In biological systems these amino acids
are the building blocks of proteins. Under the direction of messenger RNA during protein synthesis, the
amino and carboxyl side chains of two amino acids
(A)
-Carbon atom
H2N
H
C
OH
R
(B)
O
H2N
H
C
O
OH
H2N
CH2 C
CH2
O
OH
H2N
CH3
H
C
OH
CH2
CH2
O
OH
CH3
(C)
O
H2N
H
C
R1
O
OH + H2N
H
C
R2
O
OH
H2N
CH
R1
O
NH
CH
OH + H2O
R2
Figure 1 Structural features of amino acids. (A) Common structure of a-amino acids, position of the a-carbon is
labeled; `R' represents any number of possible chemical structures, which may be as simple as a single hydrogen atom
(-H, as in the amino acid glycine) or more complex chemical groups, as in (B). (B) Examples of naturally occurring
amino acids, with different `R' groups. From left to right: alanine, b-alanine, and phosphinothricine. Note that in
b-alanine the amino group is attached to the b-carbon. Phosphinothricine, a herbicide, is an unusual amino in that it
contains phosphorus. (C) Representation of peptide bond formation.
56
Amino Acids
*Leucine
Leu
CH2
*Lysine
Lys
CH3
CH2 CH2 CH2 CH2 NH2
NH2
*Methionine
Met
CH2 CH2 S
OH
*Phenylalanine
Phe
CH2
Proline
Pro
HN
H2C
Alanine
Ala
CH3
Arginine
Arg
Asparagine
Asn
CH2 C
Asparate
Asp
O
CH2 C
NH2
CH
CH3
NH
Cysteine
Glutamate
Cys
Glu
CH2
O
SH
CH2 CH2 C
O
C
Glutamine
Gln
CH2 CH2
Glycine
Gly
OH
NH2
*Histidine
His
CH2
H
C
N
*Isoleucine
Ile
CH2
CH COOH
CH2
C
H2
Serine
Ser
CH2
OH
*Threonine
Thr
CH
CH3
O
H
N
CH3
OH
*Tryptophan
Trp
H2C
*Tyrosine
Tyr
CH2
*Valine
Val
CH
NH
OH
CH3
CH3
CH3
CH3
Figure 2 The 20 common amino acids. Three- and one-letter abbreviations and the composition of the `R' side
chain group are shown. Asterisks indicate which amino acids are essential for humans. Proline is an imino acid; the
entire structure and not just the side group is shown. Tyrosine can be synthesized from phenylalanine and therefore is
not essential if phenylalanine is provided.
living cells. However, vertebrates, including humans,
are only able to manufacture a subset of these amino
acids. Hence they must obtain the remainder of their
amino acids from their diet. Amino acids that must be
obtained in this manner are known as essential amino
acids (Figure 2). Since proteins are composed of
amino acids, diets that are rich in protein are more
likely to contain sufficient amounts of each of the
essential amino acids to preclude any deficiencies.
In animal feeds, however, where the bulk of the
protein present may come from a single source such
as grain, imbalances in the individual essential amino
acids can occur. For example, corn (maize) provides
the bulk of protein in feed for livestock. Yet the protein found in normal field corn is disproportionately
low in lysine. To compensate, farmers routinely add
lysine to animal feed to improve its nutritional value.
Amino acids are produced commercially from a
variety of sources and for a variety of uses. For example, lysine, tryptophan, and threonine to be used as
feed supplements are produced by fermentation. In
these processes, genetically altered bacteria that
produce more of an amino acid than they need for
their own growth excrete the excess amino acid into
their growth medium. Once the desired amino acid
58
A m i n o Ter m i n u s
Amino Terminus
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0045
In bacteria the formylation of the initiating methionine is accomplished while the methionine is attached
to the initiator tRNA (and as we have seen is commonly removed during translation of the complete
protein). However, in all organisms, modification to
the amino-terminal amino acid residue of a protein
can occur as the result of a posttranslational process.
Acetylation is quite common; for example, in Escherichia coli the amino-terminal serine of elongation
factor Tu is acetylated, as are the amino-terminal residues of several ribosomal proteins. The E. coli ribosomal proteins L7 and L12 are identical except that the
amino-terminal serine of L12 is acetylated to give L7.
In the case of L7/L12, at least, this acetylation is under
physiological control.
The identity of the amino acid residue that occurs at
the amino terminus of the mature protein also is an
important determinant of the rate of degradation of the
protein, at least in bacteria, eukaryotic microorganisms, and animal cells. This is sometimes referred to
as the N-end rule. Interestingly, amino acid residues
that are uncovered by the action of methionine aminopeptidase (see above) are stabilizing, whereas the
presence of other amino acid residues at the amino
terminus lead to more rapid degradation of the protein. Therefore, the requirement organisms have for
this enzyme could relate to stabilization of proteins
rather than the necessity of removing methionine to
achieve full activity.
The stability of proteins can be affected by adding residues to the amino terminus in a nontemplated
fashion. In eukaryotes this can be accomplished by
arginyl-tRNA protein transferase which adds an arginine residue to amino-terminal glutamyl or aspartyl
residues of certain proteins as part of a pathway by
which these proteins are degraded. Prokaryotes contain a leucyl/phenylalanyl-tRNA protein transferase
which serves a similar function.
See also: Leader Peptide; Translation
Aminoacyl-tRNA
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1757
A m i n o a c y l - t R N A S y n t h e t a s e s 59
Aminoacyl-tRNA
Synthetases
E J Murgola
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0043
where AA is an amino acid and AARS is the corresponding AA-tRNA synthetase. In the first step, the
amino acid activation step, some synthetases (such as
GlnRS, ArgRS, and GluRS from Escherichia coli) also
require the presence of tRNA for catalysis of the
activation. When an amino acid is attached to a
tRNA (step 2), the tRNA is said to be aminoacylated
or charged. The terms `uncharged tRNA,' `unacylated tRNA,' or `deacylated tRNA' refer to a tRNA
molecule lacking an amino acid. If the tRNA specific
for tryptophan (Trp) is acylated with tryptophan, the
product would be termed tryptophanyl-tRNA (TrptRNA) or, more precisely, Trp- tRNATrp. If a glutamine (Gln)-specific tRNA were misacylated with Trp,
the product would be termed Trp-tRNAGln.
The specific synthetases are denoted by their threeletter amino acid designation and `RS'; for example,
GlyRS for glycyl-tRNA synthetase. Despite their
conserved mechanisms of catalysis, the AARSs differ
in the nature of the active site for the amino acid and
ATP as well as with respect to tRNA identity elements
recognized and the modes of binding to the tRNAs.
Accuracy in the formation of a particular AA-tRNA
involves factors such as tRNA identity determinants
for a particular AARS, existence of antideterminants
in certain tRNAs that prevent interaction with
60
A m i n o p t e ri n
Aminopterin
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0044
Amniocentesis
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0046
A m pl i c o n s 61
more rarely, other fetal lesions associated with the
leakage of fetal blood plasma into the amniotic fluid,
for example congenital nephrosis and placental
hemangioma. However, by far the most common use
of amniocentesis is for the diagnosis of chromosomal
trisomy, especially trisomy 21 (Down syndrome). In
these cases the indication for amniocentesis may be a
maternal age greater than 35 years, as the risk of
Down syndrome increases with maternal age. In
pregnancies affected by Down syndrome certain
maternal serum constituents can act as markers for
the syndrome; thus free b-chorionic gonadotrophin
is elevated and serum AFP is decreased in affected
pregnancies. These and other markers are used in
prenatal screening programs to estimate the risk of
an affected pregnancy at all maternal ages. The serum
screening programs, taken together with the results
of prenatal ultrasound examination for fetal nuchal
translucency, are widely used by mothers who wish
to know their risk of giving birth to an affected child.
The screening test is not diagnostic and amniocentesis
is required to determine if the pregnancy is affected.
A certain diagnosis of a severe abnormality gives the
parents the option of termination of pregnancy.
See also: Down Syndrome; Prenatal Diagnosis
Amplicons
R Palacios
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1634
return to the basal nonamplified state without disrupting the structure of the genome.
The dynamic nature of a DNA amplified region
continuously generates closed circular structures that
consist of monomers or multimers of the whole amplicon. Due to the lack of an origin of replication these
structures will be lost as the cell divides. However,
some genetic manipulations may allow the recovery
of such structures. This will facilitate the molecular
characterization of amplicon structures.
The main factors influencing the rate at which an
amplicon can duplicate or delete are the length and
sequence conservation of the recombining repeats.
The specific nature of the repeated sequences is unimportant. Among the reiterated sequences that
participate as amplicon borders, there are common
inhabitants of many genomes, such as ribosomal operons and insertion sequences (IS) of different kinds. On
the other hand, examples of species-specific repeated
sequences are the recombination hot spot sequences
(rhs) found in Escherichia coli, or the repeated nitrogenase operons found in several Rhizobium species.
Amplicons define a structural characteristic of the
genome. Any replicon, chromosome, or plasmid can
be viewed as a structure formed by overlapping amplicons. The only exception would be a replicon devoid
of repeated sequences. The length of specific amplicons may vary between a few kilobases to more than
one megabase. The number and size distribution
of amplicons in a genome depends on the amount,
location, and relative orientation of long repeated
sequences. These factors, coupled to the genome
architecture (chromosomal vs. plasmidic, linear vs.
circular) contribute to shaping the ``amplicon structure'' of a particular genome.
The biological role of DNA amplification in prokaryotes might be related to adaptation to extreme
conditions or situations that impose severe demands
on the ability of regulatory systems. Under some conditions, overexpression of gene products through gene
amplification may confer the phenotypic advantages
needed for survival. The amplified state would remain
as long as the selective condition exists; when the
selective condition disappears, the population will
return to the basal nonamplified state.
Different examples of natural DNA amplification
have been correlated with adaptative situations in different organisms. Among these are the following:
increased resistance to antibiotics; increased resistance
to heavy metals; growth under conditions of nutrient
scarcity; growth in different exotic (nonnatural) carbon sources; enhancement of pathogenicity; and enhancement of the capacity to fix nitrogen.
In addition to its role in adaptation to transient
conditions, DNA amplification provides new subjects
62
Anagenesis
Anagenesis
E Mayr
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0048
References
Analogy
E O Wiley
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0049
References
Ancestral Inheritance
Theory
M E Magnello
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0050
64
Anchor Locus
xy
covariance
variance of x
S2x
xy
SX SY
covariance
standard deviation of xstandard deviation of y
Further Reading
Anchor Locus
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0051
A n e u p l o i d 65
mapping studies with loci that have not been previously mapped. With the use of anchor loci distributed at 20 cM intervals throughout a genome, it is
typically possible to establish map locations for
newly characterized loci in the context of new breeding studies.
See also: Gene Mapping; Marker
Anchorage-Independent
Growth
A C Lloyd
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1544
Androgenone
W Reik
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0053
Androgenetic embryos are useful for genetic mapping purposes or for the rapid recovery of mutations.
In mice androgenetic development only progresses
halfway through gestation, to early postimplantation
stages. These embryos have reasonably well developed
extraembryonic tissues such as the trophoblast, but
the embryo proper is retarded and often abnormal.
This developmental failure is explained by the phenomenon of genomic imprinting whereby certain
genes in eutherian mammals are expressed from
only one of the parental chromosomes. Androgenetic
embryos thus have lack of gene products which are
only made by the maternal genome, and overexpression of gene products made by the paternal genome.
In humans androgenetically developing embryos can
lead to hydatidiform moles, in which there is proliferation of trophoblast tissue but fetal tissues are absent
or abnormal. The existence in nature of androgenetically reproducing species, or normal development following experimental production of androgenones,
indicates that genomic imprinting is largely absent in
these species.
See also: Hydatidiform Moles
Aneuploid
R K Herman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0054
An aneuploid cell or organism has a number of chromosomes that differs from the normal chromosome
number for the species by one or a few chromosomes.
If the normal haploid chromosome number is n, then
a somatic diploid cell has 2n chromosomes. An aneuploid somatic cell with one extra chromosome has
2n1 chromosomes and is said to be trisomic, since
three versions or homologs of one chromosome are
present. A double trisomic would be denoted 2n11,
whereas a tetrasomic would be referred to as 2n2. A
monosomic would have 2n 1 chromosomes.
Aneuploid organisms are generally less healthy
than euploids, which have each chromosome represented the same number of times (apart from the two
different sex chromosomes in the heterogametic sex).
The deleterious effect of aneuploidy is thought to be
caused not by the altered dosage of a single gene, but
by the cumulative effect of an imbalance in the gene
activities of many genes on the extra or missing chromosome. This explains why the severity of a trisomic
phenotype generally increases with the size of the
trisomic chromosome (excluding sex chromosomes,
the expression of which may be dosage compensated).
66
Angiogenesis
Angiogenesis
J Folkman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1545
A n g i o g e n e s i s 67
Common Configurations
of Microvessel Tissue Cell Units
Islet cells
(A)
Liver cells
Tumor cells
(B)
Epithelial cells
exposed to environment:
Epidermis
Gut
Genitourinary
(C)
100-200 m
Oxygen diffusion limit
(D)
68
Angiogenesis
Table 1
Protein
Year reported
18
16.4
14.1
5.5
25
17
1984
1984
1985
1986
1986
1987
4045
1983
1989
1989
1991
1991
1992
1993
1994
1996
1998
Tumor Angiogenesis
How Tumors Become Angiogenic
Most human tumors arise as a microscopic-sized colony of cells. The colony usually stops expanding
when it reaches a population of approximately 1 million cells and a diameter of 0.2 to 2 millimeters. This is
45
17
25
40
92
35
70
16
A n g i o g e n e s i s 69
growth factor, (PDGF), interleukin-8 (IL-8), hepatocyte growth factor, angiogenin, tumor necrosis factor
alpha (TNF-alpha), and other angiogenic proteins.
In some tumors an oncogene is responsible for
overexpression of a positive regulator of angiogenesis.
For example, ras oncogene upregulates expression of
VEGF. In other tumor types, concomitant downregulation of an endogenous negative regulator of angiogenesis is also required for the angiogenic switch.
When the tumor suppressor gene p53 is mutated or
deleted, for example, thrombospondin, an endogenous
angiogenesis inhibitor is downregulated. At the time
of writing, it is being discovered that other tumorsuppressor genes control negative regulators of angiogenesis. The tumor cell is not the sole regulator of the
angiogenic switch. Tumor angiogenesis can be potentiated by hypoxia. Tumor cells which are located more
than 100150 mm away from a blood vessel become
hypoxic. Hypoxia activates an hypoxia-inducible
factor (HIF-1)-binding sequence in the VEGF promoter. This leads to transcription of VEGF mRNA,
increased stability of VEGF message, and increased
production of VEGF protein beyond what may have
initially been triggered by an oncogene. HIF-1 also
increases the transcription of genes for PDGF-BB and
nitric oxide synthetase (NOS). Mast cells attracted to
the tumor bed can potentiate angiogenesis by releasing
enzymes (metalloproteinases) which mobilize VEGF
(and/or bFGF) from their heparan sulfate proteoglycan storage sites in extracellular matrix. Tumor angiogenesis can also be potentiated by angiogenic proteins
released by fibroblasts in the tumor bed or by macrophages attracted to the tumor bed. Tumor angiogenesis may also be modified by endogenous angiogenesis
inhibitors which either circulate (e.g., interferonbeta, platelet factor 4, angiostatin, and others), or are
releasable from extracellular matrix (e.g., endostatin,
thrombospondin-1, and tissue inhibitors of metalloproteinases (TIMPS)). The intensity of tumor angiogenesis is also governed by the angiogenic response of
the host, which is genetically regulated. It is known
that hemangiomas predominate in white infants and
that ocular neovascularization in macular degeneration is common in whites, but rare in black patients.
These clinical observations led to a recent experimental finding that a constant dose of angiogenic stimulation by FGF in mouse corneas yielded a 10-fold
difference in angiogenic response in mice of different
genetic backgrounds.
Other molecules which help to regulate neovascularization include ephrins, which specify arterial or
venous development of capillary vessels and cyclooxygenase 2, the production of which is stimulated
by bFGF and which converts lipid precursors to prostaglandin E2 (PGE2), an angiogenic stimulator.
Neovascularization
70
Angiogenesis
A
E
C
H
Angiogenesis Inhibitors
Endogenous Angiogenesis Inhibitors
A n g i o g e n e s i s 71
more recently discovered endogenous inhibitors, a
function is less clear (Table 2). The first endogenous
angiogenesis inhibitors were discovered in the 1980s:
interferon alpha/beta, platelet factor 4, and the class
of angiostatic steroids typified by tetrahydrocortisol.
However, thrombospondin-1 was the first endogenous inhibitor for which there was compelling evidence
of participation in the angiogenic switch, because it
was downregulated in tumors before angiogenesis
could be triggered. By 1989 the switch itself was
understood as the result of a shift in the ``net balance''
of angiogenesis stimulators and inhibitors. This led to
the discovery of angiostatin and endostatin, two very
potent and specific endogenous angiogenesis inhibitors. Angiostatin is a 38-kDa internal fragment of
plasminogen. Endostatin is a 20-kDa internal fragment
of collagen XVIII. The discovery of these two proteins came from a clinical clue. Surgical removal of
some tumors sometimes results in rapid growth of
remote metastases. This phenomenon had been
observed in both animals and humans for more than
50 years, but had always been difficult to explain.
Table 2
Name
Molecular
weight (kDa)
Year
reported
Reference
Nature 297: 307
Science 208: 516
Endocrinology 133: 1292
Cell 88: 277
Cell 79: 315
Science 285: 1926
Journal of the National Cancer Institute 87: 581
Journal of Experimental Medicine 182: 155
Journal of Experimental Medicine 188: 2349
Journal of Biological Chemistry 275: 1209
Biochemical and Biophysical Research
Communications 255: 735
Proceedings of the National Academy of Sciences,
USA 96: 2645
Science 285: 1926
Platelet factor 4
Interferon alpha
Prolactin fragment
Angiostatin
Endostatin
Antithrombin III
Interleukin-12
Inducible protein 10
Vasostatin
Canstatin
Restin
21
24
22
1982
1980
1993
1994
1997
1999
1995
1995
1998
2000
1999
Troponin I
22
1999
50
1999
PEX
Id1 and Id3
VEG1
Proliferin-related protein (PRP)
Meth-1 and Meth-2
Osteopontin cleaved product
Maspin
26
16
38
20
53
1994
110, 98
1998
1999
1999
1994
1999
1999
2000
Data from Folkman (2000) and in part from Carmeliet and Jain (2000).
72
Angiogenesis
Therapeutic administration of an angiogenesis inhibitor can tip the balance of the angiogenic switch so
that angiogenic output of a tumor is opposed or abrogated completely. At the time of writing 20 angiogenesis inhibitors are in clinical trials for patients with
cancer in multiple medical centers in the USA. Seven
of these have reached Phase III. Each phase of a clinical trial usually takes at least 2 years. At least five
angiogenesis inhibitors are in clinical trial in the USA
for the treatment of eye diseases such as diabetic
retinopathy and macular degeneration, which are
dominated by pathological neovascularization. Two
inhibitors are in Phase III. Antiangiogenic gene
therapy using genes which code for endogenous antiangiogenic proteins has been reported in a variety of
tumor-bearing animal systems. While no clinical
translation of this technology has been initiated, antiangiogenic gene therapy holds great promise for the
future, especially because certain endogenous angiogenesis inhibitors, such as angiostatin and endostatin
are not toxic and do not induce drug resistance.
A nn e a l in g 73
monthly basis. At least one angiogenic protein, VEGF,
is known to mediate the neovascularization in these
lesions. VEGF is upregulated by increased estrogen
and downregulated by withdrawal of estrogen.
Approximately 780 000 women in the USA suffer
from endometriosis. No clinical trials of angiogenesis
inhibitors for endometriosis have been initiated at the
time of writing, but these inhibitors may be beneficial
in this disease.
In cardiology, atherosclerotic plaques in the coronary arteries have been shown to be intensively neovascularized. In experimental studies, mice deficient in
apolipoprotein E and fed a Western diet develop large
atherosclerotic plaques in the aorta. These plaques are
also highly neovascularized and plaque growth is
significantly inhibited by antiangiogenic therapy of
the mice. Thus, growth of atherosclerotic plaques
may be angiogenesis-dependent. In certain cases of
ischemic heart disease attempts are being made to
increase angiogenesis in the myocardium by implanting genes for angiogenic proteins. Local therapeutic
angiogenesis has been tested in animals and is being
tested in a very few early clinical trials in humans. It
remains to be seen if local stimulation of angiogenesis
in specific cases of ischemia which is refractory to
conventional therapy will be beneficial.
available which should permit improvements in therapy for many of these diseases. Thus, angiogenesis is a
unifying process that has heuristic value across many
medical specialities.
Summary
Annealing
Further Reading
Reference
D M J Lilley
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0055
74
Anonymous Locus
Anonymous Locus
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0056
Antibiotic Resistance
P M Hawkey
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0057
Enzymatic Modification
Decreased Uptake
The target site for antibiotic action can also be protected by preventing the antibiotic from entering the
A n t i b i o ti c R e s i s t a n c e 75
cell or pumping it out via an efflux pump faster than it
can flow in. b-Lactam antibiotics gain intracellular
access to gram-negative bacteria via a water-filled hollow membrane protein known as a porin. Imipenem is
a carbapenem b-lactam antibiotic and some imipenemresistant Pseudomonas aeruginosa strains lack the specific D2 porin by which imipenem is taken up into the
cell. A similar mechanism involving other porins is
seen in low-level fluoroquinolone and aminoglycoside resistant gram-negative bacteria. Increased efflux
via an energy-dependent membrane transport pump is
a common mechanism for resistance to tetracyclines in
gram-negative bacteria and is encoded by a range of
related genes such as tet(A) that are widely distributed
in Enterobacteriaceae. The marRAB operon associated with the MAR phenotype (multiple antibiotic
resistance) probably works in part by influencing the
expression of distant genes such as micF which encodes
an antisense RNA that inhibits ompF translation leading to a reduction in the ompF porin and reduced
entry of antibiotics such as tetracycline, b-lactams,
fluoroquinolones, and nalidixic acid. In addition
active efflux of tetracycline and chloramphenicol has
been seen to be associated with the MAR phenotype.
seen in vancomycin-resistant Enterococcus spp. encoded by the vanA gene complex is an interesting
variant of the bypass mechanism. Vancomycin-sensitive enterococci have a target for vancomycin which is
a cell wall precursor that contains a pentapeptide that
has a d-alanined-alanine terminus, to which the vancomycin binds, preventing peptidoglycan synthesis.
Enterococci carrying vanA can make an alternative
cell wall precursor ending in d-alanined-lactate, to
which vancomycin does not bind. Other genes in the
vanA cluster contribute to resistance, e.g., VanX
cleaves the normal d-alanined-alanine should any
be made, thus enhancing the role of the alternative
synthetic pathway.
76
Antibiotic-Resistance
Mutants
Further Reading
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0058
Antibiotic-resistant organisms can arise from antibiotic-sensitive organisms in a number of ways. Genes,
or sets of genes, conferring antibiotic resistance may
be obtained via resistance plasmids (R plasmids), integrons, or transposable elements. These genes often
encode proteins responsible for modifying or destroying the antibiotic, affecting the uptake or efflux of the
antibiotic, or modifying the cellular target of the antibiotic. This mechanism is of great practical importance, as horizontal transfer of antibiotic resistance to
pathogens is putting severe constraints on antibiotic
therapy for treatment of infectious disease. This topic
is covered in the entries on Antibiotic Resistance,
Drug Resistance, and Resistance Plasmids.
Many of the resistance genes present on transposable elements, viruses, or specially constructed cassettes also have important uses as mutagenic agents
in the laboratory, either in transposon mutagenesis or
in various in vitro methods (for example see Transposons as Tools). In these cases the antibiotic-resistance
phenotype is of importance to the geneticist because it
can be positively selected.
Antibiotic resistance can also be the result of a
mutation in one or more chromosomal genes in a
sensitive strain. Sometimes these genes are also involved in uptake, destruction, or modification of the
antibiotic. However, in some cases the resistance arises
because the mutation is in the gene encoding the
cellular target of the antibiotic. These latter antibioticresistance mutations have been invaluable for dissection of the complex reactions in macromolecular
synthesis. In this entry we will discuss a few examples
of antibiotic-resistance in the bacterium Escherichia
coli which result from such mutations.
Coumarins
The coumarins, such as coumermycin A and novobiocin, are antibiotics that inhibit certain DNA topoisomerases, enzymes that catalyze interconversions of
A n t i b i o t i c - R e s i s ta n c e M u t a n t s 77
different topological isomers of DNA. One of the
topoisomerases that the coumarins inhibit is bacterial
DNA gyrase. This enzyme introduces negative supercoils into DNA and is an essential enzyme in DNA
replication. Specifically, the coumarins inhibit the
activity of the B subunit which catalyzes the ATP
hydrolysis involved in the enzyme reaction. Some
mutationsofthegyrBgene,whichencodestheBsubunit
of this enzyme, lead to resistance to these antibiotics.
Erythromycin
Erythromycin is one of the macrolides, a large group
of structurally related antibiotics that inhibit protein
synthesis. Erythromycin has been shown to bind to
the large ribosomal subunit in the peptidyltransferase
region of the 23S ribosomal RNA (rRNA). Resistance
can arise from mutations in at least three different
genes encoding large subunit ribosomal proteins. Curiously, in E. coli, genetic elimination of ribosomal
protein L11 makes the cells hypersensitive to erythromycin. Resistance can also arise from specific mutations in the gene encoding 23S rRNA, mutations
which must be constructed in organisms like E. coli
that have multiple copies of this gene. These mutations
are in a region of the 23S rRNA which is protected by
specific methylation in the organism that produces
erythromycin. Methylation at this site in E. coli leads
to erythromycin resistance, but the gene that encodes
the specific methylase must be acquired by horizontal
gene transfer.
Fusidic Acid
The antibiotic fusidic acid inhibits the translational
elongation factor, EF-G, which promotes translocation of the ribosome from one codon on the messenger
RNA to the next. Mutants of EF-G are known that are
resistant to fusidic acid, and they are responsible for
the gene encoding this factor being termed fus. Many
such mutations inhibit the growth rate of the cell as
well as the rate of translation elongation.
Kasugamycin
The aminoglycoside kasugamycin acts as an inhibitor
of translation initiation. Sensitivity to kasugamycin is
dependent on the presence of two dimethyladenosine
residues found near the 30 end of 16S rRNA. Ribosomes whose 16S rRNA is missing these methylated
adenosine residues are resistant to the antibiotic
and, therefore, mutants defective in the methylase
(encoded by the gene ksgA) are kasugamycin resistant.
Cells containing this undermethylated rRNA also
have a reduced growth rate.
Kirromycin
The polyenic antibiotic kirromycin inhibits the translational elongation factor, EF-Tu, blocking its exit
from the ribosome. Kirromycin-resistant alleles of
the tuf genes, which encode EF-Tu, have been isolated. Individually, these mutations lead to amino acid
substitutions at one of a small number of sites. Many
of these alleles lead to an increase in various errors in
translation. Many bacteria, including E. coli, have
duplicate tuf genes (typically tufA and tufB) and the
alleles conferring resistance are recessive to the wildtype alleles. Kirromycin-resistant cells grow more
slowly than wild-type cells. Resistance to kirromycinlike antibiotics in some of the actinomyctes that
produce them is also the result of these organisms
having a resistant EF-Tu.
Quinolones
The quinolones, such as nalidixic acid, also inhibit
bacterial DNA gyrase. However, unlike coumarins
(see above) these antibiotics specifically inhibit the
A subunit, the nicking-ligating component of the
enzyme. Certain mutants of the gene encoding this
subunit, gyrA (formerly nalA), confer resistance to
nalidixic acid and other quinolones.
Rifampicin
The rifamycins are a group of antibiotics synthesized
by certain Streptomyces species. One of these antibiotics, rifampicin, specifically inhibits the bacterial
RNA polymerase by binding to the b subunit of this
enzyme. Rifampicin blocks transcription at, or shortly
after, the initiation of an RNA chain by the polymerase, but it does not block the elongation of chains
already initiated. Rifampicin-resistant mutants are
readily isolated and are found to have mutations in
rpoB (formerly rif ), the gene encoding the b subunit.
The mutations are point mutations or small, in-frame
insertions and deletions at a limited number of sites
which result in amino acid substitutions leading to the
loss of the ability of the enzyme to bind rifampicin.
Rifamycin-resistant mutations can have pleiotropic
phenotypes, such as temperature sensitivity or a
change in the regulation of transcription of some genes.
Streptomycin
The aminoglycoside antibiotic streptomycin inhibits
protein synthesis in bacteria by binding to a specific
site on 16S rRNA. Streptomycin-resistant mutants of
E. coli were first reported in 1950. These mutants have
amino acid substitutions in ribosomal protein S12,
78
Antibody
encoded by the rpsL gene (formerly str). Streptomycin itself can increase many types of translational
errors that occur on the ribosome. Many, but not all,
streptomycin-resistant mutants are said to be ``restrictive'' in that they reduce the error frequency below
that of wild-type ribosomes. All such restrictive mutations also result in a decreased growth rate and a
decreased peptide chain elongation rate. In at least
some cases, these effects can be compensated for by
second-site mutations outside rpsL without diminishing the level of streptomycin resistance. Streptomycinresistant mutants are recessive in cells that also contain
a wild-type allele. Streptomycin resistance can also
arise from mutations in the genes encoding 16S
rRNA. These mutations are also recessive to the
wild-type allele and in organisms containing multiple
rRNA genes these mutations are typically constructed
by in vitro techniques.
See also: Antibiotic Resistance; DNA Replication;
Drug Resistance; Elongation Factors; Escherichia
coli; Integrons; Resistance Plasmids; Resistance to
Antibiotics, Genetics of; Ribosomal RNA (rRNA);
Ribosomes; RNA Polymerase; Streptomyces;
Streptomycin; Topoisomerases; Transcription;
Translation; Transposable Elements; Transposons
as Tools
Antibody
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1759
Anticodons
A Liljas
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0059
tRNA
Crick, in his adaptor hypothesis, proposed that small
RNA molecules would be the adaptors that could be
charged with amino acids by specific enzymes and
that could identify the codons (triplets of nucleotides)
of the mRNA by base-pairing. These adaptors could
thus participate in incorporating the amino acids into a
growing polypeptide. Subsequently these adaptors
were identified and are now known as the tRNA
molecules. From the nucleotide sequences of numerous tRNA molecules, the secondary structure of the
tRNA, the classic cloverleaf, has been identified. Of
the three loops, the middle one contains the anticodon
of the tRNA. The three-dimensional structure of
tRNA has the shape of an ``L.'' Here the anticodon is
located at one end and the 30 acceptor for amino acids
away. This
is at the opposite end, approximately 80 A
means that the anticodon has no possibility of interacting with the amino acid. This also means that, when
the tRNA assists in the incorporation of the amino
acid into the growing polypeptide on the ribosome,
the interaction of the anticodon with the mRNA is far
from the site of peptidyl transfer. The anticodons are
frequently posttranscriptionally modified. This concerns the bases as well as the riboses.
Anticodons
The anticodon is composed of three nucleotides, normally positions 3436 of the tRNA, that read the
codons of the mRNA, primarily by WatsonCrick
base-pairing. However, the same tRNA can base-pair
with different nucleotides in the third position of the
Further Reading
Antigen
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1761
Antigenic Variation
K L Hill and J E Donelson
Fidelity
The fidelity of the decoding depends on the interactions between the anticodon of the tRNA and the
codon of the mRNA. The fidelity is in the order of
1 error per 10 000 incorporated amino acid residues.
The fidelity is a result of two main processes on the
ribosome: the initial recognition of the tRNA by
the ribosome-bound mRNA and the proofreading
by the ribosome. The initial recognition of a tRNA
is done while the tRNA is bound to elongation factor
Tu (EF-Tu) in complex with GTP. The fidelity of this
step is in the order of 1:100. The proofreading occurs
once the GTP molecule bound to EF-Tu has been
hydrolyzed and EF-Tu has dissociated from the
tRNA and the ribosome. Before the tRNA can participate in peptidyl transfer, it has to reorient itself to
place the amino acid into its correct position of the
peptidyl transfer site while maintaining the codon
anticodon interaction. During and after this reorientation the aminoacyl-tRNA can fall off from the
ribosome or proceed to participate in peptidyl transfer.
This increases the fidelity to what has been observed
in vivo. Certain antibiotics or ribosomal mutations
can affect the fidelity significantly.
Antigenic variation is a process by which many infectious agents, including some pathogenic viruses, bacteria, fungi, and parasites, evade the defense responses
of the vertebrate immune system. A major component
of the immune system is the generation of a specific
group of proteins, called antibodies, that attack invading pathogens by recognizing and binding to molecules on the pathogen's surface. Surface molecules that
elicit this immune response are called `antigens' (`antibody generators') and can be proteins, carbohydrates,
or lipids. Thus, pathogens that can periodically change
or switch the molecular composition of their surface
antigens are said to undergo antigenic variation. This
periodic variation provides a means for individual
organisms within a population to temporarily camouflage themselves and thereby prevent elimination of
the entire population by the host's immune system. In
some pathogens, antigenic variation is accomplished
through random mutations in the genes encoding
either surface molecules themselves or the enzymes
that synthesize them. In other pathogens, antigenic
variation is mediated by mechanisms that function
80
A n t i g e n i c Va ria tio n
Further Reading
Anti-Oncog enes 81
Swanson J, Belland RJ and Hill SA (1992) Neisserial surface
variation: how and why? Current Opinion in Genetics and Development 2: 805811.
Two-Hit Model
Anti-Oncogenes
E Maher and F Latif
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0060
Gene
Chromosomal
locus
Neoplasm
Function
RB1
APC
p53
NF1
NF2
WT1
BRCA1
BRCA2
VHL
p16
13q14
5q21
17p13
17q11.2
22q12
11p12
17q21
13q12
3p25
9p21
Retinoblastoma
Colorectal cancer
Sarcomas, gliomas, carcinomas
Neurofibromatosis type 1
Neurofibromatosis type 2
Wilms's tumor
Familial breast cancer
Familial breast and ovarian cancer
von HippelLindau disease
Familial melanoma
DPC4/SMAD4
PTC
18q21.1
9q22.3
Pancreatic carcinoma
Nevoid basal cell carcinoma syndrome
TSC1
TSC2
9q34
16p13.3
Tuberous sclerosis
Tuberous sclerosis
82
Antiparallel
Protein Products
The first TSG (RBI) cloned was the gene causing
retinoblastoma in children. Since then over a dozen
or so TSGs have been isolated, including p53, APC,
and VHL. These genes are also known as ``gatekeepers'' preventing cancer through direct control
of cell growth. The protein products of TSGs are
known to play various roles in cell cycle control
(RB1 prevents cells that are in G0/G1 phase going
into S phase of the cell cycle, whilst p53 acts in late
G1 phase, preventing the cells progressing to the S
phase), apoptosis (p53: in response to DNA damage
there is rapid increase in the level of p53 which causes
arrest of the cell cycle during G1 allowing the cell to
repair its DNA; if repair is not possible p53 induces
programmed cell death or apoptosis), and transcription regulation (WT1 protein is a transcription factor
and can bind to specific DNA sequences causing transcriptional activation or repression).
Clinical Implications
As well as providing insights into the mechanisms
that regulate normal cell proliferation, the study of
TSGs should eventually lead to novel therapies and
better clinical management of cancer patients. Genetic
alterations in TSGs and oncogenes mark cancer cells
as distinct from their normal counterparts. The study
of TSGs will provide molecular markers that can be
used for early detection of specific cancers and will
provide surrogate markers for chemoprevention trials
and possibly to help design tumor-specific therapies.
Further Reading
Antiparallel
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1762
Antisense DNA
R Oberbauer
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0062
Synthetic DNA oligomer antisense is used to temporarily block gene expression in vitro and in vivo.
Antisense oligonucleotides are short, traditionally
1525 bases long, single-stranded DNA fragments
that are designed to hybridize by WatsonCrick base
pairing with mRNA or genomic DNA. Oligonucleotides are synthetized complementary to (or antisense)
the sense-stranded target nucleic acid sequence.
In oligonucleotide design, the nucleotides must be
modified owing to the poor stability of DNA in vivo.
The most frequently used DNA backbone modification is the substitution of the nonbridging oxygen at the
internucleotide linkage with a sulfur at the phosphorus site (phosphorothioate oligonucleotides). Other
substitutions at that internucleotide linkage site led to
the development of phosphodiester and methyl phosphonate oligonucleotides. Important considerations
for the design of oligonucleotides are to use sequences
that will not form duplexes with itself, and that are
free of secondary structures such as hairpin loops. The
designed oligonucleotide must hybridize strongly to a
sequence that is unique to the gene of interest and that
is free of secondary structures. This is usually around
the translation initiation site of the mRNA. Another
important point is the half-life of the target protein,
which needs to be considerably shorter than the halflife of the antisense oligonucleotide. DNA oligonucleotide suppression of gene expression is transient
and degradation of the applied oligonucleotide either
intra- or extracellularly by nucleases within days
results in rapid return to baseline expression.
It was originally suggested that antisense oligonucleotides prevent synthesis of gene products by
blocking the transcription of mRNA, but it has been
appreciated that binding to mRNA can also precipitate
A n t i s e n s e RN A 83
message degradation and that binding of oligonucleotides to nuclear DNA, resulting in triplex formation,
can prevent transcription. The oligonucleotides can
also bind in a nonsequence-specific way to proteins
and thus block their action. This process has been
named aptamer binding. Most studies reporting successful application of antisense DNA have not identified the mechanism of action responsible for gene
suppression. In fact, gene suppression by oligonucleotides designed as `antisense' molecules is very often
not due to sequence-specific blockage of the target
DNA or mRNA; phosphorothioate oligonucleotides
in particular exert high nonspecific toxicity.
For a convincing demonstration of a gene-specific
effect, adequate control oligonucleotide sequences
(scrambled, sense-stranded, and one or two base pair
mismatched) must be used along with appropriate
biological endpoints. A sensitive technique to determine specific antisense down regulation of the target
gene's expression either at the RNA or protein level is
mandatory. Major obstacles in the use of antisense
oligonucleotides are their poor cellular uptake and
their limited subcellular distribution. These problems
can be overcome in vitro by microinjection, by transient disruption of the cell membrane by electroporation, or, in some cases, by using high oligonucleotide
concentrations. The delivery of antisense DNA into
cells in vitro and in vivo can be accomplished by either
packing the DNA into inactivated virus envelopes,
which are determined to penetrate the target cell membrane, or by coating the oligonucleotides with lipids.
A goal even harder to achieve is tissue-specific delivery of these constructs. In theory, tissue-specific
ligands could be used to direct the oligonucleotides
to the target cells. A more promising method is the
local delivery of the antisense oligonucleotides. However, some organs have a high uptake of intravenously
administered oligonucleotides. After systemic administration of phosphorothioate oligonucleotides into
laboratory animals, the kidneys and the liver have
been shown to be the organs with the highest uptake.
This was one of the reasons investigators targeted
these organs with antisense oligonucleotides against
transporters, transcription factors, and cell cycle regulatory genes, and demonstrated a sequence-specific
reduction in mRNA as well as protein. Other potential in vivo applications of antisense oligonucleotides
might be the temporary knockdown of genes responsible for the neointimal proliferation in restenotic
lesions after balloon-angioplasty, temporary inhibition of genes coding for MHC peptides, or adhesion
molecules in organ transplant recipients. A further
possible clinical use of antisense oligonucleotides is
in the treatment of malignancies, as it has been
shown in a mouse model of melanoma that blockage
Antisense RNA
R W Simons
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0061
Antisense RNAs are small (*100 nucleotides), diffusible, regulatory RNAs that bind to complementary
regions on specific target RNAs to control their
expression or biological function at a posttranscriptional level. Most antisense RNAs have been identified in prokaryotic organisms. However, a few
apparent cases have been described in eukaryotic
cells. It is likely that antisense control exists in all
cells.
The genetic systems in which antisense RNAs
function are varied and interesting in their own right.
These include the homeostatic replication systems of
diverse bacterial plasmids, copy-number-dependent
inhibition of mobile genetic element transposition,
temporal development of viral gene expression, control of cell division, and postsegregational killing following cell division. Antisense inhibition can occur
at many molecular levels, including transcription
84
Antitermination Factors
Antitermination Factors
E Kutter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0063
A pomor phy 85
nontranscribed strand in the transcribed region. In the
absence of Q, this segment induces a prolonged pause
in transcription at positions 16 and 17. Sigma is
required for the pause and is present in the complex,
even though the polymerase has already moved from
the initiation to elongation mode (as described in
detail in Transcription) and has released the sigma
factor from the elongating complex. However, a residual nonspecific binding of sigma to the polymerase
may be involved. The pause gives Q time to bind to the
elongation complex; Q, in turn, seems to chase the
polymerase from the pause site, reducing the pause
half-life about fivefold. Q antitermination requires
host proteins N and NusA, but not any of the other
Nus proteins involved in N antitermination. Q suppresses pausing as well as antitermination at sites far
downstream from the promoter.
The examination of antitermination in the lambdoid phage HK022 gives further insight into the
range of possibilities and mechanisms involved in
antitermination. The HK022 antitermination system
does not involve N or, in fact, any other phage-directed
proteins, but it does involve specific sites on the DNA;
its action can be blocked either by mutations in the
DNA site or by mutations at specific sites in the zincbinding region of the b0 subunit of the RNA polymerase. Interestingly, HK022 does encode a protein,
nun, that is homologous to the lambda N protein in
both sequence and location and that is also targeted
against nut sites. However, rather than recognizing
sites in its won genome, nun targets heterologous nut
sites in the DNA of bacteriophage lambda and some
related phages, where it induces termination rather
than prevents it, thus allowing HK022 to remove the
competition of some of its relatives.
With all three modes of inducing antitermination,
the effect is clearly specific to the elongating complex,
not to the termination site; overlapping transcripts
started from other promoters are not affected. It also
appears that the effect relates to some sort of specialized pausing state, not to the general elongation
process, and that pausing state is involved in rhodependent termination, as has been suggested from
other lines of evidence.
`Attenuation', a site-specific form of antitermination that controls expression of a number of genes
involved in amino acid metabolism, is discussed in a
separate entry in the encyclopedia (see Attenuation).
Further Reading
References
Antitermination Proteins
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1763
AP Endonucleases
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1764
Apert Syndrome
See: Craniosynostosis, Genetics of; Syndactyly
Apomorphy
E O Wiley
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0064
86
Apoptosis
Apoptosis
T M Picknett and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0065
Aporepressors
C Yanofsky
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0066
Arabidopsis thaliana:
Molecular Systematics and
Evolution
M A Koch
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1723
Arabidopsis thaliana (thale cress) is a short-lived, selfcompatible, and predominantly inbreeding annual
plant, distributed in the temperate regions of Europe,
the Far East, and East Africa. During previous centuries, A. thaliana was introduced in North America,
South Africa, and Australasia and continues to expand
its range. A. thaliana belongs to the mustard family
Arabidopsis thaliana : T h e P re m ie r M o d el P la nt 87
synthase, alcohol dehydrogenase, the plastidial maturase K, and the mitochondrial nad4 have been used to
predict that the divergence time of A. thaliana from its
closest relatives, the former genus Cardaminopsis, was
approximately 5.8 million years ago. A. thaliana
diverged from the lineage containing B. oleracea
(cabbage) approximately 1620 million years ago.
The age of the whole family is approximately 50 million years.
The total genome sequence information is open
to everybody and accessible via the World Wide Web
in combination with numerous tools for the search
and prediction of genes, proteins, and promoter
sequences. The combination of the rapidly expanding
molecular knowledge about A. thaliana with increasing knowledge about the evolutionary history of the
Brassicaceae family on different scales (from genes to
genomes) makes A. thaliana an ideal study object not
only for functional analysis but also with which to
answer evolutionary questions.
See also: Arabidopsis thaliana: The Premier Model
Plant; Brassicaceae, Molecular Systematics and
Evolution of
88
Early genetic screens identified loci that were important for plant growth or metabolism, but did not
reveal the identity of the gene that controls the phenotype. The development of a genetic map of Arabidopsis was the first breakthrough in bridging the gap
between mutant phenotype and gene identity. Using a
large collection of Arabidopsis mutants with visually
scorable phenotypes, Martin Koorneef and William
Feenstra published a comprehensive linkage map
that divided the genome into 500 map units on five
chromosomes (Koorneef et al., 1983). The classical
map was used by geneticists to locate the chromosomal positions of new mutations. In some cases these
mapping experiments defined a small interval within a
chromosome that contained the mutation of interest.
While the classical genetic map was useful, it did not
allow researchers to identify the individual genes that
were affected in mutant plants. To identify the affected
gene it was necessary to construct a map based on the
DNA sequence of the genome.
Arabidopsis thaliana : T h e P re m ie r M o d el P la nt 89
pairs, quite a small package compared to the genome
of wheat (17 billion base pairs). Equally important,
DNA reassociation experiments indicated that only a
small fraction (about 10%) of the Arabidopsis genome
is comprised of repetitive DNA, unlike the maize
genome, 80% of which is highly repetitive sequence
elements. Large segments of repetitive DNA make it
difficult to link cloned segments of DNA into an
arrangement that reflects their order in the genome.
The problem with repetitive DNA in creating a large
contiguous segment is analogous to assembling a jigsaw puzzle that contains many pieces of the same
color and shape. The scarcity of repetitive DNA elements in the Arabidopsis genome made it feasible to
assemble a collection of large-insert clones that span
almost the entire genome. These clones provided the
materials to determine the nucleotide sequence of the
entire Arabidopsis genome.
proteins of known function, usually genes that are involved in conserved functions such as DNA replication
or translation. Sequence similarities between predicted Arabidopsis genes and homologs in other species
provide researchers with important hints as to where to
look to uncover gene function. For 45% of the predicted genes, the amino acid sequence does not provide
any hint of function. In total it is estimated that the
function of only 5% of all Arabidopsis genes is known.
90
Arabinose
References
Arabinose
I Schildkraut
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0068
Arachnodactyly
R E Pyeritz
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0069
Arber, Werner
T N K Raju
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0070
A rc ha ea , G e ne t i c s o f 91
School in Zurich and biophysics at the University of
Geneva. In Geneva, Arber developed strong foundations in genetics, electron microscopy, and the
physiology of bacteriophage. After training as a postdoctoral fellow in the United States for 2 years, he
returned to Geneva in 1960 and pursued genetic
research. After a series of preliminary studies related
to bacteriophage physiology, he began studying the
mechanisms of gene transfer in bacteria with phage as
vector. Specifically, he and his associate Daisy Dussoix
began assessing changes in the genetic materials of
the common bacterium Escherichia coli, induced from
radiation and from bacteriophage.
By studying the phenomenon of ``host controlled
restriction of bacteriophages'' these investigators discovered that over time, the ``infected'' viral DNA is
changed in the host cell, although the host DNA itself
remains unchanged. This suggested the existence of a
natural ``barrier'' against foreign genetic material in the
host cells. They then found that a variety of enzymes
were involved in such barrier functions.
Arber and colleagues then showed that there were
two main processes intimately involved in the development of barrier functions: restriction and modification. `Restriction' meant breakdown of DNA, while
`modification' meant preventing such breakdown.
Arber postulated that both processes are catalyzed
by a specific set of enzymes. He proposed that the
DNA molecule contained specific sites with capacities
to bind both types of enzymes. He also demonstrated
the chemical nature of those sites, as recurring, specific
base pair sequences. The enzymes act at these sites
either by cleaving the molecule causing restriction
(breakdown) or by adding a methyl group, methylating (causing modification) the latter prevented the
breakdown.
Thus, Arber and associates had discovered the principles that were involved in the functioning of a family
of ``chemical scissors'' which were responsible for
both the cutting of specific segments of the DNA
molecule, and to preventing such cutting.
Arber's discovery opened new avenues in molecular genetics. Using purified bacterial restriction
enzymes, Hamilton Smith confirmed Arber's results
and hypotheses, and isolated the first restriction
enzyme that cut the DNA in the middle of specific
symmetrical sequences. Daniel Nathans showed that
restriction enzymes could be used for constructing
genetic maps and developed methods involving restriction enzymes to explain how genes were organized
and expressed in the living cells.
These pioneering works led to such major discoveries as determining the order of genes on human and
animal chromosomes, analyzing the chemical structures of genes, and most importantly, the discovery of
Archaea, Genetics of
F T Robb
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0071
Since their discovery in 1978 by Carl Woese and colleagues, the Archaea have become widely recognized
as a third lineage of life, distinct from bacteria and
eukaryotes, but sharing many molecular characteristics of both of these taxa (Woese et al., 1978). Originally identified by distinctive sequence patterns in their
16S ribosomal RNA, the Archaea are also set aside by
unique phenotypic characteristics such as ether-linked
membrane lipids. Another widespread feature of the
Archaea is their colonization of environments with
the most extreme physical and chemical conditions
that support life as we know it. An extensive body of
92
Archaea, Genetics of
A rc ha ea , G e ne t i c s o f 93
microbial life that relied on efficient gene transfer as a
means of adaptation to extreme and rapidly changing
environments. The stage is set for the application of
the power of genetic analysis to dissect adaptive functions of the group as a whole.
The antibiotics carbomycin, celesticetin, chloramphenicol, and thiostrepton as well as butanol and
butylic alcohol have been used as growth inhibitory
agents in hyperthermophiles. Novobiocin, an inhibitor of DNA gyrase, has also been used to good effect
in the design of shuttle vectors for the halophiles, but
it is unfortunately thermolabile. The effective use of
puromycin as a selective agent in mesophilic methanogens and the availability of a puromycin resistance
gene marker has led to the construction of a set of
seven vectors for Methanococcus and Methanosarcina
spp.
The hyperthermophiles, such as Sulfolobus and the
Thermococcus/Pyrococcus group, grow at temperatures in excess of 80 8C, and colony formation may
require incubation for up to one or two weeks.
DNA Transfer
Transformation
94
Archaea, Genetics of
Conjugation
Plasmid vectors using puromycin resistance as a selective marker have been developed for many methanogenic Archaea. The current vectors are constructed
either with autonomous replication from rollingcircle origins, or else with the ability to integrate into
Further Reading
References
Arginine
E J Murgola
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0073
CH2
CH2
CH2
N
NH
NH2
Figure 1
Arginine.
Artificial Chromosomes,
Yeast
P Hieter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0074
CEN
5 kb
100 1000 kb exogenous DNA insert
5 kb
YSM-2
(plasmid) TEL
Figure 1 Basic structure of a yeast artificial chromosome. TEL, Tetrahymena telomere-derived sequences;
(plasmid) sequences derived from bacterial cloning
vector such as pBR322; YSM-1 and YSM-2, yeast genes
for selecting yeast host transformants, generally prototrophic markers; ARSI, yeast autonomously replicating
sequence; CEN, yeast centromere DNA.
96
Artificial Selection
Practical Considerations
While certainly feasible, the construction of comprehensive genomic YAC libraries with average insert
sizes of greater than 300 kb is technically challenging,
requiring considerable effort and technical skill. A
number of representative genomic libraries have
been constructed from human DNA and from the
DNA of numerous other organisms (Green et al.,
1998). Efficient methods have been developed for
screening YAC libraries for individual clones using a
PCR-based strategy.
YACs have played a major role in the construction
of clone-based physical maps of whole genomes, most
notably of the human genome. The basic method used,
called `STS-content mapping,' involves the use of
sequence-tagged sites (STSs) as markers and YACs as
the source of cloned DNA. An STS is a short segment
of genomic DNA that can be uniquely detected using
a PCR assay. An STS map represents the genome as a
series of STS landmarks at known physical distances
from one another. By determining the STS content of a
sufficiently large number of individual overlapping
YACs using a sufficiently high number of unique
STSs, both the order of STSs and the extent of overlap
of adjacent YAC clones can be deduced simultaneously.
The development of methods to transfer YACs
back into mammalian cells has provided a means for
analyzing gene expression or regulation within very
large regions of DNA. On introduction into mammalian cells, the cloned insert DNAs within YACs tend to
integrate into the mammalian genome as intact
segments. This feature permits functional analysis
of large stretches of DNA spanning thousands of
kilobases, involving large genes, gene clusters, or regulatory elements that can be dispersed over large
regions.
There are technical difficulties associated with
YAC cloning methodology. In particular, individual
YAC clones exhibit a high rate of chimerism, that is
the presence of two unrelated segments of DNA
within a cloned YAC insert. Such chimeric YACs
constitute about 50% of the YAC clones in most
libraries. Another problem is the difficulty in purification of YAC clone DNA away from endogenous
References
Artificial Selection
W G Hill
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0075
Introduction
Artificial selection is distinct from natural selection in
that it describes selection applied by humans in order
to produce genetic change. When artificial selection is
imposed, the trait or traits being selected are known,
whereas with natural selection they have to be
inferred. In most circumstances and unless otherwise
qualified, directional selection is applied, i.e., only
high-scoring individuals are favored for a quantitative
trait. Artificial selection is the basic method of genetic
improvement programs for crop plants or livestock
(see Selective Breeding). It is also used as a tool in
the laboratory to investigate the genetic properties of
a trait in a species or population, for example, the
magnitude of genetic variance or heritability, the possible duration of and limits to selection, and the
correlations among traits, including with fitness.
A r t i f i c i a l S e l e c t i o n 97
of the trait and S is the selection differential applied to
it, i.e., the mean superiority of selected parents. The
ratio or regression of selection response on selection
differential (the realized heritability) is therefore an
estimate of the heritability of the trait. The rate can
also be expressed as R ih2sP ihsA, where sP2 is the
phenotypic variance, sA2 is the additive genetic variance and i S/sP is the selection intensity (see Heritability; Selection Intensity). If selection is practiced on
some index or other criterion of selection, family
mean performance, for example, the expected response
per generation is irrsA, where r is the accuracy of
selection (r h for individual selection) (see Selection
Index). In this form it is seen that the response depends
on: (1) spare reproductive capacity to enable selection
to be applied, (2) the accuracy of predicting genotype
(specifically breeding value) from phenotype, and (3)
the magnitude of genetic variation (specifically additive genetic variance) in the trait. The expected rate of
response per year also depends on (4) the generation
interval (L, the mean age of parents when their progeny are born), and with continued selection of the
same intensity the annual rate equals R/L.
These predicted responses are functions solely of
variances and covariances, and formally hold only for
a single generation. Selection itself changes gene and
haplotype frequencies and hence the genetic variation,
and is necessarily practiced in a population of finite
size, such that heterozygosity falls due to genetic drift.
For the infinitesimal model of additive unlinked genes
each with infinitesimally small effect on the trait,
selection yields negligible gene frequency changes
and consequent variance changes. It induces, however,
a negative correlation of frequencies among loci, i.e.,
linkage (gametic) disequilibrium, so genetic variance is
reduced by an amount which can be predicted using
methods of Bulmer. Most of this reduction occurs in
two generations, and asymptotes at about one-quarter
loss of response, more if there is tight linkage between
the relevant genes. With long-term selection, changes
in gene frequency from both selection and drift have
to be taken into account as, eventually, do mutations.
Therefore although it is possible to make qualitative
predictions, for example that rates of response will
reduce as genes become fixed, quantitative predictions
require information on gene effects and frequencies.
As this is not available, it is not possible to make
confident predictions about the magnitude of longterm response from data that can readily be obtained
(see Selection Limit). Most of our information therefore comes from the results of selection experiments,
which typically show that initial rates of responses are
maintained for five or more generations without substantial attenuation, and may continue for many tens
of generations.
98
Artificial Selection
Selection Experiments
Numerous selection experiments using artificial
selection have been conducted over the last century,
for a wide range of traits in many species. With very
few exceptions, selection response has been achieved, indicating that there is genetic variance present
for any trait in any population. In those experiments
continued for many generations, responses have
continued so that the mean performance of the
selected line is well outside the range of that found
in the base population. Examples of important or
typical experiments are given in the accompanying
figures to illustrate the roles of selection experiments.
Experiments differ in aspects of design. In some an
unselected control population is maintained so that
genetic and environmental change can be distinguished. In others divergent selection, in which a
high and a low line are maintained, in practice so as
to check whether response is symmetric in the two
directions (which may require a control as a check) or
simply to eliminate environmental change by comparing the contemporaneous high and low lines. The outcome of a selection experiment in a finite population is
essentially a random walk: because the selected line
is finite in size, genetic sampling (drift) produces
25
Examples
The Illinois corn oil experiment (Figure 1) was established before 1900, and continues to this day (Dudley,
personal communication). Seed is selected from cobs
with a high or low oil content in the high and low
lines, respectively. While response in the low lines has
attenuated, it is doing so at levels of oil content so low
that there is little variation left in the population.
Response in the high line continues after almost
100 generations of selection. In 1934 `Student' used
results of this experiment to estimate numbers of loci
contributing to response, by comparing response with
variance; he obtained a value of over 30 genes, but
estimates rise as response continues. Essentially, this
experiment shows the power of selection to effect
change and also the duration over which responses
continue, these presumably being fed by new mutations coming into the lines over the many generations
since selection started.
Figure 2 shows the lines of Drosophila melanogaster
selected for abdominal bristle number from a founder
population recently caught from the wild (Clayton and
Robertson, 1957). This experiment is important
because it was used to test directly quantitative genetics
theory: estimates of heritability and variance were
20
IHO
RHO
SHO
ILO
RLO
% OIL
15
10
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
88
91
94
97
100
GENERATION
Figure 1 Selection for oil content in maize (unpublished graph courtesy of J W Dudley, University of Illinois).
IHO, continuous selection for high oil content; RHO, reverse high oil selection; SHO, switchback to high oil selection;
ILO, continuous selection for low oil content; RLO, reverse low oil selection.
A r t i f i c i a l S e l e c t i o n 99
H. 4
80
H. 1
(STERNITAL BRISTLES)
H. 5
70
H. 5 R.S.
H. 3
H. 2
H. 5 B.S.
60
MEANS
50
H. 1
H. 4
40
L. 1
L. 4
30
20
L. 3
L. 4
10
L. 1
L. 5
4
0
10
15
20
25
30
L. 2
35
GENERATIONS
BASE
10
20
30
40
MEAN
MEAN
MEAN
L.I.
GEN. 34
50
60
70
80
H.I.
GEN. 35
90
100
110
Figure 2 Selection for abdominal bristle number from an outbred population of Drosophila melanogaster.
(A) Responses in five replicate selected high and low lines (broken) and in relaxed or unselected lines (solid).
(B) Frequency distribution of bristle number in the base population and in the highest and lowest replicate lines after
345 generations. (Reproduced with permission from Clayton and Robertson, 1957.)
100
Artificial Selection
20
15
10
20
40
60
80
100
120
80
100
120
GENERATION
25
20
15
10
20
40
60
GENERATION
Figure 3 Selection for abdominal (top graph) and sternopleural bristle number (lower graph) from an inbred
population of Drosophila melanogaster. (Reproduced with permission from Mackay et al., 1994.)
Other Examples
Further Reading
References
Ascobolus
B C Lamb
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0076
Introduction
The fungal genus Ascobolus was established in 1791 by
Persoon.Ascobolushassmallapothecia(fruitingbodies)
with large asci protruding beyond the hymenium at
maturity (see Figure 1). Ascobolus is in the Ascobolaceae, Pezizales, Ascomycota, and there are about 50
species in all. These fungi usually live on dung or
rotting plant remains and have a world-wide distribution. Some are homothallic and self-fertile; others are
heterothallic, usually with two mating types.
Although many species have been studied mycologically (Van Brummelen, 1967), nearly all genetic
work has used Ascobolus immersus, which is common
and widely distributed, especially on the dung of herbivorous mammals. A. immersus has been extremely
important in elucidating recombination mechanisms,
mainly through studies of segregation ratios in unordered octads from crosses using ascospore color
markers, so that aberrant segregation ratios can be
identified visually, as in Figure 2.
Biology
Sexual Reproduction
In some species, there are fruit bodies that are cleistothecial (closed) and in others they are perithecial
(with a neck), but in most species they are apothecial
(open disks). Some species can develop fruit bodies
parthenogenically, but they are usually sexual. Some
species have ascogonia and antheridia, while some
only have ascogonia. In heterothallic species, fusion
is normally between vegetative hyphae, though A.
carbonaris has trichogynes and antheridial conidia.
102
A sc o b o l u s
Figure 1 Apothecia of A. immersus with intact asci from a wild-type (, redbrown ascospores)white 178 (white
ascospores) cross, normally giving 4:4w178 segregations. A spontaneous mutation occurred early in the cross,
from to a new white mutation, giving a 0 redbrown:8 white ascospore segregations. The diameter of the
apothecium in the top left corner is 650 mm. This figure originally appeared in Lamb BC and Wickramaratne MRT
(1973) Corresponding-site interference, synaptonemal complex structure, and 8:0m and 7:1m octads from
wild-type mutant crosses of Ascobolus immersus. Genetical Research 22: 113124, Cambridge University Press
(reproduced with permission).
In A. immersus the apothecia are up to 1 mm across,
hemispherical, and usually immersed in the substrate
up to the disk, with only a few asci protruding
(Figure 1), and with up to 40 asci being produced in
partly synchronized waves. The asci are 500700 by
100130 mm and are phototropic, turning to face open
spaces. The eight large uninucleate haploid ascospores
are about 5070 by 2836 mm each and are arranged
in roughly two rows (Figure 2). In the Pasadena
(California) strains, the ascospores are dark red
brown and oval, whereas the ascospores of European
strains are brown and more straight-sided. Many species have violet ascospores. In A. immersus, the ascospores are dehisced as a group, and therefore too
heavy to be much influenced by air currents. They
travel up to 30 cm horizontally and 35 cm vertically.
In contrast, species such as A. furfuraceus and A.
scatigenus largely dehisce their ascospores individually, and so air currents are important for dispersal.
Vegetative State
Figure 2 An intact, undehisced ascus of A. immersus which has been removed from its apothecium. It is from a wildtype () white 110 cross and has a 6:2w110 gene conversion ratio, with one white ascospore partly hidden. The
ascus is 550 mm long. This figure originally appeared in Lamb BC and Wickramaratne MRT (1973) Correspondingsite interference, synaptonemal complex structure, and 8:0m and 7:1m octads from wild-type mutant crosses
of Ascobolus immersus. Genetical Research 22: 113124, Cambridge University Press (reproduced with permission).
vegetative fusions between A. immersus strains of
opposite mating type can occur, as well as fusions of
like mating type. The haploid chromosome number has
been reported as 8, 9, 11, 12, 14, and 16; it is 11 in the
European strains, with nine identified linkage groups.
Ascospore Germination
104
A sc o b o l u s i m m e r s us
References
Ascobolus immersus
D Stadler
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0077
A s p a r a g i n e 105
The absence of these spores has impeded the Ascobolus
work in various ways, one being the failure to detect
nutritional mutants. Such mutants would be useful as
genetic markers and to construct selective systems for
the detection of rare events like mutation, intragenic
recombination, and genetic transformation.
Further Reading
Rossignol J-L and Picard M (1991) Ascobolus immersus and Podospora anserina: sex, recombination, silencing and death. In:
Bennett JW and Lasure LL (eds) More Gene Manipulations in
Fungi, pp. 266290. San Diego, CA: Academic Press.
Ascus
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1767
Asilomar Conference
I Schildkraut
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0078
Further Reading
ASO (Allele-Specific
Oligonucleotide)
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0079
ASO is the acronym for an allele-specific oligonucleotide which is typically 17 to 25 bases in length and has
been designed to hybridize only to one of two or
more alternative alleles at a locus. An ASO is usually
designed around a variant nucleotide located at or near
its center. It is used typically in combination with the
polymerase chain reaction (PCR) protocol as a means
for determining the presence or absence of a particular
alelle in a genomic DNA sample.
See also: Polymerase Chain Reaction (PCR)
Asparagine
E J Murgola
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0080
106
Aspartic Acid
COOH
E J Murgola
History
H2N
CH2
C
O
Figure 1
NH2
Asparagine.
Aspartic Acid
CH2
C
O
Figure 1
OH
Aspartic acid.
Aspergillus nidulans
V Gavrias, W E Timberlake, and T H Adams
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0082
The genetic system of Aspergillus nidulans, an ascomycetous fungus (Kingdom: Fungi; Phylum: Ascomycota; Class: Plectomycetes; Order: Eurotiales;
Family: Trichocomaceae) has been used extensively for
studying genetic mechanisms, metabolic regulation,
development, differentiation, and cell cycle controls.
A. nidulans is also known as Emericella nidulans, particularly in DNA sequence databases, owing to peculiarities of fungal taxonomic nomenclature. A. nidulans
has biological properties that have permitted development of experimental techniques for manipulating its
A s p e r g i l l u s n i d u l a n s 107
A
Conidiophore
Conidia
Asexual
Germling
20 M
Hyphae
B
Sexual
Ascospore
Cleistothecium
Ascus
..
Hulle-cell
1M
1M
Heterokaryosis
Genetically distinct strains can fuse to produce hyphae containing variable mixtures of nuclei with different genotypes, a heterokaryon. Because nuclei do
not move freely through pores in septa, heterokaryons
are intrinsically unstable. Heterokaryons can be maintained by selecting for complementing nutritional
deficiencies. In the laboratory, heterokaryons are
usually formed and allowed to produce cleistothecia
to make genetic crosses.
108
Aspergillus nidulans
Genetic System
Meiosis
Mitosis
Wild-type
strain
Unknown
mutation
I
II
III
IV
V
VI
VII
VIII
11
15
12
10
13
16
24
14
13
9
12
14
11
8
0
10
Mutations are typically induced in conidia by treatment with chemical mutagens, ultraviolet light, or
ionizing radiation. Mutations resulting in growth or
developmental abnormalities can be detected by direct
observation. Nutritional deficiencies can be detected
by replica-plating onto media containing or lacking
nutritional supplements. Conditional mutations, such
as temperature-sensitive mutations, can also be found
by replica-plating. Unstable diploids, containing mutations in microtubule components, can be used to
detect heterozygous lethal mutations on selected
chromosomes because they result in complete segregation distortion in haploid derivatives.
Transformation
A s p e r g i l l u s n i d u l a n s 109
m
Heterozygotes
diploid
Four-strand
stage
+
Heterozygotes
only
+
Division
m
m
m
+
After
crossover
+
m
m
+
Homozygotes
and
heterozygotes
+
+
m
+
Figure 3
Maps
1. Genetic map. An extensive recombinational map
for Aspergillus has been assembled and is regularly
updated (Clutterbuck, 1993).
2. Physical map. A physical map of the genome has
been assembled, consisting of overlapping cosmid
clones (Prade et al., 1997; http://fungus.genetics.
uga.edu:5080/).
3. DNA sequence. The DNA sequence can be
accessed by the public at GenBank under Emericella nidulans, including the results of expressed
sequence tag (EST) sequencing projects. A draft
Genetic Resources
Aspergillus strains, genetic map, transformation vectors, genes, and the physical mapping resources are
maintained by and available from the Fungal Genetics
Stock Center (http://www.fgsc.net).
Developmental Regulation
Conidiophore development has been studied extensively because it is easy to control in the laboratory
and is amenable to investigation by using molecular
and genetic approaches. Figure 4 shows developing
conidiophores, which consist of a basal foot cell, a
stalk (S) terminating in a swollen vesicle (V), two
layers of cells called metulae (M) and phialides (P),
and long chains of conidia (C).
Hundreds of genes are selectively activated during
development. Many of these encode structural
Aspergillus nidulans
110
10M
P
C
V
Figure 4 Scanning electron micrographs of conidiophore development. (A) A conidiophore stalk (S) with a
developing vesicle (V). (B) Developing metulae (M) bud from the surface of the vesicle. (C) Developing uninucleate
phialides (P). (D) Mature conidiophore bearing conidia. (Figure was kindly provided by Reinhard Fischer,
Laboratorium fur Mikrobiologie, Philipps-Universitat Marburg and Max-Planck Institut fur Terrestrische Mikrobiologie, Marburg, Germany.)
Sporulation-specific
genes
Conidiation
signals
flu
brlA
abaA
Early morphogenetic
genes
Growth
signals
flu
wetA
Spore-specific
genes
Proliferative
growth
Figure 5 Pathways controlling conidiophore development in Aspergillus. Major regulatory events required for
conidiation are shown. Arrows indicate positive interactions, and blunt lines indicate negative interactions.
proteins or enzymes that are responsible for the differentiated functions of the various cell types making
up the conidiophore. Three major regulatory genes encoding transcription factors that control development
are brlA, abaA, and wetA (Figure 5). Expression of
brlA initiates a self-sustaining cascade of events in
which regulatory (abaA and wetA) and structural
genes are sequentially activated at appropriate times
in proper cell types. Initiation of conidiophore development is coordinated through the output of distinct
signal transduction pathways involving fluffy ( flu)
genes that are required to sense environmental and
cellular factors to inhibit or promote conidiation
and growth. Under conditions that favor conidiation,
these signal pathway outputs serve to activate brlA,
thereby controlling the onset of development (Adams
et al., 1998).
Metabolic Regulation
Aspergillus can use a wide range of compounds as sole
carbon and/or nitrogen source. The assimilation of
A s s o r t a t i ve Ma ti n g 111
NO2
NH4+
NO3
Nitrate reductase
Nitrite reductase
niiA
niaD
+
glutamine
Figure 6
AreA
NirA
nitrate
Cell-Cycle Controls
Studies of cell-cycle controls in Aspergillus have
complemented those carried out in S. cerevisiae and
Schizosaccharomyces pombe (Morris and Enos, 1992).
Identification of mutants that never enter (nim) or are
blocked in (bim) mitosis led to the discovery of genes
encoding protein kinases (e.g., NimA) and phosphoprotein phosphatases (e.g., BimG) that participate in
controlling progression through the cell cycle in all
eukaryotes (see Cell Cycle).
References
D L Hartl
Assortative Mating
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0083
Mating is said to be `assortative' when there is a correlation in phenotype between mating pairs. When the
correlation is greater than zero, the mating system is
positive assortative mating; when it is less than zero,
the mating system is negative assortative mating (also
called `disassortative mating').
Positive assortative mating differs from inbreeding. Whereas inbreeding affects all the genes in the
112
A s s o r t a t i ve Ma t i n g
References
Assortment
J R S Fincham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0084
Ataxia Telangiectasia
A M R Taylor
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1546
Ataxia telangiectasia (A-T) is a neurodegenerative disorder inherited in a recessive manner that is apparent
in children as soon as they begin to walk. The disorder
is progressive, so that by their early teens, patients
have to rely on a wheelchair for mobility. Not only
ATM Protein
The ATM gene spans 150 kb of genomic DNA
and encodes a ubiquitously expressed transcript of
approximately 13 kb consisting of 66 exons. The
main promoter of ATM is bidirectional and the single
114
A ta xi a Te l a ng ie c t a s ia
Further Reading
ATP (Adenosine
Triphosphate)
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0085
116
att Sites
att Sites
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1768
A compound chromosome is one that has a nonstandard set of arms attached to one centromere. For
example, in Drosophila melanogaster, the second
chromosome is a metacentric, with left and right
Viability Considerations
Most organisms are intolerant of aneuploidy; heterozygous deficiencies for small regions are compatible
with viability, as are slightly larger duplications, but
in general deficiencies or duplications for whole arms
cause lethality. The above C(2L) chromosome is
therefore viable only in a fly that simultaneously lacks
a normal second chromosome and has two copies of
the right arm of chromosome 2, for example as C(2R).
A C(2L);C(2R) fly is euploid (it has two and only two
copies of each gene) but is sterile in crosses to flies
with normal chromosomes (Figure 1A) because all
progeny are grossly aneuploid. C(2L);C(2R) flies are,
however, fertile when crossed to C(2L);C(2R), since
complementary gametes are produced (Figure 1B).
The situation for attached-Xs in flies is slightly
different, because aneuploidy for the Y chromosome
has little effect on viability and no effect on sex. Thus,
crossing a C(1)/0 female with a normal X/Y male gives
viable C(1)/Y females and X/0 males (Figure 1C) as
50 % of the total zygotes (aneuploidy for the X is
lethal), although the X/0 males are sterile for lack of
a Y. Crossing that C(1)/Y daughter back to a normal
X/Y male now gives C(1)/Y daughters and X/Y (fertile) sons, which form a viable stock.
Should one wish to keep C(1) females without a
free Y, males carrying an attached XY (C(1;Y)) would
be used.
Construction of Compounds
Getting the first autosomal compound chromosome
as a viable fertile fly was decidedly tricky, but once
an autosomal compound exists, getting more (of
defined genotypes) is easy: treat a normal female with
X-rays and cross to a male carrying compounds. Only
progeny carrying a new compound will be viable.
These new compounds are found to have been caused
by a translocation-type event between the left arm
of one homolog and the right arm of the other
(Figure 2), and most of the recovered new compounds
have both breaks in the proximal heterochromatin
(Figure 2A); this is because breaks in the euchromatin
give compounds that are aneuploid, and here, too,
very much aneuploidy (of euchromatic genes) is lethal
(aneuploidy for heterochromatin has very little effect).
Compounds that have a little bit of one arm hyperploid are, however, viable, and interestingly these do
show preferential C(L)$C(R) segregation in male
meiosis; this strongly suggests that whatever it is that
achiasmate Drosophila males use to direct autosomal
homologs to opposite poles at meiosis I, it involves
euchromatic homology and ignores heterochromatin.
Attached-Xs are easier to construct in Drosophila,
since a new C(1)-bearing egg is viable if fertilized with
a Y-bearing sperm; however, just X-raying a normal
X/X female yields very few C(1)s, because the right
arm of the very nearly telocentric X is a very small
target indeed. Consequently, building an attached-X is
a two-step process; first, an X-centromere-Y arm
chromosome is generated by a translocation between
the extensive proximal X heterochromatin and either
arm of the Y (Figure 2B), then a second translocation
is induced between the other Y arm and the proximal
heterochromatin of a second normal X. Attached-Xs
therefore usually bear a Y-chromosome centromere.
Meiotic Segregation in
Compound-Bearing Flies
Types of Compounds
118
(A)
Normal parent,
(B)
Gametes
Compound parent,
Gametes
0
Gametes
Compound parent,
(C)
Normal XY male,
Gametes
C(1)/0 Female,
Gametes
Figure 1 Progeny of various kinds of crosses with compound-bearing parents: (A) normal parent; (B) compound
parent; and (C) normal XY male. Solid line, euchromatin; wavy line, heterochromatin; circle, centromere. Lethal
classes are crossed out with dashed lines.
the two arms may be in the same leftright orientation
(tandem, T) or in opposite leftright orientation (reversed, R); the centromere may be in the middle (metacentric, M) or near one end (acrocentric, A); and the
ends may be free (no special symbol) or joined to make
a ring chromosome (another `R,' unfortunately R(1)
is a `single' X ring). There are therefore six basic compound chromosomes possible TM, TA, RM, RA,
TR, and RR and all six have been made for the X in
Drosophila and the consequences of crossing-over
within them studied (see Lindsley and Zimm, 1992).
Basically, reversed compounds synapse as a rod
(Figure 3) and crossovers occur freely and have
no consequences other than changing genotype
(chiasmata resulting from such crossovers have no
role in directing meiosis I segregation, since they join
X
Y
+
L
L
R
L
L
(A)
(B)
Figure 2 Construction of compound chromosomes. Solid line, euchromatin; wavy line, heterochromatin; circle,
centromere; short arrows, sites of breakage; cross lines, sites of rejoining.
a
c
C(1)RM
Isochromosomes
a
b
C(1)TM
c
c
Figure 3 Synapsis of reversed and tandem compounds. Solid line, euchromatin; wavy line, heterochromatin; circle, centromere.
{
Sister chromatids
120
A tt a c h m en t S i t es
Reference
Attachment Sites
A M Segall
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0087
core
X F
P arm
H
attP
Phage
attB
Int
IHF
Phage
Phage
Int
IHF
Xis
FIS
attL
H
similar arrangements
Phage
Phage
X F
Int- core
Int- arm
attR
H
H IHF
Xis
F Fis
A t ta c h m e n t S i t e s 121
and attP partners. This stringent requirement for identity between the overlap acts as a check on whether the
recombination is taking place between the correct
partner sequences rather than between sequences
that merely resemble the genuine att sites.
The overlap and inversely repeated recognition
sites for Int constitute the core of the att site; in fact
the simplest att sites, such as the E. coli attB for phage
lambda, consist only of this core region. In contrast,
the attP sites have extra DNA flanking the core
region. These flanking sequences, or arms, contain
additional binding sites for Int as well as binding
sites for accessory proteins required to help Int perform the recombination reaction (Figure 2). The Int
binding sites in the core of the att site are known as
core sites, while those in the flanking regions are
known as arm sites. The arm and core sites have different sequences, and Int binds them with different
domains. Moreover, the arm sites have a relatively
higher affinity for Int than the core sites do, and are
the place where Int first contacts the att site DNA.
While the arm-binding domain of Int touches the
flanks of the att sites, the catalytic domain of Int
must be delivered to the core sites. This is accomplished with the help of helper proteins that bind to
the arm region of the att sites between the Int arm and
core binding sites. One of these helper proteins is the
host-encoded integration host factor (IHF), a protein
which binds DNA at specific sequences and causes
sharp, almost hairpin turns in the DNA. As seen in
Figure 2, IHF binding sites separate the arm and core
binding sites in every att site which contains flanking
sequences in addition to the core region. The IHF
protein is, as its name implies, encoded by the host,
and was discovered in the late 1970s because of its role
in site-specific recombination. However, it is found in
most gram-negative bacteria and profoundly influences the structure of the bacterial genome and the
expression of as many as 20% of bacterial genes, either
by helping to repress or to activate them.
In the case of lambda and related phages, two more
helper proteins bind to two of the four att sites, attP
and attR, and play an important role in ensuring that
recombination proceeds in a directional fashion.
These are the phage-encoded excisionase (Xis) protein
and the factor for inversion stimulation (FIS) protein.
When the host is suffering DNA damage, it is important that the phage, once excised, does not reinsert into
the chromosome since this would expose it to more
damage. By the same token, once a phage has managed
to infect a rather lonely bacterium barely eking out a
living in a nutrient-poor environment, it would be
best if it remained integrated rather than mistakenly
excising and setting off a lytic infection which would
kill one of the only hosts around. The Xis protein is
122
A tt e n u a t i o n
promoter
attB
attP
Attenuation
E J Murgola
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0088
The term `attenuation' refers to mechanisms of regulation of gene or operon expression that result in discontinuation or termination of RNA synthesis by
RNA polymerase, soon after the initiation of transcription. Transcription attenuation makes use of
RNA sequences and structures and allows cells to
sense availability of the precursors needed for RNA
and protein synthesis. RNA signals can direct a transcribing RNA polymerase molecule to pause during transcript elongation, to terminate transcription
prematurely, or to transcribe through a potential
Attenuation,
Transcriptional
T M Henkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0089
Gene expression can be efficiently regulated by aborting the synthesis of messenger RNA through the
premature termination of transcription. Transcription
termination signals that are affected by this type of
control mechanism are called attenuators, because
they have the capacity to reduce the levels of downstream transcription. Attenuators generally resemble
the typical termination signals found at the ends of
transcriptional units, but lie upstream of the coding
sequence of the genes they control. A variety of molecular mechanisms have evolved for directing RNA
polymerase to either terminate or read through a particular attenuation site (Figure 1). These mechanisms
include controlling the formation of an RNA structure
that precludes terminator formation, or converting
the transcription machinery to a termination-resistant
form. A few of these mechanisms are described below.
In each case, the efficiency of termination can be
modulated in response to some physiological signal
such as nutrient availability.
124
A tt e n u a t i o n , Tr a n s c r i p t i o n a l
Readthrough
Terminate
C
B
E. coli trp
B
B
B
B. subtilis trp
D
B
C
B
E. coli bgl
B
C
B. subtilis tyrS
E
t
Figure 1 Mechanisms for control of transcription termination. (A) Translation of leader mRNA (e.g., E. coli trp
operon). Amino acid limitation causes stalling of the ribosome (stippled circles) in region A of the leader RNA,
resulting in formation of the antiterminator (B:C) instead of the terminator (C:D). (B) Termination directed by an
RNA binding protein (e.g., B. subtilis trp). Binding of a protein (shaded circle) to the mRNA prevents antiterminator
(B:C) formation, promoting formation of the terminator (C:D). (C) Readthrough directed by an RNA binding protein
(e.g., E. coli bgl ). Binding of a protein (shaded circle) to the leader region stabilizes the antiterminator (B:C),
preventing formation of the terminator (C:D). (D) Antitermination directed by tRNA (e.g., B. subtilis tyrS). Interaction
of the cognate uncharged tRNA with the leader mRNA promotes formation of the antiterminator (B:C), preventing
formation of the terminator (C:D). (E) Modification of the transcription elongation complex (e.g., l N). Binding of
proteins (small circles) to the nascent transcript (diagonal line) and RNA polymerase (hatched ellipse) converts the
transcription complex to a form resistant to termination signals (t). (Reproduced with permission from Annual Review
of Genetics 30, by Annual Reviews (www. AnnualReviews.org).)
Terminator Proteins
Expression of the Bacillus subtilis trp operon is controlled by TRAP, an unusual RNA binding protein. In
the presence of tryptophan, TRAP binds to the trp
leader RNA and prevents formation of an antiterminator structure, thereby permitting formation of the
competing intrinsic terminator. TRAP assembles into
an 11-subunit symmetrical ring, with 11 molecules of
tryptophan spaced between the TRAP monomers.
The RNA appears to wrap around the outside of the
TRAP ring, with contacts between each monomer
and GAG/UAG repeats in the RNA binding site.
TRAP oligomerization is tryptophan-independent,
but RNA binding requires tryptophan, suggesting
that tryptophan controls TRAP activity by causing a
conformational change that is required for binding to
its RNA target site. The B. subtilis pyr system is also
regulated by binding of a regulatory protein to the
RNA leader region to mediate transcription termination. In this case, the regulator, PyrR, binds in the
presence of UMP, an end product of pyr operon
expression. The target site for PyrR is a complex
structure. Binding to this element precludes formation
of an antiterminator structure, which competes with
the attenuator. Thus, PyrR causes termination by
stabilization of an anti-antiterminator. The trp and
pyr systems are similar in that the default state is
readthrough of the attenuator in the absence of the
end-product of expression of the operon, so that transcription will be prevented only if the required metabolite is present.
Antiterminator Proteins
tRNA-Directed Antitermination
In B. subtilis and other gram-positive bacteria, many
genes involved in amino acid biosynthesis and activation are regulated by a unique transcription termination
control mechanism. Each gene responds specifically
to the charging ratio of the cognate tRNA (e.g.,
the tyrosyl-tRNA synthetase gene, tyrS, responds to
tyrosyl-tRNA, the threonyl-tRNA synthetase gene,
thrS, responds to threonyl-tRNA, etc.). Unlike the E.
coli trp-type system, where tRNA charging is monitored via the translation of a leader RNA-encoded
peptide, in gram-positive systems tRNA charging
is measured by a direct interaction between the
uncharged tRNA and the leader RNA. The specificity
of the interaction is mediated by a single codon in the
leader RNA which matches the anticodon of the regulatory tRNA, and a single-stranded region of an antiterminator structure which pairs with the acceptor
end of uncharged tRNA. Binding of the uncharged
tRNA to the leader is postulated to stabilize the antiterminator, which competes with terminator formation; charged tRNA is predicted to be unable to
interact with the antiterminator, so that readthrough
of the terminator occurs only when uncharged tRNA
accumulates. This event signals a requirement for
increased expression of the appropriate amino acid
biosynthesis or aminoacyl-tRNA synthetase genes.
The leader regions of all of the genes in this family
exhibit high conservation of a number of sequence
and structural elements, all of which are required for
readthrough of the terminator, but the roles of these
elements remain to be determined. It is likely that
factors in addition to the tRNA play a role in antitermination.
Further Reading
Henkin TM (2000) Transcription termination control in bacteria. Current Opinion in Microbiology 3: 149153.
Weisberg RA and Gottesman ME (1999) Processive antitermination. Journal of Bacteriology 181: 359367.
AUG Codons
A Liljas
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0090
126
A ut oc r i ne / Par a c ri n e
Autocrine / Paracrine
S A Aaronson and A Bafico
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1547
References
Bosenberg MW and Massague J (1993) Juxtacrine cell signaling molecules. Current Opinion in Cell Biology 5(5): 832
838.
Burgess WH and Maciag T (1989) The heparin-binding (fibroblast) growth factor family of proteins. Annual Review of
Biochemistry 58: 575606.
Chiu IM (1989) Growth factor genes as oncogenes. Molecular
Chemistry and Neuropathology 10(1): 3752.
Cleveland JL, Troppmair J, Packham G et al. (1994) v-raf suppresses apoptosis and promotes growth of interleukin3-dependent myeloid cells. Oncogene 9(8): 22172226.
Coussens LM, Tinkle CL, Hanahan D and Werb Z (2000)
MMP-9 supplied by bone marrow-derived cells contributes
to skin carcinogenesis. Cell 103: 481490.
Derynck R (1988) Transforming growth factor alpha. Cell 54(5):
593595.
Di Marco E, Pierce JH, Fleming TP et al. (1989) Autocrine
interaction between TGF alpha and the EGF-receptor:
quantitative requirements for induction of the malignant
phenotype. Oncogene 4(7): 831838.
Doolittle RF, Hunkapiller MW, Hood LE et al. (1983) Simian
sarcoma virus onc gene, v-sis, is derived from the gene (or
genes) encoding a platelet-derived growth factor. Science
221(4607): 275277.
Halaban R, Kwon BS, Ghosh S, Delli Bovi P and Baird A (1988a)
bFGF as an autocrine growth factor for human melanomas.
Oncogene Research 3(2): 177186.
Halaban R, Langdon R, Birchall N et al. (1988b) Basic fibroblast
growth factor from human keratinocytes is a natural mitogen
for melanocytes. Journal of Cell Biology 107(4): 16111619.
Hanahan D and Folkman J (1996) Patterns and emerging
mechanisms of the angiogenic switch during tumorigenesis.
Cell 86(3): 353364.
Heldin CH and Westermark B (1989) Platelet-derived growth
factors: a family of isoforms that bind to two distinct receptors. British Medical Bulletin 45(2): 453464.
Klagsbrun M and Baird A (1991) A dual receptor system is
required for basic fibroblast growth factor activity. Cell
67(2): 229231.
Logan A (1990) Intracrine regulation at the nucleus: a further
mechanism of growth factor activity? Journal of Endocrinology
125(3): 339343.
Massague J (1990) Transforming growth factor-alpha: a model
for membrane-anchored growth factors. Journal of Biological
Chemistry 265(35): 2139321396.
Matsui T, Heidaran M, Miki T et al. (1989) Isolation of a novel
receptor cDNA establishes the existence of two PDGF
receptor genes. Science 243(4892): 800804.
Maxwell M, Naber SP, Wolfe HJ et al. (1990) Coexpression of
platelet-derived growth factor (PDGF) and PDGF-receptor
genes by primary human astrocytomas may contribute to
their development and maintenance. Journal of Clinical Investigation 86(1): 131140.
Nister M, Libermann TA, Betsholtz C et al. (1988) Expression of
messenger RNAs for platelet-derived growth factor and
transforming growth factor-alpha and their receptors in
human malignant glioma cell lines. Cancer Research 48(14):
39103918.
128
A ut o g e n o u s C o n t ro l
Autogenous Control
Autonomous Controlling
Element
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1770
An autonomous controlling element is an active transposon (in maize) demonstrating the ability to transpose
(cf. nonautonomous controlling element)
Autoradiography
Autoimmune Diseases
A Cooke
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0092
Autoradiography is a technique for detecting radioactively labeled molecules by virtue of their ability to
create an image on photographic film.
Autoregulation
J Hodgkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0093
Autosomal Inheritance
D E Wilcox
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0094
130
Autosomal Inheritance
I:1
II:1
II:3
III:1
III:2
IV:1
1. Phenotype expressed in heterozygote. Each affected person has inherited the mutation from only one
parent and so is heterozygous.
2. Vertical pattern of inheritance. The result of the
mutation being present in heterozygotes in several
generations of a family is a vertical pattern of affected individuals. Typically, this is grandparent to
child to grandchild. In this family, the transmission
from the male, II:3 to III:2 then to IV:1, is a vertical
pattern.
3. Affected person has a 1 in 2 chance of transmitting
phenotype to each offspring. Each affected person
has two chromosomes and the chance of transmitting the chromosome with the mutated allele is 1 in 2.
I:3
II:2
I:2
II:4
II:5
III:3
III:4
IV:2
Figure 1 Example pedigree with autosomal dominant inheritance showing the arrangement of the alleles on each
individual's chromosome pair.
132
Autosomal Inheritance
I:1
II:1
II:2
I:2
II:3
II:4
III:1
II:5
III:2
Figure 2 Example pedigree with autosomal recessive inheritance showing the arrangement of the alleles on each
individual's chromosome pair.
the carrier risk for the healthy sib of an affected
person is 2/3. It is not 2/4 as one of the four possible genotypes (a/a) is excluded because it has an
affected phenotype.
4. Males and females are affected in equal proportions. Autosomes, and their linked genes, are
transmitted to offspring independently of the sex
chromosomes, thus males and females are equally
likely to be affected.
5. Constant expressivity within an affected sibship.
Although each recessive mutation affects the function of a gene to a differing degree, each affected sib
carries the same pair of mutations and so has a
similar phenotype. The severity of the phenotype
varies between sibships with different combinations of mutations. An example is spinal muscular
atrophy, a cause of progressive neuromuscular
weakness, that can vary from severely affected
infants to affected children and mildly affected
adults, depending on the combination of mutations.
6. Consanguinity. Consanguinity should always be
asked about when taking a family history of an
autosomal recessive disorder. Since relatives share
genes, a relative is much more likely to carry an
individual's recessive mutation than a nonrelative.
The consanguinity between III:1 and III:2 risks a
recurrence in this family. II:2 has a 2/3 carrier risk
so III:1's carrier risk is 1/3. III:2 is an obligate
carrier and has a carrier risk of 1/1. The chance
that III:1 and III:2 are both carriers and might
Further Reading
Auxotroph 133
Online Mendelian Inheritance in Man. http://www.ncbi.nlm.nih.
gov/omim/
University of Glasgow, Department of Medical Genetics, Encyclopaedia of Genetics pages contain a number of relevant
illustrations and animated diagrams. http://www.gla.ac.uk/
medicalgenetics/encyclopedia.htm
Autosomes
M A Cleary
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0095
Auxotroph
B K Low
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0096
An auxotroph is a mutant organism that has an additional nutritional growth requirement, when compared with the parental organism from which it was
derived. This concept has been especially important in
the genetic analysis of microorganisms such as bacteria and fungi, some of which normally have very
few growth requirements and therefore can grow on
simple defined synthetic media in the laboratory.
Mutations in any of hundreds of different genes
can cause auxotrophy, resulting in a requirement for
an amino acid, nucleotide, vitamin, etc., or some
combination of requirements. These kinds of mutations are easy to select against in genetic crosses and
have been crucial for the analysis of both the largescale and fine genetic structure and biosynthetic pathways of numerous microorganisms.
In order to obtain new auxotrophs in the laboratory it is possible in many cases to induce mutations
in a culture of the strain under study, and plate out
the surviving cells to obtain colonies on agar plates
containing rich medium with many nutrients to allow
auxotrophs as well as parental cells to grow. This is
followed by `replica plating' (to copy a pattern of
colonies using a round velvet-covered block for imprinting some of the cells from the colonies, for transfer onto other plates) onto both rich medium and
similarly onto minimal medium, in order to identify
rare clones which grow only on the rich medium. The
particular growth requirements of various new unknown auxotrophs can be determined by testing them
on various mixtures (pools) of known ingredients.
An alternative procedure for auxotroph isolation,
when mutants of a particular type are desired (e.g.,
requiring arginine for growth), is to plate out mutagenized cultures onto minimal medium plates that
contain a very small amount of the particular ingredient (e.g., arginine), so that the desired auxotrophs (e.g.,
arginine-requiring) will form rare tiny colonies as
compared with the majority large colony type. Some
of the tiny colonies are found to carry mutations in a
gene (e.g., arg) for the desired phenotype.
The efficiency of isolation of rare auxotrophs in a
mutagenized culture can be improved using a method
that kills most of the nonmutated cells but does
not kill the auxotrophs. Such a method is the use of an
antibiotic such as penicillin, which kills, for example,
Escherichia coli or Salmonella if they are growing (by
interfering with cell wall synthesis) but not if they are
starved and in stationary phase. This can be accomplished by growing the mutagenized culture in minimal medium for a period, to allow the auxotrophs to
come to a halt in growth, and then adding penicillin to
kill (most of) the growing cells. The survivors are then
screened to identify auxotrophic mutants.
Auxotrophic mutations differ widely in how defective or `tight' the block in function is. A partially
blocked, or `leaky' auxotroph is known as a `bradytroph,' and grows slowly in the absence of the
required nutrient. Some of these auxotrophic requirements are much tighter at either high or low temperature, thus providing a conditional phenotype that is
useful in some studies. Most auxotrophic mutants are
mutated in a biosynthetic gene or a gene for tRNA
synthesis, or an aminoacyl-tRNA synthetase.
See also: Screening
B
B6
See: C57BL/6
Bacillus subtilis
A Danchin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0099
136
B a c i l l u s s u b tilis
recently Francis Crick, relies on the notion that bacterial spores such as those of B. subtilis could travel
through space and survive for millions of years. Despite its appeal to a wild imagination, this hypothesis
essentially puts the investigation of the origin of life
out of our reach, because exploring the whole Universe
is not possible.
Sporulation
Upon starvation, B. subtilis stops growing and initiates sporulation. This developmental process involves
differentiation into two cell types (Figure 1). The
process begins with a reorganization of the cell cycle
that leads to the production of cells whose size and
chromosome content is appropriate for the developmental process. The formation of the two cell types,
a forespore and a mother cell twice as large as the
B a c i l l u s s u b t i l i s 137
forespore, with differing developmental fates is the
first morphological indication of the early stages of
sporulation in B. subtilis. Endospore formation is a
multistep process that is common among bacilli. This
seemingly simple structure is the product of a very
complex network of interconnected regulatory
pathways that become activated during late growth
in response to unbalanced nutritional shifts and cell
cycle-related signals. Sporulation starts with stage 0
(vegetative growth). Symmetrical cell division, characteristic of vegetative growth, is blocked. Instead, the
cell divides asymmetrically to produce a small polar
prespore cell and a much larger mother cell. During
stage I, asymmetrical preseptation starts. The cellular
DNA takes the shape of an axial filament. At stage II,
septation proceeds and the daughter chromosomes
are separated. Spore development follows at stage III
(engulfment of the forespore and complete separation
of the spore membrane from that of the mother cell).
Stage IVinvolves formation of the spore cortex. In stage
V spore coat proteins are synthesized and assembled.
At stage VI the spore becomes highly refractile under
the microscope, and it acquires heat and stress resistance. Finally, the programmed death of the mother cell
occurs, leading to lysis and release of the mature spore
(stage VII). Pigments are produced that stain the spores
from reddish brown to blackish brown (black in the
presence of tyrosine).
Stage 0
Stage I
Septation
A
pro- E
pro- E
pro- E
F
Stage IV
Cortex
Stage III
Engulfment
pro- E
F
G
pro-E
E
Stage V
Coat
K
pro- K
G
pro- K
Stages VI VII
Maturation, lysis
Free spore
Germination
Outgrowth
Figure 2 The stages of sporulation. The various sigma transcription factors that control gene expression processes
during sporulation are indicated within the compartments where they operate.
138
B a c i l l u s s u b tilis
Protein Secretion
Metabolism
In addition to the need for compartmentalization, living cells must chemically transform some molecules
into others. Metabolism is the hallmark of life. Cells
can be in a dormant state, as is the case with spores, for
example, but one cannot be sure that they are living
organisms unless, at some point, they initiate metabolism. In general, one distinguishes between primary
metabolism (the transformation of molecules that support cell growth and energy production) and secondary metabolism (transactions involving molecules that
are not necessary for survival and multiplication, but
B a c i l l u s s u b t i l i s 139
Processing
PEP19
Oligopeptide
permease
Outside
Cell membrane
Inside
Modification ?
RapA
phosphatase
Phr
PEP6
Spo0F + Pi
+
ATP
KinA
ADP
KinA P
Spo0F P
Spo0B
Spo0A P
Spo0B P
Spo0A
Spo0E
Spo0F
Pi
Figure 3 Competence is triggered when Bacillus subtilis encounters a signal from the environment and when an
appropriate quorum is reached, monitored by phenomones synthesized by the bacteria. The chain of events is
depicted. A sensor controls a regulator which, though a phosphorylation cascade, controls transcription. The onset
of sporulation negatively controls competence under appropriate conditions through the action of the protein
SpoOK.
assist in the exploration and occupation of biotopes,
e.g., antibiotic synthesis).
140
B a c i l l u s s u b tilis
Intermediary Metabolism
that permit this versatility. Specific transcription control processes allow the cell to adapt to changes of
temperature by transiently synthesizing heat shock or
cold shock proteins, according to the environmental
conditions. In addition, gene duplication may permit
adaptation to high temperature, with isozymes having
low and high temperature optima. As a case in point,
B. subtilis has two thymidylate synthases. The one
coded by the thyA gene is thermostable (and more
related to the archebacterial type) and the other, ThyB,
is thermosensitive. Bacillus subtilis is also able to adapt
to strong osmotic stresses, such as the one that occurs
during dehydration and can adapt to high oxygen
concentrations and changes in pH. Not much is yet
known about the corresponding genes and regulation.
Because the ecological niche of B. subtilis is linked
to the plant kingdom it is subjected to rapid alternating drying and wetting. Accordingly, this organism is
very resistant to osmotic stress, and can grow well in
media containing 1 m NaCl, and indeed B. subtilis has
been recovered from sea water.
Secondary Metabolism
B a c i l l u s s u b t i l i s 141
SubtiList, updates the genome sequence and annotation as work on B. subtilis progresses throughout the
world. Of the more than 4100 protein-coding genes,
53% are represented once. A quarter of the genome
corresponds to several gene families that have been
greatly expanded by gene duplication, the largest of
which is a family containing 77 putative ATP-binding
cassette permeases.
142
B a c i l l u s s u b tilis
Information Transfer
B a c i l l u s s u b t i l i s 143
some B. subtilis species such as the Marburg strain
168) and transduction with the appropriate carrier
phages is well understood.
Recombination
RestrictionModification Systems
Phylogeny
Bacillus subtilis is a typical gram-positive eubacterium. As such it is significantly more similar to Archaea
than is E. coli. Many metabolic genes have a distinct
archaeal flavour, in particular genes involved in the
synthesis of polyamines, but it is rare to find genes in
B. subtilis that are similar to eukaryotic genes. This led
Gupta to propose that ancestral bacteria comprised
a monoderm organism that diverged into grampositive bacteria and Archaea, and that gram-positive
bacteria further led to gram-negative bacteria with
their typical double membrane (diderms). This
hypothesis stirred a very heated, but interesting, debate
about the origin of the first cell(s). As such, bacilli
form a heterogenous family of bacteria that can be
split into at least five distinct groups. Bacillus subtilis
is part of group 1 and is strongly linked to B. licheniformis (which is often found on the cuticle of insects),
and to the group of animal pathogens formed by B.
thuringiensis, B. cereus, and B. anthracis. In this classification B. sphaericus is typical of group 2, B. polymyxa of group 3, and B. stearothermophilus of group
5. The pathogen Listeria monocytogenes (in between
groups 2 and 5) is related to B. subtilis, and, indeed, its
genome has many features in common with that of the
genome of B. subtilis. Accordingly, B. subtilis is an
excellent model for these groups of bacteria.
Industrial Processes
As a model organism, B. subtilis possesses most of the
functions that one would expect to find in bacteria.
144
Bac kcross
Further Reading
Reference
Backcross
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0100
B a c k g ro u n d S e l e c t i o n 145
during the generation of each individual. Genotypic
and phenotypic analysis of animals in this secondgeneration population can provide data that can be
used to determine linkage relationships and map positions of genes.
See also: F1 Hybrid; Independent Assortment;
Independent Segregation; Test Cross
Background Selection
W F Eanes
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1417
146
Bac teria
expectation; however, under the adaptive-sweep scenario, the genomic region is proposed, depending on
the age of the sweep, to show an excess of rare alleles,
or `singletons.' This issue remains unresolved.
Further Reading
Bacteria
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0102
Characteristics of Bacteria
Prokaryotes (Bacteria and Archaea) have some things
in common besides the lack of a membrane-bound
nucleus. Prokaryotes are also missing the other membranous organelles, such as mitochondria and chloroplasts, found in eukaryotic cells. (Interestingly, it
seems very clear that mitochondria and chloroplasts
have evolved from Bacteria.) While many prokaryotes
are motile by means of a flagellum or flagella, the flagellum of the prokaryotes is unrelated to that found in
eukaryotes.
Prokaryotes alsoreproduceexclusivelyasexually,by
a process called binary fission. Unlike the Eukarya,
B a c t e r i a 147
over 200 mm in length. One of the largest Bacteria is
the rod-shaped Epulopsicum fishelsoni which has a
diameter of 50 mm and is over 500 mm in length. The
cells of the bacterium Thiomargarita namibiensis can
be as large as 750 mm in diameter, about the size of a
printed period (full stop), and are thus visible to the
naked eye. This enormous size results from the presence of a very large vacuole which contains nitrates.
Many Bacteria contain vacuoles or storage granules,
but most are much smaller in size.
While most Bacteria are unicellular, some clump
together in regular patterns (see Figure 1) and others
can form complex multicellular groups during their life
cycles. These latter include organisms like Myxococcus
xanthus, where specialized cell types form during a
complex life cycle.
Diversity of Bacteria
Bacteria occupy almost every niche where cellular life
is possible and are the most numerous forms of cellular life on the planet. Indeed, many Bacteria live in
habitats which are far removed from the environmental conditions which can be tolerated by eukaryotic
organisms (although Archaea can also thrive in these
extreme environments). Bacteria also exist in very large
numbers in most habitats, and consequently their metabolism has a profound impact on environmental and
Figure 1 A scanning electron micrograph of Micrococcus luteus. Each coccoid cell is approximately 1 mm in
diameter. Note there is a tendency of the cells to exist in
small clusters. This organism is an obligate aerobe, that
is oxygen is required for metabolism, and is a member
of the gram-positive division. (Electron micrograph
courtesy of John Bozzola.)
Metabolic Diversity
The metabolic diversity found among the various species of Bacteria is enormous, encompassing all known
major modes of nutrition and most known modes of
metabolism. Many Bacteria (like most Eukarya) are
chemoheterotrophs, and must consume organic molecules for both a source of carbon and of energy. Many
other Bacteria (like most plants) are photoautotrophs,
and can derive energy from light and synthesize
organic compounds from carbon dioxide. Some Bacteria are chemolithoautotrophs, and also synthesize
organic compounds from carbon dioxide but derive
energy from oxidizing inorganic substances. Still
other Bacteria are photoheterotrophs, and use light
to generate energy but require organic carbon as a
carbon source.
To say merely that the heterotrophic Bacteria
require an organic carbon source fails to convey the
enormous variety of carbon sources that different
148
Bac teria
Figure 3 A scanning electron micrograph of Borrelia burgdorferi. This bacterium is a member of the Spirochete division
and is the cause of Lyme disease. It is one of the relatively few Bacteria known to have a linear chromosome (see Table 1).
The cell shown is 15 mm in length. (Electron micrograph courtesy of Pawel Krasucki and Cathy Santanello.)
Bacteria might use. Indeed, most organic compounds
can be metabolized by some species of Bacteria. Some,
such as Bacillus subtilis and Escherichia coli, need only
a simple organic carbon source, such as glucose, from
which they derive their energy and form all the other
carbon compounds found in the cell. Other free-living
Bacteria require a complex mixture of organic compounds for growth, while others are parasites and
obtain complex substances from their hosts.
However, the metabolic diversity of the Bacteria is
not confined to how they derive energy or what carbon
source they use. Nitrogen is the second most abundant
element in living material and several of the reactions
that nitrogen undergoes in the environment are carried
out almost exclusively by Bacteria. About 85% of the
nitrogen fixation, the process by which nitrogen gas
(N2) in the atmosphere is reduced to ammonia (NH3),
which occurs on the planet is biological and almost all
of this is carried out by Bacteria (although a few
Archaea can also fix nitrogen). Many of the Bacteria
which fix nitrogen, e.g., species of Rhizobium, do so
while participating in a symbiotic relationship with
higher plants. Bacteria are also responsible for the
denitrification which returns nitrogen to the atmosphere and also play key roles in the global sulfur and
iron cycles as well as those of several trace metals.
Some Bacteria are obligate aerobes, that is they require oxygen for respiration and cannot grow without
it. Other Bacteria use oxygen if it is present to respire
Phylogenetic Diversity
B a c t e r i a 149
characteristics such as morphology, gram reaction and
cell wall chemistry, nutritional classification (photoautotroph, chemoheterotroph, and so forth), ability
to use various carbon, nitrogen, and sulfur sources,
nutritional requirements, lipid chemistry, temperature
and pH requirements or tolerances, pathogenicity, and
habitat. However, molecular taxonomy is based on the
relatedness of the sequences of macromolecules, particularly that of the 16S ribosomal RNA (rRNA)
found in the small subunit of the prokaryotic ribosome. Two Bacteria whose 16S rRNA differ by more
than 3% are considered to be separate species and
those which differ by more than 57% are considered
to be in separate genera. (The conservation of these
sequences can be noted by the fact that a 3% difference in 16S rRNA sequence can be indicative of an
overall genome similarity of about 70%.)
Higher level taxonomy depends both on these
sequence differences and the phenotypic characterization mentioned above. Among the culturable Bacteria
there are at least 14 different major divisions. These
divisions are also sometimes referred to as `kingdoms'
or `phyla' (there is not yet a consistent usage of taxonomic nomenclature of the Bacteria for the higher
level of groupings). The typical difference between 16
S rRNA sequences from different divisions is 2025%.
The major divisions include Aquificales, Chlamydia,
Cyanobacteria (the chloroplasts of eukaryotes are
related to this division), Cytophagales, Deinococcus/
Thermus, Fusobacteria, Gram-positive Bacteria
(often divided into two divisions), Green nonsulfur
Bacteria, Green sulfur Bacteria, Nitrospira, Planctomyces, Proteobacteria (also called Purple Bacteria),
Spirochetes,andThermotagales.(Thereisnotyetagreement on the names of the divisions.) However, the
ability to identify and characterize organisms using in
situ hybridization to nucleic acid probes has indicated
there may be as many as 50 such higher-order groupings. Many of these are currently known only as a
sequence found in a natural population, these are
referred to as `environmental sequences.'
It is the power of this methodology that has given
rise to the estimate that so few of the existing species of
Bacteria have been cultured. At the same time it has
become clear that the currently culturable Bacteria do
not make up a preponderance of natural populations.
This finding emphasizes that these new divisions do
not simply contain a few minor exotic Bacteria.
Bacterial Genomes
As mentioned above, the majority of Bacteria whose
genomes have been characterized are found to contain
a single circular double-stranded DNA molecule
as a chromosome. Certain Bacteria do have linear
150
Bac teria
Table 1
Organism
Mycoplasma genitalium
Mycoplasma pneumoniae
Borrelia burgdorferi
Chlamydia trachomatis
Rickettsia prowazekii
Treponema pallidum
Aquifex aeolicus
Helicobacter pylori
Haemophilus influenzae
Synechocystis sp.
Bacillus subtilis
Mycobacterium tuberculosis
Escherichia coli
580 070
816 394
910 725
1 042 519
1 111 523
1 138 006
1 551 335
1 667 867
1 830 137
3 573 470
4 214 810
4 411 529
4 639 221
470
677
853
894
834
1041
1512
1590
1743
3168
4100
3924
4288
Gram-positive Bacteria, lacks cell wall, parasitic, smallest known cellular genome
Gram-positive Bacteria, lacks cell wall, causes pneumonia
Spirochete, has linear chromosome,d causes Lyme disease
Chlamydia, obligate intracellular parasite, common human pathogen
Proteobacteria, obligate intracellular parasite, causes epidemic typhus
Spirochete, human parasite which cannot be cultured continuously in vitro, causes syphilis
Aquificales, a hyperthermophilic chemolithoautotroph, growth maximum near 95 8C
Proteobacteria, causes peptic ulcers, the most common chronic infection of humans
Proteobacteria, causes infectious meningitis, naturally transformable, first cellular genome sequenced
Cyanobacteria, a photoautotroph
Gram-positive Bacteria, genetic model
Gram-positive Bacteria, causes tuberculosis, claims 3 000 000 lives per year, more than any other pathogen
Proteobacteria, gram-negative genetic model
All Bacteria listed have a single chromosome. However, many contain one or more plasmids. For example, the strain of Borrelia burgdorferi whose chromosome was
sequenced contains 17 different plasmids which themselves total 533 kilobase pairs. bThe number of open reading frames, ORFs, is an approximation of the total number of
different proteins that an organism encodes. c The bold term in each of these descriptions is the name of the bacterial `division' to which the organism belongs. dAll other
chromosomes in this list are circular.
Transformation
Conjugation
Bacterial conjugation is a plasmid-encoded mechanism that allows certain plasmids to transfer themselves from cell-to-cell, sometimes across wide
phylogenetic distances. Conjugation is one of the
prominent mechanisms for the spread of antibiotic
resistance in pathogenic Bacteria and is undoubtedly
involved in other types of horizontal gene transfer.
Conjugation was discovered (in Escherichia coli),
however, because under certain conditions the
process can also mobilize the transfer of part of the
host chromosome.
Transduction
Further Reading
Hugenholtz P, Goebel B and Pace NR (1998) Impact of cultureindependent studies on the emerging phylogenetic view of
bacterial diversity. Journal of Bacteriology 180: 47654774.
Madigan MT, Martinko JM and Parker J (2000) Brock Biology of
Microorganisms, 9th edn. Upper Saddle River, NJ: Prentice Hall.
Woese CR, Kandler O and Wheelis ML (1990) Towards a natural
system of organisms: proposal for the domains Archaea,
Bacteria, and Eucarya. Proceedings of the National Academy of
Sciences, USA 87: 4576 4579.
Bacterial Genes
E A Raleigh
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0103
152
B a c te r i a l G e n e s
Gene Names
Mutations
Gene Structure
Protein-Coding Genes
Promoter
(start)
lacZ
DNA
lacY
lacA
Transcription
RNA
AUG. . . . . . . . . UAA AUG. . . . . . UAA UUG.UAA
5' UTR/RBS
3' UTR
Protein
Translation
....................
LacZ
..............
LacY
........
LacA
RNA Genes
Some genes code for RNA transcripts that have function in themselves, without translation. Pre-eminent
among these are those involved in the translation
machinery: ribosomal RNA and transfer RNA. The
gene is then defined as the transcribed segment in
toto. Some processing may occur to yield the final
product so that the active species may be smaller
than the initial transcript. Some other RNA gene
products play enzymatic roles: RNAse P for example
is comprised of both a protein moiety and an RNA
moiety. Yet others act by altering the metabolism of
particular transcripts: for example, RNA I of pBR322
regulates plasmid copy number via a pairing interaction with the RNA (RNA II) that serves as a replication primer.
154
B a c te r i a l G e n e s
or mutant state, nor even any knowledge of the biochemical function in question. However, it does
require a method for introducing a second, wild-type
copy of the candidate genes. In bacteria, this is not
straightforward, since the genome is haploid. Such
experiments are usually conducted by establishing a
plasmid carrying the gene region in question.
Gene Organization
Operons
In some cases, adjacent genes overlap and are translated in different frames from the same sequence.
Usually the overlap is small. A significant minority
of genes in operons overlap by one or four nucleotides
for example: TAATG, where TAA is the stop for the
upstream gene, and ATG is the start of the second; or
GTGA, where GTG is the start of the downstream
gene and TGA the stop for the upstream gene.
Numerous examples of ribosomal frameshifting
have been described in viruses and insertion sequences
as well as at least two conventional bacterial genes.
Translating ribosomes `slip' on the message at a defined location (called a `slippery sequence') and continue translation in a frame different from the original
one. This occurs with dnaX of E. coli, leading to
expression of replication factor gamma. A subset of
ribosomes fails to frameshift; these terminate translation at a stop codon not far away, resulting in translation of replication factor tau, so that there are two
gene products.
In rare instances, two genes may overlap extensively: the IS5 insertion sequence expresses one protein from one strand and two others from the other
strand. In this instance, the same sequence segment
codes for two genes. This sort of overlap is more
frequent in mobile elements and bacteriophage,
Intervening Sequences
Genes in Databases
Large-scale sequencing projects have produced massive quantities of sequence information on large
numbers of bacterial species. This section sketches
very briefly the basis of gene definition for these raw
sequences, and some problems of interpretation that
arise in the absence of biochemical or mutational data.
Genetic analysis of model systems (especially E. coli
and B. subtilis) over the last 50 years has provided a
large reservoir of knowledge of genes and functions
essential for life. This knowledge can be applied to
new sequences, by computerized analysis based on
known gene structure and by comparison with known
sequences. Accordingly, annotations to database entries can provide a guide to what genes are present,
where they are, and what they might do.
ORFs
Codon Usage
156
B a c te r i a l G e n e t i c s
Modules
Further Reading
References
Bacterial Genetics
S Maloy
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0104
Bacterial genetics is the study of how genetic information is transferred, either from a particular bacterium
to its offspring or between interbreeding lines of bacteria, and how that genetic information is expressed.
Given the short generation times of most bacteria, the
inheritance of genetic information must be extremely
faithful. Occasionally genetic variation or the transfer
of genetic information between bacteria gives rise to
mutations. The large sizes of bacterial populations
ensure that even extremely rare genetic events are
likely to occur. This genetic variation makes it possible
for individual members of huge populations of bacteria to evolve new traits rapidly. For example, a single
mutation may allow a bacterium to survive environmental conditions that would kill its nonmutant siblings (e.g., exposure to an antibiotic), or a group of
genes transferred from another bacterial species
may enable such an altered bacterial species to invade
a new environmental niche (e.g., the ability to infect a
Effects of Mutations
Bacterial mutants are typically described by comparison to a standard, well-characterized, reference strain
called the `wild-type' strain. Bacterial mutants have
often lost some growth property (e.g., failure to utilize
a particular carbon or nitrogen source or failure to
grow without a particular nutrient), or acquisition of
some new growth property (e.g., ability to grow in the
presence of some toxic substance). Genes can be
divided into two categories based on the phenotypes
of the corresponding mutants: nonessential gene products are only required under specific growth conditions, while essential gene products are required under
all conditions. The genes of lactose catabolism are
Table 1
Conditional mutation
Permissive conditions
Nonpermissive conditions
30 8C
42 8C
High osmotic strength
Host with suppressor mutation
42 8C
30 8C
Low osmotic strength
Host lacking suppressor mutation
158
B a c te r i a l G e n e t i c s
Approach
Features
Selection
Screen
Enrichment
Sensitivity
10
10
10
to 10
10
per cycle
Examples
Resistance to antibiotics or toxic substrate analogs; resistance to phage
Indicator medium to identify mutants
unable to ferment a carbon source;
replica plating to identify auxotrophs
Penicillin enrichment; D-cycloserine enrichment; radioisotope suicide
Genetic Exchange
Exchange of DNA between bacteria plays an important role in evolution. Gene transfer in nature can result
in the acquisition of antibiotic resistance and new
virulence traits. Gene transfer is also a useful tool in
the laboratory, allowing genetic mapping and complementation tests, and the construction of bacterial
strains with multiple mutations. The three most
common methods of gene transfer between bacteria
Table 3
Transformation
Transfer mechanism
Vehicle
Properties
Transformation
Conjugation
Generalized transduction
Phage
Specialized transduction
Lysogenic phage
160
B a c te r i a l G e n e t i c s
Conjugation
Transduction
Mutant organisms can regain their wild-type characteristics by means of a mutation that restores function,
by a process called reversion. Reversion can either be
owing to `true reversion,' a back mutation that restores
the original genotype, or `suppression,' a second mutation that produced a change that partially or fully
compensated for the first mutation. The ability to
analyze very large populations of bacteria facilitates
the isolation of rare classes of such revertants.
The reversion frequency is a useful criterion for
classifying mutants. The reversion frequency can distinguish a single point mutant from a double mutant
or a deletion mutant: single point mutants (which can
be repaired by a single back mutation) revert at a much
higher frequency than double mutants (which require
two changes and thus revert very rarely) or deletion
mutants (which cannot directly revert). The effect
of mutagens on the reversion frequency can also be
used to distinguish different types of mutations
reversion of frameshift mutations is stimulated by
frameshift mutagens, and so on. For most purposes
this approach has been superceded by direct DNA
sequence analysis; however, the reversion of known
bacterial mutations remains an important assay to
detect potential human carcinogens (for example, by
the Ames test).
Second site mutations that result in suppression
can occur within the same gene as the original mutation (intragenic suppression), or within a different
gene (intergenic suppression). Intragenic suppressors
can provide information about the role of particular
Homologous Recombination
Complementation Analysis
A particular phenotype frequently reflects the activity
of many genes. To understand any genetic system it is
essential to know the number of genes and regulatory
elements involved. Since multiple genes that affect the
same function may map very close to each other, it is
not possible to determine if two mutations are in the
same or different genes simply from the recombination frequency. For a variety of reasons, it is also often
difficult to prove that two mutations affect the same
gene product by DNA sequence analysis. However,
this question can be addressed by genetic complementation.
Bacteria are normally haploid, but complementation requires the maintenance of two copies of a particular gene in the same cell. In bacteria this can be
done by constructing a partial diploid or `merodiploid' that carries two copies of the relevant genes.
Partial diploids may provide one copy of the gene on
the chromosome and a second copy of the gene on a
plasmid. For example, the partial diploid lacZ/lacZ
162
B a c te r i a l G e n e t i c s
has both a copy of the lacZ gene and second copy with
the lacZ gene. If the functional copy of these genes
encodes a protein that can diffuse through the cell to
perform its function, and the second copy has a lossof-function mutation, the functional copy of the gene
will be dominant over the mutant copy. Thus, even
though the lacZ gene on the chromosome cannot produce b-galactosidase, the plasmid-bone lacZ gene
makes b-galactosidase so the cell is phenotypically
Lac. The mutant is complemented by the wild-type
gene, indicating that the mutation is recessive. In contrast, the partial diploid lacZ/lacZ cannot make
b-galactosidase so the cell is phenotypically Lac . If
two recessive mutants fail to complement, they affect
the same gene.
Complementation analysis in bacteria usually follows the grouping of mutations by genetic mapping
experiments. This reduces the number of partial
diploids that must be constructed because genes that
map far from each other are clearly different. In
addition, complementation analysis should involve
partial diploids with an equal number of copies
of each gene. If one of the genes is present in excess
(for example, with genes cloned on multicopy plasmids) artifacts can occur which may be very misleading.
Transposable Elements
Summary
A hallmark of bacterial genetics is the ability to analyze very large populations of cells to identify rare
genetic events. Although a wide variety of genetic
tricks have been developed for specific purposes
in particular bacteria, bacterial genetics relies on a
relatively small core of tools for dissection of the
structure and function of genes. The essential tools
include the isolation of mutations, the ability to transfer
genes between bacterial strains, the ability to isolate
recombinants, and the ability to do complementation
Further Reading
Bacterial Transcription
Factors
T R Hoover
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0544
164
B a c te r i a l Tr a n s c r i p t i o n F a c t o r s
Transcriptional Activators
Additional transcriptional factors are required for
maximal expression from many bacterial promoters.
These transcriptional activators may be required for
recruiting RNA polymerase to the promoter, for isomerization of the closed promoter complex to an open
complex, or for promoter clearance. Activators, therefore, work by facilitating specific steps in the normal
pathway of transcription initiation rather than by
creating new pathways.
Repressors of Transcription
The Lac repressor protein, LacI, prevents the transcription of genes involved in lactose utilization (lac
genes) in E. coli. Like many other repressors, LacI
utilizes multiple operators to increase the efficiency
of repression. The main operator, O1, is centered at
11 relative to the transcriptional start site of the lac
operon. Auxiliary operators, O2 and O3, are centered
at positions 412 and 82, respectively. LacI, which is
Further Reading
Bacterial Transformation
R J Redfield
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0105
166
Natural Competence
Distribution
Importance
Transformation has been demonstrated under seminatural conditions, in soil, water, and the mammalian
hosts of bacterial pathogens. However, we have only
indirect evidence about the frequency of DNA uptake
in nature. DNA is relatively abundant in many environments; concentrations range from 0.250 mg l 1 in
fresh water and sea water to 300 mg ml 1 in some bodily
fluids; in each case this is a significant fraction of
the total available nutrients. Furthermore, in some
environments DNA can be more abundant than free
nucleotides and bases. Most of this DNA is not closely
related to the genomes of the competent cells that
might take it up.
In the laboratory
The mechanisms and regulation of natural competence have been well characterized in only a few
groups, primarily the gram-negative genera Haemophilus and Neisseria and the gram-positive Bacillus
and Streptococcus. In both gram-negative and grampositive bacteria, specific proteins on the competent
cell surface bind double-stranded DNA and pass it to
the DNA-translocation machinery in the inner (cytoplasmic) membrane, which then threads one or both
strands of the DNA across this membrane into the
cell. One strand of the DNA is usually degraded; this
may occur before, during, or after entry of the other
strand into the cytoplasm. The proteins involved in
the initial steps are different in different bacteria, but
at least some components of the membrane-threading
machinery are homologous.
The topological problems associated with DNA
uptake have received little attention. DNA molecules
are very big (a 10-kb molecule is as long as a bacterial
cell), and double-stranded DNA is not very flexible
(persistence length ~50 nm). DNA also carries a
strong negative charge that may repel the cell surface
and that will resist passage through hydrophobic
Regulation of Competence
Competence develops most commonly under conditions of growth downshift or nutritional stress, but the
genes, signals, and mechanisms of regulation differ
widely. Many regulatory genes have been identified,
but few appear to be specific to competence (i.e.,
most also control other cellular functions). Laboratory
cultures of Neisseria gonorrhoeae are competent at all
stages of growth. Some cells in Haemophilus influenzae cultures become competent at the end of exponential growth, and the entire culture becomes
competent after an abrupt transfer to a starvation
medium. B. subtilis uses secreted factors and a complex network of nutritional signals to coregulate
competence induction with other `postexponential'
cellular processes at the onset of stationary phase.
Streptococcus pneumoniae becomes competent in
response to a secreted factor as culture density increases during exponential growth. There is little evidence that different bacteria use common factors to
regulate competence, suggesting that regulation has
evolved independently in response to different selective pressures or environments.
168
B a c te r io ph a g e R e c o m b i na ti o n
Artificial Competence
Transformation of laboratory cultures with plasmids
is an essential tool for molecular biology, and the
availability of a reliable laboratory procedure for
transformation may determine whether a particular
bacterial species or strain is suitable for genetic analysis. Cells made competent by these procedures are
usually transformed with self-replicating plasmids as
the DNA taken up by such cells does not usually
recombine with the chromosome even if sequences
are homologous.
Chemical Competence
Electroporation
Further Reading
Ausubel FM, Brent R and Kingston RE, et al. (eds) (1988) Current
Protocols in Molecular Biology, Sect. 1.8. New York: John Wiley
& Sons.
Lorenz MG and Wackernagel W (1994) Bacterial gene transfer
by natural genetic transformation in the environment. Microbiological Reviews 58: 563602.
Solomon JM and Grossman AD (1996) Who's competent when:
regulation of natural genetic competence in bacteria. Trends
in Genetics. 12: 150 155.
Bacteriophage
Recombination
K N Kreuzer
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1431
Genetic recombination has two major forms. Homologous recombination involves exchange between two
DNA molecules that have extensive homology, and is
catalyzed by proteins that can function anywhere
along the homologous regions. Site-specific recombination, on the other hand, involves exchange between
two DNA molecules that have little or no homology,
and is carried out by specialized proteins that act only
at those particular sites. Both forms of recombination are central to the life cycles of bacteriophages,
although their importance varies greatly between
different phage species.
This article begins with a brief historical perspective
that highlights a few of the most important early developments in bacteriophage recombination, followed by
a more indepth overview of some of the key roles of
homologous recombination in bacteriophage, and
closes with a brief summary of site-specific recombination. The emphasis will be on a broad overview of the
biology of phage recombination, and the reader is
encouraged to explore the fascinating mechanisms
underlying this biology in numerous other review
articles and in the primary literature.
History
The recombination of bacteriophage genetic markers
was first observed in the 1940s, a few years before the
Homologous Recombination
Production of Concatemeric Phage DNA
chromosomes avoid the end problem by being circular, while linear eukaryotic chromosomes have
special end sequences, telomeres, which allow a
novel mode of replication to solve the end problem.
Bacteriophages have adopted a variety of strategies
to deal with the end problem, and these strategies
are reflected in the diverse genome structures of different phages. Phages such as phi X174 and PM2 have
adopted the same simple strategy as bacterial cells,
namely the use of a circular genome. Others, like
phi 29, have adopted another simple strategy, using a
terminal protein that serves as a primer for DNA
polymerase at the very ends of the genome. For most
phages, however, the solution to the end problem
involves homologous recombination in one way or
another.
As mentioned above, the T-even phages use homologous recombination to generate the concatemeric
DNA that is a precursor to headful packaging. The
pathways for generating these long DNA concatemers
are quite complex, and indeed are still under intensive
study. To understand the process, we need to consider
first the structure of DNA within the phage head.
Each T4 particle contains one linear duplex of phage
DNA. Even though the nonredundant T4 genetic
material consists of 169 003 bp, each packaged DNA
is several kilobases longer. The explanation is that the
DNA sequence at one end of the packaged DNA is
repeated at the other end, so-called `terminal redundancy.' Using simple letters instead of more complex
gene names, a given phage DNA molecule might have
the structure ABCD . . . YZAB, with segment AB as
the terminal redundancy. A second complexity of the
genome structure is called `circular permutation,'
which refers to the fact that different ends are found
in different phage DNA molecules. Thus, a series of
five phage DNA molecules (with terminal redundancy) could be represented as: ABCD . . . YZAB;
BCDE . . . ZABC; CDEF. . . ABCD; DEFG . . . BCDE;
EFGH . . . CDEF. This is an oversimplification the
precise ends of the genome can be within genes.
Indeed, every one of the 169 003 bp of the T4 genome
may be an end within a subset of the packaged DNA
molecules.
During the course of an infection, the infecting
linear DNA of phage T4 is replicated into very long,
branched concatemers containing hundreds of phage
genome equivalents (i.e., tens of millions of base pairs).
Immediately after infection, T4 replicates from internally located replication origins to duplicate the genome once or a few times (Figure 1). As described
above, the 30 ends of the parental DNA cannot be
completely replicated, resulting in ss ends that are
recombinogenic. During infection by a single phage
particle, the two daughter molecules from this early
170
B a c te r io ph a g e R e c o m b i na ti o n
ABCD
YZAB
Origin-dependent replication
ABCD
YZAB
Recombination at end
ABCD
Y
Z AB
ABCD
YZAB
(A)
MNOP
K
L MN
ABCD
MN
(B)
Recombination-Dependent Replication
3'
5'
Origin-dependent
replication
3'
3'
Annealing at ends
172
B a c te r io ph a g e R e c o m b i na ti o n
In addition, another phage protein binds to the Dloop and then loads the replicative helicase and primase onto the displaced DNA strand, allowing Okazaki fragment synthesis on the lagging strand.
Recombinational Repair
Strand invasion
D-loop
Replication initiation
lagging strand
Elongation
Intron Mobility
Site-Specific Recombination in
Bacteriophages
Studies of phage genome structures led to the first
detailed understanding of a site-specific recombination
174
B a c te r io ph a g e R e c o m b i na ti o n
SU
U' S'
attP
attB
S' U'
US
(A)
(B)
B a c t e r i o p ha g e T h er a py 175
new phage particles. In this lytic infection, all of the
DNA replication of the phage occurs by means of
replicative transposition events. Thus, the phage DNA
is madly transposing around the bacterial chromosome
in ever-larger numbers as the lytic infection proceeds,
until the cell lyses and releases a crop of new phage
particles. Phage Mu has provided a wonderful system
to study transposition, since the process occurs at an
extremely high rate during a lytic infection (hundreds
of events per cell per hour), as opposed to the very low
rate of other bacterial transposons (on the order of
10 4 events per cell per hour).
Closing Comments
Homologous and site-specific recombination play
many important roles in the life cycles of various
bacteriophages. Some have been presented in this
chapter, but numerous others have been either studied
less well or remain to be discovered by the next generation of researchers. Undoubtedly, both homologous and site-specific recombination also play
major roles in the evolution of the huge diversity of
bacteriophages found in nature, although the details
are less clear. The study of bacteriophage recombination holds great interest beyond an appreciation of the
life cycles and evolution of bacteriophage. For example, over the second half of the twentieth century,
the elucidation of particular pathways of recombination in bacteriophage has provided some of the most
important advances in the field of molecular biology.
Furthermore, bacteriophage enzymes involved in
recombination (e.g., ligases, polymerases, site-specific
recombinases) have provided key tools used throughout molecular biology. Finally, in recent years,
different bacteriophages have been shown to play
key roles in bacterial pathogenesis, and thus bacteriophage recombination pathways are also relevant to
human disease processes.
Further Reading
References
Luria S (1947) Reactivation of irradiated bacteriophage by transfer of self-reproducing units. Proceedings of the National Academy of Sciences, USA 33: 253264.
Streisenger G, Emrich J and Stahl MM (1967) Chromosome
structure in phage T4. III. Terminal redundancy and length
determination. Proceedings of the National Academy of
Sciences, USA 57: 292295.
Thaler DS and Stahl FW (1988) DNA double-chain breaks in
recombination of phage and of yeast. Annual Review of Genetics 22: 169197.
Bacteriophage Therapy
Z Alavidze and M Kutateladze
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0981
176
Tbilisi, Georgia
Work on therapeutic uses of phages continued in the
USSR, where they had much less access to the new
antibiotics and more experience and trust in the application of phages. This work was led by scientists at
what is now called the Eliava Institute of Bacteriophages, Microbiology, and Virology in Tbilisi. During
visits to the Pasteur Institute in 192126, its founder
George Eliava worked extensively with d'Herelle;
their first joint paper was published in April 1921.
During over half a century, the leading direction of
the Institute has remained the investigation of bacteriophages. Many basic and practical studies were aimed at
the understanding of the isolation and selection, morphology and biology, serology and taxonomy of virulent and temperate bacteriophages. Bacteriophage
ecology, phagehost bacterial cell interaction mechanisms, appearance and development of lysogeny, the
methods of isolation of active phage clones to aerobic
and anaerobic bacteria, phage purification and concentration have been studied at the Institute. At its
height, the Institute had about 150 research scientists,
with an additional 650 people associated with the facility for mass producing phage for commercial applications. At times they produced over 2 tonnes of phage
preparations a day, which were distributed through
hospitals and pharmacies all over the USSR.
The Institute had focused its interest on applications of phages in medicine for treatment and prophylaxis of different infectious diseases. Polyvalent phage
preparations against purulent microorganisms, particularly intestinal and wound anaerobic infections,
have been created. `Piophage' (containing phages
active against Staphylococcus, Streptococcus, Proteus,
E.coli, and Pseudomonas bacterial strains) and `Intestiphage' (21 different phage components) were successfully used as remedies against infectious diseases in the
entire former USSR, and additional production centers were established in three Russian cities. In addition to the combined polyvalent phage preparations,
mono-phage preparations were produced against
Salmonella, Shigella, Staphylococcus, Streptococcus,
E. coli, and Pseudomonas. A Staphylococcus phage preparation should be specially mentioned. This preparation has been very highly purified and concentrated,
giving it higher efficiency and rapid action; in contrast
with the other phage preparations, it can be applied
intravenously. This anti-Staphylococcus phage has
been successfully used against septicemia, female
infertility, osteomyelitis and open fractures, and burn
and wound infections. It appeared especially efficient
for treating septicemia in the newborn.
Along with the phage research aimed at creation
of therapeutic and prophylactic phage preparations,
B a c t e r i o p h a g e T h e r a py 177
original phage-typing patterns (or lysotyping schemes)
designed for interspecific differentiation of pathogens
such as Salmonella paratyphi, S. typhimurium,
Shigella flexneri, Sh. sonnei, Clostridium perfringens,
and others have been elaborated. Application of the
highly specific lysotyping phage sets is of great significance for epidemic studies, determining the sources
and ways of transmission of the infections, as well as for
rapidly diagnosing bacterial pathogens for selection of
treatment.
In recent years, a number of new applications of
phages have been developed. Particularly, a severe
problem of hospital-acquired infections has been
solved due to them. The experience of purposeful
phage application has been regularly practiced in many
clinics in the Republic of Georgia and in Russia. A new
phage preparation `Phagobioderm,' a highly effective
wound-covering material, has been worked out in
collaboration between scientists from several institutes in Tbilisi. This preparation, stimulating accelerated wound healing, is a bioresorbable polymeric film
impregnated with dried bacteriophage (sometimes
with other antibiotics) and painkiller substances and
having a surface-immobilized a-chymotrypsin that
causes slow release of the various agents. It protects
against entry of new microbes and treats those already
there while allowing efficient air circulation and
stimulating healing, even of large burns.
Investigation of new bacteriophages and the
mechanisms of their interaction with host cells are
also performed at the Tbilisi Institute. Bacteriophages
remain excellent model systems for elucidating such
major problems of biology as DNAprotein and
proteinprotein recognition. Investigation of phage
genomes and gene products may be useful with regard
to practical applications in biotechnology as well as in
some fields of medicine. Basic characteristics and
molecular properties of the phages included in the
multicomponent preparations are carried out, as well
as the investigation of the mechanisms by which the
bacterial viruses cause degradation of host cellular
structures ensuring its successful replication. Comparative analyses of the interaction mechanisms of
phages with different host cells provide useful information as to what is common in these processes, thus
enlightening some peculiarities of viral evolution.
Eastern Europe
The Soviet Union and Eastern Europe maintained a
strong emphasis on both basic and applied research on
phages in many other institutes and universities, as
well. The other major facility that has been particularly instrumental in work with phage therapy since
1957 is the Hirszfeld Institute of Immunology and
178
Conclusions
While it may be premature to generally introduce
injectible phage preparations in the West without
further extensive research, the carefully implemented
use of phages for a variety of agricultural purposes and
in external applications could potentially soon help
reduce the emergence of antibiotic-resistant strains.
Phage are also especially useful in dealing with challenging nosocomial infections, where large numbers
of particularly vulnerable people are being exposed to
the same strains of bacteria in a closed hospital setting.
In this case, the environment as well as, eventually, the
patients can be effectively treated using phages.
New techniques for the detailed genetic and
physiological characterization of phages isolated
from nature, for the rapid characterization of the
pathogens involved in a specific disease process and
for the eventual intentional modification of potential
therapeutic phages offer further promise in the
development of powerful phage therapy approaches
for restoring microbial balance and improving our
health and that of our ecosystem.
Further Reading
Bacteriophages 179
References
Bacteriophages
E Kutter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0106
Bacteriophages are viruses that specifically infect bacteria. Like all viruses, they are obligate parasites.
While they carry all the information to direct their
own reproduction in an appropriate host, they have no
machinery for generating energy and they have only
one kind of nucleic acid either DNA or RNA. Each
phage consists of a piece of genetic information, determining all of the properties of the virus, which is
packaged in a protein coat. Phages are like `space
ships' that carry genetic material from one susceptible
bacterial cell to another and then reproduce in the cell
where they land.
Phages are found in large quantities wherever their
hosts are found in sewage and feces, in the soil, in
deep thermal vents, in drinking water. Wommack and
Colwell, 2000 provide an excellent review of phages
in aquatic ecosystems, including their key roles in
maintaining the food web in the oceans, where their
numbers surpass those of bacteria by an order of
magnitude. Their high level of specificity, long-term
survival, and ability to reproduce rapidly given appropriate hosts contribute to bacteriophages maintaining
a dynamic balance among the wide variety of bacterial
species in any natural ecosystem. When no hosts are
present, they can maintain their infectivity for years
unless damaged by something in the environment.
Phages are susceptible to agents such as UV that
damage nucleic acid and most are killed by drying out
or freezing, but the majority are not susceptible to
organic solvents and many other forms of sterilization.
The target for each bacteriophage is a specific group
of bacteria. Bacteriophages cannot infect the cells of
more complex organisms because of major differences
in key intracellular machinery as well as in the specific
cell-surface proteins to which they must bind to infect
a cell. Most phages have tails, the tips of which have
180
Bac teriophages
Symbol
Family
Genus
Features
Nucleic acid
Example
M
S
P
Ii
Ip
Mi
C
T
L
Pl
Cy
SSVI
Li
Myoviridae
Siphoviridae
Podoviridae
Inoviridae
Inoviridae
Microviridae
Corticoviridae
Tectiviridae
Leviviridae
Plasmaviridae
Cystoviridae
SSV1 group
Lipothrixviridae
Inovirus
Plectovirus
Microvirus
Corticovirus
Tectivirus
Levivirus
Plasmavirus
Cystovirus
SSV1 group
Lipothrixvirus
Contractile tail
Long, noncontractile tail
Short tail
Long filament
Short rod
Lipid-containing capsid
Double capsid
Envelope, no capsid
Envelope
Lemon-shaped
Lipid-containing envelope
DNA, ds, L
DNA, ds, L
DNA, ds, L
DNA, ss, C
DNA, ss, C
DNA, ss, C
DNA, ds, C, S
DNA, ds, L
RNA, ss, L
DNA, ds, C, S
RNA, ds, L, M
DNA, ds, C, S
DNA, ds, L
T4
lambda
T7
fd
MVL1
Phi (f) X174
PM2
PRD1
MS2
MVL2
Phi (f) 6
SSV1
TTV1
, None established; ds, double-stranded; ss, single-stranded; L, linear; C, circular; S, supercoiled; M, multipartite.
Bacteriophages 181
showed that large amounts of labeled DNA are passed
on to the next generation, while virtually no parental
protein is contained in the new phage.
The HersheyChase experiment was the classical
demonstration that DNA is the stuff of heredity, so
for this reason it is important to all of biology. But it
also clearly established the general pattern of phage
growth, and it explained the eclipse period. The first
event following adsorption of the phage particle must
be injection of its DNA. The DNA takes over the
cellular apparatus and initiates the synthesis of new
phage proteins; but whole phage particles are not
made until after about 1112 min, and then their
numbers increase rapidly.
P
Ip
M
Mi
PI
Cy
Figure 1
Cy
Li
Li
182
Bac teriophages
Specific Bacteriophages
Several specific phage groups are discussed in some
detail elsewhere (see T Phages; Phage Mu; Filamentous
Bacteriophages; Archaea, Genetics of). There are also
several articles on topics important to genetics of a
variety of phages (Phage Recombination; Lambda Integration; Rolling-Circle Replication; Transcription). To
give a better idea of the breadth of properties of phages,
we also include here a survey of some of the other phages
that have played important roles in genetic analysis and
understanding of gene function. They are explored in
much greater detail in Webster and Granoff (1994) and
Calendar (1988). The complete sequences of many of
them have now been determined and are available in
the bacteriophage section of the genome site at NCBI
http://www.ncbi.nem.nih. gov:80/. This repository
also contains the sequences of many prophages determined in the course of microbial genome projects, as
well as lambdoid and mycobacterial phages analyzed
as part of the comparative genomics project at the
Pittsburgh Bacteriophage Institute, under the direction of Graham Hatfull and Roger Hendrix.
P1
P1 is a temperate phage that is unusual in that its prophage most commonly exists in plasmid rather than
integrated form. It is particularly useful because it can
carry out generalized transduction of markers between
strains of E. coli and Shigella. P1 also encodes its own
restriction-modification system, the study of which by
Werner Arber played a major role in determining the
biology of such systems and laying the foundation for
genetic engineering. P1 belongs to the Myoviridae. Its
93 601 bp genome, encoding 110 genes, is terminally
redundant and circularly permuted in the phage particle, being packed by a headful mechanism from a concatameric genome that is produced by rolling-circle
replication. Packaging into the first prohead is initiated
from a distinct pac site that must be dam methylated (by
either the host or the phage system) to function and
interacts with specific packaging machinery, which
then continues to insert headfuls of DNA (with about
10% redundancy) into subsequent proheads. P1 has
an 85 nm icosahedral head, but variable numbers of 65
and 47 nm heads are also produced, depending on phage
and host strain and on conditions; these heads package
partial genomes.
Genes for related functions are scattered throughout the genome, as in T4, rather than being tightly
clustered as in lambdoid phages. Two different origins
of replication are used: oriR in the prophage state; oriL
in the lytic phase. The two modes of replication also
have different requirements for host proteins. Though
present in only one to two copies per cell, the P1 prophage plasmid is only lost once per 100 000 cell divisions. This very high efficiency of maintenance is due
to the fact that it encodes its own effective partition
function, par, to ensure that the daughter chromosomes are properly divided between the two daughter
cells. P1 has a particularly complex set of immunity
functions involved in maintaining the prophage state
and excluding other phages. It also encodes an antirepressor protein that is capable of blocking the action
of the repressor and thus causing the prophages to be
activated to the lytic mode. However, activity of the
antirepressor is tightly controlled in the prophage
state by a 77 bp antisense mRNA.
P1 makes generalized transducing phages that
include all parts of the host genome equivalently
(and little or no phage DNA). This implies that the
occasional packaging of sections of the host genome
continues until all of the DNA molecule is used and/
or the packaging apparatus recognizes a number of
host sites as if they were pseudo-pac sites from
which DNA packaging can be initiated. Another unusual feature is that P1 extends its host range by
encoding two different versions of its tail-fiber genes
on a 4.2 kb invertable `C-segment' that is largely
homologous to the smaller G-segment of phage Mu
(see Phage Mu); both seem to recognize lipopolysaccharide moieties.
P2 and P4
Bacteriophages 183
of P4 requires the cox gene of P2, which activates
transcription from the P4 lytic promoter.
Replication of P2 occurs via a rolling-circle
mechanism that includes its own site-specific initiation
functions but otherwise relies on host genes. The
related phage 186 also encodes a protein that depresses
host replication which enhances the phage burst size,
but is not essential. In contrast, P4 replicates bidirectionally from a unique ori (origin) that requires a
second site 4.5 kb away and several phage proteins,
but only two host proteins PolIII and Ssb, the singlestranded DNA-binding protein. P2 DNA is packaged
from monomeric circles rather than from linear DNA
and requires a 125 bp region including a site that is
cleaved to give 19 bp cohesive ends.
P22
Cyanophages
184
Bac teriophages
Lipid-Containing Phages
Bacteriophages 185
the 28S ribosomal RNA; both the structure and the
mechanism are related to those of ricin and related
plant toxins.
Clostridium botulinum
Diphtheria toxin
Phage Evolution
There has long been interest in where viruses come
from, how they acquire their special properties and
genes, and how they relate to each other. In 1980,
David Botstein suggested that lambdoid phages, at
least, are put together in a sort of mix-and-match
fashion from an ordered set of modules, each of
which may have come from a particular host, plasmid,
or other phage. It is now generally agreed that bacteriophages are very ancient as ancient as the bacteria
that they infect. Within each large family of phages, a
common general gene order is preserved, facilitating
large-scale recombination among them; this has been
particularly well studied in the lambdoid phages,
many of which have been sequenced at the Pittsburgh
Phage Institute. Harald Bruessow has provided similar data for temperate phages of the gram-positive
lactic acid bacteria. In addition, there is strong evidence
for considerable intervirus recombination through
simultaneous infection or recombination with prophages. This eventually can lead to unrelated temperate
bacteriophages from distant bacterial groups possessing homologous genes. This has been particularly
well demonstrated with Andrew Kropinski's recent
completion of the sequence of bacteriophage P22.
Significant homologies are seen between P22 and not
only other members of Podoviridae, but also those of
Myoviridae and Siphoviridae all of them temperate
phages, but infecting a variety of different gramnegative bacteria. In addition, the pair of genes
involved in O-antigen conversion are related to those
of phage Sfx a member of the Inoviridae. All of this
supports the long-held concept that the temperate
tailed phages, at least, are mosaics built of a series of
modules or cassettes. The extent of the relationships
and the degree of apparent randomness are particularly interesting. For example, while the holin gene is
related to that of phage lambda, the lysozyme is not,
even though the main purpose of the holin is to give
the lysozyme access to the peptidoglycan layer at the
time that lysis is to occur.
The generalizations that the Pittsburgh group have
made suggesting that all double-stranded DNA phages
are ``mosaics with access, by horizontal exchange, to
a large common gene pool'' seem likely to apply to a
substantial degree to the temperate phages, but the
case is much less clear for the large lytic phages. T4,
with 168 903 bp, is the only such phage whose
sequence has been completed. Only about 12% of the
186
B a c ul o v i r u s Sy s t e m
Further Reading
References
Baculovirus System
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2082
Balanced Polymorphism
R S Singh
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0107
The traditional notion of genetic polymorphism assumed that genes were either essentially monomorphic
or highly polymorphic. Now we know that there is a
continuum of genetic variation and that most, if not
all, genes harbor some genetic variation; the amount
and type of variation depending on the nature and the
strength of evolutionary forces mutation, selection,
migration, and genetic drift. `Balanced polymorphism'
implies that the polymorphism is being maintained by
the interplay of two or more evolutionary forces acting
in opposite directions. All genetic polymorphisms need
not and probably cannot be balanced polymorphisms.
For example, as beneficial mutations increase in
Three of the most interesting forms of selectively maintained balanced polymorphisms are heterozygous
advantage or overdominance, frequency-dependent
selection, and multiple-niche polymorphism.
Heterotic-balance polymorphisms develop when
the fitness of the heterozygotes is higher than
those of the homozygotes. A classic case of balanced
polymorphism in human populations is that of sicklecell anemia. A mutation in the hemoglobin gene (bS)
leads to an alteration in the hemoglobin protein such
that the homozygote (bSbS) genotype is effectively
lethal because individuals die of anemia. This would
lead to elimination of the bS allele from the populations
except that in regions where there is malaria the normal
homozygote (bAbA) individuals suffer relatively more
mortality from malaria (caused by Plasmodium falciparum) than heterozygous individuals (bAbS). The
latter enjoys the highest fitness as they receive protection from both anemia and malaria. The loss of bS
alleles due to anemia is compensated (at equilibrium) by
the loss of bA alleles from malaria and thus both alleles
188
Balanced Tr anslocation
rare-male mating advantage in Drosophila, selfincompatibility alleles in plants, and bird predation
in colored moths and snails. Self-incompatibility
genes in plants control germination of pollen on the
female stigma and discriminate between self and nonself. Successful pollination only occurs when pollen
and stigma are of opposite types. In this situation
a population would theoretically maintain multiple
self-incompatibility alleles. Several types of selfincompatibility genes are known and they provide
an example of some of the highest polymorphic,
multiallelic genes known next only to the major histocompatibility complex (MHC) genes in humans.
An ecologically important form of balanced polymorphism develops when a population's environment is heterogeneous in a way that favors different
genotypes in different environments (multipleniche
polymorphism). A combination of random mating,
niche-specific genotypic fitness, and niche-specific
contribution to total population numbers can lead to
a balanced polymorphism.
There are many other forms of selection, such as
seasonal selection within a year, cyclic selection over
generations, and selection between different life
stages, that can lead to balanced polymorphisms.
However it is important to realize that balanced polymorphism, specially in the case of heterotic balance, is
maintained through loss of fitness (lower viability or
fertility) of disadvantaged individuals which leads to
more loss in population number than would be the
case without balanced polymorphism. Therefore it has
been argued that there must be a limit to the number of
gene loci that can be maintained by balancing selection. Unlike the polarized view 50 years ago, now
the problem of the amount of genetic variation has
been separated from the problem of its maintenance.
There appears to be more genetic variation in most
populations than can be maintained by balancing
selection alone. Mutation rate and population size, in
addition to selection, are important variables affecting
genetic variation.
See also: Frequency-Dependent Selection; Nearly
Neutral Theory; Neutral Theory; Sickle Cell
Anemia
Balanced Translocation
M L Budarf
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0108
Alternate Segregation
Normal
Balanced carrier
Adjacent-l Segregation
Unbalanced gametes
190
B A LB/ c Mou s e
References
BALB/c Mouse
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0109
Balbiani Rings
J C J Eeken
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0110
Further Reading
Barr Body
J Read and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2083
Further Reading
192
B a s e C o m p o s i ti o n
Base Composition
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0115
194
H3C
H
N
H O
N
N
N
N
N
N
H
(A)
H3C
(B)
O H
H3 C
H3C
H
N
N
N
N
N
H
(C)
(D)
(E)
H+
H3C
H3C
H
N
O
(F)
H3C
H
H
(G)
N
N
H
N
N
O
N
H
(H)
(I)
(J)
Figure 1 Structures of base pairs and base mispairs. (A) T . A WatsonCrick base pair; (B) C.G WatsonCrick
base pair; equilibrium between (C) favored keto tautomer of T with (D) rare enol tautomer of T; (E) T(enol).G base
mispair; pH-dependent equilibrium between (F) T (keto) and (G) T (ionized); pH-dependent equilibrium between (H)
A (amino) and (I) A (protonated); ( J ) T.G wobble base mispair; equilibrium between (K) C.A (ionized) base mispair
and (L) C.A reverse wobble mispair; (M) N4-methoxycytosine (imino) . A base mispair; (N) ) N4-methoxycytosine
(amino).G base pair; pH-dependent equilibrium between (O) BrU . G wobble base mispair and (P) BrU (ionized).G
base mispair.
H
N
N
H
N
O
N
H
(K)
(L)
CH3
CH3
O
H
H
N
N
N
O
N
N
H
(M)
Br
(N)
O
Br
N
N
O
N
N
N
O
N
H
N
H
(O)
Figure 1
(P)
(Continued)
196
which are observed as a single, predominant configuration, base mispairs may exist as a family of
structures in equilibrium with one another.
Numerous experimental studies have been conducted over the past few years on a variety of mispairs
in DNA involving mutagenic base analogs such as
5-bromouracil and 2-aminopurine, and bases chemically modified by carcinogens including those damaged by oxidation and alkylation. The picture which
is emerging from these studies is that essentially all
mispairs examined to date are best represented as a
family of configurations which are in equilibrium with
one another. Such equilibria may involve ionization,
rotation of purine residues around the glycosidic bond
from the normal anti to a syn conformation, and even
tautomerization.
To date, the only confirmed tautomeric equilibrium
within a base mispair in DNA, observed by either
NMR spectroscopy or X-ray crystallography, involves N4-methoxycytosine. This modified base,
formed by reaction of methoxyamine with cytosine,
is preferentially in the unusual imino configuration.
However, the energy difference between tautomeric
forms is sufficiently small that the tautomeric forms
observed in DNA may be altered by changing the base
paired opposite the modified base (Figure 1M, N).
As most modified base pairs represent a family of
structures in equilibrium with one another, which of
these forms represents the configuration which results
in incorporation of the mispaired deoxynucleoside
triphosphate by DNA polymerase? This question
comprises the thrust of current and future experimental research efforts.
With normal base pairs, the energy of both hydrogen bonding and base-stacking is optimized when the
base pair assumes WatsonCrick geometry. With mispairs, however, the geometric configuration in which
base stacking is optimum might correspond to a configuration with a substantial steric clash between
hydrogen bonding protons or the positioning of two
highly electronegative heteroatoms directly in front
of one another. The most stable among the possible
base pair configurations formed between mismatched
bases is usually not WatsonCrick. As a consequence,
mispair formation generally results in a substantial
decrease in the thermal and thermodynamic stability
of a DNA duplex.
One view of the mechanism of correct base selection by a polymerase is that DNA polymerase can
discriminate between correct and incorrect base pairs
by free energy differences. As mispairs are generally
less stable, an incorrect deoxynucleoside triphosphate
would dissociate more readily from the replication
complex, and thus be less likely to be incorporated
covalently by DNA polymerase. Although mispairs
Further Reading
Base Substitution
Mutations
W A Rosche and P L Foster
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0118
198
Bases
G:C
A:T
G:C
T:A
C:G
A:T
G:C
A:T
C:G
T:A
Transitions
Transversions
Further Reading
Reference
Bases
E J Murgola
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0113
H
C
N3
N1
CH
N
5
HC
(A)
CH
HC
(B)
CH
N
H
Baur, Erwin
W-E Lonnig and H Saedler
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1716
Usually two men of science are quoted to have discovered non-Mendelian or cytoplasmic inheritance in
plants at about the same time (1909): Erwin Baur and
Carl Correns. However, it was definitely Baur who
clearly drew attention to separate plastid inheritance
whilst Correns developed a hypothesis that only the
200
B a u r, E r w i n
Baur began working with Antirrhinum in 1904 publishing the first description of a (semi-dominant) lethal
gene mutant in 1907 (the Aurea gene). Baur's book on
his Antirrhinum studies (1924) lists and discusses 29
genes including three cases of multiple alleles. The first
gene linkages in Antirrhinum were published by Baur
for the genes Eluta-rosea-pallida in the years 1911 and
1912. In 1927, after hearing Muller, Baur and his student Hans Stubbe worked on induction of mutations
by X-rays and other mutagenic agents. Following
Baur's death, Stubbe continued this work, and published his Antirrhinum monograph in 1966, with a
description of all the mutants obtained so far.
For the first homeotic plant gene to be cloned,
Baur's deficiens mutant of 1917 was used (Sommer
et al., 1990), followed by investigations on several
other homeotic Antirrhinum mutants from his collection (Theissen and Saedler, 1999).
Seed Collections
Baur began work on seed collections in 1911, gathering original wheat and oat lines. After World War I he
reinforced these activities participating in several
excursions into Turkey, Spain, Portugal, and South
America. Also, from about 1927 onwards, Baur
had contacts with the famous Russian geneticist
Nicolai I. Vavilov, who independently of Baur
had recognized the key function of large seed banks
for future recombinant plant breeding. Baur's students
continued his work in Germany, resulting in the
Gatersleben collection of more than 50 000 lines of
original and cultivated plant lines. The enterprise of
Baur and Vavilov can be viewed as the first scientific
undertaking of the modern era for the conservation of
biodiversity.
Publications
202
B a u r, E r w i n
Concluding Remarks
Erwin Baur was an ambivalent character. He was an
internationalist in science and nationalist at home. He
is described as having been generous and liberal in his
treatment of his coworkers, and also as having displayed
an all-embracing claim to leadership so that his
coworkers had difficulties in developing their own
scientific profile; he is said to have been entirely free
from vanity and self-interest and also to have been an
inconsiderate egocentric who sometimes even proclaimed research results of his coworkers as his own.
In general, his behaviour at the institute as well as to
his wife and four children has been described as openminded and communicative (displaying a good sense
of humor), yet to his coworkers also as sometimes
terribly quarrelsome.
The last year of his life was overshadowed by heavy
economic problems of his institute, the KaiserWilhelm-Institut fur Zuchtungsforschung Muncheberg (the later Max-Planck-Institut fur Zuchtungsforschung). The institute was founded on condition
that Baur himself collected the money necessary for
the maintenance of the institute from industry, administration, and international societies, which proved to
be too heavy a burden after some four years. Personal
and political problems added their part to his situation, and Baur died on 2 December 1933 after an
acute heart attack.
Hans Stubbe made a commemorative speech on
Erwin Baur in 1958. After mentioning his strengths
and weaknesses he closed his statement as follows:
He was a man, take him for all in all,
I shall not look upon his like again.
B aye s i a n A n a l y s i s 203
Further Reading
References
Bayesian Analysis
R D Blank
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0119
204
N
R
0:5R 0:5N
35
35
N
1
R
DR DN
PdatajlinkagePlinkage
Pdata
Thelinkageconfidenceintervalcanbecalculatedsimply
by choosing limits of integration for P(linkagejdata)
to include (1 significance level), as the contributions
of each location on the genome P(linkagejdata) sum
to unity. It is worth noting that P(linkagejdata) is in
fact the quantity that geneticists wish to determine.
This Bayesian interpretation also reminds us that the
judgement of linkage and map distance is contingent
on the available data.
It appears from the above discussion that, in practice, Bayesian analysis has little effect other than to
demand more stringent criteria for a given level of
significance. This observation may explain why a
Bayesian significance level of 0.05 is approximately
equivalent to a LOD score of 3.0 reflecting the
approximate factor of 20 by which P(no linkage)
exceeds P(linkage) in the absence of prior data. This
B c l - 2 G en e F a m i l y 205
is less important, however, than appreciation that the
statistical questions addressed by maximum likelihood
and Bayesian interpretations differ as outlined above.
The crux of this difference is that the Bayesian interpretation leads naturally to evaluation of P(linkagej
data) at every location in the genome and that over all
locations, these must sum to 1. The Bayesian interpretation partitions a finite, normalized P(linkagejdata)
while the maximum likelihood interpretation compares the likelihood of the best location to no linkage,
without considering the excess of unlinked locations
in the genome.
Further Reading
206
B c l - 2 Gene Family
group, however, is much more distant and heterogenous. Its members, which include the mammalian
Bik, Bad, Bid, Bim, Hrk, and Noxa as well as the
nematode EGL-1, have only the short (916 residue)
BH3 domain in common with the family and with
each other. This amphipathic a-helix is necessary and
perhaps sufficient for the apoptogenic action of the
``BH3-only'' proteins and is also very important in
that of the Bax subfamily.
The opposing action of various Bcl-2 family members is in part due to their ability to form heterodimers, and structural studies on Bcl-xL revealed the
basis. A hydrophobic groove on the surface of Bcl-xL,
formed by the convergence of a-helices in its BH3,
BH1, and BH2 regions, can bind tightly to the BH3
a-helix of an apoptogenic family member. This interaction is thought to inactivate pro-survival function,
or perhaps even render the molecule apoptogenic.
In addition to the BH domains, most members of
the Bcl-2 and Bax subfamilies and certain of the BH3only proteins also possess a hydrophobic C-terminal
domain. It facilitates their targeting to the surface of
the mitochondria, endoplasmic reticulum, and nuclear
envelope, the sites where the majority of pro-survival
molecules typically reside and where the apoptogenic
ones gather during apoptosis.
To preclude unwarranted apoptosis, healthy cells
prevent heterodimerization between family members
in various ways. Some of the apoptogenic genes, such
as those for EGL-1, Hrk, or Noxa, are transcribed
predominantly only after particular apoptotic stimuli.
Most apoptogenic members, however, are constitutively made but restrained by their conformation or
subcellular localization. In Bid, for example, the BH3
domain is buried and becomes exposed only when Bid
has been cleaved by, for example, caspase-8, which is
usually activated by ``death receptors'' of the TNF
family. With Bad, phosphorylation at different sites
allows its sequestration by 1433 proteins or directly
masks its BH3 domain. Bim, on the other hand, is
rendered inactive by its association with the dynein
motor complex on microtubules. Finally, the conformation of Bax keeps it in the cytosol until an unknown
signal triggers its oligomerization and translocation to
organelles such as mitochondria.
B C R / A BL Oncogene 207
cytochrome c, an essential cofactor for Apaf-1, and
evidence that certain cells, albeit not others, lacking
Apaf-1 or caspase-9 are refractory to cytotoxic stimuli
that Bcl-2 can regulate.
Further Reading
Adams JM and Cory S (2000) Life-or-death decisions by the Bcl2 protein family. Trends in Biochemical Sciences 26: 61 66.
Gross A, McDonnell JM and Korsmeyer SJ (1999) Bcl-2 family
members and the mitochondria in apoptosis. Genes and
Development 13: 18991911.
Horvitz HR (1999) Genetic control of programmed cell death in
the nematode Caenorhabditis elegans. Cancer Research 59:
1701s1706s.
Sattler M, Liang H, Nettesheim D et al. (1997) Structure of
Bcl-XL-Bak peptide complex: recognition between regulators of apoptosis. Science 275: 983986.
Vaux DL, Cory S and Adams JM (1988) Bcl-2 gene promotes
haemopoietic cell survival and cooperates with c-myc to
immortalize pre-B cells. Nature 335: 440442.
BCR/ABL Oncogene
R A Van Etten
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1550
The BCR/ABL oncogene is the product of the Philadelphia chromosome (denoted Ph1 or simply Ph). The
Ph chromosome was first identified over 40 years
ago by Nowell and Hungerford in Philadelphia as an
abnormally short G-group chromosome in blood cells
from patients with the myeloproliferative disease
chronic myelogenous or myeloid leukemia (CML),
and has the distinction of being the first genetic
abnormality identified in a human cancer. The development of chromosome-banding techniques allowed
the identification of the Ph chromosome as the der22
product of a balanced translocation between chromosomes 9 and 22, t(9;22) (q34.1;q11.21). The structure
of the breakpoint on the Ph chromosome was elucidated in 1984, with the demonstration that the c-ABL
gene on chromosome 9 was translocated to a restricted
region of about 6 kb on chromosome 22 called the
breakpoint cluster region, or bcr. Subsequent characterization of this region demonstrated that bcr was in
the middle of a protein-coding gene composed of
25 exons, now called BCR. Conversely, the breakpoint in the ABL gene on chromosome 9 was variable,
and located in a large first intron of over 250 kb. The
c-ABL gene encodes a nonreceptor protein-tyrosine
kinase, c-Abl (see c-ABL Gene and Gene Product),
while the BCR gene product is a 160 kDa cytoplasmic
phosphoprotein, Bcr. As a consequence of the Ph
translocation in CML, the first 13 or 14 exons of
BCR are fused 50 to the second exon of c-ABL, with
208
Bead Theor y
maintenance of the translational reading frame. Transcription of the chimeric gene and RNA splicing generates an 8.5 kb mRNA encoding a fusion protein of
210 kDa, designated p210 Bcr/Abl. There are actually
two alternative p210 fusion genes (usually designated
b2a2 and b3a2) found in different CML patients,
depending on whether BCR exon 14 is included in
the fusion. The b3a2 p210 protein is 25 amino acids
longer than the b2a2 isoform. The reciprocal translocation product from the der9 chromosome is a
chimeric ABL/BCR gene whose reading frame is also
intact, but the variable expression of this gene in CML
patients argues against a significant role for this fusion
gene in leukemogenesis.
The Ph chromosome is found in over 90% of
patients with clinical features of CML (which include
leukocytosis with increased maturing myeloid cells,
hepatosplenomegaly, and basophilia). Among the
infrequent patients with CML-like disease that lack
the Ph chromosome, about half demonstrate molecular evidence of fusion of BCR and ABL when Southern blotting, reverse transcriptase polymerase chain
reaction, or fluorescence in situ hybridization are
employed. These patients are likely to have multiple
chromosomal rearrangements in their leukemic cells
that obscure the Ph translocation. Of the remaining
Ph-negative patients, many have atypical clinical features and may represent another disease, such as a
myelodysplastic syndrome. A small number may
have a variant 9q34 translocation that fuses c-ABL to
another gene, such as TEL (ETVb) on chromosome
12p13. Besides CML, the Ph chromosome is also
found in several other human hematologic malignancies, most commonly in some cases of B-lymphoid
acute lymphoblastic leukemia (B-ALL), and infrequently in acute myeloid leukemia, non-Hodgkin's
lymphoma, and multiple myeloma. The majority of
adult and pediatric patients with Ph-positive B-ALL
have a distinct type of chimeric BCR/ABL gene that is
results from a chromosome 22 breakpoint within the
first intron of BCR, rather than in the classic bcr
region. The product of this chimeric gene (denoted
e1a2) is a fusion protein of 190 kDa, p190 Bcr/Ab1
(also referred to as p185 in some references). A third
and less common variant fuses BCR exon 19 to ABL
exon 2 (e19a2), generating a p230 form of Bcr/Ab1.
Rare patients with fusion of BCR exon 1 or 13 to ABL
exon 3 (b1a3 and b2a3) and BCR exon 6 to ABL exon
2 (b6a2) are also observed, where the translational
reading frame is preserved in all cases.
While neither Bcr nor c-Abl proteins will transform cells, the Bcr/Abl fusion protein transforms
fibroblasts, hematopoietic cell lines, and primary
bone marrow cells in vitro. Furthermore, expression
of the BCR/ABL gene in hematopoietic cells in mice
Bead Theory
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0121
B eh av i o r a l G en e t i c s 209
the gene cannot be separated into smaller parts that
are themselves mutable or able to be recombined.
Benzer's uncovering of the fine structure of the gene
rendered the bead theory obsolete.
See also: Alleles
BeckwithWiedemann
Syndrome
S Malcolm
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1551
Behavioral Genetics
C P Kyriacou
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0122
Overview
Behavioral genetics is one of the oldest areas within
the discipline, and, until relatively recently, one of the
least well-developed, for two reasons. The first is
This crude eugenics message has found many followers in more recent times. Francis Galton, as a
young man, took distinguished Victorians (men, of
course) whom he considered to be geniuses in their
own fields, and examined the eminence of their male
relatives. He observed that the closer the genetic relationship, the more likely were their relatives to be
successful, suggesting to him, a genetic predisposition
for mental ability. As an older man, Galton founded
the Eugenics Society, and every decade there was
an international meeting to discuss genetics and the
``directed'' evolution of human behavior. The first two
meetings were attended by the ``who's who'' of the
genetics world. The third meeting, held in New York
in August 1932, had been abandoned by reputable
geneticists, and had descended into an unpleasant
farce, with paper after paper advocating mass sterilization of moral or mental ``inferiors.'' Unfortunately, this
message was taken literally in the USA and marked a
dark period in their history with tens of thousands of
forced sterilizations. This was to be taken much further
in Germany, culminating in the Holocaust.
In this climate of genetic determinism, mostly advocated by the medical profession, psychologists interested in the genetic analysis of behavior either kept
their heads down or, like JB Watson, put forward the
opposite view of extreme environmentalism. Meanwhile, during the 1920s and 1930s, Edward Tolman
and his student Robert Tryon were performing their
classic experiments on the genetic basis of maze
learning in rats. This work was not fully published
until the late 1940s, when it stimulated the field we
now call behavioral genetics. There were two schools
in those early days of the 1950s, the American
210
B e h av i o r a l Gen etics
In the late 1960s, Seymour Benzer, a geneticist working in California, advocated a new approach to studying behavior. Using Drosophila as the model system
he suggested that instead of investigating polygenic
inheritance through strain differences, chemical mutagenesis should be used to induce single gene mutations
in the behavior of choice. Setting up ingenious massscreening techniques, and clever genetic tricks, he
succeeded in generating many mutations in simple
behaviors such as phototaxis, geotaxis, flight, locomotor activity, and coordination. The mutations
could then be mapped, and by use of genetic mosaics
and fate mapping (see Neurogenetics in Drosophila),
the region of the nervous system responsible for the
mutant behavior could be roughly identified. In
essence, he was using the mutated gene and behavior
as scalpels to dissect the function of the nervous
system. This gave rise to the term `neurogenetics,'
and represented a dramatic departure from the quantitative genetics methodologies used previously to study
behavior.
Initially, this new work caused enormous problems
for the more traditionally minded behavioral geneticists. They were unimpressed with the anatomical,
B eh av i o r a l G en e t i c s 211
clock genes have been isolated, initially by mutagenesis,
and their molecular roles in generating the circadian
phenotype have been elucidated (see Clock Mutants).
Furthermore, intraspecific natural variation in the
coding regions of the per gene has been shown to
have important implications for the fly's Darwinian
fitness, with natural selection distributing this variation geographically along a latitudinal cline in Europe
and Australia.
In addition, the per gene carries with it speciesspecific behavioral information, so that circadian
locomotor activity patterns of different fly species
can be transferred between species by interspecific
transformation. This also applies to another speciesspecific phenotype controlled by the per gene, the
60-sec lovesong cycle generated by the male's wing
display during courtship. These results show how the
identification of per by mutagenesis has ultimately
resulted not only in dissecting out the molecular basis
of the circadian clock, but also in contributing to our
understanding of the biological clock in an evolutionary, ecological and population genetics context.
This example provides the most powerful response to
the critics of Benzer's approach, namely that it could
never tell us anything about normal behavior, nor its
evolution.
Other equally compelling stories have been developed in the study of learning and memory, sexual
behaviour, olfaction, etc. (see Neurogenetics in Drosophila). The fly remains a wonderful model system for
behavior genetic research because of its rich repertoire
of behavior and its tractable genetics. However, other
higher eukaryote models have now been developed,
primarily the nematode and the mouse. Sydney
Brenner, like Benzer, also saw the value of the singlegene approach, and extensive mutagenesis of
Caenorhabditis elegans has provided many behavioral
variants in mechanoreception, locomotion, etc. Behavior in the worm is rather less sophisticated than in the
fly, but the developmental fate of all 302 nerve cells is
known, as is the worm's complete DNA sequence, so
the potential for molecular neurogenetics is enormous
(see Neurogenetics in Caenorhabditis elegans).
Forward genetics by mutagenesis followed by behavioral screening is not very practical in the mouse,
although recently the approach has had some stunning
success in the case of the Clock circadian rhythm
mutant (see Clock Mutants). The development of
gene knockouts in the mouse has permitted the targeted
mutagenesis of known DNA sequences, and the assessment of function. This has been very informative for a
number of genes that are involved in long-term memory formation, particularly the gene that encodes
CREB, a protein that mediates cAMP-responsive
transcription, and that encoding the protein kinase
212
B e n z e r, S eym o u r
Future Prospects
There is little doubt that the new century will see
remarkable developments in animal behavioral
genetics, driven mainly by the molecular revolution.
The various mammalian genome projects will be
concluded in the next few years and the conservation
of gene function in behavior that has been seen, for
example in learning and circadian rhythms across
taxa, will inevitably provide candidate genes and
behaviors that can be analyzed in mammals, including
humans. Developments in breeding techniques, and
increased sensitivity of the mathematical tools for
identifying the gene loci that make small contributions
to behavioral phenotypes, will begin to dissect the
polygenic bases of behavior, particularly in tractable
organisms such as mice. Linkage studies will continue
to be used in attempts to find loci that make contributions, however small, to both normal and abnormal
phenotypes in human behavior. As the field expands
and becomes an area within molecular neuroscience,
important political and ethical questions will have
to be addressed about the use of the knowledge that
will be obtained. Will it be a Pandora's box, or will it
finally herald in the Age of Reason?
Further Reading
Benzer, Seymour
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0125
Beta (b)-Galactosidase
R E Huber
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0489
b-Galactosidase is an enzyme found in many organisms. The b-galactosidase that has been studied in the
most detail is encoded by the lacZ gene of the lac operon
of Escherichia coli. It catalyzes b-d-galactoside breakdown and galactosyl transfer reactions. This enzyme is
of both scientific and historical significance. Jacob and
B et a ( b) -Galactosidase 213
Monod used it to study the lac operon and induction,
and later won the Nobel Prize for their work.
Structure
Metal Requirements
Lactose
Synthetic Substrates
Reaction Mechanism
Glu461 is thought to be a general acid catalyst for
cleavage of the glycosidic bond. His540, His357,
His391, Asp201, Glu461, Trp568, and Phe604 are
required for transition state stabilization, but other
residues are also undoubtedly involved. Glu537 forms
a covalent bond with galactose during the reaction and
Tyr503 is a general acid catalyst for breakage of this
covalent bond.
214
Bidirectional Replication
Further Reading
Bidirectional Replication
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1773
Bidirectional replication is the replication accomplished when two different replication forks move
away from the origin in different directions.
See also: Replication; Replication Fork
In 1968, studies on the kinetics of DNA renaturation showed that large numbers of repeated DNA
sequences form significant portions of eukaryotic
genomes (Britten and Kohne, 1968). Although Britten
and Khone did not identify repetitive DNA in Escherichia coli, a number of repeated elements are now
known. In fact other prokaryotic genomes also contain DNA repeats, some of which are several kilobases
long and may have identical nucleotide sequences.
Such duplications include operons coding for ribosomal RNA subunits, genes for other essential cellular
functions, insertion sequences, transposons, symbiotic genes for nitrogen fixation, etc.
Many bacterial genomes also include shorter interspersed repetitive elements (<200 bp) generally found
at the 30 end of transcription units. First described in E.
coli and Salmonella typhimurium, these repeats
include the repetitive extragenic palindrome (REP)
(Higgins et al., 1982) or PU (Palindromic Unit)
(Gilson et al., 1984) elements, as well as the 126 bp
long enterobacterial repetitive intergenic consensus
(ERIC) (Hulton et al., 1991) sequences also found
in other enterobacteria. Analysis of the E. coli genome
identified 581 REP-like sequences that are grouped
into 314 elements of one to twelve tandem copies
(Blattner, 1997). Together these sequences represent
the most abundant class of repetitive DNA in E. coli
and account for 0.54% of the whole chromosome. In
contrast, only 19 ERIC elements were found.
Like in short repeats often combine into complex
motifs mosaics, of up to 300 nucleotides, first characterized as bacterial interspersed mosaic elements
(BIMEs) (Gilson et al., 1991). Similarly, genomes of
various members of the Rhizobiaceae were shown to
contain modular repeats larger than 100 bp, that were
called Rhizobium-specific intergenic mosaic elements
(RIME) (steras et al., 1995, 1998). As in BIME, ERIC,
or REP elements, RIME1 and RIME2 sequences also
include inverted repeats that can form stemloop structures when transcribed into RNA. Although REP were
shown to stabilize upstream mRNA and influence gene
expression, these functions appear to be secondary
consequences of stemloop formation in appropriate
locations rather than reflecting a primary function of
the REP sequences (Higgins et al., 1988). Although the
exact role of short interspersed sequences remains
unknown, their scattered distribution in many different bacterial genomes helped produce genomic fingerprints using PCR-primers complementary to REP,
ERIC, and other repeats.
In contrast to the intergenic BIME, ERIC, REP or
RIME sequences, 19 of the 44 Rickettsia palindromic
elements (RPEs) identified in R. conorii are inserted
in open reading frames (Ogata et al., 2000). These
19 repeats of 106 to 150 nucleotides never interrupt
the translation frame, and are predicted to code for
a central a-helix domain flanked by two extended
or coil regions. As RPEs are found in transcripts and
lack the common features of the self-splicing inteins,
they probably form a class of DNA elements encoding a peptide insert tolerated by many arbitrary host
proteins.
References
Binomial Distribution
N Saitou
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0128
Because this probability is the kth term of the binomial expansion of ( p q)n, this probability distribution is called `binomial distribution.' Because p q 1
by definition, the sum of all terms for ( p q)n 1,
which satisfies the requirement of a probability
distribution.
When n is not large, the binomial distribution is
skewed. For example, when n 12 and p 1/3,
Prob[0], Prob[1], . . . , and Prob[12] are 0.008, 0.046,
0.127, 0.212, 0.238, 0.191, 0.111, 0.048, 0.015, 0.003,
0.000, 0.000, and 0.000, respectively. The highest
probability is for Prob[4]. This is because expectation
[ pn] 4.0. As n becomes larger, the probability distribution will approach the normal distribution with
mean as expected value. When n is large and p is small,
the binomial distribution can be approximated as
Poisson distribution, exp( l) lk/k!, where l is the
mean ( np).
When Mendel studied pea phenotypes in search of
the fundamental laws of genetics, he used a simple
binomial distribution, with p q 1/2 and n 2. In
this case, there are only three terms (k 0, 1, and 2),
and these correspond to the probability of obtaining homozygotes of one allele, heterozygotes, and
homozygotes of the other allele. Under this condition,
Prob[0], Prob[1], and Prob[2] are 1/4, 2/4, and
1/4, respectively. When the phenotype of the heterozygote is indistinguishable from one of the homozygotes, we obtain the famous 3:1 proportion under
dominance.
The so-called HardyWeinberg ratio is also a simple application of the binomial distribution. In this
case, the probabilities p and q correspond to allele
frequencies of two alleles at one locus in one population, and n 2 (two gametes transmitted from paternal and maternal parents). Because two gametes are
united by chance under random mating, the expected
frequency of genotypes in the offspring generation
is given by expansion of ( p q)2 p2 2pq q2.
When there are more than two alleles, we can use
the multinomial formula instead of the binomial
one. In any case, this HardyWeinberg ratio is
known to be a good approximation for estimating
observed number of genotypes in a population, unless
the effect of random genetic drift, inbreeding, assortative mating, mutation, gene flow, and other factors are
significant.
When we compare two homologous amino acid or
nucleotide sequences, we obtain two classes of sites
after aligning them: identical or different. Both the
number of identical and different sites follow binomial
distribution, nCkpk qn k, where n is the number of
compared sites (excluding gaps), p and q are the probabilities of different and identical sites, respectively,
and k is the observed number of different sites. The
216
B i o c h e m i c a l Gen etics
O
OH
Biochemical Genetics
COOH
CH2
COOH
Chorismate
GLN
trpE, trpD
GLU
pyruvate
D M Downs
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0129
COOH
NH2
Anthranilate
PRPP
trpD
PPi
COOH OH OH
CH2
NH
PO4
O
Phosphoribosyl
anthranilate
trpC
OH
COOH
O
NH
PO4
OH
1-(O -Carboxyphenylamino)-1deoxyribulose-5-phosphate
trpC
CO2
OH
O
N
H
PO4
OH
Indoleglycerol phosphate
SER
trpA + trpB
Glyceraldehyde 3-phosphate
COOH
N
H
Tryptophan
NH2
218
assay is monitoring an activity that is not physiologically relevant or, (2) there is a redundant function in the
cell that masks the requirement for this enzyme
in vivo. Further experimentation can distinguish between these two possibilities. Importantly, such results have the potential to broaden our understanding
of metabolism in a way that might not be achievable if
a biochemical or genetic approach had been used in
isolation.
Summary
Biochemical genetics provides a means to rigorously
gather insight into the physiology of a living cell. The
premise behind this approach is that to understand
physiology one must know the gene, the function of
the product, and the role of this product in metabolism.
To obtain this information requires the integrative use
of both biochemical and genetic approaches. Either
approach alone, or in combination with sequence
data can go only so far in defining the role of a gene
in cellular physiology. In each system the feasibility of
the current and future technologies must be evaluated
to ensure that the highest level of rigor possible is
being used to make functional assignments.
Further Reading
Biosynthesis of Small
Molecules
M E Winkler
Simple Screening
A powerful classical approach to elucidating biosynthetic pathways has been to isolate mutants that
require the addition of a specific nutrient for growth
`auxotrophs.' These mutants usually contain defective
variants of one or more of the enzymes necessary to
synthesize a compound that is needed for growth. For
example, suppose we want to study the biosynthesis of
a small molecule, arbitrarily called F (Figure 1), which
could be an amino acid, vitamin, pyrimidine or purine
base, or some other essential compound normally
synthesized by bacterial cells. For the sake of illustration, suppose F is synthesized in cells in five steps
Figure 1
Gene
a
Gene
b
Gene
c
Gene
d
Gene
e
Enzyme
a
Enzyme
b
Enzyme
c
Enzyme
d
Enzyme
e
Starting
substrate
Intermediate
Intermediate
Intermediate
Intermediate
F
End product
that can be
supplied as
a nutrient
Mutagenesis
the chromosomes of bacteria, yeast, and other organisms from carrier pieces of DNA referred to as
vehicles. Transposon insertion into a gene disrupts
the protein reading frame and thereby usually inactivates the gene. In addition, transposons themselves
carry genes that impart resistance to antibiotics or
other readily selectable genetic markers. For example,
when a transposon carrying a gene that imparts resistance to an antibiotic, such as kanamycin, inserts into a
bacterial chromosome, the resulting bacterial strain
becomes resistant to kanamycin. Thus, the bacteria
in each kanamycin-resistant colony contain a transposon inserted at a specific place in their chromosomes,
and bacteria from individual kanamycin-resistant
colonies will usually have the transposon inserted at
different loci in their chromosomes. Following a
transposon jump, antibiotic-resistant colonies can be
screened for an auxotrophic requirement, such as
the need for nutrient F (Figure 1). About 1 in 2000
transposon-containing, antibiotic-resistant colonies
will typically turn out to be nutrient F auxotrophs.
Mutant Enrichment
220
Chemical Analogs
The chemical structure of the endproduct of a biosynthetic pathway (e.g., small molecule F in Figure 1)
can be used to design analogs that can be exploited to
identify the genes involved in the steps or regulation of
that pathway. For example, a halogen atom could
be substituted for a hydrogen atom in the structure.
Numerous potential analogs of biologically active
compounds have been synthesized and many are commercially available. Chemical analogs frequently inhibit enzymes or alter the regulation of biosynthetic
pathways and thereby inhibit the growth of treated
bacterial cells. Analogs can also be used with yeast
and other kinds of cells. Depending on the enzyme
and pathway, bacterial mutants that are hypersensitive or resistant to chemical analogs can be isolated.
Hypersensitive mutants must be screened or enriched
for in populations of mutagenized cells. In contrast,
analog resistant mutants, but not the wild-type parent
bacteria, are the only cells that can grow on plates
containing the analog. Thus, analog-resistant mutants
are among the easiest class of mutants to isolate by
direct selection.
Hypersensitivity and resistance to chemical analogs
can arise by several different mechanisms. For example, hypersensitivity can result when a mutation
increases the affinity of a biosynthetic enzyme for an
inhibitory analog. Alternatively, a regulatory mutant
that decreases the cellular amount of an enzyme may
cause hypersensitivity. Conversely, resistant mutants
can result when a mutation increases the cellular
amount of an inhibited enzyme or decreases the affinity
of the enzyme for the analog. Resistance can also arise
when a mutation disables an enzyme that converts a
chemical analog into a toxic substance or that decreases
uptake or retention of the chemical analog by cells.
Mutants defective in the biosynthesis of small molecules can be used in numerous ways to gain information about the genes that mediate a biosynthetic
pathway. The DNA sequences that flank the point of
insertion of a transposon insertion in the chromosomes of bacteria, yeast, and other organisms can be
determined rapidly by cloning and polymerase chain
reaction (PCR) methods. If the DNA sequence of the
genome is known, then the gene or open reading frame
disrupted by the transposon can be identified from the
database.
Complementation
Mapping
Crossfeeding
Epistasis
222
Potential Problems
Reverse Genetics
Reverse genetics depends on the isolation of an
enzyme of sufficient purity to allow determination
of a segment of its amino acid sequence. Significant
technological advances have recently been made in
peptide sequencing by chromatography and mass
spectroscopy so that the amount of purified protein
needed for amino acid analysis is small. After an amino
acid sequence has been obtained, two strategies can
be used to find the gene that encodes the purified
enzyme. If the enzyme is from an organism whose
entire genome is known, then the amino acid sequence
can rapidly be found among all the proteins encoded
by that genome. For organisms whose genomes have
not been fully sequenced, searches of the available
DNA sequences may fail to identify the gene encoding the purified enzyme. Nevertheless, the entire gene
can still be identified by molecular methods. In this
approach, a set of mixed oligonucleotide probes is
synthesized, based on the genetic code, to correspond
to the sequence of amino acids in the peptide from
the purified enzyme.
A genomic library prepared from the organism
under study (above) is then screened for the gene
that hybridizes strongly to the mixed oligonucleotide
probe. As noted above, genomic libraries can be
prepared in plasmid vectors. They can also be prepared in bacteriophage vectors. To screen a library,
individual bacterial or yeast colonies or bacteriophage plaques, each containing a vector with a small
segment of the chromosome under study, are separated on petri plates, attached to a synthetic support
medium, such as nylon, lysed, and hybridized to the
mixed oligonucleotide probe, which is labeled. A colony or plaque that hybridizes strongly to the probe is
further analyzed to determine whether it contains a
complete copy of the gene that encodes the purified
enzyme.
The application of reverse genetic approaches
depends on the availability of methods for purifying
enzymes from crude extracts of an organism. In addition, the success of this approach depends on the
stability and cellular abundance of the enzymes.
Enzyme assays and purification schemes often require
considerable ingenuity to design and optimize. Unfortunately, the activities of some enzymes cannot be
assayed, because the substrates are not known or are
not available. Certain enzymes are not functional or
stable during purification despite the addition of cocktails containing protease inhibitors, reducing agents,
and stabilizing agents. Finally, some enzymes are
Genomics
The complete DNA sequences of the chromosomes of
many prokaryotes and eukaryotes have been determined, and many more genomes will sequenced in
the near future. One surprise about the genome of
the bacterium E. coli is that the functions of approximately 1700 genes out of a total of 4473 genes are
unknown. Thus, nearly 38% of the genes in E. coli,
which is one of the most studied of all organisms,
have not been encountered before. Complete genome
sequences allow comparisons of the predicted amino
acid sequences of all the proteins encoded within an
organism and among different organisms. Within
an organism, polypeptides with conserved amino
acid sequences, shared structural motifs, and similar
functions using different substrates are referred to as
`paralogs' and can be classified into paralogous gene
families. Such comparisons can sometimes suggest
whether related proteins likely evolved by duplication
from a common ancestral protein followed by divergence to acquire new functions. In other cases, proteins with related functions seem to have arisen by the
convergence of two unrelated ancestral proteins or
acquired from another organism by horizontal gene
transfer.
Comparisons of the predicted proteins synthesized
in different organisms reveals conserved proteins,
called `orthologs,' that likely carry out the same functions. One of the great values of genomics is that once
the function of a protein or enzyme has been determined biochemically and genetically in one organism,
its orthologs will likely have the same or a very similar
function in other organisms. Thus, homology searches
can suggest some readily testable hypotheses about
the functions of previously unidentified genes in
many different organisms.
Genomics has been invaluable in helping to elucidate the biosynthetic pathways of certain small
molecules. For example, suppose a hypothetical biosynthetic pathway has been proposed, but mutants
defective in one step of the pathway have never been
isolated. Or suppose that knockout mutations in a
gene always result in a partial growth defect in which
the mutant grows normally when a certain nutrient is
added but slowly when the nutrient is omitted. Both
of these cases suggest redundancy of enzyme function
224
Biotechnolog y
Regulation of Biosynthesis
A detailed discussion of the regulation of biosynthetic
pathways is beyond the scope of this article; nevertheless, a couple of generalizations are warranted.
Biosynthesis is regulated at several levels. Pathway
regulation involves inhibition of enzymatic activity
by an intermediate or product of a biosynthetic pathway. Often the end product, such as nutrient F in
Figure 1, inhibits the activity of the first enzyme in
the pathway, in this case enzyme a. Mutations in genes
encoding enzymes, such as enzyme a, that are no
longer feedback-inhibited by end products, such as
nutrient F, can often be selected using chemical analogs (see above). Genetic regulation involves changes
in the amounts of the enzymes themselves (e.g.,
enzymes a to e in Figure 1). Genetic regulation is
often studied by constructing fusions between biosynthetic enzymes and reporter proteins, whose activities
are easy to assay. A cutting-edge approach to study
genetic regulation involves the simultaneous quantification of all mRNA transcripts in cells using DNA
microarrays attached to chips. Finally, it should be
noted that the older generalization that biosynthetic
pathways contain a single rate-limiting step is no
longer accepted. Instead, several enzymatic steps in
each pathway may influence the rate at which intermediates are converted and product formed by the
pathway. The analysis of the flux of intermediates
through pathways is called metabolic control analysis.
Further Reading
Biotechnology
H I Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0132
The term `biotechnology,' which seems to have originated in the 1970s, means different things to different
people. A useful, broad definition the application of
biological systems and organisms to technical and
industrial processes coined by a White House working group in the mid-1980s clearly encompasses a
variety of old and new processes and products. These
include endeavors as different as fish farming, the
production of enzymes for laundry detergents, and
the genetic manipulation of bacteria to enable them
to clean up oil spills or synthesize human insulin. But
to many, biotechnology connotes genetic engineering
specifically with the newest molecular, gene-splicing
techniques.
Neither biotechnology nor its subset, genetic engineering, is new. A primitive form of biotechnology
dates back at least to 6000 bc when the Babylonians
used microorganisms in fermentation to brew alcoholic beverages. And genetic engineering can be
dated from humans' recognition that animals and
crop plants can be selected and bred to enhance
desired characteristics. In these applications, early
biologists or agriculturists selected for desired
phenotypes, with the poorly understood evolution of
genotypes occurring concomitantly.
During the past half century, better understanding
of genetics at the molecular level has added to the
sophistication of genetic manipulation. An excellent
Biotechnolog y 225
example is the genetic improvement of Penicillium
chrysogenum, the mold that produces penicillin: Via
the application of a variety of techniques during the
past half century, penicillin yields have been increased
more than a hundred-fold. Similarly, agricultural
crops have been genetically improved with astonishing success. These applications of ``conventional'' biotechnology, or genetic engineering, represent scientific,
technological, commercial, and humanitarian successes of monumental proportions. However, the techniques used for these earlier successes were relatively
crude; recently, they have been supplemented, and
even supplanted, by ``the new biotechnology,'' a set
of enabling techniques that make possible genetic
manipulation at the molecular level. These new,
widely applicable techniques are of two general kinds.
The prototype, variously called recombinant DNA or
gene-splicing, shuttles genes readily between organisms. Recombinant DNA technology provides more
precise, better understood, and more predictable
methods for manipulating genetic material than was
possible with conventional biotechnology. The desired ``product'' of recombinant DNA manipulations
may be the engineered organism itself for example,
bacteria altered to clean up oil spills, a weakened virus
used as a vaccine, or a pest-resistant crop plant or a
biosynthetic product of the cells, such as human insulin
produced in bacteria, a hepatitis vaccine antigen
synthesized in yeast, or oil expressed from seeds. The
other major enabling technology is the production of
hybridomas, immortal cell lines that produce monoclonal antibodies of high specificity that are useful as
drugs and in clinical diagnostics.
The seminal recombinant DNA experiment was
the 1973 paper by Stanley Cohen, Herbert Boyer
and their collaborators (Cohen and Boyer, 1973), in
which they mixed two plasmid DNAs digested with a
restriction enzyme and, after ligation, introduced the
resulting recombinant, or chimeric, DNA into Escherichia coli. When the bacteria were propagated, the
plasmids containing heterologous DNA were likewise
propagated and produced amplified amounts of this
recombinant DNA.
The idea for this experiment came not as a sudden
epiphany but was the logical extension of earlier work
in several discrete scientific areas. Recombinant DNA
technology developed from the synergy of several
more or less independent lines of biological and chemical research extending over several decades. Prodigious
research in enzymology had led to the use of restriction
enzymes to cut DNA molecules at defined sequences,
and to the use of DNA ligases to rejoin DNA fragments to form covalently linked chimeric molecules.
Another essential contribution was the panoply of
advances in fractionation procedures that permitted the
Biotechnology's Contributions to
Science and Society
Thus, what has changed since the demonstration of
recombinant DNA technology in the early 1970s is
the technology of biotechnology. The new technology
is at the same time more precise and predictable
than its predecessors and yields better characterized
and more predictable products. And what a cornucopia of products! There are already more than two
dozen distinct gene-spliced or hybridoma-derived
drugs on the market (including one adjunct to cancer
chemotherapy whose revenues exceed $1 billion
annually) and upwards of 500 in clinical development. Marketed products include human insulin
synthesized in recombinant E. coli (Figure 1), used
daily by millions of American diabetics; tPA, tissue
plasminogen activator, a protein that dissolves the
blood clots that cause heart attacks and strokes;
human growth hormone, used to treat children with
hormonal deficiency; erythropoietin, which stimulates the growth of red blood cells in certain
patients suffering from anemia; and several interferons, proteins used to treat a variety of maladies, from
multiple sclerosis to viral infections and cancer.
Dozens of recombinant crop and garden plants on
the market have been genetically improved with a
variety of introduced genes, to impart pest and
disease resistance; these include tomato resistant to
bacterial speck disease (modified by the introduction
226
Biotechnolog y
Figure 1 Escherichia coli containing genes for human proinsulin that were introduced with recombinant DNA
techniques. The large, homogeneous-appearing inclusion bodies in these elongated bacteria are crystallized proinsulin
that has precipitated out because of its high intracellular concentrations. (Courtesy of Eli Lilly & Co.)
of a gene from the bacterium Pseudomonas syringae
(Figure 2), and herbicide-resistant soybeans (modified by the addition of an enzyme that degrades the
herbicide glyphosate) that permit the use of a more environment-friendly herbicide, and in smaller amounts.
Another promising application of the new biotechnology is gene therapy, the insertion of normal or
modified genes into an animal or human, which can
be done for different purposes. A common application
is the creation of genetic lines of animals with characteristics useful in research or medicine animals that
are, for example, models of important human diseases
such as breast cancer or multiple sclerosis, or that
secrete into their bloodstream large amounts of a substance that can be used as a human therapeutic, a
process known as `biopharming.' In humans, gene
therapy is being widely tested to correct genetic or
acquired disorders via the synthesis in the body of
Figure 2 Tomato with bacterial gene that confers resistance to bacterial speck disease. On the right is a wild-type
tomato plant; on the left is a plant that differs from wild-type functionally by the addition of a bacterial gene (Prf, from
Pseudomonas syringae) that modulates resistance to bacterial speck disease. Both plants have been challenged by the
application of the Pseudomonas pathogen. (Courtesy of Dr. Brian Staskawicz, University of California, Berkeley.)
manipulation that preceded it, perhaps we should
think of the technological era that is approaching as a
Brave Old World.
Birth Defects
Further Reading
Blastocyst
Reference
228
Blood groups are antigenic determinants on the surface of blood cells, but the use of the term is generally
restricted to antigens on red blood cells. A blood
group system is one or more blood group antigens
encoded either by a single gene or by a cluster of two
or more closely linked, homologous genes.
Table 1
Human blood group systems, genes that encode them, and their chromosomal location
Number
System name
System symbol
Gene name(s)
Chromosome
Number of
antigens
001
002
003
004
005
006
007
008
009
010
011
012
013
014
015
016
017
018
019
020
021
022
023
024
025
ABO
MNS
P
Rh
Lutheran
Kell
Lewis
Duffy
Kidd
Diego
Yt
Xg
Scianna
Dombrock
Colton
LandsteinerWiener
Chido/Rodgers
Hh
Kx
Gerbich
Cromer
Knops
Indian
Ok
Raph
ABO
MNS
P1
RH
LU
KEL
LE
FY
JK
DI
YT
XG
SC
DO
CO
LW
CH/RG
H
XK
GE
CROM
KN
IN
OK
RAPH
ABO
GYPA, GYPB, GYPE
P1
RHD, RHCE
LU
KEL
FUT3
FY
SLC14A1
SLC4A1
ACHE
XG
SC
DO
AQP1
LW
C4A, C4B
FUT1
XK
GYPC
DAF
CR1
CD44
CD147
MER2
9
4
22
1
19
7
19
1
18
17
7
X
1
12
7
19
6
19
X
2
1
1
11
19
11
4
43
1
45
18
23
6
6
3
18
2
1
3
5
3
3
9
1
1
7
10
5
2
1
1
Further Reading
Bloom's Syndrome
J German
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0137
230
B l u nt - E nd L ig a t io n
The three major complications are chronic lung disease, diabetes mellitus, and cancer.
BS is a genetically determined trait transmitted in
straightforward autosomal recessive fashion, mutation at the locus BLM being responsible. Homozygosity or compound heterozygosity of any of the more
than 60 mutations at BLM identified so far result in a
similar phenotype. The mutations are predominantly
null alleles, but missense mutations also have been
detected. BS is rare in all populations, but in the
Ashkenazi Jewish population one particular mutant
allele, a 6-bp deletion and 7-bp insertion that results in
premature termination of translation, has through
founder effect reached a relatively high carrier
frequency of approximately 1%; in 31% of all persons
with BS one or both parents are Ashkenazi.
The genome is abnormally unstable in the somatic
cells of persons with BS so that mutations arise spontaneously and accumulate in numbers many times
greater than normal. These include both microscopically visible chromatid gaps, breaks, and rearrangements and mutations at specific loci. Exchanges
between chromatids take place excessively, at what
appear to be homologous sites. One consequence of
this hyperrecombinability is reduction to homozygosity of constitutionally heterozygous loci distal
to points of exchange.
Some of the clinical characteristics of BS may be
viewed as direct or indirect consequences of the
hypermutability, so that clinical BS has been considered the prototype of a class of disease referred to as
the somatic mutational disorders. Nevertheless, the
small size, the diabetes, and the immunodeficiency
remain to be explained. A major consequence of the
hyperrecombinability and hypermutability is proneness to neoplasia; BS more than any other known
human state predisposes to the development of cancer
of the types and sites that affect the general population, and at unusually early ages: carcinoma commonest, leukemia and lymphoma next in frequency,
the rare childhood neoplasms last.
Diagnosis of BS is based on clinical observation.
Laboratory confirmation ordinarily is by cytogenetic
demonstration of the characteristically increased tendency of chromatid exchange to take place. BS is the
only condition known that features a greatly increased
rate of sister chromatid exchange (SCE), and blood
lymphocytes in short-term culture are suitable for
confirming or disaffirming the diagnosis. Under certain circumstances, the diagnosis can be confirmed by
demonstrating mutation(s) at BLM by molecular
techniques.
The mapping of BLM to chromosome band
15q26.1 and its subsequent molecular isolation identified a nuclear protein which contains a 350 amino acid
Further Reading
Blunt-End Ligation
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1775
Blunt-end ligation is a reaction that joins two doublestranded DNA molecules (without `staggered cohesive ends') directly at their ends.
See also: DNA Ligases
B o m by x m o r i 231
Bombyx mori
M Goldsmith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1700
processes as well as for potential production of valueadded products introduced by genetic engineering.
Cytogenetics
B. mori, as a typical lepidopteran, has small numerous
`holokinetic' chromosomes (n 28) with dispersed
centromeres and, unfortunately, few visible landmarks
apart from occasional constrictions and bead-like
chromomeres. The failure to show regular banding
patterns in meiotic or mitotic tissue and the lack of
polyteny in terminally differentiated tissue despite
its polyploidization (chromosomes replicate without
cytokinesis but fail to align as in dipteran insects) have
meant that cytogenetics is of limited utility for investigating chromosome fine structure. Chromosomal
fluorescent in situ hybridization (FISH) has been used
successfully to localize repeated sequences, including
telomeric repeats, ribosomal DNA, families of dispersed retrotransposable elements (LINES), and silkworm homologs of centromere-associated sequences.
Few single-copy genes have been localized by this
method, perhaps because of the relatively large
genome size (530 Mb) and features of chromosome
organization not well understood.
Classic Genetics
The community of scientists exploiting the silkworm
for genetic and other studies has largely been confined
to Southeast and South Asian countries traditionally
engaged in sericulture such as Japan, China, Korea, and
India, which maintain the largest number of welldefined genetic stocks, but there has also been a tradition of basic research in the silkworm in Europe,
notably in Russia, Italy, and France, which also have
historic and modern ties to the silk industry. Most
Mendelian mutations have been found as spontaneous
mutations during the course of mass-rearing for stock
maintenance or sericulture. Standard methods of mutagenesis such as irradiation and chemical mutagenesis
followed by selective screens to target specific processes or developmental stages have also been used,
but have been limited by the expense of mass-rearing
and long-term stock maintenance, which must be carried out annually to renew vitality despite the ability
to put most strains into an early embryonic diapause
(dormancy) for 610 months per year. Diapause, which
depends in part on genetic constitution and partly on
rearing conditions (temperature and light cycle), is
broken by a period of chilling, or by artificial activation at critical stages after egg-laying. In addition to
point mutations and deletions affecting specific traits,
modest collections of chromosome aberrations have
been produced, notably autosomal and sex-limited
232
B o m by x m o r i
Molecular Biology
In the past 30 years research groups in Japan, France,
the United States, and Canada have developed a number of model systems in the silkworm primarily to
study control of gene regulation; these early model
systems include the genes encoding silkgland-specific
proteins (the two major silk fiber proteins, fibroin
heavy and light chains, the soluble cocoon `gum,'
sericin, and p25, a putative chaperone protein) and
transfer RNAs for amino acids that are highly
enriched in silk, and the chorion multigene families
that encode the eggshell proteins, which are synthesized and secreted by the follicular cells that nurture
the growing oocyte. Recent advances in cloning technology have been widely used to isolate and study
many silkworm genes based on knowledge of protein
products or expected homology with conserved genes
from other species, extending the study of fundamental mechanisms into such areas as early development,
the immune response, sex determination, and neurobiology. Although to date no group has reported successful isolation of a mutation by direct positional or
map-based cloning, as discussed below, the tools to do
this are becoming available.
Molecular Genetics
In the past several years molecular linkage maps
have been constructed using a variety of physical
markers, including restriction fragment length polymorphisms (RFLPs) based primarily on anonymous
and partially sequenced cloned cDNAs (expressed
sequence tags, ESTs) and isolated genes, random arbitrary polymorphic DNAs (RAPDs), microsatellites,
and inter-simple sequence repeats (ISSRs). Map density ranges in these maps are up to 1000 markers for
RAPDs, which are estimated to be spaced at an average of 500 kb, and are being integrated with other
molecular markers as well as with the conventional
genetic maps. The construction of BAC libraries in the
same genetic stocks as the primary molecular linkage
maps will make positional cloning feasible in the near
future. This work, together with broad-scale gene
isolation and identification, is being aided by the
development of a large-scale EST sequencing project
which will provide a source of anchor loci for contig assembly aimed at a whole genome sequencing
project. The EST database (``SilkBase'') contains more
than 20 000 clones to date representing at least 8000
independent sequences obtained from more than 30
different cDNA libraries from many tissues and
developmental stages. Data from SilkBase is also
being used for tissue transcription profiling and
transcription mapping of isolated genomic DNA
fragments.
Gene Introduction
BmNPV, nucleopolyhedrosis virus, a double-stranded
DNA baculovirus that infects silkworms and is a serious problem in sericulture, has been engineered as an
expression vector to produce exogenous products
such as interferon for commercial use and has some
applications in the pharmaceutical industry. Strains of
B. mori have been bred specifically for mass-rearing
under sterile conditions using artificial diet, and can be
used to harvest expressed proteins directly from the
hemolymph before the virus kills the host. Current
efforts are aimed at disabling the virus, and engineering it to become a suitable vector for germline transformation.
Current methods for obtaining transgenic silkworms rely on a vector derived from piggyBac, a
transposable element first found in the cabbage looper,
Trichoplusia ni, using green fluorescent protein as a
reporter and a silkworm actin promoter, and injected
directly into the embryo at the syncytial preblastoderm stage shortly after egg-laying. Although not yet
routinely used by silkworm geneticists, improvements
in the efficiency of obtaining stable germline transformants by the development of new vectors and easier
methods of gene delivery are under investigation,
and promise to usher in a new period of basic research
using this model to elucidate basic genetic mechanisms
B o t t l en e c k Ef f e c t 233
in an order of insects that include the worst agricultural pests world-wide.
Further Reading
Bootstrapping
See: Trees
Bottleneck Effect
R Chakraborty and M Kimmel
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0140
234
Bottleneck Effect
References
B o ver i , T h e o d o r 235
Hoelzel AR, Halley J, O'Brien SJ et al. (1993) Elephant seal
genetic variation and the use of simulation models to investigate historical population bottlenecks. Journal of Heredity
84: 443449.
Kimmel M, Chakraborty R, King JP et al. (1998) Signatures of
population expansion in microsatellite repeat data. Genetics
148: 19211930.
King JP, Kimmel M and Chakraborty R (2000) A power analysis
of microsatellite-based statistics for inferring past population
growth. Molecular Biology and Evolution 17: 18591868.
Kittles RA, Bergen AW, Urbanek M et al. (1999) Autosomal,
mitochondrial, and Y chromosome DNA variation in Finland:
evidence for a male-specific bottleneck. American Journal of
Physical Anthropology 108: 381399.
Li WH and Sadler LA (1991) Low nucleotide diversity in Man.
Genetics 129: 513523.
Maruyama T and Fuerst PA (1985) Population bottlenecks and
nonequilibrium models in population genetics. III. Genetic
homozygosity in populations which experience periodic
bottlenecks. Genetics 111: 691703.
Mayr E (1963) Animal Species and Evolution. Cambridge, MA:
Harvard University Press.
Menotti-Raymond M and O'Brien SJ (1993) Dating the genetic
bottleneck of the African cheetah. Proceedings of the National
Academy of Sciences, USA 90: 31723176.
Nei M, Maruyama T and Chakraborty R (1975) The bottleneck
effect and genetic variability in populations. Evolution 29:
110.
Nijhuis M, Boucher CA, Schipper P et al. (1998) Stochastic
process stringly influence HIV-1 evolution during suboptimal
protease-inhibitor therapy. Proceedings of the National Academy of Sciences, USA 95: 1444114446.
Polanski A, Kimmel M and Chakraborty R (1998) Application
of a time-dependent coalescence process for inferring the
history of population size changes from DNA sequence
data. Proceedings of the National Academy of Sciences, USA 95:
54565461.
Reich DE and Goldstein DB (1998) Genetic evidence for a
paleolithic human population expansion in Africa. Proceedings of the National Academy of Sciences, USA 95: 8119
8123.
Reich DE, Feldman MW and Goldstein DB (1999) Statistical
properties of two tests that use multilocus data sets to
detect population expansions. Molecular Biology and Evolution
16: 453466.
Rogers AR and Harpending HC (1992) Population growth
makes waves in the distribution of pairwise genetic differences. Molecular Biology and Evolution 9: 552569.
Sajantila A, Salem AH, Savolainen P et al. (1996) Paternal and
maternal DNA lineages reveal a bottleneck in the founding of
the Finnish population. Proceedings of the National Academy of
Sciences, USA 93: 1203512039.
Tajima F (1989) Statistical method for testing the neutral
mutation hypothesis by DNA polymorphism. Genetics 123:
585595.
Boveri, Theodor
R Chaganti
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0141
236
B - P ro l y m p h o c y ti c L e u ke m i a ( B - P L L )
Further Reading
B-Prolymphocytic
Leukemia (B-PLL)
D Catovsky
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1609
Brachydactyly
R Savarirayan, V Cormier-Daire, and
D L Rimoin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0142
B r a s s i c a c e a e , Mo l e c u l a r S y s te m a t i c s a n d E vo l u t i o n o f 237
patterns; type B, by shortened middle phalanges and
distal phalanges and nail aplasia (gene localized to 9q);
type C, by shortened middle phalanges and ulnar
deviation of the index and middle fingers (due to
heterozygous mutations in the cartilage-derived morphogenic protein 1 gene); type D, by short, broad
terminal phalanges of the thumbs and great toes; and
type E, by shortness of all metacarpals and phalanges,
especially the fourth and fifth digits (GNAS1 gene mutations found in subgroup with Albright hereditary
osteodystrophy).
P J Hastings
Brachyury Locus
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0143
Branch Migration
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0144
Brassicaceae, Molecular
Systematics and
Evolution of
S L O'Kane Jr.
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1717
The Brassicaceae, also called the Cruciferae in reference to its four ``crossed'' petals, is commonly known
as the mustard family. The family is of systematic
interest, in part, because it includes the various culinary mustards such as Chinese mustard (Brassica juncea), black mustard (B. nigra), white mustard (Sinapis
alba), horseradish (Armoracia rusticana), radish
(Raphanus sativus), and the highly human-modified
B. oleracea which provides broccoli, Brussels sprouts,
cabbage, cauliflower, kale, and kohlrabi. The family
238
B r a ss i nos t e ro i d s
To date, our overall view of relationshisp within Brassicaceae is primarily a product of the later studies.
Published molecular analyses focusing on the placements of Arabidopsis have relied on DNA sequences
of the chloroplast rbcL gene, the nuclear gene and
circumscription adc (arginine decarboxylase), adh
(alcohol dehydrogenase), and chs (chalcone synthase),
the nuclear internal transcribed spacers (ITS) of the
large subunit of rDNA, and on restriction site analysis
of the chloroplast genome. These studies clearly indicate that the intrafamilial systematics of the family are
in need of revision. For example, the genus Lesquerella is paraphyletic if Physaria is not included in it;
Arabis is currently polyphyletic, consisting of several
unrelated lineages; and long-standing tribal relationships in the family either need to be thoroughly
revised or abandoned due to the rampant polyphyly
and paraphyly of the tribes. Conversely, molecular
studies have strongly supported, for example, the relationships among the mustard species of the genus
Brassica previously worked out on morphological
and cytological grounds. Because of the attention
given to Arabidopsis, much more is now known of
its inter- and intrageneric relationships. The genus
now consists of A. thaliana, all of the species formerly
included in Cardaminopsis, and some taxa previously
placed in Arabis. Except for Arabidopsis thaliana, all
species previously placed in Arabidopsis have now
been reassigned to other, mostly new, genera.
Molecular tools have not only forced a reconsideration of the systematics of Brassicaceae, they have
refocused the attention of systematists on a suite of
characters previously underutilized, such as leaf insertion, growth form, hair types, cytology, and biogeography. Reliance on silique morphology to the
exclusion of other characters is untenable. The genus
Arabis is an example where relying on fruit morphology
has produced a highly unnatural, polyphyletic genus.
See also: Arabidopsis thaliana: Molecular
Systematics and Evolution; Arabidopsis thaliana:
The Premier Model Plant; Plant Development,
Genetics of
Brassinosteroids
J Chory
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1684
OH
HO
Clinical Aspects
O
HO
H
Figure 1
Brassinolide.
division and differentiation, and modulate reproductive development. Brassinosteroids also mediate
growth responses unique to plants, including promotion of cell elongation in the presence of a complex cell
wall, xylem differentiation, senescence, stress production, and coordinating multiple developmental
responses to darkness and light. The chemical
structure of brassinolide is shown in Figure 1.
Further Reading
BRCA1/BRCA2
A Ashworth
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1552
240
B R C A1 / B R C A2
Genetics
Breast cancer exhibits familial association in that the
disease is about twice as common in the mothers,
sisters, and daughters of carriers as it is in the general
population. This familial risk rises to about fivefold
where the cancer occurs before 40 years of age. Mutations in BRCA1 and BRCA2 account for most of
the inherited susceptibility to breast cancer in families with several (more than six) affected individuals.
However, it has been estimated that, overall, BRCA1
and BRCA2 mutations might account for only 20
25% of familial risk. None of these other putative
BRCA genes (BRCA3, 4, 5, etc.) has yet been
mapped or cloned.
Carriers of mutations in BRCA1 or BRCA2 have
an up to 85% chance of developing breast cancer by
age 70 years, but this might differ between different
populations. Hundreds of different mutations in
BRCA1 and BRCA2 have been described (see the
Breast Cancer Information Core (BIC) database
on the World Wide Web at http://www.nhgri.gov/
Intramural_research/Lab_transfer/Bic/). Some mutations are found more commonly than others, usually
due to founder effects in certain populations. For
example, in the Ashkenazi Jewish population, two
BRCA1 mutations (185delAG and 5382insC) and
one BRCA2 mutation (6174delT) are common and
B re a k C o py / B re a k Jo i n 241
Further Reading
BreakCopy/BreakJoin
P J Hastings and S M Rosenberg
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0146
242
Breast Cancer
Breast Cancer
N Haites
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0147
BRCA2
`BRCA3' . . .?
It is generally agreed that none of the currently available cancer susceptibility tests are appropriate for the
screening of asymptomatic persons in the general
population, although the population-specific mutations described among Ashkenazi Jews and Icelanders
may achieve that status in the future. The testing of unaffected members of a family known to carry a BRCA1
or BRCA2 mutation or other cancer-predisposing gene
(known as a predictive genetic test) is probably best
done at specialty clinics.
LiFraumeni Syndrome
Cowden Disease
Cowden disease is considered an autosomal dominant disorder in which an estimated 30% of affected
women have breast cancer, often bilateral and typically
at a younger than average age. A genomic search localized the Cowden gene to chromosome 10q2223, and
mutations in the gene, PTEN, have been found in
Cowden disease patients.
Breeding of Animals
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0148
244
B re n n e r, S y d n ey
of an inbred strain are homozygous across their entire genome and genetically identical to each other.
Thus, an outcross between two inbred strains can be
symbolized as A/A a/a, and the offspring resulting
from such a cross are called the first filial generation,
symbolized by F1. All F1 animals that derive from an
outcross between the same pair of inbred strains are
identical to each other with a heterozygous genome
symbolized as A/a. However, when either or both
parents are not inbred, F1 siblings will not be identical
to each other.
An outcross between two inbred strains or between
one inbred strain and a non-inbred animal that
contains a genetic variant of interest is almost always
the first breeding step performed in a linkage analysis.
The F1 animals obtained from this outcross can
be used in two types of crosses commonly performed
by experimental geneticists backcrosses and intercrosses. A mating between a heterozygous F1 animal
(with an A/a genotype) and one that is homozygous
for either the A or a allele is called a backcross.
This term is derived from the vision of an F1 animal
being mated ``back'' to one of its parents. In actuality, a
backcross is usually accomplished by mating F1 animals with other members of a parental strain rather
than a parent itself. The two-generation outcross
backcross combination is one of the major breeding
protocols used in linkage analysis. From Mendel's first
law of segregation, we know that the offspring from
a backcross to the a/a parent will be distributed
in roughly equal proportions between two genotypes
at any single locus approximately 50% will be
heterozygous A/a, and approximately 50% will
be homozygous a/a.
A mating set up between brothers and sisters from
the F1 generation, or between any other two animals
that are identically heterozygous at a particular locus
under investigation, is called an intercross. An intercross can be represented by the notation: A/a A/a.
The two-generation outcrossintercross series was
the classic breeding scheme used by Mendel in the
formulation of his laws of heredity, and it is the second
major breeding protocol used today for linkage analysis in animals. Again, according to Mendel's first
law, the offspring from an intercross will be distributed among three genotypes at any single locus 50%
will be heterozygous A/a, 25% will be homozygous
A/A, and 25% will be homozygous a/a.
A mating between two members of the same inbred
strain, or between any two animals having the same
homozygous genotype is called an incross. The
incross (A/A A/A or a/a a/a) serves primarily as
a means for maintaining strains of animals that are
inbred or carry particular alleles of interest to the
investigator. All offspring from an incross will have
Brenner, Sydney
B Guttman and E Kutter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0149
246
Buoyant Density
in a living animal; his rational voice in the debate on recombinant DNA; and his trenchant wit.
Buoyant Density
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1776
Burkitt's Lymphoma
R Chaganti
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0151
B u r k i t t 's Ly m p h o m a 247
tumor in children as well as adults may also present in
the marrow as an acute lymphoblastic leukemia. Facial
bone tumors are less common in the nonendemic
form, the majority of cases presenting in the abdomen.
These tumors are highly aggressive, but are potentially
curable. The prognosis in children correlates with
disease bulk at the time of diagnosis.
As a lymphoma, it comprises a small but welldefined histological subset of the histologically and
clinically complex B cell non-Hodgkin's lymphomas.
Histologically, the tumor cells are monomorphic
comprising small B cells. They display round nuclei,
multiple nucleoli, and relatively abundant basophilic
cytoplasm. The tumor cells exhibit a high rate of proliferation as well as a high rate of spontaneous cell
death. A histological hallmark of this disease is the
so-called `starry sky' pattern resulting from ingestion
of the apoptotic cells by numerous benign macrophages. As a lymphoid neoplasm, the tumor cells
present surface immunoglobulin as well as a number
of B-cell-associated antigens, notably CD19, CD20,
CD22, CD79a, and CD10.
One of the first recognized intriguing features of
this neoplasm was that the serum of endemic patients
exhibits high titers of antibodies against the DNA
virus EpsteinBarr virus (EBV). The viral genomes
are also detected in the lymphoma cells of almost all
endemic patients, but rarely in nonendemic patients.
The precise role played by the virus in the generation
of the tumor is unknown. It has been suggested that
the virus-infected B lymphocytes may be prone to
excessive proliferation, which may predispose them
to additional genetic errors, some of which may
indeed be specific for cell transformation to a malignant state. While this scenario is possible, EBV is a
common virus with most individuals in the population
being seropositive. Acute infection by the virus causes
infectious mononucleosis and there is no increased
predisposition of these patients to Burkitt's lymphoma. Nonendemic patients with EBV-negative
Burkitt's lymphomas (no virus in tumor cells) are
often seropositive for the virus. Further complicating
the issue of the role of this virus in the etiology of
lymphoma is the fact that EBV-positive Burkitt's or
Burkitt's-like lymphoma is one of the malignancies
frequently associated with immunodeficiency states
associated with human immunodeficiency virus
(HIV) infection or iatrogenic immunosuppression.
Burkitt's lymphoma was the first human tumor
in which the key molecular aberration underlying
tumorigenesis was defined. Soon after the disease
was identified, cell lines were derived in Sweden
from tumors collected in Africa. Cytogenetic analysis
of these tumors during the 1970s identified a set of unusual chromosomal translocations that characterized
248
B u r k i t t 's Ly m p h o m a
through the analysis of these immunoglobulin-geneassociated translocations and the study of their normal
function made significant contributions to our understanding of these biological phenomena. Thus, the
discovery of MYC-associated translocations and
the characterization of their molecular biological
C
C Genes
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1778
C Value
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1779
C-Value Paradox
H C Macgregor
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0301
250
C . elegans
CA Repeats
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2084
References
CAAT Box
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1780
C. elegans
See: Caenorhabditis elegans
C57BL/6
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0180
C a e n o r h a b d i t i s e l e g a n s 251
the different 50 exons spliced to a common second
exon, followed by the remained of the coding
sequence. A smaller transcript is detected in testes,
but there is no alteration in coding sequence of this
mRNA. The c-ABL gene is relatively ubiquitously
expressed during embryonic development and in
adult tissues, with higher levels of expression in
lymphoid tissue and testes. There is a single known
homolog of c-ABL in the genome known as ARG
(ABL-related gene), or formally as ABL2.
The c-ABL gene encodes a nontransmembrane
protein-tyrosine kinase, c-Abl, of about 145 kDa.
There are two protein isoforms of Abl, denoted type
Ia and Ib, that are the products of the two distinct
mRNAs, and differ only in their N-terminal
sequences. The type Ib form of c-Abl contains a glycine residue at the second position and is covalently
attached to a myristoyl fatty acid moeity. The remainder of the polypeptide is identical between the two
isoforms. The N-terminal 60 kDa is very similar in
structure to members of the Src family, with Src
homology 3 (SH3) and Src homology 2 (SH2)
domains, followed by the catalytic domain. However,
c-Abl (and Arg) differ from Src proteins by the existence of a large (90 kDa) C-terminal domain. This
domain contains many functional motifs, including
phosphorylation sites, nuclear localization and export
signals, and DNA- and actin-binding domains.
The normal cellular functions of c-Abl are unknown. The protein is localized predominantly to
the cell nucleus in adherent cell types, with a fraction
also found in the cytoplasm, predominantly associated
with the filamentous actin cytoskeleton. The tyrosine
kinase activity of the protein is tightly regulated
in vivo. Abl catalytic activity is directly inhibited by
its own SH3 domain in an intramolecular fashion that
is similar to the regulation of Src kinases, but there is
also evidence for regulation by a cellular inhibitor and
via phosphorylation by other tyrosine and serine
kinases. Abl kinase activity is stimulated by several
physiologic stimuli, including DNA damage, oxidative
stress, and integrin and growth factor stimulation.
Overexpression studies have suggested roles for
nuclear Abl in growth arrest and apoptosis responses
to genotoxic stress and in transcription, and for cytoplasmic Abl in cytoskeletal responses to cell adherence
and growth factor stimulation. In Drosophila, genetic
evidence implicates Abl in axonogenesis in the central
and peripheral nervous system. The murine c-abl gene
was one of the first to be inactivated by homologous
recombination. Mice with homozygous null mutations in c-abl have reduced postnatal survival, variable
deficiency of mature B- and T-lymphoid cells, and
impaired spermatogenesis. However, mice lacking
Abl do not have profound defects in DNA damage
Caenorhabditis elegans
J Hodgkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0152
Introduction
Caenorhabditis elegans is a small, free-living, nematode worm, which has become established as a standard model organism for a great variety of genetic
investigations, being especially useful for studying
developmental biology, cell biology and neurobiology. As an invertebrate experimental system, it is
now second only to Drosophila melanogaster in terms
of convenience and popularity. By the year 2000, the
community of C. elegans researchers had expanded to
approximately 300 laboratories, distributed over 20
countries. In 1998, sequencing of the 97 million base
pairs of DNA that make up the entire genome was
essentially completed. This was the first complete
genome sequence to be determined for any multicellular organism.
The potential usefulness of nematodes as tools for
genetic research was recognized early on by Ellsworth
Dougherty (USA) and Victor Nigon (France), but the
special popularity of C. elegans stems directly from
its adoption as an experimental organism by Sydney
Brenner, working in Cambridge, UK, during the 1960s.
The reasons for choosing this species as a subject for
intensive study are described later in this article.
C. elegans is a member of the phylum Nematoda,
which are commonly known as roundworms. The
number of species in this phylum is unknown, but
believed to be very large: at least one hundred
thousand and probably several million. In terms of
individual numbers, nematodes are also extraordinarily numerous; according to some estimates, four out of
every five animals on this planet is a nematode. Both
free-living and parasitic species occur. Plant parasitic
nematodes are economically important, being responsible for the loss of 1020% of primary agricultural
production. The animal parasites are also very important; most vertebrates have at least one nematode
parasite, which can have debilitating or lethal effects
on its host. Entomopathogenic nematodes, which kill
252
Caenorhabditis elegans
HERMAPHRODITE
Pharynx
Intestine
Oocytes
Ovary
Sperm in
spermatheca
Anus
Vulva
Eggs in
uterus
Oocytes
Spermatheca
MALE
Spicule
Intestine
Testis
Sperm
Fan
Seminal
vesicle Vas
deferens
Cloaca
Rays
Figure 1 Anatomy of adult hermaphrodite (XX) and adult male (XO) C. elegans. The hermaphrodite is about 1 mm
long when it reaches adulthood, but continues to grow in length thereafter, to a maximum of 1.5 mm. Males are
smaller, reaching 1 mm in total length. Larval stages have simpler anatomy, because they lack mature gonads and
genitalia.
C a e n o r h a b d i t i s e l e g a n s 253
XX herm
Egg
SELF
Adult
L1 larva
XX herm
SELF
XX herm
(99.8%)
L4 larva
XX male
(0.2%)
XX herm
L2 larva
CROSS
L3 larva
Dauer
larva
XX male
(50 %)
XX herm
(50 %)
and can survive harsh conditions and prolonged starvation, remaining viable for several months. When
food becomes available, dauer larvae will molt to
form L4 larvae, and resume maturation.
Technical Advantages
The major advantages of C. elegans as an experimental
system are as follows:
. Anatomical simplicity: the adult hermaphrodite
has only 959 somatic nuclei when fully grown, yet
it contains well-differentiated tissues, corresponding to those found in more complicated animals.
These include intestine, musculature, epidermis
(also known as hypodermis), sense organs, neurons,
germ cells, and so on.
. Genomic simplicity: the DNA content, at 97 megabases, is lower than that of most animal species.
. Ease of culture: the worms are easily grown on
bacterial culture plates. All wild-type and mutant
strains can be frozen for permanent storage in
liquid nitrogen, retaining viability indefinitely.
. Rapid growth: the 3-day generation time permits
many genetic crosses in a short space of time.
254
Caenorhabditis elegans
neurons, as reconstructed from serial section electron micrographs). These descriptions were carried
out by John Sulston, John White, Robert Horvitz,
Judith Kimble, Nichol Thomson and their collaborators.
. Resources: Informational resources include documentation of the complete cell lineage, complete
wiring diagram and parts list; the physical and
genetic maps of the six chromosomes; and the complete sequence of the genome. Practical resources
include the 18 000 cosmid clones and 5000 YAC
(yeast artificial chromosome) clones which were
used to assemble the physical map, and over 3000
strains held at the Caenorhabditis Genetics
Center (CGC). The CGC also maintains a bibliography of over 4000 references. There are comprehensive and publicly accessible databases for
C. elegans such as the Caenorhabditis elegans
World Wide Web server (http://elegans.swmed.
edu), the CGC (http://biosci.umn.edu), WormBase
(http://www.wormbase.org), and ACeDB (http://
www.sanger.ac.uk/Software/Acedb/).
C a e n o r h a b d i t i s e l e g a n s 255
The genome also encodes the usual sets of eukaryotic RNAs: ribosomal RNAs, tRNAs, snRNAs,
scRNAs and other small RNAs, as well as genespecific regulatory RNAs. Most of the RNA genes
occur in families, amounting to at least another one
thousand genes. A variety of repeated sequences contribute about 6% of the genome. Six families of active
transposons, called Tc1Tc6, have been defined, present in 530 copies and polymorphic between different races of C. elegans. Two of these transposons, Tc1
and Tc3, belong to the Tc1/Mariner family and have
been extensively studied with respect to transposition
mechanism, as well as being used for transposontagging and other manipulations.
The major gene families in the protein-coding part
of the genome conform to the general animal pattern,
encoding numerous kinases, DNA-binding proteins,
RNA-binding proteins, extracellular matrix components, and so on. Conspicuously abundant are collagen
genes, which encode the various components of the
complex cuticle which acts as the animal's exoskeleton. The genome also contains large families of Gcoupled transmembrane receptors, which are believed
to act as chemoreceptors: C. elegans has a sophisticated olfactory sense, by which it probably gets
most information about its environment. Another
major gene family encodes nuclear hormone receptors, which seem to be more frequent than in other
animals, for unknown reasons.
A set of post-genomic tools is being used to assign
and analyze the function of the 20 000 genes of
C. elegans, which supplement and greatly extend the
classical genetic tools already available. These include:
. Systematic isolation and characterization of cDNA
clones, in order to isolate the coding parts of each
gene and to verify the genes and intronexon organization predicted from sequence information.
. Sequencing of the related nematode Caenorhabditis
briggsae, which is at least 20 million years diverged
from C. elegans, so that intronic sequences exhibit
no similarity, but exons and control regions are
conserved and can therefore be recognized.
. Expression analyses, by in situ hybridization and by
making transgenic lines with reporter gene constructs, using either b-galactosidase or green fluorescent protein (GFP) tags. Transformation and the
construction of transgenic lines of C. elegans are
simple, because the absence of specific centromeric
sequences means that any piece of exogenous DNA
can be propagated as an extrachromosomal element, once it has been injected into the germline.
The GFP-tagged lines allow labeling and examination of specific tissues, cells, or subcellular structures in the living animal.
Research Areas
Conventional mutational techniques, mostly using
the chemical mutagen ethyl methane sulfonate, have
defined more than 1500 genes, distributed over 300
phenotypic categories. The largest gene classes named
for specific phenotypes, and therefore implicated in
particular processes, provide a reflection of some of
the main research areas. Between 12 and 130 genes
have been defined by mutation in each of the following classes: emb (embryogenesis), lin (cell lineage and
differentiation), ced (programmed cell death), mig
(cell and axon migration), gon (gonad development),
mab (male-specific development), him (meiosis), fer
and spe (spermatogenesis), che and dyf (chemotaxis),
cod (male copulation behavior), eat (feeding behavior), egl (egg-laying), mec (mechanosensation),
osm (osmotaxis), unc (locomotion), daf (dauer-larva
formation), dpy (size and body shape), sup (genetic
suppression). The efficiency of mutagenesis in C. elegans also means that genes are often defined by multiple independent alleles, sometimes including unusual
mutations such as temperature-sensitive alleles or
gain-of-function alleles.
In addition to genes implicated in specific processes, many hundreds of essential let (lethal) genes
have been defined by mutations that lead to embryonic death, larval death, or adult sterility, but as yet
these have been little characterized with respect to
phenotype. The total number of essential genes in C.
elegans is not certain, but is unlikely to be more than
256
C a i r n s , Jo h n
Further Reading
Cairns, John
P L Foster
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0153
258
References
c A M P a n d C e l l S i g n a l i n g 259
Stimulus
GPCR
Adenylate cyclase
MEMBRANE
G>
GC
G=
G>
GC
G=
G=
GDP
Catalytic domain
GTP
GTP
ATP
GTP
Protein
ATP
Protein
R
R
cAMP
Pi
Figure 1
cAMP
cAMP
+
Response
Catalytic domain
Protein kinase A
260
C a m p b e l l M o d el
cAMP as a Chemoattractant
Another role for cAMP is as a chemoattractant in cell
chemotaxis, in which amoeboid cells, such as those of
the amoeba Dictyostelium discoideum, move toward
increasing concentrations of extracellular cAMP.
Under growth conditions, these amoeba cells track
down and phagocytose bacteria; but when starved
they move toward secreted cAMP, form aggregates,
and differentiate into stalk and spore cells. The attractant is detected by its binding to the serpentine GPCR
cAR1, with the signal propagated through the bg subunits rather than the a subunit of the associated
G-protein, eliciting rapid and transient increases in the
secondary messengers cAMP, cGMP (guanosine 30 ,
50 -monophosphate), IP3 (inositol 1,4,5-triphosphate),
and Ca2. However, IP3/Ca2 signaling does not
appear to be required for chemotaxis. cGMP stimulates myosin fiber assembly, probably via a cGMPdependent protein kinase that activates myosin II
kinase, and reorganization of the actin cytoskeleton.
These events allow the cell to throw out a pseudopod
containing F-actin toward the cAMP source. Analogous systems are operative in humans, such as the movement of leukocytes toward chemokine attractants.
Further Reading
Campbell Model
A Campbell
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0155
Integration by Site-Specific
Recombination
The validity of the model to lambda integration has
been rigorously tested. On entering an infected cell, the
lambda genome circularizes by annealing and ligation
of complementary `sticky ends' (projecting 12 bp
wx
123
cd
ab
123
cd
wx
123
yz
ab
123
yz
262
Cancer Susceptibility
the inserted F. Such abnormal elements (called specialized transducing phages and F0 plasmids, respectively) can integrate into the bacterial chromosome by
general recombination using the homology provided
by the DNA they have picked up from the host.
Besides such natural processes, integration by
homology has been used extensively with genetically
engineered constructs where a host gene is cloned into
a phage or plasmid vector. As implied by Figure 1, the
resulting integrant has two copies of the cloned segment, in direct orientation, flanking one copy of vector DNA. Where these two copies differ by mutation,
excision by general recombination can generate a
cell with alleles originally present in the cloned insert
or vector carrying alleles originally present in the
host. Such swaps will occur for alleles at position 2
(Figure 1) if the integrating crossover occurs between
1 and 2 and excision between 2 and 3.
References
Cancer Susceptibility
L Mulligan
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1554
Low-Penetrance Cancer-Susceptibility
Alleles
Although Mendelian inheritance of mutations in
cancer-susceptibility genes is associated with a subset
of human cancer, there are also mutations and variants
in the genome that do not confer this type of strong
predisposition yet still increase cancer risk, either
generally or specifically for one tumor type. These
variants may identify additional risk factors associated
with known cancer-related genes or may identify
genes which have less obvious or direct roles in tumorigenesis and yet may also contribute to the cancer
phenotype. Recent studies show that low-penetrance
mutant alleles of some tumor suppressor genes, as well
as the better known high-penetrance mutations, can
contribute to cancer incidence. These low-penetrance
alleles confer a significantly increased risk of sporadic
tumors. For example, the I1307K allele of the adenomatous polyposis coli (APC) gene confers a twofold
risk for sporadic colon carcinoma as well as its hereditary role in familial adenomatous polyposis. These
low-penetrance alleles are inherited in the same way
as the higher-penetrance mutations; however, only a
proportion of those who inherit these mutations will
develop tumors. Thus, the typical autosomal dominant cancer predisposition inheritance pattern that
we see with highly penetrant alleles is not obvious,
and the few tumor cases observed in a family can be
mistakenly interpreted as sporadic cases. Mutations
of these genes may also contribute to tumors in a
dosage-dependent fashion, such that the degree of cancer susceptibility is dependent on the number of functional copies of the gene. For example, carriers of
mutations of the ataxia-telangiectasia mutated (ATM)
gene have a 3.5- to 5-fold relative risk of developing
breast cancer as compared to the total population.
A number of cancer-susceptibility genes confer
increased relative risk of specific tumor types when
coupled with environmental factors. For example,
women with variant alleles of CYP1A1 or CYP2E1
who are also cigarette smokers have a significantly
increased risk of breast cancer, probably due to
decreased ability of these specific alleles to metabolize
carcinogens. More significantly, individuals with
mutations of proteins required for DNA repair, such
as those involved in xeroderma pigmentosum, are
unable to repair DNA damage caused by environmental factors such as UV light or other mutagens and
have an accumulation of DNA damage which contributes to a high incidence of skin tumors.
Summary
It seems likely that the risk of developing cancer may
be dependent on a few major genetic effects and multiple low-penetrance alleles, potentially in combination
with other environmental risks in a given individual.
Each of these effects does not act in isolation but
forms part of the individual's cumulative risk. Thus,
inherited cancer susceptibility may be much higher
than the 510% estimate associated with familial
forms of cancer and may reflect a significant subset
of what we currently think of as sporadic tumor cases.
See also: Breast Cancer; Carcinogens; Oncogenes;
Proto-Oncogene; Tumor Suppressor Genes
Candidate Gene
A Long
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1416
264
Can i ne Geneti cs
Canine Genetics
E A Ostrander and R K Wayne
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0374
Selective breeding over many generations has produced more than 300 distinct breeds of domestic dog
worldwide, each defined by specific physical and
behavioral characteristics. As a result, modern breeds
of dogs display a range of morphological and behavioral variation that is unique among mammals, and
resulting almost exclusively from genetic causes.
Despite extraordinary phenotypic variation, however,
all breeds of dog belong to a single species (Canis
familiaris) and crosses between most breeds produce
fertile offspring.
Purebred dogs, by definition, are segregated into small inbreeding subpopulations which are subject
to intense selection to meet specific and rigid criteria
for performance and physique, as defined by breed
standards. The combination of founder effects, genetic drift in small populations, inbreeding, and positive
selection means that dogs of any one breed, even
those from apparently unrelated families, will share
homogeneity at some places in the genome. The
combination of interbreed variation and intrabreed
homogeneity offers geneticists a rare opportunity to
delve into the genetics of mammalian morphology,
behavior, and disease.
Evolution of Dogs
The domestic dog is a single species in the family
Canidae, order Carnivora, and superfamily Canoidea
that includes seals, bears, weasel, and raccoon-like
carnivores. Although all dogs appear to have been
derived from the gray wolf, origination or interbreeding events may have occurred several times over
human history. Theories of dog origins range from
those maintaining that dogs originated once from a
limited founding pool to those suggesting multiple
origins, from possibly more than one species, over
the course of human history.
The specific number of domestication events and
their timing and location which lead to the modern
dog is somewhat controversial. The archeological
record suggests that the first domestic dogs were
found in the Middle East about 14 000 years ago.
However, very old fossil remains found in North
America and Europe suggest that dogs are closest
phenotypically to Chinese wolves. The phenotypic
plasticity of dogs is a problem when attempting reconstructions of their origin. Some dogs resemble closely
WolfDog Hybridization
Even today, wolves continue to influence the genetic
diversity of dogs. In the US alone, there are currently
over 100 000 wolfdog hybrids. These dogs are popular among some individuals because of their appearance and because of their attributes as protectors.
Wolfdog hybrids are frequently interbred with purebred dogs, particularly Nordic breeds and German
shepherd. As such interbreeding occurs, it is generally
assumed that the resulting progeny will have a lower
proportion of wolf genes, and thus be more docile.
Unfortunately, so little is known about the genetics of
behavior in mammals, that while this is likely true at
the population level, i.e., dilution of wolfdog hybrid
genes by mating with domestic dogs will likely produce more docile dogs, it is difficult to predict what
the progeny of any single wolfdog hybrid cross will
be like.
As a result of such crosses, wolf genes continue to
diffuse into the general dog population. It is interesting to note, however, that dog genes may also be
influencing the genetic composition of wild wolves.
In Italy and Spain, for instance, gray wolves occasionally will interact and even interbreed with semiferal populations of domestic dogs. Such matings
can threaten the genetic integrity of wild wolf populations and are a major concern of conservation
geneticists.
266
Canine Genetics
D1
D23
D18
D9
D2
D11
D4
D5
D12
D15
D20
D26
D17
D16
D3
D14
D22
D25
D8
II
W4
W5
D7
III
D19
D21
W13
W12
W26
W24
W16
W17
W15
W25
W19
W1
W14
W27
W2
W11
W18
W3
W7
W10
W21
W23
W20
D10
IV
D24
W6 / D6
W8
W9
W22
Coyote
Figure 1 Neighbor-joining relationship tree of wolf (W) and dog (D) control region sequences (16). Dog
haplotypes are grouped in four clades, I to IV. Boxes indicate haplotypes found in the 19 Xolos (25). Haplotypes found
in two Chinese crested dogs are indicated with a black circle. Bold characters indicate haplotypes found in New
World wolves (W20 to W25). (Reproduced with permission from ``Origin, genetic diversity and genome structure
of the domestic dog,'' Wayne RK and Ostrander EA. BioEssays. Copyright 1999, Wiley-Liss, Inc., a subsidiary of
John Wiley & Sons, Inc.)
268
Canine Genetics
(A)
(B)
270
Cap
Further Reading
References
Cap
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1781
Concluding Remarks
Genetic analysis of the domestic dog offers a unique
opportunity for genetic dissection of a wide variety
of mammalian traits. The high incidence of genetic
disease within specific dog breeds as well as the
availability of multigeneration genealogies, coupled
with the recent availability of a canine genetic map,
now make the dog a tangible and attractive genetic
model. Together with population level and evolutionary studies, the dog is likely to become one of the
genetically best-defined domestic species in the coming years.
CAP (CRP)
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1782
C a rci n o g e n s 271
Capsid
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1783
Carcinogens
R C Moschel
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0158
Ionizing Radiation
Examples of ionizing radiation include a particles,
X-rays, and g-rays. Most human exposure to ionizing
radiation is the result of exposure to cosmic rays,
environmental radioactivity in the form of radioisotopes, medical radiography, and radon gas. Radon
exposure results from the radioactive decay of uranium in soil to radium which, in turn, decomposes to a
gas that can collect in habitable structures. Radon gas
is now regarded as one of the principal sources of
radiation affecting the US population. There are considerable epidemiological data linking radiation exposure to human cancers. For example, lung cancer was
common among the first uranium miners, who regularly breathed in large amounts of radon gas, and skin
cancer was common among early X-ray workers.
More recently, a high incidence of leukemia and a
variety of solid tumors have been observed among
survivors of the atomic bombing of Hiroshima and
Nagasaki during World War II. Although the detailed
mechanism for carcinogenesis by radiation exposure
has yet to be established, it is well known that radiation is mutagenic and readily produces chromosomal
translocations and deletions via the formation of
DNA double-strand breaks. These chromosomal
anomalies may lead to oncogene activation or suppressor gene deletion, thereby initiating carcinogenesis.
Ultraviolet Radiation
Skin cancers are the most common form of human
cancer. Early carcinogenesis experiments involving
mice and rats demonstrated that ultraviolet (UV)
radiation from sunlight was responsible for skin cancer in these animals. The shorter wavelength UV-B
region of ultraviolet light (280320 nm) was shown
to be a more effective inducer of carcinogenesis than
the UV-A region (320400 nm), although the latter
region is also carcinogenic at higher doses over
extended times. It is now widely accepted that
repeated exposure to UV radiation from sunlight is
responsible for most nonmelanoma human skin cancers, and it contributes to the onset of melanoma as
well. UV radiation produces cyclobutane-type pyrimidine dimers and other pyrimidinepyrimidine and
pyrimidinepurine photoproducts in DNA. A failure
to repair these lesions may result in base pair substitution mutations that can inactivate suppressor genes
(e.g., p53), resulting in carcinogenesis.
Viruses
Some of the earliest experiments in viral carcinogenesis during the first half of the twentieth century
demonstrated that avian leukemia and an avian sarcoma were transmissible diseases. Later investigations
identified viruses as the agents responsible for this
transmissibility. Viruses were also identified as the
agents responsible for causing fibrous tumors and
benign papillomas in rabbits, and later studies led to
the discovery of murine leukemia viruses and the
feline leukemia virus. The involvement of viruses in
causing human cancer has only recently been established and has thus far been limited to human T-cell
leukemia, Burkitt's lymphoma, and nasopharyngeal
cancer. Adult T-cell leukemia, which is endemic to
Japan, the Caribbean, and parts of Africa, is caused
by the human T-cell leukemia virus type I (HTLV-1), a
human retrovirus. The EpsteinBarr virus (EBV), a
double-stranded DNA virus and a member of the
herpesvirus family, was shown to be responsible for
Burkitt's lymphoma, particularly among equatorialbelt East Africans. EBV is also linked to the occurrence of nasopharyngeal carcinoma in China as well as
in areas of Africa. Genital tract carcinomas and some
upper airway and oral cancers are more loosely associated with some human papillomaviruses, which are
also DNA viruses. Additionally, there is epidemiological evidence suggesting a strong role for the hepatitis B virus (another DNA virus) in the etiology of
human hepatocellular carcinoma; however, it is not
yet regarded as the sole causative agent.
The mechanism of carcinogenesis by human retroviruses involves the transcription of viral RNA (by
reverse transcriptase) into a complementary DNA.
This DNA is then converted to a double-stranded
DNA provirus that integrates into the host cell's
genome. In the case of DNA viruses, the viral DNA
272
Carcinoid Tumors
Chemicals
Chemicals constitute the most diverse group of carcinogens. Hundreds of chemicals are known to be
carcinogenic/tumorigenic in animals. A carcinogen is
termed genotoxic if it covalently binds to cellular
DNA. If unrepaired, the damaged DNA may cause
mutations by inducing the misincorporation of bases
during DNA replication. Genotoxic carcinogens may
be either direct-acting (ultimately reactive toward
DNA from the outset) or they may require metabolic
activation to become reactive toward DNA (indirectacting carcinogens). Examples of direct-acting carcinogens include alkyl or aryl epoxides, nitrosoureas,
nitrosamides, and certain sulfonate and sulfate esters.
Examples of indirect-acting carcinogens include polycyclic aromatic hydrocarbons, aromatic amines, alkyl
nitrosamines, or aflatoxin B1. Most chemical carcinogens require metabolic activation to elicit a tumorigenic response.
Epigenetic carcinogens are carcinogens that do not
damage DNA directly; however, they may enhance
tumorigenesis by a variety of mechanisms. Epigenetic
carcinogens may induce the generation of activating
enzymes that metabolize carcinogens to DNA reactive forms or they may inhibit beneficial detoxifying
reactions that convert procarcinogens to excretable
forms that are not DNA reactive. Epigenetic carcinogens may also inhibit the repair of damaged DNA or
serve as promoters. Promoters are agents that are not
directly reactive toward DNA or mutagenic but
instead stimulate the growth and division of cells
that may have already sustained the genetic damage
that predisposes them to become tumorigenic.
Cigarette smoking poses the greatest chemical risk
for causing cancer in humans. Cancers linked to cigarette smoking include those of the lung, larynx, mouth,
pharynx, esophagus, bladder, and pancreas. This stems
from the fact that a large number of carcinogens have
been identified in cigarette smoke. However, examples
of carcinogenic chemicals are also found among agricultural chemicals (e.g., pesticides, herbicides, and fungicides), industrial chemicals (e.g., aromatic amines,
vinyl chloride, benzene, and chromium compounds),
atmospheric pollutants (e.g., polycyclic aromatic
Further Reading
Carcinoid Tumors
P S Hasleton and N A Elgerdy
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1556
Most organ systems in the body contain neuroendocrine cells as part of a diffuse endocrine system
(Gosney, 1992). These cells often respond to injuries
or disease. For example, diffuse idiopathic pulmonary
neuroendocrine hyperplasia (DIPNECH) is a rare
clinicopathologic syndrome of the lung. It is seen in
a setting of obstructive pulmonary disease, usually
obliterative bronchiolitis, with no interstitial lung disease. The neuroendocrine hyperplasia is confined to
airways, usually bronchioles and alveolar walls. In
some conditions, such as bronchiectasis, there may
be focal proliferation of neuroendocrine cells, known
as tumorlets. The distinction between a tumorlet and a
carcinoid tumor is entirely arbitrary, tumorlets being
less than 0.5 cm in diameter, a carcinoid tumour being
0.5 cm or greater. Similarly, the stomach can show
multiple neuroendocrine lesions, ranging from the
size of tumorlets through to carcinoids.
Functional Significance of
Neuroendocrine Tumors
Carcinoid tumors are known to medical students for
the carcinoid syndrome. This consists of flushing,
p53
Inactivation of tumor suppressor genes through inhibition of their protein products (p53, retinoblastoma
274
Carcinoma
gene, CdK-I-P16) remove important regulatory constraints in the cell cycle at the G1 restriction point. p53
transcription factor is on the same common pathway as
the retinoblastoma gene (Rb) regulating G1 arrest and
on a pathway independent from Rb, regulates apoptosis. Thus p53 inactivation could contribute to accelerated growth of tumor tissue by increasing the rate of
cell division as well as allowing escape from apoptosis.
p53 mutation or stabilization is absent in typical
carcinoids. Atypical carcinoids, which showed focal
(less than 10%) or patchy p53 positivity, were more
aggressive and had significantly shorter survival times
than those without p53 staining (Brambilla and
Brambilla, 1999).
References
Cyclin
Retinoblastoma Gene
Telomeres
Carcinoma
See: Basal Cell Carcinoma
Carrier
E H Simon
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0160
Frequency
An outbred population contains an enormous amount
of heterozygosity. Just consider the amount of diversity evident in a single family. Many genes are polymorphic. If the alleles are of nearly equal selective
Detection
Several means to determine whether a person is carrying a mutant gene are available. An early approach was
to examine the level of a gene product. For example,
TaySachs disease is characterized by a hexosaminidase deficiency. Carriers of the defective gene have a
significantly lower level of this enzyme in their blood
than homozygotes and can be identified accurately
and inexpensively. This method, however, has limited
applicability. Newer methods test directly for mutant
DNA sequences. As an example, consider the following scenarios. (1) A young man has a living relative
with cystic fibrosis (CF); he is contemplating marriage
to a woman who `may' have a similar relative in her
family. Therefore he is anxious to know if he is a carrier
of a mutant CF gene. Since the relative is available for
testing, and the sequence of the CF gene is known, several techniques can be used to determine if the man has
inherited the same mutant gene. If he has, the woman
can be tested to see if she is carrying the same one. (2)
Cassette Model
I Herskowitz
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0162
276
C a s s e t t e M o del
which acts on a receptor (the a-factor receptor) present on the surface of a cells. Both a and a cells contain
the genes for production of a-factor and a-factor, and
they also contain the genes for the a-factor and afactor receptors. Why is it that a cells produce only
the a-specific products, a-factor and the a-factor
receptor, and why is it that a cells produce only the
a-specific products, a-factor and the a-factor receptor? The answer lies in the mating type locus.
The mating type locus (MAT) is located at a particular position on chromosome 3 (of yeast's 16 chromosomes) and has two different forms (or ``alleles''):
MATa, which programs the a cell type, and MATa,
which programs the a cell type (Figure 1). MATa
codes for two transcriptional regulatory proteins: a1
is an activator protein, which turns on transcription of
a-specific genes such as the genes for a-factor and the
a-factor receptor; a2 is a repressor protein, which
turns off transcription of a-specific genes such as the
genes for a-factor and the a-factor receptor. Thus, in
an a cell, the appropriate genes are turned on and the
inappropriate genes are turned off, and the cell mates
as an a.
In an a cell, the a-specific genes are not expressed
because a1 is absent; the a-specific genes are expressed because a2 is absent. Thus, in an a cell, the appropriate genes are on and the inappropriate genes are off,
and the cell mates as an a.
In an a/a cell, which does not mate, both the
a-specific and a-specific genes are not expressed.
This results because a2 turns off the a-specific genes
(as it does in the a cell). a-specific genes are not
expressed because a/a cells posess a novel repressor
protein formed by association between a2 and the
a1 protein encoded by the MATa allele. This novel
repressor, a1a2, turns off synthesis of a1 and other
genes involved in mating. Thus, an a/a cell does
not mate because the genes required for mating are
turned off. a/a cells are able to sporulate when they are
nutritionally starved because the repressor a1a2
turns off synthesis of an inhibitor of sporulation,
which is produced in a cells and a cell but not in a/a
cells.
a2 and a1a2 are both homeodomain proteins, a
large group of proteins that is involved in cell specialization in fruit flies, nematodes, mice, and humans.
Mating
pheromone
-factor
specific genes
STE2
1
Receptor
aspecific genes
MAT=
STE4
GPA1
haploidspecific genes
G protein
subunits
STE18
specific genes
a cell
aspecific genes
a1
MAP KINASE
MODULE
MAP kinase
module
STE12
Transcription
factor
FAR1
Cyclin-dependent
kinase inhibitor
CDC28-CLN1,2
MATa
haploidspecific genes
specific genes
a / cell
1
2
aspecific genes
a1
haploidspecific genes
G1
CELL CYCLE
ARREST IN G1
278
Cassette Mutagenesis
silent
MAT
silent
silent
a
MAT
silent
Further Reading
Chu S, DeRisi J, Eisen M et al. (1998) The transcriptional program of sporulation in budding yeast. Science 282: 699705.
[erratum: Science 282: 1421.]
Heiman MG and Walter P (2000) Prm1p, a pheromone-regulated multispanning membrane protein, facilitates plasma
membrane fusion during yeast mating. Journal of Cell Biology
151: 719730.
Cassette Mutagenesis
See: In vitro Mutagenesis
Castle, William E.
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0164
The catabolite gene activator protein (CAP) is a pleiotropic effector for the expression of hundreds of
N
N
1
3
2
280
References
Properties of CAP
Shortly after its isolation it was found that CAP was
a dimer composed of identical subunits, each with a
molecular weight of 22 000. CAP binds to DNA and
this binding is greatly stimulated in the presence of
Catabolite Repression
J W Lengeler
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0167
282
C a t a b o l i t e R e p re s s i o n
Glucose
Glucose 6-p
IICGlc
Glc
~IIB
P
P~
Pyruvate
EI
HPr
PEP
EI
HPr
ATP
IIACrr
CyaA
IIACrr
sn Glycerol-P
cAMP
+
CrpA
crpA-modulon
Lactose
GlpF
Glycerol
LacY
Lactose
Inducer exclusion
Carbon catabolite
repression
Figure 1 Carbon catabolite repression in enteric bacteria. The phosphoenolpyruvate (PEP)-dependent glucose
(Glc): phosphotransferase system (PTS) is shown with its general proteins enzyme I (EI), a PEP-dependent histidine
protein kinase, and histidine protein (HPr), the specific proteins IIACrr (also called IIAGlc), IIBGlc, and the transporter
IICGlc, as well as various reversible phosphate transfer reactions (solid lines). Under famine conditions, P*IIACrr
activates the adenylate cyclase (CyaA) which converts ATP to 30 , 50 -cyclic adenosine monophosphate (cAMP). This
alarmone (second-messenger) in a complex with the global activator CrpA enhances transcription of the crpAmodulon and synthesis of all catabolic enzymes (broken arrows, positive signs). During growth on glucose and under
feast conditions which cause a high pyruvate:PEP ratio and dephosphorylation of the PTS-proteins, IIACrr inhibits
transport and metabolism of nonPTS carbohydrates, e.g., lactose and glycerol (broken arrows, negative signs) causing
inducer exclusion, and elicits catabolite repression through the failure to activate the crpA modulon.
IIC
IIBGlc
~
P
~P
Glucose 6-p
IIAGlc
His
PEP
EI
HPr
Pyruvate
EI
HPr
IIAGlc~
Pi
CcpA
Glycolysis
intermediates
HprP
crpA-modulon
HPr
Ccpa HPr
Ser
HprKa
Ser
ADP
HprKia
ATP
Carbon catabolite
repression
Figure 2 Carbon catabolite repression in Bacillus subtilis. The glucose-PTS is shown as in Figure 1. Under feast
conditions, e.g., during growth on glucose, glycolytic intermediates accumulate that activate an ATP-dependent serine
protein kinase (HprK). This kinase phosphorylates at a serine residue free HPr, i.e., molecules that are not
phosphorylated at a histidine residue. Thus activated, HPr Ser-P complexes to the global repressor CcpA which
under famine conditions is inactive; the complex represses the ccpA modulon and the synthesis of (most) catabolic
enzymes, thus causing carbon catabolite repression. A putative HPr-phosphatase (HprP) ends the process and
generates free HPr.
284
Cats
Further Reading
Gancedo JM (1998) Yeast carbon catabolite repression. Microbiology and Molecular Biology Reviews 62: 334361.
zkan S and Johnson M (1999) Function and regulation of yeast
O
hexose transporters. Microbiology and Molecular Biology
Reviews 63: 554569.
Postma PW, Lengeler JW and Jacobson GR (1996) Phosphoenolpyruvate:carbohydrate phosphotransferase systems. In:
Neidhardt, F.C., Curtiss, R. III, Ingraham, J.L., Lin, E.C.C.,
Low, K.B., Magasanik, B. (eds) Escherichia coli and Salmonella:
Cellular and Molecular Biology, pp. 11491174. Washington,
DC: American Society for Microbiology Press.
Reizer J, Hoischen C, Titgemeyer F et al. (1998) A novel protein
kinase that controls carbon catabolite repression in bacteria.
Molecular Microbiology 27: 11571169.
Cats
See: Feline Genetics
Cattanach's Translocation
B M Cattanach
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0168
Chromosomal Breakpoints
On the physical map, the central chromosome 7 region
is inserted into the X chromosome at the junction of
Giemsa bands XF1 and 2. On the linkage map, it lies
closely distal to the mottled (Mo) locus. No loss from
the X chromosome has been detected. As the insertion
comprises about one-third of chromosome 7, the
rearranged X is the longest in the chromosome complement (14% longer than the longest normal chromosome, chromosome 1). It therefore provides a good
cytogenetic marker. The proximal and distal chromosome 7 breakpoints lie at the junction of G bands
7B-C and 7E2 on the physical map, and between the
ruby-2 (ru2) and quivering (qv), and shaker-1 (sh1)
and hemoglobin b-chain (Hbb) on the linkage map.
Although at pachytene of male meiosis the central
region of the normal chromosome 7 is regularly seen to
assume a loop formation and pair homologously with
the insertion within the X, at diakinesis multivalent
associations are seen with a frequency of less than 10%,
which is in accord with the rarity of anaphase bridges.
Inheritance
Two forms of the translocation exist, the original
balanced form (Type I) and its unbalanced derivative
(Type II). The latter has three copies of the central
chromosome 7 region (two in normal chromosome 7s
and one in the X). Type I females are fertile, but the
equivalent male is commonly sterile on genetic backgrounds other than that of its origin. Type II females and
males are almost invariably fertile, although the females
are prone to have imperforate vaginas. The other unbalanced derivative, which is deficient for the central
region of chromosome 7, dies early in development.
Phenotypes
The wild-type alleles of the coat and eye color genes,
ru2, p and c lie within the insertion. These are liable to
be inactivated when the rearranged X is inactivated in
the normal process of X-inactivation. When recessive
alleles at one or more of the three loci are present on
the normal chromosome 7(s), variegated coat and eye
colors may be seen. In Type I females the variegating
coat color is that of the hemizygote for alleles/genes on
the single chromosome 7, and with the Type II females
it is that of the compound of alleles/genes on the two
normal chromosome 7s. The difference provides one
means of distinguishing Type I and Type II females. In
males, both the Type I and Type II classes, having only a
single active X, are phenotypically wild-type. They can
be distinguished from each other with some reliability
by the fact that the latter become growth retarded after
birth and have a reduced viability which is genetic background dependent. Chromosome 7 markers can allow
the distinction of chromosomally normal segregants.
286
cDNA
Uses of Translocation
The translocation has been extensively used in diverse
genetic and cytogenetic studies. In addition to providing the first recorded examples of the XXY condition
in the mouse, selection studies upon the levels of variegation ultimately led to the recognition of the Xce
locus and its control of the randomness of X inactivation. Xce effects have been investigated in both fetal
and placental tissues using the translocation. The translocation has also been used in (1) diverse X-inactivation
studies, (2) comparisonsofX-inactivation and chimerabased variegation, (3) biochemical studies upon gene
dosage at the c locus, (4) creating flow sorted X chromosome libraries, (5) duplication mapping, (6) eye
pigmentation studies, (7) investigations of the influence of pigmentation on the retinofugal pathways and,
as a long marker chromosome, it has also been used in
(8) cytogenetic studies on X inactivation, and (9) to
investigate the single-cell origin of induced tumors.
More recently, the translocation has been used to generate maternal duplication/paternal deficiency for the
central region of chromosome 7 which creates a mouse
model of the human imprinting condition, Prader
Willi syndrome. Currently, unbalanced (Type II)
animals are being used to investigate the basis of autism
found in humans with additional copies of the homologous region.
Further Reading
cDNA
Y Kohara
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1412
Complementary DNA (cDNA) is the DNA produced on an RNA template by the action of reverse
Cell Culture
See: Tissue Culture
Cell Cycle
D Lew
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1557
In the first half of the nineteenth century, accumulating evidence from microscopic observations led to the
recognition of cells as the fundamental building blocks
of plant and animal tissues, and raised the question
of how new cells are made. Pioneering microscopists
witnessed the birth of new cells from the division
of pre-existing cells, and repeated observation of cell
division in many tissues firmly established that all
cells arise by the division of parental cells. Indeed,
over 150 years later, our appreciation of the complexity of cell organization makes it inconceivable that
cells could arise in any other way. The means whereby
self-replicating cells first evolved in the primeval soup
remains one of the deepest mysteries in the origin of
life, and even with our rapidly increasing technology
and knowledge about what cells are made of, the goal
Interphase
The polymerases are loaded onto DNA at specialized sites on the chromosomes called origins of replication. Once loaded, polymerases start copying DNA
moving away from the origins in both directions, and
keep going until they meet a polymerase coming the
other way or until they reach the end of the chromosome. Some origins initiate replication early in S-phase,
while others initiate replication later in S-phase, but a
single origin never initiates replication more than once
during a particular S-phase. This behavior helps to
guarantee that polymerases don't start rereplicating
DNA, and raises the question of how an origin
`knows' that it should not start another round of
replication. In addition, how is it that the same origin
is once again allowed to initiate replication in the next
S-phase? Furthermore, why is it that origins do not
initiate replication during other parts of the cell cycle
than S-phase? One of the most satisfying advances in
cell cycle research during the past decade has been the
discovery of how cells ensure that all parts of every
chromosome are copied once and only once in each
cell cycle (see below).
The microtubule organizing center must also be
precisely duplicated once per cell cycle. Microtubules
are cytoskeletal polymers (long hollow fibers of
stacked protein subunits) that help to determine cell
shape during interphase and that reorganize during
mitosis to make a remarkable apparatus that segregates the duplicated chromosomes to opposite ends
of the cell (see below). The microtubule organizing
center (either a centrosome or a spindle pole body
depending on the species) nucleates polymerization
of microtubules, and hence determines the spatial distribution of these fibers within the cell. Newborn cells
have a single microtubule organizing center that is
duplicated at a characteristic time (frequently at the
beginning of S-phase) during interphase. It appears
that common signals trigger both DNA replication
and duplication of the microtubule organizing center,
although the detailed mechanism underlying this
duplication remains mysterious.
Duplication of other components of the cell need
not be as exact as it must be for DNA, perhaps
suggesting that duplication of these components is
more or less automatic, requiring little in the way of
sophisticated coordination. Nevertheless, it is astonishing that each of the cell's many components are
(approximately) doubled during each interphase. Studies on yeast cells suggest that some of these duplication events may be linked: if delivery of new
membrane to the cell surface (needed to double the
amount of plasma membrane) is blocked, then a signal
is sent to halt production of new ribosomes (the cytoplasmic factories that synthesize proteins). This may
represent just the tip of the iceberg of self-monitoring
288
Cell Cy cle
Mitosis
Cytokinesis
Cytokinesis, the physical division of one parent cell into two daughters, begins during or after chromosome
Figure 1 Mitosis. Photographs of rat kangaroo kidney cells at different stages of mitosis, showing the positions
of the chromosomes (seen by phase contrast microscopy; left panels) and microtubules (made visible by using
fluorescent antibodies to decorate the fibers; right panels). In the top cell, the microtubule organizing centers
have moved apart and the chromosomes are starting to condense within the nucleus at the beginning of mitosis.
In the middle cell, the nuclear envelope has disassembled and microtubules from the two poles have contacted
the kinetochores of most of the sister chromatids, whereas in the bottom cell the cohesion between the sisters
has been dissolved and the chromosomes are moving toward opposite poles of the spindle. Note how the
sister chromatid pairs in the middle cell are thicker than the segregating individual chromatids in the bottom cell. Also,
the distance between the poles is increasing (compare the middle and lower cells) as the poles move to opposite ends
of the cell. (Pictures kindly provided by Julie Canman and Ted Salmon, University of North Carolina, Chapel Hill, NC,
USA.)
290
Cell Cy cle
(A) Mitotic chromosome
cohesion
(B) Microtubule capture
Pole 1
(D) Chromosome segregation
microtubules
Capture
Pole 2
Figure 2 Chromosome segregation. (A) Schematic of a fully condensed pair of sister chromatids during mitosis.
Because microtubule fibers are rigid they tend to assemble in straight lines or shallow curves, making it very unlikely
that microtubules from one pole will loop around to contact both sister kinetochores. (B) Chance encounters
between microtubules and kinetochores promote capture of the microtubule and formation of a stable linkage
between the chromosome and the spindle pole. (C) Eventually, chance encounters will lead to `bipolar' attachments in
which sister kinetochores grasp microtubules from opposite poles. This leads to pulling forces attempting to separate
the sister chromatids, generating tension at the kinetochores. (D) Once all of the chromosomes attain a bipolar
attachment, the cohesion that keeps sister chromatids together is dissolved and the chromosomes are pulled to
opposite poles as the poles also move apart to opposite ends of the cell. (Adapted with permission from a figure by
Bruce Nicklas, Duke University, NC, USA.)
in essence, these cells build a daughter cell (the bud)
adjacent to the parent, segregate components into the
bud, and form a septum at the neck once segregation is
complete. In this case, it is the spindle that must orient
along the predetermined motherbud axis so that it is
perpendicular to the cleavage plane. Spindle orientation is achieved through pulling of the spindle pole
bodies by attached cytoplasmic microtubules that
interact with cortical cues established by the actin
cytoskeleton (which is polarized along the mother
bud axis). Thus, in animal cells actin responds to spatial information from microtubules in the spindle,
while in yeast cells microtubules respond to spatial
information from the actin cytoskeleton.
(A)
(B)
Figure 3 Cytokinesis. (A) Cytokinesis in animal cells is illustrated with pictures of a sand dollar egg undergoing its
second cleavage. The spindle poles are separating in the left panel, and the cell surface is starting to invaginate in the
second panel, forming a cleavage furrow that ingresses until the cells are pinched in two. (Reproduced with
permission from Rappaport R (1996) Cytokinesis in Animal Cells. Cambridge: Cambridge University Press.)
(B) Cytokinesis in plant cells is illustrated with pictures of Tradescantia stamen hair cells. Chromosome segregation
is occurring in the left panel, and the beginning of a cell plate or septum can be seen forming in the middle of the cell
in the second panel. The cell plate then grows outward until it completely bisects the cell. (Phase contrast microscopy
pictures kindly provided by Aline H. Valster and Peter K. Hepler, University of Massachusetts, Amherst, MA, USA.)
292
Cell Cy cle
G1
G2
(A)
G2
G1
(B)
Time
Cyclin/CDK
active
Cyclin
Ubiquitin ligase
accumulation
active
Ubiquitin ligase
inactive
Cyclin
destruction
CDK
inactive
(A)
Kinase
Cyclin
ON CDK
Cyclin
CDK P OFF
Phosphatase
CDK phosphorylates
kinase and phosphatase
Cyclin
ON CDK
P P P
Kinase
Phosphatase
P P P
Cyclin
CDK P OFF
(B)
Figure 5 The cyclin/CDK oscillator. (A) The abundance of mitotic cyclin in extracts from frog eggs rises
gradually and then collapses rapidly in a `sawtooth'
pattern. This is driven by the periodic activation of a
ubiquitin ligase (a protein complex whose components
were discovered through genetic screens in yeast) that
flags cyclin for destruction (see text). (B) Gradual cyclin
accumulation is converted into a delayed but sudden
activation of cyclin/CDK complexes as a result of a
regulatory `positive feedback loop' involving an inhibitory kinase (called Wee1) and an activating phosphatase
(called Cdc25), both of which were discovered through
genetic screens in yeast. As cyclin begins to accumulate,
cyclin/CDK complexes are phosphorylated (denoted by
the circled `P') and thereby inhibited. Although the balance
between inhibition by the kinase (arrow to the right) and
reactivation by the phosphatase (arrow to the left) is
biased toward inhibition, the gradual buildup of cyclin/
CDK complexes allows a few of those complexes to
become active. These begin to phosphorylate both the
kinase and the phosphatase, tilting the balance toward
more cyclin/CDK activation, which leads to more
phosphorylation of the kinase and phosphatase, tilting
the balance even further, and so on until all of the
cyclin/CDK complexes are active, all of the kinase is
inhibited, and all of the phosphatase is activated.
294
Cell Cy cle
Checkpoint Controls
296
Cell Determination
Further Reading
Cell Determination
A Chisholm
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0170
In an early embryo a cell has the potential to generate many different cell types. During development
cells generally lose this potential (or `potency'), and
become restricted to making one or a few cell
types. This process by which cells become progressively restricted in their potency is referred to as determination.
Determination of a cell or tissue is an operational
concept, and is analyzed by experiments in which the
cell or tissue is isolated or placed in an abnormal environment. If the cell's fate does not change as a result
of the experiment, then the cell can be said to be
determined with respect to that manipulation. However, it is possible that other experiments could cause
alterations in the cell's fate. Thus, a cell cannot be said
to be absolutely `determined' but only determined
C e l l D i v i s i o n G en e t ics 297
relative to experimental tests. Evidence that a cell is
not determined can also come from cell marking
(clonal analysis) experiments: if a marked cell gives
rise to multiple cell types in its progeny, then the
marked precursor can not have been determined to
make any one cell type.
From analyses of cell fate determination in many
organisms, the following general rules have emerged.
First, determination is a gradual process, in which a
cell's potency is progressively restricted during development. Second, the `determined state' is heritable
through somatic cell divisions, an example of `cellular
memory.' Third, determination is usually but not
always irreversible; in some situations a cell can revert
to an apparently undetermined state, or can `transdetermine' to a different stable state.
Although determination is a multistep process, two
basic phases can be distinguished: an initial phase in
which a cell is specified to a particular developmental
pathway (`cell fate specification'), and a more extended process of commitment, in which the specification
is fixed and made largely irreversible. It is now well
established that cell fate specification in embryos can
involve both cell-autonomous mechanisms and inductive signals from a cell's surroundings. Combinations
of these influences result in progressive alterations in
the gene expression patterns of embryonic cells. The
later process of commitment is less well understood
for example, why the determined state is stable and
heritable, and why it is unstable in some situations.
Cells can become undetermined in special circumstances. In amphibian limb regeneration, cells lose
their differentiated characteristics and form a `regeneration blastema,' which can generate all the tissues
of a mature limb. Certain cultured cell lines behave as
if undetermined, such as the embryonic stem cells
(ES cells) used in generating transgenic mice. Germline cells are also exceptional in that they retain the
potency to generate an entire organism when they
combine to form a zygote.
The distinction between cell fate specification
and determination is exemplified by Drosophila genes
known as selector genes. Genetic analysis in Drosophila identified homeotic mutants, in which the fates
of certain body regions were altered. These homeotic
mutants defined the homeobox-containing selector
genes, which function to specify region-specific cell
fates. For example, cell fates in the third thoracic (T3)
segment of Drosophila are specified by the homeobox
gene Ultrabithorax (Ubx).
The specification of cells to the T3 identity occurs
during embryogenesis and involves the localized activation of Ubx by transcription factors that are transiently expressed in the embryo. Ubx expression is
activated in the future T3 segment and then persists
in these cells throughout development. If Ubx function is removed from cells later in development, they
lose their T3 identity and become transformed in fate,
indicating that Ubx activity is required continuously to
maintain the cells in their determined state. The stable
activation of Ubx in T3 cells and its stable repression
in other cells involves chromatin-associated proteins
required for the maintenance of active and inactive
states of gene expression. Thus, the stability of the
determined state may in part reflect stable patterns of
chromatin. In vertebrates, DNA methylation could
provide an additional heritable mechanism for stable
patterns of gene expression.
See also: Embryonic Stem Cells
298
C e l l D i v i s i o n i n C a e n o r h a b d i t i s el e g a n s
Cell Division in
Caenorhabditis elegans
K O'Connell
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0171
Cellular reproduction is a complex endeavor, encompassing many independent processes that are coordinately controlled. Classical and molecular genetic
studies, particularly those carried out in budding and
fission yeast, have identified many factors essential for
karyokinesis (nuclear division) and cytokinesis (division of the cytoplasm). Genetic, cytological, and ultrastructural analysis of these mutants has furthered our
knowledge of cell division by uncovering key regulatory mechanisms and intermediate steps in important
processes such as assembly of the mitotic spindle and
of the cytokinetic ring. While much of the work in
lower eukaryotes is universally relevant, the mechanisms of cell division in animal cells are likely to differ
in many important aspects, and thus, animal model
systems are necessary. This review discusses the use of
the nematode worm Caenorhabditis elegans for the
study of animal cell division.
300
Figure 1 Early embryonic development of Caenorhabditis elegans. (A) Pronuclear migration. The oocyte pronucleus
(o) travels toward the sperm pronucleus (s) at the posterior of the embryo and passes through a transient furrow at
mid egg length. (B, C) Alignment of the first spindle. After meeting at the posterior, the two pronuclei move to the
center where they undergo a 908 rotation to position the centrosomes, or future spindle poles, (arrowheads) on the
anteriorposterior (AP) axis. (D, E) First cleavage. The spindle visible as a clearing of cytoplasmic granules initially
forms at the center of the cell but becomes eccentrically placed towards the posterior during anaphase. The ensuing
furrow divides the cell into a larger anterior cell (AB) and a smaller posterior cell P1. (F) Second division. AB divides
first and symmetrically with its spindle (poles denoted by arrowheads) perpendicular to the AP axis. Soon after, P1
divides asymmetrically along the AP axis; the positions of P1 centrosomes (indicated by arrows) predict the axis on
which the spindle will form. Anterior is to the left in all panels. Bar10 mm. (Reproduced with permission from
O'Connell et al., 1998.)
into the gonad of an adult hermphrodite. Expression
of this gene in the offspring of the injected animal is
often silenced. Although the mechanism of RNAi is
unknown, the method is known to be specific and highly effective, in many cases producing a phenotype equal
to that of the strongest loss-of-function mutations.
The following sections illustrate how these approacheshavebeenappliedtostudyvariouscelldivision
processes in C. elegans.
Cytokinesis
Cell division ends with the process of cytokinesis in
which the mother cell is cleaved in half yielding two
cells, each containing one of the daughter nuclei.
Mutant analysis in C. elegans indicates that cytokinesis
mayproceedintwodistinctsteps.Thefirststep involves
furrow formation and requires cytoskeletal elements
that act to constrict the cell at the equator. This step is
defined genetically by mutations that block furrowing
activity, such as those in the cyk-3 gene. The second
step involves separation of daughter cells and is illustrated by mutations in the cyk-1 gene. In cyk-1
mutants, furrows form and ingress normally but
fail to complete. cyk-1 encodes an evolutionaryconserved protein that localizes to cytokinetic furrows where it may act to carry out this final step by
stabilizing the constriction until cleavage is complete.
Future Prospects
While cell division research in C. elegans is only in
its infancy, there is great potential for growth. With
the entire genomic DNA sequence nearly in hand,
the advent of powerful reverse genetic approaches,
and strong methodology for cytological studies, this
model organism offers an enormous opportunity to
learn more about basic cell division processes.
302
Cell Lineage
Further Reading
Cell Lineage
A D Chisholm
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0172
Clonal analysis
(A)
X-rays
FLP
FRT
or
+
FRT
m
m
+
+
+
+
m
mutant clone
wild-type clone
(B)
Asymmetric
A
Proliferative
Stem-cell
Diversifying
304
Cell Lineage
In animals displaying invariant cell lineages, the ancestry, environment, and fate of a cell are correlated. It
was often assumed that invariant cell lineages reflected
intrinsic (cell-autonomous) mechanisms of cell fate
determination (also known as the `mosaic' mode of
development), in which the fate of a cell is determined
only by its inheritance of factors segregated in ancestral cell divisions. However, lineage invariance is not
sufficient evidence for a lineage-intrinsic mechanism.
It is important to note that in an invariant cell lineage
both a cell's environment and its ancestry are correlated with its fate. Thus, cell fates could be specified by
reproducible cellcell interactions rather than reproducible inheritance of intrinsic factors. To prove that
fates are specified autonomously, experiments in which
a cell is isolated or transplanted must be performed.
Although nematodes and ascidians both display invariant lineages, modern experiments have shown that
many aspects of development in these animals are
not cell-autonomously programmed, but instead rely
on invariant cellcell interactions.
living specimens (reviewed by Sulston, 1988). In conjunction with maps of cell nuclei, the cell lineage
provides a complete fate map, and makes it possible
to analyze the results of experimental manipulations
and mutants with single-cell resolution.
The C. elegans zygote undergoes a series of asymmetric cell divisions to generate six blastomeres (AB,
MS, E, C, D, and P4), known as embryonic founder
cells (Figure 3A). Each founder cell is distinctive in
terms of its cell lineage pattern and the cell fates it
generates. For example, the zygote divides asymmetrically to form a larger anterior daughter denoted the
AB founder cell, which undergoes a set of initially
symmetrical divisions to generate neurons, muscle
cells, and some epidermal cells. Most cell proliferation
occurs during the first half of embryogenesis; a small
number of postembryonic blast cells divide in larval
development to generate neuronal and epidermal cells,
the gonad, and sexually dimorphic structures. During
the development of a C. elegans hermaphrodite 1090
somatic cells are generated, of which 131 undergo
programmed cell death, to yield an adult containing
959 somatic cell nuclei (the number of cells is lower
because some cells fuse to form multinucleate syncytia).
In C. elegans lineage studies each cell is given a
unique name reflecting its lineage history. Certain key
embryonic and postembryonic precursors are given arbitrary names (e.g., AB, Z1). Their progeny are named
by adding letters denoting the axis of the cell division
relative to the body axes (a/p for anterior/posterior,
etc.). Thus, Z1.ppp is the posterior daughter of the
posterior daughter of the posterior daughter of Z1.
hours
0
zygote
germline
10
gut
pharynx + neurons
20
30
40
50
epidermis
vulva
somatic gonad
Figure 2 The Caenorhabditis elegans cell lineage. Time axis is vertical; each cell division is a horizontal line. The
origin of some cell types is indicated.
P1
EMS
MS
P2
P3
D
epidermis
neurons
mesoderm
neurons
P4
gut
epidermis
muscle
germline
Pn
Pn.p
epidermal
Pn.aaaa
sn or mn
(ACh)
Pn.aap
Pn.apa Pn.app
cell death
mn(ACh) mn(GABA)
or mn (ACh)
Pn.aaap
cell death
or mn (ACh)
Figure 3 (A) Abbreviated embryonic lineage of Caenorhabditis elegans, showing the relationships of the major embryonic founder cells. (B) P cell sublineage, showing the classes of cell generated. mn, motor neuron, with neurotransmitter indicated (ACh, cholinergic; GABA, GABAergic); sn, sensory neuron.
The somatic cell lineage of C. elegans is largely
invariant, with limited exceptions. Within some pairs
of cells there is variation in terms of which member of
the pair adopts one fate and which adopts the other
fate. For example, two adjacent gonadal precursor
cells, Z1.ppp and Z4.aaa, generate two cells known as
an anchor cell (ac) and a ventral uterine precursor
(VU) cell. In an individual animal, either Z1.ppp or
Z4.aaa becomes an anchor cell, and the other cell
becomes a VU cell. Since in normal development the
Z1.ppp/Z4.aaa pair never generates two anchor cells
or two VU cells, the two cells must communicate to
ensure the normal pattern of fates. The Z1.ppp/Z4.aaa
pair of cells is an example of an `equivalance group': a
group of cells equivalent in developmental potential.
In the case of the Z1.ppp/Z4.aaa pair, the choice of
fates appears to be entirely stochastic; in other equivalence groups, the choice of fates is biased.
306
Cell Lineage
Many mutations result in `homeotic' cell fate transformations, that is, a particular cell is not simply
abnormal but takes on the fate (as evidenced by a cell
lineage transformation or other markers) of another
cell normally found in a different body region, in a
different developmental stage, or in the other sex.
Many cell lineage mutants display defects in the normal asymmetry of cell divisions, and have provided
insights into the mechanisms by which determinants
of cell fates are segregated in such asymmetric divisions. The first division of the zygote is asymmetric
ac
VU
or
Z1.ppp Z4.aaa
VU
lin-12 (gf)
Z1.ppp Z4.aaa
VU
lin-12 (lf)
Z1.ppp Z4.aaa
VU
ac
ac
ac
(B)
lin-14(lf)
T
lin-14(gf)
T
wild-type
T
L1
T.ap
T.ap
L2
(C)
wild-type
A
unc-86
A
Figure 4 Effects of cell lineage mutations. (A) Effects of lin-12 mutations on cell fates in the anchor cell/ventral
uterine (ac/VU) equivalence group. Filled circle anchor cell; open circle ventral uterine precursor. lin-12(gf)
gain-of-function lin-12 mutation causing overactivity of the LIN-12 protein; lin-12(lf) loss or reduction of LIN-12
function. (B) Effects of lin-14 mutations on the temporal control of the lineage of the postembryonic blast cell T. In
normal development T generates the lineage shown in the L1 and L2 larval stages. In lin-14(gf ) mutants the LIN-14
protein is overactive and early (L1-specific) lineages are reiterated (retarded phenotype); in mutants that cause loss of
LIN-14 function (lin-14(lf )), L1-specific patterns are bypassed and T undergoes an L2-specific lineage. (C) Effect of
unc-86 mutations on diversifying lineages in the nervous system. In unc-86 loss-of-function mutants, diversifying
lineages are transformed to a reiterating stem-cell-like pattern. UNC-86 protein is expressed within the daughter
affected in the mutants (C in the figure).
308
Cell Lineage
(A) Wild-type
socket
hair
neuron
glial
"b"
hair
"a"
socket
hair
socket
Conclusions
Studies of cell lineages have been critical in our understanding of how cell fates are specified in development
and how fates are correlated with cell division patterns. Invariant lineages or sublineages, although
initially considered to imply `lineage-intrinsic' mechanisms of fate determination, are now thought to
reflect both intrinsic and extrinsic mechanisms. Thus,
animals with invariant cell lineages may not develop in
fundamentally different ways from larger animals in
which cell lineages are variable. In insects and vertebrates, cells mostly function in groups, within which
cell communication specifies fate. In such animals
development may be described as a lineage of cell
groups. Selection for rapid development and small
size might have led to the reduction of such cell groups
to individual cells, and thus the appearance of animals
with defined cell lineages.
References
310
Cell Lines
Cell Lines
A B Ulrich and P M Pour
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0173
Cell Line
A cell line is a permanently established cell culture that
will proliferate indefinitely given appropriate fresh
medium and space. Lines differ from cell strains in
that they become immortalized.
Problem Areas
Cell-to-cell interaction is one of the most important
cellular functions in an organism, the disruption of
which certainly has known and still unknown consequences. It is questionable whether the few cells
from which immortalized cell lines originate, are
representative of their tissue or disease of origin.
Genetic manipulations of the cells add additional
problems and can ultimately alter some or many
native functions and responses of the cells. The major
problem with the human cells is their tendency to
become senescent and, therefore, they presently are
useless for long-term experiments.
Transplantation of cultured cells into a suitable
host, as often performed to test the malignancy of
cells, can also be problematic. The growth and differentiation of tumor cells and their response to therapeutic agents can be different in different species, and
even between different strains of the same animal.
Conclusion
Cultured cells have provided some information on
physiological and pathophysiological processes of
various cell types. So far, most of the findings are
based on the cultured cells of rodents. The advancement of tissue culture techniques and molecular biology offer steady progress in this important line of
research. The collaboration of researchers from different medical disciplines is necessary for successful
isolation, purification, and maintenance of normal
human epithelial cells.
See also: Tissue Culture
312
C e l l M a r ke r s : G re e n F l u o re s c e n t P ro t e i n ( G F P )
Uses of GFP
GFP and its variants have been used in organisms from
bacteria and yeast to mice and human cells. One of the
most common uses of GFP is in promoter and protein fusion constructs. Promoter fusions with GFP
can document patterns of gene expression. Given the
dynamics of GFP production (the fluorophore takes
some time to form) and stability (the protein appears
to be long-lived), detailed studies of the onset and
cessation of gene expression (with a resolution of
minutes) are not possible. Protein fusions are useful
in determining the subcellular localization of a protein
of interest and whether that localization changes during development, with different growth conditions,
or in different genetic backgrounds. The most useful
fusions are those that also rescue the mutant phenotype, because the rescue indicates that the fusion protein functions appropriately.
Sometimes these fusion constructs are used to analyze a protein or promoter of interest. At other times
these fusions mark cells or cellular compartments so
that biological phenomena can be examined or manipulated. Nuclei, endoplasmic reticulum, Golgi, mitochondria, peroxisomes, and synaptic endings have all
been labeled using GFP. Once organisms have been
labeled, they can be subjected to various conditions or
they can be mutated to obtain mutants with altered or
absent expression. For example, we have used GFPlabeled neurons in the nematode Caenorhabditis
C e l l / N e u ro n D e g e n e r a t i o n 313
elegans as the basis of a screen for mutations that alter
cell fate, cell migration, or neuronal outgrowth.
GFP can also indicate the presence of viruses and
microorganisms. In molecular biology research, the
labeling of viral proteins makes GFP a useful transfection marker. Since GFP labels living cells, the labeling
of microorganisms may be particularly important in
studying interactions between and within populations, e.g., symbiosis and hostparasite interactions.
GFP can also be used to monitor infectious processes
in plants and animals.
Recently several groups have produced GFP fusion
proteins that couple the fluorescence of GFP to particular biological conditions. Such hybrid molecules
respond with altered fluorescence to differences in
membrane potential, calcium concentration, and pH.
These molecules and others like them promise to
greatly expand the usefulness of GFP into the realm
of biological sensors.
Further Reading
Cell/Neuron Degeneration
N Tavernarakis and M Driscoll
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0175
314
Cell/Neuron Degeneration
C e l l / N e u ro n D e g e n e r a t i o n 315
touch receptor ultrastructure or viability). Similarly,
dominant mutations in deg-1 (degenerin; deg-1(d))
induce death of a group of neurons that includes the
PVC interneurons of the posterior touch sensory
circuit. (Loss-of-function mutations in deg-1 appear
wild-type in behavior.)
mec-4 and deg-1 encode proteins that are 51% identical. These genes were the first identified members of
the C. elegans `degenerin' family, so named because
several members canmutatetoformsthatinducecelldegeneration. Included in this family are mec-10, which
can be engineered to encode toxic degenerationinducing substitutions, unc-8, which can mutate to a
semidominant form that induces swelling and dysfunction of ventral nerve cord; and unc-105, which
appears to be expressed in muscle and can mutate to
a semidominant form that induces muscle hypercontraction. Thus, a general feature of the degenerin gene
family is that specific gain-of-function mutations have
deleterious consequences for the cells in which they
are expressed.
C. elegans degenerins share sequence similarity
with subunits of the vertebrate amiloride-sensitive
epithelial Na channel. a-, b-, and g-ENaC (for epithelial Na channel) are homologous subunits of the
multimeric Na channel that mediates Na absorption in epithelia of the distal part of the kidney tubule,
the urinary bladder, the distal colon and the lung. The
degenerin family of C. elegans currently includes 23
Although mec-4(d) and deg-1(d) mutations kill different groups of neurons, the morphological features of
cell deaths they induce are the same. The time course
of degeneration depends upon the dosage of the toxic
allele, but on average can take approximately 8 hours.
When viewed using the light microscope, the nucleus
and cell body of the affected cell first appear distorted
and then the cell swells to several times its normal cell
diameter (Figure 1). Eventually the swollen cell disappears, often after shrinking but sometimes as a
consequence of cell lysis. Interestingly, the swollen
character of mec-4(d)- and deg-1(d)-induced deaths
resembles the morphologies of mammalian cells
undergoing necrotic cell death.
Figure 1 Apoptotic and degenerative cell death in Caenorhabditis elegans. White arrows indicate normal cells while
black ones point to dying ones. A cell undergoing apoptotic or programmed cell death, is shown in (A). In (B), a
degenerating cell has swollen to several times its diameter and adopts a vacuole-like appearance that is different from
the compacted, button-like structure of the apoptotic cell.
316
Cell/Neuron Degeneration
At the ultrastructural level, cells dying as a consequence of mec-4(d) and deg-1(d) expression exhibit
some remarkable features. The first detectable
abnormality apparent in an ill-fated cell is the formation of small tightly wrapped membrane whorls that
seem to originate at the plasma membrane. These
whorls are internalized and appear to coalesce into
large electron-dense membranous structures. Large
internal vacuoles form and distortion of the nucleus
by these vacuoles is associated with chromatin clumping. Finally, organelles and cytoplasmic contents are
degraded, usually leaving a membrane-enclosed shell.
The striking membranous inclusions suggest that
intracellular trafficking may contribute to degeneration. Interestingly, in some mammalian degenerative
conditions such as neuronal ceroid lipofuscinosis
(Batten disease; the mnd mouse) and that occurring
in the wobbler mouse, cells develop vacuoles and
whorls (fingerprint bodies) that look similar to internalized structures in dying C. elegans neurons. This
suggests that some degenerative processes may be
similar in nematodes and mammals.
The touch receptor neurons in mec-4(d) mutants
express terminally differentiated properties before
they die and the PVC neurons in deg-1(d) mutants
differentiate and function before they degenerate.
mec-4(d)- and deg-1(d)-induced cell deaths have
therefore sometimes been referred to as the nematode
version of `late onset' neurodegeneration. Careful studies of the timing of mec-4 expression relative to the
onset of degeneration support that onset of neurodegeneration is correlated with the initial expression
of the toxic gene product.
Increased cation influx results, initiating neurodegeneration. That ion influx is critical for degeneration is
supported by the fact that amino acid substitutions
that disrupt the channel conducting pore can prevent
neurodegeneration when present in cis to the A713
substitution. In addition, large sidechain substitutions
at the analogous position in some neuronally expressed mammalian superfamily members do markedly increase channel conductance.
C e l l / N e u ro n D e g e n e r a t i o n 317
nicotinic antagonists partially suppress deg-3(d)induced defects.
Activated Gas
318
Cenancestor
Future Prospects
The identification of C. elegans mutations that cause
necrotic-like cell death enables us to exploit the
strengths of this model system to gain novel insight
into a nonapoptotic death mechanism. The intriguing
observation that distinct cellular insults can induce a
similar necrotic-like response suggests that C. elegans
cells may respond to various injuries by a common
process, which can lead to cell death.
The initiation of degenerative cell death in C. elegans and its general neuropathology are reminiscent of
elements of excitotoxic cell death and other necroticlike cell death in higher organisms. Excitotoxic neuronal death mediated via glutamate receptors (channel
proteins) in cell culture or in vivo in response to
ischemia is an example of this type of cell death. It is
also interesting that there are many reported instances,
in animals as diverse as flies, mice, and humans, in
which neurons degenerating due to genetic lesions
exhibit morphological changes similar to those
induced by mec-4(d) and other hyperactivated degenerins. Given that apoptotic death mechanisms are
conserved between nematodes and humans it can be
hypothesized that various cell injuries, environmentally or genetically introduced, converge to activate a
degenerative death process that involves common biochemical steps. At present the question of common
mechanisms remains an intriguing but open question.
If specific genes enact different steps of the degenerative process, then such genes should be identifiable
by mutation in C. elegans. Indeed, suppressor mutations in several genes that block mec-4(d)-induced
degeneration have been isolated. Although some suppressor mutations affect channel function (for example mutations in mec-6), others are expected to be
more generally involved in the death process. Analysis
of such genes should result in the description of a
genetic pathway for degenerative cell death. Perhaps,
as has proven to be the case for the analysis of
C. elegans programmed cell death mechanisms,
elaboration of an injury-induced death pathway in
Further Reading
Aguzzi A and Raeber AJ (1998) Transgenic models of neurodegeneration. Neurodegeneration: of (transgenic) mice and
men. Brain Pathology 8: 695697.
Canessa CM, Horisberger J-D and Rossier BC (1993) Epithelial
sodium channels related to proteins involved in neurodegeneration. Nature 361: 467470.
Dragunow M, MacGibbon GA, Lawlor P, et al. (1997) Apoptosis,
neurotrophic factors and neurodegeneration. Reviews in
Neuroscience 8: 223 265.
Driscoll M (1996) Cell death in C. elegans: molecular insights
into mechanisms conserved between nematodes and mammals. Brain Pathology 6: 411425.
Heintz N and Zoghbi HY (2000) Insights from mouse models
into the molecular basis of neurodegeneration. Annual Review
of Physiology 62: 779802.
Lints R and Driscoll M (1996) Programmed and pathological cell
death in C. elegans. In: Martin GR, Holbrook N and Lockshin
RA, (eds) Cell Aging and Cell Death. New York: Wiley-Liss.
Min KT and Benzer S (1997) Spongecake and eggroll: two
hereditary diseases in Drosophila resemble patterns of
human brain degeneration. Current Biology 7: 885888.
Min KT and Benzer S (1999) Preventing neurodegeneration in
the Drosophila mutant bubblegum. Science 284: 19851988.
Nakao N and Brundin P (1998) Neurodegeneration and glutamate induced oxidative stress. Progress in Brain Research 116:
245263.
Paulson HL (2000) Toward an understanding of polyglutamine
neurodegeneration. Brain Pathology 10: 293299.
Warrick JM, Paulson HL, Gray-Board GL et al. (1998) Expanded
polyglutamine protein forms nuclear inclusions and causes
neural degeneration in Drosophila. Cell 93: 939949.
Cenancestor
W Fitch
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0176
Centimorgan (cM)
Centric Fusion
F W Stahl
J R S Fincham
320
Centrioles
Further Reading
White MJD (1971) Animal Cytology and Evolution, 3rd edn. Cambridge: Cambridge University Press.
Centrioles
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1784
Centromere
J R S Fincham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0179
C e n t ro m e re 321
the daughter chromosomes to move apart towards
the two poles of the spindle. In meiosis, in contrast, the
chromatids remain joined at the centromere at the
first anaphase. The bivalent chromosomes, resulting
from pairwise synapsis and chiasma formation, each
separate into two dyads, each consisting of two chromatids joined at the centromere (Figure 1B), which
is not split until the end of metaphase of the second
division. Thus the centromere can be defined genetically as that point in a linkage map that always segregates at the first division of meiosis (reductionally) and
never at the second division (equationally).
(A)
Kinetochores
Centromeres attach to spindle fibers through protein
structures called kinetochores. Except in organisms
without localized centromeres (see below), there is
one kinetochore per chromatid, and mitotic metaphase
chromosomes, divided into two chromatids, have one
kinetochore directed towards each pole of the spindle.
Kinetochores can sometimes be seen under the microscope to be stretched towards the spindle poles. At the
first division of meiosis, there are again two kinetochores on each of the undivided centromeres, but here
they are pointed in the same direction; this may be at
least part of the explanation for the centromeres
remaining undivided (Figure 1B).
(B)
Centromere Structures
The structures of centromeres are extremely diverse
between different organisms. They have been investigated most completely in the budding yeast Saccharomyces cerevisiae. With the completion of the yeast
genome project, the DNA sequences underlying all
18 centromeres are known, and the minimum sequence
necessary for centromere function has been determined
by the testing of chromosome fragments in yeast artificial chromosomes. This minimum sequence, only
125 bp long, is, with minor variations, conserved
between chromosomes, and consists of three elements, CDEI, II, and III, of which CDEIII has the
best-defined function. It is an imperfect palindrome
of 25 bp, and it binds to the innermost proteins of the
kinetochore, which is a highly complex multiprotein
structure with multiple functions in the regulation of
chromosome division, the most obvious of which is
binding to a spindle fiber (microtubule). CDEI is an
imperfect palindrome of 8 bp which binds one known
protein and facilitates centromere function without
being absolutely essential. The remaining part of the
centromere, CDEII, is a spacer between CDEI and
CDEIII and consists of a less specific A/T-rich
sequence (Figure 2A).
322
Centromere
Neocentromeres
Centromeric DNA sequence is extremely variable,
both between and within species. The ability of centromeres to become established over different DNA
sequences is most strikingly shown in the formation
of neocentromeres more or less functional new centromeres with associated kinetochores that sometimes
appear on chromosomes that have had their regular
centromeres deleted or otherwise inactivated. These
have been particularly studied in cultured human
cells. A well-investigated example is a neocentromere in a partially deleted human chromosome 10.
Here the DNA spanning the neocentromere has
been sequenced and found to contain no alphoid satellite nor any other sequence similarity to regular centromeres.
A hypothesis that is attracting increasing attention
is that kinetochores are propagated epigenetically
(Karpen and Allshire, 1997). On this view, new kinetochores are built on newly replicated chromosomes
at the same sites as the old ones, not primarily because
of the DNA sequence but because some trace of kinetochore structure is already there. The formation of
neocentromeres suggests that, in centromere-deficient
chromosomes, kinetochores can be established de
novo, without strict DNA sequence requirements,
50 bp
A
25 kb
cc1
1Mb
C
Further Reading
Reference
Initiation
In the initiation phase of protein synthesis a messenger
RNA (mRNA) is bound to the ribosome. In this
process the correct initiation (methionine) codon
from which the translation begins is selected. Eubacterial initiation is stimulated by three initiation
factors, IF-1, IF-2, and IF-3. In eukaryotes a much
larger number of initiation factors participate. The
fundamental steps in initiation are the binding of the
mRNA to the small subunit, the subsequent binding
of the initiator tRNA, and the attachment of the large
subunit to this initiation complex. IF-2 in complex
with GTP binds with the initiator tRNA to the initiation codon of the mRNA on the small subunit,
which in turn associates with the large subunit. IF-1
may assist IF-2 in binding the initiator tRNA to
the P-site, whereas IF-3 prevents the large subunit
from associating before a proper initiation as been
completed.
Elongation
During each cycle of elongation one amino acid is
incorporated into the nascent peptide. The elongation
factors (three in eubacteria) catalyze two of the basic
Termination
The termination of protein synthesis depends on the
exposure of one of the three stop codons, UAG,
UAA, and UGA, in the decoding part of the A-site.
In eubacteria two release factors RF1 and RF2 participate to decode the stop codons and hydrolyze the
324
Chaperonins
completed peptide from the P-site tRNA. In eukaryotes they correspond to a single decoding factor,
eRF1. The crystal structure of eRF1 indicates that
these factors may perform their function by mimicking tRNA. The termination factor RF3 in all cases
catalyzes the dissociation of the decoding factors
from the ribosome.
The ribosome recycling factor (RRF) has the role
of removing the mRNA from the ribosome so that
the ribosome is available to synthesize new protein
from new mRNAs. It performs this role together
with EF-G. An amazing observation is that RRF also
closely mimics tRNA. This may suggest that RRF
binds to a tRNA binding site, possibly the A-site,
and is translocated from this site by EF-G. This would
lead to the dissociation of the mRNA from the ribosome and the ribosomal subunits from each other.
See also: Translation
Chaperonins
A Brinker and F U Hartl
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1510
Function
Chaperonins are essential for cell viability in all
growth conditions, because they are required for the
efficient folding of numerous proteins that mediate
Character
W Fitch
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0183
Chargaff's Rules
B S Guttman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0185
Further Reading
A character is any biological feature that occurs
across a range of organisms and might thus be used
to determine evolutionary relationships. Characters
may be structural (bones, organs), molecular (genes,
proteins), functional (flying, enzymatic activity), or
behavioral (mating, food gathering).
See also: Character State; Cladograms;
Quantitative Trait
Charon Phages
See: Vectors
Character State
Chi Sequences
W Fitch
G R Smith
326
C h i Se q u e n c e s
1.0
IIIa
0.8
1b
0.6
1
1a
0.4
0.2
111b
0
8
Chi
5 GCTGGTGG 3
A3
Chi
A2
3
5
RecBCD
enzyme
A3
Chi 3
A2
3
(ATP) > (Mg2+)
5
5
or
A3
5
3
Chi 3
A2
(Mg2+) > (ATP)
B
5
(B)
Figure 1 (A) Localized stimulation of recombination by Chi in phage lambda crosses. I, Ia, etc. are genetic intervals
bounded by markers located the indicated distance from a Chi site in lambda. Solid circles indicate the midpoints
of each interval and the frequency of recombinants per physical length of that interval, normalized to interval II 1.
(B) Action of purified RecBCD enzyme at Chi. With (ATP) > (Mg2) RecBCD enzyme unwinds the DNA
substrate, nicks the upper strand about five nucleotides to the right of Chi, and continues unwinding. With (Mg2)
> (ATP) RecBCD enzyme degrades the upper strand up to Chi, nicks the lower strand, and degrades or unwinds it to
the left of Chi. Both conditions produce single-stranded DNA with a 30 end near Chi and extending to its left.
RecBCD enzyme loads RecA protein onto this `Chi tail.' (Reprinted with modification from Smith et al. (1995) with
permission.
328
Chiasma
Further Reading
Anderson DG and Kowalczykowski SC (1997) The translocating RecBCD enzyme stimulates recombination by directing
RecA protein onto ssDNA in a w-regulated manner. Cell 90:
7786.
Chedin F, Ehrlich SD and Kowalczykowski SC (2000) The
Bacillus subtilis AddAB helicase/nuclease is regulated by its
cognate Chi sequence in vitro. Journal of Molecular Biology 298:
720.
Colbert T, Taylor AF and Smith GR (1998) Genomics, Chi sites
and codons: `Islands of preferred DNA pairing' are oceans of
ORFs. Trends in Genetics 14: 485488.
Myers RS and Stahl FW (1994) w and the RecBCD enzyme of
Escherichia coli. Annual Review of Genetics 28: 4970.
Chiasma
P B Moens
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0189
C h i a s m a 329
Multiple Chiasmata
The number of chiasmata per chromosome is a hereditary characteristic and is therefore genetically
determined. Depending on the species, there can be
one or several chiasmata in a set of paired chromosomes. If the two adjacent chiasmata involve the same
two nonsister chromatids, it is a two-strand double
crossover. Depending on which chromatids are
involved, there can also be a three-strand or a fourstrand double crossover. As a rule, each chromosome
pair, no matter how short, should have at least one
chiasma to assure proper chromosome segregation at
the first meiotic division.
Chiasma Interference
It is an observed but unexplained fact that, in most
organisms, the presence of a given chiasma interferes
with the occurrence of a chiasma nearby. As a consequence, chiasmata are rarely distributed in a random
fashion. The presence of chiasma interference is
reflected at the genetic analysis level by the fact that
for short genetic distances there are fewer than expected double crossovers.
Localized Chiasmata
The positions of chiasmata along the length of the
paired chromosomes are genetically determined and
can be highly localized or may appear to be more
evenly distributed. In some cases, there can be a predetermined single chiasma next to the centromere,
whereas a closely related species can have several
chiasmata along the length of each chromosome pair.
In other cases the single chiasma can be located strictly
at the end of the chromosomes, and a closely related
species can have a more even distribution. Presumably,
DIPLOTENE
Chiasma Resolution
In order for the paired chromosomes to separate at
the first meiotic division, it is necessary for the
chiasmata to be resolved. The older models postulated
that the chiasmata slide to the end of the paired
chromosomes, a process referred to as `chiasma terminalization.' The evidence, however, was never
METAPHASE I
PRODUCTS
AFTER MEIOSIS II
to pole
short arm
short satellite
chromosome
core
nonsister
chromatids
sister
chromatids
long satellite NOR centromere
chiasma
'cross'
B
parental
A
a
recombinant
a
recombinant
knob
a
to pole
b
parental
Figure 1 (See Plate 3) Diagrammatic representation of chiasma formation and resolution. The light gray colored
chromosome and its dark homolog have a reciprocal exchange event between the positions of genes A and B involving
nonsister chromatids. The chromosome core holds the sister chromatids together when the centromeres are pulled
to opposite poles (arrows). Under tension, the cross formation (chiasma) becomes evident. When the proteins of the
chromosome cores degenerate, the chromosomes separate and travel to the opposite poles but the sister
chromatids stay together. These sister chromatids separate at the second meiotic division so that four products
result. Two of the products are identical to the original chromosomes and two are recombinant, genetically Ab and aB
and cytologically long-satelliteno knob and short-satelliteknob. This type of experiment shows that genetic
exchange is accompanied by a physical exchange of chromosome parts.
330
Chimera
satisfactory. On the contrary, experimentally differentiated chromatids gave clear evidence that the
chiasma position does not change. It now appears
that the axial cores of each pair of sister chromatids
(see Figure 1) prevent the resolution and movement
of the chiasma. (The axial core is the protein structure
around which the chromatin organizes in meiotic prophase.) When the core proteins are degraded at the
first meiotic division, the sister chromatids are free to
separate except at the centromere and, as a consequence, the chiasmata are resolved. Apparently the core
proteins that reside between the sister centromeres are
exempt from the degradation so that the centromeres
stay together until the second meiotic division.
Further Reading
Chimera
C L Stewart
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0190
Recombination Nodules
At the high resolution of an electron microscope it
is possible to detect a few small, 100-nanometer,
dense bodies along the length of the paired chromosome cores. The positions of these nodules correlate
with the positions of chiasmata at the later stages of
meiotic prophase. It is likely that these nodules are
conglomerates of proteins that are involved in the
DNA metabolism at the site of a crossover. In contrast to these few late nodules, there are numerous
early nodules that do not necessarily correlate with
chiasmata in number or position. These early nodules,
however, are associated with proteins involved in
the initiation of early recombinant events. The function of these structures in the initiation, maturation,
and resolution of recombination is under investigation.
Molecular Mechanisms
The chiasma/crossover is based on the induction
of chromosome breaks by the cell. While these
breaks are double-stranded in Saccharomyces cerevisiae and Schizosaccharomyces pombe, no similar information has been reported to date for other species.
The induction of breaks is a remarkable process
because cells in general guard strongly against breaks
in DNA. Several mechanisms in somatic cells detect
damage, arrest cell proliferation, and either repair the
damage or cause cell degeneration. Meiotic prophase
cells, on the contrary, activate enzymes that induce
DNA breaks. This is followed by detection and repair
processes with the employment of somatic and
meiosis-specific enzymes. Whereas in somatic cells,
recombinational repair can make use of the undamaged sister chromatid as a template, in meiotic
cells, meiosis-specific enzymes direct the repair
towards the non-sister chromatid thereby promoting
the formation of a chiasma.
Chimera 331
use and are more informative. In particular, mice have
been genetically engineered to express in all their cells
enzymes such as b-galactosidase, alkaline phosphatase, or green fluorescent protein which result in the
cells becoming colored when they are treated with
particular substrates or fluoresce when exposed to
UV light. These are particularly useful as they make
it relatively easy to determine the distribution of cells
within a whole chimeric embryo or histological section (Figure 1).
Figure 1 A chimeric mouse embryo in which unmarked mouse ES cells were injected into a blastocyst
expressing b-galactosidase as a marker and which colors
the cells expressing it dark blue. The ES cells have almost
entirely formed the embryo proper (2) except for part
of the gut that is darkly labeled. The membranes (yolk
sac) surrounding the embryo have been entirely derived
from the blastocyst as they are all darkly labeled (1).
In mammals, the mouse is the principal species used
in genetics and embryology. Early studies on mice had
largely been limited to creating the chimeras using
preimplantation stages, as access to the later stages is
complicated by their development in the uterus and
their dependence on a placenta. Mouse chimeras made
from preimplantation stages established that the
trophoblast of the preimplantation embryo forms the
placenta, whereas the inner cell mass (ICM) forms
the embryo proper. Significant progress has however
been recently made in analyzing later stages of development, due to the advent of gene targeting, the use
of mosaics (see below), and the fact that embryos
between 7 and 10 days of age can now be cultured
and manipulated in vitro. The latter technique that has
provided limited, but significant information on cell
lineages at this critical time of mouse gastrulation,
particularly with regard to the formation of the
three germ layers and the way they organize to form
the embryo. As in chickens, bone marrow transfer
between adults or between embryo and adult mice
have been central to understanding how the hemopoietic and immune systems are formed and the roles
of the different cell types that comprise both systems.
Chimeras have been instrumental in determining the
origin of germ cells, the cells that will ultimately form
the eggs or sperm. Also, they revealed that in mammals phenotypic sex is determined by the genetic sex
of the somatic cells in the gonads and not the genetic
sex of the germ cells.
332
Chimera
Mosaics
Another form of chimera is the mosaic, which is a
composite individual derived from a single fertilized
egg. In mammals all females can be described as
`mosaics' since they are a mixture of cells, differing
from each other by which X chromosome has been
inactivated during embryogenesis. As an experimental
tool mosaics have been of greater use in the study of
the development of worms and flies, as well as plants.
Mosaics are generated by the individual marking of
cells by a dye or by the introduction of specific genes.
They can also be derived by inducing a specific genetic
alteration (e.g., chromosomal translocation) in a cell,
with all the descendants of the cell subsequently inheriting the chromosomal change. Mosaics have been
used to study cell lineages as well as to determine
Further Reading
LeDouarin N and McLaren A (eds) (1984) Chimeras in Developmental Biology. London: Academic Press.
Rossant J and Spence A (1998) Chimeras and mosaics in mouse
mutant analysis. Trends in Genetics 14(9): 358.
Gene Fusion
Chimeric proteins are prepared by fusing the structural genes of the proteins in question in a suitable
expression vector. The translational 30 terminus of the
first gene is deleted, as is the promoter at the 50 terminus of the second structural gene. The two genes are
then ligated in-frame and expressed in an appropriate
host. The most frequently used hosts are bacteria such
as Escherichia coli, but plant, mammalian, and insect
cells have also been used. After transcription and
translation, the cell will produce one single polypeptide chain with the properties of both the original gene
products. The fusion can be made at either or both
termini of a protein. Whether one side is more favorable for the biological activity of the protein or not has
Linker Regions
It is crucial to the design of a chimeric protein that
upon fusion the individual units retain their ability to
fold independently of the remainder of the polypeptide chain. Otherwise folding may result in nonactive,
scrambled structures. There are many aspects to consider when designing and analyzing a linker region.
The first requirement is to avoid proteolytic cleavage
within the linker, which limits the amino acid composition of the linker. Pairs of dibasic amino acids
are often sites for proteolytic activity, as well as, for
instance, repeats of glycine-glycine-X, where X is an
amino acid with a hydrophobic side chain. The linker
region should not depend on the rest of the protein for
its stabilization and conformation. Since the preferred
secondary structure within the linker is a coil or a bent
structure, the amino acid composition of the linker
is further restricted. In naturally occurring chimeric
proteins, bulky amino acids as well as hydrophobic
residues are avoided in the linkers, with the exception
of the smaller hydrophobic residues alanine and proline. Glycine, serine, and threonine are strongly preferred. Other common linker residues are asparagine,
glutamine, and lysine.
Areas of Application
The design of synthetic chimeric enzymes has proven
to be a valuable tool in protein engineering and in
enzymology in the study of proximity effects between
enzymes. Particularly, if an enzymatic process is based
on consecutive steps, it can be convenient to put the
catalytic centers of separate enzymes in close proximity to improve the overall kinetics and yield of the
process. In nature, this has been accomplished by
the evolution of multifunctional proteins and multienzyme complexes. Several metabolic pathways seem
to appear as free enzyme systems or multienzyme
systems in those cellular forms that probably occurred
early in evolution and appear as multifunctional
enzymes or multienzymes in cells that arose later in
evolution. This phenomenon has been explained by
fusion of genes coding for the separate enzymes into
334
Chirality
Further Reading
Chirality
See: Handedness, Left/Right
Chlamydomonas reinhardtii
J-D Rochaix
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1663
The Organism
Cells of C. reinhardtii are oval-shaped, typically 10 mm
in length and 3 mm in width, with two flagella at
their anterior end (Figure 1). This alga contains several
mitochondria and a unique cup-shaped chloroplast
containing the internal thylakoid membranes where
the primary reactions of photosynthesis occur. The
chloroplast occupies 40% of the cell volume and
partly surrounds the nucleus. C. reinhardtii possesses
a primitive visual system with the eyespot consisting
of stacks of carotenoid-containing lipid granules that
focus the incoming light on an overlying patch of
plasma membrane where the photorecetor is believed
to be localized. This photoreceptor is a rhodopsin-like
protein with an all trans-retinal chromophore, and it
is closely related to photoreceptors of multicellular
organisms.
C h l a my d o m o na s re i n h a rd t i i 335
Haploid vegetative cells of C. reinhardtii can be
propagated through mitotic divisions. These cells exist
as mating-type() or mating-type( ), determined by
two structurally distinct alleles of the mating-type
locus. Vegetative cells differentiate into gametes
when they are starved of nitrogen. Gametes undergo
several characteristic changes such as loss of ribosomes, alteration of chloroplast morphology, starch
accumulation, and reduced photosynthetic activity.
Mixing of gametes of opposite mating type leads to a
rapid agglutination of their flagella, a response which
is mediated through gamete-specific glycoproteins,
called agglutinins, which are associated with the flagellar membrane. The flagellar agglutination triggers a
series of complex reactions which ultimately lead to
the fusion of the gametic cells as well as their nuclei
and chloroplasts, and to the maturation of the zygote
into a thick-walled zygospore. After a maturation
period, the latter undergoes meiosis and produces a
tetrad consisting of four haploid daughter cells. Stable
vegetative diploid cells can also be obtained after mating. They divide mitotically and are useful for determining whether a mutation is dominant or recessive.
Photosynthetic function is dispensable in C. reinhardtii provided a carbon source such as acetate is
included in the growth medium. This property has
been used extensively for isolating numerous mutants
deficient in photosynthetic function. An important
feature of this photosynthetic alga is that its chlorophyll fluorescence patterns are highly sensitive to
deficiencies in the photosynthetic electron transport
chain. Fluorescence can thus be used as a noninvasive
method to identify photosynthetic lesions in the
mutants. Cells of C. reinhardtii can be grown under
three distinct regimes: phototrophic growth with CO2
assimilated through photosyntheis as unique carbon
source, heterotrophic growth in the dark with acetate,
and mixotrophic growth in the light with acetate.
In addition, cell division can be synchronized by
alternate 12 h light and 12 h dark cycles. Another
important feature of C. reinhardtii is its ability to
synthesize chlorophyll both in a light-dependent and
light-independent manner. As many mutants deficient
in photosynthesis are sensitive to light, they need to
be grown in the dark. Under these conditions and in
contrast to higher plants, C. reinhardtii is still able to
assemble its photosynthetic apparatus. It is thus possible to isolate photosynthetic complexes from lightsensitive mutant cells and to study their properties.
336
C h l a my d o m o n a s r e i n h a rd t i i
Chloroplast Biogenesis
As in higher plants, the biosynthesis of the photosynthetic apparatus of C. reinhardtii occurs through the
concerted action of the nuclear and chloroplast genetic
systems. Subunits of the photosynthetic complexes are
either encoded by the chloroplast genome and translated on chloroplast 70S ribosomes or encoded by
nuclear genes, translated on cytosolic ribosomes as precursors, in most cases with an N-terminal extension that
targets the proteins to the chloroplast. An extensive
genetic analysis of mutants deficient in photosynthetic
activity has revealed two major classes. The first
includes mutations within the genes of the subunits
of the photosynthetic complexes. The second includes
mainly nuclear mutations that interfere with photosynthesis indirectly by affecting chloroplast gene
expression. These mutations affect mostly chloroplast
post-transcriptional steps such as RNA stability and
processing, splicing, translation and the assembly
process of photosynthetic complexes. The number of
these genes is surprisingly high and most of their
products appear to act in a gene-specific manner.
Although it is clear that the nucleus plays a crucial
role for chloroplast gene expression, the state of the
chloroplast can also influence nuclear gene expression.
Earlier studies with higher plants revealed that certain
nuclear genes involved in photosynthesis are not
expressed in plants containing defective plastids as a
Further Reading
Chlamydomonas, Historical
Model
See: Photosynthesis, Genetics of
Chloroplasts, Genetics of
B D Dyer
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1493
Chloroplast Genomes
Compared to their cyanobacterial ancestors, chloroplasts have lost most of their genes. Algae and plant
chloroplasts have only a few hundred kilobases of
DNA in circular genomes present in multiple copies
with about 100 genes. Parasitic plants that have
secondarily lost the ability to photosynthesize have
even smaller chloroplast genomes as in Epifagus with
338
C h ri s t m a s Dis ea s e
Shared Coding
Complete loss of redundant and extraneous genes
seems to occur easily in close symbioses, perhaps
because a streamlining of the genome confers a replicative advantage. In addition intracellular horizontal
transfer of genes may occur, facilitated by the proximity of the chloroplast, mitochondrial, and nuclear
genes. The direction of transfer is strongly biased
toward the nucleus, although chloroplast to mitochondria transfers are also noted. A result of horizontal transfer is a shared coding for some essential
chloroplast structures including the ribosomes. This
makes the relationships even more obligate among the
various genomes of eukaryotic cells. For additional
details see Mitochondria and Symbionts, Genetics of.
Maternal Inheritance
There is considerable variation in the plants and algae
in respect to gamete size. In some cases maternal
gametes (ova) are much larger than the paternal ones
(pollen) and contribute entirely or almost entirely to
the chloroplasts of the zygote. This means that maternal inheritance of chloroplast mutations can occur
in some plants. Often such inheritance is manifested
Further Reading
Christmas Disease
F Gianelli
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0196
Chromatid
J Y Lee and T L Orr-Weaver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0198
A chromatid is one of the replicated copies of a chromosome. Identical sister chromatids are produced as a
result of DNA replication. In contrast, homologous
chromosomes derive from either the mother or the
father of the organism, and although they contain the
same set of genes, they usually have genetic differences.
Sister chromatids are physically attached all along their
lengths and particularly at the centromeres. This cohesion between the chromatids is established while the
DNA is being replicated and is mediated by several
proteins, some of which constitute the cohesin complex. Sister chromatid cohesion is essential for the
movement of the chromosomes to the metaphase
Chromatid Interference
F W Stahl
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0197
340
C h ro m a t i n
Chromatin
J Y Lee and T L Orr-Weaver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0199
Function
Chromatin structure plays an important role in controlling gene expression and replication. The packaging
of DNA into nucleosomes forms a `closed' structure
that is not very accessible to enzymes that perform
replication, transcription, and DNA repair. This structure is generally transcriptionally repressive, allowing
only a basal level of gene expression. In a disrupted,
`open' nucleosome structure, the DNA is more accessible to replication and transcription factors.
In transcription, some activators and repressors
interact with RNA polymerases to change the chromatin structure and modulate gene activity. Activators
can help to disrupt nucleosome structure and thereby
stimulate the assembly of RNA polymerase and transcription factors at the promoter. For replication,
a similar modulation of chromatin structure must
occur to allow the replication machinery to be positioned at the origins of replication.
The structure of chromatin can also have longrange effects on gene expression. In a phenomenon
termed `position effect variegation,' genes located
near silent heterochromatic regions can also be made
transcriptionally inactive. Genes as far as 1000 kb
away can be silenced. Because the exact areas that are
repressed vary from cell to cell, this is an epigenetic
phenomenon that produces variegation in phenotype.
It is generally thought that the highly condensed
Chromatin 341
Short region of
DNA double helix
Beads-on-a-string
form of chromatin
30-nm chromatin
fiber of
packed nucleosomes
2 nm
11 nm
30 nm
Section of
chromosome in an
extended form
300 nm
Condensed section
of metaphase
chromosome
700 nm
Entire
metaphase
chromosome
1400 nm
Figure 1 Model of chromatin packing into higherorder structures. DNA is assembled into the `beads-ona-string' conformation by wrapping around histones to
form nucleosomes, which can then be packed into a 30nm fiber. The fibers coil to form a chromosome. In
metaphase, the chromosomes condense even more
extensively. Reproduced from Alberts, Bruce et al.
(1994) Molecular Biology of the Cell, 3rd edn. New York:
Garland Publishing. (Permission from Elsevier Science).
nature of heterochromatin prevents access by transcription factors, but how this can affect neighboring,
nonheterochromatic regions is not fully understood.
While it is accepted that proteins found in the heterochromatin can `spread' to adjoining regions and impact a similar repressive effect, another possibility
is that the heterochromatin may be grouped into compartments of the nucleus that are inaccessible to transcription factors.
Chromatin structure can also affect DNA replication on a global level. For example, heterochromatin
and other silent areas of the genome replicate late in
S-phase, but the reason that these late-replicating
regions are silent is unknown. One possibility is a
specific repressive chromatin structure that can be
overcome to allow origin firing late in S-phase.
Modification
Both protein complexes and small organic molecules
modulate the state of chromatin in the cell. Large
chromatin-remodeling complexes use the energy
from ATP hydrolysis to destabilize and reposition
nucleosomes. These complexes are highly conserved
in many eukaryotic organisms (see Table 1). They
all include a helicase-like subunit that has a DNAdependent ATPase activity. The SWI/SNF family of
chromatin-remodeling complexes are very large protein complexes with many subunits. They are thought
to help activate transcription by disrupting the
nucleosome structure, by displacing the histone octamer to neighboring DNA. The ISWI family of remodeling complexes are smaller in mass and generally
have fewer subunits. Unlike the SWI/SNF complexes,
they are thought to promote gene expression by sliding the nucleosomes along the DNA, opening up a
local area.
Another example of a chromatin-remodeling complex is the polycomb (Pc) group in Drosophila. This
complex is a negative regulator of transcription, acting
to repress homeotic genes during development. The
Pc complex is believed to cause a tightening of chromatin structure, inducing a heterochromatin-like
state, or to coat the chromatin, thus preventing access
by transcription factors. The Pc group is also needed
for position effect variegation.
Posttranslational modifications also exert significant effects on chromatin activity (see Figure 2).
Acetylation of the N-terminal tails of core histones
is the best-studied histone modification. The addition
and removal of acetyl groups has considerable influence on gene regulation. This connection was made
by several observations. First, some transcriptional
activators are histone acetyl transferases (HAT), whilesome transcriptional repressors are or recruit histone
deacetylases (HDAC). Second, in Saccharomyces
cerevisiae, regions with hyperacetylated histones (in
particular, H3 and H4) tend to be transcriptionally
active, while areas with hypoacetylated histones are
mostly silent. In mammals, the inactive X chromosome has little acetylation of histone H4.
The addition of an acetyl group on a lysine residue
will reduce the positive charge of the histone, thereby
causing a weaker interaction between the histone and
the DNA. This loss of stability in the nucleosome
probably facilitates access to the DNA by transcription factors. By the crystal structure of the nucleosome, the N-terminal tails of histones were shown
to mediate nucleosomenucleosome interaction, and
acetylation of these tails is predicted to disrupt these
interactions and thereby open the chromatin structure. Since multiple lysine residues are often modified,
342
C h ro m a t i n
Table 1
Chromatin-remodeling complexes.
Complex
SWI/SNF family
SWI/SNF
RSC
Brahma
h SWI/SNF
h SWI/SNF
NRD
ISWI family
I SWI1
I SWI2
NURF
CHRAC
ACF
RSF
Organism
ATPase
Mass
(MDa)
No. of
Subunits
S. cerevisiae
S. cerevisiae
D. melanogaster
H. sapiens
H. sapiens
H. sapiens
Swi2/Snf2
Sth1
brahma
hBRM
BRG1
CHD4
2
1
2
2
2
1.5
11
15
nd
*10
*10
18
S. cerevisiae
S. cerevisiae
D. melanogaster
D. melanogaster
D. melanogaster
H. sapiens
ISWI1
ISWI2
ISWI
ISWI
ISWI
hISWI
0.4
0.3
0.5
0.7
0.2
0.5
4
2
4
5
4
2
(Reprinted from Kornberg and Lorch, 1999. Twenty-five years of the nucleosome, fundamental particle of the eukaryote
chromosome. Cell 98: 285294; with permission from Elsevier Science.)
Me
Ac P
Me
Ac A
KS
c Ac
K
A P
9 10
K
K c
14 18
23 2 KS
7 28
H3
Me
P Ac
c
A
K
Me c Ac K S 9
0
A
c
K
A
P
41
1
K
8
SK 23 1
27
28
H3
C
C
20
1 5 8 12 16 K
S K K K K Me
Ac Ac
P Ac Ac
1
S
P
H4
Ac
K
5
H4
20 16
1
12
K K K 8 5 S
K K
Me Ac Ac
Ac Ac P
H2A H2B
119
K
Ub
120
K C
Ac Ac Ac
K K K
20 15 12
Ac
K N
5
Ub
Ub
Ub
Figure 2 Schematic of modification of chromatin histones (H2A, H2B, H3, H4). The sites of acetylation (Ac),
phosphorylation (P), methylation (Me), and ubiquitination (Ub) of the core histones are diagrammed. Acetylation,
methylation, and phosphorylation occur on the N-terminal tails of the histones. With the exception of
phosphorylation of serine residues (S), modifications are of lysine residues (K). (Adapted from Spencer VA and
Davie JR (1999) Role of covalent modifications of histones in regulating gene expression. Gene 240: 112; with
permission from Elsevier Science.)
it is suspected that these hyperacetylated histones
could affect global chromatin structure, e.g., by destabilizing the 30-nm fiber structure.
Phosphorylation is another modification seen on
histone tails. In particular, the phosphorylation level
of N-terminal serines of H1 and H3 changes with
mitotic and meiotic stages, appearing late in G2,
reaching its peak in metaphase, and disappearing at
anaphase. This correlates with the timing of chromosome condensation, and phosphorylation of histone
H3 is required for proper chromosome condensation
and segregation.
Methylation occurs on histone lysines, beginning
after nucleosome assembly and peaking in mitosis.
Recent work has shown that methylation at a particular lysine in H3 is required for proper cell division.
Chromomeres 343
Also, ADP ribosylation and ubiquitination on histones have been observed, but their effects are not as
yet well understood.
Modification of the DNA itself has profound
effects on chromatin structure and gene expression.
In most eukaryotes, methyl groups are often added to
the cytosine residue in a CG doublet. Silent genes are
often methylated, while active genes are usually not
methylated. Since methylation is found frequently
at the 50 ends of genes, this modification probably
induces some sort of silent chromatin state that prevents access by RNA polymerase. One example of
genes silenced by methylation is the globin gene
cluster in adult chicken erythroid cells. In mammals,
the inactive X chromosome in females is silenced
by methylation. Studies on methylated genes have
shown that the methylation patterns are heritable,
and models have been proposed as to how such a
state would be propagated.
Replication
The exact mechanism of how chromatin is replicated
is not yet clear, but important observations have been
made on certain aspects of this process. Most of chromatin assembly occurs during S-phase, so the nucleosome is assembled soon after DNA replication. Only
a very small area of DNA is perturbed at a time as
replication occurs: only two nucleosomes in front of
the replication fork are disturbed, and less than 300 bp
behind the fork are without nucleosomes. The first
nucleosome behind the fork is almost complete (lacking only H1), but, 450650 bp behind the fork, fully
assembled nucleosomes are found. As the replication
fork approaches, the histone octamers disassemble
into H2A/H2B dimers and (H3, H4)2 tetramers. The
formation of the new nucleosome occurs in several
stages. The tetramer of histones H3 and H4 is deposited
onto the newly replicated DNA first with the help
of chromatin assembly factor-1 (CAF-1). This is
dependent on replication. Interestingly, H3 and H4
are acetylated on specific lysines when they are first
deposited, and are deacetylated to another form after
they are incorporated. The H2A/H2B dimers are
then deposited in a replication-independent process
that may involve NAP-1 (nucleosome assembly
protein-1). H1 is the last protein added, and the new
nucleosome is made up of both old and new histone
proteins.
The overall state of the chromatin is preserved after
replication. For instance, regions that are silent because
of position effect variegation are maintained. It has
been hypothesized that the modification state of the
chromatin can be the epigenetic mark that is passed
onto new chromatin. During cell division in females,
Further Reading
References
Chromomeres
M Hulten and C Tease
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0200
344
C h ro m o s o m e
Further Reading
Cook PR (1995) A chromomeric model for nuclear and chromosome structure. Journal of Cell Science 108: 29272935.
Judd BH (1999) Genes and chromomeres: a puzzle in three
dimensions. Genetics 150: 19.
Chromosome
J Y Lee and T L Orr-Weaver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1712
The chromosomes of eukaryotic cells may be differentiated by size, the position of the centromere, and
banding patterns made by stains that have various
affinities for certain kinds of DNA sequences. Organisms vary widely in the size of their genomes and the
number and size of their chromosomes that contain
their genetic information. Even two species in the
same genus can show extensive diversity. The Chinese
muntjac deer (Muntiacus muntjac reevesi) has 46
chromosomes, while the Indian muntjac (M. muntjac
vaginalis) has only six chromosomes in the female and
seven in the male. Salamanders of the genus Plethodon
can have 19.5 109 bp (P. cinercus) or 67.6 109 bp
(P. vandykei), but in spite of having a difference of
over three times in the number of base pairs, both
species have 28 chromosomes.
Chromosomes can only be visualized when they
are condensed during the cell cycle. For most of the
cell cycle (interphase), one can only see a tangle of
chromatin in the nucleus. In order for the chromosomes to be accurately partitioned during cell division, the chromatin must condense into a more
tightly compacted form. As cells enter prophase, the
chromatin begins to condense into rod-like structures
that become fully formed in prometaphase. At metaphase, the chromosomes are aligned on the metaphase
plate, and the two sister chromatids are segregated at
the metaphaseanaphase transition. The chromosomes decondense back to the diffuse form in telophase. In many organisms, prophase chromosomes
have been observed to take on an arrangement in the
nucleus called the Rabl orientation, in which the centromeres of the chromosomes are at one end of the
nucleus while the telomeres are oriented toward the
other end. This develops as a result of the chromosome arrangement during the previous anaphase,
where the centromeres are the leading part of the
chromosome in moving to the spindle pole. In some
organisms, early meiotic chromosomes can also
arrange themselves into a `bouquet formation,' where
the chromosome ends cluster together at the nuclear
membrane.
There are several types of special chromosomes.
Polytene chromosomes, found in some insect tissues
and giant trophoblast cells of mammals, are formed as
a result of endoreplication. Here, chromosomes are
replicated two or more times without intervening
mitoses, producing many copies of tightly paired, replicated chromosomes. Transcription of these chromosomes can be a mechanism for a cell to produce
proteins in large quantities. Double Minutes are
unstable chromosome-like structures composed of
amplified genes. They do not have formal telomeres
or centromeres and segregate randomly at mitosis.
Double Minutes have only been observed in cancers
Chromosome Aberrations
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0203
346
C h ro m o s o m e A b e r r a t i o n s
diagnosis in subsequent pregnancies. Maternal screening by ultrasound and biochemical tests can help
to determine the risk of an affected pregnancy (see
Prenatal Diagnosis).
Trisomy 18 (see Trisomy 18) and trisomy 13 (see
Patau Syndrome) are less common forms of aneuploidy.
Both cause severe physical and mental handicap and
most affected infants do not survive the neonatal
period. Each has characteristic dysmorphic features
which permit a clinical diagnosis, and the incidence
at birth increases with maternal age.
Most other human autosomes can be involved in
trisomic conceptions, but many of these are anembryonic and all are inviable. The rare exceptions
are chromosomal mosaics, where the survival of the
trisomic cell line is supported by the presence of a
normal cell line. In other types of trisomic conception, return to the disomic state is achieved by
`trisomic rescue,' in which one of the trisomic
chromosomes is lost. If this results in the two remaining chromosomes having the same parental origin, the
term `uniparental disomy' (UPD) is used (see Uniparental Inheritance). The importance of this relates
to the phenomenon of parental genomic imprinting in
which specific genetic loci are inactivated during
either male or female gametogenesis. UPD may therefore result in both alleles being inactivated at one or
more loci, with consequent abnormal effects on the
phenotype. Chromosomes 7, 11, 14, and 15 are known
to be affected by imprinting, whereas others such as
chromosomes 13 and 21 are not. The best-known
example is chromosome 15, in which maternal UPD
occurs in 30% of patients with the PraderWilli
syndrome, and paternal UPD accounts for 5% of
cases of Angelman syndrome.
Aneuploidy for the sex chromosomes is not
associated with such severe disability as is found in
autosomal aneuploidy. 47, XXY Klinefelter syndrome
(see Klinefelter Syndrome) occurs in approximately 1
in 1000 male births and the main feature is infertility
due to primary hypogonadism. 47,XXX is the equivalent condition in women, who are not infertile but may
have learning difficulties. Similarly males with a 47,
XYY complement are usually fertile and asymptomatic; they tend to be 45 cm taller than average.
The only viable human monosomy is 45,X Turner
syndrome (see Turner Syndrome). However, it is estimated that 98% of such conceptions are inviable and
that placental mosaicism may explain the occurrence
of most of the survivors. The incidence at birth is
approximately 1 in 5000 female births. Short stature
and sexual infantilism are the main findings, together
with complex dysmorphic features, including webbed
neck and congenital heart disease.
C h ro m o s o m e A b e r r a t i o n s 347
348
C h ro m o s o m e B a n d i n g
Further Reading
Chromosome Banding
A T Sumner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0204
Further Reading
350
C h ro m o s o m e B re a k
Chromosome Break
J Y Lee and T L Orr-Weaver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0205
Chromosome Bridge
J S Heslop-Harrison
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0206
Chromosome and chromatid bridges occur at anaphase of mitosis and meiosis when chromatids are
not free to separate and form a bridge between the
two sets of segregating chromosomes. The chromatid
or chromosome forming the bridge usually breaks,
leading to duplication of a segment in one daughter
nucleus and deletion in the other. Bridges occur for
Chromosome Dimer
Resolution by Site-Specific
Recombination
D J Sherratt and G D Recchia
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1454
352
C h ro m o s o m e D i m e r R e s o l u t i o n b y S i t e - Sp e c i f i c R e c o m b i n a t i o n
(A)
(B)
Xer
recombination
(C)
Figure 1 (A) Consequences of an odd number of homologous recombinational exchanges. In linear chromosomes,
recombination between homologs (as shown), or between sister chromatids, always yields linear chromosomes. In
the case of circular chromosomes, any odd number of homologous recombination events will generate a dimeric
chromosome, which requires Xer-mediated resolution for segregation. (B) Schematic representation of the sitespecific recombination reaction catalyzed by tyrosine recombinases. Two recombination sites (shown doublestranded) align in antiparallel, each bound by two recombinase monomers. In the case of Xer recombination, XerC
(white ovals) catalyzes the first pair of strand exchanges to form a Holliday junction intermediate. This can then
undergo a conformational change to provide a substrate suitable for its resolution by XerD (shaded ovals). (C) dif
recombination and the Escherichia coli cell cycle. Xer recombination at dif (m) only becomes necessary in the event of
homologous recombinational exchanges between replicating molecules to generate a dimeric chromosome. At the
onset of septation, dimeric chromosomes are resolved by Xer recombination in an FtsK-dependent manner.
Chromosomes are specifically oriented within the cell, with the origin (.) close to one pole and the terminus region
(including dif ) close to the other pole. After replication, both dif sites lie close to the invaginating septum, in the
appropriate position for cell-division-dependent recombination to occur.
rejoins the DNA 30 phosphate to the 50 OH. A
complete recombination reaction proceeds by two
sequential pairs of strand exchanges separated by 6
8 bp, the first generating a Holliday junction intermediate, and the second resolving this intermediate
to generate recombinant product (Figure 1B).
Whereas most site-specific recombination systems
utilize only one recombinase, the Xer system is unusual in that two recombinases are required. The roles
of XerC and XerD are temporally and spatially separated, with XerC catalyzing the first pair of strand
exchanges to generate the Holliday junction intermediate, and XerD resolving this intermediate. In
order for XerD to recognize the Holliday junction
and complete the recombination reaction, a conforma-
References
Chromosome Mapping
P L Pearson
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0208
Historical Perspectives
After the independent rediscovery of the principles of
Mendelian genetics at the start of the twentieth century by three botanists (de Vries, Correns, and von
Tschermak) it was quickly realized by Bateson and
Sutton studying the transmission of characters in the
vetch Lathyrus odoratus that, although the majority of
genes appeared to segregate independently from each
other when transmitted from one generation to the
next, as predicted by Mendel's laws, there were exceptions; particular combinations of gene variants (alleles)
appeared to either attract or repel one another, which
Bateson referred to as `coupling' and `repulsion'
respectively. This phenomenon was subsequently correctly interpreted by T. H. Morgan working on gene
inheritance patterns in Drosophila melanogaster to be
caused by the genes concerned being physically
located on the same chromosome and hence the alleles
could not be transmitted independently of each other.
Morgan first observed this for eye-color genes carried
on the X chromosome and later for genes on other
chromosomes, the autosomes. This association was
termed `linkage'.
Morgan and his pupils Sturtevant, Bridges, and
Muller went on to establish that the level of linkage
between genes on the same chromosome was determined by the frequency of meiotic recombination
(meiotic crossing-over, chiasmata) which had occurred
during gamete forming in the parents, such that the
greater the distance between genes the higher the frequency of recombination in the offspring and vice
354
C h ro m o s o m e M a p p i n g
356
C h ro m o s o m e M a p p i n g
N R =0:5NR
1=2ln1
2
2
358
C h ro m o s o m e M a p p i n g
Table 1
Year
Institute/author
Type of
marker
Number of
random markers
Average
density
1987
1992
1994
1994
RFLPs
STRs
STRs
RFLPs, STRs
403
813
2066
5840
*10 cM
*4.5 cM
*1.5 cM
*0.7 cM
directly adjacent loci showing complete linkage disequilibrium and a reduction in linkage disequilibrium
with increasing distance. In the case of the CF gene
changes in the level of linkage disequilibrium between
the original two flanking markers suggested that the
gene must lie midway between the two markers. This
fact helped in locating the CF gene. There are conflicting estimates about the average distance over which
complete linkage disequilibrium extends varying from
3 to 50 kb. In practice it appears that the relationship
between linkage disequilibrium and distance varies
widely from one chromosome region to another and
each region has to be tested on its own merits. The
lower estimate of 3 kb leads to the conclusion that
over 1 million equally spaced SNPs will be required
to cover the entire human genome in linkage disequilibrium analysis in the hunt for disease alleles of genes
in complex diseases. Most workers in the field are
hoping fervently that the average disequilibrium distance will turn out to be much larger than 3 kb and
consequently that it will require a much smaller number of markers for disease allele detection by linkage
disequilibrium.
360
C h ro m o s o m e M a p p i n g
chromosome 17 was consistently retained and concluded that the gene for human TK must be on the same
chromosome. Shortly thereafter, a similar selection
system was developed in mouse and Chinese hamster
cell lines based on deficiencies of the HGPRT locus on
the X chromosome. These developments led to the
observation that hybrids between mouse and human
cells tended to lose most human chromosomes except
for the chromosome providing the selectable marker
and that the loss of the other chromosomes was more
or less random. By comparing the pattern of retention
and loss of human chromosomes in independently
derived somatic cell hybrid lines with the pattern
of specific human gene products, such as isoenzymes
and cell membrane markers, it was possible to assign human genes to individual chromosomes. The
approach was quickly implemented by many laboratories and the majority of genes mapped during the
1970s and early 1980s were mapped using somatic cell
hybrids. The use of human cell lines containing translocations which divided particular chromosomes into
two pieces that could be separated from each in the
somatic cell hybrid system permitted mapping human
genes to subregions of chromosomes. This was termed
regional assignment. Various groups developed extensive regional mapping panels of somatic cell hybrids
to permit rapid construction of maps for individual
chromosomes. For example, such panels for the X
chromosome used nearly 50 different translocation
breakpoints to divide the chromosome into small segments or bins into which genes were mapped. The
resolution provided by such mapping panels exceeded
the resolution of the chromosome banding patterns
which is approximately 5 Mb.
In 1975 Henry Harris and Stephen Goss used radiation to induce breakage in the human chromosomes
present in a hybrid cell to divide a given chromosome
into small fragments, most of which were subsequently
lost. This then permitted localization of genes to small
chromosome regions. David Cox adapted the same
principle to develop radiation hybrid mapping (RH
mapping) in which human cells are strongly irradiated
and then fused to a rodent cell line using standard
somatic cell hybridization techniques. Approximately
100 hybrid cell lines would contain the whole human
genome arranged as random fragments which overlapped each other. Cox reasoned that if two genes
were adjacent to each other, they would very likely be
carried on the same fragment and be simultaneously
present in a given cell lines; the greater the distance
between the two genes then the lower the likelihood
they would be present simultaneously. In many ways
the procedure depends on the same principles as meiotic
recombination mapping in that the frequency of radiation induced breaks occurring between two genes is
direct function of the distance between them. The
segregation frequency varies between 0 (the two
markers are never separated) to 1.0 (the two markers
are always broken apart and are therefore unlinked).
A mapping function was used to account for the
underestimate of the segregation frequency when
the markers are far apart, which is almost identical to
the Haldane mapping function used in meiotic recombination studies. The units of distance were known as
centirays (cR). Further, by varying the radiation dose
and generating fragments of different average lengths,
it was possible to create maps at different resolutions.
It was quickly discovered that, unlike chiasmata
which are clearly not randomly distributed and leading to increases in the length of the genetic map
towards the end of the chromosomes, radiation breaks
occurred more or less randomly, resulting in the
construction of RH maps which are proportional to
physical length along the length of the whole chromosome. The only requirement was that the presence of
each human marker could be individually detected
by a PCR reaction in a DNA sample of a radiation
hybrid. This development opened the way to rapidly
establishing the map position of DNA markers, including sequence tagged sites (STSs), expressed sequence
tags (ESTs), and STRs, which were being generated
and used in large numbers in the genome project. In
principle the whole gene mapping community could
use the same RH cell lines to construct a map based on
data from many centers. A particularly powerful feature of the system is that the DNA from standard sets
of RH cell lines is commercially available permitting
everyone to map individual markers in their own
laboratory. This is done by determining which lines
from the standard set are positive or negative for a
given marker by PCR and sending the results to a
server such as those at the Sanger Centre, the Stanford
Human Genome Center, or the Whitehead Institute.
The map location with supporting evidence for the
reliability of the result and information on the surrounding markers is returned within minutes: gene
mapping made easy.
http://www.sanger.ac.uk/Software/Rhserver/
Sanger Centre
http://www-shgc.stanford.edu/RH/index.html
Stanford Human Genome Center
http://carbon.wi.mt.edu:8000/cgi-bin/contig/
rhmapper.pl
Whitehead Institute
RH mapping has become a standard mapping method
used in many organisms besides humans. These
include mouse, dog, pig, cow, chicken, and zebrafish.
362
C h ro m o s o m e M a p p i n g
Type of map
Method
Resolution
are under way to use 3300 bacterial artificial chromosome (BAC) clones which are premapped and equally
spaced along the entire human genome at an average
distance of 1 Mb. Early results by Albertson indicate
that this technique will be a powerful method for
determining the map position and size of chromosome
abnormalities that result in locus copy number
changes such as duplications and deletions.
Physical Mapping
Various types of physical map were originally envisaged to be constructed within the Human Genome
Project with the emphasis on creating clone contig
maps. However, technical developments led to more
types of map being created then originally expected.
Table 2 lists those large-scale physical mapping strategies that have finally been used within the project.
Chromosome breakpoint, in situ hybridization,
and induced chromosome fragmentation maps have
already been considered.
~ 3 Mb
~3 Mb
~100 kb
~3 kb
500 kb 2 Mb depending
on radiation dose used
Several hundred kb
150 kb with PACs 40 kb
with cosmids
500 kb 2 Mb
1 bp
C h ro m o s o m e M a p p i n g
35000
30000
25000
20000
15000
10000
5000
1998
1996
1994
1992
1990
1988
1986
1984
1982
1980
0
1978
1976
Transcript Maps
1974
1972
364
13
11.2
12
13
14
22
31
32
33
34
114 Mb
118 cM
366
C h ro m o s o m e M a p p i n g
Family
studies
Chromosome
interval
Species
Number of ESTs
Large-insert
clones
Candidate
genes
Disease
mutation
Genetic
mapping
Figure 5
Physical
mapping
Transcript
mapping
Gene
sequencing
Scheme for positional cloning. (From Schuler et al. (1996) 274: 540546.)
Met A
T
G
Val G
T
C
Ser T
C
A
Leu C
T
G
Gln C
A
A
Pro C
C
G
Cys T
G
T
A
T
G
G
T
C
T
C
A
C
T
G
T
A
A
C
C
G
T
G
T
Met
Val
Ser
Leu
STOP
Model organisms
Organism
Escherichia coli
Saccharomyces cerevisiae (yeast)
Caenorhabditis elegans (nematode worm)
Drosophila melanogaster (fruit fly)
Danio rerio (zebrafish)
Fugu rubripes rubripes (puffer fish)
Mus musculus (house mouse)
4.6 Mb
12 Mb
97 Mb
165 Mb
1700 Mb
400 Mb
3000 Mb
1
16
6
5
25
?
20
Estimated
number of
genes
Further Reading
368
C h ro m o s o m e M o ve m e n t
Chromosome Movement
R S Hawley
Chromosome Number
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0210
Each species has a characteristic number of chromosomes. The total number in each somatic cell nucleus
is referred to as the diploid number as it consists of a
series of pairs of chromosomes, one member of each
pair being contributed by each parent at fertilization.
In normal human somatic cells there are 23 pairs of
chromosomes and the diploid number is thus 46,
usually indicated as 2n 46. Gametes (sperm and ova)
contain but one member of each chromosome pair as a
result of the reduction division of meiosis (see Meiosis),
and this number is referred to as the haploid number.
The diploid number varies greatly between species.
The mouse has 40 chromosomes, the cat has 32, and
the dog has 78. The mammal with the smallest number
of chromosomes is the Indian muntjac with 6 chromosomes. In contrast, the black rhinoceros has 84.
See also: C-Value Paradox; Diploidy; Meiosis
Chromosome Painting
J C Strefford
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1558
13
14
15
19
20
10
11
16
21
22
12
17
18
12
(A)
(B)
Figure 1 (See Plate 5) Examples of (A) two-colour chromosome painting and (B) multicolor M-FISH. Part (A)
demonstrates the use of two-color chromosome painting to characterize further the prostate cancer cell line, PC3,
helping to elucidate the origin of a complex marker chromosome containing material from chromosome 8 (green)
and 12 (red). Part (B) shows the M-FISH karyotype of the bladder cancer cell line, EJ28. Several chromosome
rearrangements can be seen in this transitional cell carcinoma.
Chromosome Pairing,
Synapsis
P B Moens
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0211
Chromosome pairing refers to the lengthwise alignment of homologous chromosomes at the prophase
stage of meiosis. Most sexually reproducing organisms
have two sets of chromosomes, one set inherited from
each parent. For these organisms to produce cells with
a single set of chromosomes, the sets have to be separated such that the daughter cells have one copy of
each chromosome. The responsible cell division is
meiosis and the mechanism is pairing/synapsis and
subsequent separation of homologous chromosomes.
`Pairing' refers to the juxtaposition of a pair of homologs at meiotic prophase, and `synapsis' refers to the
even closer alignment of the homologs, usually via
the parallel alignment of the meiotic chromosome
cores that form the synaptonemal complex. Close
alignment of homologs can occasionally also be
observed in somatic cells, particularly in dipteran
insects, but that phenomenon is not included in this
presentation.
Historical Background
Since the early 1900s, a large number of microscope
observations has been reported on the progression
of meiotic chromosome pairing/synapsis. Originally,
observations were made in the oocytes of newborn
rabbits and later in male and female reproductive cells
of numerous species of mammals, insects, plants, and
fungi. To visualize the chromosomes, the cells/nuclei
were fixed in a mixture of alcohol and acetic acid and
squashed between a glass microscope slide and a thin
coverslip. The chromosomes (from Greek, meaning
colored bodies) were then colored with chromosomespecific natural dyes such as carmine or orcein or with
aniline dyes (the Feulgen procedure), which were particularly useful to quantify the amount of DNA per
nucleus. For example, in the meiotic nucleus of Figure
1A, the 22 chromosomes and the X chromosome of
the common locust are Feulgen-stained. In Figure 1B,
the homologous pairs have synapsed. These pairs,
referred to as bivalents, are more readily visible once
they have shortened as in Figure 1C, where there
are now 11 bivalents and the X chromosome. This
synapsis of homologous chromosomes is one of the
outstanding characteristics of meiosis. At a later
stage, the partners of each bivalent begin to separate
but remain bound at the sites of chiasmata (ch in
Figure 1D).
After 1960, much more detailed images were
obtained through the use of electron microscopy. At
first, limited views of the meiotic nuclei were obtained
by observation of thin sections of the nuclei. With
improved sectioning techniques, complete serial sections of single nuclei were used to give a more complete view of the nucleus by means of computerized
reconstruction of chromosomes and chromosome
cores (Figure 2A, B). From 1973 onwards, the use of
surface spreading in combination with silver staining
gave quick and easy visualization of complete nuclei.
After 1985, antibodies against proteins of the meiotic
chromosomes were widely used for fluorescent microscopy and electron microscopy studies of chromosome cores and synaptonemal complex formation
during meiosis.
370
C h ro m o s o m e P a i r i n g , S y n a p s i s
X
mitotic
metaphase
C
pachytene
D
ch
late pachytene
ch
diplotene
Figure 1 Chromosome pairing during meiosis in the male locust. (A) This nucleus undergoing mitotic metaphase
shows that the male locust has 22 chromosomes and an X (arrow) chromosome. (Female locusts are XX; males are
XO.) That number is derived from two sets of 11 chromosomes and the sex chromosome X. (B) During the
pachytene stage of meiotic prophase, the pairs of homologous chromosomes associate with each other and then
form a tight synapsis. (C) As the pairs of chromosomes (now called bivalents) shorten, it is evident that there are now
11 bivalents plus the X chromosome (arrow). (D) At a later stage of meiotic prophase, the partners of each bivalent
are pulled apart while they remain bound at the sites of a reciprocal crossover/chiasma, ch. The arrow marks the
single X chromosome. Scale bars 10 mm.
2A
ABCDE
2B
ADCBE
ce
pairing
c
C
B
4
A
6
D
D
E
E
2 5
0
1
6
Initiation of Pairing
Optical microscope and electron microscope observations on well-differentiated chromosomes or on
chromosome cores indicate that in most organisms,
chromosome pairing can be initiated simultaneously
at several locations along the length of the chromosomes. The existence of internal initiation sites is supported by the observation that if one homolog has
an internal inverted region relative to its partner, one
can see what is called an `inversion loop' at meiosis
(Figure 3). This complex pairing configuration can
only be the result of a pairing initiation site within
the inverted region.
The case has been made that, at least in maize, the
pairing initiation site is also the site of a chiasma
(reciprocal genetical exchange). In the heterogametic
sex of some insects, male flies and female butterflies/
moths, the lack of crossing-over is correlated with
modified chromosome synapsis. In some fly species,
the males have no close synapsis and the limited
association of homologous chromosomes is of a
specialized kind that involves the DNA spacer regions
of the ribosomal genes and other, weaker pairing
sites.
With chromosome painting, it has been reported
that throughout interphase in some fungi, pairs of
Genetic Aspects
It has been observed for over half a century that occasionally some plants or animals may have a defect in
chromosome synapsis which leads to sterility. The
trait is inherited: it is passed on through heterozygous
parents that carry the recessive mutation. These traits
are of medical interest in humans and are of value
for agricultural plant breeding programs where male
and female sterility are manipulated to mass-produce
plants with hybrid vigour.
372
C h ro m o s o m e S c a f f o l d
The genetic defects that lead to synaptic abnormalities and subsequent partial or full sterility have
recently been identified. In yeast and mouse model
systems, it has been reported that pairing/synaptic
aberrations result from defects in the proteins of the
chromosome cores, the synaptic proteins, recombination proteins, and DNA damage/detection/repair
proteins. Most of the defects lead to degeneration
of the meiotic cells and thereby failure to generate
haploid cells.
Further Reading
Chromosome Scaffold
C Heyting
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0212
AT-rich DNA sequences remain bound to the chromosome scaffold. If these scaffold-attached regions
(SARs) are added as cloned fragments to metaphase
chromosome scaffolds, they bind to the scaffolds with
high affinity. The association of AT-rich sequences to
the chromosome scaffold has also been demonstrated
within intact mitotic metaphase chromosomes by
differential fluorescent staining of AT- and GC-rich
sequences. The signal for the AT-rich sequences
(named `the AT queue') colocalizes with the chromosome scaffold as detected by immunolabeling of topo II
(Saitoh and Laemmli, 1994).
The role of the major scaffold proteins in chromosome organization is far from elucidated. Topo II can
catalyze the passage of one double-stranded DNA
molecule through a transient double-strand break in
another DNA-molecule and thereby catenate and
decatenate DNA. SARs are highly enriched for the
consensus DNA-sequence for cleavage by Topo II.
In vitro, Topo II binds selectively and cooperatively
to SARs. In vivo, binding sites of Topo II to DNA
have been mapped by means of drugs that inhibit
Topo II at the cleavage step. Many double-strand
DNA breaks that were thus generated were localized
within SARs, and, therefore, Topo II binds to SARs
in vivo. Topo II is required for late steps in chromosome condensation and for decatenation of sister
chromatids at anaphase. These two functions could
be linked. Residual catenation of sister chromatids
possibly sterically hampers chromosome condensation, whereas condensation could push the equilibrium between catenation and decatenation by Topo II
toward decatenation of sister chromatids.
SMC2 (ScII) can interact by its C-terminal domain
with DNA, in particular with AT-rich sequences with
a tendency to form secondary structures. Because
many SARs have these features, it is possible that
SMC2 binds to SARs in vivo. SMC2 can form a heterodimer with another SMC protein, SMC4. In
metaphase chromosomes of Xenopus spp., proteins
homologous to SMC2 and SMC4 colocalize in the
chromosome scaffold. SMC2 and SMC4 occur also
as a heterodimer in a 13S protein complex from Xenopus egg extracts that fulfils an essential role in chromosome condensation in vitro. This 13S condensin
contains, besides the SMC2/SMC4 heterodimer,
three other protein components, which are all evolutionary highly conserved. It has been proposed that,
during chromosome condensation, 13S condensation
forms large, positive supercoil loops in DNA, which it
fastens and organizes into a regular solenoidal structure. It is conceivable that, during such a process, 13S
condensin or components end up in a chromosome
scaffold. However, it remains to be established
whether all 13S condensin components contribute
C h ro m o s o m e S t r u c t u re 373
to the structural organization of the chromosome
scaffold, and which components recognize specifically
SARs during chromosome condensation.
The chromosome scaffold has some features in
common with the nuclear matrix or nuclear scaffold,
which represents the insoluble fraction from interphase nuclei after protein depletion and nuclease
digestion. All analyzed SARs remain also attached to
nuclear matrices and are then named MARs (matrix
attached regions), whereas many but not all MARs are
also SARs. There are also pronounced differences
between the two structures. For instance, SMC2
(ScII) does not make part of nuclear scaffolds.
References
Chromosome Structure
A T Sumner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0213
Uninemy
The undivided chromosome is unineme, that is,
only a single DNA molecule runs throughout the
length of a chromatid. The evidence for this is: (1)
DNA replicates semiconservatively, and chromosomes themselves replicate semiconservatively; uninemy is the simplest explanation for this; (2) DNA
molecules from species with small chromosomes
have the sizes that would be expected if there were
only a single molecule per chromosome; (3) the
kinetics of chromosome breakage by low-energy
X-rays or by DNase are consistent with the presence
of only a single DNA fiber; and (4) the axial fiber of
lampbrush chromosomes is only wide enough to contain two DNA molecules, one for each chromatid.
Chromosomal Fibers
The chromosomal DNA fiber is packed into a 10 nm
nucleosomal fiber, producing a packing ratio of about
7, and this, in turn, is packed into a 30 nm fiber, the
solenoid, which again produces a packing ratio of 7,
giving a total of about 50. The 30-nm fiber seems to be
the basic unit of chromosome organization, and is also
found in interphase nuclei. The further 200-fold compaction to form the metaphase chromosome appears
to involve further folding of the 30-nm fiber, and not
its reorganization into thicker fibers.
Chromosome Scaffolds
Metaphase chromosomes examined by electron
microscopy only show an apparently random tangle
of chromatin fibers, but a variety of evidence, in particular the reproducibility of detailed chromosomal
banding patterns, indicates that chromatin fibers are
organized in a reproducible pattern. It is now accepted
that chromosomes consist of a proteinaceous `scaffold' from which radiate chromatin loops. The details
of the scaffold, as seen by electron microscopy, are
quite variable, it being sometimes a compact structure,
and sometimes much more diffuse.
Two main proteinaceous components of the scaffold have been identified. One of these is the important nuclear enzyme topoisomerase II (Topo II),
which is involved in many processes that require topological alterations of the DNA molecule, including
replication, transcription, DNA repair, and the
decatenation (separation) of newly replicated DNA
molecules. Topo II also seems to be necessary for
chromosome condensation. The function of Topo II
in the chromosome scaffold may be primarily structural; there is evidence that specific sequences in the
chromosomal DNA may attach themselves to Topo
374
C h ro m o s o m e S t r u c t u re
Chromatin Loops
Although there is a single DNA molecule running
throughout each chromatid, the DNA behaves as if it
consists of much smaller units. The DNA is attached
to the scaffold at frequent intervals, and forms loops
with characteristic properties. The loops of DNA
extend 630 mm from the scaffold, or up to 100 kb.
The loops probably radiate in all directions from
the scaffold, to form a rosette. It has been claimed
that the structure formed by loops attached to a
scaffold is no more than 0.20.3 mm in diameter. It
has been estimated that the degree of compaction
produced by the loops would be in the region of 40fold, to produce a total packing ratio for the DNA of
about 2000.
The DNA loops are attached to the scaffold by the
scaffold attachment regions (SARs), DNA sequences
that remain attached to scaffolds after exhaustive
digestion. SARs are AT-rich, and generally appear to
contain a consensus sequence for topoisomerase cleavage, consistent with the observation that Topo II
is a major scaffold protein, although not all SARs
appear to contain a sequence for Topo II cleavage,
and SARs appear to be able to bind to scaffolds lacking
Topo II. SARs are never found in coding sequences,
and are between about 3 and 140 kb apart. It has
been found that they flank genes, and coincide with
the boundaries of the nuclease-sensitive domains
associated with active genes. They have also been
associated with origins of replication, and the sizes
of loops are similar to the sizes of replicons. SARs
may therefore be functional units of chromatin and
chromosome organization.
There is considerable variation in the size of loops,
which may correspond to variation in sizes of replicons or transcribed domains. However, there may
also be more systematic differences in loop sizes. The
more frequent attachment of ribosomal DNA to the
scaffold, with correspondingly shorter loops, might
explain the existence of a secondary constriction at
the sites of nucleolar organizer regions (NOR); similarly, the tendency of alphoid satellite (in humans) to
associate with the scaffold rather than with the loops
might account for the centromeric constriction. However, other explanations of chromosomal constrictions
are possible.
Chromosome Periphery
Chromosomes do not consist only of chromatin loops
radiating from a core, and they also have a characteristic surface layer. This has been called the chromosome periphery, or perichromosomal material, and
consists of closely packed fibrils and granules, consisting of ribonucleoproteins (RNP), while more recently,
several proteins have been identified at the surface of
Chromosome Walking
L Stubbs
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0201
A chromosome walk begins with isolation of a foundational set of clones derived from the genomic region
of interest. This foundational set generally consists of
a clone or clones isolated by screening a specific type
of genomic library; for the purposes of this discussion
I will use the example of walking in a bacterial artificial chromosome (BAC) clone library. These first
clones are typically identified by hybridization of
recombinant bacterial colonies stamped onto nylon
filters with a probe designed from a gene, microsatellite sequence, or other marker (marker A in Figure 1).
Alternatively, the clones may be identified through
polymerase chain reaction (PCR) screening of DNA
prepared from pooled recombinants using specific sets
of oligonucleotide primers. Depending on the depth
of the library that is screened (or, the average number
of times a particular sequence is represented in different clones comprising the library), this first screening
generally will permit the isolation of a foundation set
of overlapping clones, each containing the marker of
interest and extending to different lengths in either
direction. Such a collection of overlapping clones is
termed a `contig' (Figure 1; see below). The degree of
overlap between specific clones in the contig is generally determined by comparing restriction enzyme
fragment patterns produced by each clone, a process
called `restriction fingerprinting.' Assembly of clones
based on the presence of shared and unique restriction
fragments can produce a restriction map of the contig
if a large enough number of clones is examined with
informative enzymes. These maps permit the order
and placement of clones within the contig to be established.
marker A
foundational clone
bidirectional walking
376
Circular Linka ge M ap
References
Occurrence
Many viruses and plasmids have circular chromosomes, typically DNA duplexes, throughout their
life cycles. These replicons manifest circular linkage
maps, implying that there is no single spot at which an
exchange always occurs.
In T-even bacteriophages of Escherichia coli (e.g., T2
or T4), the chromosomes in the virion are linear
but the gene orders among the particles of the same
clone are circular permutations of each other. Thus, in
some particles, members of a given marker pair are
farther apart, physically, than they are in the other
particles. Crosses typically involve infection of
numerous host cells by several phage particles of
each of two genotypes. The resulting linkage maps,
based on recombinant frequencies among the phage
particles produced, are circular (Stahl and Steinberg,
1964).
Bacteria that carry the sex factor can conjugate
with other bacteria and transfer the sex factor to
them. Clones of donor E. coli cells in which the sex
factor F is integrated into the bacterial chromosome
(Hfr cells) efficiently transfer the segment of chromosome to one side of F. Artificial interruption of
synchronous conjugation can determine the time of
transfer of a given marker, providing a basis for determining the order of markers. The gene orders obtained
with different Hfr strains constitute a set of overlapping linear map segments which can be assembled
uniquely into one circular linkage map for E. coli. By
convention, the distances shown on the map are in
References
Cis-Acting Locus
T Werner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0217
378
Cis - A c ti n g L o c u s
Detection
Locus control regions (LCRs) are absent from constitutively expressed chromatin domains, which is consistent with their function of activating silent genes.
They are also not yet structurally well defined but a
few known examples suggest an organization that
includes features present in S/MARs and enhancers.
Functions
LCRs can open condensed chromatin domains, thereby activating enhancers. Like S/MARs they control
gene expression independently of their chromosomal
position. LCRs also act on a long-range time scale, in
contrast to enhancers and promoters. When a silent
domain has been experimentally derepressed, establishment of the repressed chromatin structure requires
that cells pass through S-phase. Thus, LCRs act as
long-term on/off switches in the chromosome. LCRs
are also usually detected experimentally on the basis
of effects on transcription control. They contain
various binding sites for activating proteins (TFs).
Structure
Function
Structure
Detection
Promoters
In general, the promoter is an integral part of the gene.
The behavior of a given promoter often makes sense
only in the context of its own gene, especially if the
frequency of transcription is determined outside of
the promoter (e.g., by an enhancer). The promoter
by definition marks the beginning of the first exon of
a gene and always contains the transcriptional start
site (TSS). There are three different promoter types
in higher eukaryotic sequences, named after their
respective RNA polymerases, I, II, and III. Since
most of the regulated cellular genes are transcribed
by polymerase II, only these promoters will be described in detail.
Function
Structure
380
Cis - A c ti n g P ro t e i n s
Detection
Further Reading
Arnone MI and Davidson EH (1997) The hardwiring of development: organization and function of genomic regulatory
systems. Development 124: 18511864.
Boulikas T (1995) Chromatin domains and prediction of MAR
sequences. International Review of Cytology 162(A): 279388.
Kamakaka RT (1997) Silencers and locus control regions: opposite sides of the same coin. Trends in Biochemical Science 22:
124128.
Roeder RG (1996) The role of general initiation factors in
transcription by RNA polymerase II. Trends in Biochemical
Science 21: 327335.
Cis-Acting Proteins
K M Derbyshire
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0216
Cis-acting proteins are an unusual class of DNAbinding proteins that act preferentially on DNA sites
located close to the gene from which they are expressed.
This is in sharp contrast to the majority of proteins,
which are freely diffusible (trans-acting) and can act at
many different locations in the genome with equal
efficiency. In fact, a protein's ability to freely diffuse
in a bacterium is a basic requirement of the classical
complementation test used to determine if two mutations affect the same gene function. Cis-acting proteins were originally identified using such an assay:
they exhibited weak complementation of a defective
allele when a wild-type copy of the gene was supplied
in trans.
C i s- A c t i n g P ro t e i n s 381
Protein instability
382
Cis - D o m i n a n c e
Sequestration of protein
Further Reading
Cis-Dominance
M Goldman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0218
C i s t ro n 383
such as the lac operator locus, must be located in cis in
order to influence transcription. `Cis-dominance'
refers to the action of a mutation in such a regulatory
sequence. When the mutation is in cis with the structural gene (coding sequence), its effect is observed
phenotypically, as if it were a dominant mutation.
When it is located in trans to the structural gene, however, there is no effect of the mutation. (Operationally,
the term is probably not any different from the concept of a cis-acting mutation.)
See also: Cis-Acting Locus; Operon; Regulatory
Genes
CisTrans Configurations
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0219
Cistron
B S Guttman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0221
Genes have traditionally been defined, both theoretically and operationally, by three criteria: biological
function, mutability, and recombination with other
genes. In his analysis of fine genomic structure,
Seymour Benzer (1957) defined three elementary
units corresponding to these criteria: the cistron as a
unit of function, the muton as a unit of mutation, and
the recon as a unit of recombination. The term cistron is hardly used any longer except in combined
forms (polycistronic: containing more than one cistron), but the concept is still important. It is based
on the theoretical conception of a gene as a structural unit, specifically a portion of the genome that
encodes a single protein (polypeptide), and ideally
a cistron ought to be identical with a gene so conceived.
In Benzer's system, a cistron is defined on the basis
of a complementation test, performed by putting two
copies of a gene in the same cytoplasm to observe their
interactions; with a phage such as T4, the system
Benzer used, this is done by simultaneously infecting
bacteria with two mutants, and the idea is most easily
explained with this system. The bacterial host is
chosen for restricting the growth of all the mutants
involved, so no mutant by itself can grow on this host.
However, sometimes a cell infected simultaneously
with two distinct mutants produces phage because
the mutations complement each other. That is, each
mutant phage is still capable of supplying a function
that the other is missing. In each experiment, either the
two mutations affect the same cistron (i.e., the same
gene as defined above) or two different cistrons (call
them A and B); also, the mutations can either be in the
same genome, cis to each other, or in different genomes, trans to each other. Thus there are four possible
experimental situations:
1. One mutant has a defective A gene, the other a
defective B gene; the mutations are trans to each
other. Since the A mutant has a wild-type B gene
and the B mutant a wild-type A gene, the phage
should complement each other and multiply normally.
2. One mutation affects the A gene, the other the B
gene, but they have been recombined so both are in
the same genome, cis to each other; the other genome has only wild-type genes. This is a control to
ensure that a mutant gene does not somehow dominate a normal wild-type gene, perhaps by producing a toxic product; the phage should multiply
normally.
3. Both mutations affect the A gene and they are trans
to each other. Since neither genome has a wild-type
A gene, the phage should not multiply.
4. Both mutations affect the A gene and they are cis to
each other; the other genome has only wild-type
genes. Since one genome has both wild-type genes,
the phage should multiply normally.
Thus, a complementation test is sometimes called a
cistrans test, and a cistron is then a region of a genome
defined by a set of mutations that are located together,
as determined by mapping experiments, and do not
complement one another. In practice, it may be difficult to carry out unambiguous complementation tests,
but in a well-controlled system, this is a classical and
still useful way to determine the limits of a gene.
384
Clade
Reference
Clade
E Mayr
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0222
Clade is the term that denotes a phyletic lineage, consisting of a stem species and all the species derived
from it. A branch in a cladogram, in a formal cladification, is termed a cladon.
See also: Cladistics; Cladogenesis; Cladograms
Cladistics
Cladogenesis
E Mayr
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0223
Further Reading
E O Wiley
Cladograms
E O Wiley
Cladistics, or phylogenetic systematics, is a systematic and taxonomic discipline. Hennig (1960, 1966),
founded the discipline although he was certainly not
the first to use many of its principles. It provides a
method for reconstructing the phylogenetic relationships between species and higher taxa. Species are
grouped into natural, or monophyletic, groups based
on sharing of synapomorphic homologs while plesiomorphic homologs are rejected as valid evidence for
relationship. Phylogenetic or cladistic classifications
are characterized by containing only strictly monophyletic groups (containing an ancestor and all descendants of the ancestor) while specifically rejecting
paraphyletic groups (one or more descendants removed from the group) or polyphyletic groups (ancestor not logically included in the group).
References
C l e f t L i p a nd Cle f t P a l a t e 385
which contains both the group studied and one or
more outgroups.
See also: Clade; Phylogeny; Taxonomy,
Evolutionary; Trees
Class Switching
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1786
Cleavage
See: Nuclease, Restriction Endonuclease
Recurrence Risk
Because the etiologies of CL P and CP are so largely
undefined, the counseling of affected families relies
almost entirely on empirical studies of recurrence
risk. For Caucasians, it has been found that if the
proband has other affected first and/or second degree
relatives, the risk to subsequent siblings or offspring is
about 15%. If the proband has no other affected first
and/or second degree relatives, the risk is about 35%.
Unfortunately, similar empirical risk determinations
for other racial groups have not been made, but it is
Figure 1 (See over) FraserJuriloff model of CP susceptibility. The roof with holes in it represents the maternal
barrier between teratogen (arrows) and embryo. The x-axis represents the phenotypic distribution, normal to the left
of the vertical threshold and abnormal to the right; the threshold separates palate closure from palate nonclosure.
(A) Palate closure is normally late (slow growth), so the phenotypic distribution for this genotype (dashed curve) is near
the threshold, and the delaying effect of the teratogen causes all embryos (solid curve) of this genotype to fall beyond
the threshold and be affected (hatched area). (B) In an early closing (faster growth) genotype, the same delay causes a
minority of embryos to be affected. Of course, these two cases are the outer boundaries of the model, and there will be
many genotypes (dashed curves) at varying distances to the left of the threshold. (Reproduced with permission from
Fraser FC (1980) Animal models for craniofacial disorders. In: Melnick M, Bixler D and Shields ED (eds) Etiology of Cleft
Lip and Cleft Palate, pp 123. New York: Alan R. Liss. Reprinted with permission of John Wiley & Sons, Inc.)
386
C l i n i c a l G e n etics
(A)
generally agreed that the above estimates are reasonable for non-Caucasians as well.
See also: Dysmorphology
Clinical Genetics
J M Connor
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0229
Clock Mutants
C P Kyriacou
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0230
(B)
388
Clock Mutants
(A) Drosophila
TIM
TIM
PER
DBT
+++
?
CLK CYC
PER
CLK CYC
Clk
CLK
PER
per/tim
CYC
mPER2 mCRY
(B) Mouse
CK1
mCRY
mCRY
mPER2
mPER2
+++
+++
mCLK BMAL1
mPer2/m Cry
mPER2
BMAL1
mCLK
Figure 1 (A) The Drosophila interlocked circadian feedback loops. The rectangles filled in italics represent the
coding regions of genes, their promoters are shown as a straight line and transcription is shown as a horizontal arrow
rising from the gene from which transcripts (single-stranded squiggles) arise. The shapes filled with roman letters are
the proteins. `' and `
' represent whether the transcription factors activate or inhibit transcription
respectively. The CLK/CYC transcription factors bind to the per/tim promoters and activate transcription during the
day and early night. The PER protein is initially degraded by the actions of DBT kinase, but TIM protein blocks this
phosphorylation of PER, allowing PER levels to rise. PERTIM dimerization and translocation of the dimer into the
nucleus late at night blocks per/tim transcription by sequestering the positive transcription factors CLK/CYC. CLK/
CYC also repress transcription of Clk (either directly or indirectly, hence the `?'), so when the PERTIM dimer moves
into the nucleus, it derepresses Clk, allowing CLK levels to build up during the day, after which they dimerize with
CYC and reactivate per/tim but repress Clk transcription. (B) The mouse interlocked loop. The functionally equivalent
clock genes and molecules are shown for the mouse circadian system. Protein shapes that are similar to those in
Drosophila represent homologous proteins. Thus fly DBT is equivalent to mouse CK1e (casein kinase 1e) which is able
to phosphorylate mPER, although whether the delay between mPER translation and its function is mediated by CK1e
is not proven, hence the `?'. The key genes are mPer2, the mCry, and bmall. Dimerization between mCRY and mPER2
allows nuclear translocation, but the repression of mPer2 and mCry transcription is by mCRY alone. Similarly,
activation of bmall is by mPER2 acting with an unknown (`?') transcription factor.
to a level that corresponds to that several hours in the
future, thereby generating a phase advance. This simple molecular model provides a compelling explanation for the apparently complex PRC. Entrainment
of circadian cycles to different lightdark regimes and
the PRC can be understood in terms of TIM lightsensitivity, which is itself mediated by the blue-light
receptor CRYPTOCHROME (CRY). Under certain
conditions, mutation in the cry gene gives abnormalities in entertainment, although in constant darkness,
this mutation has little effect on the normal circadian rhythm. CRY therefore can be thought of as a
circadian photoreceptor and part of the clock input
pathway because it interfaces between the environmental lightdark cycle and TIM, but not a bona fide
clock component.
Bread Mold
Cyanobacteria
Finally, the cyanobacterium Synechococcus is a unicellular organism that shows circadian cycles in a
number of physiological characteristics including
photosynthesis and cell division. Use of a reporter
gene strategy by which circadian cycles of bioluminescence were targeted for mutagenesis, a large number
of clock mutants have been isolated. These reveal an
essential clock gene cluster, kaiA, kaiB, and kaiC,
whose products do not share sequences similarity
with any of the eukaryotic clock proteins described
above. KAIC negatively regulates its own gene, providing the basic feedback loop and KAIA enhances
390
Clone
Further Reading
Clone
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1787
A clone is a large number of identical cells or molecules arising from a single progenitor (ancestral) cell
or molecule.
See also: Cloned Organisms; DNA Cloning;
Whole Organism Cloning
Cloned Organisms
D Solter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0231
References
Cloning Vectors
I Schildkraut
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0233
392
selectable characteristic such as an antibiotic resistance gene to enable host cells which harbor the vector
to be distinguished from the cells that do not harbor
the vector.
See also: Vectors
Coalescent
C Neuhauser and S Tavare
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1418
k 0; 1; . . . ; N
T2
n
X
jTj;
j2
T3
T4
T5
1
Figure 1
N!1
then mutations occur along the branches of the coalescent tree according to a Poisson process with rate
y/2 independently in each branch of the tree.
The distribution of the total number of mutations
in the sample since their most recent common ancestor
follows readily. Given the total length T of the
branches in the tree, which is
j
1 1
N j
1
with
1
p1
uN 1 p1 sN uN
p 1 p1 sN
N!1
and
lim 2NsN
N!1
394
Coalescent
A2
Ultimate ancestor
MRCA of sample
A1
A2
A1
A2
A1
A1
A2
A1
A2
A1
Recombination
To describe the genealogy of two linked loci, L1 and
L2, we assume that the population is of fixed size N
and evolves according to the neutral WrightFisher
model. Recombination occurs independently in each
offspring. In each generation, with probability 1 r
each offspring independently inherits the genes at loci
L1 and L2 from the same chromosome; with probability r the genes are inherited from different chromosomes (i.e., a recombination event occurred).
When the population is large and
lim 2Nr
N!1
the genealogy of a sample of size n can be approximated by a continuous time Markov chain R(t), t 0,
where time t is measured in units of N generations.
This Markov chain, known as the `ancestral recombination graph,' can be described as a graph that contains
the lineages of each individual of the sample. Following a lineage backwards in time on this graph, recombination events occur at rate r/2. At such times, the
lineage of the two loci L1 and L2 splits which results in
a branching event. One branch follows the ancestry of
one locus, the other branch follows the ancestry of the
other locus. Common ancestry is again represented by
the coalescing of branches. An example is given in
Figure 4.
The dynamics of this recombination graph are
given as follows. If there are k branches in the graph,
then a coalescing event occurs at rate (k2), that is, each
pair of branches coalesces at rate 1; a branching event
in which a branch splits into two, occurs at rate k r/2,
that is, each branch splits into two at rate r/2.
If one adopts the convention that branches that
correspond to the L1 locus are drawn to the left and
branches that correspond to the L2 locus are drawn to
the right at branching points, then the ancestry of each
locus can be traced separately by following the paths
to the left for the L1 locus and to the right for the L2
locus at each branching point. It follows that the
ancestry of each locus is given by the neutral coalescent process and each subtree has its own most recent
MRCA of locus L 2
MRCA of locus L 1
396
Coalescent
N!1
Ni
mij ij
Nj
Strong Selection
Under selection, demography and mutation become
inseparable which results in a more complicated
Other Coalescents
The structure of the coalescent has been identified for
a wide variety of other phenomena, such as nonrandom mating (e.g., selfing), different sexes, age structure, and so on. We have assumed in our exposition
that mutation, recombination, and selection rates are
of the order of the reciprocal of the population size. In
cases where this is not true, other behavior for the
genealogy is possible; discrete time branching processes arise in this context.
Inference
An important use of coalescents arises when using
random population samples to estimate population
parameters such as r, y, and s. A number of
approaches have been proposed for this purpose,
including those based on the behavior of summary
statistics (for example, the number of segregating
sites observed in a sample of DNA sequences is often
used to estimate y). Full likelihood methods and
Bayesian approaches are currently of great interest,
particularly as they provide an inferential framework
for mapping disease genes by linkage disequilibrium
mapping, and by haplotype sharing. Importance
sampling and Markov chain Monte Carlo approaches
have proved useful in this context.
Further Reading
History
Discrete variation in external morphology or appearance that segregates in pedigrees provides the cornerstone of genetics in all organisms, and therefore it is no
surprise that coat color variation has played a crucial
role in animal genetics.
Pioneering studies in this field were carried out at
the turn of the century by William Castle and two of
398
400
Further Reading
Coding Sequences
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0236
402
Coding Strand
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1790
Codominance
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0237
Mutation Biases
Even in the absence of natural selection, DNA
sequences rarely consist of equal proportions of the
four nucleotides A, C, G, and T. Mutational biases are
pervasive and lead to biased patterns of codon usage,
with different patterns in different species, and sometimes different patterns in different genes from the
same genome, dependent on their location.
Genome GC Content
Table 1
Bacillus subtilis
High
Low
High
Low
Mycoplasma
capricolum
Streptomyces
coelicolor
Saccharomyces
cerevisiae
Drosophila
melanogaster
Homo sapiens
High
Low
High
Low
AT
GC
Phe
UUU
UUC
0.3
1.7
1.0
1.0
0.4
1.6
1.4
0.6
1.8
0.2
0.0
2.0
0.2
1.8
1.1
0.9
0.2
1.8
1.1
0.9
1.5
0.5
0.2
1.8
Leu
UUA
UUG
CUU
CUC
CUA
CUG
0.0
0.1
0.1
0.2
0.0
5.6
0.9
1.0
0.9
0.5
0.2
2.6
0.9
0.3
3.9
0.3
0.4
0.2
1.1
0.9
1.6
0.7
0.3
1.3
4.5
0.3
0.5
0.0
0.6
0.0
0.0
0.1
0.1
2.3
0.1
3.4
0.6
5.1
0.0
0.0
0.2
0.0
1.2
1.2
0.8
0.7
0.8
1.2
0.0
0.6
0.2
0.9
0.1
4.2
0.9
1.4
1.0
0.5
0.8
1.3
1.4
1.1
1.5
0.5
0.7
0.8
0.0
0.2
0.1
1.6
0.1
4.0
Values indicate the relative synonymous codon usage calculated as the observed values divided by those expected if synonyms are used equally.
High and Low indicate genes expressed at high and low levels, for species with codon bias selected for translation; the optimal codons are in bold. AT and GC indicate
AT-rich and GC-rich genes.
404
C o d o n Us a g e B i a s
Intragenomic Regions
Natural Selection
The effect of natural selection may be superimposed on
the patterns of codon usage bias caused by mutational
biases. Alternative synonymous codons are not equivalent in their translational properties, because of the
interaction between a codon and its cognate tRNA.
There are two main reasons. First, for amino acids
encoded by more than two synonyms there is usually
more than one tRNA species. These isoaccepting
tRNAs, with different anticodon sequences for the
same amino acid, have different abundances in the cell.
Second, any particular tRNA species often decodes
more than one codon (typically two, but sometimes
three or even four), due to `wobble.' The potential
bonds between one tRNA anticodon and the multiple
codons it can recognize are not equivalent. Combining these two effects, the translationally optimal
codon for any amino acid is the one best recognized
by the most abundant tRNA.
Escherichia coli
Fine-Scale Variations
Other Bacteria
For example, in the gram-negative bacterium Escherichia coli there are four different species of Leu tRNA
to decode the six Leu synonyms. The tRNA for CUG
is very abundant, while that for CUA is very rare.
The two Phe codons UUU and UUC are decoded by
a single tRNA species, with anticodon 30 -AAG-50 :
while this anticodon can bind both UUU and UUC,
it binds more strongly to the latter. Codon usage in
E. coli reflects these factors: both CUG and UUC are
strongly preferred over their synonyms (and CUA
is very rare) in genes expressed at very high levels
(Table 1), such as those encoding proteins involved
in translation (ribosomal proteins and translation
elongation factors), and abundant outer membrane
proteins. However, genes expressed at low levels exhibit much less bias, and have codon usage patterns
largely consistent with the effects of mutational biases.
The same phenomenon is seen for most of the 18
amino acids with alternative synonymous codons.
Similar observations have been made for the grampositive bacterium Bacillus subtilis. For some amino
acids, such as Phe, it is the same codon (UUC) that is
translationally optimal, but for others, such as Leu, the
identity of the optimal codon is different (Table 1),
correlated with a change in the abundance of the respective tRNAs. Although the abundance of tRNA
species has not been quantified in other bacteria,
Eukaryotes
Applications
Gene Identification
406
Codons
Codons
I Schildkraut
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0238
Codons, Invariable
W Fitch
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2169
C
varied
(has)
variable
(can)
A
unvaried
(hasn't)
invariable
(can't)
Figure 1
However, some amino acids in particular position
in a protein sequence may have a function so specific
and so vital that any change in that amino acid is
so deleterious to that protein's function, and hence
to the organism in which it resides, that selective
forces will surely remove that mutation from the
gene pool. Thus, when one examines lots of sequences
from different organisms, one does not see any variation in that particular position. Such a position is
said to be invariable and the observation of its being
unvaried is a clue to its potential functional importance. But a position may have no variants, not
because that amino acid is so functionally important,
but because that position has, by chance, not received
one of the allowable variants in that position, or the
organism that has the variant present was not sampled.
It is variable but unvaried. Generalized to all the positions in the sequence, the unvaried positions (A B
in Figure 1) comprise two mutually exclusive kinds of
positions, the invariable (A) and the variable-butunvaried (B) positions. Similarly, the positions that
are unchanged (B C) also comprise two mutually
exclusive kinds of positions, the unvaried (C) and the
variable-but-unvaried positions (B).
We can count the varied and the unvaried positions
by inspection; varied and unvaried are observations.
We cannot know the number of variable and invariable positions except by making additional assumptions; variable and invariable are inferences. Each of
these pairs, varied plus unvaried and variable plus
invariable add up to the total. The distinction is vital
in genetic analyses.
The word invariant is avoided as it does not distinguish between the two forms, that which is impossible
(invariable) and that which has not varied (unvaried),
and thus introduces ambiguity as to the author's
intent.
These distinctions are important because it is now
clear that the positions that are variable in a protein
from mammals may differ from those that are variable
in the same homologous protein in, say, plants. This
result led to the concept of covarions (concommitantly variable codons).
See also: Codon Usage Bias; Codons; Covarion
Model of Molecular Evolution
Coevolution 407
Coevolution
D J Futuyma
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0239
Concepts of Coevolution
Although the term `coevolution' occasionally refers to
the joint evolution of different features of a single
species (e.g., enzymes in a pathway, sex-specific features used in sexual interactions), it usually (and in
this article) refers to the joint evolution of two or more
species or genomes, owing to interactions between
them. (In some instances, as in the joint evolution of
nuclear and cytoplasmic genomes, this distinction
may fail.) Most study of coevolution has focused on
interspecific competition, mutualism, and interactions
between predators and prey, herbivores and plants,
and parasites and hosts (together referred to here as
`consumers' and `victims'), as well as a few interactions, such as mimicry, that do not conveniently fit
into these major categories.
In classifications of coevolutionary processes
(Thompson, 1994), one important distinction describes evolution within lineages versus branching of
lineages. One possible form of coevolution is cospeciation, the coordinated branching (speciation) of
interacting species (such as a host and parasite).
This may have occurred, for example, in figs (Ficus)
and the small wasps (Agaonidae) that pollinate and
develop in fig flowers. Each fig species is pollinated
by a single host-specific wasp, and related wasps
appear to be associated with related figs. A series of
cospeciation events may produce concordant phylogenies of two groups of interacting species. Such
concordance is the exception rather than the rule,
but has been described for the lice of pocket gophers,
the bacterial endosymbionts of aphids, and a few other
instances. In most such cases, there is little opportunity
for transmission of the symbiont between different
species of hosts. Concordance of host and symbiont
phylogenies implies a longer history of association,
and of opportunity for reciprocal adaptation, than
when symbionts frequently have switched from one
host to another. Phylogenetic studies have revealed
that considerable switching among hosts has occurred
in most groups of herbivorous insects, some symbiotic
bacteria, and various parasite groups.
Coevolution within two or more interacting lineages consists of genetic changes in the characteristics
of each, due to natural selection imposed by each on
the other. That is, it consists of adaptation of lineages
to each other. Such changes are referred to as specific
or pairwise coevolution if the evolutionary responses
408
C o evo l u t i o n
Coevolution 409
the coevolutionary dynamics may be much more
complex than this, due to factors such as (1) costs of
adaptation, (2) diffuse versus pairwise coevolution,
and (3) selection at multiple levels, as well as the
biological details of particular interactions.
Costs of Adaptation
Levels of Selection
410
C o evo l u t i o n
Coevolution 411
may become prevalent in hospitals because intensive
antibiotic treatment and/or heightened opportunity
for transmission favor the evolution of more rapid
reproduction.
Evolution of Mutualism
In mutualistic interactions between species, each
species uses the other as a resource. That is, each
exploits the other, and the balance between exploitation and overexploitation i.e., parasitism or
predation may be a delicate one. Mutualisms include
interactions both between free-living organisms, such
as plants and pollinating animals, and between symbionts, one of which spends most of the life cycle on or
in the other. Microbes are partners in many symbiotic
mutualisms. Mutualists often have adaptations for
encouraging the interaction or even nurturing the
associate, such as foliar nectaries in plants, which
attract ants that defend the plants against herbivores,
or the root nodules of legumes, which house and
nourish nitrogen-fixing rhizobial bacteria. Some symbioses are so intimate that the symbiont functions as
an organ or organelle, as in the case of host-specific
bacteria that reside within special cells in aphids
and supply essential amino acids to their host. Many,
though by no means all, mutualisms evolved from
parasitic interactions.
For each mutualist, the interaction has both a benefit and a cost. Legumes, for example, obtain nitrogen
from rhizobia, but expend energy and materials on the
symbionts. Excessive growth of the rhizobia would
reduce the plant's growth to the point of diminishing
its fitness. Likewise, excessive proliferation of mitochondria or plastids, which originated as symbiotic
bacteria, would reduce the fitness of the eukaryotic
cell or organism that carries them. Thus, selection will
always favor protective mechanisms to prevent overexploitation by an organism's mutualist. Whether or
not selection on a mutualist favors restraint depends
on how much an individual's fitness depends on the
fitness of its individual host. When a mutualist can
readily move from one host to another, as pollinating
insects can from plant to plant, it does not suffer from
the reproductive failure of any one host, and selfishness or overexploitation may be favored. Indeed,
many pollinating insects `cheat.' The larvae of yucca
moths (Tegeticula) feed on developing yucca seeds in
flowers that their mothers actively pollinated. However, several species of Tegeticula have independently
lost the pollinating behavior, having evolved the habit
of ovipositing in flowers that other species have
already pollinated. Moreover, the pollinating species
lay only a few eggs in each flower, so that the few
Consequences of Coevolution
We are only beginning to understand the effects that
coevolution has had on the history and diversity of
life. Clearly many of the adaptive differences among
organisms the many thousands of toxic defensive
compounds in different plants, insects, and fungi, the
many forms of flowers, the diverse growth forms of
plants, the sometimes astonishingly specialized diets of
animals have issued from interactions among species.
The numbers of species, too, may have been augmented
by coevolution. According to a hypothesis advanced
by P.R. Ehrlich and P.H. Raven, plant lineages that
evolved novel chemical defenses against herbivores
became free to diversify, and gave rise to diverse taxa
to which insect lineages subsequently adapted and
diversified in turn. As this hypothesis predicts, plant
lineages with resin- or latex-bearing canals, which are
known to deter insect attack, consistently have more
species than their equally old sister lineages that lack
these novel defenses. Coevolution among competitors
can also augment the species diversity in communities,
producing suites of specialized species that finely partition resources among them. In theory, such coevolution may result in ecosystem-level effects such as
higher productivity and resource consumption, but
the evidence on this subject is very sparse. Coevolution
may cause rapid changes in the properties, and therefore the population dynamics, of species. It can
have far-ranging evolutionary consequences, such as
412
Cognate tRNAs
Further Reading
References
Cognate tRNAs
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1791
Cohesive Ends
D J Wieczorek and M Feiss
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0240
5'
5' C
3' G
C C C
Figure 1
G C
C G
G G
C C
C G
G C
A C
T G
C T C
G 3'
C 5'
G A 5'
Coincidence, Coefficient of
F W Stahl
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0241
2CJ R1 R2 :
With this expression, CJ can be determined (less sensitively), by combining the data from the three twofactor crosses:
CJ R1 R2
R3 =2R1 R2 :
R1
R2
(A)
R3
R1
R3
R2
(B)
414
C o i n c i d e n ta l E vo l u t i o n
Further Reading
References
Coincidental Evolution
See: Concerted Evolution
Coisogenic Strain
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0242
Col Factors
D M Gordon
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0244
Colicinogeny
The colicin phenotype is encoded by three tightly
linked genes: the colicin, immunity, and lysis genes.
The genes are found on accessory genetic elements
(plasmids) called Col factors. Under conditions of
stress, such as nutrient depletion, some colicinogenic
cells are induced to produce colicin proteins. Simultaneously, lysis proteins are produced and some time
after synthesis, the colicin protein is released from the
cell. Colicin synthesis results in the death of the cell.
Cells harboring Col factors are protected from their
own colicin by a specific, constitutively expressed,
immunity protein.
Colicins gain entry into susceptible cells by recognizing specific receptors on the surface of the target
cell. Once a colicin is translocated into a target cell it
will, depending on the colicin, kill the cell in one of four
ways: by altering the permeability of the cytoplasmic
membrane, cleaving 16S ribosomal RNA, nonspecifically degrading DNA, or inhibiting peptidoglycan
synthesis. The available evidence indicates that a single
colicin molecule is sufficient to kill the target cell.
Col Factors
Colfactorsmay belarge selftransmissible plasmidswith
molecular weights of about 107, or nonconjugative, but
mobilizable plasmids with molecular weights of about
106. The low-molecular-weight Col factor, Col E1, has
been entirely sequenced. Other than the genes
required for plasmid replication, mobilization, and
the three colicin-related genes; no other genes have
been identified. The large Col factors harbor genes
Resistance to Colicins
Cells can become resistant to the action of a colicin
through alternations in the surface receptor that
binds the colicin molecule, or through changes in cell
416
Col Factors
Further Reading
C o l i c i n s 417
Riley MA and Gordon DM (1999) The ecological role of
bacteriocins in bacterial competition. Trends in Microbiology
7: 129133.
Cold-Sensitive Mutant
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0243
Colicins
J P Gratia
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0245
Colicins belong to a class of antibiotics called bacteriocins. Bacteriocins are produced by various bacterial species (colicins by Escherichia coli, pyocins by
pseudomonads, etc.). They are peculiar, notably, in
their relationship to bacteriophages (phages), i.e.,
viruses that infect bacteria.
In 1925 the Belgian microbiologist Andre Gratia
reported that an E. coli strain (named E. coli V) displaying high virulence toward rabbits and guinea pigs
produced a substance that was toxic to another E. coli
strain (f) and, to a lesser extent, to Shigella dysenteriae. This substance was called the `V principle' and later
colicin V. It was active at high dilution (10 3) and could
diffuse in agar from the point of inoculation of the
producing strain. It was thermostable (withstanding
heating for up to 30 min at 120 8C) and could cross a
cellophane membrane. It precipitated in an active form
in the presence of acetone but was destroyed by absolute alcohol. Gratia concluded that it was a lowmolecular-weight proteinic substance and confirmed
this later by demonstrating its sensitivity to trypsin.
This was the first characterization of such an antibiotic. Colicin V has no therapeutic value but is
extremely interesting to cell physiologists and molecular biologists. Colicins differ from other antibiotics
418
Colinearity
Colicinotyping: Applications in
Epidemiology
Sensitivity to a given colicin reflects both the presence
of specific receptors for that colicin and a positive
response to its lethal action. Sensitivity to bacteriocins
in general and colicins in particular is used in both
genetics and epidemiology as a marker for typing
bacterial strains, including strains of clinical interest.
Colicinotyping is a complement to lysotyping (typing
according to sensitivity to bacteriophages).
See also: Antibiotic Resistance; Bacteriophages;
Col Factors; Holliday's Model; Plasmids
Colinearity
C Yanofsky
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0246
C o l i n e a r i t y 419
described as geneprotein colinearity, an hypothesis
that was verified experimentally in the early 1960s.
Demonstration of GeneProtein
Colinearity
The colinearity of gene structure and protein structure was established by two research groups working
with different material and employing somewhat
different approaches. One group was led by Charles
Yanofsky and the other by Sydney Brenner. The
Yanofsky group performed their studies with the
trpA gene of the bacterium Escherichia coli, and its
corresponding protein, TrpA (Yanofsky et al., 1964,
1967). This protein is essential for tryptophan biosynthesis in this organism. TrpA is one subunit of a
two component enzyme complex; the second subunit
is the TrpB protein. To characterize the trpA gene
genetically a large number of mutants were isolated,
crossed with one another and the pairwise recombination frequencies recorded. From these recombination
frequencies a linear fine structure genetic map was
constructed repesenting the relative locations of all
the mutationally altered sites in the trpA gene.
This fine-structure genetic map was constructed
using the logic employed by Seymour Benzer in his
demonstration that a genetic map was valid representation of the nucleotide sequence of a gene (Benzer,
1957). In the studies with trpA, two classes of mutants
were recovered, so-called missense mutants and
nonsense mutants. Missense mutants produce a full
length, inactive protein, whereas nonsense mutants
produce only a fragment of the protein. The inactive
protein encoded by each missense mutant was purified
and the amino acid change responsible for inactivity
determined by a procedure called `peptide fingerprinting.' The positions of the amino acid changes in a set of
missense mutants were then compared to the positions
of the genetic alterations in these mutants on the map
of the trpA gene. As shown in Figure 1, the two were
colinear, i.e., the order and spacing of mutational sites
within the trpA gene correlated with the order of
amino acid replacements in the TrpA protein.
Geneprotein colinearity was also established by
Sydney Brenner and coworkers in studies of the gene
encoding the head protein of bacteriophage T4
(Sarabhai et al., 1964). They analyzed a set of nonsense
mutants that produced only a fragment of the head
protein because a translation termination event caused
by the nonsense mutation interrupted head protein
synthesis. Their analyses were facilitated by the finding that approximately 50% of the protein synthesized in the late stages of infection of a population of
E. coli cells by phage T4, is the phage head protein.
420
C o l o ny H y b r i d i z a ti o n
4.2 map units
A38
H
H
A218
A3 A11 A33
O
N
C O H
1
15 22
49
LYS PHE
OC
GLU
175
177 183
211
213 234
TYR LEU THR GLY GLY GLY GLY GLY SER GLN
CYS ARG ILE ARG GLU VAL CYS ASP LEU OC
Figure 1 Representation of the genetic map of the trpA gene and the sequence of the corresponding TrpA protein. The
positions of the genetic alterations in a set of trpA mutants is indicated in the upper double strand, representing the gene.
The positions of the corresponding amino acid changes are indicated on the bottom line, representing the TrpA protein.
They radiolabeled the protein fragments that were produced by a set of head protein nonsense mutants and
sized the head protein fragments by digestion with
different proteolytic enzymes and electrophoretic
separation of the resulting peptides on a gel. They
also constructed a fine-structure map of the T4 head
protein gene by recombination analyses with their set
of nonsense mutants. They demonstrated that the map
of the head protein gene correlated exactly with the
lengths of the corresponding head protein fragments,
demonstrating that the two are colinear.
Beyond Colinearity
With the advancement of technology for analyzing
DNA, it has become routine to isolate specific genes
and determine their complete nucleotide sequences.
Knowledge of the genetic code relating each trinucleotide in DNA with a specific amino acid, permits
the complete amino acid sequence of a protein to be
predicted from the nucleotide sequence of its gene.
In higher organisms the coding regions within genes,
called exons, are generally interrupted by noncoding
nucleotide sequences, called introns, which are removed when the primary transcript is processed to
yield the messenger RNA that is actually translated.
Despite the additional complexity of having noncoding blocks of nucleotide sequence within genes, the
colinearity relationship proven in studies of simpler
organisms remains valid. The protein product invariably reflects the linear order of the nucleotides in the
specifying gene.
Colony Hybridization
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1794
Colony hybridization is a technique for in situ hybridization of bacterial colonies to identify those containing DNA homologous with a particular sequence
(probe).
See also: Bacterial Genetics
Color Blindness
D M Hunt
References
Beadle GW and Tatum EL (1941) Genetic control of biochemical reactions in Neurospora. Proceedings of the National Academy of Sciences, USA 27: 499506.
Benzer S (1957) The elementary units of heredity. In: The
Chemical Basis of Heredity, pp. 7093. Baltimore, MD: Johns
Hopkins University Press.
Tritanopia
The loss of functional S cones, a condition called tritanopia, arises from mutation in the S opsin gene. It
occurs at a much lower frequency than redgreen color
blindness. The absence of the blue-sensitive pigment
limits blueyellow color discrimination. Tritanopia is
inherited as an autosomal dominant disorder; the presence of a mutant pigment even in heterozygous individuals is sufficient to result in S cone degeneration.
Achromatopsia
Total loss of color vision arises in two ways, either by
the absence of both M and L pigments, a rare condition called blue cone monochromacy where only S
cones are present, or by the absence of all cone types.
In the latter case, only rod photoreceptors are retained
so the condition is called rod monochromacy. Since
blue cone monochromacy arises from mutations in the
M/L gene array on the X chromosome, it shows an
Table 1
Occurence
in males (%)
Dichromacy
Protanopia
M only
0.81
Deuteranopia
L only
0.48
Anomalous trichromacy L or M Hybrid 6.61
422
Colorectal Cancer
X-linked pattern of inheritance, whereas rod monochromacy arises from mutations in a number of genes
that affect other components of signal processing in
the cone photoreceptors. It is generally inherited as an
autosomal recessive condition.
Nonprimate mammals
Most mammals, other than primates, have only two
cone types (the minimal situation for color vision),
one maximally sensitive in the green region of the
spectrum (500530 nm) and the other in the blue or
ultraviolet region (435365 nm). Such a system provides a basic dichromatic color vision system with
only limited discrimination particularly at longer
wavelengths.
See also: Sex Linkage
Colorectal Cancer
A Castells, H Harada, and A K Rustgi
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1559
Epidemiology
There are approximately 135 000 new cases of colon
cancer annually in the United States with approximately 60 000 people who die from the cancer or its
complications every year. Over a lifetime, average-risk
women have a 5% risk of developing colorectal cancer
and average-risk men carry a 6% risk. This risk is
substantially increased if an individual has a family
history of colorectal cancer. The incidence of sporadic
colon cancers arise with advancing age, especially for
men and women in their 60s and 70s.
Risk Factors
The vast majority of sporadic colon cancers arise from
the progression from normal colonic mucosa to adenomatous polyp (a precancerous growth) to cancer.
This process can take 510 years to occur and only
about 10% of adenomous polyps eventually become
cancer. The factors which favor progression to cancer
within a polyp include size of the polyp and histology
of the polyp (villous features). Other risk factors
for sporadic colon cancer include prior history of
colon cancer; inflammatory bowel disease, especially
ulcerative colitis, where the relative risk increases
with duration and extent of colitis; prior radiation
therapy; acromegaly; and, possibly, prior history of
breast cancer.
C omm al es s Co de 423
orderly accumulation of changes in key oncogenes,
tumor suppressor genes, and DNA mismatch repair
genes.
At the same time, a great deal of information has
been gained through the elucidation of molecular
mechanisms underlying FAP and HNPCC. The gene
responsible for FAP is called the adenomatous polyposis coli (APC) gene. This gene, located on chromosome 5q21, comprises 15 exons and encodes a protein
of 310 kDa. Germline mutations in the APC gene are
responsible for the colonic polyposis in FAP as well as
extraintestinal manifestions, such as congenital hypertrophy of the retinal pigment epithelium, gastroduodenal polyps, and desmoid tumors. The APC
protein interacts with several intracellular proteins,
notably b-catenin. This leads to the sequestration and
degradation of b-catenin. However, if APC is mutated,
then b-catenin is stabilized, and is translocated into the
nucleus where it interacts with transcriptional factors
(Tcf, Lef) to transactivate key genes (c-myc, cyclins).
The genetic basis for HNPCC is quite complex. It
involves DNA mismatch repair genes, originally identified and characterized in bacteria and yeast. These
genes, which include hMLH1, hMSH2, hPMS1,
hPMS2, and hMSH6, maintain the fidelity of DNA
replication. However, when mutated, there is disruption of proper DNA replication, leading to microsatellite instability. It has been demonstrated that about
5060% of HNPCC kindreds have germline mutations in either hMLH1 or hMSH2 and genotypic
phenotypic correlations are emerging for HNPCC in
the context of colon cancer and also extracolonic
cancers (e.g., endometrial). Overlap exists between
the pathogenesis of FAP and HNPCC and that of
sporadic colon cancer. For instance, about 7090%
of sporadic adenomatous polyps harbor APC mutations. About 1520% of sporadic colon cancers have
evidence of microsatellite instability as a consequence
of mutations in the mismatch repair genes. Target
genes of microsatellite instability include TGFbIIR,
BAX, and APC, among others.
In addition, amongst activation of oncogenes, it has
been demonstrated that mutations in the k-ras oncogene are found in 4050% of polyps and cancers.
Sporadic colon cancers also involve inactivation of
tumor suppressor genes, notably p53 and genes on
chromosome 18q, particularly SMAD4 which is
downstream of TGFbIIR. Thus, it is important to
view colon cancer as the accumulation of multiple
genetic alterations. Further insight has been gained
through the generation and characterization of transgenic and knockout mouse models that recapitulate
colonic polyps and cancer.
As a separate consideration, it is also clear that
overexpression of cyclooxygenase-2 (COX-2), a key
Commaless Code
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0249
424
Commensal
Commensal
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0250
Comparative Genomic
Hybridization (CGH)
A Kallioniemi
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1560
Cancer is thought to arise through a stepwise accumulation of genetic aberrations that gradually change
a normal cell to a malignant cell. Identification and
detailed characterization of these genetic changes is
crucial for the understanding of cancer development
and progression and more importantly provides an
opportunity for improved diagnostic and therapeutic
approaches. Conventional cytogenetic analysis has
been very successful in identifying genetic changes
involved in hematological malignancies but the analysis of solid tumors has been less favorable due to
technical difficulties and the complexity of the genetic
changes. Comparative genomic hybridization (CGH)
was developed in the early 1990s for comprehensive
screening of DNA sequence copy number changes in
cancer, especially in solid tumors. CGH allows identification and mapping of gains and losses of DNA
sequences throughout the entire genome. This review
will provide a brief overview of the methodological
aspects of CGH and the different applications of
CGH in cancer research.
CGH Methodology
CGH is based on the simultaneous hybridization of
differentially labeled test and normal reference DNAs
to normal metaphase chromosomes. The hybridized
DNAs are detected with two different fluorescent
dyes. Differences in binding of the test and reference
C o m pa r a t i ve G en o m ic H y b r i d i z a t i o n ( C G H ) 425
and that consistent patterns of nonrandom genetic
aberrations exist in various tumor types. Some of the
most frequent genetic aberrations identified by CGH
are common to a number of different tumor types,
such as the gains of 1q and 8q as well as losses of 8p,
9p, and 13q. Other chromosomal changes seem to be
more tumor-specific, such as the 6p loss in gastric
cancer and the 12p gain in testicular cancer, indicating
that these chromosomal regions might contain genes
that play an essential role in the development of these
specific tumor types.
A large number of chromosomal gains and losses
have been discovered by CGH but the genes involved
in these aberrations are still largely unknown. However, CGH has sometimes played an essential role in
the identification of putative cancer genes, especially
genes involved in DNA amplifications. In most cases,
the target gene was a previously cloned gene that was
located in the region of involvement indicated by
CGH. For example, CGH analysis of prostate cancers
that had recurred during androgen deprivation therapy showed frequent gain or amplification at chromosomal region Xq11q12. The androgen receptor gene
is located in this region and was shown to be amplified
in these tumors. Similarly, CGH analyses of alveolar
rhabdomyosarcomas showed amplification at 1p36
and 13q14 and subsequent studies revealed the existence of an amplified PAX7-FKHR fusion gene (fusing
the PAX7 locus at 1p36 and the FKHR locus at 13q14).
More recently, traditional positional cloning efforts
have been successfully utilized to identify target
genes from chromosomal regions that show frequent
amplification by CGH. For example, the AIB1,
ZNF217, and BTAK genes were identified as putative
targets for the frequent 20q amplification in breast
cancer, and it is likely that such examples will be
more common in the future. CGH can also be useful
in identification of genes involved in DNA losses,
although at present such examples are more rare.
CGH analysis of intestinal polyps from patients with
PeutzJegher syndrome, a hereditary cancer syndrome, showed consistent loss at 19p and the gene
that causes this syndrome was subsequently identified
from this region.
Tumor development and progression, the gradual
advancement of a local slow-growing tumor to an
invasive, metastatic, and eventually treatment refractory cancer, is caused by a step-wise accumulation of
genetic changes. CGH is an excellent tool for the
genome-wide analysis of genetic changes involved in
tumor progression and several CGH studies have
shown that the number of genetic changes increases
during tumor progression as expected. Analysis of
large number of tumors at different stages, such as
premalignant lesions, localized tumors, invasive
Further Reading
426
C o m p a r t m e n ta liz a tio n
Kallioniemi A, Kallioniemi O-P, Sudar D et al. (1992) Comparative genomic hybridization for molecular cytogenetic analysis
of solid tumors. Science 258: 818821.
Kallioniemi O-P, Kallioniemi A, Piper J et al. (1994) Optimizing
comparative genomic hybridization for analysis of DNA
sequence copy number changes in solid tumors. Genes
Chromosomes Cancer 10: 231243.
Knuutila S, Bjorkqvist A-M, Autio K et al. (1998) DNA copy
number amplications in human neoplasms: review of comparative genomic hybridization studies. American Journal of
Pathology 152: 11071123.
Pinkel D, Segraves R, Sudar D et al. (1998) High resolution
analysis of DNA copy number variation using comparative
genomic hybridization to microarrays. Nature Genetics 2:
207211.
Pollack JR, Perou CM, Alizabeth AA et al. (1999) Genome-wide
analysis of DNA copy number changes using cDNA microarrays. Nature Genetics 23: 4146.
Ried T, Heselmeyer-Haddad K, Blegen H, Schrock E and Auer G
(1999) Genomic changes defining the genesis, progression,
and malignancy potential in solid human tumors: a phenotype/genotype correlation. Genes Chromosomes Cancer 25:
195204.
Rooney PH, Murray GI, Stevenson DAJ et al. (1999) Comparative genomic hybridization and chromosomal instability in
solid tumors. British Journal of Cancer 80: 862873.
Compartmentalization
G Morata
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0251
Further Reading
Garcia-Bellido A, Lawrence PA and Morata G (1979) Compartments in animal development. Scientific American 241:
102110.
Lawrence PA (1992) The Making of a Fly. Oxford: Blackwell
Scientific Publications.
Lawrence PA and Struhl G (1996) Morphogens, compartments
and pattern: lessions from Drosophila. Cell 85: 951961.
Compatibility Group
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1795
A compatibility group is a group of plasmids containing members unable to coexist in the same bacterial
cell.
See also: Plasmids
Complement Loci
M J Hobart
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0252
428
Complement Loci
C1q
MBL
C1r
MASP1
C1s
MASP2
C4
C3binact
C2
C4b
Factor I
Factor H
C3b
C4b2a
Factor B
Factor D
C3
C3bBb
C5
C3b
C4b2a3b
C6
C5b
Key
Proteases (pre-proteases)
C7
C5b6
"Thioester" family
C8
C9
Figure 1
C5b67
Pro-Collagen-like
C5b678
TCC
C5b678poly9
CCP
Serine Proteases
The trypsin-like serine proteases of the complement
system fall into a number of structurally distinct
groups. Some are secreted as functionally active
enzymes which wait for their substrate (Factors I
and D), others are fairly conventional pre-proteases
which are activated as required (C1r and C1s,
430
Complement Loci
Table 1
Component
Gene Name
GDB:#
Chromosome
Gene size
Exons
C1q
119042
119043
128132
120167
SP-D
C1QA
C1QB
C1QC
MBL2
CGN1
SFTPA1
SFTPA2
SFTPD
119593
6045454
132674
1p36.31p34.1
1p36.31p34.1
1p36.31p34.1
10q22.2
28q18 (bovine)
10q2124
10q2124
10q2124
2.5
2.6
3.2
3.5
11
4.5
4.5
8
2
2
2
4
9
7
7
8
C1r
C1s
MASP-1
MASP-2
C2
Factor B
Factor D
Factor I
C1R
CIS
MASP1
MASP2
C2
BF
DF
IF
119729
119730
361104
6071500
119731
119726
132645
120077
12p13
12p13
3q2728
1
6p21.3
6p21.3
? (autosome)
4q244q25
10.5
>50
12
>16
18
6
18
18
63
13
C3
C4A
C4B
C5
C3
C4A
C4B
C5
119044
119732
119733
119734
19p13.3
6p21.3
6p21.3
9q33 (?9q34.1)
41
21
14.6 or 21
79
41
41
41
41
C6
C7
C8
C6
C7
C8A
C8B
C8G
C9
119045
119046
119735
119736
119737
119738
5p125p14
5p125p14
1p32
1p32
9q34
5p125p14
80
80
70
40
1.8
100
18
18
11
12
7
11
Factor H
CR1
CR2
DAF
MCP
C4-bp
HF1
CR1
CR2
DAF
MCP
C4BPA
C4BPB
120041
119800
119802
119088
120169
120568
125208
1q32
1q32
1q32
1q32
1q32
1q32
1q32
120 (mouse)
133
30
40
22 (mouse)
39
19
11
CR3
ITGAM
ITGB2
ITGAX
ITGB2
C1QR
C3AR1
C5R1
CLU
C1NH
PFC
CD59
120599
120574
119758
120574
9957729
5982182
128856
125226
119041
120275
119769
16p11.2 ()
21q22
16p11.2
21q22
55
40
25
40
Small
7
9
17
17.5
6
26
30
16
>30
16
2
2
2
9
8
10
5
MBL
Conglutinin (bovine)
SP-A
C9
CR4
C1qRp
C3aR
C5aR
Clusterin
C1-inhibitor
Properdin
CD59
12p13
19q13
8p21
11q12.111q13.1
Xp11.3Xp11.23
11p13
Deficiency
Disease?
C1q
Yes
Yes
Yes
Yes
MBL
Conglutinin
(bovine)
SP-A
C1QA
C1QB
C1QC
MBL2
CGN1
Polymorphism
(protein)
SP-D
SFTPA1
SFTPA2
SFTPD
C1r
C1s
MASP-1
MASP-2
C2
Factor B
Factor D
Factor I
C1R
C1S
MASP1
MASP2
C2
BF
DF
IF
C3
C4A
C4B
C5
C3
C4A
C4B
C5
C6
C7
C8
C6
C7
C8A
C8B
C8G
C9
TSP1,
TSP1,
TSP1,
TSP1,
Susceptible ?
Factor H
CR1
CR2
DAF
MCP
C4-bp
HF1
CR1
CR2
DAF
MCP
C4BPA
C4BPB
CCP
CCP
CCP
CCP, STP
Yes
CR3
ITGAM
ITGB2
ITGAX
ITGB2
C1QR
C3AR1
C5R1
CLU
CD59
C1NH
PFC
C9
CR4
C1qRp
C3aR
C5aR
Clusterin
CD59
C1 Inhibitor
Properdin
LDLRA,
LDLRA,
LDLRA,
LDLRA,
STP
TSP1
Yes
Yes
()
Susceptible
() (Africans)
Yes
()
Usually
Susceptible
Susceptible
()
() (Melanesians)
Susceptible
Susceptible
Susceptible
Susceptible
EGF
Polymorphism
(DNA){
(Acquired) Yes
Yes
Yes
Yes
()
Serpin
Yes
Yes
Dominant
*Domain abbreviations: CUB, C1r/s-uEGF-bone morphogenic protein; EGF, epidermal growth factor, Ca calcium-binding;
CCP, complement control protein; Protease, serine protease; VWFA, von Willibrand factor type a; FIMAC, Factor I and
membrane attack complex; TSP1, thrombospondin type 1; LDLRA, low density lipoprotein receptor type a; STP, serine,
threonine, proline-rich mucin-like domain; CRD, carbohydrate recognition domain. {DNA polymorphism refers to additional
polymorphisms not reflected in phenotype.
432
Complement Loci
Further Reading
Morley BJ and Walport MJ (eds) (2000) The Complement Factsbook. London: Academic Press.
Sellar GC, Blake DJ and Reid KB (1991) Characterization and
organization of the genes encoding the A-, B- and C-chains
of human complement subcomponent C1q. The complete
derived amino acid sequence of human C1q. Biochemical
Journal 274: 481 490.
Complementary DNA
(cDNA)
P J Hastings
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0253
434
C o m p le m en t a t i o n
Complementation
Complex Locus
The complex locus (of Drosophila melanogaster) possesses genetic characteristics inconsistent with the
function of a gene for a single protein. Complex loci
are generally very large (over 100 kb).
Complementation Map
See: Complementation Test
Complementation Test
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0255
Mutations can often be assigned to the same or different genes based on functional tests with diploids (or in
the case of bacteria and bacteriophages partial or
temporary diploids) that carry each mutation on a different chromosome or DNA molecule. In this case,
the trans configuration, the mutations do not destroy
function if they are in different genes, since each
chromosome contributes one wild-type gene that can
direct the synthesis of a functional product. However,
if the mutations prevent function, then they are
assigned to the same gene, since now there is no
wild-type copy of the gene to direct the synthesis of
one of the required gene products. As a control for this
case, the mutations are tested when they are on the
same chromosome, the cis configuration, in the presence of a chromosome with both wild-type genes.
Now the wild-type character should be restored.
Mutations that destroy function in the trans but not
the cis configuration are assigned to the same gene.
Mutations can be assigned to genes by pairwise complementation tests, resulting in a complementation
map of the genes and mutations.
See also: CisTrans Configurations; Cistron
Complete Penetrance
See: Penetrance
Complex Traits
W N Frankel
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0257
Co nca te m er ( G en om es ) 435
concordance amongst relatives, but which cannot be
explained by a conventional mode of dominant, recessive or additive inheritance. Thus, what makes complex traits stand out is the comparative difficulty of
their analysis identifying underlying genetic loci and
determining how each contributes to phenotype. The
ultimate analysis of complex traits would entail very
sophisticated statistical designs, accounting for all the
possible variables and their interactions. This may
be feasible in some experimental populations such as
plants and fruit flies, but tends to be problematic for
humans and laboratory mammals where it is more
difficult to obtain the very large populations required
to rival the multiple variables.
Even in less tractable systems, however, complex
trait analysis can be approached using conventional
designs by (1) reducing the number of phenotypes to
only the most robust measures, (2) testing models of
inheritance and gene action that are more likely to
occur in nature than others, and (3) exploiting the
high density of genetic markers that exist for humans
and some model organisms to use as `surrogates' for
true trait loci. Thus, traits which have continuous
distributions in a population (e.g., height, blood pressure) can be analyzed as parametric quantitative trait
loci (QTL), for example, by examining whether a particular marker genotype correlates with phenotypic
values, and determining the fraction of the variation
for which it accounts. Marker-phenotype associations
for traits which are discrete in nature (e.g., diabetic vs
nondiabetic, drug-resistant vs drug-susceptible), can
be analyzed non-parametrically, such as by w2 tests,
and a risk for phenotypic outcome can be assigned to
each allele. Consequently, however, the statistical
power gained by the use of simplified designs usually
comes at a high price: relative to simple traits there is
poor resolution for genetic mapping and for assignment of a specific phenotypic role for each locus.
Concatemer
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1799
Concatemer (Genomes)
E Kutter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0258
436
Concatenated Circles
Concatenated Circles
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1800
Concatenated circles are DNA circles that are interlocked like the rings of a chain.
See also: DNA Structure
Concerted Evolution
L D Strausbaugh
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0259
While many repetitive sequences duplicate and diverge to evolve new functions, other gene families
undergo a remarkable type of genetic process by
which family members become highly homogeneous.
Individual family members within a species do not
grow increasingly dissimilar over time due to the
independent accumulation of mutations, but rather
undergo sequence homogenization. However, during
speciation each lineage follows an independent trajectory with the result that families are divergent
between species (Figure 1). Concerted evolution
(coined by Elizabeth Zimmer and colleagues in 1980)
is the name currently used for this process, although a
variety of terms (coevolution, horizontal evolution,
coincidental evolution) have been used in the past.
C o n c e r t e d E vo l ut i o n 437
similarities of 95100% between family members are
common.
A seemingly contradictory aspect of the concerted
evolution of tandem repeats, more apparent than
real, is the widespread occurrence of length polymorphisms. Repeats of different lengths coexist in
many eukaryote genomes (Figure 2C), even within a
single array. Length polymorphisms map to the
spacers of the basic repeat. Repeat length differences
may be caused by insertions and excisions of mobile
genetic elements, as well as by variation in copy
numbers of embedded microsatellites, minisatellites,
or other internally repetitive elements. Concerted
evolution still occurs between homologous sequences,
even between family members with dramatic length
differences.
Concerted evolution is a dynamic process that
reflects a balance between the forces that introduce
new mutations and the genetic mechanisms of amplification, homogenization, and fixation. As a result,
identity between family members is not absolute.
Instead, subtle variation is superimposed on a general
framework of high identity. The partitioning of variation in an array, while only a snapshot in evolutionary
time, is still a useful characteristic. Two generalizations emerge from studies of several gene families in
a number of organisms: (1) There is an inverse relationship between linear distance and the strength of
sequence homogenization. In other words, repeats
that are neighbors in a tandem array tend to be more
similar to each other than are repeats that are more
distant from one another. (2) Variation in one locus is
greatest at the ends of arrays and lowest in internally
located regions.
aaaa
aaaa
a Aa a
a a Aa
Ba a a
aaaB
Intraspecific repeat
homogenization and
population fixation
a AAa
a AAa
Species A
Ba a B
Ba a B
Species B
Figure 1 Concerted evolution. Duplicated genes (boxes) are shown with the ancestral sequence (a) at four
positions. Concerted evolution occurs when independent changes in Species A (A) and Species B (B) spread
throughout their respective families and become fixed in the population. For simplicity of presentation, the ancestral
repeated genes are depicted as identical. As indicated in the text, any set of genes is likely to harbor variants that may
be selectively represented as new lineages evolve.
438
C o n c e r t e d E vo l u t i o n
(A)
(B)
H3
H4
5.0 kb
4.8 kb
H2A
H2B
(C)
1 2 3 4 5
H1
6 7 8
3.6 kb
3.4 kb
Figure 2 (See Plate 4) Examples of concerted evolution in tandem repeats. (A) Histone repeats are quintets of
coding regions (boxes; one for each of the histone proteins H1, H2A, H2B, H3, H4; arrows show direction of
transcription) and associated intergenic spacers (horizontal lines between boxes). This entire unit is repeated headto-tail to form a tandem array. (B) Genomic blot analysis of Drosophila melanogaster histone repeats. Four different
restriction endonucleases (lanes 14), each with one recognition site per repeat, generate unit lengths of 4.8 and
5.0 kb (length differences map to the H1H3 spacer) in the vast majority of the 100 copies per haploid genome.
(C) Genomic blot analysis of D. virilis histone repeats. Four different restriction endonucleases, each with one site
per repeat (lanes 1, 2, 6, 7), reveal extensive length heterogeneity in the 50 copies per haploid genome: the doublet at
3.6/3.4 kb represent H1-less quartets, and the ladder of larger fragments are quintets with variably sized H4H2A
spacers. Predominant fragment patterns with restriction enzymes with multiple sites (lanes 3, 5) or no sites (lane 8) in
the repeat provide further evidence for concerted evolution.
genomic architecture and are proposed to undergo
concerted evolution. Members of gene families can
also occur at dispersed, nonallelic locations as essentially solitary copies. Examples of such families that
may be evolving in concert are color vision genes in
vertebrates and heat shock genes in many organisms.
While concerted evolution can occur in all of these
diverse types of gene families, its tempo and mode
may be very different. For example, a process of
birth and death that couples amplification (phase 1)
and lineage-specific loss/retention (phase 3) appears
to be the major force in the evolution of major histocompatibility complex, immunoglobulin, and vertebrate histone genes.
C o n c e r t e d E vo l ut i o n 439
(A) Consequences of unequal crossing-over
a
Copy#
4
a
and
a
a
a
X
a b
b
X
Repeats
in next
cycle
a
a
and
a
a
a
Repeats
in next
cycle
a
a
and
b
a
a b
and
a b
4
4
Figure 3 Mechanisms of concerted evolution. Four paralogs (boxes) with one variant repeat (shaded) are depicted
with ancestral sequences at two nucleotide positions (a) and a change (a to b) at one of the sites. For simplicity,
only the two chromatids involved in the genetic exchange are shown. The brackets on the left mark repeats taking
part in the second cycle of each process; copy numbers are shown on the right. Shaded regions approximate
the extent of sequence exchange. (A) Consequences of unequal crossing-over. Point of cross over is indicated by
an X. (B) Consequences of out-of-register gene conversion. Double-sided arrows indicate site and direction of
conversion.
mechanism that operates over relatively small distances (Figure 3B). Estimates of the sizes of typical
conversion tracts are several hundred base pairs for
yeast, fruit flies, and humans. Gene conversion generates small tracts of sequence identity between repeats
and does not change copy numbers of family members. Gene conversion has an additional advantage
since it is commonly biased (favoring one allele in
the nonreciprocal information transfer) which would
facilitate concerted evolution considerably.
440
C o n c e r t e d E vo l u t i o n
3. There has also been speculation that threedimensional structural features of repeats (such as
the presence of stem-loops and scaffold attachment
sites) may promote concerted evolution. It is probable that many different factors contribute to the
tempo and mode of concerted evolution, and that
each family has its own unique combination of
factors.
Future Prospects
Most of the empirical approaches to concerted evolution focus, by necessity, on its second phase, the
mechanisms of sequence homogenization. Even for
this relatively well-studied aspect of concerted evolution, there are still gaps in our knowledge, particularly
of the rates of unequal exchange and gene conversion.
Better understandings of both are imminent, due to
recent advances in analysis of repeated genes and
detection of polymorphisms. Whether dinucleotide
repeats or ribosomal DNA, we have relatively little
understanding of the precise features and genetic
forces that trigger amplification and subsequent
spread throughout a species the first and third
phases.
The growing availability of appropriate reporter
genes that can be assembled into artificial arrays and
introduced into genomes by transgenic technology
promises new experimental systems for the study of
concerted evolution. Such powerful approaches have
been largely confined to yeast and cultured mammalian cells. Given the wonderful and growing collections of genes (and corresponding mutations) known
to affect chromatin structure, repair, and recombination in several model organisms, there will be almost
endless possibilities for examination of synergistic
effects, and dissection of mechanisms.
Further Reading
C o n d i t i o n a l L e t h a l i t y 441
Nei M, Gu X and Sitnikora T (1997) Evolution by the birth- anddeath process in multigene families of the vertebrate immune
system. Proceedings of the National Academy of Sciences, USA
94: 77997806.
Spradling A (1994) Transposable elements and the evolution
of heterochromatin. Society of General Physiologists Series 49:
6983.
Concordance
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0260
`Concordance' is used in two different ways by geneticists. In formal genetic studies, it describes the situation
where two expressed traits or alleles that are found
together in one parent are also found together in
the offspring of that parent. The level or percent of
concordance refers to the fraction of total offspring
characterized from an experimental cross that shows
concordance. The remaining fraction is considered
to be discordant. A high rate of concordance is consistent with pleiotropy, genetic linkage, or association
between the two loci or traits under analysis. Concordance is also used in twin studies to describe pairs
in which both individuals express the same trait.
By comparing levels of concordance for a particular
trait in populations of monozygotic and dizygotic
twins, it is possible to estimate the heritability of the
trait. The greater the heritability, the higher the level
of concordance expected among monozygotic twin
pairs.
See also: Heritability; Linkage; Pleiotropy
Conditional Lethality
M Sussman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0261
conditions, the mutant gene is at least partially functional, and the affected individual is viable. Under
`restrictive' conditions, the mutant gene is not functional, and the affected individual dies. Nonmutant, or
wild-type, individuals (genotype AA or A) survive
under either set of conditions.
Dominance
Geneticists can make little use of a dominant mutation
that is lethal under all environmental conditions
because an organism carrying such a mutation cannot
reproduce. Conditionally lethal mutations, on the
other hand, are valuable precisely because they can be
propagated under permissive conditions. Conditional
lethals can be, and sometimes are, dominant. Thus, the
Aa genotype may or may not be viable under restrictive conditions. In practice, the vast majority of conditionally lethal mutations are recessive.
Collecting lethal mutations is a relatively simple
matter for researchers working with diploid organisms, which can carry recessive lethal mutations in
the heterozygous state (Aa). However, for those who
study haploid organisms such as bacteria, bacterial
viruses, or fungi, the only way to collect lethal mutations is to use conditional lethals, which can be propagated under the permissive conditions and then
studied genetically and physiologically under restrictive conditions.
Temperature-Sensitive Mutations
Mutants whose permissive temperature is lower
than the restrictive temperature are usually called
temperature-sensitive, although some investigators
prefer the term heat-sensitive. Mutants whose permissive temperature is higher than the restrictive
temperature are said to be cold-sensitive. `Hot' and
`cold' are not used here in the sense that they are used
on water faucet labels. For example, temperaturesensitive mutants of bacteriophage T4, first isolated
by Robert S. Edgar in the 1960s, were screened for the
ability to grow at 23 8C, roughly room temperature,
but not at 42 8C, roughly the high temperature on a
sunny midsummer day in Phoenix, Arizona. The
Escherichia coli host bacteria will grow and wild-type
phage T4 will make plaques at either temperature. The
temperature-sensitive (ts) mutants of T4, however,
make plaques only at 23 8C.
At the DNA level, almost all temperature-sensitive
mutations are single-base substitutions, which lead to
amino acid switches in the protein product of the gene.
The substitution of one amino acid for another can
affect the folding of the protein into its active threedimensional structure, the stability of that folded
442
C o n d it i o n a l Le t h a l it y
Nonsense Mutations
A second major class of conditionally lethal mutations is made up of base-substitution mutations that
introduce premature stop codons into genes. When a
stop codon, such as UAG, is substituted for a codon
corresponding to an amino acid found in the wildtype protein product of the stops, the translation
apparatus synthesizes the protein from its N-terminus
up to the stop codon and the stops. Thus, the product
of the mutant gene is a truncated, nonfunctional polypeptide. Mutations that introduce premature stop
codons into genes are called nonsense mutations.
They were given whimsical names that derive from a
laboratory joke: Mutants that introduce the stop
codon UAG are called amber mutants; those that
introduce UAA are called ochre mutants; and those
that introduce UGA are called opal mutants.
One might expect that the introduction of a premature stop codon into a gene would invariably
incapacitate the gene. How can such mutations be
conditionally expressed? The answer is that a nonsense mutation can be suppressed by the presence of
an abnormal transfer RNA (tRNA) whose anticodon
pairs with the stop codon. For example, a tRNA with
the codon 30 GUC 50 will pair with the amber codon.
The tRNA carries an amino acid that will be incorporated into the protein product of the gene at the position corresponding to the nonsense codon. In the
presence of the suppressor tRNA, the translation
of the mutant message results in a mixture of two
products some truncated polypeptides and some
full-length protein molecules with an amino acid substitution at the position of the nonsense mutation. If
the substituted amino acid does not greatly alter the
structure of the protein, the gene product may be sufficiently active to allow survival of the nonsense
mutant.
`Stop'-binding tRNAs are not normal components
of the translation apparatus. A cell harboring one of
these supernumerary suppressor-tRNAs is called a
suppressor-plus (su) cell. Viruses carrying nonsense
20,21,22,23
24,31,40,66
36
35
wac
16,17,49
5,6,7,8,9,10,11,12,
25,26,27,28,29,51,53
48,54
2,4,5,64,65
19
63
13,14
18
3,15
Figure 1 Genetic control of the biosynthesis of bacteriophage T4, revealed through the use of conditionally lethal
mutations. The figure shows steps in the synthesis of head, tail, and tail fibers. The `early' steps in phage development
DNA synthesis and related processes are not shown in this figure. The numbers appearing above the arrows refer
to genes of the phage. (The only gene in the figure that is not identified by a number is wac, the gene that encodes
`whisker' proteins.) Under restrictive conditions, a temperature-sensitive or amber mutation in any gene whose
number appears over a particular arrow will block phage morphogenesis at the step symbolized by that arrow. Thus,
for example, a mutation in gene 23 blocks the formation of empty phage heads, and a mutation in gene 49 prevents
the `stuffing' of the heads with phage DNA. (Adapted from Wood WB (1979) Bacteriophage T4 assembly and the
morphogenesis of subcellular structure. The Harvey Lectures 73: 203223.)
of this sort can be done to determine whether the
biological function affected by a particular conditionally lethal mutation is required early in the growth
cycle or late in the growth cycle or continuously
throughout the cycle. By varying the time of the shift,
one can determine exactly when the temperaturesensitive process ends or begins.
Special Cases
There are many other examples of mutants that survive under one set of conditions but not under
another. For example, auxotrophic mutants of bacteria
or other microorganisms differ from the prototrophic
wild-type in their nutritional needs. They cannot
grow unless their environment contains some specific
nutrient an amino acid, a vitamin, a nucleic acid
precursor that is not required by the prototroph.
Some streptomycin-resistant mutants of E. coli are
actually dependent on the presence of streptomycin
in the growth medium. Human cells deficient in
the enzyme hypoxanthine guanine phosphoribosyl
transferase (HGPRT) are unable to grow on a medium
containing hypoxanthine, aminopterin, and thymidine (HAT medium), whereas wild-type cells grow
perfectly well. In all these cases, the `conditions' that
determine the viability of the mutant are highly specific to the gene function affected by the mutation.
Nevertheless, the term `conditional lethal' is generally
reserved for the likes of nonsense or temperaturesensitive mutations, in which the permissive condition
Congenic Strain
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0263
C o n g e n i c S t ra in
DONOR
RECIPIENT
F1
N2
Selection for donor allele at differential locus
444
N3
N4
differential
chromosomal
segment
residual
heterozygosity
N10
FULL
CONGENIC
Congenital Adrenal
Hyperplasia
(Adrenogenital Syndrome)
C J Migeon
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0052
446
Name
Location/action
CYP11A (P-450scc)
3b-HSD
CYP17 (P-450c17)
Choromosomal
location
Table 1
Forms of CAH
Virilizing Adrenal Hyperplasia
Deficiency of 3b-Hydroxysteroid
Dehydrogenase
Aldosterone Deficiency
Further Reading
448
Congenital Disorders
Congenital Disorders
D Donnai
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0264
The term `congenital' signifies a condition that is present at birth. Structural birth defects are the most
frequent congenital disorders. Other disorders presenting at birth include genetic, metabolic, and neurological disorders, and disorders due to the effects of
environmental factors such as infections and other
teratogens. Between 2% and 3% of newborns have a
congenital disorder that will require medical attention. Most birth defects are single such as a congenital
cardiac defect, spina bifida, cleft lip, or talipes, but
some, such as chromosomal disorders like Down
syndrome, affect multiple body systems. The incidence of anomalies is higher in stillborn babies and
in spontaneously aborted fetuses.
Terminology
Terms such as malformation, deformity, and anomaly
are often used interchangeably. However, there are
precise definitions for various terms which aid
description and scientific analysis.
Malformation
Deformation
Disruption
Single Malformations
Factors that suggested a partly genetic basis for common malformations such as neural tube defects and
cleft lip and palate were the observed increase in risk
for first degree relatives and a lesser, although still
greater than background, risk for second and third
degree relatives. Also, in malformations in which the
frequency between the sexes was unequal such as
congenital dislocation of the hip and pyloric stenosis,
the recurrence risk was found to be greater if the
affected child was of the lesser affected sex. Whilst
the genetic predisposition to a malformation cannot
be altered environmental factors can be modified. An
example of this preventative treatment is the role of
folic acid in reducing recurrence risk of neural tube
defects in susceptible mothers after the birth of an
affected child.
Conjugation 449
Multiple Malformations
disorder, affects secretions of the lungs and gastrointestinal tract and may present at birth with meconium
ileus, a bowel obstruction. There are many inborn
errors of metabolism which are inherited as autosomal
or X-linked recessive traits which present soon after
birth with progressive metabolic disturbance leading
to neurological abnormalities and sometimes death.
One such condition is X-linked ornithine transcarbamylase deficiency, a defect in urea metabolism usually
leading to death of affected males. Abnormalities of
the components of connective tissue such as collagens,
fibrilins, and proteins involved in assembly of these
components may present at birth with unusual tissue
laxityandoftendeformations duetointrauterineforces.
Severe neonatal Marfan syndrome, usually due to a
new dominant mutation in fibrilin 1, is an example.
Conjugation
K B Low
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0267
450
Conjugation
One involves various species of bacteria that are normally haploid and which can transfer plasmids and
sometimes portions of their chromosomes to other
bacterial cells or in some cases to plants. The second
involves diploid protozoan ciliates such as Paramecium, which can exchange entire haploid nuclei
between cells, thus giving rise to a new pair of complete chromosomes in each exconjugant cell, derived
from both of the parental cells.
Bacterial conjugation is one of the three major
known modes of genetic exchange between bacteria,
the other two being transduction and bacterial transformation. Of these three modes, conjugation is the
onlyonethatinvolvescell-to-cellcontact.Thephenomenon was first reported in 1946 by J. Lederberg and
E. L. Tatum using Escherichia coli, as the result of a
conscious effort to find sexual recombination in bacteria. Bacterial conjugation is a sexual mode of genetic
transfer in the sense that chromosomal material from
two sexually distinct types of cells are brought together in a defined and programmed process. However, in bacterial conjugation the process involves only
a portion (usually small) of the genome of one of the
cells (the donor) and the complete genome of its sexual
partner (the recipient), as opposed to sexual union in
most higher organisms which involves an interaction
between the entire set of chromosomes from both of
the parental cells. Thus, genetic transfer in bacterial
conjugation is partial, and it is in most cases polar,
wherein genetic material moves unidirectionally
from the donor cell into the recipient cell followed
by separation of the cells and further changes in the
organization or recombination of the combined
genetic material within the recipient cell. With a few
conjugational transfer systems, some transfer can also
occur from the recipient strain into the donor strain.
This is known as retrotransfer. The transfer of genetic
material can take several minutes or more (up to
several hours).
In contrast to bacterial conjugation, protozoan conjugation involving Paramecium is more complicated
and prolonged, taking about 20 h to complete. As in
bacteria, conjugation in Paramecium involves cell-tocell contact and transfer of genetic material. However,
conjugation in Paramecium involves first meiotic and
mitotic divisions of its diploid germinal set of chromosomes, followed by an exchange of an entire haploid
nucleus from each of the conjugating cells to the other.
Occasionally cytoplasmic material, including small
autonomous genetic components, is also exchanged.
Thus in Paramecium, conjugation results in a substantially symmetric and complete exchange of a haploid
equivalent of genetic material into both of the participants, which can then propagate vegetatively, each
carrying a new combination of chromosomes.
Of these two varieties of conjugation, the mechanisms of bacterial conjugation are understood in greater
molecular detail, and are considered below.
Some plasmids carry only a portion of the genetic information necessary for conjugation. The one essential
element that a plasmid requires for it to be transferable
is a site where one strand of the DNA is cut, prior
to transfer. This site, termed the origin of transfer, is
usually known as oriT, or bom (basis of mobility).
Conjugation 451
Also required in the cell is a gene for a site-specific
nuclease that cuts at this oriT; this gene is usually given
a mob or tra designation, as are other necessary genes.
If the oriT (bom) site and necessary mob genes are
present, the remainder of the conjugational apparatus
can sometimes be supplied by a coresident conjugative
plasmid. This is called mobilization in trans of a nonconjugative plasmid by a conjugative plasmid.
Another configuration in which a set of conjugation genes can cause transfer of unrelated DNA is
when the transfer system (e.g., a conjugative plasmid)
recombines with another plasmid or with the main
bacterial chromosome. In this configuration the transfer of one DNA strand beginning with oriT results in
transfer of all the DNA that is linked to it on the same
DNA strand, which includes the other plasmid or
chromosomal DNA. This is mobilization in cis. In
the case of a conjugative plasmid stably recombined
with the chromosome, the transfer system is called
Hfr (high frequency of recombination for chromosomal genetic markers which are transferred to other
cells). There is sometimes breakage of DNA during
extended transfer in Hfr crosses, so that markers
located far from oriT (distal markers) are transferred
less frequently than earlier markers closer to oriT
(proximal markers). This effect is known as the transfer gradient.
In contrast to stable integration of a conjugative
plasmid, the plasmid can in some cases interact transiently (but rarely) with the chromosome or another
plasmid and cause mobilization perhaps by initial
steps in recombination in regions of limited DNA
homology, or owing to the activity of a transposable
element. If there is substantial homology (more than
several kilobases) between the conjugative plasmid
and the replicon being mobilized, repeated recombination events result in an equilibrium between integrated and autonomous states of the plasmid, and
thus in a population of cells a large percentage of each
mode of transfer (mobilization (in cis) versus simple
plasmid transfer) will occur. This happens, for instance,
with a plasmid that carries a sizable piece of chromosomal DNA (derived by an abnormal excision event
from a once-integrated plasmid such as in an Hfr)
which is known as an F-prime (or R-prime, Colprime, etc.) factor. These `primes' confer merodiploidy for the regions of the chromosome that they
carry (e.g., F-lac, which includes the E. coli lac operon)
and have been used extensively for dominance studies.
In some bacteria, for example, gram-positive species such as E. faecalis, Clostridium difficile, Streptococcus pneumoniae, or Lactococcus lactis, and in the
gram-negative genus Bacteroides, genetic determinants for conjugation are located on the main chromosome, and in some cases associated with a transposon
452
Conjugation
Relaxosome
Sex pilus
Pilus attachment,
pilus retraction
Donor
cell
Recipient
cell
Plasmid cleavage
at the nick site
Recipients
oriT nick
ssDNA transfer
Recircularization
Complementary
strand synthesis
DNA supercoiling
Historical Aspects
The discovery of bacterial conjugation in 1946 was
hailed by Salvador Luria in 1947 as ``probably among
the most fundamental advances in the whole history
of bacteriological science,'' even before the most basic
454
Evolutionary Relationships
Transfer functions of diverse origin show similarities
with each other and (more distantly) with functions
required for export of protein toxins, suggesting
that macromolecular export in general makes use of
homologous machinery. The tra genes of the F factor
have been best studied and are closely related to genes
of other enteric low-copy plasmids. RP4 and relatives typify a second well-studied family, the broadhost-range conjugal plasmids of Pseudomonas. These
express a smaller set of tra functions quite similar to
each other. The A. tumefaciens Ti plasmids capable of
transferring DNA into plants, which express no pili,
also express some genes with similarity to those of
RP4. At least 10 of the Tra functions of these families
display a more distant similarity between families.
Eight of these exhibit similarity to Ptl proteins, which
are responsible for the export of Bordetella pertussis
Host Range
Conjugative Transposition
G Churchward
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0265
Conjugative Transposons
Conjugative transposons are genetic elements that
excise from a donor DNA molecule, transfer from
the donor bacterium to a recipient bacterium, and
integrate into a recipient DNA molecule. This process
is shown in Figure 1. They are typically much larger
than many bacterial transposable elements because, in
addition to encoding proteins that are responsible for
DNA cleavage and strand transfer during excision and
integration, they also encode the proteins required for
conjugal DNA transfer. They are different from conjugal plasmids in that the circular form of the transposon that results from excision is not capable of
autonomous replication.
(A)
Mechanism of Transposition
Most conjugative transposons encode an integrase
enzyme that is a member of the integrase family of
(B)
C
Tn
Tn
Donor
Recipient
Donor
(D)
(C)
C
C
Tn
Tn
Recipient
Recipient
Figure 1 (A) A donor bacterium with a conjugative transposon (bold line, `Tn') inserted at a unique site in the
donor chromosome (`C'). In reality, the donor chromosome is approximately 200 times longer than the transposon.
(B) The transposon has excised from the donor chromosome, assumed a circular form, and is ready to conjugate into
a recipient bacterium. (C) The transposon is present in its circular form in the recipient bacterium. (D) The
transposon has integrated into the recipient chromosome at a new site, different from the position it occupied in the
donor bacterium. These donor and recipient bacteria belong to different species.
456
Conplastic
Mechanism of Conjugation
Genetic data suggest that a single strand of the transposon is transferred from donor to recipient during
conjugation. Conjugation also requires a segment of
transposon DNA that is distinct from the transposon
ends. Recombinant plasmids carrying this transposon
segment are mobilized by a transposon resident in
the donor chromosome, and so this segment of
DNA constitutes an origin of conjugal transfer. The
segment shows similarities to the origins of conjugal
transfer of bacterial plasmids, and so it is assumed
that DNA transfer proceeds in a similar manner. A
transposon-encoded protein (Mob) recognizes the
DNA sequence of the origin of transfer and nicks the
DNA, exposing a DNA end that can serve as a primer
for rolling-circle DNA replication. If so, then following transfer of a single transposon DNA strand to the
recipient, the complementary transposon strand must
be synthesized in the recipient and the circular form
of the transposon reconstituted prior to integration of
the transposon into its target site.
Further Reading
Conplastic
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0268
Consanguinity
Related Elements
`Consanguinity,' derived from the Latin consanguineus (``of common blood''), is defined as the kinship of
two individuals characterized by a shared common
ancestor(s). It implies the inheritance of genes which
are identical by descent, i.e., inherited from the common ancestor(s). Consequently, consanguinity affects
c1 15q
16q c1 q
Consensus Sequence
A Liljas
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0270
458
that is generally used for inter- or intramolecular interactions. Similar molecules in a cell frequently use the
same or highly similar consensus sequences and are
usually well conserved between species. There are
numerous examples, some of which are mentioned
here.
In the case of DNA, the sequence at the origin of
replication in Escherichia coli and other bacteria has
two short, repeated sequences with consensus features
as well as individual variations. DNA polymerase and
other proteins of the replication machinery identify
these sequences. For the transcription of DNA to
RNA, the regulation is essential. Consensus DNA
sequences are at the core of this regulatory activity.
Thus eubacterial promotors of genes have consensus
sequences preceding the start site for replication. One
such, called the `Pribnow box' or ` 10' region, is
situated about 10 nucleotides before the start of replication. Another site is the 35 region. They have the
consensus sequences TATAAT and TTGACA,
respectively, which are recognized by RNA polymerase. In eukaryotic transcription, again consensus
sequences guide the RNA polymerase to the proper
site. Here a large number of transcription factors participate to aid in the regulation of transcription. A
classical consensus sequence is the so-called TATA
box, recognized by one of the key proteins, the
TATA box-binding protein (TBP).
In the case of RNA, a number of consensus elements are of great interest. One case is the consensus
sequences of RNA transcripts in eukaryotes and
archaea that lead to the splicing to mature mRNAs.
Prokaryotic mRNAs have a consensus sequence preceding the initiation codon. This is called the `Shine
Dalgarno sequence' and is used by a complementary
region of the ribosomal RNA to identify the initiator
codon from other methionine codons. Another case is
the 30 end of tRNA, which has the consensus sequence
CCA. This consensus sequence is recognized by the
tRNA synthetases, which charge the terminal ribose
with the appropriate amino acid. The individual
tRNAs are recognized by the appropriate synthetase
owing to additional consensus features. The ribosomal
peptidyl transfer site also recognizes the CCA sequence of the tRNAs.
Proteins also have consensus sequences of amino
acids. This is, e.g., the case for sites of phosphorylation
in proteins. The pattern recognized by the kinases can
often be identified as consensus elements in the amino
acid sequence. One other classical consensus sequence
in proteins is the GXGXXG motif found in the
Rossmann fold or nucleotide binding fold at the binding site of the nucleotide. Another consensus sequence
is the leucine zipper, where every seventh amino acid
residue is a leucine. Such structures are found in
Conservation Genetics
R Frankham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0271
460
exists in a single population of only about 70 individuals in Queensland, Australia has less genetic diversity
than its nearest relative, the southern hairy-nosed
wombat.
Island Populations
A recurring theme in conservation genetics is the parallels between populations of conservation concern
and island populations. Endangered species in captivity
are akin to island populations. Fragmented populations
often have the characteristics of island populations
where they previously existed as a continuous population with immigration across the range. The disturbing implication of the island analogy is that island
populations are much more prone to extinction than
mainland populations.
Island populations typically experience different
environments to their mainland counterparts, especially in the absence of many predators, parasites,
and disease. The evolution of endangered species in
captivity has features that are akin to this. Island
populations are often founded from a small number
of individuals so they have often experienced population size bottlenecks. Further, the populations are
typically smaller than their mainland counterparts.
Both these features are shared with endangered species
in captivity. As a consequence of bottlenecks at foundation and small size, island populations typically
have less genetic diversity than mainland populations
and are often inbred and may have lowered reproductive fitness.
attempts to breed. Genetic markers on the sex chromosomes can be used to distinguish the sexes in bird
species.
Population structure and dispersal rates are important parameters required for conservation purposes.
These can be worked out indirectly by using genetic
markers. For example, dispersal patterns of female
loggerhead turtles have been determined from studies
of mitochondrial DNA (mtDNA) which is maternally
inherited.
Nocturnal and shy species are difficult to census.
Scat (feces) counts can be used to census such species.
However, scats of the species of interest must be
distinguished from other species with similar feces.
Analyses of amplified segments of mtDNA have
been used to distinguish scats of the endangered San
Joaquin kit fox from those of red foxes, gray wolves,
coyotes, and domestic dogs.
Endangered species are protected from hunting and
trade by laws within countries and by CITES (the
Convention on Trade in Endangered Species). However, illegal hunting may be difficult to detect. A substantial proportion of the whale meat in Japan and
Korea has been shown to come from protected
whale and dolphin species, rather than from minke
whales which are subject to legal scientific whaling.
Scott Baker and colleagues amplified a portion of
mtDNA from samples of whale meat purchased in
those countries and took the amplified DNA to their
home laboratories and sequenced it. They could not
take the whale meat samples out of Japan and Korea
without risking violation of CITES rules.
462
Further Reading
References
Conservative
Recombination
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1801
Conserved Synteny
See: Synteny (Syntenic Genes)
Consomic
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0273
Consomic strains of animals are a variation on congenic strains in which a whole chromosome rather
than one local chromosomal region is backcrossed
from a donor strain onto a recipient background.
Consomic production with the Y chromosome is
readily carried out because of the lack of recombination along this chromosome in male animals. For
consomic production with other chromosomes, it is
necessary to select animals at each generation with the
use of DNA markers that can demonstrate the transmission of the whole chromosome intact. Like congenics, consomics are produced after a minimum of 10
backcross generations. Backcrossing to obtain consomics for the Y chromosome must be carried out in a
single direction: males that contain the donor chromosome are always crossed to inbred females of the
recipient strain.
See also: Congenic Strain
Constant Regions
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1802
Constant regions are the nonvariable regions of immunoglobulin molecules encoded by C genes. Each of
the heavy and light chain classes of immunoglobulin
has a different C gene.
See also: C Genes; Immunoglobulin Gene
Superfamily
Constitutive Expression
E Thomas
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0274
Continuous Variation
Constitutive
Heterochromatin
See: Heterochromatin
Constitutive Mutations
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1804
Contig
L Stubbs
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0275
P Sham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0276
464
C o n t r a c ti l e R in g
Contractile Ring
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1807
Controlling Elements
P Lu
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0277
`Controlling elements,' when used in the genetics literature, refers to maize transposons. More recently,
`X-controlling element' (X-ce) refers to a locus found
on the mouse X chromosome that influences the
choice of X chromosome inactivation; this is not likely
to be a transposon.
Other Considerations
It should be noted that these complex genetic phenomena were first discovered in maize for the same
reason that our understanding of genetics started with
peas. This is a restatement of the fact that horticulture preceded animal husbandry. Plants are sessile, are
visible to the naked eye, and are easy to store and
catalog. Genetics depends on the analysis of offspring
from breeding parents of known genetic traits. Every
step of the process in this analysis is simpler with
plants.
The existence of transposons and the notion that
they are parasitic or selfish DNA remains an evolutionary riddle.
Further Reading
Convergent Evolution
J Read and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2085
Conversion Gradient
F W Stahl
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1005
Gradient
For some genes, the meiotic conversion frequency for
different genetic markers (mutations) within the same
gene varies depending on the position of the marker
within the gene. Commonly, conversion frequencies
are relatively high near one end of the gene and fall
monotonically to a lower value at the other end (this
is known as the conversion gradient, often called
`polarity gradient,' defining a segment of a chromosome as a polaron).
Conversion gradients can be detected indirectly, as
well, by examining random haploid products from
heteroallelic meioses. Wild-type recombinants from
such crosses are approximately half the time parental
type with respect to markers that flank the heteroalleles. If the mutant site in one heteroallele is more
frequently converted to wild-type than is the other,
the two parental types will be unequally represented.
The gradient is revealed by comparing the results
of several such crosses between various heteroalleles
and noting that the more frequently converted site is
always the one located toward the same end of the
gene.
466
(A)
cutting
(B)
resection
(C)
strand
invasion
(D)
strand
blocked
transfer
extended
(E)
(H)
transfer
reversed
junction
sliding
(I)
(F)
DNA
synthesis
mismatch
removal
(J)
(G)
5:3
DNA
synthesis
(K)
ab4:4
5:3
6:2
Figure 1 Model for a meiotic conversion gradient based on heteroduplex rejection. Of the four chromatids in a
meiotic bivalent, only the pair of chromatids involved in the illustrated recombination event are shown. Arrowheads
signify 30 ends of polynucleotide strands. Broken lines indicate newly synthesized DNA. A chromatid (A) cut on both
polynucleotide strands (B) is repaired with the aid of a homolog acting as jig and template. The repair steps involve
resection of the 50 -ended strand (C; shown here only on one side of the break), followed by invasion of the homolog
(D). Transfer of the invading strand into the homology may be extended (H) or may be interrupted when the
mismatch-repair enzyme system recognizes a heteroduplex, stops the strand transfer (E), and resolves the
heteroduplex (F). Examples of meiotic segregation ratios are indicated (G, K). On the right, repair of a mismatch ( J)
has resulted in a full conversion (6:2). An unrepaired mismatch to its left results in half conversion (5:3 segregation).
Outward sliding (branch migration) of the Holliday junction (I) can result in aberrant 4:4 segregation (ab4:4, in which
the ratio of alleles is normal but two of the chromatids are heteroduplex) when the mismatch repair system fails both
to reverse formation of a resulting mismatch and to repair it.
(A)
cutting
(B)
resection
(C)
strand
invasion
DNA
synthesis
(D)
(E)
junction
cutting
(F)
mismatch
repair
(G)
4:4
468
C o o rd i n a t e R e g u l a t i o n
Further Reading
Cordycepin
Nicolas A and Petes TD (1994) Polarity of meiotic gene conversion in fungi: contrasting views. Experientia 50: 242252.
Reference
Kirkpatrick DT, Dominska M and Petes TD (1998) Conversiontype and restoration-type repair of mismatches formed
during meiotic recombination in Saccharomyces cerevisiae.
Genetics 149: 16931705.
Coordinate Regulation
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1808
Copy-Choice Hypothesis
P J Hastings
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0279
Copy-choice is a proposed mechanism of recombination by which conservative DNA replication following one chromosome could switch templates and
copy the other. Originally proposed to explain bacterial recombination, it was developed into a general
recombination model on the basis of recombination
data from Neurospora. The proposal was that, if the
point at which template switching occurred was not
precisely reciprocal when two chromosomes were
being copied, there would be more copies of the genotype of one chromosome than of the other over a
length, thus accounting for gene conversion and for
its association with crossing-over. Demonstration of
semiconservative replication of DNA caused the
model to fall into disfavor. It also had the problem that
the mechanism predicted that, contrary to observation,
all recombination would be confined to the two new
Core Particle
A Liljas
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0280
The ribosomal core particle has been under investigation for decades and is related to the assembly and
disassembly of ribosomes. The question is how the
ribosomal RNAs (rRNA) bind ribosomal proteins to
form the functional particle. Ribosomes from most
species have more rRNA than protein. Thus the
majority of ribosomal proteins are likely to primarily
bind to the rRNA with less direct interactions
between the ribosomal proteins. Another relevant
observation is that the ribosomal RNA from which
essentially all proteins have been removed is able to
perform the so-called puromycin reaction, which is an
assay for the central ribosome function of peptide
transfer. This clearly indicates that at least part of the
rRNA has the correct functional conformation, essentially independent of ribosomal proteins. This is in
agreement with observations using electron microscopy that the ribosomal RNAs without the protein
complement have shapes related to the complete subunits. However, the rRNAs cannot possibly have
adopted their final structure, since only a limited set
of ribosomal proteins are able to bind specifically to
the rRNA alone. Thus there seems to be an ordered
pathway for ribosome assembly.
From assembly experiments of the Escherichia coli
small subunit, M. Nomura has established that the
proteins that bind to the 16S RNAwithout the presence
of other proteins and that form the core particle are
S4, S7, S8, S15, S17, and S20. In the large subunit of
E. coli ribosomes, the corresponding proteins are L4,
L13, L20, L22, and L24 according to the findings of
C o r re l a t e d R e s p o n s e 469
K. Nierhaus. Since most of these proteins are conserved in all types of ribosomes, they may be important for the assembly of ribosomes in all species.
The term `core particles' also relates to the particles
formed and the proteins that remain on the ribosome
when treated with increasing concentrations of salt. In
this type of disassembly of the small subunits from
E. coli, proteins S4, S7, S8, S15, S16, and S17 remain
until there is a very high salt concentration. The discrepancy between such disassembly and the assembly
of the small subunit is that S16 is strongly bound to a
site that is dependent on proteins S4 and S17, whereas
protein S20 is more weakly bound to a site that does
not depend on the presence of other proteins. Similarly if large subunits from E. coli are incubated with
increasing concentrations of salt, proteins L2, L3, L4,
L13, L17, L20, L21, L22, and L23 together with the
23S RNA still form a compact particle. Also here there
is a good correspondence with the proteins that have
binding sites on the rRNA independent of other ribosomal proteins.
The core particles are thus formed by proteins that
bind to the rRNA in the absence of other ribosomal
proteins. These proteins affect the folding of the
rRNA is such a way that binding sites for additional
ribosomal proteins are generated. The binding sites for
yet other proteins are generated by these proteins. It is
possible that the binding of these proteins not only
depends on the correct conformation of the rRNA
but also on direct interactions with previously bound
ribosomal proteins.
See also: Ribosomal RNA (rRNA); Ribosome
Binding Site; Ribosomes
Corepressor
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1811
A corepressor is a molecule that helps to elicit repression of transcription by binding to a regulator protein.
See also: Repressor
Correlated Response
G P Wagner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0281
CovPOXY
SX
VX
0 Y
is the difference in the character
where CRY X
0 and the parental
mean in the offspring generation Y
generation Y. As it is the case for the direct selection
response, the correlated response depends on additive
genetic effects. Noting that the parent offspring covariance CovPO(XY) is caused by the additive genetic
covariance between X and Y, the equation (1) can be
rewritten as
p
CRY rA hX hY iX VY
2
470
C o r re l a t e d R e s p o n s e
where rA is the p
additive
genetic correlation between
RY
iY hY
Y
Optimum
rA=0
rA=0.75
Cotransformation 471
References
Bohren BB, Hill WG and Robertson A (1966) Some observations on asymmetrical correlated responses to selection.
Genetic Research 7: 4457.
Eisen EJ (1972) Long term selection response for 12 day litter
weight in mice. Genetics 72: 129142.
Falconer DS and Mackay TFC (1996) Introduction to Quantitative
Genetics, 4th edn. Harlow, UK: Longman.
Futuyma DJ (1998) Evolutionary Biology. Sunderland, MA: Sinauer
Associates.
Lande R (1979) Quantitative genetic analysis of multivariate
evolution, applied to brain:body size allometry. Evolution 33:
402416.
Reeve EC and Robertson FW (1953) Studies on quantitative
genetic inheritance. II. Analysis of a strain of Drosophila
melanogaster selected for long wings. Journal of Genetics 51:
276316.
Roff DA (1997) Evolutionary Quantitative Genetics. New York:
Chapman & Hall.
Via S and Lande R (1985) Genotypeenvironment interaction
and the evolution of phenotypic plasticity. Evolution 39:
505522.
Cosmids
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1812
Cotransformation
S A Lacks
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0283
Dispersive Mechanisms
DNA Disruption
Processing on Uptake
Integrative Recombination
Mismatch Repair
472
Counterselection
system. Depending on the mismatch, marker integration efficiencies can be reduced as much as 20-fold.
This will reduce the frequency of cotransformation
relative to high efficiency markers, on which the mismatch repair system does not act.
Genetic Mapping
The effect of dispersive processes and the limited
amount of DNA taken up per cell (usually less than
1% of the genome) limits the frequency of cotransformation. Because dispersive processes depend on
distance between markers, cotransformation frequency (ctf) can serve as a measure of linkage. Conversely, the frequency at which markers separate, that
is (1 ctf ), is a measure of distance between markers.
For example, if two markers show 20% linkage, they
might each transform 1% of the cells in a population,
but 20% of the cells transformed for one marker
(0.2% of total cells) will also carry the other marker.
If the markers are not closely linked, they will be
separated by the dispersive processes and enter cells
separately and randomly. For an individual marker
transformation frequency of 1%, the cotransformation frequency would be 0.01% of total cells (or 1%
of cells transformed for one marker would be transformed for both markers).
Competent Population
In the above discussion of genetic mapping, it was
assumed that the entire population was competent;
that is, all cells were equally able to take up DNA
and be transformed. This is not always the case. The
proportion of a population that is competent can be
calculated by reversing the above approach and
assuming that a pair of markers enters randomly,
which will be true for most arbitrarily chosen pairs.
Measurement of the cotransformation frequency,
when divided into the square of the single marker
frequency, gives the proportion of the population
that is competent. In this way it was found that all of
the cells in populations of S. pneumoniae and Haemophilus influenzae are competent, whereas in B. subtilis
only approximately 10% are competent. The molecular
mechanism by which the competent population of
B. subtilis becomes differentiated from the noncompetent majority is not known.
Congression
Congression refers to the tendency of cells to be
transformed by multiple markers, even when they
are unlinked. There are two molecular bases for this
phenomenon. Both have been used in practice to
Bacillus subtilis
Counterselection
K B Low
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0284
Covarion Model of
Molecular Evolution
D Penny and M Hasegawa
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0285
474
C o v a r i o n M o d e l o f M o l e c u l a r E vo l u t i o n
t3
17.5
12
(A)
13
5
t2
G
22
t4
t1
(B)
t2 t3 t4
C
T
(C)
A G
C T
A
G
C
T
(D)
.85
.10
.02
.03
A
.10
.85
.03
.02
G
.02
.03
.85
.10
C
.03
.02
.10
.85
T
Figure 1 (A) The structure of the model (an unrooted evolutionary tree) plus initial conditions (weights). (B) The
model as a rooted tree; this does not affect the calculations. (C) The mechanism of Kimura's three-parameter model.
(D) A Markov transition matrix for a specific edge for a defined time.
476
C o v a r i o n M o d e l o f M o l e c u l a r E vo l u t i o n
A+
G+
C+
K =
T+
A
G
C
T
a
b
g
kd
b
g
0
0
0
0
0
0
0
0
0
0
0
0
0
g
b
g
b
a
0
kd
0
0
0
0
kd
0
0
0
0
kd
0
0
0
0
0
0
A+ G+
C+
T+
G+
0
0
0
(A)
A+
a
C
C+
T+
(B)
Conclusion
On the biochemical side the covarion model is well
established as a realistic description of protein evolution through time. In addition, it appears important in
explaining how sequences allow the recovery of older
divergences during evolution. Despite its biochemical
realism, and potential importance for evolutionary
studies, it is still difficult to use the covarion model
in practice. The hidden Markov approach has potential
but still requires more evaluation. One maximum likelihood program for the hidden Markov approach has
been implemented for nucleotides (A. Rambaugh, personal communication), but additional experience with
such an approach is urgently required. The covarion
model is a good idea that still requires more research
References
CpG Islands
M Goldman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0286
478
C r a ni o s y n o s t o s i s , G en e t ic s of
Craniosynostosis, Genetics
of
Metopic
Front
A O M Wilkie
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1688
Coronal
Sagittal
Classification of Craniosynostosis
Craniosynostosis affects about 1 in 2500 individuals
and is a significant medical problem. Without surgical
treatment, the consequent distortion of skull growth
may lead to altered blood flow in the brain, raised intracranial pressure, and cosmetic deformity; in more
complex cases, involvement of the facial skeleton may
cause additional problems with vision, hearing, nasal
breathing, and dental development.
Two methods of classification of craniosynostosis
are used: anatomical and etiological (i.e., by cause).
The anatomical classification identifies the fused cranial suture. There are six major sutures, comprising
single metopic and sagittal sutures and paired coronal
and lambdoid sutures (Figure 1). Single suture synostosis most commonly involves the sagittal suture
(50% of cases), followed by coronal (20%, one third
of which are bilateral), metopic (10%), and lambdoid
(3%). Multiple suture synostosis accounts for the
remainder. Alternatively an etiological classification
emphasizes the primary cause of the craniosynostosis.
The two most common causes of craniosynostosis are
restriction of fetal head movement during the pregnancy, and single gene disorders (syndromes) that
Lambdoid
Back
(A)
(B)
Craniosynostosis Syndromes
Apert Syndrome
First described in 1906, Apert syndrome has a prevalence of 1 in 65 000. The clinical features are a distinctive facial appearance, with high forehead, prominent
eyes (caused by shallow orbits), prominent beaked
nose and underdeveloped midface, and characteristic
complex fusions of the digits (syndactyly) of the hands
and feet (Figure 2).
Crouzon Syndrome
First described in 1912, Crouzon syndrome has a prevalence of 1 in 60 000. The facial appearance is similar
to Apert syndrome but the hands and feet appear
normal. Crouzon syndrome is sometimes accompanied by the specific skin disorder, acanthosis nigricans,
characterized by pigmented, thickened, felty skin.
Pfeiffer Syndrome
First described in 1964, Pfeiffer syndrome has a prevalence of approximately 1 in 100 000. It is similar to
Crouzon syndrome, but the big toes and sometimes
Muenke Syndrome
SaethreChotzen Syndrome
Craniofrontonasal Syndrome
Mutations in Craniosynostosis
Mutations of four genes, FGFR1, FGFR2, FGFR3, and
TWIST, are common causes of craniosynostosis. Key
features of these genes, their corresponding proteins
and the syndromes with which they are associated are
summarized in Table 1; for completeness this includes
a rare disorder termed BeareStevenson syndrome.
An additional gene, MSX2, is of historic interest
because the first molecularly defined cause of craniosynostosis, described in 1993, was an MSX2 mutation
in a single family with Boston syndrome. However,
MSX2 mutations usually give rise to a different
phenotype with symmetric holes in the skull bones
(parietal foramina).
Figure 2 Clinical features of Apert syndrome. The facial appearance (left) combined with the syndactyly of hands
(right) is characteristic.
C r a ni o s y n o s t o s i s , G en e t ic s of
480
Table 1
Gene
FGFR1
FGFR2
FGFR3
8p11.2p11.1
10q26
4p16.3
TWIST 7p21.1
MSX2 5q34q35
822
821
806
1994
1994
1994
202
267
1997
1993
FGFR1
FGFR2
FGFR3
Figure 3 Important mutations of FGFR1, FGFR2, and FGFR3 proteins in craniosynostosis. The FGFR
transmembrane proteins are shown traversing the cell membrane (pair of dashed lines), with the extracellular side
to the left. Open rectangles delineate the principal domains. Symbols denote specific, recurrent amino acid
substitutions or splicing mutations and the shaded rectangle shows a broader region of mutation, as indicated in the
key. The arrowhead shows the position of the hypermutable cysteine in FGFR2 and the arrows indicate the position
of equivalent proline!arginine mutations in all three receptors. Additional mutations of FGFR3 (not shown) are
important causes of short-limbed bone dysplasia.
Cre/ l ox Tr a n s g e n ic s 481
and Pfeiffer syndromes arise exclusively from the
father (this has also been demonstrated for the
FGFR3 mutation in achondroplasia). These fathers
tend to be older than average, but the mechanism of
the excessive paternal mutations is not exactly known.
Mutations in TWIST
Further Reading
Cre/lox Transgenics
B Sauer
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0287
482
Cre/lox Recombination
The Cre (cyclization recombination) protein of phage
P1 catalyzes recombination at a specific site on the P1
genome called loxP (locus of X-over of phage P1), and
plays at least two roles in the biology of the phage.
First, Cre enhances the stability of the single-copy P1
plasmid replicon in E. coli by efficiently resolving
dimeric circles generated by occasional homologous
recombination between daughter molecules after plasmid replication. In the absence of dimer resolution by
Cre, partition of the single-copy dimeric DNA to
only one of the two daughter cells at cell division
results in cells that no longer carry the P1 replicon,
i.e., plasmid loss. In addition, Cre ensures the prompt
cyclization of the linear, terminally redundant virion
DNA of phage P1 after infection should the host
recombination system fail.
Cre is a 343 amino acid protein related to the Int
recombinase of phage lambda. Unlike many other Int
family members, Cre requires neither accessory proteins for activity nor any special topology of the DNA
substrate. The loxP recombination site is 34 bp in size,
consisting of two 13 bp inverted repeats flanking an
asymmetric 8 bp spacer region that imparts an overall
directionality to the site. Conservative site-specific
DNA recombination occurs within the spacer region.
Recombination between two directly repeated loxP
sites on the same DNA molecule results in precise
excision of the intervening DNA as a covalently
closed circular molecule (Figure 1). DNA inversion
occurs between oppositely oriented loxP sites.
Marker Recycling
Target
gene
Targeting
vector
loxP
neo
loxP
Homologous
recombination
Gene
KO
loxP
loxP
neo
Cre-mediated
recombination
KO (neo)
loxP
+
loxP
(not stably
maintained)
neo
Figure 1 Cre-mediated excisive recombination of a selectable marker. Disruption/deletion of a target gene (striped
box) is achieved by homologous recombination using a targeting vector carrying the loxP2 neo selectable marker
embedded within flanking target gene homology (striped box). The neo marker in the resulting gene knockout (KO) is
removed as a covalently closed circle by Cre-mediated site-specific recombination at the loxP sites (open arrows)
flanking the neo gene. The excised circular DNA is not maintained in mammalian cells because it lacks an origin of
DNA replication and other DNA sequences required for stability.
By designing the Cre transgenic to express Cre in the
zygote or germline lineage, progeny mice are produced that have `automatically' deleted the marker
gene from the genome.
Conditional Mutations
The binary nature of the Cre/lox system naturally
gives rise to a conditional system for regulating gene
expression. Simply put, DNA excision at loxP sites is
dependent on Cre expression. Hence, recombination
will have occurred only in those cells that express Cre
recombinase, or had expressed Cre in a progenitor
cell. By placing the cre gene under the control of a
promoter with the type of regulation desired recombination is directed to a particular cell or time. Evaluation of the effects of gene expression in a particular
type cell or tissue among the many different ones
present in metazoans can thus be achieved by designing recombination-based genetic switches to either
Gain of Function
484
the lox-modified transgene, and that of Cre recombinase, and is temporally restricted to occur only after
prior Cre expression. Such recombinational strategies
render potentially embryonic lethal transgenes quiescent until activated at a desired later time by a suitably
regulated cre gene. Propagation of transgenic models
that might otherwise be impossible to maintain can
thus be achieved. Because activation of a reporter gene
by Cre-mediated recombination indelibly marks Creexpressing cells and their descendents, Cre-based strategies are becoming increasingly important for cell
lineage analysis.
Loss of Function
loxP
A
Chromosome Rearrangements
Cre-mediated site-specific recombination is both conservative and reciprocal, proceeds either intra- or
intermolecularly, is remarkably efficient in eukaryotic
cells, and is undeterred in recombining loxP sites that
are quite far from each other (90 kb on Cre's natural
substrate, the P1 genome). It has thus become clear
that Cre might allow actual genome engineering by
being able to effect large-scale deletions, inversions,
and even chromosome translocations. Unlike spontaneous or mutagen/radiation-induced chromosome
rearrangements, Cre-mediated rearrangements can be
designed with nucleotide precision by using homologous recombination to place loxP sites exactly as
desired in the genome, and then having Cre catalyze
recombination between the loxP sites.
loxP
Cre
A
loxP
A
loxP
Translocations
Regulation by Steroids
486
C re u t z f e l d t - J a c o b D i s e a s e ( C J D )
Further Reading
Creutzfeldt-Jacob Disease
(CJD)
See: GSD (GerstmannStraussler Disease)
Cri-du-Chat Syndrome
C Turleau
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0290
Cross 487
with hypertelorism and epicanthal folds, a broad base
of nose, and micrognathia. Ears are lowset or poorly
formed. Severe respiratory or feeding difficulties soon
after birth are frequent. The phenotype in infancy and
young children include psychomotor retardation, a
high-pitched cry, microcephaly, growth retardation,
poor weight gain, a round face, hypertelorism, a
broad nasal bridge, downslanting palpebral fissures,
and micrognathia. Coordination problems are always
present. The gait is unsteady and broad-based, stooping with bent knees. With advancing age the phenotype becomes less striking and the clinical picture is
difficult to establish. The face lengthens with a poorly
angulated mandible, the nasal bridge normalizes, and
the hypertelorism and epicanthal folds attenuate.
Teeth are decayed and abnormally erupted with frequent malocclusion. Marked growth retardation
results in short stature, poor weight gain, and significant microcephaly. Hypertonia of the limbs with
strong reflexes and spastic gait may appear. Scoliosis
and premature graying of hair are observed.
Chronic medical problems in childhood include
upper respiratory tract infections, otitis media, severe
constipation, and hyperactivity. Minor anomalies such
as strabismus, deficient tears, dental malocclusion,
gastroesophageal reflux, inguinal hernia, hip dislocation, or clubfoot may be present and are amenable
to various medical and surgical interventions. Scoliosis
is relatively frequent after 8 years of age. Major malformations are rare and include mostly cardiac and
gastrointestinal tract anomalies. They are more frequent in patients with unbalanced translocations
and therefore associated chromosomal imbalances.
Mortality rates, except for those with major anomalies, are low and many of these patients survive into
adulthood.
Most patients are severely to profoundly mentally
retarded. The sitting posture is usually acquired only
after the age of 2 years and independent walking after
the age of 4. Some patients never learn to walk. Lack of
speech was considered to be a characteristic of the
syndrome; however, the cri-du-chat children who are
raised at home and who benefit from early, intensive
programs of special education are ambulatory and
can communicate either verbally or through gestural
language. Stimulation programs including forms of
communication training could prevent potential
behavioral problems, which mostly relate to the
patients inability to express themselves.
About 85% of patients have a de novo deletion
either terminal or, less frequently, interstitial. Ring,
de novo unbalanced translocations, or mosaics with a
normal clone are sometimes observed. The size of the
deletion is variable. In 1015% of patients the cause of
the deletion is a parental rearrangement, which is a
Cross
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0291
488
Crossing-O ve r
Crossing-Over
J R S Fincham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0292
Crossing-over is the reciprocal exchange of corresponding segments between homologous chromosomes. It occurs as a regular event in the prophase of
the first division of meiosis, and occasionally during
mitosis. Its physical basis is seen in the form of chiasmata between homologous chromosomes at the diplotene stage of meiotic prophase. Crossovers, if they
occur between chromosomal loci marked by allelic
difference, result in genetic recombination.
The reciprocal nature of crossing-over was first
inferred from the general statistically equal frequencies
of reciprocally constituted recombinant classes among
randomized meiotic products, but it is demonstrated
more rigorously when the reciprocal recombinants can
be recovered together in the same meiotic tetrad. Tetrad analysis also provides the most direct evidence that
each crossover involves only one chromatid from each
divided chromosome, and therefore generates two
recombinants and two nonrecombinants (a `tetratype'
tetrad). Full tetrad analysis is possible only in numerous fungi and a few algae, but similar conclusions can
be drawn from experiments with attached-X chromosomes in the fruit fly Drosophila, which amount to
half-tetrad analysis since each viable egg receives two
of the four X chromosome copies.
Meiotic recombination can also occur by gene conversion, which is essentially the replacement of a patch
of chromosomal DNA by a corresponding sequence
from the homologous chromosome, with the donor
chromosome undergoing repair back to its original
constitution. Up to about 50% of conversion events
(but sometimes a much smaller proportion) tend to be
associated with nearby crossovers, and there is evidence, still not conclusive, that conversions and crossovers stem from the same kind of interchromatid
interaction, which always involves some local nonreciprocal transfer of DNA but only sometimes results
in a crossover.
The relative importance of crossing-over and conversion in recombination depends on the spacing of
the markers being recombined. In classical linkage
studies, with genes usually separated by some hundreds or thousands of kilobases, crossing-over is allimportant, and the effects of conversion negligible.
Whereas any single crossover falling between two
marked genes will generate recombinants whatever
the distance between them, a local nonreciprocal event
will have an observable effect only if the transferred
Crossover Suppressor
J R S Fincham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0293
C ro w, Ja m e s F 489
2
1
3
2
V
IV
A
2
2
1
3
4 V
2
2
4
3
2
L
2
3
L
4
3
3
1
1
IV
4
V
Figure 1 Why segmental inversions are crossover suppressors when heterozygous. (A) Pericentric inversion: the
products of crossing-over within the inversion have duplications and deficiencies of chromosome segments and will all
be inviable, but they are free to enter the meiotic products. (B) Paracentric inversion: crossing-over within the
inversion creates crossover products that are trapped in an anaphase I bridge and fragment, and in Drosophila are
usually excluded from the egg (there is no crossing-over in the male). Symbols: V, viable; IV, inviable; L, lost.
Further Reading
Crouzon Syndrome
See: Craniosynostosis, Genetics of
Crow, James F
W F Dove
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1701
490
Crow, J ames F
Jim Crow's early experimental work with Drosophila explored isolating mechanisms between species
and the polygenic determination of insecticide resistance. His student Yuichiro Hiraizumi discovered the
segregation distortion (SD) system of meiotic drive in
Drosophila. Further work by Crow's colleagues Larry
Sandler, Dan Hartl, Terry Lyttle, Barry Ganetzky, and
others has subsequently brought the understanding of
the SD system forward to molecular analysis.
Jim Crow's studies of mutation rate, genetic load,
and the structure of human populations began in 1956
when he worked with Newton Morton and Hermann
Muller to measure the impact of inbreeding. These
studies continued on several fronts: the analysis of
the Hutterite population with Arthur Mange (1965);
the impact of assortative mating with Joe Felsenstein
(1968); the role of recombination (1988); the theory of
genetic loads (1976); the nature of effective population
size (1984); isonymy (1980); mutation component
(1998); and, most recently, studies on the pronounced
elevation of human mutation rates with age of the
male parent (1997).
Jim Crow and his colleagues have contributed
seminal experimental investigations in Drosophila
that inform some of the issues for human populations.
A monumental experiment by Terumi Mukai indicated that the frequency of new mutations with
minor deleterious effect may be as high as 50% per
zygote. With Rayla Greenberg Temin and then with
Michael Simmons, Crow showed that, typically, the
detrimental effects of spontaneous mutations and of
EMS-induced mutations are partially dominant. Mukai
and Crow's Drosophila work and more recent estimates for mammalian species predict an enormous
mutation burden on these populations. Extinction is
avoided by eliminating deleterious alleles in groups,
as elaborated by mathematical analyses that Crow
carried out with Motoo Kimura on the efficiency of
truncation selection (1979).
These formal population genetic studies are complemented in Jim Crow's work by more molecular
insights. With his student Kimura, Crow developed
Sewall Wright's formalism of random sampling of
alleles in small populations to propose, in 1964, the
`infinite allele model.' This foreshadowed Kimura's
influential neutral theory of molecular evolution.
Crow and Kimura also collaborated on their now
classic textbook An Introduction to Population Genetics Theory (Crow and Kimura, 1970). In the same
spirit of ``gladly learn and gladly teach,'' Jim Crow
published his Genetics Notes in eight editions (Crow,
19501983), his Basic Concepts in Population, Evolutionary, and Quantitative Genetics (Crow, 1986), and
has edited, with William Dove, Perspectives on Genetics (Crow and Dove, 2000).
Further Reading
References
C Kado
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0295
492
number of investigators sought to identify the oncogenic material produced by A. tumefaciens. There
were lengthy debates on whether the bacterium itself
or a `tumor-inducing-principle' causes the crown gall
disease. Plant tissue culture studies provided evidence
that the tumor tissue remained in a transformed state
in the absence of bacteria. The transforming agent was
subsequently sought, with a number of studies directed toward the physiological and biochemical differences between the crown tumor and its surrounding
healthy tissues, and between A. tumefaciens and other
tumor-causing bacteria such as Pseudomonas savastanoi (now called Pseudomonas syringae pv. savastanoi).
Avirulent strains were found when A. tumefaciens was
cultured at 37 8C or when treated with ethidium
bromide, suggesting that an extrachromosomal element is required for virulence. In support of this
notion, A. radiobacter, a naturally occurring avirulent
relative of A. tumefaciens, was shown to be converted
to the virulent form when mixed with the virulent
strain and inoculated on plants. The direct analysis
of A. tumefaciens and A. radiobacter revealed the
presence of a large virulence-conferring plasmid,
called the Ti (for tumor-inducing) plasmid (see Ti
Plasmids). Though A. radiobacter also contained
large plasmids, it is remarkable that the early work
concluded correctly that the plasmid in A. tumefaciens
conferred virulence. Subsequent DNA hybridization
studies in the late 1970s and early 1980s confirmed the
original hypothesis that genetic elements were transferred from A. tumefaciens into the plant chromosomes. The transmission of genetic material across
kingdom boundaries by A. tumefaciens is the first
bona fide case in evolutionary biology of active horizontal gene transfer between living organisms of different kingdoms (Prokarya to Eukarya). The research
on A. tumefaciens gave rise to the modern technology
of plant genetic engineering whereby any piece of
DNA placed in the T-DNA can be transferred into
and expressed in plants.
C r u c i f o r m D N A 493
A. tumefaciens strains that are sensitive to the antagonist. Other prophylactic strategies include maintaining clean propagation nurseries free of crown gall
diseased plants, and sanitary cultural practices. The
recent rise of genetically engineered crop technology
has opened the way for developing crown gall resistant lines of fruit and nut trees, including grapevines
and canes.
Further Reading
References
Cruciform DNA
D M J Lilley
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0296
494
C ru c i f o r m D NA
2
Extrusion
Cruciform structure
4
Extended, square
structure
Branch Migration
When a junction is formed by strand exchange between
two homologous duplexes, it can undergo a sequential
exchange of base-pairing in which the branchpoint
becomes effectively displaced along the DNA sequence. This is termed `branch migration.' When the
junction is folded into the stacked X-structure in
the presence of divalent metal ions, this process is
M2+
Figure 2
Stacked X-structure
Ion-dependent folding of the four-way DNA junction into the stacked X-structure.
C r y p t i c Sp l i c e S i t e s a n d C r y p ti c S p l i c i n g 495
relatively slow, with a rate of a few steps per second.
Thus the process requires protein-mediated acceleration inside the cell.
Further Reading
Murchie AIH and Lilley DMJ (1992) Supercoiled DNA and cruciform structures. Methods in Enzymology 211: 158180.
Lilley DMJ (2000) Structures of helical junctions in nucleic acids.
Quarterly Reviews of Biophysics 33: 109159.
White MF Giraud-Panis M-JE Pohler JRG and Lilley DMJ (1997)
Recognition and manipulation of branched DNA structure
by junction-resolving enzymes. Journal of Molecular Biology
269: 647664.
Cryptic Satellite
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1814
496
C r y p t i c S p l i c e S i t e s a n d C r y p t i c Sp l i c i n g
C ut i s l ax a 497
The question of cryptic splice site activation can
distill to one of context. The outcome of a mutation
within a normal splice site may depend on the proximity and strength of a nearby cryptic site. Since most
disease genes are well characterized, potential cryptic
splice sites could be identified within their nucleotide
sequence. This approach could permit the awareness
of risk of cryptic splicing. In summary, the clinical
interpretation of genetic variants that could impact
splicing in disease genes is problematic. Definitive
genetic and biochemical approaches are beyond the
scope of clinical laboratories, while attempting to
model the effects of mutations for all but the most
conserved bases within the normal splice sites is
uncertain.
See also: Alternative Splicing; Eukaryotic Genes;
Pre-mRNA Splicing
ct DNA
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1815
CTP (Cytidine
Triphosphate)
Cutis laxa
F M Pope
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1747
Cutis laxa (CL) is a clinical term referring to the overstretched and inelastic skin, which forms loose folds,
especially of the neck, face, and flexures. Whilst
common in old age, it is very abnormal any earlier,
especially when generalized or very widespread as
opposed to more localized forms of sagginess. For a
long time, CL was confused with cutis hyperelastica,
which characterizes most EhlersDanlos syndrome
(EDS) variants (Pope, 1993). Early cases of EDS
were often described as showing `cutis laxa,' as was
the original description of EDS by Danlos himself.
Here, in contrast to the bloodhound-like jowly
melted-wax appearance and loss of elasticity of true
cutis laxa, the skin isoverextensible.Afterbeing stretched or otherwise deformed, it immediately snaps back
to normal. Rather confusingly, some EDS subtypes
also show genuine CL, either very early, as in EDS
types VII ac or very much later as a late complication
after middle age in some EDS I/II variants.
CL is classified into three subsets; primary CL of
which there are several variants; secondary CL, in
which the lax skin complicates other inherited defects
of connective tissue; and acquired CL in which disorders of systems other than primary connective tissue
components induce obvious cutaneous laxity and
redundancy (Pope, 1993, 1995).
E J Murgola
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0298
498
Cutis l axa
depletion or less frequently proliferation. Mutations
of the elastin gene have now been demonstrated in
both variants (Zhang et al., 1997).
Here the end result is the same, i.e, there is true laxity
of the skin, but the cause is of incidental systemic
disease, which happens to infiltrate dermal connective
tissue. Good examples are amyloid disease, either as a
primary disorder or occurring secondary to multiple
myeloma. In hereditary neuropathic amyloidosis of
the Finnish type, the CL is predominantly facial.
Similar generalized elastolysis can also occasionally
complicate urticaria, generalized eczema or Sweet
syndrome (Pope, 1993, 1995). None of these have
anything in common pathogenetically except for a
general predisposition to affect the skin. Idiopathic
generalized elastolysis occurs in the absence of any
of the listed secondary causes. Unlike primary cutis
laxa, it is of adult onset, from the third to the sixth
decades. Characteristic changes include esophageal
diverticulae, esophogeal or inguinal hernias, severe
generalized emphysema, colonic diverticulae, and
progressive joint laxity. Pulmonary hypertension or
aortic dilatation and rupture have been documented in
various patients. Whether this is sometimes a lateonset allelic variant of primary CL is unknown.
Blepharochalasis
Although strictly speaking confined to the orbits, eyelids, and eyebrows, blepharochalasis often accomponies generalized CL. It can also segregate as specific
autosomal dominant traits, with or without lip involvement (OMIM 109900 and 11000).
References
500
C y c l i n - De p en d e n t K i n a s e s
Further Reading
Busby S and A Kolb (1996) The CAP modulon. In: Lin ECC and
Lynch S (eds) Regulation of Gene Expression in Escherichia coli,
pp. 255279. Austin, TX: RG Landes.
Cyclin-Dependent Kinases
M E M Noble, and J A Endicott
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1561
Discovery
The events of mitosis that generate two daughter cells
are readily visible with the light microscope. Apart
from growth, there is little apparent cellular activity
in the gap phases between successive rounds of cell
division. DNA is replicated in a discrete period
between successive rounds of mitosis. This phase is
called S (for synthesis). The preceding phase is termed
G1 and the gap before the next round of mitosis is
termed G2. Within G1, growth and the synthesis of
material necessary for DNA replication occur, while
in G2 the cell prepares for mitosis (M-phase), which
may involve further growth. Cell fusion experiments,
initially carried out with unicellular protozoa and
subsequently refined by Rao and Johnson with studies
on cultured human HeLa cell lines, suggested the
Function
In the early synchronous divisions of frog embryos,
MPF activity oscillates without reference to other cell
cycle events such as the successful completion of
DNA replication. This observation suggests the existence of a `cell cycle engine,' which drives the cell
through consecutive rounds of division by means of
periodic activation and inactivation of MPF.
Somatic cell fusion experiments, together with
genetic experiments on yeast, provide a rather different view of the role of CDK activity in regulating the
cell cycle one in which CDK-driven progression
depends on the successful completion of earlier cell
cycle events. Certain mutations in the S. pombe cdc2
and cdc13 genes allow inappropriate cell cycle progression a phenotype that Hartwell and Weinert
called `relief of dependence.' From this observation
they argued that control mechanisms, termed `checkpoints,' must exist. These checkpoints are composed
of a surveillance system that detects when a particular
cell cycle event has not been correctly executed, and a
signal transduction pathway whose ultimate target can
be a CDK. The cell cycle of dividing cells has two major
points of commitment at which CDK/cyclin pairs are
active in determining cell fate. These are the G1/S and
G2/M transitions, passage through which results in
DNA replication and mitosis, respectively. If essential
molecular events have not been successfully completed, cells will arrest at these transitions as a result
of inhibition of CDK activity. For example, cells will
arrest at the G2/M boundary if damaged DNA or
incompletely replicated chromosomes are present.
The multiplicity of roles played by CDKs in timing
and coordinating cell division lead David Morgan
to describe them as ``engines, clocks and microprocessors.''
Glycine loop
Recruitment
motif
Substrate
G-helix
Thr160
Figure 1 The structure of CDK2 in complex with cyclin A. The structure of CDK2 (white, left), and cyclin A (grey,
right) are shown in ribbon representation. The CDK-specific insert that mediates interactions with CKS proteins and
KAP follows the G-helix. The sites where proteinprotein interactions direct substrate- and inhibitor-binding have
been determined by peptide-binding studies.
502
C y c l i n - De p en d e n t K i n a s e s
CDKs are only fully active following phosphorylation of a conserved threonine residue within the
Pho85
CDKs in Metazoans
CDKs 1 and 2
CDKs 4 and 6
Growth
factors
p16
p15
p19
p18
D
4/6
p21
p27
p57
4/6
pRB
P
pRB
DP1 E2F
DP1 E2F
G1
E
2
P
B
DP1 E2F
1
G2
S
A
2
1
A
2
Figure 2 Key regulatory events in the higher eukaryotic cell cycle. CDKs are labeled by their number (1, 2, 4/6, 7)
and cyclins are identified with a single letter (A, B, D, E, H). Light grey arrows denote phosphorylation. Dark bars
denote inhibition of CDK activity.
CDK7
504
C y c l i n - De p en d e n t K i n a s e s
CDK5
CDK Regulation
The paramount role of CDKs in determining the
phases of the cell cycle, means that they themselves
have to be tightly regulated and responsive to a variety
of inputs. Monomeric CDK molecules possess no
detectable protein kinase activity. A structure of
monomeric CDK2 determined by Sung-Ho Kim and
his coworkers provides an explanation for this inactivity, since key elements of the substrate recognition
and phosphotransfer apparatus are inappropriately
arranged. The molecular mechanisms that are used to
regulate CDK activity are diverse, but can be characterized as depending on either reversible phosphorylation, or reversible association with a regulatory
partner. These two phenomena affect in turn the ability of the CDK to select an appropriate substrate, or to
adopt an active conformation. The surface of a CDK
presents multiple sites through which it interacts with
substrates and regulators (Figure 3).
Cyclin binding
Assembly factors
p27
Cyclin
KAP
Suc1
Substrate
Figure 3 CDKprotein interaction sites. The determination of the structures of a number of complexes
containing either CDK2 or CDK6 has identified the sites
of interaction of members of the CDK family with their
key regulators. The CDK subunit is shown in ribbon
representation looking onto the N-terminal domain.
Each regulatory protein-binding surface is highlighted by
a curved line. p27Kip1 binding extends round on to the
cyclin subunit.
Phosphorylation
Localization
In eukaryotic cells, the activity of CDKs can be inhibited by the binding of proteinaceous CDK inhibitors
(CKIs). In Sc. pombe, rum1 was identified as a gene
without which mitosis becomes uncoupled from
DNA replication. This gene was found to encode a
CKI of 25 kDa, which is an important regulator of
G1 progression. The slightly larger protein Sic1 is
the functional homolog of Rum1 in Sa. cerevisiae.
Sa. cerevisiae uses the CDK inhibitor Far1 to induce
G1 cell cycle arrest in response to mating pheromones.
When phosphorylated by Fus3, a MAP kinase, Far1
inhibits three Cdc28/Cln complexes. Two major
classes of proteins inhibit CDK activity in higher
eukaryotes (Figure 2). These are the INK4 (inhibitors
of CDK4 ) family, which specifically inhibits CDK4
and CDK6, and the more promiscuous Cip/Kip
family, which also inhibits cyclin A and cyclin
Phosphorylation
506
C y c l i n - De p en d e n t K i n a s e s
Protein degradation
Further Reading
Cysteine
E J Murgola
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0304
Figure 1
mutations, one from each of their parents who themselves are free of any CF symptoms. Carrier parents
have a 1 in 4 chance of having a CF child at each
pregnancy.
CF is most prevalent in the Caucasian populations
of Northern European ancestry, at a frequency of about
1 in 2500 live births (and a carrier frequency of 1 in 25),
but is relatively infrequent among people of Asian or
African descent. If untreated, affected children usually
die at an early age because of severe lung infection
and malnutrition but, as a result of advances in clinical
management, the lifespan of patients has increased
markedly and many of them now live to adulthood.
The basic defect in CF resides primarily in the
secretory epithelia; the transport of water, electrolytes, and other solutes across the cellular membranes
is defective, due to the absence or deficiency of a
chloride ion channel. The underlying mechanisms
have only been uncovered in detail since the isolation
of the cystic fibrosis gene in 1989.
CH2
CF Gene
SH
Cysteine.
Cystic Fibrosis
L-C Tsui
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0305
CF Gene Mutations
A single mutation named DF508 accounts for 70% of
the mutant CFTR genes in the world; it corresponds
508
C y s t i c F i b ro s is
GenotypePhenotype Correlation
The genotype of a CF patient refers to the description
of the CFTR mutation(s) at the DNA level. About half
of the CF patients worldwide carry two DF508 mutations (one from each parent) and 40% have one DF508
and another mutation affecting a different part of the
CFTR gene. The remaining patients have other CFTR
gene mutation combinations. The ability to associate
disease severity with the CFTR genotype can improve
patient management and treatment. Indeed, a strong
association between CF pancreatic enzyme status and
genotype can be established; patients with sufficient
(residual) pancreatic function are found to have one or
two mutations of class IV or V, which appear to confer
residual CFTR activity. Therefore, CF patients who
have two copies of DF508 (homozygotes) are expected
to be pancreatic insufficient and require dietary supplements of pancreatic enzymes. Proper nutrition is
important for management of CF patients. There is a
general correlation between pancreatic sufficiency and
overall mild disease.
Unfortunately, there is no direct correlation between CFTR genotype and the other CF symptoms.
Atypical Diseases
Mutations in the CFTR gene have also been found in
congenital bilateral absence of vas deferens (CBAVD),
obstructive azoospermia, idiopathic pancreatitis, chronic obstructive pulmonary disease, diffuse bronchiectasis, allergic bronchopulmonary aspergillosis, chronic
pseudomonas bronchitis, neonatal hypertrypsinemia,
Cytoplasm 509
and asthma, all of which constitute a subset of the CF
phenotypes. Although not all of the patients with the
above CF-related diseases have CFTR mutations, the
frequencies found are much higher than those
expected for CF carriers in the population. For example, 80% of patients with CBAVD have one or two
typical mutations found in CF and 10% of patients
with asthma have a CFTR mutation.
A number of atypical mutations are also found in
the CF-related diseases. Many of them represent
DNA sequence alterations that are relatively frequent
in the population and are found in normal control
individuals. These atypical mutations are presumably
more susceptible to modifier gene effects, resulting in
varied outcome in different patients. Therefore, these
mutations present major challenges in genetic counseling as well as in the ethical and legal guidelines.
Further Reading
Cytogenetics
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0306
Cytokinesis
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1816
Cytoplasm
J H Miller
510
C y t o p l a s m i c G en e s
Cytoplasmic Genes
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2106
Cytoplasmic Inheritance
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1817
Cytoplasmic inheritance refers to a property of extranuclear genes, e.g., those located in mitochondria or
chloroplasts.
See also: Chloroplasts, Genetics of; Mitochondrial
Inheritance; Mitochondria, Genetics of
Cytosine
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rgwn.2001.0309
Cytoskeleton
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1819
The cytoskeleton comprises of the internal components of animal cells that confer structural strength
and motility; these components are predominantly
microfilaments (of actin), microtubules (of tubulin),
and intermediate filaments.
See also: Cell Cycle
Cytosol
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1820
D
D'Herelle, Felix
W C Summers
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1713
512
Cyril Dean Darlington (19031981) made major discoveries about the chromosome theory of heredity,
unifying biology through the fundamental principles
of evolution, cytology, genetics, and biochemistry,
and was one of the influential figures of biology of
the twentieth century. Building on diverse work from
animals and plants, he was the first to describe clearly
the nature of the partitioning and segregation of chromosomes at mitosis, and the recombination events between chromatids that occur at meiosis. He advanced
the dictum that looking at chromosomes was another
way of looking at genes, a view that profoundly influenced fundamental biological thought. His work was
marked by the ability to synthesize technical excellence and intellectual insight into the behavior of his
experimental materials and the meaning of the structures he visualized by microscopy.
Brought up in London, he graduated with a BSc
in agriculture from the college that became Wye
Further Reading
References
Darlington CD (1932) Recent Advances in Cytology. Edinburgh:
Churchill Livingstone.
Darlington CD (1939) The Evolution of Genetic Systems. Cambridge: Cambridge University Press.
Darlington CD and LaCour L (1942) The Handling of Chromosomes. London: Allen & Unwin.
Morgan TH, Sturtevant AH and Bridges CB (1919) The Physical
Basis of Heredity. New York: Holt Rinehart & Winston.
Darwin, Charles
S M Case
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0311
514
Darwin, Charles
Darwin brought the first volume of Lyell's book Principles of Geology with him on the Beagle and had later
volumes sent to him during the voyage. Lyell proposed that natural processes working now are also
responsible for past geological change (uniformitarianism). This philosophical position was in opposition
to catastrophism which held that history was punctuated by unpredictable, sudden events that changed its
course. As Browne (1995, p. 186) states, ``Lyell's book
taught Darwin how to think about nature.'' Darwin
applied Lyell's principles in his geological studies and
found they helped him analyze his observations and
test his conclusions. Darwin adopted Lyell's concept
of slow, gradual change of the earth and extended it
to include the organisms living on it.
516
Darwin, Charles
Further Reading
References
Databases, Sequenced
Genomes
D Ussery
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0313
518
Table 1
Type
Name of database
Sanger Centre
http://www.sanger.ac.uk/Projects/
TIGR
http://www.tigr.org/tdb/mdb/mdbcomplete.html
TIGR Latest Update for Unfinished Microbial Genome Data
http://www.tigr.org/cgi-bin/BlastSearch/ReleaseDate.cgi
TIGR's ``ongoing projects''
http://www.tigr.org/tdb/mdb/mdbinprogress.html
University of Oklahoma's Advanced Center for Genome Technology
http://www.genome.ou.edu/
Washington University in St Louis Genome Sequencing Center
http://genome.wustl.edu/gsc/C_elegans/navcelegans.pl
3. Lists of links about sequenced genomes NCBI list of bacterial genomes that are complete but not published
http://www.ncbi.nlm.nih.gov/Entrez/Genome/org.html
NCBI list of completed and ongoing projects
http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/bact.html
Blast NCBI genomes
http://www.ncbi.nlm.nih.gov/Microb_blast/unfinishedgenome.html#GENOMES
Complete genomes in KEGG (Kyoto Encyclopedia of Genes and Genomes)
http://www.genome.ad.jp/kegg/catalog/org_list.html
GOLD Genomes OnLine Database
http://216.190.101.28/GOLD/completegenomes.html
GOLD ``Ongoing'' Genomes OnLine Databse
http://216.190.101.28/GOLD/prokaryagenomes.html
Infobiogen list of complete genomes
http://www.infobiogen. fr/doc/data/complete_genome.html
Infobiogen list of incomplete genomes
http://www.infobiogen.fr/doc/data/uncomplete_genome.html
The Enhanced Microbial Genomes Library
http://pbil.univ-lyon1.fr/emglib/emglib.html
NIH (National Institute of Allergy and Infectious Diseases) supported
projects
http://www.niaid.nih.gov/dmid/genomes/genome.htm
Department of Energy (DOE) funded microbial genomes, completed and
ongoing projects
http://www.er.doe.gov/production/ober/EPR/mig_cont.html
4. Lists of genome analysis web pages
Summary
There are hundreds of genome databases available; key
web pages are shown in Table 1. Many of these will
allow blast searches to be done, both against the published genomes as well as the ``current ongoing'' genome
projects. DNA Structural Atlases are a way of viewing
whole genomes, in terms of DNA structures, and are
useful for finding regions of unusual DNA structures.
The number of sequenced genomes will soon reach
more than a hundred. Genome databases are necessary
to track and better utilize this information.
Further Reading
Baxevanis AD (2000) The Molecular Biology Database Collection: an online compilation of relevant database resources.
Nucleic Acids Research 28: 17. (Note. Nucleic Acids Research
traditionally devotes the first issue in January to sequence
databases.)
520
Plasmodium falciparum
Chromosome 3
1,060,106 bp
A)
B)
C)
D)
E)
F)
G)
H)
0k
125k
250k
A) Intrinsic Curvature
375k
C) Position Preference
0.26
0.22
5.64
750k
875k
E) Watson Repeats
CDS
1000k
G) GC Skew
fix
avg
CDS +
rRNA
dev
avg
7.26
dev
avg
0.14
B) Stacking Energy
625k
D) Annotations:
dev
avg
0.18
500k
5.00
dev
avg
0.09
9.00
F) Crick Repeats
0.09
H) Percent AT
fix
avg
tRNA
5.00
9.00
fix
avg
0.20
0.80
GENOME ATLAS
Figure 1 (See Plate 9) DNA ``Genome Atlas'' for Plasmodium falciparium. The different colored lines are as described in the text and at our website (http://
www.cbs.dtu.dk/services/GenomeAtlas/). (Note: this figure can also be seen at the following URL: http://www.cbs.dtu.dk/services/GenomeAtlas/Pfalciparum/
pfal_3.genomeatlas.lin.htm1.)
De Vries, Hugo
G S Stent
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0328
Although the laws governing the segregation of hereditary traits in hybrid crosses of cultivated plants were
first published by Gregor Mendel in the 1860s, they
remained virtually unknown until they were rediscovered 40 years later by three men working independently: Carl Correns, of Germany, Erich von
Tschermak, of Austria, and Hugo de Vries (1848
1935), of the Netherlands. Of this trio, de Vries is the
one considered to have made the greatest contribution
to laying the foundations for the discipline that (following a proposal made by William Bateson in 1906)
came to be known as `genetics.'
Hugo de Vries was born in Haarlem in 1848. After
graduating from the University of Leiden in 1870, he
taught at a school in Amsterdam for four years before
studying plant physiology at the German University
of Halle, where he was awarded a doctoral degree in
1877. On his return to Holland, de Vries was appointed lecturer in plant physiology at the University
of Amsterdam, where, rising rapidly through the
academic ranks, he became a full professor in 1881.
His early research was concerned with plant respiration and osmosis, and he did not begin his studies of
hereditary variation in plants until 1880. De Vries
remained on the Amsterdam faculty until 1918, when
he retired to Lunteren, a small town 30 miles to the
south-east of Amsterdam, where he died in 1935.
De Vries had been troubled that Darwin's theory of
``descent with modification'' lacked a plausible explanation for the source of the variations in hereditary
traits on which natural selection was supposed to act.
He was not satisfied with Darwin's `pangenesis' theory of heredity, according to which the number and
relative proportion of diverse `gemmules' present in a
creature's body determine its characteristic traits.
Thus, under pangenesis the phenomenon of heredity
would be attributable to the transmission from parent
522
Deficiency
Deficiency
See: Deletion Mutation
Degenerate Code
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0315
Delbruck, Max
W C Summers
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0316
Max Delbruck (19061981), a German-born, American molecular biologist, was born on 4 September
1906 in Berlin. His family, originally from Halle,
had been active in German political and intellectual
affairs for several generations. He was a son of Hans
Delbruck, professor of history in Berlin, and a greatgrandson of the famous German chemist Justus
Liebig. After starting to study astronomy in Tubingen
in 1924, young Delbruck moved to Gottingen where
he eventually abandoned astrophysics for quantum
mechanics and received a PhD degree in theoretical
physics in 1930 under Max Born. He carried out
advanced work in physics in Bristol, England, Switzerland, Berlin, and Copenhagen. While in the laboratory
of Niels Bohr in Copenhagen, he developed his interests in biological problems with the hope that new
principles of physics and chemistry might be revealed
by the study of life processes. In Berlin, Delbruck collaborated with Nicolai Timofeeff-Ressovsky, a Russian
geneticist, and Karl Zimmer, a German radiobiologist
on a celebrated paper examining the physical nature of
the gene (mainly with reference to Drosophila) by
radiobiological approaches. Delbruck became convinced that genetics, and the stable nature of the gene
and its incredibly accurate replication, were aspects of
biology most likely to provide the paradoxes that
might reveal the new principles he was seeking.
D e l b r u ck , Max 523
In late 1937 Delbruck moved to the California
Institute of Technology to work in collaboration with
Thomas Hunt Morgan on Drosophila genes. However, soon after he arrived at Caltech, he met Emory
Ellis, a research associate who was already developing
bacteriophage as a model system to study the basic
biology of viruses. Ellis had characterized the basic
process of bacteriophage growth, and had confirmed
the one-step growth patterns described a decade earlier by Felix d'Herelle. Delbruck was immediately
struck by the usefulness of bacteriophage as a tool to
study heredity, and he abandoned his plans for Drosophila and joined Ellis's laboratory to work on phage.
With the onset of World War II, Delbruck left Caltech
to take up a position in the physics department at
Vanderbilt University where he taught physics and
continued his research on phage multiplication. Ellis
went into military-related research and had a distinguished career in rocketry.
In 1941 Delbruck met Salvador Luria and they began
a life-long collaboration and friendship. In an ongoing
research program on bacteriophage, they, along with
Alfred Hershey, initiated the research school now
known as the ``American Phage Group.'' In 1947
Delbruck moved back to Caltech, where, except for
a brief hiatus in Cologne from 1961 to 1963, he was on
the faculty until his death in 1981. His research
focused on the multiplication and genetics of bacteriophages, and later on the phototropism of the mold
Phycomyces blakesleeanus. Among his many honors,
Delbruck received a Nobel Prize in Physiology or
Medicine in 1969, which he shared with Luria and
Hershey.
Delbruck's published work does not represent his
full impact on the field of genetics and molecular
biology. His own research papers are highly focused
on specific problems, and while they are models of
clarity and care, they do not deal with ground-breaking
issues of the time. In his collaborations with others
and in his scientific leadership and intellectual guidance, he was much more influential. Several examples
suffice to indicate this influence.
An old problem in phage biology, that of the appearance of phage-resistant bacteria, interested Luria.
He and Delbruck devised a way to test if the phageresistant bacteria were produced spontaneously and
subsequently grew out under selective conditions, or
conversely, if the phage somehow induced the phage
resistance to appear. Their approach was both sound
and elegant, but indirect, relying as it did, on probabilistic arguments similar to those they had often
used in their radiobiological target theory work. This
experimental approach, which came to be known as
the LuriaDelbruck experiment, has been widely
524
Deletion
Further Reading
Deletion
N Grindley
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0317
Deletion Mapping
I Schildkraut
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0318
markers fall within the same region of DNA. Typically a genetic cross is set up between a recipient
and a donor where one of the genetic elements is
known to harbor a deletion. A point mutation which
occurs within the bounds of a deletion cannot lead to a
wild-type allele and therefore function will not be
restored through a recombinational event. Deletion
mapping will not determine the relative order of
mutations but rather answer the question whether or
not the recombining of the two elements lead to
restored function. If the genetic cross does restore
function then point mutation does not lie within the
genetic element defined by the deletion mutation. If
the cross does not restore function then the point
mutation does lie within the deletion.
See also: Deletion Mapping, Mouse;
Gene Mapping
Basic Information
Deletions (also known as deficiencies) are aberrations
in which intervals of varying length are missing from a
chromosome, resulting in segmental monosomy for
the affected region. Deletions are typically employed
for fine-structure mapping of genetic loci within small
chromosomal regions, and the principles of deletion
mapping in the mouse are really no different from
those applied to other experimental organisms, such
as Drosophila melanogaster. Deletions are useful for
mapping recessive mutations defined by phenotype, as
well as codominant genetic markers defined by biochemical or molecular methods, but they are not typically employed for mapping of mutations specifying
dominant phenotypes. Chromosomal deletions occur
spontaneously at a low frequency, or are induced by
treatment of germ cells (most efficiently, mature or
maturing oocytes in the female, and postmeiotic
spermatogenic cells in the male) with chromosomebreaking agents, such as acute radiation or certain
chemicals. The first suggestion that intrachromosomal
deletions occur in mice came in 1958 by W. L. Russell
and coworkers: in the first-generation progeny of
irradiated males, they found simultaneous induction
of mutations at the tightly linked (0.16 cM) dilute
(d; now called Myo5a) and short-ear (se; now called
Bmp5) loci. That is, they recovered animals that were
both dilute and short-eared among the progeny of
Progeny
Cross
Wild-type
Abnormal (r )
Del(m)a
+
+
+
+
r
r
~50%
~50%
Del(m)b
+
+
r
r
100%
Del(m)b
Del(m)a
Figure 1 A pseudodominance test mapping a hypothetical r locus between two deletion breakpoints.
Del(m)a and Del(m)b represent two independent deletions of the m locus. Mice homozygous for mutant
alleles at the r locus (r/r) exhibit an abnormal phenotype.
The open boxes below the simple chromosome map
represent the extent of each deletion. The arrowheads
designate the interval on the deletion map containing
the r locus, based on the progeny data given in the Table.
526
Del(m)b
Enz s +
Enz f +
Enz f +
1 2 3 4 5 6 7 8 9
Enz f +
Enz f +
1 2 3 4 5 6 7 8 9
Origin
Slow
Fast
Enz
Del(m)b
Del(m)a
l1
l2
Figure 2 Deletion mapping of codominant genetic markers. Enzs and Enzf represent alleles at a locus, Enz,
specifying slow and fast electrophoretic forms, respectively, of a hypothetical enzyme. The large box represents an
electrophoretic gel; the lines represent the loaded wells (the origin); and the dark boxes represent the protein (or
DNA) bands. Two crosses of deletion heterozygotes to Enzf homozygotes are indicated. For each cross, lanes 1 and 2
represent the deletion parent and the Enzf/Enzf parent, respectively; the other seven lanes represent their progeny.
Note that individuals 4, 5, 7, and 9 in the right part of the gel do not receive an Enzs allele from the deletion parent,
indicating that the Enz locus is included in the Del(m)b deletion. In the deletion map at the bottom of the figure, the
open boxes represent the extents of each deletion with respect to the r locus (from Figure 1) and the Enz locus,
mapped here. This same strategy can be used to map loci defined by DNA polymorphisms (see text). The stippled
boxes represent the map positions for essential genes (l 1 and l 2) that are removed by each deletion (see text for
definition and mapping of 11 and 12).
528
Deletion Mutation
Further Reading
Deletion Mutation
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2087
Demes
K E Holsinger
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0320
Most widespread species are composed of many populations that are somewhat isolated from one another.
A deme is one of these geographically localized populations. It is a group of individuals belonging to a
single species occurring in one place at one time.
Table 1
parryae
Distance
Correlation
25 feet
75 feet
250 feet
750 feet
0.5 mile
1.0 mile
1.5 miles
2.0 miles
0.899
0.875
0.817
0.723
0.599
0.505
0.115
0.096
D e n a t u r a t i o n ( P ro t e i n s ) 529
Denaturation (Proteins)
A R Fersht
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0321
temperature. At high enough concentrations, all proteins denature irreversibly because their denatured
state aggregates and precipitates. Some proteins will
not spontaneously renature because they have been
subjected to modification after biosynthesis, such as
the removal of stretches of amino acids or just simple
cleavage. Proteins that have multiple disulfide bridges
are particularly intractable to renaturation if those
bridges are cleaved by reduction as part of the denaturation process. Incorrect bridges may be formed
on reoxidation. This problem is overcome during
protein biosynthesis by disulfide-shuffling enzymes.
Some proteins do not renature because a partly denatured state is kinetically stable and the structure
becomes trapped. Denaturation can also be caused by
chemical changes in a protein, such as oxidation of
cysteine or methionine residues or by deamidation
of glutamine or asparagine.
There are several biological consequences of protein denaturation. The first is that many proteins are
on the verge of being denatured at physiological temperature and so heat shock will cause them to denature. The denatured protein may be rescued by heat
shock proteins that specifically recognize the distinctive feature of denatured states of proteins' exposed
hydrophobic surfaces. The heat shock proteins bind to
the exposed hydrophobic surfaces and prevent aggregation. Heat shock proteins also function as molecular
chaperones in protein biosynthesis. They bind partly
denatured states of proteins during biosynthesis and
prevent their aggregation and precipitation in the same
way as they do in heat shock. The hydrolysis of ATP is
usually required to release the bound protein and
allow it to fold successfully. The molecular chaperones
can also have another role by causing proteins to unfold
temporarily in order to be transported across membranes or reassemble as parts of larger protein complexes. A probable mechanism for the formation of
amyloid deposits invokes the partial denaturation of
proteins: sequences of the protein that have a tendency
to form strands of b-sheet can be exposed on partial or
full denaturation so that they are able to associate and
form long fibrils of b-sheet. There is also a hypothesis
that the interconversion of the normal prion protein to
its scrapie form requires denaturation of the protein.
Further Reading
530
D e re p re s s i o n
Derepression
C Yanofsky
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0322
Detoxification (SOD)
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0324
Deuteranopia
See: Color Blindness
Developmental Genetics
J Hodgkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0326
reason for this is that the organisms favored for experimentation by developmental biologists were those
with large eggs and embryos, such as amphibia and
mollusks. These are ideal for the traditional methods
of embryological manipulation such as ablation and
transplantation, but none of them is easily subjected to
genetic analysis. Conversely, organisms favored by
geneticists tend to be small and rapidly developing,
and therefore hard to manipulate physically. Only
since 1970 has developmental genetics come into its
own, aided in particular by the advent of efficient
cloning methods.
Understanding the development of an organism
depends first on description of the events involved and
then on experimental intervention. Genetic methods
can contribute to both phases, though they are more
important in the second. Description of development
is greatly aided by techniques for marking a single cell
and all of its descendants, and the best markers for
such purposes are genetic, since these can be minimally invasive and transmitted indefinitely. Fate mapping by genetic techniques has produced fundamental
observations such as the existence of compartments in
insect development. The primary experimental intervention for the geneticist is the creation of mutants
and the study of their altered development. However,
the genetic toolkit permits other kinds of manipulation, some of which are equivalent to conventional
embryological manipulation. The creation of genetic
mosaics, in which cells of different genotype coexist in
the same individual, provides similar information to a
transplantation experiment. The use of conditional
mutations allows gene activity to be switched on or
off at different points in development. Temperaturesensitive mutants illustrate this approach: shifting
organisms under study from a permissive temperature
to a restrictive temperature or vice versa can effectively allow genes to be switched on and off, thereby
providing information about when gene activities are
required and in what order.
The general strategy adopted by developmental
geneticists has been to concentrate on a well-described
developmental process and carry out systematic
screens for mutants with abnormalities in that process.
Characterization of the phenotypic defects in these
mutants and study of the genetic properties and interactions of the genes thereby implicated can then be
used to deduce the approximate nature and function
of the underlying genetic program. Molecular cloning
methods permit the isolation of these genes and identification of their products, leading in turn to a biochemical description and explanation of the process.
The molecular data provide an additional bonus,
because gene sequences can be used to hunt for related
genes in other organisms. These can then be studied
D eve lo pm en t al G en e t ic s of C a e n o r h ab d i t i s el e g an s 531
biochemically, without the need for developing a
genetic system for that organism, in order to see if
they perform corresponding functions. The rapid progress in developmental biology over the past three
decades has been heavily dependent on this general
approach, particularly in vertebrates.
Different genetic systems have been used to concentrate on different developmental problems. The
question of cell-type specification has been addressed
in the greatest detail in various bacterial and fungal
systems, most notably in the analysis of mating-type
specification in the budding yeast Saccharomyces cerevisiae. Simple cases of pattern formation have been
examined in bacteria, fungi, protozoa, and slime
molds. For more complex examples of multicellular
development, genetic studies of animal development
have been especially powerful, exploiting the advantages of Drosophila, Caenorhabditis elegans, and the
mouse. In recent years, the zebrafish Danio rerio has
been developed as a promising additional system,
which combines the power of the genetic approach
with the advantages of a relatively large and transparent embryo, amenable to physical manipulation.
A major general finding from studies on these
organisms is that the developmental mechanisms are
remarkably conserved across the animal kingdom,
despite the great differences in anatomy and ontogeny
between different groups. Perhaps the most striking
example is provided by homeobox clusters, which
control patterning along the anteriorposterior axis
in all animals. The homeobox cluster was first discovered in Drosophila, but has proved to be conserved to a
remarkable degree in all metazoan animals, both in
function and in genetic and molecular organization.
Many other examples of conservation in the molecular
biology of animal development can be cited.
In contrast, developmental processes in plants
involve different processes and different molecules.
These differences may arise from distinctive properties of plants such as the lack of cell movement and the
presence of a cell wall. Such factors must greatly affect
the available strategies for generating a complex threedimensional structure. However, the genetic approach
has been equally successful in unraveling some components of plant development, again by concentrating
on one or two favorable species. This can be illustrated
by the investigation of flower development, which has
been analyzed in parallel studies using Arabidopsis and
Antirrhinum (snapdragon).
Theforwardgeneticapproachtodevelopmentalanalysis has been enormously successful, but is now likely
to be supplanted or at least greatly supplemented by
genome-based approaches. Microarray methods, in
which all of the many thousand genes in an organism
can be assayed for expression at once, are enormously
increasing the amount of information available on cellspecific and tissue-specific gene expression. Systematic studies of proteingene and proteinprotein interactions will have similar impact. These projects will,
however, build on the frameworks laid down by
previous genetic investigation. Also, the follow-up
studies to test the involvement and importance of
new genes and predicted interactions will continue
to rely on the general strategies and techniques of
developmental genetics.
See also: Developmental Genetics of
Caenorhabditis elegans; Homeotic Mutation;
Neurogenetics in Caenorhabditis elegans;
Neurogenetics in Drosophila; Pattern Formation
Developmental Genetics of
Caenorhabditis elegans
J Hodgkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1411
532
Developmental Genetics,
Mouse
DiGeorge Syndrome
Dicentric Chromosome
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1821
Dictyostelium
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2098
Dideoxy Sequencing
See: DNA Sequencing
Dideoxynucleotide
In 1965 DiGeorge recognized the association of hypocalcemia secondary to parathyroid hypoplasia and
absence of the thymus. As additional cases were
reported it became clear that cardiac malformations
and facial dysmorphism were often present as well. It
was reported in 1991 that a proportion of children
with DiGeorge syndrome had chromosomal deletions
within band 22q11. As techniques were developed to
detect submicroscopic deletions it became apparent
that 95% of affected children had such deletions. It
has also become clear that there is a very wide phenotypic spectrum associated with this deletion, classical
DiGeorge syndrome being at the most severe end of
the spectrum and the majority of cases having the
clinical features of velocardiofacial syndrome (discussed below). What is the cause of DiGeorge syndrome in the children who do not have deletions
within chromosome band 22q11? Although many
genes have been identified within the commonly
deleted region it is not yet clear which of these contribute to the phenotype. It is possible that smaller
deletions or mutations in a gene in this region could
produce the phenotype. A second genetic locus has
been identified, submicroscopic deletions in chromosome band 10p13 being associated with the syndrome.
DiGeorge syndrome has also been reported in the
offspring of diabetic mothers but there remain cases
where the cause is unknown.
Differential Segment
Although the children DiGeorge described had abnormalities of thymus and parathyroids, it is unusual
for these to cause the presenting features in children
with the deletion. Severe immunodeficiency is very
unusual, occurring in less than 1% of individuals
with the deletion, but T lymphocyte numbers are
often low, this largely being due to low CD4 counts.
However, most patients generate good antibody responses following immunization. It is important to
check the calcium levels to prevent hypocalcemic seizures but hypocalcemia responds well to oral supplements. Congenital heart defects are present in 75% of
patients with the deletion. The heart defects most
commonly seen are tetralogy of Fallot, pulmonary
atresia with ventricular septal defect, ventricular septal
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0332
534
DiGeorge Syndrome
Di p lo id y 535
Prevalence of Deletion
The standard method of detecting this deletion is
fluorescent in situ hybridization (FISH). A minimum
estimate of prevalence is 13 per 100 000 (95% confidence interval 4.5 to 21.5). This figure was based on
cases presenting in infancy and therefore misses the
milder end of the spectrum.
Further Reading
Diploidy
J R S Fincham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0335
In the so-called lower plants haploidy is more prominent. Meiosis in ferns produces haploid spores
which germinate to give rise to miniature but freeliving green plants (prothalli) which in turn produce
eggs and sperm for restoration of diploidy; in mosses
the leafy shoots are all haploid and only the green
spore-bearing capsules borne on the shoots are diploid.
The fungi, with the exception of some sections of
the Phycomycetes which are diploid, are nearly all
haploid, with meiosis in the sexual forms following
immediately after the fusion of haploid nuclei. The
budding yeasts, Saccharomyces and allied genera, can
propagate vegetatively either as diploids or as haploids,
but unless the haploid cells are artificially prevented
from mating, the diploid phase predominates.
In the algae all possibilities are found: entirely
haploid except for the zygotes (e.g., green flagellates
such as Chlamydomonas and Euglena), predominantly diploid, with a very brief haploid phase (e.g.,
the Fucales, brown seaweeds), and alternation of
morphologically similar haploid and diploid phases
(e.g., green algae such as Ulva spp., the sea lettuce).
True diploidy hardly occurs in bacteria, although
there are several ways in which bacterial cells can carry
a part of their genome in duplicate.
536
Diploidy
b
a
+
+
a
a
b
b
+
+
+
Figure 1 The generation of homozygous cells from heterozygotes by mitotic crossing-over. One pair of chromosomes is represented, with heterozygosity at two loci a and b in one arm, with a furthest from the centromere (shown as
small circle). (A) before chromosome replication (G1 stage of the cell cycle); (B) chromosomes replicated (G2
stage), with a rare crossover between chromatids in the ab interval; (C) anaphase of mitosis with crossover
products passing to opposite poles of the spindle (50% probability); (D) daughter diploid nuclei, now homozygous
with respect to a but still heterozygous for b. A crossover between b and the centromere would lead (again with 50%
probability) to homozygosity for a and b together. This kind of segregation of homozygotes from heterozygous
diploids has been demonstrated particularly in the fly Drosophila melanogaster and the fungus Aspergillus nidulans.
Di sj unc ti on 537
functions in meiosis as a diploid. Meiotic pairing will
usually occur far more readily between the fully homologous chromosomes from the same species than between equivalent chromosomes from different species,
and bivalents will then be formed virtually exclusively.
The tetraploid will then behave in meiosis like a diploid
and will obey ordinary Mendelian rules, though it may
well have functional duplication of many genes. Tetraploids of this kind are called allotetraploids, or sometimes amphidiploids, since they have two different
diploid chromosome sets. Wheat (Triticum aestivum)
is an allohexaploid, with three diploid genomes originating from different species. Functionally, and with
minor variations, it has all its genes in triplicate, but
it behaves in meiosis like a regular diploid.
Further Reading
Direct Repeats
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1822
Directed Deletion In
Developmental Processes
Disassortative Mating
See: Assortative Mating
Discontinuous Replication
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1823
Discordance
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0342
Directed Mutagenesis
See: Complement Loci
Directed Mutation
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0339
Disequilibrium
See: Gametic Disequilibrium; Linkage
Disequilibrium
Disjunction
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0343
538
Disruptive S election
Disruptive Selection
W G Hill
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0344
Further Reading
Distal
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0345
D-Loop 539
Divergent Evolution
b
a
oriH
Divergent Transcription
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1824
Divergent transcription is the initiation of transcription at two promoters facing in opposite directions,
such that transcription proceeds away in both directions from a central region.
See also: Transcription
D-Loop
Y Yamamoto
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0346
oriL
540
DNA
References
DNA
J J Perona
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0347
NH2
N
O
O
5'
N
N
O
O
H
O
3' O
DNAase 541
H
H
O
N
N
N
N
N
N
O
T
A
H
N H
H N
H N
N
N
N
H
C
Further Reading
DNAase
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1826
542
D N A - B i n d i n g P ro te i n s
DNA-Binding Proteins
A Travers
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0348
Motif
Recognition
Specificity
Examples
Histone fold
HMG domain
AT-hook domain
TBP domain
HU class
Helixturnhelix class
Homeodomain
Winged helix
Pou domain
Zn-containing motifs
Zinc finger
Receptor DBD
Gal4 DBD
GATA
bZip
Rel
DNA backbone
Minor groove
Minor groove
Minor groove
Minor groove
Major groove
None
Variable
A/T rich sequences
High (TATAAA)
Variable
Highly variable
Moderate
Highly variable
Major
Major
Major
Major
Major
Major
TFIIIA
Estrogen receptor
Gal4
GATA-1
GCN4, c-Fos
NF-kB
groove
groove
groove
and minor grooves
groove
groove
High
High
DBD, DNA-binding domain; HMG, high mobility group. The HMG proteins are a disparate group of abundant nuclear
DNA-binding proteins; the HMGA group contain an AT-hook; the HMGB group contain a canonical HMG domain; and the
HMGN group (not listed above) bind to core nucleosome particles.
D N A - B i n d i n g P ro t e i n s 543
sequence-specific interactions are mediated by a helix
turnhelix domain binding in the major groove and
further interactions are made by a short extended
peptide in the floor of an adjacent minor groove.
Typically a DNA sequence recognition motif
recognizes a sequence of 35 bp. This is insufficient
to allow highly selective discrimination between all
sequences in a genome. In practice the effective site
size for recognition can be increased by use of a larger
protein assembly, or as in the POU domain, the
conjunction of two sequence-specific DNA-binding
motifs in the same polypeptide. Larger assemblies
often comprise a stable homo- or heterodimer (as in
helixturnhelix and bZip proteins). However, in
other cases cooperative interactions between proteins
bound at contiguous or distant DNA sites are
required for the formation of an assembly. One example of such interactions is the cooperative binding
of lambda CI repressor dimers to the leftward and
rightward operators in lambda DNA, each of which
contains three CI binding sites. Because stable binding
to any two of these sites is dependent on interaction
between separate dimers, occupation is sensitive to
small changes in concentration of repressor molecules
over a certain range. Interactions can also occur
between distant binding sites on the DNA such that
simultaneous occupation generates a loop of intervening DNA. Loop formation of this type need only
require a single stable protein assembly (as in the case
of the tetramer of the lac repressor containing four
helixloophelix motifs) or may require cooperative
interactions between proteins bound at the separate
sites (as have been postulated for interactions between
eukaryotic enhancer and promoter elements). When
bound at distant sites there is frequently a requirement
that the two proteins be bound on the same face of the
double helix. One example of this phenomenon is
the binding of the E. coli AraC protein to two sites
separated by about 230 bp on the araBAD promoter.
Altering the separation of these sites by integral
double-helical turns has little effect on loop formation. By contrast alteration of the separation by 0.5,
1.5, 2.5 turns, etc., severely impairs loop formation
and regulatory function. The requirement for binding
on the same face of the double helix arises because the
torsional rigidity of DNA prevents the untwisting or
overtwisting necessary to bring the two proteins into
appropriate spatial register when they are initially
bound on opposite faces of the duplex. The presence
of more than one DNA-binding motif in a protein can
also permit interactions with more than one DNA
duplex. A good example of this is provided by the
globular domain of histones H5 and H1 (Figure 1).
These domains comprise a winged helix variant of the
helixturnhelix motif where the recognition helix is
ProteinDNA Recognition
The binding of a protein to a specific DNA sequence is
largely dependent on two types of interaction. The
principal basis for sequence selectivity is direct contact
between the polypeptide chain and the exposed edges
of the base pairs, primarily in the major groove of
B-DNA. These contacts may involve either hydrogen
bonds or van der Waals interactions, the latter particularly with the methyl group of thymine. Small
molecules, such as water molecules, which are tightly
and rigidly bound to a protein and are thus integral
components of the macromolecular structure, may
also participate in these interactions and so provide
binding specificity to the protein by proxy.
The binding energy available from direct interactions with the base pairs, although significant, is not in
general sufficient by itself to allow the formation of a
stably bound complex for binding sites of average
length (615 bp). The required additional binding
energy may be provided by direct electrostatic interactions between basic amino acid residues and the
negatively charged sugarphosphate backbone. The
spatial constraints imposed by this type of interaction
may also serve to restrict the configuration of the DNA
when bound to protein. It is the difference between
the binding energies for the sequence-dependent and
544
DNA Cloning
the DNA by substantial amounts: for example, the
heterodimeric IHF can induce a bend of 1808 over one
double-helical turn while the HMG domain induces a
bend of ~901008 over six base pairs (Figure 2).
In addition to proteins that induce distortions of
DNA structure certain DNA-binding proteins recognize both lesions in DNA such as UV-induced pyrimidine dimers, cisplatin adducts and single-stranded
nicks, and also enzymatically generated structures
including four-way junctions and fork junctions.
One particular example is the A domain of HMG1,
which specifically recognizes a cisplatin adduct by
binding in the minor groove and inserting a phenylalanine residue into the cisplatin-induced kink
between two adjacent guanine bases.
See also: DNA Structure
DNA Cloning
V Sgaramella and A Bernardi
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0350
Definition
The word `clone' comes from the ancient Greek,
meaning `bud' or `twig.' It was introduced in biomedical science to describe a group of genetically
identical organisms. The members of a clone (often
named clones themselves) are expected to be identical
because of their descent from a single organism. By
`cloning,' we mean the ability to generate several identical copies of (1) a DNA molecule (`molecular cloning,' to be obtained either in vivo or in vitro), or (2) a
cell (which could be eukaryotic or prokaryotic: in this
case we speak of `cellular cloning'). Recent attempts to
clone fully developed mammals have stirred interest
among scientists and emotions among laymen: in this
case the term `somatic or reproductive cloning' is
often used since the starting materials are the nuclei
of somatic, possibly differentiated cells, and the aim is
the generation of copies.
History
All the information and techniques necessary to identify and retrieve a DNA fragment or gene of interest,
present in any genome, have essentially been known
since the 1970s. In the early 1970s several techniques
were developed which launched molecular cloning,
also known as `genetic engineering' or `recombinant
DNA technology.' Briefly, they allowed the covalent
joining (or ligation) of any given fragment of DNA
Cloning Objectives
Three major objectives can be considered:
1. Production of large amounts of a DNA sequence.
2. Studyofasinglegeneproduct(RNAand/orprotein).
3. Construction of transgenic organisms carrying and
expressing a transgene.
libraries are stocks of host cells harboring a recombinant structure containing an unique genomic fragment, different in each transformed cell. There are
specialized libraries for DNA fragments corresponding to single chromosomes, or for DNA complementary (cDNA) to messenger RNA (mRNA). These
libraries are the starting material for gene or genome
sequencing projects.
The host of choice for sequencing projects is E. coli.
Amplification of a large amount of DNA requires
specialized vectors present in a high copy number
per cell (multicopy plasmids or phages). For some
goals, such as genomic mapping, the vectors have to
be able to accomodate very large DNA fragments; see
below: `Preparation of the vector.'
Genomic libraries have facilitated the complete
resolution of the DNA sequences of tens of bacterial
genomes (many of them pathogens), of the entire yeast
genome and, with the ever-improving sequencing
technologies, the genomes of a worm (Caenorhabditis
elegans), of an insect (Drosophila melanogaster), of a
plant (Arabidopsis thaliana), and ultimately of the
human genome.
546
DNA Cloning
Transgenic Organisms
These steps are depicted in Figure 1; very comprehensive presentations can be found in Winnacker (1987),
Old and Primrose (1989), and Micklos and Freyer (1990).
The most commonly used vectors have been the plasmids. These are extrachromosomal elements, able
to replicate themselves within their hosts, abundantly
ori
insert
marker
vector
ori
marker
(C) Transformation
ori
marker
Figure 1 The steps of a cloning experiment. Shaded areas indicate the ends of the linearized vector or of the insert;
ori indicates the origin of replication of the vector; marker indicates the resistance to an antibiotic. In (C),
chromosomal DNA inside the host cell is not shown and only one copy of recombinant DNA is indicated.
and independently from the replication of the host
chromosome. A large number of vectors are derived
from bacterial plasmids, particularly from E. coli. Many
of them have been modified in vitro to accomplish
their task more efficiently. They generally (1) are small,
35 kb, (2) are present in a large number (up to 500
copies) per cell, (3) carry the gene for resistance to one
or more antibiotics, so that the cells which harbour
them can be positively selected, and (4) are endowed
with several unique restriction sites in order to facilitate the insertion of the desired gene.
Plasmids can easily accommodate DNA fragments
up to 23 times their size. Another frequently used vector for cloning in E. coli is the phage lambda, which
has been modified to allow the insertion of relatively
large (1015 kb) fragments. Other phages, mostly the
phage M13, also considerably modified in vitro, have
been extensively used especially when the main purpose of the cloning was sequencing the fragment.
Some eukaryotic vectors, such as the 2 mm plasmid of
yeast origin or systems based on the simian virus
Ligation Step
548
DNA Cloning
(A)
5 G T G A T* C T A G 3
3 C A C T A G A T C 5
(B)
5 G T C T G
*
3 C A G A C T T AA
(C)
5 G T T C G A T
3 C A A G C T A
*
*
A A T T C G T C C G 3
G C A G G C 5
*
A T C G T G T 3
T A G C A C A 5
Figure 2 (A) Polynucleotide chain interrupted on a single strand at the phosphodiester bond between T and C. The
reaction catalyzed by the ligase is indicated by an asterisk. (B) Two different fragments obtained by degradation with
the restriction enzyme EcoRI (which leaves 50 protruding ends) are indicated. In bold the EcoRI recognition site, the
other nucleotides are as an example. The reactions catalyzed by the ligase are indicated by two asterisks. (C) Two
different fragments obtained by degradation with the restriction enzyme EcoRV (which leaves flush ends) are
indicated. The EcoRV recognition site bold; the other nucleotides are examples. The reactions catalyzed by the ligase
are indicated by two asterisks.
demonstrated that two DNA duplexes having no protruding ends (called flush or blunt ends) could be
ligated to each other, albeit with slightly lower efficiency (Figure 2C).
Irrespective of the origins of the DNA fragment
and vector, the ends of both have to be compatible
in order for ligation to occur. As indicated above,
restriction enzymes cut DNA at specific sequences,
frequently in a staggered way: that means that either
the 50 or 30 ends may be protruding, but always in a
complementary way. For cloning, the most favorable case is when both the vector and the fragment
have been cut by the same restriction enzyme (for
instance EcoRI, PstI, HindIII, etc.) or at least by two
enzymes which create complementary protruding
ends (for instance BamHI and BglII). Under these
circumstances the efficiency of the ligation step is
very high. However, if the vector and the DNA to be
cloned have been digested by different enzymes, generating non-compatible ends, other steps are taken to
make them compatible:
. The simplest procedure is to make both duplexes
have blunt ends. This can be achieved with several
enzymes which can either fill in or remove the
protruding strand. This procedure to `polish' the
ends of the DNA fragments prior to the cloning
step is almost mandatory when the fragments have
been obtained by physical or enzymatic methods
(such as sonication or shearing or cloning of fragments obtained through a PCR amplification)
which leave the fragments frayed at their ends and
thus not suitable for joining. After this step all the
fragments are suitable for cloning in a vector produced by an enzyme (such as HindII or EcoRV)
which leave flush ends on its DNA substrate.
Transformation
Transformation is the step by which the hybrid molecule comprising the vector and the ligated DNA
duplex (as described above) is introduced into the host.
Introduction of large DNA molecules in host cells
does not occur naturally except by viruses. Therefore
these cells have to be subjected to chemical or physical
treatments that will make them `competent' to accept
foreign DNA. The rationale of the chemical method
is the observation made in the 1970s that the treatment
of E. coli cells with calcium chloride (which permeabilizes the cell wall) in cold shock conditions,
considerably enhanced the uptake by these bacteria
of a plasmid DNA. All the improved methods subsequently developed for bacteria are based on this
observation. The frequency of this reaction is relatively low, because in the most favorable conditions
only one out of 5001000 cells takes up the plasmid.
Conclusions
The ability of cloning to yield an exponential multiplication of DNA molecules in vivo through vectormediated transformation, as well as in vitro via PCR,
550
DNA Code
References
DNA Code
See: Genetic Code
DNA Damage
See: DNA Repair
DNA Denaturation
D W Ussery
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0353
DNA denaturation refers to the melting of doublestranded DNA to generate two single strands. This
involves the breaking of hydrogen bonds between the
bases in the duplex. From a thermodynamic point of
view, the most important contribution to DNA helix
stability is the stacking of the bases on top of one
another. Thus, in order to denature DNA, the main
obstacle to overcome is the stacking energies that
provide cohesion between adjacent base pairs. In general, stacking energies are less for pyrimidine/purine
(YR) steps, and for AT-rich regions. Thus, the
sequence TATATA would be expected to melt quite
readily, and this is indeed what happens, both in a test
tube and inside cells.
There are a variety of ways in which to denature
DNA. Perhaps one of the most common (and oldest)
methods used in the laboratory is simply to heat the
DNA to a temperature above its Tm or melting point.
The unstacking of the DNA base pairs can be readily
monitored spectrophotometrically. DNA absorbs
strongly at 260 nm, and as the DNA melts, the absorbance will increase until all of the DNA is melted,
and then remains constant on further heating. (This
is called the `hypochromic effect,' and the absorbance
of single-stranded DNA is usually around 50%
greater than that of the corresponding duplex DNA.)
The process is reversible, and the renaturation time of
DNA can be used to estimate its base-composition as
well as the presence of repetitive fractions within the
sequence. This method was used in the 1960s to
monitor differences in the base composition of DNA
from different organisms, and also to demonstrate that
eukaryotic DNA contained a large fraction of repeated sequences. Figure 1 shows the melting temperatures of genomic DNA from several different
microorganisms as a function of the AT content of
the genome.
The actual Tm of a given piece of DNA will depend
on several factors, such as the length of the DNA
sequence (shorter pieces of DNA will tend to melt
more easily than longer pieces), the base composition
of the DNA (in general, regions with alternating
pyrimidine/purine steps and AT-rich regions will
95
90
85
80
75
70
30
Deinococcus radiodurans
Mycobacterium leprae
Escherichia coli
Bacillus subtilis
Saccharomyces cerevisiae
Ureaplasma urealyticum
40
50
60
70
80
Figure 1 Melting temperatures of the genomic DNA from various organisms, as a function of AT content, ranging
from 33% AT (Deinococcus radiodurans) to 75% AT (Ureaplasma urealyticum). A straight line representing the best fit
through the points is also shown.
melt more readily), the topological condition of the
DNA (e.g., whether it is a closed circle that is relaxed
or supercoiled, or a linear piece, or is heavily nicked),
and the composition of the buffer (in terms of the
amount of salt and which ions are present). Given the
roles of all of these parameters, it is difficult to predict
accurately the exact melting temperature of a given
sequence, although it is generally easy to say which
region within a long piece of DNA will melt first.
Denaturation of small regions of DNA within a
much longer sequence can be estimated by using
enzymes or chemicals that modify or cut singlestranded DNA more readily than duplex DNA.
Some enzymes, such as methylation enzymes and
certain single-strand-specific nucleases, can be used
to monitor the denaturation status of a particular
region of DNA, either in a test tube or in a living
cell. Some chemicals will react preferentially with
single-stranded DNA, such as haloacetaldehydes
(e.g., chloroacetaldehyde), permanganate, diethyl
pyrocarbonate (DEP), or osmium tetroxide. These
chemicals can be used in a similar way to the enzymes,
and the location of the modified bases can be detected
using polymerase chain reactions (PCRs). As an alternative, fluorescently labeled oligomers specifically
designed to hybridize to a suspected region of singlestranded DNA can be used both in vivo as well as in
vitro. Another method is to use a cross-linking agent,
such as psoralen, to cross-link the single strands
552
DNA Denaturation
Range:
360000
..
370000
Escherichia coli
strain K-12, isolate MG1655 (lac operon region) 4 639 221 bp
(A)
mhpA >
(B)
< lacA
< lacY
< lacZ
< lac1
< mhpR
(C)
360 k
361.25 k
362.5 k
363.75 k
365 k
366.25 k
367.5 k
368.7 k
370 k
Figure 2 (See Plate 12) A `DNA atlas' for the lac operon in Escherichia coli. Genes oriented in the `forward direction' are shown in blue, whilst genes in the
`reverse direction' are indicated in red (B). (C) Color-coded bar representing the calculated stacking energy in kcal mol 1 of the DNA sequence: green indicates
regions that will require more energy to melt (e.g., more negative numbers), and red indicates regions that will melt more readily (that is, the stacking energy values
are smaller or closer to zero). Color-coded bar indicating the AT content of the region: blue represents lower AT content, and red indicates more AT-rich regions.
Note that near the beginning of the operon there are more AT-rich regions, which also correspond to regions that will melt more readily. For more information on
DNA atlases and melting profiles of regions upstream of genes for whole genomes, see Pedersen et al. (2000) and DNA Genome Atlas (http://www.cbs.dtu.dk/services/
GenomeAtlas/).
DN A, Hi stor y of 553
facilitates DNA denaturation. Glyoxal agarose gels
can also be used to stabilize single-stranded DNA or
RNA. This is particularly important for Southern and
Northern blotting methods.
Further Reading
References
DNA, History of
M R Sanderson
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0120
554
D N A , H i s to r y of
crystallographic methods. Rosalind Franklin succeeded in precisely controlling the humidity of her
DNA fiber sample using hydrogen bubbled through
ammonium sulfate solutions, which resulted in her obtaining the beautiful and now famous X-ray diffraction
pattern No. 51 of the high-humidity biologically relevant B-form. She also obtained excellent pictures of
the lower humidity A-form and showed that the diffraction patterns recorded earlier in 1938 by Astbury
of a much lower quality were a mixture of the A- and
B-forms. Personality clashes, an erroneous value for
the DNA density of the B-form, and time spent on
trying to fully index the diffraction pattern and solve
the Patterson function for the A-form slowed down
the elucidation of the more biologically relevant
B-form at King's.
The approach taken by James Watson and Francis
Crick in Cambridge was that developed by Linus
Pauling to solve the structure of the a-helix, namely
to build a model to fit the experimentally relevant data
incorporating the relevant chemical constraints. That
the B-form is a helix was appreciated by both the
London and Cambridge groups. Stokes at King's and
Crick, Vand, and Cochran had shown independently
that the X (cross) pattern in the diffraction pattern of
the B-form arose from diffraction by a helical object.
An early model of a triple DNA helix was built by
Bruce Fraser in the London group on the basis of the
erroneous density value and the knowledge that the
structure must be a helix. The key point observed by
James Watson and Francis Crick was that the symmetry of the B-form diffraction pattern dictated that the
helix had to have a twofold normal to the helix axis,
and hence antiparallel strands and an even number of
strands. In addition to this Watson and Crick pulled
together a wide range of disparate information: Chargaff rules dictating that the A:T/G:C ratio in a variety
of organisms is one to one and the correct tautomers
for the bases allowed correct hydrogen bonded basepairing, the latter gleaned from Jerry Donahue; the
correct chemical linkages of base, sugar, and phosphate from Lord Todd's organic chemical group at
Cambridge; a pitch of ten base pairs per helical
with a base-pair separation of 3.4 A
turn over 34 A
10 10 m) (from the reflections on the helical
(1 A
cross and the strong meridional reflection in the Bform diffraction pattern); potential charge neutralization of the phosphate groups on the outside of the
helix; and Furberg's result that base and sugar are
noncoplanar. The result was the construction of
their double-helical model for the B-form of DNA
(Watson, 1969; Oldby, 1974; Portugal and Cohen,
1977; Chargaff, 1980; Gribbin, 1985; Crick, 1989;
Chomet, 1995; Judson, 1995), which was published
in 1953 (Watson and Crick, 1953) together with two
DN A, Hi stor y of 555
papers from the King's College London group describing the experimental data (Franklin and Gosling,
1953; Wilkins et al., 1953). Watson and Crick's model
immediately indicated how DNA could be replicated
and had an immense impact on genetics and biochemical experiments. The DNA structure was a huge catalyst for further experiments to elucidate transcription
and also the biochemistry of replication. Severo Ochoa
and Arthur Kornberg and other researchers went
on to work out how nucleic acids are synthesized
(Kornberg, 1991; Kornberg and Baker, 1991).
Further Reading
References
556
D N A Hy b r i diz a tio n
DNA Hybridization
W Fitch
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0355
DNA Invertases
See: Hin/Gin-Mediated Site-Specific DNA
Inversion
DNA Lesions
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0357
Lesions in DNA result from damage that occurs spontaneously or as a result of external agents. Often these
lesions result in blocks to DNA replication and/or
mutagenesis. Some spontaneous lesions result in altered bases such as the products of deamination. Cytosine deaminates to uracil, adenine to hypoxanthine,
guanine to xanthine, and 5-methylcytosine to thymine.
Each of these altered bases has different pairing properties from the original base and will lead to mutations
if left unrepaired. Other altered bases result from
oxidative damage caused by reactive oxygen species
generated during metabolism and cellular respiration. A wide variety of altered bases can be formed,
of which ring-opened purines, thymine glycol, and
D N A M o d i f i c a t i o n 557
8-oxoguanine are the most widely studied, the latter
two being implicated in mutagenesis if left unrepaired.
Spontaneous lesions are also produced by depurination
and depyrimidination that results in cleavage of the
N-glycosidic bond. If the cell is forced to replicate
past these lesions, mutagenesis results frequently,
since there is no way to determine the correct pairing
partner for the lost base.
Radiation and chemicals are among the treatments
that can damage DNA. Ionizing radiation generates a
myriad of DNA lesions, including thymine glycol,
8-oxoguanine, thymine dimers, 5-hydroxymethyluracil, and many others. Damage to sugars in the DNA
and strand breaks have been detected; UV radiation
also results in numerous photoproducts. Two different pyrimidinepyrimidine dimers have been implicated in mutagenesis. These are the cyclobutane dimer,
and the 6-4 photoproduct (also called the pyrimidinepyrimidone (6-4) photoproduct). Thymine glycols are
also generated. Many different kinds of chemicals
damage DNA in a manner that can lead to mutagenesis. Some agents such as methylmethanesulfonate,
ethylmethanesulfonate, N-ethyl-N-nitrosourea, and
N-methyl-N0 nitro-N-nitrosoguanidine alkylate different positions on the DNA. Alkylations at the O-6
position of guanine and the O-4 position of thymine are
best correlated with mutagenesis. Other agents cause
interstrand cross-links, blocking DNA replication.
Cis-platin (cis-platinum (II) diaminodichloride) is an
example. Many chemical agents make large adducts to
DNA bases, blocking replication and often resulting
in mutagenesis; the carcinogens aflatoxin B1, 4-nitroquinoline 1-oxide, benzo(a)pyrene diolepoxide and N2-acetyl-2-aminofluorene are examples. The adducts
are not always in the same place; thus, aflatoxin B1
forms its principal adduct at the N-7 position of guanine, benzo(a)pyrene diolepoxide at the exocyclic
amino group of guanine, and N-2-acetyl-2-aminofluorene at the C-8 position of guanine.
See also: Mutagens; Mutation
DNA Ligases
S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0359
DNA Mapping
See: Chromosome Mapping
DNA Marker
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0361
In addition to the four bases, guanine, cytosine, adenine, and thymine, many genomes contain chemically
modified derivatives such as 5-methylcytosine and
N-methyladenine. Enzymes catalyzing such modifications at a wide range of different sites are found
in bacteria, where they are usually associated with
restriction enzymes, which cut the DNA at the same
sites protected by the cognate modification enzymes.
A large range of specificities exists and more are constantly being discovered. Eukaryotic cells, especially
those of vertebrates and plants, are heavily modified
but in these cases the function of the modification is
not well understood.
See also: DNA
558
D N A Po l y m e r a s e h (Eta)
DNA polymerase Z (pol Z) is a translesion DNA polymerase, encoded by the gene RAD30A (chromosome
6). The enzyme replicates nondamaged DNA with low
fidelity, due to a lack of proofreading activity. Unlike
polymerases d and e, it inserts adenines across thymine
thymine dimers, the most frequent pyrimidine dimer
lesion in DNA induced by UV. Thereby it protects the
cell against UV-induced mutagenesis. Inherited deficiency of pol Z results in a variant form of the skincancer-prone disorder xeroderma pigmentosum.
See also: DNA Polymerases; Pyrimidine Dimers;
Xeroderma Pigmentosum
DNA Polymerases
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0364
DNA Recombination
D Carroll
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0365
The term DNA recombination is often used interchangeably with genetic recombination, but there are
two contexts in which it may have a distinct meaning.
First, DNA recombination is distinguished from RNA
recombination, which may occur during the replication of RNA viruses or in some in vitro reactions.
Second, the term is sometimes used to describe recombination of DNA that has been introduced into cells,
as distinct from natural recombination processes that
involve endogenous chromosomes. As with genetic recombination, a distinction is made between legitimate
recombination, which involves extensive sequence
homology, and illegitimate recombination, which is
supported by little or no sequence homology between
the interacting DNAs.
See also: Genetic Recombination; Illegitimate
Recombination
DNA Repair
E C Friedberg
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0366
D N A R e p a i r 559
pH is no greater than that of the collective chemical
bonds of which it is comprised. Hence, the nitrogenous bases adenine, cytosine, guanine and thymine are
subject to alterations due to spontaneous tautomeric
shifts as well as the spontaneous loss of exocyclic
amino and methyl groups. Additionally, spontaneous
hydrolysis of the glycosylic bonds linking the bases to
the sugarphosphate backbone of DNA results in the
loss of free purines and pyrimidines, leaving sites of
base loss in the DNA, which are themselves subject to
further spontaneous chemical alterations. Second, the
fidelity of the process of DNA synthesis is limited by
the accuracy of the replication machinery with respect
to correct base pairing. Hence, DNA replication is
intrinsically error-prone. The magnitude of errors during stable base pairing depends on the DNA polymerase in question and its associated accessory proteins.
Probably the most prevalent exogenous source of
DNA damage derives from various reactive oxygen
species (ROS) which are products of both normal and
abnormal oxidative metabolism in animal cells. These
ROS can interact with and chemically modify the
nitrogenous bases, the sugars and the sugarphosphate
linkages. Oxidative alterations to DNA represent an
extensive and under-appreciated source of DNA
damage which is likely the source of many mutations
in cells, and hence of diseases that derive from mutations in somatic cells, such as cancer. Sunlight constitutes another prevalent source of exogenous DNA
damage. Some of the UV radiation that filters through
the earth's atmosphere is readily absorbed by the
nitrogenous bases in DNA resulting in the formation
of multiple distinct photoproducts which interfere
with normal DNA replication and transcription.
Finally, DNA is interactive with multiple diverse
exogenous chemical agents. Some of these agents
derive from normal (or abnormal) cellular metabolism. Others derive from natural organic sources, typically other life forms, and in recent decades yet others
derive in increasing quantities from synthetic industrial pursuits.
The last-mentioned category aside, all the other
sources of DNA damage mentioned above have provided powerful environmental influences for the
selection of multiple and diverse mechanisms for the
repair of damaged DNA. Not all DNA damage is
intrinsically deleterious; rather, DNA damage, especially spontaneous base damage, also provides a source
of genetic variability in germline cells which serves as
the essential basis for Darwinian evolution. Hence,
the elaboration of DNA repair mechanisms which are
perfect in the sense that they achieve complete restoration of all forms of DNA damage in cells is antithetical to the notion of genetic diversity and of evolution
by natural selection.
Enzymatic photoreactivation
One of the quantitatively major photoproducts produced in DNA exposed to UV radiation is the cyclobutane pyrimidine dimer. The covalent joining of
adjacent stacked pyrimidines in the DNA duplex
derives from saturation of their respective 50 60 double
bonds following the absorption of UV radiation at
*260 nm, resulting in the formation of a cyclobutane
ring structure. The first DNA repair mode to be discovered is one in which such dimerized pyrimidines
are restored to their normal monomeric state in
the presence of visible light. Such repair, called enzymatic photoreactivation, is catalyzed by a class of
enzymes called DNA photolyases. All DNA photolyases contain chemical chromophores which absorb
visible light of specific wavelengths. This light absorption facilitates a series of photochemical reactions
which destroy the cyclobutane ring and restore the
normal 50 60 double-bonded structure of the monomeric pyrimidines (Figure 1).
Microbial photolyases have strong amino acid
sequence homology with a blue-light photoreceptor
from plants that is devoid of photolyase activity. It is
likely that such photoreceptor proteins evolved to
photoreactivating enzymes under the selective pressure of the lethal effects of UV radiation.
560
D N A Re p a i r
(A) Native DNA
Cells have also evolved multiple biochemical pathways by which damaged or inappropriate (such as
D N A R e p a i r 561
5' End
H
H
N
O
H
N
O
Na+ O P
H2C 5' H
O
O6 meG
3'
O
Na+ O
P O
N
CH3
N
H2C 5'
Cysteine residue in
methyltransferase
3'
CH2
CH
NH3
Na+ O
P O
H3C
O
H2C 5' H
H
N
N
5' End
O
H N H
O
H
N
O
Na+O
3'
H2C 5' H
O
Na+O
COO
H
S
S-methylcysteine in
methyltransferase
Guanine
3'
O
3' End
O
Na+O
H2C 5'
COO
S CH2 CH
NH3+
3'
O
Na+O
H3 C
P O
H3C
O
H2C 5' H
H
O
3'
O
Na+O
O
3' End
Figure 2 An enzyme activity called O6-methylguanine-DNA methyltransferase (O6-MGT) transfers a methyl group
from the O6 position of guanine in DNA to a cysteine residue in the protein, thereby restoring the native chemistry
of guanine and repairing the base damage by direct reversal.
mispaired bases, once again allowing for the generation of free ends from which a DNA strand can be
degraded, providing for the release of the mispaired
base. Since in principle any base in a mismatched pair
can be either the `correct' or the `incorrect' base, the
special problem that cells had to solve in the evolution
of the repair of mismatched bases was how to distinguish the DNA strand containing the mismatched
base from that with the normal base. Since all mispaired bases that arise during DNA replication have
the `incorrect' base in the newly replicated strand, the
way this problem was solved was to evolve a means of
562
D N A Re p a i r
CH3
3'
5'
5'
3'
CH3
MutS
CH3
1
DNA glycosylase
ATP
AP
site
5'
3'
5' AP endonuclease
dRpase
Free base
excised
ADP + Pi
MutL
MutH
2
CH 3
OH
P
5'
CH3
CH 3
3'
3
5'
3'
+ P
CH3
3'
MutH incision
at d(GATC) site
OH
CH3
Nick
D N A R e p a i r 563
separated by about 30 nucleotides (Figure 5). It is
not known how these endonucleases discriminate
between such junctions on the damaged and undamaged DNA strands. A reasonable speculation is
that strand specificity derives from the precise architecture of the repairosome when properly bound to
DNA.
The bimodal incision of DNA during NER is common to both prokaryotes and eukaryotes. However,
the biochemical mechanism just described is unique to
eukaryotes. Prokaryotes utilize a somewhat different
mechanism for bimodal incision that involves a group
of proteins which are not conserved in eukaryotes. It
would appear that the elaboration of a more complex
mechanism for RNA polymerase II transcription
initiation in eukaryotes led to the cooption of TFIIH
for NER in such organisms. An interesting implication of the observation that TFIIH proteins are
required for both NER and RNA polymerase II
transcription is the potential for competition for
these proteins during these processes. Recent in vitro
studies in yeast have indeed demonstrated inhibition
of transcription initiation in the presence of active
NER. Assuming that such inhibition also operates in
vivo, it provides another means for limiting the elaboration of mutant transcripts as a consequence of
DNA damage.
The bimodal incision of DNA generates oligonucleotide fragments which incorporate sites of base
damage (Figure 5). Our understanding of the precise
biochemical events that accompany oligonucleotide
excision, and the repair synthesis needed to repair
the gaps created by the excision events, is sketchy.
3'
5'
5'
3'
Pyrimidine dimer
Damage-specific DNA
incising activity (2 nicks)
5'
OH
P
OH
3'
Oligonucleotide excision
5'
OH
3'
5'
+P
T=T
OH
3
T
3'
564
D N A Re p a i r D e f e c t s a n d Hu m a n D i s e a s e
(HNPCC) (Friedberg et al., 1995). Other relationships between repair-defective phenotypes and cancer
predisposition are less clear cut. Individuals defective
in the CSA or CSB genes which are required for
strand-specific NER in human cells suffer from a
debilitating developmental and neurological disorder
called Cockayne syndrome (CS). There is no evidence
that human CS patients are cancer-prone. However,
these individuals are frequently incapacitated from an
early age and often die very young. Thus, it has been
argued that they might not suffer the same level of UV
radiation exposure as XP individuals. Consistent with
this notion, knockout mice defective in the CSB
gene are skin-cancer-prone when exposed to UVB
radiation.
No human patients with hereditary defects in base
excision repair or in the reversal of base damage have
been reported. In the case of base excision repair one
must seriously consider the possibility that an inability to repair the many types of spontaneous base
damage, especially that produced by ROS, for which
this repair mode is required, is incompatible with
normal embryogenesis. Mice which are defective in
O6-alkylguanine-DNA alkyltransferase activity are
viable, but at the time of writing little is known
about their cancer predisposition.
Reference
Excision Repair
566
D N A Re p a i r D e f e c t s a n d Hu m a n D i s e a s e
of microsatellite instability, that is, the frequent alteration in the tract lengths of certain short repetitive
nucleotide sequences in some hereditary colorectal
cancers, provided the first indication that the etiology
of these cancers might involve a problem in correcting
errors introduced during DNA replication. Since the
repetitive sequences have a tendency to form strandslipped structures, with mispairings during replication,
small deletions giving rise to frameshift mutations
are generated. The mismatch repair system normally
corrects these errors. Thus, a defect in mismatch repair
might be expected to result in microsatellite instability. The finding of mismatch repair gene defects in
patients with HNPCC established that these defects
could be the cause of the enhanced cancer incidence.
HNPCC accounts for 5% of all colorectal cancers and
the patients are also at some risk for cancers of the
endometrium, ovary, stomach, and intestine. The correspondence between a mismatch repair gene defect
and susceptibility to cancer provides support for the
original hypothesis of Lawrence Loeb, at University
of Washington, that tumorigenesis is usually promoted by a ``mutator'' phenotype.
568
D N A Re p a i r D e f e c t s a n d Hu m a n D i s e a s e
Cockayne Syndrome
Trichothiodystrophy
Although the photosensitive form of this rare autosomal recessive disease has some features in common
with CS (e.g., lack of skin cancer predisposition), trichothiodystrophy (TTD) presents several characteristics that are quite unique. These include ichthyosis
(i.e., dry, scaly skin), but most notably brittle hair
and nails, due to reduced sulfur content in the component proteins. It is normally the presence of the amino
acid, cysteine, in certain proteins and the crosslinking
of these proteins through disulfide linkage, that gives
hair its flexibility. As with CS, several of the responsible
Ataxia Telangiectasia
Table 1
Major human genetic diseases involving defects in DNA damage response pathways
Syndrome
Gene(s)
Biological function
Clinical features
Hypersensitivities
Sunlight hypersensitivity
XPV
Growth retardation
Mental retardation
Premature aging
Sunlight hypersensitivity
No increase in skin cancers
Trichothiodystrophy (TTD)
UV
ATM
Cerebellar ataxia
Telangiectasia
Neurological deterioration
Immunodeficiency
Lymphomas
BLM
DNA helicase
Sunlight hypersensitivity
Growth retardation
Leukemias
Breast and intestinal cancer
UV
WRN
DNA helicase
Premature aging
Atherosclerosis
Soft tissue sarcomas
Melanoma, thyroid cancer
4-NQO, camptothecin
570
Gene(s)
Biological function
Clinical features
Hypersensitivities
Rothmund Thompson
syndrome (RTS)
RecQ4
DNA helicase
Growth deficiency
Sunlight sensitivity
Osteogenic sarcomas
Squamous cell carcinomas
UV
p53
Controls apoptosis,
Early-onset cancers, including
cell cycle checkpoints,
breast, brain, leukemia, sarcomas
and nucleotide excision repair
Lynch syndrome
MSH2, MLH1
Mismatch repair
Colon cancer
Endometrial cancer
Double-strand-break repair
Oxidative damage repair
Transcription coupled repair
NBS1
Double-strand-break repair
Microcephaly
Immunodefiency
Lymphomas, neuroblastoma,
rhabdomyosarcoma
X-rays
FAA, FAC
Growth retardation
Bone marrow deficiency
Leukemia predisposition
Anatomical defects
D N A Re p a i r D e f e c t s a n d Hu m a n D i s e a s e
Syndrome
The gene defect responsible for LiFraumeni syndrome (LFS) is p53, a gene that has been shown to be
mutated in over 50% of human cancers. The heterozygote `carriers' of a defective p53 allele do not
appear to have clinical problems or DNA repair
defects. However, when the second allele has been
mutated, or lost, the absence of functional p53 results
in severe problems for the cell. First of all, the p53
controlled pathway of apoptosis is disengaged so
severely damaged cells will survive and be at risk for
carcinogenic transformation because of their genomic
instability. (In fact, LFS p53 homozygote defective
cells are often less sensitive to UV than wild-type
cells.) That genomic instability derives from the fact
that p53 is also an important regulator of cell cycle
checkpoints. Thus, as with the situation in AT, the cells
continue to progress through their growth cycle,
rather than pausing to allow time for DNA lesions to
be repaired. Finally, p53 serves an important regulatory function in nucleotide excision repair, and in its
absence some important mutagenic lesions are simply
not repaired. That, of course, is a major contributor to
the genomic instability and the consequent development of tumors. It is of some interest and importance
that most rodent cells do not express the p53dependent excision repair pathway. Therefore, rodents
are imperfect surrogates for use in carcinogenicity
testing protocols for environment risk assessment.
References
Bootsma D et al. (1998) Nucleotide excision repair syndromes: xeroderma pigmentosum, Cockayne syndrome, and
trichothiodystrophy. In: B Vogelstein, KW Kinzler (eds.)
The Genetic Basis of Human Cancer, 245274. New York:
McGraw-Hill.
Ford JM and Hanawalt PC (1997) Role of DNA excision-repair
gene defects in the etiology of cancer. In: MB Kasten (ed.)
Genetic Instability and Tumorigenesis, pp. 4770. Berlin:
Springer-Verlag.
Wood R D et al. (2001) Human DNA repair genes. Science 291:
12841289.
DNA Replication
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0367
The faithful replication of DNA is one of the cornerstones of heredity. The process involves a DNA
duplex separating into two strands, with each strand
serving as a template for the synthesis of a new complementary strand. A set of enzymes and associated
572
DNA S equencing
DNA Sequencing
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0368
The first methods for sequencing DNA were developed in the middle 1970s by Fred Sanger, and by
Walter Gilbert and Allan Maxam. Subsequently,
Sanger developed a new method that forms the basis
of most DNA sequencing today. This technique uses
dideoxy nucleotides in DNA synthesis reactions.
Because they lack a 30 -hydroxyl group, the dideoxy
nucleotide triphosphates terminate synthesis after
being incorporated into a growing chain. Labeled fragments are visualized after electrophoresis on an acrylamide gel by autoradiography. Four parallel reactions
are run in four different tubes, each tube containing a
small amount of one of the dideoxy nucleotide triphosphates, and all four of the normal deoxy nucleotide
triphosphates. A single-stranded DNA labeled at one
end is used as the template. In each tube all possible
fragment sizes are generated that result from the
DNA Structure
J J Perona
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0370
D N A S t r u c t u re 573
HO
HN
O
H
H
NH
N
O
O
H
O P O
O
O
H
O
N H
N
O
O P O
O
N H
O
H
H N
O P
N
N
N
O P
N
H
P O
O
H
H N
O
H
OH
O
H
O
3
Figure 1 Schematic drawing of a 3-bp segment of duplex DNA. Note that the 50 to 30 direction of the chains is
opposite; the chains are antiparallel to each other. The commonly found A-T and G-C pairs are shown with dotted
lines indicating hydrogen bonds.
the helical ladder are formed by the sugarphosphate
backbones, which run antiparallel to each other. The
antiparallel orientation of the two strands is most
clearly seen in the opposite orientations of the sugar
rings (Figure 1). The floors of the grooves are formed
by the edges of the stacked base pairs. It is the stacking
of the base pairs upon each other, like coins in a roll,
that provides the strongest driving force for formation
of the duplex from the individual single strands.
The DNA double-helix can form somewhat different three-dimensional conformations depending on
environmental conditions and, to some extent, on the
base sequence. Inside the cell where the DNA is fully
hydrated, a highly regular structure known as the
B-form is adopted (Figure 2). In B-form DNA, the
major groove is approximately twice the width of
the minor groove, and the helical axis runs through
the center of the base pairs. The depth of the two
grooves is approximately the same. The DNA makes
a complete 3608 turn in 10 base pair steps, or 368 per
574
D N A S tr uc tu re
Base
pairs
Major
groove
Minor
groove
Phosphates
Further Reading
DN A S u pe rco i li n g 575
Lk=Lk
DNA Supercoiling
D M J Lilley
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1255
Lk < 0
It is often convenient to express the level of supercoiling in the form of a density that is effectively
independent of the size of the molecule considered.
Superhelix density (s) is given by:
These changes in the shape and geometry of the supercoiled molecule lead to different physical properties,
such as sedimentation and frictional properties
(Figure 1).
where R is the gas constant, T is the absolute temperature, and N is the size of the DNA molecule in base
pairs. Thermal fluctuation about a mean linking difference results in a Boltzmann distribution that is
Gaussian; this is readily observed by separating the
topoisomers of a bacterial plasmid by gel electrophoresis in the presence of an intercalator like chloroquine. The energy held in a supercoiled circle can be
substantial. For example, a 4 kb plasmid with a superhelix density of 0.05 has a free energy of supercoiling
of 60 kcal mol 1 at 37 8C.
Any local perturbation that is underwound relative
to B-DNA (such as an unwinding of the helix, or
the formation of a cruciform structure or a section of
left-handed DNA) contributes a negative twist change
that brings about a partial relaxation of the superhelical
stress. This reduction in the free energy of supercoiling
offsets the free energy of formation of the new DNA
structure, and helps stabilize the altered conformation. Since the free energy of supercoiling increases
quadratically with linking difference (equation (6)),
576
D N A S up e rco i l i ng
sc
rel
terminus and the protein, usually as a phosphotyrosine linkage. Thus, the topoisomerases are closely
related to the site-specific recombinases.
Type I topoisomerases relax DNA supercoiling by
means of a single-strand break. The linking number is
then changed either by allowing a swivel to occur
about the nick (eukaryotic enzymes), or by passing
the unbroken strand through the break before the
phosphodiester linkage is restored (most eubacterial
enzymes), thereby leaving a permanent change in the
linking number of the DNA circle. No energy transduction is involved in the function of topoisomerase I
(excluding reverse gyrase). It is not required for remaking the phosphodiester linkage as the free energy
for this is preserved by a temporary covalent DNA
protein linkage, and the position of equilibrium is
simply allowed to run `downhill'; i.e., these enzymes
just relax supercoiling toward the equilibrium state
under the prevailing conditions. The eubacterial
topoisomerases I will specifically relax negative supercoiling, while eukaryotic enzymes can relax either
negative or positive supercoiling.
Type II topoisomerases function by passing duplex
DNA through a double-stranded break, thereby altering Lk by steps of 2. DNA gyrase is a type II
topoisomerase of E. coli. This enzyme has the special
property of coupling the hydrolysis of ATP to the
introduction of negative supercoiling into DNA.
Thus, DNA gyrase is an A2B2 tetramer, consisting of
specialized subunits for topoisomerization and energy
transduction. Type II topoisomerases are found in all
cells and even some viruses, but DNA gyrase is the
only such enzyme that is known to induce negative
supercoiling.
The balance between the opposing activities of
supercoiling (by gyrase) and relaxation (by topoisomerase I) creates a steady-state level of supercoiling,
demonstrated in Salmonella by studying mutants in
the relevant genes.
Do bzh a n s k y, T h e o d o s i us 577
Further Reading
Reference
DNA Synthesis
J Read and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1861
Dobzhansky, Theodosius
R C Lewontin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0373
578
Dogs
Dogs
See: Canine Genetics
Dominance
M A Cleary
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0375
Dominance is manifestation of the phenotype associated with a particular gene allele even when only one
copy of that allele is present in a organism's genome. Because there is a 1 in 2 chance of a dominant gene allele
being inherited and only one copy of the allele is
necessary for the phenotype, 50% of offspring will
display the phenotype and any associated disease.
See also: Codominance; Incomplete Dominance
Dosage Compensation
M F Lyon
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0377
580
D o u b l e - M i n u te Ch romo s omes
Double-Minute
Chromosomes
Double-Strand Break
Repair Model
G Levan
P J Hastings
ii
iii
iv
vi
References
D o u b l e - S t r a n d B re a k R e p a i r M o d e l 581
(A)
or
(B)
or
(C)
Figure 2 Three modes of resolution of the double Holliday junction depicted in Figure 1(vi). DNA strand
conventions are as in Figure 1. Solid arrows indicate progression. Open arrows indicate endonuclease action. Half
arrows show the direction of branch migration. Topoisomerase molecules are shown as circular arrows around DNA
molecules. The diagram is explained in the text.
general scheme for recombination called the doublestrand break repair model.
Figure 1 shows a basic form of the model.
The double-strand break shown in part (i) may be
582
Holliday junctions with random cleavage. This is illustrated in Figure 2A. If the two junctions were
cleaved in the same plane, there would be no crossover. Cleavage of the two junctions in different planes
would result in a crossover. This random process
would be expected to yield an equal number of crossovers and non-crossovers. This is unsatisfactory
because these are observed to be unequal. Notably,
when double-strand break repair occurs in mitosis,
less than 10% of events give rise to crossovers.
Two other modes of resolution have been suggested. Figure 2B illustrates the removal of the double
Holliday junction by the concerted action of a topoisomerase rotating the lengths between the junctions.
This can cause the distance between the junctions to
grow shorter, and the junctions to approach each other
until they disappear, resulting in a non-crossover outcome. A third method of resolution that would lead to
a deficiency of crossovers is to resolve one junction
always in the plane that gives nonrecombinant molecules, and then have the remaining junction migrate
across the nicks left where the first junction was cut
(Figure 2C). This results in a non-crossover outcome.
There may be other mechanisms of resolution yet to
be described, and it is possible that more than one
mode of resolution occurs.
The resolution mechanisms shown in Figure 2B
and 2C both give rise to an interesting configuration
of heteroduplex DNA called trans-heteroduplex. A
length of heteroduplex DNA spans the original position of the double-strand break, but the linkage relationships of the parents are not maintained, with
recombination within the heteroduplex occurring
at the site of the original break. The distribution of
markers predicted by this structure has been observed
genetically.
Further Reading
Down Syndrome
J L Tolmie
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0381
Frequency
The incidence of Down syndrome at birth is approximately 1 in 750. However, since the majority of
trisomy 21 pregnancies spontaneously miscarry, the
incidence at conception must be higher, perhaps as
high as 1 in 150. The chance of a trisomy 21 conception
rises with advancing maternal age. Thus, in affluent
countries where there is a population-based, antenatal
screening program for trisomy 21 and a trend for
women to postpone having children until the fourth
decade, the number of Down syndrome diagnoses has
increased.
Cytogenetics
In 95% of cases, Down syndrome is due to the presence
of an extra, free chromosome 21 and the karyotype is
given as 47, XX, 21 or 47, XY, 21. In 4% of cases
the phenotype is identical but the extra chromosome
21 is not free, instead it is attached to another chromosome, commonly chromosome number 13 or chromosome 14 in a Robertsonian translocation. In the
remaining 1% of cases, Down syndrome mosaicism
is present with the affected individual's cells comprising two populations, one with a normal karyotype
Clinical Aspects
Infants who are affected by Down syndrome are
usually diagnosed very soon after birth because they
have reduced body tone in combination with minor
features including flat occiput, upslanting palpebral
fissures, epicanthic folds, large or slightly protruding
tongue, single palmar crease, small fifth finger, and
wide gap between first and second toes. More importantly, these infants also have an increased chance of
being affected by one or several different serious congenital malformations or illnesses. Thus, about one in
five affected children die before age 5 years and two
in five are affected by conditions such as congenital
heart defect, bowel atresia, or leukemia. For most but
not all families of an affected child, cognitive impairment is the most important complication of the syndrome. This is always present, although of variable
severity. In general, the type of cognitive impairment
is not specific to trisomy 21. Delay in development
is often evident from early infancy and when IQ is
measured, scores indicate moderate to severe retardation (IQ range 1070). Thus, Down syndrome individuals achieve variable levels of independence in
adult life but only a minority are fully independent
in all daily living skills. In mid to late adulthood there
is increased prevalence of dementia. Neuropathological studies on brains of deceased trisomy 21 individuals over age 40 years always show microscopic
changes characteristic of Alzheimer disease, but only
about half of those individuals have clinical evidence
of dementia.
Despite increased mortality and morbidity in trisomy 21, the life expectancy of affected children has
greatly lengthened and nowadays nearly 50% of
adults with Down syndrome survive to age 60 years.
The quality of life of Down syndrome individuals has
also greatly improved with most important factors
being care of affected individuals at home and their
participation in the wider community. Educational
opportunities have improved and the general public
have better understanding of the condition due to the
584
Down, Up Mutations
Down, Up Mutations
See: Promoters
Downstream
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1829
Drosophila melanogaster
C J O'Kane
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1698
Genetic Tools
Classical Genetics
Transposable Elements
Genetic Screens
formation. These provided enough material for a generation of drosophilists to address the major processes
of embryonic pattern formation, and the cloned genes
allowed identification and comparative study of important homologs in other species, including humans.
Reverse Genetics
Drug Resistance
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0382
586
Duchenne
Muscular Dystrophy
(or Meryon's Disease)
A E H Emery
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0383
Clinical Features
Duchenne muscular dystrophy (DMD) (or Meryon
disease) is inherited as an X-linked recessive trait and
is the commonest form of muscular dystrophy with a
birth incidence of around 1 in 3500. Its clinical features, familial incidence, and muscle pathology were
first described in detail by the English physician
Edward Meryon in 1852 and some years later by
Duchenne de Boulogne.
It is characterized by enlarged calves (hence the old
name pseudohypertrophic muscular dystrophy) and
progressive muscle wasting and weakness, beginning in
early childhood and mainly affecting the proximal
limb girdle musculature. Affected boys become chairbound by around age 12 and often succumb by the age
of 20, usually from cardiac involvement or respiratory
failure.
The responsible gene is located at Xp21 and its product is termed `dystrophin.' This remains the largest
gene associated with a disease (2.4 Mb) and takes over
24 h to be transcribed. It consists of 85 exons with
introns making up 98% of the gene. There are three
Laminin
Dystroglycans
Sarcolemma
Dystrobrevin
Sarcoplasm
F-actin
Extracellular
matrix
Syntrophins
Dystrophin
Figure 1 Muscle membrane proteins. Duchenne muscular dystrophy results from an absence of dystrophin; other
dystrophies are caused, for example, by deficiencies of a particular sarcoglycan or merosin.
full-length transcripts plus five shorter transcripts
generated by internal promoters. In skeletal muscle,
dystrophin is associated with other membrane proteins (Figure 1), has a molecular mass of 427 kDa
and consists of 3685 amino acids.
Mutations that disrupt the reading frame of the gene
(out of frame) result in a complete absence of dystrophin in DMD, whereas mutations that are in-frame
result in a partial deficiency of the protein and the
clinically similar but milder condition of Becker
muscular dystrophy. Roughly two-thirds of cases are
caused by large deletions, the remainder being due to a
variety of point mutations, small deletions, or duplications. Rarely, two different mutations occur in the
same family and may result from the insertion of a
transposon into the dystrophin locus.
Treatment
Steroids may slow the progression of the disease for a
time but no drug has yet been found that affects the
long-term course of the disease. Some form of gene
therapy offers hope for the future either by using a
viral vector carrying the normal dystrophin gene to
transform muscle cells in vivo, or to upregulate a protein (such as utrophin) to compensate for the deficiency of dystrophin. Stem cell therapy now also
seems likely to be another approach to treatment
in future. However, it is always possible a drug
may be found that interrupts the pathogenic pathways
in some way and thereby ameliorates the disease
process.
Further Reading
588
Duffy Glycoprotein
The Duffy glycoprotein, also known as the Duffy
antigen receptor for chemokines (DARC), is a receptor for a variety of chemokines including interleukin-8
(IL-8) and melanoma growth stimulatory activity
(MGSA). It has a molecular mass of 3646 kDa and
consists of a 336-amino-acid polypeptide that traverses
the membrane seven times, with a 63-amino-acid extracellular N-terminal domain containing two potential
Phenotype
Fy(ab )
Fy(ab)
Fy(a b)
Fy(a b )
Genotype
Fya/Fya or Fya/Fy
Fya/Fyb
Fyb/Fyb or Fyb/Fy
Fy/Fy
Frequencies(%)
Europeans
Africans
20
48
32
0
10
3
20
67
Further Reading
Dulbecco, Renato
W Eckhart
COOH
Figure 1 Likely conformation of the Duffy glycoprotein (DARC) in the red cell membrane, showing the
seven transmembrane domains, the N-glycosylated
N-terminal extracellular domain, and the C-terminal
cytoplasmic domain. The arrow shows the position of
the Fya/Fyb polymorphism.
N-glycosylation sites, and a cytoplasmic C-terminal
domain (Figure 1). This structure is characteristic
of one of the largest gene families in the mammalian
genome, the seven-transmembrane-segment class
of the G-protein-coupled superfamily of receptors,
which bind many different ligands. DARC is present
on endothelial cells lining postcapillary venules
throughout the body. It has also been detected on
some other vascular endothelial cells, on epithelial
cells of renal collecting ducts and pulmonary alveoli,
and on Purkinje neurons of the cerebellum.
The function of DARC is not known. It has been
suggested that it may act as a clearance receptor for
inflammatory mediators and that Duffy-positive red
cells function as a ``sink'' or as scavengers for the
removal of unwanted chemokines. If so, this function
must be of limited importance as DARC is not present
on the red cells of most Africans.
590
D un n , L . C .
Both interactions could be studied in vitro, using cultured animal cells. These discoveries set the stage for
molecular characterization of DNA tumor virus genes
and the role of viral genes in cell transformation.
In 1963 Dulbecco moved to the newly established
Salk Institute for Biological Studies in San Diego.
Over the next 10 years his laboratory used the small
DNA tumor viruses, polyoma and SV40, to explore
mechanisms of neoplastic cell transformation, producing a series of fundamental insights into the process.
He and his collaborators demonstrated that viral
DNA persisted in transformed cells, and was integrated into cellular DNA. They showed that there
were two classes of viral genes, `early' and `late,'
which were transcribed from opposite strands of the
viral DNA. Both classes of genes were expressed during productive infection, but only the early genes were
expressed in transformed cells. They provided evidence that the activity of viral genes led to transcription of cellular genes. Using viral mutants, they
showed that viral genes could influence the growth
properties of transformed cells. In recognition of this
work, Dulbecco was awarded the Nobel Prize for
Physiology or Medicine in 1975, together with his
former associates, David Baltimore and Howard
Temin, who were honored for their discovery of
reverse transcriptase.
From 1972 to 1977 Dulbecco served as Deputy
Director of Research at the Imperial Cancer Research
Fund Laboratories in London, where he and his colleagues used antisera directed against polyoma tumor
antigens to characterize virus-specific proteins in the
plasma membrane of infected and transformed cells.
These experiments led to the identification of the polyoma middle T antigen, the viral protein that is primarily responsible for cell transformation by polyoma.
Dulbecco returned to the Salk Institute in 1977 to
pursue a new interest in mammary cell biology and
breast cancer. At the same time, he continued to reflect
on the implications of genetic discoveries for understanding cancer. These thoughts led him to propose an
international undertaking to sequence the human genome, described in an article published in Science, in
1986. He interrupted his laboratory research to
assume the presidency of the Salk Institute in 1988,
serving with distinction in that post until 1993. Thereafter he joined the Istituto di Tecnologie Biomediche
Avanzate in Milan, where he directed the Italian Genome Project while continuing to study genes involved
in mammary cell differentiation. At present he divides
his time between the institute in Milan and the Salk
Institute, where he is Distinguished Research Professor and President Emeritus.
See also: Delbruck, Max; Luria, Salvador
Dunn, L.C.
K Artzt
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0386
Dunn thought and wrote broadly about scientific history, philosophy, and the human condition. He was
indeed the renaissance geneticist.
Further Reading
Dwarfism
K M Beckingham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0389
Dwarfism, in Mice
K Douglas and S A Camper
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0388
592
Dwarfism, in Mice
Mouse Mutants
Spontaneous mutations have been identified in genes
acting at various levels of growth regulation. The
little (lit) mouse mutation is an autosomal recessive
mutation resulting in proportionate dwarfism visible
at 2 weeks of age. Adult lit mice have a body weight
two-thirds the size of control littermates. Female mice
are fertile but often fail to nurse their first litters,
whereas male mice have reduced fertility. Dwarfism
in little mice is due to a missense mutation in
the GHRH receptor (Ghrhr) gene (Godfrey et al.,
1993). The mutation substitutes a glycine residue for
a conserved aspartic acid residue within the ligandbinding domain of the receptor. This amino acid
substitution greatly reduces the sensitivity of the
receptor to GHRH. Therefore, little mice do not
receive the hypothalamic GHRH signal to secrete
GH. Serum levels of GH and IGF-I are low and
the mice are dwarfed. In addition, lit/lit pituitaries
have fewer somatotropes (GH-producing cells), because GHRH normally stimulates proliferation of
these cells.
The Ames dwarf (df ) mouse mutation is an autosomal recessive mutation resulting in more severe
References
Adkison LR, Taylor S and Beamer WG (1990) Mutant geneinduced disorders of structure, function and thyroglobulin
synthesis in congenital goitre (cog/cog) in mice. Journal of
Endocrinology 126: 5158.
Camper SA, Saunders TL, Katz RW and Reeves RH (1990) The
Pit-1 transcription factor gene is a candidate for the Snell
dwarf mutation. Genomics 8: 586590.
Godfrey P, Rahal JO, Beamer WG et al. (1993) GHRH receptor
of little mice contains a missense mutation in the extracellular
domain that disrupts receptor function. Nature Genetics 4:
227231.
Gu W-X, Du G-G, Kopp P et al. (1995) The thyrotropin (TSH)
receptor transmembrane domain mutation (Pro556-Leu) in
the hypothyroid hyt/hyt mouse results in plasma membrane
targeting but defective TSH binding. Endocrinology 136:
31463153.
Kim PS, Hossain SA, Park YN et al. (1998) A single amino acid
change in the acetylcholinesterase-like domain of thyroglobulin causes congenital goiter with hypothyroidism in
the cog/cog mouse: a model of human ER storage diseases.
Proceedings of the National Academy of Sciences, USA, 95:
99099913.
Li S, Crenshaw EB, Rawson EJ et al. (1990) Dwarf locus mutants
lacking three pituitary cell types result from mutations in the
POU-domain gene Pit-1. Nature 347: 528533.
Sornson MW, Wu W, Dasen JS et al. (1996) Pituitary lineage
determination by the Prophet of Pit-1 homeodomain factor
defective in Ames dwarfism. Nature 384: 327333.
Stein SA, Oates EL, Hall CR et al. (1994) Identification of a
point mutation in the thyrotropin receptor of the hyt/hyt
hypothyroid mouse. Journal of Molecular Endocrinology 8:
129138.
Watkins-Chow DE and Camper SA (1998) How many homeobox genes does it take to make a pituitary gland? Trends in
Genetics 14: 284290.
Dynamic Mutations
R I Richards and G R Sutherland
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1355
594
Dynamic M utations
Table 1
.
.
.
.
.
Table 2
Repeat motif
Disease/fragile site
Gene
Normal
Affected
AGC
1131
535
934
1938
2228
1336
420
717
723
1637
4062
>802000
37120
4081
3750
6879
2130
38130
4975
107?
AR
DMPK
Huntington
Ataxin
SCA2
MJD1
CACNL1A4
SCA7
Atrophin
SCA8
CCG
655
625
629
732
1650
>230
? (>230){
? (>230){
? (>230){
? (>230){
FRAXA
FRAXE
FRAXF
FRA11B ( Jacobsen)
FRA16A
FMR1
FMR2
CBL2
AAG
12mer
{
24mer
33mer
42mer
734
23
5
730
475
>2001200
3570
614
? (>150){
>75
Friedreich ataxia
Myoclonic epilepsy 1
CreutzfeldtJacob disease
FRA16B
FRA10B
Frataxin
Cystatin B
PrP
*Copy numbers between the normal and affected ranges can act as premutations, expanding in subsequent generations to
give full mutations.
{
All observed alleles are above this copy number but the threshold is unknown.
Gain of Function
Where the expanded repeat is located within the coding region the repeat usually encodes polyglutamine.
Instances where this is not the case (rare cases of
CreutzfeldtJacob disease, CJD, and some polyalanine
associated disorders) may not constitute the same form
of dynamic mutation. The finding that most of the
polyglutamine expansion diseases have a common
pathological copy number threshold suggests that a
common pathway is involved, even though the proteins
involved appear to be quite unrelated (Table 2). The
polyglutamine tracts are aggregated by crosslinking
into higher molecular weight forms. This aggregation
involves transglutamination of the polyglutamine.
There is clear experimental evidence for polyglutamine
tracts inducing apoptosis either in vivo or in vitro.
In addition, several studies have shown that the
polyglutamine aggregates in the form of nuclear inclusions in the affected cells, suggesting a pathological role
in the neurodegeneration process. Experiments using
transgenic Drosophila have indeed shown that the
expanded polyglutamine tract is able to initiate the
neurodegenerative disease and while nuclear inclusions
do form they do not appear to be sufficient to account
for the phenotype. An inhibitor of apoptosis does
restrict the pathology providing evidence that the polyglutamine does exert its effects by inducing inappropriate apoptosis in the affected neurons. Since these
596
Dynamic M utations
(A)
Loss of function (loss of transcript)
5'
3'
exon
intron
exon
exon
intron
exon
5'
3'
5'
FRAXA,
FRAXE
3'
exon
(B)
EPM1
intron
exon
FRDA
Gain of function
5'
exon
intron
exon
polyglutamine disorders
SBMA
3'
DRPLA
HD
SCA1-3,6,7
Figure 1 Effect of position of repeat expansion on pathway from genotype to phenotype. The location of the
expanded repeats (indicated by shaded triangles) with respect to the affected gene transcript. Each of these genes has
more than one intron, although only one is shown for the sake of clarity. (A) Each of the loss-of-function pathways
involves a loss or reduction in the amount of mRNA for the affected gene. For EPM1 the repeat is located in the
promoter region and expansion appears to affect transcription. The FRAXA and FRAXE repeats are located in the 50
untranslated region, with expanded alleles causing methylation of both the repeat and the promoter region, resulting
in abolition of transcription. The FRDA repeat is located within in an intron and seems to exert its effect by reducing
the amount of mRNA, presumably by affecting RNA splicing and/or stability. (B) The diseases due to an expanded
repeat coding for polyglutamine are likely to have a common pathway from genotype to phenotype. They exhibit
dominant inheritance characteristics consistent with a gain of function.
disorders represent a gain of function the resultant
inheritance pattern is dominant.
Dysmorpholog y 597
adjacent genes, perhaps by inducing inappropriate
expression. Overall the pattern of inheritance is
dominant and therefore the molecular pathways
would appear to require gain-of-function properties
although it is possible that dosage effects may contribute to the phenotype.
Conclusions
Dynamic mutation involves a novel molecular
mechanism with can account for the non-Mendelian
phenomena, such as anticipation, exhibited by certain
human genetic diseases. Dynamic mutation also represents the mechanism whereby one class of fragile site
is generated from loci which normally exhibit copy
number polymorphism of repeat sequences. The
molecular pathway from genotype to phenotype is
largely dependent upon the location of the expanded
repeat with respect to any gene that it effects and can
therefore result in either recessive, X-linked, or
dominant inheritance characteristics.
Further Reading
Koob MD, Moseley ML, Schut LJ, Benzow KA, Bird TD, Day JW
and Ranum LPW (1999) An untranslated CTG expansion
causes a novel form of spinocerebellar ataxia (SCA8). Nature
Genetics 21: 379384.
Richards RI and Sutherland GR (1992) Dynamic mutations:
A new class of mutations causing human disease. Cell 70:
709712.
Richards RI and Sutherland GR (1996) Repeat offenders: simple
repeat sequences and complex genetic problems. Human
Mutation 8: 17.
Richards RI and Sutherland GR (1997) Dynamic mutation:
possible mechanism and significance in human disease. Trends
in Biochemical Sciences 22: 432436.
Sutherland GR and Richards RI (1995) Molecular basis of fragile
sites in human chromosomes. Current Opinion in Genetics and
Development 5: 323327.
Sutherland GR, Baker E and Richards RI (1998) Fragile sites still
breaking. Trends in Genetics 14: 501506.
Warrick JM, Paulson HL, Gray-Board GL et al. (1998) Expanded polyglutamine protein forms nuclear inclusions
and causes neural degeneration in Drosophila. Cell 93:
939949.
References
Dysmorphology
D Donnai
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0390
Dysmorphology is the medical and scientific discipline concerned with the study of birth defects and
syndromes, and with diagnosis, investigation, and
counseling of affected individuals and their families.
Dysmorphologists have training in pediatrics and
clinical genetics and work closely with their colleagues in diagnostic and research laboratories.
Making a Diagnosis
When a baby is born with birth defects the parents
have many questions: What is the problem?; What
does it mean for our baby?; Why did it happen?; Will
it happen again? In order to answer these questions
properly and to manage and treat the baby appropriately a precise diagnosis is needed.
Approach to Diagnosis
There are over 1000 described dysmorphic syndromes and many more if syndromes due to small
598
Dystrophin
Utility of a Diagnosis
Establishing a precise diagnosis is important for the
family, for clinical, social, and educational management, and for research.
Once a diagnosis has been established the underlying cause of the problem (where known) can be
discussed with the family and information given
about prognosis and risks of recurrence including
options for prenatal diagnosis. For many conditions
there are family support groups. The provision of
social and educational care for the special needs of a
child with a syndrome is helped by knowledge of the
precise condition a child has.
A precise diagnosis aids early clinical management
and anticipatory care. For example some syndromes
may be associated with visual or hearing deficits
which, if identified early, can be treated. Some malformation syndromes are lethal and early diagnosis
may lead to a decision to encourage the parents and
family to maximize their time with the child rather
than surgical correction of structural malformations
which will not affect the eventual outcome.
For many conditions the underlying mechanism is
not fully understood and research is continuing. The
dysmorphologist, by making precise diagnoses and
ensuring a group to be investigated is as homogeneous
as possible, maximizes the chance for successful
Dystrophin
R L Somerville
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0391
Further Reading
E
E.coli
See: Escherichia coli
EBNA
G Klein
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1563
Reference
Ectoderm
See: Developmental Genetics
Ectodermal Dysplasias
F M Pope
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0394
Pathogenesis
Unlike many other genodermatoses, such as the ichthyoses, EhlersDanlos syndrome (EDS), pseudoxanthoma elasticum, epidermolysis bullosa, and cutis
laxa in which particular structural components are
600
Ectodermal Dysplasias
Autosomal recessive
Other Variants
References
E d i ti n g a n d P ro o f re a d i n g i n Tra n s l a t i o n 601
Suzuki K, Hu D and Bustos T et al. (2000) Mutations of PVRL1
encoding a cellcell adhesion molecule/herpes virus receptor in cleft lip-palate ectodermal dysplasia. Nature Genetics
25: 427430.
Zonana et al. (2000) American Journal of Human Genetics 67:
15551662.
Translation, like the other steps in the flow of information from DNA to protein, is very accurate; the
overall frequency of all types of translational errors is
clearly less than one mistake per 1000 codons read.
Part of this accuracy is achieved through various
energy-dependent editing or proofreading functions
which operate at the different steps involved in translation. These functions prevent the formation of a
defective protein by preventing the formation of an
error-containing intermediate or rejecting such an
intermediate at some step in the pathway.
The terms `editing' and `proofreading' are often used
as synonyms to describe mechanisms for preventing
errors in translation. Such mechanisms can operate
during the formation of aminoacyl-tRNA or during
the selection of an aminoacyl-tRNA by the ribosome.
In this entry, the term editing refers to such events
during aminoacylation and proofreading refers to
those that occur on the ribosome. Although these
mechanisms work to increase accuracy, it is important
to note that translation in normal cells is not maximized for accuracy, since mutations exist that increase
accuracy above that seen in the wild-type. However,
such mutations lead to a decrease in the growth rate,
indicating that translational accuracy has been optimized for accuracy and growth rate.
Editing in Aminoacylation
Aminoacylation, the attachment of an amino acid to a
tRNA, is typically a two-step process catalyzed by the
aminoacyl-tRNA synthetases. The first step, termed
`activation,' is the formation of an aminoacyl-AMP
(aminoacyl-adenylate) on the enzyme through the
hydrolysis of ATP. The second step is the transfer of
the activated amino acid residue from the adenylate to
Editing at Ribosome
During elongation, aminoacyl-tRNAs are brought to
a site on the ribosome containing the next codon to be
translated (the A-site) as a ternary complex containing
the aminoacyl-tRNA, guanosine triphosphate (GTP),
and an elongation factor. As in the case of aminoacylation, the initial selection of the aminoacyl-tRNA
on the ribosome would give approximately a 100fold preference for the correct versus a nearly correct
codonanticodon interaction. This is, of course, far
lower than the observed accuracy of protein synthesis.
Accuracy is apparently enhanced by proofreading, a
rechecking of the codonanticodon action. As in the
case with editing by aminoacyl-tRNA synthetases,
proofreading is energy driven, in this case by the
hydrolysis of GTP. Every time EF-Tu delivers an
aminoacyl-tRNA to the A-site, there will be GTP
602
E f f e c t ive Po p ul a t i o n Nu m b e r
Effective Population
Number
J F Crow
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1421
sampling process is p(1 p)/2N, where p is the frequency of the allele of interest in the parent generation. This expression formed a part of the general
mathematical expression for allele frequency change
formulated by Wright. Real populations do not conform to this idealization. The number of males and
females may differ, the population size may fluctuate
over time, and parents are not equally viable and fertile. To accommodate these problems, rather than
write more elaborate equations, Wright introduced
the concept of effective population number, Ne. Ne is
a number calculated from an actual population of size
N, that when substituted into the Wright equations
leads to the same amount of random genetic drift as is
occurring in the actual population. The population
size is most appropriately assessed at the beginning
of the reproductive period.
When the number of females, Nf, differs from the
number of males, Nm, the effective population number
is given by 1/Ne 1/4Nf 1/4Nm. Notice that when
Nf and Nm are each N/2, Ne N as expected. If the
numbers of the two sexes differ, then the sex with the
smaller number dominates the value of Ne. To consider an extreme example, the effective size of a polygamous population with one male and 100 females is
about 4, much closer to the number of the rarer sex.
When the population varies from time to time, the
effective population number is the harmonic mean of
the values at different times. Thus, if Ni is the population number at generation i, the effective population
size averaged over t generations is given by 1/Ne
(1/t)Si(1/Nt). Again, since Nt appears in the denominator, the smaller values dominate. Therefore, size
bottlenecks are important factors in assessing the
importance of random genetic drift in the history of
a population. A situation that is often more important
and more difficult to calculate occurs when the viability and fertility differ among the members of the
parent generation. In the simplest case where the sexes
are equally frequent, mating is at random (including
a random amount of self-fertilization), and the population is neither increasing nor decreasing. In this case
the effective population number is Ne(4N 2)/
(2s2), where N is the population number and s is
the standard deviation of the number of progeny per
parent, counted as adults. Formulae adjusting for
counting at the wrong stage are available (Crow and
Morton, 1955).
This aspect of effective population number has had
a great deal of research in recent years and increasingly
complicated formulae have been developed to take
into account such factors as separate sexes, unequal
numbers of the sexes, increasing or decreasing
population number, inbreeding, and population structure. The small population effect is manifested both in
Eh l er s Da nl o s S y nd ro m e 603
a decrease of heterozygosity and fluctuations in allele
frequency. In a stationary population, the effective
population numbers are the same for both effects,
but in a growing or diminishing population they are
different. For a review, see Caballero, 1994.
Usually the effective size of a population is less than
the actual size. It clearly is if the population is censused at an early stage at which the death rate is high
(as in most fish), but it is also true if the population is
censused at the adult stage. Measured values range
widely, but typically for most animals the effective
number is between 1/4 and 3/4 of the census number.
Although this is the case for a population without
structure, it is not necessarily true if the population is
subdivided. When the total population is divided into
subgroups between which migration is limited, the
effective population number can be, and often is,
greater than the census number of the whole population. This is found, for example, in prairie dog colonies.
Random genetic drift plays an important role in
Sewall Wright's `shifting balance' theory of evolution.
It is also a key quantity in the neutral theory of molecular evolution of Motoo Kimura.
Further Reading
Provine WB (1986) Sewall Wright and Evolutionary Biology. Chicago, IL: University of Chicago Press.
Wright S (1968, 1969, 1977, 1978) Evolution and the Genetics of
Populations, 4 vols. Chicago, IL: University of Chicago Press.
References
EhlersDanlos Syndrome
F M Pope
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0397
EhlersDanlos syndrome (EDS) is essentially a disorder of collagen connective tissue, in which skin is
over-fragile and hyperelastic. First described by
Tschernogobov and then separately by both Ehlers
and Danlos (Beighton, 1993) at the end of the nineteenth century there are at least seven separate types,
most of which are nonallelic. Recently a consensus
committee have proposed a modified classification,
Figure 1
EDS I/II.
604
E h l e r s D a n l o s S y n d ro m e
(A)
(B)
Figure 2 (See Plate 13) (A) Face and (B) hands of acrogeric EDS IV patients. The large eyes and lobeless ears are typical,
whilst the hands show pulp atrophy from terminal phalyngeal erosions.
electron microscopy of skin at 60 000 magnification
shows patchy irregularity of fibril size. Collagen III
protein analysis nearly always shows depletion of
procollagen, and collagen in the medium and overmodified mutant protein intracellularly. Most mutations are private and unclustered, although exon 24 is a
hot spot for skipping errors, whilst first position glycine substitutions are also very common (Pope et al.,
1996). Helical 30 mutations have the most abnormal
external appearance and also the highest frequency of
arterial rupture. The latter is commonest in the fourth
decade of life, but patients are also at risk during
adolescence and pregnancy. Blood pressure and aortic
ultrasound monitoring are of unproven value.
EDS type VI is typified by extreme general laxity
and hypotonia, and motor delay owing to ligamentous
laxity is very common. The typical hypotonic child,
has good power and normal muscle biopsy. Early
scoliosis indicates surgical correction in adolescence.
Blue sclerae and osteopenia overlap with osteogenesis
imperfect (OI), giving severe joint laxity with EDS
VII (see below). EDS can be distinguished by abnormal typeI collagen chemistry (type VII and OI) or
electron microscopy of skin (normal in EDS VI).
Type VIb lysyl hydroxylase deficiency can be monitored by gel electrophoresis of collagen type I proteins
which migrate faster in affected patients, or by measuring urinary cross-links. The gene has 19 exons, and
most mutations are double heterozygotes.
EDS type VII is caused by the persistent Npropeptides, either from faulty N-proteinase or individual structural mutations of exon 6 of COL1A1
or COL1A2 genes. Clinically congenital hip dislocation combines with generalized joint laxity. Fragile
skin and the external general phenotype overlaps
with EDS I and II, but is distinguished by abnormal
collagen chemistry (types a and b), in which the
retained uncleaved propeptide is retained either as
an extra a-1 or a-2 chain. In type VIIc, both a-1 and
2 extensions remain (Pope and Burrows, 1997). Types
a, b and c retain 1, 2 or 3 chains, with a gradation of
clinical severity, greatest in type VIIc in which there
References
Electron Microscopy
M E Dresser
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0398
606
Electron Microscopy
m
(A)
(B)
Figure 1 (above) Thin sections of ultrarapidly frozen, freeze-substituted Dictyostelium amoebae. (A) Wild-type cell.
(B) CluA-cell, a mutant where mitochondria associate in a single large cluster. The scale bar represents 1 mm;
nucleus (n), lysosome (l), and mitochondria (m). (Images were kindly provided by S. Fields and M. Clarke, Oklahoma
Medical Research Foundation.)
A
Intracellular topology is visualized using freezefracture and deep-etch methods. Following freezing,
cells are broken open and held under vacuum to allow
sublimation of the ice to proceed until intracellular
structures are exposed. The surface is then coated
with heavy metal to produce a film imprinted with
the surface topology. This film is examined using TEM
after being separated from the underlying substrate.
608
E l e c t ro p h o re s i s
Figure 5 (See Plate 15) The surface of a three-dimensional reconstruction of a helical filament of human Rad51
protein on DNA is shown in gold in the foreground. In the background is an electron micrograph of the actual
filaments that Rad51 protein forms on single-stranded DNA in the presence of ATP. (The Rad51 protein is from the
laboratory of Dr Steve West, ICRF, UK.) The inset (right), a portion of such a Rad51-DNA filament (scale bar
represents 400 A) shows the very poor signal-to-noise ratio present in such images. To surmount this problem, the
reconstruction has been generated using an algorithm for processing such images (Egelman (2000) Ultramicroscopy 85:
225234) and involved averaging images of 7620 segments. The reconstruction shows that the filaments contain
~6.4 subunits per turn of a 99 A pitch helix.
References
Electrophoresis
See: Gel Electrophoresis, Pulsed Field Gel
Electrophoresis (PFGE)
Elongation 609
Electroporation
I Schildkraut
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0400
Electroporation is the introduction of DNA molecules into cells by use of an electric current to temporarily make the cells permeable.
Elongation
A Liljas
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0401
Elongation in Transcription
Genomic DNA cannot be translated but has to be
copied or transcribed into RNA by different RNA
polymerases. Here the classic mechanism discovered by Watson and Crick applies. One strand of the
double-stranded DNA (the negative one) is copied
with WatsonCrick base-pairing into a positive strand
of RNA. This occurs in the 50 to 30 direction. The
double-stranded DNA is opened up in a `bubble'
that travels along the duplex during transcription.
Here, a DNARNA hybrid is formed transiently.
The process of transcription is in all cases strongly
regulated. Some genes are transcribed frequently,
whereas others are transcribed only rarely. Again
some genes are transcribed in some brief period in
the life of the cell, whereas others are copied more or
less continuously.
610
Elongation Factors;
Translation
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0402
Em br yo Transfer 611
Embryo Transfer
F Constantini
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0406
612
In laboratory mice, embryo transfer is usually performed by surgical methods. Mouse embryos at the
one-cell stage or cleavage stages are transferred to the
oviduct, while blastocysts, which are ready to implant,
are transferred to the uterus. The recipient female
mouse, or `foster mother,' is first made `pseudopregnant' by mating with a male that has been sterilized by
vasectomy, so that her own eggs will not be fertilized
and cannot compete with the transferred embryos.
The gestational age of the donor embryos and the
recipient must be synchronized, with optimal results
occurring when the embryos are one day more
advanced than the gestational age of the recipient.
The recipient female is anesthetized and the oviduct
or uterus is exposed through a small incision. Under a
low-power stereomicroscope, the embryos to be
transferred are loaded in a small volume of liquid
into a fine glass pipette, which is inserted through a
small hole in the bursa (the membrane covering the
ovary and oviduct) into the infundibulum (the open
end of the oviduct). In the case of a uterine transfer, a
small hole is made in the side of the uterus with a sharp
needle, and the transfer pipette is inserted through the
hole into the uterine lumen. The embryos are then
expelled and the incision is closed. Under optimal
conditions, the rate of successful implantation and
development of the embryos to term can exceed 90%.
In large animals and in humans, embryo transfer is
usually performed using a transvaginal approach, in
which the embryos are inserted through the cervix and
into the uterus with a catheter.
The applications of embryo transfer are numerous and of great importance for basic research in
genetics and experimental embryology, for animal
husbandry and genetic manipulation of livestock,
and for reproductive medicine in humans. Experiments in which the preimplantation embryo is physically manipulated (for example, by microsurgery,
injection of cells, or cell lineage tracers) depend on
embryo transfer to determine the effects of the treatment on the resulting fetus or animal. All of the
powerful and widely used techniques for genetic
manipulation (transgenesis) of the mouse and other
animals involve the introduction of foreign genetic
material into the preimplantation embryo, either in
the form of purified DNA injected at the one-cell
stage, embryonic stem cells introduced at the cleavage
or blastocyst stages, or nuclei transplanted at the onecell stage. Therefore, embryo transfer is required for
these genetically manipulated embryos to develop
into live animals. Rare or valuable strains of mice and
other mammals are often preserved by embryo freezing (cryopreservation), and embryo transfer is used to
revive these strains. In agriculturally important mammals, in addition to the aforementioned applications,
Embryonic Development
of the Nematode
Caenorhabditis elegans
M Labouesse
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0404
(B)
Pseudocoelomic
cavity
Epidermis
Pharynx
Muscles
Intestine
(C)
Gonad
Nervous system
Figure 1 Anatomy of the nematode embryo. (A) Nomarski picture of the embryo at mid-embryogenesis; muscles,
neurons, and the gonad cannot be distinguished in this picture. (B) Schematic drawing of a section through the
embryo or the larva showing the positions of the main tissues. (C) Schematic drawing of the young hatchling. White
scale bar in (A) is 10 mm. In the embryo (A) and the larva (C) anterior is to the left and dorsal is up. The color code
for tissues and organs is the same in (B) and (C); note that only a subset of muscles are drawn and that only the main
nerve is shown (but the cell bodies of neurons are not represented).
possible, for example, can tell the anterior from the
posterior or generates functional muscles.
614
Fertilization
Number
of cells
Zygote
1
2
4
AB
Gastrulation
100
P1
Two-cell
stage
28
ABp
ABa
P2
100
Four-cell
stage
EMS
Beginning of
gastrulation
Differentiation
240
350
300
Beginning of
morphogenesis
Morphogenesis
Hatching
550
800
550
End of
morphogenesis
Figure 2 The main stages of Caenorhabditis elegans embryogenesis. The Nomarski pictures on the right show from
top to bottom: a zygote after positioning of the two pronuclei at its centre; a two-cell embryo (note that the anterior
AB blastomere is larger than the P1 blastomere); a four-cell embryo (the blastomere names are indicated); a 28-cell
embryo which is initiating gastrulation; an embryo at the beginning of elongation; an embryo at the end of elongation.
Major embryonic events on the left and pictures on the right should be related to the time-scale and cell-number
scale shown in the center. The scales are not linear.
at the 28-cell stage. The second stage corresponds to
the time of gastrulation, organ formation, and initial
differentiation; it takes 4 h and is accompanied by six
further cycles of cell division. During the final stage,
terminal differentiation and morphogenesis of the
embryo from a ball of cells to a worm-shaped embryo
occur with only very few additional cell divisions.
Light microscopy using differential interference contrast optics (Nomarski optics) and the use of an autofluorescent protein (green fluorescent protein) fused
Embryological Methods
Fertilized
egg
Neurones
AB
ABa
ABal
ABar
Pharynx
P1
ABp
P2
EMS
ABpl ABpr
Epidermis
P3
MS
Intestine
Muscles
Germline
D
P4
Figure 3 Early embryonic lineage and founder cells. This figure shows the beginning of the embryonic lineage. The
vertical axis shows time, the horizontal axis shows divisions. The organ/tissue that is contributed to by each major
branch of the lineage is symbolized by sectored circles which are roughly proportional to the number of cells
produced. The color code is given on the right. The letters that follow blastomere names normally refer to the axis of
divisions. For instance, ABal is the left daughter of the anterior daughter of AB (note however that AB divides along
the dorso-ventral axis and not the anterior/posterior axis but that physical constraints push it to adopt an anterior
position, hence its name).
Molecular Methods
616
(B)
AB
P1
(C)
AB
Wild-type
P1
par-3
AB
P1
par-2
P granules
Mislocalized P granules
Figure 4 The A/P axis is defined by a polarizing mechanism. (AC) Schematic representation of the major
differences observed between wild-type (A), par-3 (B) or par-2 (C) embryos, at the one-cell (top drawings) and twocell (bottom drawings) stages. In a wild-type embryo, (1) the zygote and then the posterior blastomere are polarized
(asymmetric black shading), (2) PAR-3 and PAR-6 proteins are at the anterior cortex (red line) while PAR-1 and
PAR-2 proteins are at the posterior cortex (brown line), (3) P granules (red and white dots) are located posteriorly
and segregate to the P1 blastomere, (4) the AB blastomere is larger than the P1 blastomere and divides along the D/V
axis. In a par-3 mutant embryo (it would be the same in a par-6 mutant embryo), (1) polarity is abolished, (2) PAR-1
and PAR-2 proteins are found all around the cortex, (3) P granules are uniformly distributed, (4) the first cleavage is
symmetric and generates two blastomeres that divide along the A/P axis. In a par-2 mutant embryo, the situation is
similar, except that now (1) PAR-3 and PAR-6 proteins are found all around the cortex and (2) both AB and P1divide
along the D/V axis. Anterior is to the left.
absent from the AB blastomere4 after the first division.
Thus, four criteria can be used to identify the initial
polarity of the zygote: the asymmetric localization
of P granules, the unequal size of AB and P1, the fact
that AB divides along the future dorso/ventral axis
(D/V axis) while P1 divides along the A/P axis, and the
different fates5 of AB and P1 progenies (Figure 4A).
A breakthrough in the understanding of how the
zygote acquires its A/P polarity came with the isolation of mutations that affect the distribution of P
granules. These mutations define six maternal6 genes,
called par-1 through par-6 ( par stands for partitioning
4
defective). Although par genes act in a common pathway, they can be subdivided into at least two groups.
One group includes proteins localized at the anterior
cortex (PAR-3 and PAR-6; see Figure 4B), the other
includes proteins localized at the posterior cortex
(PAR-1 and PAR-2; see Figure 4C). Genetic analysis
has shown that PAR-1 is the final effector of this
pathway. The nature of PAR proteins suggests that
they act in a signaling process. In particular, PAR-1
and PAR-4 have protein kinase7 domains, whereas
PAR-3, PAR-6, and PAR-2 have proteinprotein
interaction modules suggesting that they could position or tether other proteins in specific places.
It is thought that PAR proteins act to interpret the
polarity cue provided by the sperm, which brings two
centrosomes. One model that is supported by the
involvement of actin states that PAR proteins act by
mediating local changes in the cytoskeleton. How
they do so and what their immediate targets are is
unknown. Whatever the mechanism, their activity is
required to localize several cell fate determinants
7
618
(A)
ABp
ABa
P2
EMS
(B)
(C)
ABp
ABp
ABp
P2
(D)
ABa
EMS
EMS
ABpl
ABpr
ABal
ABar
MS
P2
ABp
P2
ABa
ABa
EMS
P3
E
MS
MS
Wild-type
apx-2
mom-2
POP-1 protein
APX-1 ligand
MOM-2 ligand
Figure 5 Defining the D/V axis and generating cell diversity in the early embryo. (A) At the four-cell stage, due to
the activity of par genes, the EMS and P2 blastomeres are different from each other and both are different from the
ABa and ABp blastomeres; however, ABa and ABp are initially equivalent. (B, upper) In wild-type embryos, the APX-1
ligand in P2 instructs ABp to become different from ABa, while the MOM-2 ligand in P2 polarizes EMS (asymmetric red
shading). (B, lower) In turn, polarization of EMS is interpreted when it divides in such a way that the POP-1 protein
becomes nuclearly localized mainly in the MS blastomere; this allows the E blastomere to generate the intestine. (C)
In apx-1 mutant embryos, ABa and ABp remain identical and express the `ABa fate.' (D) In mom-2 mutant embryos,
EMS is not polarized at the four-cell stage, hence MS and E both inherit nuclear POP-1, which causes them to be
identical and to express the `MS fate' (the intestine is not made).
anterior daughter of EMS (Figure 5A). The ligand is
expressed in the P2 blastomere, while its receptor
is expressed in the EMS blastomere. In embryos lacking the ligand or its receptor, the posterior daughter E
adopts the fate of its anterior sister MS (Figure 5C).
How POP-1 activity in the anterior daughter can
ultimately contribute to generate cell fate diversity is
beginning to be understood for the EMS lineage.
Genetic and molecular analysis has shown that the
maternal gene skn-1 encodes a transcription factor
that is present and essential in EMS, MS, and E
blastomeres. Hence, at least two transcription factors
will be active in MS, POP-1, and SKN-1, which will
contribute to specify the `MS fate,' while only one will
be active in E, SKN-1 which will activate the endoderm specification program. Presumably POP-1
together with other transcription factors can similarly
specify unique fates among the first 28 cells of the
pregastrulation embryo. Genetic analysis in C. elegans
has thus uncovered an entirely new mechanism to
generate cell diversity, which has now also been
shown to exist in vertebrates.
620
Knowledge of muscle function and assembly is particularly detailed in C. elegans. C. elegans muscle
sarcomeres, which assemble during the second half
of embryogenesis, are in most respects very similar
to those observed in other species, except that muscle
cells do not fuse. They include alternating thick filaments, which contain myosin, and thin filaments,
which contain actin, tropomyosin, and troponin.
Conclusion
For a long time, nematodes were thought to be unique
among animal species owing to their invariant lineage.
Indeed, early blastomeres have a fixed fate in C. elegans, implying that if a blastomere is ablated the tissues
that should normally be generated by this blastomere
will be missing. In contrast, in many other species early
Cuticle (exoskeleton)
Epidermis
Extracellular
matrix
Intermediate
filaments ?
Myotactin
Integrin
Muscle
sarcomere
Dense
body
M
line
Figure 6 How the muscle is attached. Sarcomeres are formed by alternating thin (gray lines) and thick filaments
(red lines). Within muscle cells the anchoring structure is called the dense body and consists of a complex formed by
several proteins (vinculin, talin, and a-actinin), which interact with actin within thin filaments, and an integrin dimer
(dark pink/pale pink pair; genes pat-2 and pat-3) at the muscle membrane (some additional attachment is provided
through so-called `M lines' at the center of thick filaments). The integrin itself recognizes a protein from the
extracellular matrix called perlecan (intertwined gray lines; gene unc-52). Within epidermal cells the anchoring
structures are called fibrous organelles: they are made in part by a long transmembrane protein called myotactin
(gray; gene let-805) that extends toward muscles and contacts in turn one or more proteins that probably run across
the epidermal cytoplasm from the basal membrane to the cuticle. It is not yet known what provides attachment to the
cuticle.
Further Reading
Embryonic Development,
Mouse
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0403
622
maternal
pronucleus
Fourth cleavage
Fertilization
16-cell embryo
paternal
pronucleus
zygote
fertilized egg
1-cell embryo
outside cells
differentiate
into trophectode
First cleavage
Blastocyst formation
2-cell embryo
blastocyst
ICM
Second
cleavage
Further divisions,
hatching,and
implantation
4-cell embryo
Third cleavage
Figure 1
Preimplantation development.
individual cells that were dissected out of the fourcell-stage mouse embryo and placed back individually
into the female reproductive tract. This experimental
feat demonstrates the theoretical possibility of obtaining four identical clones from a single embryo of any
mammalian species.
It is important to contrast the early developmental
program of all placental mammals with that of other
animals including the two model organisms Caenorhabditis elegans and Drosophila melanogaster. Identical twins can never be obtained from a single
nematode or fly embryo. During nematode development, individual embryonic cells from the two-cell
stage onward are highly restricted in their developmental potential or `fate.' The fly egg is polarized
even before it is fertilized and different cytoplasmic
regions are devoted to supporting different developmental programs within the nuclei that end up in these
End Labeling
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1830
Endoderm
See: Developmental Genetics
Endonucleases
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1832
End-Product Inhibition
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1831
End-product inhibition is the process whereby a product of a metabolic pathway inhibits the activity of an
enzyme that catalyzes an early step in the pathway.
See also: Enzymes
624
E n h a n c e rs
Enhancers
R Grosschedl
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0410
Enhancers are operationally defined as cis-acting elements that augment the activity of a promoter in an
orientation- and position-independent manner. Initially identified in a sea urchin histone gene and in
the simian virus 40 (SV40) viral genome as regulatory
elements that increase transcription from a promoter
located at a distant position on the same DNA molecule transcriptional enhancers are found in most
eukaryotic genes transcribed by RNA polymerase II.
In lower eukaryotes such as yeast, upstream activator
sequences (UAS) can also function at variable distances from the promoter and in either orientation.
UAS are thus analogous to enhancers of higher eukaryotes, although they differ from enhancers in their
inability to activate a promoter from downstream
positions. In bacteria, a simple enhancer has been
identified upstream of a promoter that is recognized
by a specific form of RNA polymerase containing
sigma factor 54.
Transcriptional enhancers of higher eukaryotes are
typically composed of multiple modules that cooperate to augment gene expression. The modules consist either of binding sites for individual transcription
factors or of composite binding sites for different
transcription factors. The multiplicity and modularity
of transcription factor binding sites in enhancers allow
for combinatorial control and functional diversity.
In addition, interactions between multiple enhancerbinding proteins can help to increase the accuracy of
DNA sequence recognition in a large and complex
genome.
The modularity of enhancers has been demonstrated by experiments in which individual modules
of an enhancer have been multimerized to generate
synthetic enhancers. Synthetic enhancers typically
augment gene expression. However, they do not
reproduce all regulatory properties of natural enhancers such as cell type specificity or inducibility. Insight into the modularity and functionality of natural
enhancers is provided by experiments examining
interactions between transcription factors bound at
different modules. At some natural enhancers, multiple transcription factors have been shown to interact
with each other, resulting in the assembly of a higherorder nucleoprotein complex, termed the `enhanceosome.' The assembly of such complexes can be
faciltated by architectural proteins that have no
activation potential by themselves but augment
Enzymes 625
chromosomal domains. An insulator that is placed
between an enhancer and a promoter blocks the interactions between these elements. Thereby, insulators
help to impart promoter specificity in complex gene
loci in which multiple promoters are located in the
vicinity of an enhancer.
See also: Chromatin; Cis-Acting Proteins;
Promoters
Enzymes
D C DeLuca and J Lyndal York
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0414
Koshland. These models represent extreme cases. Different enzymes show features of both models. Amino
acid side chain residues at the enzyme's active site
interact chemically or physically with substrate to
lower the energy required for the reaction to occur at
physiological temperatures. Substrate specificity is
determined by the chemical properties and spatial
arrangement of the amino acid residues forming
the active site of an enzyme. The restriction endonucleases illustrate enzyme substrate specificity. These
enzymes are responsible for very specific cutting of
DNA into unique fragments. Separation of these
fragments provides a `fingerprint' of an individual
organism's DNA for unambiguous identification.
Restriction enzymes play a critical role in the development of the field of biotechnology.
Some enzymes require the presence of small nonprotein units (cofactors), either inorganic ions, organic
molecules, or both. The precursors for some organic
molecules (coenzymes) are the vitamins. Coenzymes
covalently attached to the enzyme are called prosthetic groups and cosubstrates if they undergo chemical
modification during the reaction. An enzyme with
its cofactor is called the holoenzyme; without the
cofactor, the species is called an apoenzyme.
Isoenzymes (isozymes) are distinct forms of an
enzyme that catalyze the same reaction but differ in
physical or kinetic properties. Different isoenzymes
are usually encoded by different genes and may occur
in different tissues of an organism. For example,
human creatine kinase exists as three isozymes that
predominate in skeletal muscle, heart muscle, and
brain tissue, respectively.
Molecules that act directly on an enzyme to reduce
its catalytic activity are known as inhibitors. Many
therapeutically useful drugs, pesticides, and herbicides
are inhibitors of specific enzymes. Inhibitors are classified as either reversible or irreversible. The effect
of reversible inhibitors may be overcome in various
ways, whereas irreversible inhibitors lead to a state of
permanent inactivity as they often form stable covalent
bonds with reactive amino acid residues of the protein.
Reversible inhibition is further subdivided into competitive and noncompetitive types. A competitive
inhibitor is one whose effect is overcome by the addition of substrate. Noncompetitive inhibitors engage a
site other than the catalytic site causing a conformational change altering catalytic activity. This state cannot be overcome by substrate addition.
Enzyme-catalyzed reactions are subject to a variety
of exquisite control mechanisms. These are feedback
inhibition of allosteric enzymes, covalent modification, proteolytic activation, and regulation of protein
synthesis and breakdown. Feedback inhibition occurs
in a metabolic pathway when an early enzyme in
626
E pi d er m a l G row t h F a c t o r ( EG F )
References
E pi d er m a l G row t h F a c t o r ( EG F ) 627
2.
3.
4.
5.
628
E p i g e n e ti c s
EGF ligand dimerization partner and may indeed originate from differential receptor transphosphorylation. As a result the different cellular responses to
the EGF family of peptide growth factors is due to
the array of erbB receptors activated and the repertoire of signaling pathways that are engaged at the
effector level.
See also: erbA and erbB in Human Cancer; Neu
Oncogene; SH2 Domain; Signal Transduction
Epigenetics
deceased
E p ig en e t ic s 629
chromosome structure that collude with DNA
methylation in effecting epigenetic control are discussed. Possible mechanisms for the tagging of loci
in taxa are also reviewed, e.g., arthropods, that do not
appear to have DNA methylation in their genomes.
Then several representative examples of epigenetic
regulation, are presented, focusing in each case on
the presumed evolutionary benefit reaped by the cell
and the organism from effecting such a mode of gene
control, and on recent molecular data that offer
mechanistic explanations (reviews: Russo et al., 1996;
Chadwick and Cardew, 1998). In conclusion, there is a
short perspective on the general applicability of epigenetic principles to genome control in eukarya.
630
E p i g e n e ti c s
E p ig en e t ic s 631
proliferating hepatocytes stop expressing many
liver-specific markers). In general, proliferating cells
in multicellular organisms only very rarely express
genes associated with the differentiated state, such
luxury is only allowed to cells that are replicatively
quiescent and can assemble regulatory complexes on
DNA with no fear of being swept out of the way by a
passing megadalton assembly of DNA polymerase.
Thus, the first major challenge facing the cell in
enabling stable domains of gene expression is a need
to maintain them through repeated rounds of genomic
replication.
There is an additional problem, however: even in
cells that are in a state of proliferative arrest, a protein
complex bound to DNA is no guarantee of stability. In
large part this is because the eukaryotic nucleus contains significant quantities of ``philandering'' transcriptional regulators; not tethered to any particular DNA
segment, they can spuriously affect a random gene
through a ``hit-and-run'' mechanism. One possible
kind of insurance against such accidents is the establishment of a particular kind of ``fortified'' regulatory
structure at a given locus that would be impervious
to such sporadic attacks. The second major challenge
in stably controlling the genome is to minimize levels
of regulatory noise due to spurious interactions.
As discussed in the following two sections, epigenetically regulated genes and loci offer a wonderful
example of how parsimonious natural selection can be
in molding regulatory pathways. The solution to both
challenges is implemented via a remarkably elegant
integration of simple biochemical mechanisms.
NH2
N
O
C
4
6
1
C
C
Cytosine
Figure 1
NH2
DNA
methyltransferase
S-Adenosylmethionine
CH3
C
4
6
1
C
C
5-Methylcytosine
632
E p i g e n e ti c s
CH3
CH3
DNA
polymerase
CH3
DNA
methyltransferase
CH3
CH3
CH3
CH3
CH3
(A)
x
x
x
DNA
Polymerase
Chromatin
assembly
x
x
x
x
x
x
x
x
x
x
x
x
(B)
Figure 2 (A) The passage of the replication fork and reestablishment of methylation; (B) the replication of
chromatin.
This is very significant for controlling epigenetic
phenomena, because, in the double helix of DNA, the
stretch around 50 -CpG-30 is symmetric (i.e., the other
strand reads 30 -GpC-50 ); in fact, when human DNA is
examined, all of the methylation is also symmetric:
50
. . . m5 C
G...
30
...
m5 C . . .
30
50
E p ig en e t ic s 633
DNA methylation, e.g., arthropods such as the fruit
fly Drosophila melanogaster and nematodes such as
the round worm Caenorhabditis elegans, is a fascinating issue). Thus, if a particular gene has its promoter
methylated, it becomes transcriptionally inert (`invisible' to RNA polymerase). DNA methylation
is one of the most potent mechanisms for transcriptional repression known in biology today. Most significantly, we currently do not know of any way to
reactivate a chromosomal locus that has been silenced
by methylation except to remove the methyl residues.
This is not the academic issue that it might seem;
advanced forms of human cancer are well known to
have aberrant DNA methylation; for example, genes
required for cell-cycle arrest, such as cyclin-dependent
kinase inhibitors, are erroneously silenced in tumor
cell lines because their promoters are methylated
(which they never are in normal, noncancerous cells).
In addition, as elaborated in the section ``Repetition,
mother of genetic silencing'' up to one-third of our
genomes consists of parasites (self-propagating DNA
elements such as transposons). They are kept in check
(i.e., silenced) by hypermethylation, and woe to the
genome that unleashes the parasites within it.
How can DNA methylation by the tiny 26 Da of
methyl groups involved be so powerful in antagonizing the transcriptional machinery? The answer to
this question also takes care of the other challenge
to enabling stable domains of gene expression: how to
prevent noise. It turns out that methylation of a
particular DNA stretch is read by the cell as a command to envelop it in a protective cocoon, a specialized protein structure that makes the DNA
physically inaccessible to regulators. This shielding
occurs in two steps: first, when a DNA stretch is
methylated, it is immediately bound by specialized
proteins, discovered by Adrian Bird's research group,
called methylated DNA binding domains (MBDs)
these have the interesting and useful property of being
highly selective for methylated DNA (for example, the
best-studied MBD, a protein called MeCP2, will bind
m5CpG, but not CpG). Once bound, these MBDs
attract other proteins, all of which can remodel chromatin, i.e., alter the structure of the chromosome
around their binding site.
To appreciate how chromatin remodeling can lead to
transcriptional repression, we must recall that there is
no naked DNA in our cells, all of it is complexed with
highly positively charged proteins called histones.
Every 146 bp of DNA in the genome is wound around
eight molecules of histones to form a nucleosome the
elementary building block of our chromosomes; the
familiar ``bead-on-a-string'' array of nucleosomes
winds around itself to form chromosomes. As one
might expect, the cell uses chromatin to regulate its
634
E p i g e n e ti c s
As mentioned in ``A brief history of research in epigenetics,'' the maternal and paternal genome make
unequal contributions to the genomic output of their
joint product, the progeny: for a certain number of
genes, one copy in our genome is inactivated for the
duration of our lifespan. Appropriately borrowing the
term from Helen Crouse's study of Sciara, loci regulated in this way are referred to as `imprinted.' It would
be helpful to explain the terminology used in this field:
a gene is called `maternally expressed' if organisms
always use (i.e., transcribe) the allele they inherited
from their mother; conversely, a `paternally expressed'
gene is one in which the allele inherited from the father
is the one that is active. Imprinting, while of immense
academic and general intellectual interest, has medical
relevance: several human disorders, including Prader
Willi and Angelman syndromes, are caused by a misregulation of imprinted loci.
Of the many questions that spring to mind regarding imprinting, we will briefly address three: (1) On a
mechanistic level, how is the difference in expression
between the two alleles effected? (2) How do male
and female gonads ensure correct imprinting in future
generations and distinguish sets of paternally and
maternally expressed genes when producing gametic
precursors? (3) What selection pressure could have
lead to the evolution of such a peculiar mode of gene
regulation?
1. The difference in expression is maintained by keeping the imprinted loci in a state of differential methylation, such that the expressed allele is demethylated,
and the repressed allele is hypermethylated. We
emphasize that the primary DNA sequence of the
two alleles can be identical, and yet profound differences in expression levels are observed. A genetic
ablation of pathways leading to DNA methylation
abrogates correct regulation on many imprinted loci
in the mouse genome. One very interesting recent
development has been the discovery, by the research
groups of Shirley Tilghman at Princeton and Gary
Felsenfeld at the NIH, that for the H19/Igf2
imprinted locus, the effect of such differential
methylation is to control the ability of a protein
called CTCF to associate with a regulatory element
E p ig en e t ic s 635
found in this area (CTCF binding is prevented by
methylation). Interestingly, the role of CTCF binding is to speed the spread of regulatory information
along the chromosome, i.e., CTCF enables boundary, or insulator function in this locus. In other cases,
the effect of methylation is to drive the creation of a
repressive chromatin structure over the genes being
regulated.
2. Few things in epigenetics are the cause of greater
wonder and mystery than the establishment of
methylation patterns relevant to imprinting during
gametogenesis. By way of example, consider a
paternally expressed gene in a human female: of
the two alleles she has in her genome, the allele
she inherited from her father (let us designate it
as <) is demethylated and active; the allele she
inherited from her mother (,) is methylated and
inactive. During oogenesis, the following happens
(`m' stands for `methylated'):
Paternally expressed gene in an ovary: ,m < ! ,m <m
Thus, in her germ cells, the maternal allele is kept
methylated, and the paternal allele is methylated
de novo (by definition, her children must inherit
this allele from her in inactive, methylated form).
Remarkably, a maternally expressed locus in that
same woman undergoes the exact opposite process:
Maternally expressed gene in an ovary: , <m ! , <
(i.e., the cell takes the allele this woman inherited
from her father, and demethylates it; again, by defination, this woman's children have to receive this gene
from her in active, demethylated form).
3. During spermatogenesis in the male, all maternally
expressed genes are methylated and all paternally
expressed genes are demethylated. This process is
called `resetting of gametic marks,' and we have
little beyond very weak conjecture on how it is
enabled. How can the cell possibly scan the vast
narrative of its entire genome for all loci that are
differentially methylated on two homologous
nonsister chromatids? Once these loci have all
been found, whatever quasi-miraculous machine
lies behind this search, what can be the mechanism whereby the cell determines whether this
particular allele pair is maternally or paternally
expressed (this cannot be based on the difference in
methylation between the two alleles, of course), and
decides whether to demethylate or hypermethylate
both alleles? As if this was not mysterious enough,
how can such programs be perfectly reversed in a
gender-specific way (i.e., the same locus is treated in a
diametrically oppositeway in males versus females)?
Howeverthisworks,itclearlydoes,andquitewell;
but how does organism benefit? A very attractive
636
E p i g e n e ti c s
Whatever the mechanism whereby epigenetic activation or repression is enabled, one of its most salient
features is that traits controlled epigenetically frequently exhibit non-Mendelian inheritance patterns.
A classic example is the paramutation phenomenon
that affects maize kernel color: the R locus controls
pigment formation, with the Rr allele producing dark
kernels and the Rst allele, stippled kernels. This would
be yet another case of the exceptional utility of color
inheritance in providing textbook illustrations of
Mendelian inheritance, if it were not for the following:
a testcross of an Rr/Rr plant yields all dark kernels and
of an Rst/Rst, all stippled, in full agreement with expectation. In overwhelming contradiction to common
sense, a testcross of an Rr/Rst plant yields all stippled
kernels, even though 50% of them are genotypically
Rr. We now know that the r allele is somehow epigenetically modified (weakened) by the act of its passage
through a heterozygotic environment that contains an
st allele, but the mechanistic details are not well understood (Russo et al., 1996).
As mentioned earlier, studies of epigenetic phenomena were initiated by Barbara McClintocks's experiments on the Spm transposon in maize (which,
incidentally, also affects pigment formation in the
kernel), in which she showed that activity of this
transposon can fluctuate through generations, and
that it can become epigenetically inactivated and reactivated (not surprisingly, the trait affected by the transposon insertion fails to comply with Mendelian
segregation rules). Subsequent work from Nina
Fedoroff's laboratory at Penn State University has
shown this regulation to be due to alterations in the
methylation status of a stretch of the Spm promoter
that contains a very high percentage of G/C residues
a hypermethylated transposon is heritably inactivated
until passage through a nucleus containing an active,
demethylated copy of the Spm transposon leads to
demethylation and activation (Russo et al., 1996).
A final, wonderful example of the extraordinary
power of epigenetic regulation in effecting nonMendelian inheritance comes from fission yeast
Schizosaccharomyces pombe. Work from Amar Klar
and his colleagues has unraveled the very elegant
mechanism whereby this organism switches mating
type: after meiosis, haploid spores reacquire the capacity to mate by assuming one of two mating types
(`plus' and `minus'; Figure 3A). The mating type is
defined though a DNA recombination event in the
spore; only spores of opposite type can mate. Remarkably, when a single spore of the plus mating type
divides twice, of its four granddaughters, three remain
plus and one switches to minus. We now know this
E p ig en e t ic s 637
(A)
(B)
+
Leading strand
Lagging strand
DNA
replication
+
Imprint
+
Switch
References
638
Episome
Episome
K B Low
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0416
References
Jacob F and Wollman EL (1958) Les episomes, elements genetiques ajoutes. Comptes Rendues de l'Academie des Sciences
247: 154156.
Epistasis
G A Churchill
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0417
Biochemical Epistasis
Gene products often act together in pathways and
networks. Examples include biosynthetic pathways,
signal transduction pathways, and transcriptional
Epistasis 639
Statistical Epistasis
Component
Additive
Additive
Dominance
Dominance
AA
AD
DA
DD
AABB
2
2
1
1
1
1
1
1
AABb
2
0
1
2
0
2
0
2
AAbb
2
2
1
1
1
1
1
1
AaBB
0
2
2
1
0
0
2
2
AaBb
0
0
2
2
0
0
0
4
Aabb
0
2
2
1
0
0
2
2
aaBB
2
2
1
1
1
1
1
1
aaBb
2
0
1
2
0
2
0
2
aabb
2
2
1
1
1
1
1
1
640
Epistasis
Table 2
clf1
BB
Bb
bb
AA
Aa
aa
Future of Epistasis
What sorts of phenotypes will tend to show epistatic
effects? Transcriptional regulation of gene expression
is complex, involving both positive and negative regulation of multiple factors with varying degrees of
specificity. Gibson, 1996 demonstrated that inherent
properties of such systems lead to epistatic and pleiotropic effects. Traits that are closely related to direct
regulation of one or a few genes are more likely to
reveal epistasis than are morphological traits that depend on the cumulative effects of many genes for their
expression. The availability of molecular markers and
technology for monitoring gene expression opens
up new possibilities for unraveling the network of
biochemical mechanisms underlying the relationship
between phenotype and genotype. As our ability to
study the effects of genes at the biochemical level
improves, so will our understanding of the mechanisms underlying epistasis. Modern molecular techniques are helpful in reuniting the biochemical and
statistical descriptions of this ubiquitous phenomenon.
References
Bateson W (1909) The progress of genetics since the rediscovery of Mendel's papers. Progressus rei botanicae 1: 368 418.
Cockerham CC (1954) An extension of the concept of partitioning hereditary variance for analysis of covariances among
relatives when epistasis is present. Genetics 39: 859882.
Fisher RA (1918) The correlation between relatives on the
supposition of Mendelian inheritance. Transactions of the
Royal Society of Edinburgh 2: 399 433.
Gibson G (1996) Epistasis and pleiotropy as natural properties
of transcriptional regulation. Theoretical Population Biology 49:
5889.
E p s t e i n B a r r V i r u s ( E B V ) 641
Juriloff DM (1995) Genetic analysis of the construction of the
AEJ. A congenic strain indicates that nonsyndromic CL (P) in
the mouse is caused by two loci with epistatic interaction.
Journal of Craniofacial Genetic Development Biology 15: 112.
Wright S (1931) Evolution in Mendelian populations. Genetics
10: 97159.
Wright S (1980) Genic and organismic selection. Evolution 34:
825843.
size, packaged with a number of minor virion proteins. The most abundant EBV envelope and tegument
proteins are 350/220 and 152 kDa in size, respectively.
The EBV genome is carried by the virion as a linear,
double-stranded 172-kbp DNA.
The interactions of EBV with the human host are
seemingly paradoxical. It is the most highly transforming known virus. It turns resting B lymphocytes
regularly into immunoblasts that can give rise to
immortal lymphoblastoid cell lines (LCLs). The
EBV-transformed immunoblasts closely mimic IL-4
and anti-CD40 activated blasts morphologically and
with regard to their repertoire of activation markers.
In spite of its high transforming ability, EBV
induces or contributes to malignant disease only
exceptionally. These exceptions can be seen as biological accidents at the level of the host (like immunosuppression as already mentioned) or of the cell
(oncogene activation). The second, related paradox
concerns the relationship between the virally infected
lymphocyte and the host organism. EBV-transformed
immunoblasts are highly immunogenic. Mononucleosis can be seen as a somewhat chaotic but nevertheless
efficient rejection reaction. In vitro exposure of T cells
to autologous EBV-transformed immunoblasts generates CD8 killer T cells that lyse their specific targets
with an equally high efficiency as allogeneic T cells can
kill MHC class I incompatible targets. In spite of the
highly efficient elimination of the proliferating immunoblasts, the virus regularly succeeds in establishing its permanent latency in the B cell compartment
itself, without causing either proliferation or rejection
of its carrier cell. Both paradoxes have been resolved
by the analysis of the viral strategy.
642
E ps t e in B a r r Vi r us ( EB V )
E ps t ei n B a r r V i r u s ( E B V ) 643
LMP1 associates with LAP1 in B lymphoblastoid
lines and with an EBV-induced cell protein, EB16,
which is the human homolog of the murine TRAF1,
implicated in cell growth and NFkB activation.
Immune Responses
EBV-transformed immunoblasts are highly immunogenic for autologous T cells. Several immune effectors
and CD8 T cells react to them with an equally
intense proliferation and cytotoxic response as to allogeneic MHC class I incompatible cells. In the autologous T anti EBV-B mixed lymphocyte culture, one
or two of the growth transformation associated EBVencoded proteins (with the exception of EBNA1) are
chosen as the main targets, depending on the MHC
class I allotypes of the responder that serve as the
preferential restriction specificities. Other effectors,
such as NK cells, LAK-type cells, and macrophages
are also mobilized and a variety of lymphokines are
644
E qu i l ib r i u m
released in the course of mononucleosis, but the efficient rejection is probably largely due to the CD8
CTL. EBNA3, 4, and 6 and LMP2 are the most frequent rejection targets.
The exemption of EBNA1 from being targeted by
the CD8 T cells is due to the long glycinealanine
repeat that inhibits the proteasomeubiquitin-dependent processing of EBNA1, as long as it is in the normal
cis position. This exceptional handling of EBNA1 can
be seen in relation to the fact that it is the only EBVencoded protein that can be expressed irrespectively
of the cellular phenotype. This is also one of the main
reasons why the memory B cells that carry latent virus
escape the ``attention'' of the immune system.
Disease Association
EBV is the causative agent of infectious mononucleosis and of the immunoblastomas that arise in
immunosuppressed patients, such as transplant recipients, congenital immunodeficiencies, particularly the
X-linked lymphoproliferative syndrome (XLP, an
inherited immunodeficiency syndrome that preferentially effects the EBV-specific immune surveillance
mechanism), and in HIV-infected persons. The virus
is associated with 98% of endemic Burkitt lymphomas, but is only present in about 20% of the sporadic
cases. All Burkitt lymphomas carry the chromosomal
Ig/myc translocation, however, that is believed to
provide the proliferative drive of the tumor. Multiple
viral genomes are present in 100% of low differentiated or anaplastic nasopharyngeal carcinomas (see
Nasopharyngeal Carcinoma (NPC)). They are also
present in 50% of Hodgkin's lymphomas, a variable
but usually low percentage of T-cell lymphomas
(except midline granulomas where the association is
100%), NK-cell leukemias, a small fraction of gastric
adenocarcinomas, and leimyosarcomas that arise in
immunosuppressed (e.g., HIV-infected) patients. The
role of EBV in these malignant diseases is not clear.
Further Reading
Equilibrium
M A Asmussen
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1237
Mathematical Criteria
To formalize the concepts of equilibrium and stable
and unstable equilibrium for a system determined by a
single variable, suppose the value of the variable x (say
the gene frequency at a locus) changes through time
such that after one generation its new value x0 is some
function, f(x), of its value x in the previous generation.
The variable x then changes from one generation to
the next according to the recursion equation:
x0 f x
Equilibrium
^xj < jx
^xj whenever x ^x
df ^x
<1
dx
Eq u il i b r i um 645
Under this condition, the equilibrium ^
x is locally
stable because the value of the variable x will always
return to the equilibrium value ^
x if it is perturbed
slightly from that equilibrium.
An equilibrium ^
x is called `globally stable' if the value
of the variable x always converges through time to ^x
from all possible (nonequilibrium) starting values, no
matter how far away the system initially is from this
equilibrium.
Unstable Equilibrium
^ is unstable
Under this condition, the equilibrium x
because the value of the variable x moves away from
the equilibrium value ^
x if it is perturbed slightly from
that equilibrium.
Example
f1 xN
f1 xN f2 1
xN
f1 x
f1 x f2 1
f x
f1 x
x
f1 x f2 1 x
f1 f2 x1 x
f1 x f2 1 x
dx
f1 x f2 1
x2
Number of newborns
Survival rate
Number of new adults
A1
A2
xN
f1
f1xN
(1 x)N
f2
f2(1 x)N
E qu i l ib r i u m Po p ul a t i o n
646
df ^x
x
dx
^x for x ^x
^x
df ^x
x
dx
^x for x ^x
and thus the condition in equation (2) for local stability reduces to the criterion given in equation (3).
x0 = 0.2
x 0 = 0.5
x 0 = 0.7
0.8
0.6
f 1 = 0.8
f 2 = 0.9
0.4
Equilibrium Population
K E Holsinger
0.2
0
0
10
15
20
Generation
25
30
(A)
1
0.8
0.6
f 1 = 0.5
f 2 = 0.4
x
0.4
x 0 = 0.2
x0 = 0.4
x0 = 0.1
0.2
0
0
10
15
20
Generation
25
30
(B)
E q ui l i br iu m Pop u la t i o n 647
an allele, A1, at a particular locus is p and the frequency of the alternative allele at this locus, A2, is
qq 1 p. If there are no differences among individuals in the probability that they survive or in
the numbers of offspring that they produce and if
there is no mutation, then clearly we will have the
same number of A1 and A2 alleles in the next generation as we have in this generation. Putting it another
way:
pt1 pt
where pt refers to the allele frequency in the current
generation and pt1 refers to the allele frequency in the
next generation. A population is in equilibrium whenever pt1 pt.
Suppose we now allow for the possibility that
mutation can occur. Then we would normally expect
the allele frequency in the next generation to be different from the allele frequency in the present generation. Specifically, imagine that A1 mutates to A2
with a frequency m and that A2 mutates to A1 with a
frequency . Then:
pt1 1
pt 1
pt
Clearly, pt1 will not normally equal pt, so the population is not at equilibrium. But what if pt = ?
It is not hard to verify that pt1 will also equal
= . Thus, pt1 pt and the population is at
equilibrium. At this equilibrium the rate at which A1
alleles give rise to A2 alleles (m pt) is equal to the rate
at which A2 alleles give rise to A1 alleles ( (1 pt)) so
there is no net change in the frequency of either allele.
w22 =2
w11
w22
where w11 is the probability that the genotype homozygous for A1 survives relative to the probability that
the heterozygous genotype survives, and w22 is the
probability that the genotype homozygous for A2
survives relative to the probability that the heterozygous genotype survives.
Often mutations cause deleterious effects on the
individuals that carry them. If it were not for the fact
that mutation is introducing new copies of these deleterious alleles, natural selection would tend to eliminate them from populations. If the mutations recur
repeatedly, however, the population will approach an
equilibrium where the rate at which natural selection
eliminates deleterious alleles is exactly balanced by the
rate at which mutation reintroduces them, a phenomenon often called mutationselection balance. If the
relative survival probabilities of the favorable homozygote, the heterozygote, and the deleterious homozygote are denoted as 1, 1-hs, and 1-s, respectively,
when the deleterious allele is completely recessive its
frequency is:
q =s1=2
When the deleterious allele is expressed in heterozygotes its frequency is:
q =hs
648
e r b A and e r b B i n H u m a n C a n c e r
equilibrium are decreased in every generation. Similarly, small populations will tend to change in ways
that cause them to have allele frequencies that are
associated with high probabilities at stationarity.
Further Reading
Error Catastrophe
C Kurland and J Gallant
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0419
650
E r y t h ro b l a s t o s i s Fe t a l i s
Erythroblastosis Fetalis
R F Ogle and C H Rodeck
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0420
Historical Background
The first recorded case of hemolytic disease of the
newborn was described in 1609, but it was not until
1932 that hydrops fetalis, jaundice, and kernicterus
were shown to be part of the same disease associated
with hemolytic anemia, extramedullary erythropoiesis,
hepatomegaly, and erythroblastosis. These features
were collectively known as erythroblastosis fetalis.
By 1939, Levine and Stetson had demonstrated the
involvement of the Rh antigen, but it was not until
1954 that Chown proved that the fetal hemolysis was
caused by the production of maternal anti-RhD alloimmune antibodies. Since 1970 prevention of the disease has been possible by routinely giving anti-RhD
g-globulin to nonsensitized RhD-negative mothers
immediately following the birth of a RhD-positive
child. The antibody removes fetal cells from the maternal circulation before they can cause sensitization.
Rh Blood-Group System
Depending on the presence or absence of D antigen
on the red blood cell surface, individuals are classified
as Rh-positive or Rh-negative. In addition to the D
E r y t h ro b l a s t o s i s Fe t a l i s 651
antigen there are two other major antigens, the C/c and
E/e antigens, which have important clinical implications, the only difference being that there is no apparent d antigen, where `d' refers to an absence of `D'.
In the case of Cc and Ee, both the upper and lower
case letters indicate the presence of serologically
definable antigen.
The genes encoding these three sets of antigens are
inherited together rather than randomly. Earlier investigators therefore proposed a single gene where
recombination would only rarely occur. When individuals totally lack these antigens, they usually have
membrane instability, suggesting that the Rh antigens
have major physiological importance. The ethnic incidence of the RhD-negative phenotype varies considerably, being about 15% in Caucasians, 35% in
Basques, and virtually zero in Asiatic Chinese and
Japanese.
The nomenclature of the Rh system is confusing,
but the most common system used is the FisherRace
system which was based on the theory that the Rh
system locus consists of three genes with antithetical alleles C/c, D/d, and E/e. The haplotypes are
described in triplets, Cde, cde, and cDE being the
most frequent.
652
E r y t h ro b l a s t o s i s Fe t a l i s
Approximately 56% of Rh-positive subjects are heterozygous for the D antigen. When the mother is
RhD-negative and the father is heterozygous RhDpositive there is a 50% chance that the fetus will be
RhD-negative and so not at risk for erythroblastosis
fetalis. Previously in clinical practice, prenatal determination of the RhD phenotype where the father
was heterozygous for the trait involved fetal blood
sampling with serological Rh typing, resulting in a 1
2% fetal loss rate and a 40% risk of fetomaternal
hemorrhage, which may also have increased the risk
of sensitization. An alternative method is serial
amniocentesis for quantitation of bilirubin in amniotic
fluid. This technique is unable to distinguish an
RhD-positive fetus that is mildly affected from an
RhD-negative one. It also potentially exposes the
fetus to multiple invasive procedures.
The ideal strategy for prenatal determination of fetal
RhD phenotype would be to generate a pair of PCR
primers that only amplified a specific region of the
RhD gene without cross hybridization to the RhCeEe
gene or any other gene. A second amplification, ideally
of the RhCcEe gene, should be performed in duplex.
This strategy is difficult to design in practice as there is
a high degree of homology between the RhCcEe
and D genes. Exon 10 of the RHD gene demonstrates
a region of divergence with a copy of the Alu repeat
motif which is found in a large number of other genes
and noncoding sequences. Bennett et al., 1993
reported the first use of PCR for prenatal determination of fetal RhD using primers at the 50 extreme end
of exon 10 in RHD. The 50 primer lay within a region
of 100% homology between RhCcEe and RHD, but it
was the 30 primer, designed to an RhD-specific region,
that gave the specificity to the reaction. A control
primer from sequences in exon 7, which amplified a
134-bp product from both RhCcEe and RHD, acted
as a control in the duplex reaction.
The original report was from 15 samples of amniotic fluid cells with the fetal RhD type being also
Preimplantation Determination of
RhD Type
For sensitized women with heterozygous partners
who have experienced recurrent miscarriages or serial
transfusions and are unable to cope with future affected
pregnancies and also for those who have had severely
affected pregnancies prior to conventional therapy,
preimplantation determination of embryonic RhD
type after in vitro fertilization and before embryonic
transfer is a possible option. To perform molecular
diagnosis of RhD from DNA present in a single
human diploid cell requires amplification by `nested
PCR.' A low number of cycles is used with an outer
set of oligonucleotide primers. The second round of
PCR is then performed on a small aliquot from the
first reaction, using a higher number of amplification
than that in the first reaction with primers internally
nested. A common primer rather than two sets of
primers for duplex PCR in the outer reaction of nested
PCR reduces the incidence of locus-specific amplification when used in a single-cell diploid genome.
E s c h e r i c h i a c o l i 653
To differentiate the two genes, one inner primer was
designed to anneal with RhD sequences and the
second to RhCcEe sequences, but at different yet
overlapping sites, so resulting in an amplification product of different size from that of the RhD gene. This
method would eliminate the risk of an RhD-positive
embryo being missed.
References
ES Cells
See: Embryonic Stem Cells
Escherichia coli
F C Neidhardt
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0422
The science of genetics has benefited from concentrated studies on a relatively small number of living
systems so-called paradigm or model organisms.
Examples include the laboratory (or house) mouse
(Mus musculus), the fruit fly (Drosophila melanogaster), the nematode worm (Caenorhabditis elegans),
the protozoan Paramecium (Paramecium aurelia), and
the bread mold (Neurospora crassa). The enteric bacterium Escherichia coli has been among the model
organisms of genetics ever since the middle of the
twentieth century. Study of E. coli and its viruses has
contributed much information to fundamental genetics, including the nature of the genetic material, the
molecular definition of genes, and the mechanisms of
their function and regulation. The biotechnology
654
Figure 1 Thin section of Escherichia coli. The DNA was immunostained, revealing the nucleoid as the convoluted
central area. (Reproduced with permission from Kellenberger E (1996) Structure and function at the subcellular level.
In: Neidhardt FC et al. (eds) Escherichia coli and Salmonella: Cellular and Molecular Biology, 2nd edn, ch. 4. Washington,
DC: ASM Press.)
questioned whether bacterial inheritance follows the
same rules that govern plants and animals.
E s c h e r i c h i a c o l i 655
into the host E. coli cells which proceed to produce a
new crop of phage.
The very characteristics that provided the early puzzles about phenotypic plasticity and mutability made
E. coli enormously valuable once it was realized that
its genetics would model that of plants and animals.
The small size of these cells, their rapid growth rate,
and the extensive phenotypic influence by the environment provided geneticists with powerful tools.
Small size meant that many millions, even billions of
individuals could be studied in a single experiment.
Rapid growth meant that many generations could be
produced within a single day. The ability to grow
these cells in chemically diverse media and at different
temperatures made it possible for biochemical genetics to flourish. The latter characteristic opened the
door to the biochemistry of how genes function and
how inheritance works, and also brought genetic analysis to bear on discovering the biochemical nature
and workings of the cell. As the structure and function
of DNA and the nature of the genetic code became
known, the biochemical genetics of E. coli evolved
into the field of molecular genetics. Soon thereafter
recombinant DNA techniques were developed, aided
greatly by studies of E. coli and its restriction enzymes.
Of particular importance was the early realization
of the rich opportunities for genetic studies provided
by the many kinds of bacteriophage (bacterial viruses,
or `phage') to which E. coli is susceptible.
3.
4.
5.
6.
656
Mutational Studies
Genetic Exchange
E s c h e r i c h i a c o l i 657
one-way transfer of DNA from a donor cell to a
recipient cell.
oriT
Conjugation
Ro
5'
cl
In 1946, Joshua Lederberg and Edward Tatum demonstrated that two particular strains that had multiple
nutritional requirements (auxotrophic mutants) different from each other would, when mixed together,
give rise to cells able to grow without any nutritional
supplement. These so-called prototrophic recombinants had a complete set of wild-type genes, so it was
logical to think that a mating had occurred by cell
fusion. But this turned out not to be the case. Cell
contact between the two `parental' strains was necessary, but, as shown by William Hayes in 1952, the two
wild-type recombinants did not arise by fusion of
a pair of the two different auxotrophs. Rather, one
of the strains, the donor, transferred its DNA into
the other, or recipient strain, by a process called conjugation.
How conjugation brings about the transfer of bacterial genes is an odd story having to do with plasmid
biology. Plasmids are autonomously replicating, circular, double-stranded DNA molecules, much smaller
than the chromosome, found in great variety within
E. coli and probably all bacteria. They confer a great
range of properties on the cell. In E. coli, `maleness,'
the ability to transfer DNA to a recipient, is related to
the presence of the F (for fertility) plasmid within the
donor cell. This plasmid carries genes that can bring
about the plasmid's transfer from one cell to another.
Among many related functions of these genes is the
ability to produce a hair-like protein structure called a
sex pilus, which helps the male cell (called donor or
F) capture the female cell and maintain a conjugation
bridge through which the F plasmid DNA passes. The
transfer is initiated by a single-strand break that
occurs at a site called oriT (origin of transfer). The
linearized strand is driven into the female cell (called
recipient or F ) by a special mode of replication called
transfer replication of the plasmid. The strand entering
the recipient cell directs the synthesis of its complementary strand, the completed plasmid DNA
circularizes, and its genes become functional. The formerly F cell grows a sex pilus and is now functionally
a donor, male cell. The donor cell remains F because
the strand not transferred directs synthesis of its
complement. Mixing a population of F cells with
one of F cells results in the massive conversion of
the latter to F (Figure 2).
Transfer of F does not transfer chromosomal genes
from donor to recipient. A variation of this process is
responsible. Every so often, in a population of F
cells, the plasmid DNA and the chromosome fuse
and form a cointegrated DNA molecule. A cell with
ll in g c ir
F + cell
F cell
658
Transduction
Transformation
Gene Transpositions
Extrachromosomal Genomes
References
ESS
See: Evolutionarily Stable Strategies
660
Et s F a m i l y 661
to the possible benefits to the environment, to nonhuman animals, and to humans, for example by reducing
the use of insecticides and herbicides and from nutritional improvements.
Further Reading
Ets Family
C Brenner, J-L Baert, and Y de Launoit
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1567
662
E t s F a m il y
Evolutionary Relatedness
Ets genes are conserved throughout the metazon species ranging from diploblastic organisms to Drosophila and vertebrates, but they are absent from the
genome of plants and yeast. Phylogenetic analyses
indicate that the ets genes in contemporary species
are derived from an ancestral gene early in metazoan
evolution. The amplification of such families of transcription factors is viewed as a critical step in the
evolution of multicellular animals, including higher
vertebrates.
The ETS domain identifies all Ets proteins as sequencespecific DNA-binding proteins. This motif, composed
of 85 amino acids, forms a winged helixturnhelix
tertiary structure, which allows Ets proteins to interact
with an approximately 10 bp long DNA element containing a GGAA/T central core. This recognition
motif is present in a vast majority of promoters and
enhancers.
Regulating Domains
Biological Importance
Biological Role
Many Ets transcription factors are subject to autoregulatory mechanisms, which inhibit their DNAbinding activity by domains outside the ETS domain.
This may function to prevent promiscuous DNA
binding by these transcription factors because of
their relatively nonstringent DNA-binding specificity.
Moreover, posttranslational modifications, such as
phosphorylation, represent other potential mechanisms for regulating DNA binding.
Ets proteins interact not only with the basal transcriptional complex, but also with other gene-specific
transcription factors. For example, direct physical
interaction between Ets factors and b-ZIP proteins
represents a conserved mechanism for regulating gene
expression in a variety of lymphoid and nonlymphoid
cell types.
Ets proteins also functionally cooperate with various transcriptional coactivators, such as the histone
acetylase CBP/p300 that modulates the chromatin
structure.
E vo l ut i o n 663
Further Reading
Euchromatin
A T Sumner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0425
Euchromatin is the term for those parts of chromosomes, generally the greater proportion, which show a
normal cycle of decondensation at the end of mitosis.
Most genes are in euchromatin, which, however, also
contains a very high proportion of nongenic DNA.
See also: Chromatin; Heterochromatin
Eugenics
See: Ethics and Genetics
Eukaryotes
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0428
Euploid
J R S Fincham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0429
Evolution
B Guttman
A eukaryote is an organism whose cells have chromosomes with nucleosomal structure, separated from the
cytoplasm by a nuclear envelope and exhibit functional compartmentalization in distinct cytoplasmic
organelles.
See also: Prokaryotes
Eukaryotic Genes
T M Picknett and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0427
664
E vo l u ti o n
although, of course, phylogenies are subject to continuing revision and modification like any other pieces
of scientific information. At least in eukaryotes, similar phylogenetic trees are generated whether one bases
them on the relationships between morphological
structures or the sequences of ribosomal RNA or of
widely distributed proteins like cytochrome c.
Theory of Evolution
The theory of evolution is based on the fundamental
nature of organisms. Organisms are genetic systems
(Guttman, 1999). That is, they are self-reproducing
systems that operate on the basis of instructions
encoded in their genomes, and so during the course
of reproduction, parental genomes must be replicated
to produce new copies for the offspring. But replication is inherently an error-prone process, and mutation continuously introduce genetic novelties.
Furthermore, sexual reproduction entails the reshuffling of chromosomes into different combinations,
so individuals acquire variant genomes, thus giving
them different traits. Organisms inhabit ecosystems
that afford various opportunities for obtaining the
resources they need (energy, raw materials, living
spaces, etc.). Each organism, with its particular combination of traits, has a particular ability to exploit
those resources in other words, to adapt to a particular ecological niche and thus experiences a certain
level of reproductive success, which is generally measured by an organism's fitness. Those with the highest reproductive success have, by definition, the
highest fitness, and thus are most successful in passing
on their particular genotypes. Thus, organisms are
subject to a process of natural selection as those that
are most fit for a particular way of life those that are
best adapted to a particular niche are most successful
in reproducing and their genotypes become most
common. This, as Darwin recognized, is the most
central and most critical process governing evolution,
even though other factors also intervene. (The description of the process clearly has a certain tautological
character.)
E vo l u t i o n 665
division, augmented by occasional lateral genetic
transfer, often mediated by viruses. The features of
organisms on the many branches of this tree may
diverge from one another without limit or may be
kept somewhat confined by continuing selection.
The biological species concept has been applied most
consistently and successfully to certain groups of animals; it may be applied with some difficulty in plants,
which often are able to reproduce in more plastic ways
and to hybridize with one another quite freely. This
conception of a species has been challenged by the
phylogenetic species concept, which is much more
difficult to define but says, essentially, that a species
shall be considered a distinct branch of a phylogenetic
tree that can be distinguished morphologically or
genetically. The issue may be more anthropological
(that is, reflective of the human need to categorize
objects neatly) than biological; it is clear in any case
that `species' by any conception have diverged from
one another in the past and continue to do so.
Speciation
Extinction
The fossil record reveals three major patterns in evolution: speciation, extinction, and phyletic evolution.
Speciation has already been described. Extinction is
clearly a major feature of evolution. Although a few
species have apparently persisted for very long times
(in relatively stable environments, such as the depths
of the ocean), most species have appeared in the fossil
record, have persisted for periods on the order of
100 000 years to a few million years, and then have
become extinct. The paleontologist G.G. Simpson
estimated that 99.9% of all species have become
extinct. Phyletic evolution refers to a gradual change
in morphology in a certain direction; for instance,
hominid (human) evolution, while involving apparent
instances of speciation, has also entailed a gradual
increase in height and cranial capacity, and certain
trends in anatomical details. However, some paleontologists have proposed that phyletic evolution is illusory and that evolution is more properly described as
punctuated equilibrium that is, a species generally
endures with little or no change until it becomes
extinct, but occasional instances of speciation occur
rapidly, so it may appear that a single species has
gradually changed. This may be a non-issue. Instances
of both phyletic evolution and punctuated equilibrium can apparently be documented in the fossil
record.
The synthetic theory pictured evolution as being
driven largely by gradual selection of alleles with small
666
E vo l u ti o n o f G e n e F a m i l i e s
References
Microbial Evolution
Mice, humans, the lowly intestinal bacterium Escherichia coli, and all other forms of life evolved from the
same common ancestor that was alive on this planet a
few billion years ago. We know this is the case from
the universal use of the same molecule DNA for
the storage of genetic information, and from the
nearly universal genetic code. But E. coli has a genome
size of 4.2 megabases (Mb), while the mammalian
genome is nearly 1000-fold larger at ~3000 Mb. If one
assumes that our common ancestor had a genome size
that was no larger than that of the modern-day E. coli,
the obvious question one can ask is where did all of
our extra DNA come from?
The answer is that our genome grew in size and
evolved through a repeated process of duplication and
divergence. Duplication events can occur essentially at
random throughout the genome and the size of the
duplication unit can vary from as little as a few nucleotides to large subchromosomal sections that are tens,
or even hundreds, of megabases in length. When the
duplicated segment contains one or more genes, either
the original or duplicated copy of each is set free to
accumulate mutations without harm to the organism
since the other good copy with an original function
will still be present.
Duplicated regions, like all other genetic novelties,
must originate in the genome of a single individual and
their initial survival in at least some animals in each
subsequent generation of a population is, most often, a
simple matter of chance. This is because the addition
of one extra copy of most genes to the two already
present in a diploid genome is usually tolerated
without significant harm to the individual animal. In
the terminology of population genetics, most duplicated units are essentially netural (in terms of genetic
Further Reading
E vo l u t i o n o f G en e F a m i l i es 667
selection) and thus, they are subject to genetic drift,
inherited by some offspring but not others derived
from parents that carry the duplication unit. By
chance, most neutral genetic elements will succumb
to extinction within a matter of generations. But
even when a duplicated region survives for a significant period of time, random mutations in what
were once-functional genes will almost always lead
to nonfunctionality. At this point, the gene becomes a
pseudogene. Pseudogenes will be subject to continuous genetic drift with the accumulation of new mutations at a pace that is so predictable (~0.5% divergence
per million years) as to be likened to a `molecular
clock.' Eventually, nearly all pseudogene sequences
will tend to drift past a boundary where it is no longer
possible to identify the functional genes from which
they derived. Continued drift will act to turn a oncefunctional sequence into a sequence of essentially random DNA.
Miraculously, every so often, the accumulation of a
set of random mutations in a spare copy of a gene can
lead to the emergence of a new functional unit or
gene that provides benefit and, as a consequence,
selective advantage to the organism in which it resides.
Usually, the new gene has a function that is related to
the original gene function. However, it is often the
case that the new gene will have a novel expression
pattern spatially, temporally, or both which must
result from alterations in cis-regulatory sequences that
occur along with codon changes. A new function can
emerge directly from a previously functional gene or
even from a pseudogene. In the latter case, a gene can
go through a period of nonfunctionality during which
there may be multiple alterations before the gene
comes back to life. Molecular events of this class can
play a role in `punctuated evolution' where, according
to the fossil or phylogenetic record, an organism or
evolutionary line appears to have taken a `quantum
leap' forward to a new phenotypic state.
Duplication by Transposition
With duplication acting as such an important force in
evolution, it is critical to understand the mechanisms
by which it occurs. These fall into two broad categories: (1) transposition is responsible for the dispersion of related sequences; (2) unequal crossing-over is
responsible for the generation of gene clusters. Transposition refers to a process in which one region of the
genome relocates to a new chromosomal location.
Transposition can occur either through the direct
movement of original sequences from one site to
another or through an RNA intermediate that leaves
the original site intact. When the genomic region itself
(rather than its proxy) has moved, the `duplication' of
genetic material actually occurs in a subsequent generation after the transposed region has segregated into
the same genome as the originally positioned region
from a nondeleted homolog. In theory, there is no
upper limit to the size of a genomic region that can
be duplicated in this way.
A much more common mode of transposition
occurs by means of an intermediate RNA transcript
that is reverse-transcribed into DNA and then
inserted randomly into the genome. This process is
referred to as retrotransposition. The size of the retrotransposition unit called a retroposon cannot be
larger than the size of the intermediate RNA transcript. Retrotransposition has been exploited by various families of selfish genetic elements, some of
which have been copied into 100 000 or more locations
dispersed throughout the genome with a self-encoded
reverse transcriptase. But, examples of functional,
intronless retroposons such as Pgk2 and Pdha2
have also been identified. In such cases, functionality
is absolutely dependent upon novel regulatory elements either present at the site of insertion or created
by subsequent mutations in these sequences.
668
E vo l u ti o n o f G e n e F a m i l i e s
A'
B'
genetic
exchange
sites
B'
region of
pairing
A
B'
B/B'
B
C
crossover
site
B
B3
reciprocal products
divergence
deleted chromosome
A
C'
A
A'
B'
B1
B2
C
three-member gene family cluster
duplicated segments
Figure 1 Unequal crossing-over generates gene families. The left side illustrates an unequal crossing-over event and
the two products that are generated. One product is deleted and the other is duplicated for the same region. In this
example, the duplicated region contains a second complete copy of a single gene (B). The right side illustrates a
second round of unequal crossing-over that can occur in a genome that is homozygous for the original duplicated
chromosome. In this case, the crossover event has occurred between the two copies of the original gene. Only the
duplicated product generated by this event is shown. Over time, the three copies of the B gene can diverge into three
distinct functional units of a gene family cluster.
In all cases, unequal crossing-over between homologs results in two reciprocal chromosomal products:
one will have a duplication of the region located between the two sites and the other will have a deletion
that covers the same exact region (Figure 1). It is important to remember that, unlike retrotransposition,
unequal crossing-over operates on genomic regions
without regard to functional boundaries. The size of
the duplicated region can vary from a few base pairs to
tens or even hundreds of kilobases and it can contain
no genes, a portion of a gene, a few genes, or many.
E vo l u t i o n a r i l y S ta b l e S t r a t e g i e s 669
also occur across related DNA sequences that are even
distributed to different chromosomes. There have
been numerous modifications of the Holliday model
including those proposed by Meselson and Rading
that allow a better fit to the actual data, and there is
still lack of consensus on the some of the details
involved. However, the central feature of the Holliday
model single-strand invasion, branch migration, and
duplex resolution is still considered to provide the
molecular basis for gene conversion.
See also: Gene Conversion; Holliday's Model;
Major Histocompatibility Complex (MHC);
Molecular Clock; Unequal Crossing Over
Evolutionarily Stable
Strategies
E Pianka
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0431
When natural selection acts on several different alternative behaviors, the most optimal should be favored.
If costs and benefits of alternatives depend on choices
made by other individuals, optimal solutions are not
always as obvious as they are in simpler situations. An
evolutionarily stable strategy, or ESS, is a mathematical definition for an optimal choice of strategy under
such conditions.
Interactions between two individuals can be depicted as a mathematical game between two players.
A branch of mathematics, called game theory, seeks to
find the best strategy to play in any given carefully
defined game. The central problem of game theory is
to find the best strategy to take in a game that depends
on what other players are expected to do.
Originally used in studies of economics and human
conflicts of interest, game theoretical thinking was
first used in biology by Hamilton (1967) to study evolution of sex ratios. Later, game theory was explicitly
applied to behavioral biology by Maynard Smith (1972)
and Maynard Smith and Price (1973). Maynard Smith
coined the term ESS for a refinement of the Nash
equilibrium used by economists to define a solution
to a game.
The notion of a Nash equilibrium makes some tacit
assumptions about rational foresight on the part of the
player. An ESS must meet a stricter set of requirements than Nash equilibria; the mathematical difference boils down to whether a tie between strategies
leads to a new strategy being considered better. An
670
E vo l u ti o n a r i l y S t a b l e S t r a t eg i e s
References
E vo l ut i o n a r y R a t e 671
Hamilton WD (1967) Extraordinary sex ratios. Science 156:
477488.
Hammerstein P (1981) The role of asymmetries in animal contests. Animal Behaviour 29: 193205.
Johnstone RA (1997) The evolution of animal signals. In: Behavioural Ecology: An Evolutionary Approach, 4th edn, pp. 155
178. Oxford: Blackwell Scientific Publications.
Maynard Smith J (1972) On Evolution. Edinburgh: Edinburgh
University Press.
Maynard Smith J and Price GR (1973) The logic of animal conflict. Nature 246: 1518.
Maynard Smith J (1982) Evolution and the Theory of Games.
Cambridge: Cambridge University Press.
Maynard Smith J and Parker GA (1976) The logic of asymmetric
contests. Animal Behaviour 24: 159175.
Evolutionary Rate
R E Lenski
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0432
Functional Constraints
A mutation may be deleterious, neutral, or beneficial
in terms of its effect on an organism's reproductive
success. Deleterious and neutral mutations are both
very common, whereas beneficial mutations are much
rarer and thus have less effect on variation in evolutionary rates at the genetic level. Because of the redundancy of the genetic code, some point mutations in
protein-encoding genes (especially those at the third
position in a codon) will not actually alter the amino
acid sequence of the protein. Such synonymous mutations are therefore likely to be neutral. By contrast,
nonsynonymous mutations cause a change from one
amino acid to another, and such mutations often have
672
Ewing's Tumor
Ewing's Tumor
N Coleman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1568
E x c i s i o n R e p a i r 673
Exchange
J R S Fincham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0434
Exchange Pairing
See: Exchange, Segmental Interchange
Excision Repair
J T Reardon and A Sancar
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0437
DNA Damage
At some point in time virtually all cells are exposed to endogenous and environmental agents that
damage the genome. Genetic damage is a rare event
as cells possess multiple mechanisms for eliminating
or neutralizing genotoxic substances before they
damage DNA. But damage does occur and sources
of modification include rare misincorporation events
during DNA replication, normal cellular metabolism
involving oxygen and water which generate DNAdamaging free radicals, and extracellular sources
such as environmental chemicals and sunlight (UV
radiation). Damage may include base modification
or cleavage of the phosphodiester backbone such
that RNA and DNA polymerases are blocked at
the lesion (modified base or strand break) and
unable to translocate along the helix, thus interrupting normal DNA replication and RNA transcription.
674
E xc is i o n R e p a ir
5'
5'
TF11H
A
RPA
(i)
A A
B
C
A A
B
TF11H C
RPA
(ii)
A A
TF11H G
F
1
(iii)
5' nick
3' nick
5' nick
3' nick
A G
F
1
C B
Hel
Repair factors +
(iv)
11
dNTPs
ligase
PCNA, RFC
RPA
Pol 1
dNTPs
ligase
30-mer patch
(A)
12-mer patch
(B)
E x p re s s i v i t y 675
sequential assembly of subunits leading to the dual incision event. These six factors are XPA, RPA, TFIIH
(XPB and XPD plus four additional polypeptides),
XPC.HR23B, XPG, and XPF.ERCC1. Following
damage recognition by XPA and RPA, XPC and
TFIIH are recruited to the damage site (see Figure 1)
to form the first stable preincision complex. The
initial, localized helical denaturation resulting from
DNA damage is extended both 50 and 30 by the helicase activities of two TFIH subunits, XPB and XPD.
XPC helps to stabilize this open complex and, furthermore, XPC is a molecular matchmaker that dissociates
after recruiting and positioning XPG 30 to the DNA
damage. The last factor to assemble is the
XPF.ERCC1 heterodimer. Dual incisions follow
rapidly with XPG nicking the DNA at the sixth 3
phosphodiester bond 30 to the damage and XPF.
ERCC1 hydrolyzing at the 20th 5 bond 50 to the
lesion. The 2432 nucleotide-long oligomer containing the damaged base is released from the DNA
(excision) and repair factors rapidly dissociate following the dual incision event leaving a gapped substrate. In subsequent steps, DNA polymerases d and e
and their accessory factors, PCNA and RFC, assemble at the gapped molecule and the undamaged strand
is used as a template for precise resynthesis of the
DNA. The repair patch size matches the size of the
excision gap and, when the gap is filled to the 30 end,
the repair patch is ligated to the parental DNA by a
ligase.
Further Reading
Exon
See: Introns and Exons
Exonucleases
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1836
Expression Vector
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1837
Expressivity
J A Fossella
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0440
Expressivity refers to the variation seen among individuals expressing a particular trait or mutant phenotype. `Variable expressivity' is the term used to describe
a trait or mutant phenotype that fluctuates in degree or
severity from individual to individual in a population.
For example, all individuals of a population expressing
a trait or mutant phenotype such as `spotted' may
show an identical number of spots. This would be an
example of low or nonvariable expressivity. Alternatively, some individuals may have many spots while
others only a few and many with an intermediate
number of spots. This would be an example of variable
expressivity, since all the individuals express the trait
or mutant phenotype of `spotted' but vary in the
degree of spotting.
Expressivity is similar in meaning to `penetrance'
and the two terms are often used together when
describing mutations. For example, certain weak
alleles of the W locus seen in mice result in white
coat color spots. These mutant alleles are said to
show reduced penetrance and variable expressivity.
The distinction between penetrance and expressivity
is that penetrance refers to the genotype while expressivity refers to the phenotype. In this example, only
some of the mice that carry the W / genotype show
any spots at all. This is an example of reduced penetrance. Of the animals that show the spotted `phenotype' however, some tend to show much spotting
676
Extranuclear Genes
Extranuclear Genes
See: Cytoplasmic Inheritance
F
F Factor
S M Rosenberg and P J Hastings
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0454
100 kb
IS3
tra operon
IS2
oriT
50
oriV
Leading region
678
F F act or
Bacterial chromosome
Recombination
Hfr
Recombination
Conjugative Transfer
Cells carrying the F factor are called male or donor
cells. They express long, rod-like pili on their surfaces
and use these to attach to female cells for transfer of
the F factor. Once attached, the pili retract, bringing
the mating pair into close contact. The TraI endonuclease makes a single-strand nick at oriT, and, with its
helicase activity, peels back the 50 end to which it
remains covalently bound. The 30 end primes leading
strand synthesis that displaces the 50 -ending strand.
The displaced strand is transferred into the recipient
cell. Whether the DNA is transferred through a pilus
or via some other close contact is not yet clear.
The synthesis and strand displacement end when the
whole single-strand length of the circle has been displaced and the 30 growing end again reaches oriT. TraI
is hypothesized to nick again, releasing the end of the
displaced strand, and to assist recircularization of the
ends in the recipient cell. Meanwhile, the complement
of the transferred single-strand is synthesized in the
recipient cell, such that a duplex circle is reestablished.
The recipient thus becomes an F-carrying male, and
the donor remains male. The Tra proteins can act on
other bacterial plasmids with similar origins of transfer, including the ColE1 plasmids (from which
pBR322, pUC, and many other cloning vectors are
derived). The process of recruiting and transferring
other plasmids is called mobilization. The sites on
those plasmids that allow mobilization are called
mob and the nick site itself (oriT, which is necessary
but not sufficient for transfer) has also been called bom
and nic. pBR322 lacks mob but carries bom, and
cannot be mobilized by the F factor unless a third
F F act o r 679
Donor
F
3
oriT
Tral
Recipient
Donor
Recipient
Donor
Recipient
Figure 3 Conjugative transfer of the F plasmid. Each line represents a DNA strand; dashed lines represent newly
synthesized DNA, and arrowheads 30 ends. During transfer, the F-encoded Tral endonuclease cleaves one strand of
DNA at the transfer origin, oriT, and remains covalently bound to the 50 end. Leading-strand synthesis primed from
the 30 end displaces the cleaved strand, which is transferred into a recipient cell. Lagging-strand synthesis and
recircularization occur in the recipient, regenerating an F plasmid there.
plasmid, apparently supplying mob function (ColK),
is present. pUC plasmids contain neither mob nor
bom sites and so cannot be mobilized.
Further Reading
680
F1 Generation
References
F1 Generation
FAB Classification of
Leukemia
F1 Hybrid
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0442
B Bain
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1569
1 - , 2 - , 3 - F a c t o r C ro s s e s 681
research and also, by providing widely accepted terminology and definitions, facilitated clinical trials and
international collaboration.
See also: Leukemia; WHO Classification of
Leukemia
Fabry Disease
(a-Galactosidase A
Deficiency)
R J Desnick
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0443
Further Reading
1-Factor Crosses
When only one marker distinguishes two, sexually
reproducing eukaryotic parents, three Mendelian
principles can be illustrated:
1. The F1, diploid offspring from a cross between two
pure-breeding diploid parents (P, the parental generation), may resemble either one parent or the
other, illustrating dominance/recessiveness.
2. The population of haploid cells (gametes) produced
by meiosis in the F1 is composed equally of cells
containing one or the other of the markers that
distinguished the parents, illustrating Mendel's law
of segregation.
3. The diploid generation resulting from the union of
F1 gametes (F2) will have a phenotypic ratio of 3:1,
favoring the type determined by the dominant
gene. This ratio is expected if gametes unite with
each other at random, without regard to their
genotype.
2-Factor Crosses
When the P generation differs by two markers located
in different genes, additional Mendelian principles are
illustrated:
1. The frequency of recombinants among haploid
cells produced by meiosis in the F1 illustrates
Mendel's principle of independent assortment
when, as is likely to be true, the two markers
involved are on separate chromosomes. If the two
markers are on the same chromosome, they may
illustrate linkage, by the production of fewer than
50% recombinant haploid products of meiosis.
2. When the factors involved influence different
phenotypes, the F2 usually manifests independent
expression of those phenotypes. If the two genes
are on separate chromosomes, this results in relative
frequencies of the four phenotypes of 9:3:3:1, illustrating the mosaic nature of phenotype determination.
682
F a c u l t a t i ve H e t e ro c h ro m a ti n
3-Factor Crosses
When the P generation differs by three markers, new
principles emerge if the three markers are linked.
1. The recombination frequencies for the factors
taken two at a time determine the order of the
markers on the linkage map. In the absence of complications, this corresponds to the order of the markers on the chromosome and genetically defines the
concept of locus.
2. The frequency of double crossovers may be different from that expected if simultaneous crossingover in the two joint intervals were the result of
statistically independent exchange events (interference).
Tetrad Analysis
Some fungi produce meiotic spore tetrads in which the
order of the spores in the ascus reflects their origin via
the two divisions of meiosis. The first two spores in the
ascus are sister spores from the same second meiotic
division, as are the last two spores. In a 1-factor cross,
the frequency with which sister spores carry the same
marker (first division segregation) is a measure of
the distance of that marker from the centromere of the
chromosome on which it is carried. If the frequency is
close to 100%, the marker is close to its centromere.
(For those species, like Neurospora crassa, that have
eight ascospores by virtue of postmeiotic mitosis,
substitute `spore pairs' for `spores' in the above.)
In organisms with unordered spore tetrads (e.g.,
Saccharomyces), linkage of a newly found marker to
its centromere can be established when another marker tightly linked to a different centromere is available. In a 2-factor cross involving the new and the old
markers, the frequency of tetratype tetrads is indicative of the degree of linkage under question. If the
frequency of tetratype tetrads is close to zero, the new
marker also is close to its centromere.
Facultative
Heterochromatin
See: Heterochromatin
Familial
Hypercholesterolemia
M S Brown and J L Goldstein
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0444
F - D u c t i o n 683
manifest three- to eightfold elevations of plasma LDL,
and they typically have heart attacks in childhood.
The disorder has been observed in nearly every population of the world, placing FH among the most prevalent single-gene disorders in humans. More than 500
mutations in the LDL receptor gene have been defined
by genomic analysis. The LDL receptor was the first
cell-surface receptor that was recognized to carry a
protein into cells by receptor-mediated endocytosis.
Comparison of normal and mutant FH cells helped to
establish the properties of this fundamental process,
which is now known to be used for many purposes in
all animal cells.
See also: Genetic Diseases
Fanconi's Anemia
C Mathew
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0445
Favism
L Luzzatto
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0446
F-Duction
Fate Map
F-duction is the same as sexduction, i.e., the highfrequency transfer of a segment of bacterial (for example, Escherichia coli) DNA incorporated into an
F0 plasmid.
684
Fe line Genetics
Feline Genetics
S J O'Brien
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0165
12
10
5.0
Panthera uncla
Panthera onca
Panthera pardus
Panthera leo
Panthera tigris
Neofelis nebulosa
Snow leopard
Jaguar
Leopard
Lion
Tiger
Clouded leopard
Pardofelis marmorata
Lynx rufus
Lynx pardinus
Lynx lynx
Lynx canadensis
Prionailurus rubiginosus
Marbled cat
Bobcat
Iberian/pardel lynx
Eurasian lynx
Canadian lynx
Rusty-spotted cat
Mayailurus iriomotensis
Prionailurus bengalensis
Prionailurus viverrinus
Prionailurus planlceps
Leptailurus serval
Caracal caracal
Profelis aurata
Catopuma temmincki
Catopuma badia
Herpailurus yagouaroundi
Puma concolor
Acinonyx jubatus
Iriomote cat
Leopard cat
Fishing cat
Flat-headed cat
Serval
Caracal
African golden cat
Temminck's golden cat
Bornean bay cat
Jaguarundi
Puma, mountain cat, cougar
Cheetah
Felis catus
Felis silvestris
Felis margarita
Felis chaus
Felis nigripes
Felis bieti
Otocolobus manul
Domestic cat
African/Europen wild cat
Sand cat
Jungle cat
Black-footed cat
Chinese mountain cat
Pallas cat
Domestic cat
lineage
Oncifelis geoffroyi
Oncifelis guigna
Lynchailurus colocoio
Leopardus tigrinus
Leopardus pardalis
Leopardus wiedii
Oreallurus jacobitus
Geoffroy's cat
Kodkod, guina, huina
Pampas cat
Tiger cat, oncilla
Ocelot
Margay
Andean mountain cat
Ocelot
lineage
Panthera
genus
Lynx
genus
Asian leopard
cat group
Caracal
cat group
Bay
cat group
Puma group
0.0
Figure 1
patterns.
Molecular phylogeny of the 37 species of cats based on nuclear and mitochondrial DNA divergence
Female Carriers
D E Wilcox
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0448
686
Fe ma le Car ri e rs
I:1
II:1
+
md
II:2
+
III:1
+
I:2
+
II:3
III:2
+
md
II:4
md
III:3
md
II:5
+
III:4
md
X Y
Figure 1 Example pedigree of Becker muscular dystrophy showing the arrangement of the alleles on each
individual's sex chromosomes.
Most will have a healthy phenotype and so carrier
detection is a major role of the genetic clinic. A female
carrier of an X-linked recessive disorder has a 1 in 2
chance of passing the mutant allele to each child. Since
the child has a 1 in 2 chance of being male, her chance
of having an affected male is 1 in 4.
Although carrier females are heterozygous for any
X-linked recessive trait they carry, only one allele is
active in each cell. In early embryogenesis in females,
one of each cell's X chromosomes is randomly and
permanently inactivated. The mix in the tissues
usually prevents the development of the full mutant
phenotype but female carriers of an X-linked recessive
trait are at risk of developing variable expression of the
disorder. This contrasts with autosomal recessive
traits where carrier heterozygotes have the normal
phenotype.
Pedigree Analysis
Features of X-Linked Recessive Pedigrees
New mutation
Gonadal mosaicism
Mother of a sporadic affected male If the mother of
Carrier Tests
Direct Tests, which Confirm Carrier Status
Clinical geneticists can combine the results of several independent conditional carrier tests together
with conditional pedigree information using a Bayes
calculation to produce a final carrier risk. Where the
mutation is not detectable, genetic linkage studies can
help. These label the chromosome around the gene
and track the segment of chromosome through individuals in several generations of the family. Linkage
studies are not 100% accurate because of genetic
recombination, which can cause the markers to switch
chromosomes. Biochemical tests, not directly related
to the trait's gene product, can also provide conditional risk information. Examples include serum creatine kinase in DMD and fibroblast very long chain
fatty acids in adrenoleukodystrophy. Carrier/noncarrier risk ratios are available for various values of these
substances.
688
Fe ral
Nonrandom X-Inactivation
Universityof Glasgow, Department of Medical Genetics, Encyclopaedia of Genetics pages contain a number of illustrations
and animated diagrams to accompany this article: http://
www.gla.ac.uk/medicalgenetics/encyclopedia.htm
Feral
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0449
Homozygosity
X-Autosome Translocation
Fertility, Mutations
See: Infertility
Fertilization
J C Harper
Further Reading
Fe r t i l i zati on 689
chromosomes are halved by meiosis. Oogenesis results
in a large, complex oocyte that contains the proteins,
enzymes, and other factors necessary for the first
days of development. Spermatogenesis has to ensure
that the sperm is able to travel through the female
reproductive tract to meet and fertilize the oocyte.
Oogenesis is a complex process that starts during
fetal life. At birth, females are born with primary
oocytes arrested in meiosis I (diplotene stage) and no
further oocytes will be produced. Each month only
one oocyte will fully mature and just before ovulation,
meiosis is resumed, the first polar body is extruded
(containing one set of oocyte chromosomes), and the
oocyte arrests in metaphase II. Meiosis is only complete upon fertilization. Maturation (M-phase) promoting factor (MPF) causes resumption of meiosis
and is regulated by the c-mos gene. MPF is high during
meiosis I and II.
Sperm are produced by the testis and become
mature and motile as they travel through the epididymis. Sperm released on ejaculation have to travel
through the cervical os, the uterus, and the fallopian
tubes where hopefully an oocyte is waiting for fertilization. The mature sperm consists of a head piece,
neck, and flagellum. The flagellum is responsible for
initiation and maintenance of motility through the
female reproductive tract. The head of the sperm contains the sperm DNA and is the area involved in recognition of the zona pellucida (the glycoprotein coat that
surrounds the oocyte) and spermoocyte fusion. The
head is covered by a membrane-bound vesicle called
the acrosome. Before a sperm can fertilize an oocyte it
needs to go through capacitation and the acrosome
reaction. Capacitation occurs in the female reproductive tract and involves a range of poorly understood
processes that do not alter the ultrastructure of the
sperm, but enable the sperm to fertilize the oocyte.
At ovulation the oocyte is surrounded by a dense
array of cumulus cells that play a vital role in oogenesis. The sperm swim through the cumulus cells until
they reach the zona pellucida. It is thought that there
are two stages to sperm binding to the human oocyte.
The first involves the binding of the acrosome intact
sperm to the primary binding site on the oocyte,
termed ZP3. The acrosome matrix consists of a number of enzymes, the most important of which is acrosine, a serine protease that is packaged as the inactive
pro-acrosin. The acrosome reaction exposes the
second binding site on the sperm head that binds to
the secondary binding site, ZP2, on the oocyte. The
sperm is then able to penetrate the occyte and two
reactions are stimulated: the cortical reaction and
oocyte activation.
The cortical reaction may be involved in blocking additional sperm penetrating the oocyte. In vitro
IVF
In vitro fertilization (IVF) is a technique developed
for the treatment of some forms of infertility. The first
successful birth was that of Louise Brown in 1978.
IVF can be used for the treatment of several types of
male and female infertility. In the female, one of the
most common causes of infertility is tubal blockage,
but IVF can also be used in the treatment of ovulatory disorders, including polycystic ovarian disease
(PCO), immunological disorders such as antisperm
antibodies, endometriosis, coital problems, and ``unexplained'' infertility where no etiological factor has been
identified.
690
Fe r t i l i z a t i o n , M a m m a l i a n
Fertilization, Mammalian
P M Wassarman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0452
Fe r t i l i z a ti o n , M a m m a l i a n 691
as the process of union of two germ cells, egg and
sperm, whereby the somatic chromosome number
is restored and the development of a new individual
exhibiting characteristics of the species is initiated.
Both mammalian eggs and sperm are designed to
ensure that fertilization takes place reliably. Accordingly, mechanisms are in place to support speciesspecific interactions between gametes and to prevent
fusion of eggs with more than one sperm (polyspermy).
Egg Development
Oogenesis begins during fetal development when
primordial germ cells are transformed first to oogonia
(mitotic) and then to oocytes (meiotic). The pool of
small, nongrowing oocytes, present at birth, is the
sole source of unfertilized eggs in the sexually mature
female mouse. These oocytes are arrested at the diplotene (dictyate) stage of the first meiotic prophase.
Each oocyte (*15 mm in diameter) is contained within
a cellular follicle that grows concomitantly with the
oocyte for about 2 weeks, from a single layer of a few
epithelial-like cells to three layers of cuboidal granulosa cells by the time the oocyte has completed its
growth (*80 mm in diameter). Over several days,
while the oocyte remains the same size, follicular
cells undergo rapid division, increasing to more than
5104 cells in the Graafian follicle. The follicle exhibits a fluid-filled cavity, or antrum, when it consists
of *6103 cells and, as the antrum expands, the oocyte
takes up an acentric position surrounded by two or
more layers of granulosa cells (cumulus cells).
Fully grown oocytes in Graafian follicle complete
the first meiotic reductive division, called meiotic
maturation, just prior to ovulation in response to
a surge in the level of luteinizing hormone (LH).
Oocytes progress to metaphase II of the second meiotic
division, with separation of homologous chromosomes and emission of a first polar body, and become
unfertilized eggs. Oocytes must complete meiotic
maturation in order to be capable of being fertilized
by sperm. The ovulated egg completes meiosis, with
separation of chromatids and emission of a second
polar body (i.e., becomes haploid, 1n), only upon
fertilization by sperm (sperm chromosomes restore a
diploid, 2n, state to the zygote).
Sperm Development
In mice, it takes *35 days for each spermatogonial
stem cell, already present in the fetus, to progress
through meiosis as a spermatocyte, become four
haploid spermatids (1n), and to be transformed into
spermatozoa. Spermatogenesis takes place within the
seminiferous epithelium lining the tubules of the testes
692
Fe r t i l i z a t i o n , M a m m a l i a n
Binding
Initiation of
acrosome reaction
Pervitelline space
Zona pellucida
Plasma membrane
Cortical granule
Continuation
of reaction
Tail
Fusion
Cortical
reaction
Midpiece
Excluded
sperm
Acrosome-reacted
sperm
Zona
reaction
Neck
Penetration
Acrosome
Head
Nucleus
Plasma membrane
Acrosomal contents
Inner acrosomal membrane
Outer acrosomal membrane
Fusion of plasma
membrane and outer
acrosomal membrane
Vesiculation
Acrosome-reacted sperm
Figure 1 The mammalian fertilization pathway includes a series of steps taken in a compulsory order. Acrosomeintact sperm bind to sperm receptors in the zona pellucida (ZP) by using egg-binding proteins on the sperm head
plasma membrane. Sperm then undergo the acrosome reaction (cellular exocytosis), penetrate through the ZP, and
reach the perivitelline space between the ZP and plasma membrane. A single sperm then binds to and fuses with egg
plasma membrane. Fusion of sperm and egg triggers the cortical reaction that, in turn, triggers the zona reaction. The
zona reaction alters the properties of the ZP making it a barrier to other sperm and preventing polyspermic
fertilization. (Adapted with permission from Wassarman, 1988.)
Some of the egg and sperm molecules that support
each step in this pathway to fertilization have been
identified and characterized. A description of these
molecules and the manner in which they participate
in mammalian fertilization follows below.
All mammalian eggs are surrounded by a thick extracellular coat, called the zona pellucida (ZP). Consequently, sperm must first bind to and then penetrate
Fe r t i l i z a ti o n , M a m m a l i a n 693
Figure 2
Light photomicrograph of mouse sperm bound to the ZP of an unfertilized mouse egg in vitro.
Acrosome Reaction
694
Fe r t i l i z a t i o n , M a m m a l i a n
Final Considerations
Reproduction of the species is a fundamental property
of all living things. Fertilization activates the mammalian egg to initiate a complex program of development, transforming a single cell into a multicellular
organism. Accordingly, development of eggs and
sperm and the interactions between gametes that culminate in fertilization are highly regulated. Some of
the egg and sperm molecules that participate in the
fertilization pathway have been identified and their
mechanisms investigated. This relatively new information has already contributed to our ability to control reproduction and will continue to have an impact
on medical aspects of human reproduction for years
to come.
Further Reading
Filamentous
Bacteriophages
E Kutter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0455
696
Filamentous Bacteriophages
III
VI
VIII
IX VII
pIII
pVIII
DNA pol I and ligase are needed to close the complementary strand. The DNA is then supercoiled by
gyrase to become replicative form I (RFI), the template for transcription. Further replication is rather
complex. The phage pII has to nick a specific site, the
viral-strand origin, and the 30 -OH end thus formed
acts as the primer for rolling-circle replication carried
out by pol III and the host SSB and rep helicase. After
one round of replication, pII cleaves and circularizes
the single-stranded viral-strand `tail,' which can initiate formation of a new RFI, while the RFI containing
the new viral RNA strand is resealed and supercoiled
to again act as a substrate for pII, as well as for
transcription.
Once sufficient gpV accumulates in the cell, it binds
cooperatively to some of the ssDNA molecules;
pX somehow helps regulate the balance between progeny DNA and RF formation. Three additional,
phage-encoded proteins aid in viral assembly. Outermembrane protein pIV (405 AA), `secretin,' is synthesized with a 21-AA signal sequence. Gene I encodes a
348-AA inner-membrane protein with its N-terminal
253 residues in the cytoplasm. An internal translational start produces pI* (108 AA), still containing
the membrane and periplasmic domains of pI. As
shown genetically, the cytoplasmic part of pI interacts
with thioredoxin and the packaging signal during
assembly, while the outer portion appears to interact
with pIV in the outer membrane to form the extrusion
passage. The pV helps form the DNA into a linear
antiparallel structure facilitating assembly; about 1500
molecules of pVare required per phage for this process.
Assembly of the virus particle can then be initiated by
an interaction of the packaging signal with the cytoplasmic domain of pI, the membrane-associated pVII
and pIX. During elongation and extrusion, pVIII from
II
IV
IG
pVIIpIX
pVI
Figure 1 The f1 filamentous bacteriophage. Top: Electron micrograph of a negatively stained particle, with the pIIIVI end located on the left side. Bottom: Schematic representation of the positions of the structural proteins and DNA
in the bacteriophage. (Adapted from Webster RE and Lopez J (1985) In: Casjens S (ed.) Virus Structure and Assembly,
p. 235. Boston, MA: Jones & Bartlett.)
Further Reading
698
F i l ia l G e ne r a t io ns
Filter Hybridization
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1842
Filter hybridization is a technique for in situ solidphase hybridization whereby denatured DNA is
immobilized on a nitrocellulose filter and incubated
with a solution of radioactively labeled RNA or
DNA.
See also: In situ Hybridization
Fingerprinting
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0458
Filial Generations
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0456
Diplotene / metaphase I
anaphase I
anaphase II
(A)
A
A
A
a
a
A A
a a
a
(B)
A
A
a
a
A
A
a
a
Figure 1 First division (A) or second division (B) segregation of alleles A and a depending on whether or not a
crossover occurs between the A/a locus and the centromere (vertical bar in the left panel), the point at which sister
chromatids remain connected until the second division of meiosis.
When two homologous chromosomes are distinguished by a genetic marker, say with allele A on one
chromosome and allele a on the other, the Aa difference will segregate at the first division provided that
the alleles remain attached to their original centromeres. However, when a single crossover, which
always involves just one chromatid of each chromosome, occurs between the A/a locus and the centromere, different alleles become joined to the same
centromere, and the anaphase separation at first anaphase will not be between A-A and a-a but rather
between A-a and A-a (Figure 1). Then segregation of
A from a will be delayed to the the second division.
The effect of two crossovers in the locuscentromere
interval depends on whether the same chromatids are
involved in the second crossover as in the first. If the
same two cross over twice (a two-strand double), or if
the second crossover involves the two chromatids not
involved in the first crossover (a four-strand double),
the effect in either case is to restore first division
segregation. If one chromatid crosses over twice, two
cross over once each, and the other not at all (threestrand double), the effect is second division segregation. So double crossovers give, on average, 50%
second division segregation. An indefinitely large
number of crossovers will give, on average, two-thirds
second division segregation, a result most easily
understood by imagining the four alleles as totally
uncoupled from their centromeres and distributed
two-and-two to the first division spindle poles at
700
F I S H ( F l u o re s c en t i n s i t u H yb ridiz a tion)
Interpretation
Bivalents
at first
metaphase Actual
appearance
Anaphase II
Fisher, R.A.
A W F Edwards
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0459
Arm shortened
by deletion
Anaphase I
Segregation of length
difference at 1st division
Anaphase II
Segregation of length
difference at 2nd division
Fluorescent in situ hybridization (FISH) is a technique used to identify the chromosomal location of a
particular DNA sequence. A DNA probe is fluorescently labeled and hybridized to denatured metaphase
chromosomes spread out on glass slides.
See also: Physical Mapping
Further Reading
Box JF (1978) R.A. Fisher: The Life of a Scientist. New York: Wiley.
Edwards AWF (1990) R.A. Fisher: Twice professor of genetics:
London and Cambridge or ``A fairly well-known geneticist''.
Biometrics 46: 897904.
Fisher RA (197174) Collected Papers of R.A. Fisher, vols. 15, ed.
J.H. Bennett. Adelaide, SA: University of Adelaide Press.
702
Fitness
Fitness
C B Krimbas
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0460
Definition
Fitness is a concept that is often considered to be
central to population genetics, demography, and the
synthetic theory of evolution. In population genetics,
it is technically a relative or absolute measure of reproductive efficiency or reproductive success. The
absolute or Darwinian fitness of a certain genetic constitution living in a defined homogeneous environment would be equated with the mean number (or
the expected number) of zygotes sufficiently similar
to that produced during its entire lifetime, whereas
relative fitness would be the measure of the reproductive efficiency of a certain genotype, as defined above,
compared with that of another from the same population.
The term `sufficiently similar' needs explanation: it
does not mean that the offspring of a certain genotype
would necessarily share the same genotype with their
parent (actually, in many instances, Mendelian segregation would prohibit this); rather, it indicates that
genetic effects seriously affecting fitness with one generation delay should be taken into consideration and
should affect the value of fitness of the parent genotype exclusively responsible for these delayed effects.
Thus, the grandchildless mutants in Drosophila subobscura and D. melanogaster have such an effect:
female homozygotes for the mutant allele produce
sterile offspring (regardless of the genotype of the
male parent or that of the offspring). The reason for
this is that in their fertilized eggs, the posterior polar
cells are not formed. The mean number of offspring
may not be sufficient to define the fitness of a genotype: the distribution of the number of its offspring
may also be of importance. Thus, Gillespie has shown
that genotypes having the same mean number of progeny, but differing in variances, have a different evolutionary fate. Everything else being equal, an increase
in variance of the number of progeny (from generation
to generation, or spatially, or developmentally) is, in
the long run, disadvantageous. From the actual mean
expected number of progeny we should subtract a
quantity equal to 1/2s2 for the case of temporal variation
(where s2 is the variance in offspring number) and 1/Ns2
for the case of developmental variation to arrive at an
estimate of fitness. However, with the exception explained above, concerning the one-generation delayed
effects of a certain genotype affecting reproduction,
we will restrict fitness definition to only one generation, (where N is the effective population size) thus
avoiding the temporal variation. (The long-term evolutionary fate addressed by Thoday and Cooper will
not be considered here.) Furthermore, we will restrict
the definition of fitness to a certain homogeneous
selective environment, thus avoiding complications
such as those described by Brandon (1990), where a
genotype having two different fitnesses in two environments, both lower than the respective two fitnesses of
a different genotype, may end up having a higher mean
fitness due to an unequal distribution of individuals of
the two respective genotypes in these two environments]. The reason for these restrictions is that fitness
values serve to put some flesh onto the models describing allelic frequency changes from generation to
generation and thus allowing short-term genetic predictions. Fitness is a useful device in quantifying the
kinetics of a genetic change; it is otherwise devoid of
any other independent meaning and cannot serve as a
substitute to the nebulous concept of adaptation. Of
course in some models, one may consider complex
fitness functions, e.g., the weighted mean fitness in
two environments.
Medawar (in Krimbas, 1984) expresses this view in
the following statement:
The genetical usage of `fitness' is an extreme attenuation of
the ordinary usage: it is, in effect, a system of pricing the
endowments of organisms in the currency of offspring; i.e. in
terms of net reproductive performance. It is a genetic valuation of goods, not a statement about their nature or quality.
Historical Overview
The first use of `fitness' with a loosely similar meaning is found in Darwin's On the Origin of Species.
From the first to the sixth edition, Darwin employed
the verb `fit' and the adjective `fitted' as synonyms
for `adapt' and `adapted,' respectively. The noun first
appears in 1859:
Nor ought we to marvel if all the contrivances in nature be
not, as far as we can judge, absolutely perfect; and if some of
them be abhorrent to our idea of fitness.
(Paul, 1992)
Fitness 703
to `adaptation.' Even today, Brandon (1990) equates
fitness with adaptedness.
In 1798, Malthus compared the rates of increase of
population size with the amount of food produced.
According to Tort (1996), the ratio l of the number of
individuals of one generation (Nt1) to that of its
parental generation (Nt) is the Darwinian fitness or
the Malthusian fitness. No differences among individuals are considered in this formulation, which
describes a geometric or rather an exponential increase
of population size, if l is constant. In 1838, Verhulst
gave another formulation, taking into consideration
the change in ratio as the population reaches its
carrying capacity, K. Thus Verhulst distinguishes w,
the biological fitness (the Verhulstian fitness is the
number of offspring produced by an individual at its
sexual maturity) and population fitness, which varies
also according to K and to the present population
size, Nt:
Nt1 wNt
1
K
Nt 2
1
K
Nt
dNt t
704
Fitness
Hamilton's concept of `inclusive fitness' was formulated to provide a Darwinian explanation for altruistic actions that may endanger the life of the
individual performing such acts. An individual may
multiply its genes in two different ways: directly by its
progeny, and indirectly by protecting the life of other
individuals of a similar-to-it genetic constitution. If
the danger encountered is outweighed by the gain
(all calculated in genes) then the performance of such
acts may be fixed by natural selection. Estimations of
inclusive fitness do not take into account only the
individual's fitness but also that of its relatives (of
similar genetic constitution): it is the sum total of
two selective processes, individual selection and kin
selection. In this case, in fact, the counting tends to
change from the number of individuals in the progeny
to the number of genes preserved by altruistic acts
in addition to those transmitted directly through its
progeny.
Fitness 705
adaptation, it is argued, is in every case the optimal
solution to an environmental problem. The difficulties
with such an approach are twofold. First, we are often
unable to define precisely the problem that the organism faces (it might be a composite problem) in order to
determine in advance the optimal solution and, as a
result, we tend to adapt the `solution' encountered to
the nature of the problem the organism faces. Second,
it is evident that several selection products are not
necessarily the optimal solutions, the evolutionary
change resembling more a process of tinkering rather
than an application of an engineering design.
Recently, several authors (Brandon, Mills and
Beatty, Burian, and Sober; see Brandon, 1990) have
supported the propensity interpretation of fitness (or
adaptedness). In so doing they try first to disentangle
`individual fitness' (something we are not considering
here; as mentioned earlier we have taken into consideration only the fitnesses of a certain category or
group of individuals) from the fitness that is expected
from its genetic constitution. Indeed, all kinds of accidents may drastically modify the number of progeny
one individual leaves behind. A sudden death may
zero an individual's contribution to the next generation. But selection is a systematic process in the sense
that in similar situations similar outcomes are expected. Thus, in order to pass from the individual or
actual fitness to the expected one, these authors are
obliged to consider two different interpretations of
`probability.' The first interpretation considers probability as the limit of a relative frequency of an event in
an infinite series of trials, but since this series is never
achieved, the observed frequency in a finite series of
trials might be used instead. The second interpretation
is that of propensity, where the very constitution, i.e.,
the physical properties, of the individual underlies the
propensity for performing in a given way. This may be
a dispositional property, i.e., it might be displayed in a
certain way in some situations and in another way in
others. The propensity interpretation of fitness attributes to physical causes, linked to the very structure of
the individual, the tendency to produce a specific
number of offspring in a particular selective environment. This is another way of reifying fitness, and via
fitness relative adaptedness, and finally adaptation. It
is reminiscent of the Aristotelian potentia et actu,
where the propensity is `potentia' and the actual mean
number of offspring corresponds to the `actu.' In some
situations of viability selection this interpretation
seems quite satisfactory (e.g., in mice resistant to warfarin). No one would deny that the selection process
depends most of the time on the properties of a genotype performing in a certain environment. But this
may not be as general as one may think. There are
situations in which the contribution to fitness from
On Population Fitness
It is much more difficult to define population fitness:
population geneticists use to calculate the mean adaptive value or the mean individual fitness in a population. But this exercise is quite futile when comparing
two different populations. A group of adapted organisms is not necessarily an adapted group of organisms.
Demographers earlier equated size (or increase in size)
with population fitness. However, as Lewontin once
remarked, it is not certain that a greater or denser
population is better adapted, since it may suffer from
706
Fitness Landscape
parasites and epidemics; on the other hand, a population depleted of individuals may suffer collapse and
extinction. I have argued (Krimbas, 1984) that according to the `Red Queen hypothesis' of Van Valen, all
populations (at least of the same species) seem to have,
a priori, the same probability of extinction, and thus
possess, a priori, the same long-term population fitness.
In addition, it is not clear enough how we should
consider a group: a group is not an organism that
survives and reproduces. Although individuals of the
group interact in complex ways and thus provide some
image of cohesion, the `individuality' of the groups
seems most of the time to be quite a loose subject.
Should we consider group extinction per unit of time
to determine group fitness? What about group multiplication? In order to achieve a model in group selection cases, one may resort to different population
selective coefficients, or population adaptive coefficients (something related to the population fitness).
In these cases, the search for the nature of population
fitness becomes even more elusive. As a result, population fitness is a parameter useful exclusively for its
expediency; no search for its hidden nature is justified.
References
Fitness Landscape
fitness
J F Crow
AA bb
AA BB
1s
aa bb
A fitness landscape, or adaptive surface, is a geometrical construct in which the fitness or adaptive value
of a genotype is the ordinate and the two abscissae
%A
%B
Figure 1
aa BB
Fix Ge n es 707
metaphor than as a mathematical model. As a result his
papers present different, often confusing concepts.
Sometimes the abscissae are allele frequencies, sometimes they are genotype frequencies, and sometimes
phenotypes. How rugged the fitness surface is has
been a matter of continual discussion since Wright
first introduced his ideas in the 1930s. Wright thought
of the multidimensional surface as quite rugged, with
numerous peaks and valleys. Others, R. A. Fisher in
particular, have suggested that the surface is more like
an ocean with undulating wave patterns. Furthermore,
as the number of dimensions increases, only a small
fraction of the stationary points are maxima. A population is much more likely to be on a ridge than on a
peak. The debate was not settled while Wright was
alive and still continues. Wright summarized his lifetime view of the subject in a paper entitled ``Surfaces of
selective value revisited,'' published in 1988, shortly
before his death (Wright, 1988).
Although Wright, more than anyone else, was
responsible for introducing random processes into
population genetic theory, he never attempted to
model the whole shifting-balance process stochastically. Recently there has been considerable mathematical work in this area, partly as a way of developing
and testing Wright's theory. The entire process has
been treated stochastically, something that was missing in Wright's formulations.
The landscape idea has been extended to concepts
other than fitness, such as developmental morphology
and protein structure. The ruggedness of the landscape
determines whether orderly change is possible or
whether alternatives, such as stasis or chaos, emerge.
In evolution, the lower the peaks and the higher the
valleys, the more likely it is that selection can carry a
population, if not to the highest peak, at least to one
that has a respectable fitness. Similar considerations
apply to the study of morphological development in
the presence of various constraints. The ruggedness of
the landscape can be deduced from parameters, such
as the number of factors involved, and especially the
degree to which they are coupled, or in genetic terms,
the degree of epistasis.
Further Reading
Reference
Fix Genes
J Vanderleyden, A Van Dommelen, and
J Michiels
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1637
The process of biological nitrogen fixation is an extremely energy-demanding process requiring, under
ideal conditions, approximately 16 moles of ATP per
mole of N2 fixed. The property of reducing atmospheric dinitrogen to ammonia is found among a wide
variety of free-living, associative, and strictly symbiotic bacteria. The genetics of nitrogen fixation were
initiated in the free-living diazotroph Klebsiella pneumoniae. This analysis led to the identification of 20 nif
(for nitrogen fixation) genes. The identification of
these nif genes has substantially facilitated the study
of nitrogen fixation in other prokaryotes such as
Sinorhizobium meliloti (formerly Rhizobium meliloti)
and Bradyrhizobium japonicum in identifying genes
that are both structurally and functionally equivalent
to K. pneumoniae nif genes, including nifHDK, nifA,
nifB, nifE, nifN, nifS, nifW, and nifX. In addition, in
these organisms, genes essential for nitrogen fixation
were identified for which no homologs are present in
K. pneumoniae. These were named fix genes and are
often clustered with nif genes or regulated coordinately. Table 1 summarizes the properties and functions of fix genes.
708
Fix Ge n e s
Table 1
Gene or
operon
fixABCX
fixD
fixF
fixGHIS
fixLJ
fixK
fixNOQP
fixR
fixT
fixU
fixW
fixY
fixZ
Renamed
nifA
nifN
cytNOQP
FixJ
ATP
ADP
O2
FixL-P
FixL
+O2
FixJ-P
FixT
fixT
fixLJ
fixK
fixNOQP
fixGHIS
FixK
fdxN
nifN
nifHDKE
fixABCX
nifA
nifB
fixU
Nif
O2
Figure 1 Model of the FixLFixJ regulatory cascade in Sinorhizobium meliloti. Low oxygen conditions stimulate the
autophosphorylation activity of FixL and repress the phosphatase activity of FixLP. In contrast, the phosphatase
activity of the unphosphorylated FixL as well as phosphoryl transfer from FixLP to FixJ protein are independent of
the oxygen concentration. Activity of the transcriptional regulator NifA is repressed by oxygen (see text for details).
Genes regulated by NifA and FixK are marked in grey and black respectively. Proteins marked in a black oval are
transcriptional activators. Not all known open reading frames (ORFs) associated with nif and fix genes are shown.
two-component system FixLJ. The target of FixT
is the C-terminal domain of the FixL protein and
the interaction of both proteins leads to inhibition
of FixL-phosphate synthesis and consequently a decrease in nifA and fixK transcription.
Conclusion
Fixation of Alleles
Fixation Probability
T Ohta
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0463
The fixation probability of a mutant allele is the probability that it becomes fixed and substitutes for the
original allele in a population. The probability depends
Fixation Probability
1
1
e
e
4Ne sp
4Ne s
1
u
2N
e
1
2Ne =Ns
4Ne s
u (p)
710
0
2Nes
+1
+2
1
2N
3
Further Reading
F l a g e l l a 711
Flagella
S-I Aizawa
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0464
712
F l a g e l l a r P h a s e Va r i a t i o n ( B i o l o g y )
HAP2 (FliD)
HAP3 (FlgL)
HAP1 (FlgK)
Hook (FlgE)
Filament (FliC)
Rod2 (FlgG)
CheB
L ring (FlgH)
P ring (FlgI)
CheR
Rod1 (FlgB,FlgC,FlgF,FliE)
MCP
MS ring (FliF)
Mot complex
(MotA,MotB)
C ring(FliG,FliM,FliN)
C rod (FliO,FliP,FliQ,FliR)
CheA
CheW
CheY
Figure 1 Schematic drawing of the flagellar system. Stimuli go through the receptor, MCP, that triggers signal
transduction to send signal proteun, CheY, to the flagellar base. Name of each substructure of the flagellum is
indicated by arrows. Their component protiens are shown in parenthesis.
housekeeping sigma factor s70. The sequences
upstream of the flhDC operon are fairly diverse
among bacterial species. In E. coli, several genes and
physiological factors affecting the flhDC expression
have been known: the heatshock proteins (DnaK,
DnaJ, and GrpE), the pleiotropic response regulator
(OmpR) activated by acetyl phosphate, and the DNAbinding protein H-NS. More such genes and factors
have been discovered in other species, and their
mechanisms are under investigation.
FLI1 Oncogene
C S Cooper
Further Reading
Flower Development,
Genetics of
G Theissen
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1674
Floral Induction
When flowering plants have reached a critical age,
environmental signals may trigger a switch to floral
development. The shoot apical meristem, a small
group of progenitor cells, ceases production of leaf
primordia and switches to the production of floral
meristems which develop into flowers. Since flowering at the wrong time may seriously hamper reproductive success, the angiosperms have evolved
multiple genetic pathways to regulate the timing of
the floral transition in response to environmental stimuli and developmental cues. Since plants live under
very different environmental conditions and follow
diverse life strategies, the mechanisms controlling the
transition to flowering vary a lot, often even within
single species.
The analysis of natural variants (ecotypes) and of
mutants that flower later or earlier than wild-type has
revealed more than 80 gene loci that affect flowering
time in Arabidopsis. These flowering time genes may
contribute to two different components of the floral
transition: the production of flowering signals and the
competence of the shoot apical meristem to respond
to these signals. The flowering time mutants can be
grouped into different classes defining different pathways of floral induction. Arabidopsis is a facultative
long-day plant which responds to long days (indicating spring and summer) by flowering earlier than when
grown in short days. One class of mutants displays
a reduced response to changes in photoperiod (day
length) when compared to wild-type. The corresponding genes, therefore, may participate in a photoperiod
promotion pathway. A second class of late-flowering
mutants are unaffected in their response to photoperiod. The corresponding genes thus may be
involved in an autonomous promotion pathway.
This pathway monitors the signals of an internal
developmental clock that measures plant age. A third
pathway, termed vernalization promotion pathway,
confers susceptibility to vernalization, i.e., an extensive
714
GAI
FLC
FUL
AG
photoperiod
blue/UV-A light
FRI
FCA
CAL
AP2
LD
SVP
UFO
P1
sepals
LFY
"X"
AP3
CO
SOCI
AP2
AP1
ASK1
AP1
autonomous
cold
LUG
FHA
TFL1
Intermediate genes
AP2
AG
AGL11
?
SEP
NAP
et al
NAP
et al
SHP,
AGL13
et al
petals
stamens
carpels
ovules
Endogenous and
environmental signals
Homeotic functions
Downstream genes
Flower structure
A+B
A+C
B
A
1
C
2
Combinations of
homeotic functions
Classical ABC model
Floral whorls
Figure 1 A simplified and preliminary depiction of the genetic hierarchy that controls flower development in
Arabidopsis thaliana. Examples for the different types of genes within each hierarchy level are shown. `Gibberellic acid,'
`vernalization,' `autonomous,' and `photoperiod' refer to the different promotion pathways of floral induction.
`Intermediate genes' summarizes a functionally diverse class of genes including `cadastral genes.' MADS-box genes are
shown as squares, non-MADS-box genes as circles, and genes whose sequence has not been reported up to now as
octagons. Some regulatory interactions between the genes are symbolized by arrows (activation), double arrows
(synergistic interaction), or barred lines (inhibition, antagonistic interaction). For a better overview, by far not all of
the known genes and interactions involved in flower development are shown. In case of the downstream genes, just
one symbol is shown for every type of floral organ, though whole cascades of many direct target genes and further
downstream genes are probably activated in each organ of the flower. A flower structure is shown in the lower region
of the figure. At the bottom of the figure, the classical `ABC model' of flower organ identity is depicted. According to
this model, flower organ identity is specified by three classes of `floral organ identity genes' providing `homeotic
functions' A, B, and C, which are each active in two adjacent whorls. A alone specifies sepals in whorl 1; the combined
activities of A B specify petals in whorl 2; B C specify stamens in whorl 3; and C alone specifies carpels in whorl
4. The activities A and C are mutually antagonistic, as indicated by barred lines: A prevents the activity of C in whorls
1 and 2, and C prevents the activity of A in whorls 3 and 4.
Abbreviations of gene names used: AG, AGAMOUS; AGL, AGAMOUS-LIKE GENE; AP, APETALA; ASK1, ARABIDOPSIS SKP1LIKE1; CAL, CAULIFLOWER; CO, CONSTANS; FLC, FLOWERING LOCUS C; FRI, FRIGIDA; FUL, FRUITFULL; LD,
LUMINIDEPENDENS; LFY, LEAFY; LUG, LEUNIG; NAP, NAC-LIKE, ACTIVATED BY AP3/PI; PI, PISTILLATA; SEP, SEPALLATA;
SHP, SHATTERPROOF; SOC1, SUPPRESSOR OF OVEREXPRESSION OF CO1; SVP, SHORT VEGETATIVE PHASE; UFO,
UNUSUAL FLORAL ORGANS; TFL1, TERMINAL FLOWER1.
716
Future Prospects
In the future, the genes involved in flower development will be studied less and less individually, but
Further Reading
Flp Recombinase-Mediated
DNA Inversion
M Jayaram
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0466
F l p R e c o m b i n a s e - M e d i a t e d D N A I nver s i o n 717
rather more and more as components of complex gene
networks. Since most human food is derived from
flower parts or products, such as fruits and grains,
there will be intensive attempts to apply the knowledge obtained with the model plants (which are higher
eudicots) to commercially important crop plants
(which are predominantly monocots). The goal will
be to design these plants according to our desires with
respect to traits such as time to flowering, and inflorescence, flower, and fruit structure. Comparative studies on genes controlling reproductive development in
a diverse range of phylogenetically informative taxa,
including monocotyledonous and basal angiosperms,
but also nonflowering plants, will provide a better
understanding of flower evolution and the origin of
biodiversity.
Further Reading
Flp Recombinase-Mediated
DNA Inversion
M Jayaram
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0466
718
F l p R e c o m b i n a s e - M e d i a t e d D N A I nve r s i o n
cell; nor is it a burden on the cellular metabolic machinery. The plasmid may be regarded as a typical `benign
parasite genome' that has optimized functions for its
stable inheritance and its copy number maintenance.
The 2-micron circle molecules exist in the yeast
nucleus as minichromosomes, and are replicated during the S phase of the cell cycle by the same replication
apparatus that duplicates the chromosomes. Replication is initiated at the origin (ORI; Figure 1), and the
replication forks proceed bidirectionally along the
circular contour of the plasmid genome. Normally,
each plasmid molecule is restricted to one round of
replication per cell cycle. Equal partitioning of the
duplicated circles is achieved by the Rep1 and Rep2
proteins (coded for by the REP1 and REP2 plasmid
genes) acting in concert with the partitioning locus
STB (Figure 1).
The 2-micron circle contains a duplicated sequence,
599 bp long, and arranged in a head-to-head orientation (indicated by the parallel lines in Figure 1). These
inverted repeats divide the plasmid into two unique
regions, represented by the circular arcs in Figure 1.
The Flp site-specific recombinase is the product of the
FLP locus, and acts on the FRT (Flp Recombination
Target) sites located within the inverted repeats. The
result of the recombination reaction is an inversion of
the left unique region with respect to the right unique
region. As a consequence, the plasmid population
within the yeast cell consists of an equilibrium mixture of the two forms A and B, present in roughly
equimolar amounts (Figure 1). The relative flipping of
the DNA by recombination is what gives the recombinase its name Flp (pronounced either as `flip' or as
the letters F-L-P).
FLP
FRT
REP2
FRT
ORI
ORI
FRT
FRT
REP2
STB
FLP
STB
(A)
(B)
Figure 1 In a schematic representation of the 2-micron plasmid, the 599 bp inverted repeat (shown by the parallel
lines) divides the circular genome into two unique regions (circular arcs at the left and at the right). Flp-mediated
recombination at the FRT sites is responsible for interconversion between forms (A) and (B) by DNA inversion. The
products of the REP genes, Rep1 and Rep2 proteins, together with the STB locus are responsible for plasmid
partitioning at cell division. ORI is the plasmid replication origin.
F l p R e c o m b i n a s e - M e d i a t e d D N A I nver s i o n 719
R1
R1
L1 R1
L2
R2 L2
L1
R1
L1
R2
L2
R2
L1
R2
L2
H1
H2
(A)
Y
O
R1
L1
R1
R2
L2
OH P
L1
P OH
O
L2
R2
Y
(B)
Figure 2 (A) Recombination between two DNA substrates, L1R1 and L2R2, is initiated by strand cleavage and
exchange at one end of the spacer (at the left end in the scheme shown here). The interactions between recombinase
monomers bound to each substrate (the two shaded monomers in substrate 1, and the two unshaded monomers in
substrate 2) are responsible for the first strand cleavage and exchange reaction. The resulting Holliday intermediate is
resolved into the recombinants, L1R1 and L2R2, by cleavage and exchange at the right end of the spacer. During the
resolution step, the catalytic dimers are formed between Flp monomers bound on the left and right arms of partner
substrates. Each `active dimer' is constituted by a darkly shaded and a lightly shaded monomer. Note that, throughout
the reaction pathway, a cyclic peptide connectivity is maintained among the four Flp monomers. The catalytically
active and inactive associations between pairs of Flp monomers are indicated by the solid and dashed arcs,
respectively. The switch in the configuration of the active Flp dimers involves the isomerization of the Holliday
junction from H1 to H2. These junctions have an approximate fourfold symmetry, but are strictly only twofold
symmetric. The circles and split arrowheads indicate the 50 and 30 ends, respectively, of DNA strands. (B) The strand
exchange reaction at the initiation and termination steps of recombination involves the formation of a covalent protein
DNA intermediate in which the 30 -phosphate end of a cleaved strand is linked to the active site tyrosine of Flp.
720
F l p R e c o m b i n a s e - M e d i a t e d D N A I nve r s i o n
F M S Oncogene 721
FRT
ORI
FRT
n+2
Figure 4 The 2-micron circle replication is initiated at the replication origin (ORI; located close to one FRT site and
away from the other) and proceeds bidirectionally. A plasmid molecule is restricted to one round of replication during
a normal cell cycle. During the amplification mode by the Futcher model, a Flp-mediated recombination event
(indicated by the DNA crossover) inverts one fork with respect to the other. As the two forks chase each other
around the circular template, multiple tandem copies of the plasmid are made from the single replication event
initiated at ORI. Amplification can be terminated by a second recombination event that now redirects the forks
toward each other. In the example shown, n 1 copies of the plasmid are made before replication is terminated. A
single plasmid unit in the `amplicon' is indicated by the square brackets, with the arrows representing the 2-micron
circle inverted repeats. After resolution to individual copies, there would be a total of n 2 plasmid molecules.
negative regulators of the FLP gene expression in
a concentration-dependent manner. However, the
details of this regulatory circuit remain to be resolved.
Site-Specific Recombination in
Evolution: the Means to Many Ends
The circular geometry of the 2-micron plasmid, its
structural organization, and its genetic potential are
all part of the elegant biological design of a successful
selfish DNA element. One central outcome from this
molecular architecture is that a carefully controlled
site-specific recombination event can be exploited to
promote replicative amplification of the genome. It is
not surprising therefore that circular plasmids found
in yeasts that are rather distantly related to Saccharomyces are structurally similar to the 2-micron plasmid
(despite their large diversity in nucleotide sequences),
and harbor their own individual site-specific recombination systems. Furthermore, the observed kinship
among site-specific recombination systems found in
phage, bacteria, and yeasts attests to the axiom that
evolution is adept at reutilizing or retooling the same
basic biochemical strategy to bring about widely varied
end results under distinct biological contexts.
Further Reading
FMS Oncogene
R A Padua
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1571
722
Fol d bac k D NA
References
Foldback DNA
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0467
Follicular Lymphoma
P G Isaacson
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1572
Footprinting
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1843
Footprinting is a technique used to identify the binding site of, for example, a protein in a nucleic acid
sequence by virtue of the protection given by the
binding site against nuclease attack.
See also: Nuclease
Ford, Charles
E P Evans
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0471
Ford realized its value in tracking donor cell contributions in the ongoing experiments by the immunologists to `rescue' lethally irradiated mice by bone
marrow injection. The small chromosome, named
T6, was and is still used worldwide as a convenient
cell marker and the early experiments laid the foundations of the basic principles of immunosuppression
and tissue transplantation such as for human bone
marrow replacement.
In over 20 years of involvement with animal cytogenetics, first at Harwell and then at the University of
Oxford, Ford worked with the chromosomes of innumerable species in a variety of situations. Only a few
examples can be cited. The chromosome marker studies
were continued in analyzing cellular contributions in
mouse chimeras `created' by morula fusion or blastocyst injection. The chimeras also produced insights
into the masculinizing effect of the mammalian
Y chromosome in XX:XY combinations, an interest
that was extended into studies of the natural secondary
chimeras found in cattle (freemartins) and marmoset
monkeys. An earlier interest in the Robertsonian
translocation systems discovered in the common
shrew broadened with the discovery of similar systems
in feral mice and their property to induce high levels of
nondisjunction and zygotic imbalance when crossed
to laboratory mice. At the same time, human cytogenetics was not neglected with such studies as meiosis
in XYY males and the chromosomal screening of
cultured blood from athletes competing in the Mexico
City Olympic Games.
Ford was renowned for his inspirational enthusiasm in all branches of cytogenetics and his many
contributions were acknowledged by his election
to a Fellowship of the Royal Society of London in
1965 and in the compilation in 1978 of a special issue
of an international journal in honor of his 65th birthday. The contents, by friends and associates, reflect
many of his interests and the esteem in which he was
held.
References
724
Forward Mutations
Forward Mutations
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1844
Fosmid
J Hodgkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0473
Founder Effect
L Peltonen
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0474
Fo un de r Ef fe ct 725
Table 1
Examples of disease mutations demonstrating the founder effect in the Finnish population
Defective protein
APECED (240300)
Aspartylglucosaminuria (AGU, 208400)
Congenital chloride diarrhea (CCH, 214700)
the result of an internal migration after initial settlement in the country. Some 2000 years or 100 generations ago, small immigrant groups inhabited Finland.
Later, small subgroups of this initial population
moved to still more remote regions of Finland and
established small population subisolates. Perhaps
only 2040 families moved to remote areas 200300
years ago, and the founder effect and chance (genetic
drift) resulted in the enrichment of some disease genes
in these subisolates.
The founder effect in one ancestral mutation makes
the mapping and identification of disease genes a
straightforward task. Genome-wide searches for disease genes are based on the identification of a chromosomal region containing genetic markers which
co-segregate with the disease, due to the close vicinity
of the marker and the mutated gene. Families with
multiple affected children are needed to reveal this
co-segregation. In the presence of the founder effect,
mapping strategies based on the analyses of only diseased individuals can be applied. Monitoring of shared
marker alleles among the affected individuals has been
highly successful in the identification of genes and
alleles causing inherited diseases in genetic isolates.
The shared chromosomal regions indicate that the
alleles are identical by descent (IBD), since they
share a common ancestor. In the case of recessive
diseases, this strategy has been called homozygosity
mapping. Typically for disease alleles showing a founder effect, linkage disequilibrium or the nonrandom
association of alleles is seen over a long genetic interval
Major mutation
occurrence in
Finland (%)
82
98
100
78
90
100
85
100
98
100
70
96
70
94
94
Further Reading
726
Fou n d e r P r i n c i p l e
Founder Principle
See: Bottleneck Effect
Reference
Fragile X Syndrome
G R Sutherland and R I Richards
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0477
Clinical Features
There are many physical and behavioral features of
fragile X syndrome. Those which occur in more than
50% of males are listed in Table 1.
These features are shown by those with a full mutation. Individuals with a premutation are intellectually
and physically normal. The only significant exception
Table 1 Clinical signs present in more than 50% of
Fragile X malesa
Physical signs
Behavioral signs
Long face
Prominent ears
High arched palate
Hyperextensible fingers
Double-jointed thumbs
Flat feet
Macroorchidism
Strabismus
Soft smooth skin
Mitral valve prolapse
Tall as children, short as adults
Large heads as children, small as adults
Hand flapping
Hand biting
Hyperactivity
Perseveration
Aggression
Shyness
Anxiety
Poor eye contact
Tactile defensiveness
726
Fou n d e r P r i n c i p l e
Founder Principle
See: Bottleneck Effect
Reference
Fragile X Syndrome
G R Sutherland and R I Richards
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0477
Clinical Features
There are many physical and behavioral features of
fragile X syndrome. Those which occur in more than
50% of males are listed in Table 1.
These features are shown by those with a full mutation. Individuals with a premutation are intellectually
and physically normal. The only significant exception
Table 1 Clinical signs present in more than 50% of
Fragile X malesa
Physical signs
Behavioral signs
Long face
Prominent ears
High arched palate
Hyperextensible fingers
Double-jointed thumbs
Flat feet
Macroorchidism
Strabismus
Soft smooth skin
Mitral valve prolapse
Tall as children, short as adults
Large heads as children, small as adults
Hand flapping
Hand biting
Hyperactivity
Perseveration
Aggression
Shyness
Anxiety
Poor eye contact
Tactile defensiveness
F r a g i l e X S y n d ro m e 727
to this is that females with premutations appear to be
prone to premature ovarian failure, which can occur at
the age of 30 years onwards, although most premutation carriers do not have premature menopause.
Treatment
Cure of fragile X syndrome is not possible. A number
of the behavioral difficulties exhibited by fragile X
syndrome are amenable to both pharmaceutical and
behavior treatments. Integrated approaches to treatment will maximize the potential of affected individuals
and minimize the disruption to family life that this condition can produce (Hagerman and Cronister, 1996).
Cytogenetics
The appearance of the fragile X chromosome is shown
in Figure 1. This is most easily seen in chromosomes
prepared from lymphocyte cultures. The lymphocytes
need to be cultured in media which have a relative
deficiency of thymidine or deoxycytidine. This can
be achieved by using special commercially available
media, using medium TC199, or by adding a variety of
inducing agents such as the antifolate aminopterin, the
thymidylate synthetase inhibitor fluorodeoxyuridine,
or high concentrations of thymidine which inhibit
the availability of deoxycytidine (Sutherland, 1991).
Cytogenetic testing for fragile X syndrome has largely
been replaced by DNA testing.
Molecular Genetics
The molecular basis of fragile X syndrome is lack of
FMRP, the protein encoded by the FMR1 gene. Within
the 50 untranslated region of the FMR1 gene there
is a polymorphic CCG repeat, which on normal
Figure 1 Sex chromosome complements from individuals expressing FRAXA. A female (left) showing the fragile X
and normal X chromosome, and a male (right) showing the fragile X and a normal Y chromosome.
728
Frameshift Mutation
Genetics
The paradoxical nature of the fragile X chromosome
was documented by Sherman et al. (1985). They
showed that normal males could `carry' the condition,
an anomalous situation for an X-linked disease but
now known to be because of the premutations being
clinically harmless. They showed that the mothers and
daughters of normal fragile X carrier males had different risks of having children with fragile X syndrome
(`the Sherman paradox').
It is now recognized that when women transmit the
fragile X mutation it usually increases in size and the
risk of going from a premutation to a full mutation
depends upon the size of the premutation (Fisch et al.,
1995). When a male with a premutation transmits it,
the size of the premutation changes little. When a male
with a full mutation transmits his fragile X chromosome (to a daughter) she always receives it as a
premutation.
It is worth noting that whenever a child is identified
with fragile X syndrome, the mother is always a carrier (either pre- or full mutation) as is one of the
maternal grandparents.
Conclusion
Fragile X syndrome is a common disorder. Its genetics
are reasonably well understood but much remains to
be learned about the molecular pathway from genotype to phenotype. Diagnosis by DNA analysis is very
reliable, and prenatal diagnosis is appropriate and
available to women who are carriers of this disorder.
References
Frameshift Mutation
B S Guttman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0478
Reference
Freedom, Degrees of
T P Speed
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0479
730
Frequency-Dependent
Fitness
See: Frequency-Dependent Selection
Frequency-Dependent
Selection
T Prout
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0482
References
Allard RW and Adams J (1969) Population studies in predominantly self-pollinating species. XIII. Intergenotypic competition and population structure in barley and wheat. American
Nature 103:934, 621645.
Brower LP and Brower JV (1962) The relative abundance of
model and mimic butterflies in natural populations of the
Battus philenor mimicry complex. Ecology 43(1): 154158.
Bryant EH, Kance A and Kimball KT (1980) A rare male advantage in the house fly induced by wing clipping and some
general considerations for Drosophila. Genetics. 96: 975993.
Donald CM and Hamblin J (1983) The convergent evolution of
annual seed crops in agriculture. Nature 212: 1478.
Edmunds M (1966) Natural selection in the mimetic butterfly
Hypolimnas misippus in Ghana. Nature 212: 154158.
Ehrman L and Probber J (1978) Rare Drosophila males:
the mysterious matter of choice. American Scientist 66:
216222.
Frequency-Dependent
Selection as Expressed
in Rare Male Mating
Advantages
L Ehrman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1062
732
Fruit Fly
References
Proportion of or males
successful in mating
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.3
0.4
0.5
0.6
Proportion of or males in
competition with pr males
0.7
0.8
Fruit Fly
See: Drosophila melanogaster
Functional Genomics
D Seemungal and D Carr
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1862
Functional genomics is the development and implementation of techniques to examine both in time and
space the global patterns by which genes and their
protein products act in concert to effect function.
The human genome consists of a DNA complement of approximately 3 billion nucleotide pairs in
length, divided into 24 distinct chromosomal segments (autosomes 122 plus X and Y). Contained
within this linear array of nucleotides is anything
between 30 000 and 40 000 genes. The protein products of genes interact in complex pathways to effect
cellular function. The combination of genes which is
expressed in a cell at any particular point in time
determines the protein complement of the cell and
hence its functional mechanics.
During the process of development, subsets of cells
in the body contain different subpopulations of activated genes and, as a result of the different combinations of proteins that result, differentiate to form
distinct tissues with their own specialized physiology.
Proteins, as mobile functional elements, allow communication between cells, both locally and remotely,
and are hence responsible for the integration of various cell types into the coordinated, highly complex
physiological systems that comprise a living organism.
Differences in the genetic complement between
individuals can cause differential expression of genes
or the production of proteins which function in
slightly different ways. In extreme cases, some individuals possess harmful variants which cause disease
directly, but this variation also underpins disease processes in more subtle ways, influencing an individual's
susceptibility to disease and varying responses to
therapeutic interventions. In addition, intracellular
protein systems allow cells to respond to changes in
their environment. Certain environmental stimuli will
perturb the normal cellular functions of proteins and
cause changes in gene expression. These kinds of
environmental factors can also lead to the pathology
of disease. Often the development of a disease will be
the result of a complex mix of factors including inherent genetic susceptibility and a series of environmental
changes or challenges.
The determination of the full sequence of the
human genome provides a tremendous opportunity
to the international research community to begin to
get the ``full picture'' of the ways in which cells work
and the mechanisms underlying disease. Hidden
Fundamental Theorem of
Natural Selection
W J Ewens
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0483
734
736
i j
Pij w
partial increase in mean fitness VA =w
m
VA
where
is the full multiple-locus additive genetic
variance. This equation, again exact and embodying
no approximations, is the statement of one of the
modern interpretations version of the FTNS.
Equation (7) is not achieved by simply summing
both sides of equation (5) over all loci. Despite this, a
summation result of a different form does hold. An
expression parallel to that in (6) is that the multiplelocus additive genetic variance may be written as:
m
V A 2w
P P
i
ij pij
P P
j
ij pij 9
the double sum being over all alleles at all loci. The
expression on the right-hand side of (9) derives from
10
where Pg is the between-generation change in frequency of the multiple-locus genotype g and wg
is the sum of the average effects of all genes at all loci
in the genotype g, any average effect being counted
in twice in the sum if the corresponding gene
occurs twice in genotype g. The expression (10) may
be shown to be identical to the expression
2!i j ij Pij w
arising on the right-hand side of
(9), so that the above interpretation of the FTNS can
be written as
g Pgwg VA
11
12
738
Fungal Genetics
References
Fungal Genetics
D Stadler
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0484
Fungi
D Stadler
Copyright 2001 Academic Press
doi: 006/rwgn.2001.0485
FUS-CHOP Fusion
See: Myxoid Liposarcoma and FUS/TLS-CHOP
Fusion Genes
Fusion Gene
P Riggs
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0486
History
The first gene fusions created by design were between
the rIIA and rIIB genes of phage T4, studied by
Champe and Benzer. They used the effects of missense, nonsense and frameshift mutations in the rIIA
gene on RIIB activity to elucidate the properties of
the genetic code. Subsequently, fusions were created
in Escherichia coli using in vivo genetic techniques
to join various genes to the lacZ gene, which codes
for the easily assayed enzyme b-galactosidase. These
fusions were used as a way to examine the expression
level and regulation of the gene fused to lacZ. Fusions
were originally limited to genes that were located near
the b-galactosidase gene, but later Casadaban and
coworkers pioneered in vivo and in vitro techniques
that allowed fusion to virtually any gene.
Current Uses
The major current use of gene fusions is still the study
of gene expression, including levels of expression and
location of gene products. Both gene fusions and
reporter constructs (where the gene of interest is
replaced by a `reporter' gene instead of being fused
to it) are used for this purpose. Fusions to lacZ are
common, but any gene whose product is active as a
fusion and can be assayed is suitable for this purpose.
In this method, an extract of a cell or tissue containing
a gene fusion is prepared and the level of gene expression is measured by assaying the fusion. Gene fusions
can also be used to study the differential expression
of a gene in different tissues of an organism, by
histochemical staining for the fused gene in sections,
tissues, or the whole organism. Two genes commonly
used for this technique are the lacZ and gfp genes. The
lacZ gene has been used primarily because of the
vast experience researchers have with b-galactosidase
fusions, and the many substrates available for this
enzyme. One of these substrates, X-gal, produces
a dark-blue insoluble product when cleaved by bgalactosidase. Thus, the blue color does not diffuse
away from the site of cleavage, and one can infer the
location and level of expression from the intensity of
the blue color. The gfp gene codes for green fluorescent protein, which fluoresces green when excited by
blue or UV light. This allows visualization, and in
many cases can be used on intact, live organisms.
Further Reading
Fusion Proteins
P Riggs
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0487
History
Some of the first fusion proteins were created in
Escherichia coli using in vivo genetic techniques to
join various proteins to the b-galactosidase enzyme.
These fusions were used initially as a way to assay the
expression level of the protein of interest. Fusions
were originally limited to proteins whose genes
were located near the b-galactosidase gene, but later,
Casadaban and coworkers pioneered in vivo and in
vitro techniques that allowed fusion to virtually any
protein. Researchers were originally surprised that
some of the fusions were bifunctional, i.e., when the
C-terminus of a protein was fused to the amino terminus of b-galactosidase, both the proteins retained
activity. As more and more fusions to b-galactosidase
were obtained and found to have activity, researchers
began to make fusions to other proteins besides
b-galactosidase and found that they could be bifunctional as well.
740
Fusion Proteins
G
G1
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1846
G2
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1847
Galactosemia
S Segal
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0488
742
Galton, Francis
Galton, Francis
R Olby
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0490
Human Heredity
Conceptual
Observational
744
Galton, Francis
7 Grandfathers
Great-uncles
4 Uncles
26 FATHERS
36 SONS
4 Nephews
9 Grandsons
2 Great-nephews
1 Great-Grandsons
Twin Studies
Regression
1200
Number of soldiers
1000
800
600
400
200
33
35
37 39
41 43 45
Chest circumference (in)
47
49
Figure 1 Graph of the distribution of chest circumferences of 5732 Scottish militia men. The figure
approximates to the bell-shaped curve of a `normal
distribution.' (Redrawn from data in Quetelet (1817)
Edinburgh Medical and Surgical Journal: 260264.)
species. Now he could understand why variability
does not change the median or `center of gravity' of
the population.
Having established in a rough manner evidence for
this regression in plants, Galton cast around for data
on human characteristics, but in vain. However, when
he advertized prizes of 500 for those who best filled
in the elaborate set of questions that he prepared
concerning them, their grandparents, parents, sisters,
brothers, children, and other relatives, he was rewarded with a good response. These family records
included stature of family members, so he was able to
plot the statures of parents against offspring from
which he calculated the regression (Figure 2). He
expressed what he called the `coefficient of regression'
as the ratio between the deviation of the offspring and
that of the mid-parents from the population mean.
This is measured on the graph by the distances AB
and AC or EF and EG. Since the data fall approximately on a straight line, the ratio is constant throughout its length, giving a coefficient of regression of
two-thirds. Now he had measured a statistical relation
between two generations.
Correlation
746
Galton, Francis
73
73
E
71
71
69
69
67
67
65
65
of fractional inheritance, according to which the parents collectively contribute one-half, the grandparents
one-quarter, and the great-grandparents one-eighth to
the hereditary constitution of the offspring. These
ancestral contributions exert their influence and tend
to bring back the progeny of deviants toward the mean
of the ancestral population as a whole.
Discontinuous Variation
63
65
67
69
71
73
Figure 2 Graph of the relation between the midparental heights of parents and the mean height of their
children. The diagonal line represent all points on the
graph corresponding to hypothetical parents and their
children where the means of the children's heights are
identical with their parents. The steeper line is plotted
from Galton's data. The ratios AB:AC or EF:EG give the
coefficient of regression. (Redrawn from Galton (1869)
Hereditary Genius: An Inquiry into its Laws and Consequences, pp. 83. London: Macmillan.)
Natural Selection
Stabilizing Selection
Darwin had focused upon the slight individual differences that constitute the variation to be found among
members of a family and of a species. ``They afford
material,'' he said, ``for natural selection to accumulate, in the same manner as man can accumulate in any
given direction individual differences in his domesticated productions.'' But Galton was convinced that
natural selection cannot work effectively against heredity, and he considered that his work was showing
heredity maintaining the racial mean. According to
Galton, the mean is the adapted form and deviations
from it will be less adapted to the conditions of life.
Therefore, natural selection will be aiding heredity in
preserving it. In other words, he granted natural selection its `stabilizing' role, but not its `creative role.' As
the reason for this state of affairs he turned to the
physiology of reproduction. The fertilized egg is composed of hereditary material from two parents, so that
each time a new generation is produced there is a
bringing together of two such materials. Inevitably
the contributions of each parent and each ancestor
will be diluted. So he accepted the long-held tradition
Finger Prints
Eugenics
Controlling our Evolution
The driving force behind Galton's extensive and longcontinued research was not just his curiosity, great as
that was, but his vision of a future in which mankind
would attain to greater energy and coadaptation. But
he realized how easy it was to follow the wrong
course. To accept the evolutionary process passively
would be to surrender to ``blind and wasteful processes'' in which raw material is produced extravagantly and all that is superfluous is rejected ``through
the blundering steps of trial and error.'' He favored the
alternative that we should take control of our evolution, for it may be that we are the ``only executives on
earth.'' Hence the importance of eugenics in providing
the proper scientific basis for action. To support such
work he settled an endowment on University College
London so that in 1905 a Research Fellow in eugenics
could be appointed. Further expansion led to the creationoftheEugenicsLaboratory.In1911,againthrough
Galton's munificence, a chair of eugenics was established at the College, the first occupant being Pearson.
Victorian Attitudes
Publicizing Eugenics
Conclusion
Galton was the confident English gentleman, well
aware of the superiority of his nation and his class,
condescending to the former colonies, and dedicated
to turning back the degeneration of his countrymen.
But he disparaged the institution of the aristocracy,
rejected the Christian religion, and considered many
of our behavioral characteristics as outworn relics
from a primitive stage in our social evolution. Although his mathematical skills were limited, his imagination, insight, and inventiveness were remarkable.
Allied to his incessant curiosity, these talents made
him one of the founders of the statistical revolution
that occurred in his lifetime. His book Natural Inheritance (1889) proved an inspiration and a turning point in
the lives of several of those who became important
contributors to the development of biometry, statistics, and evolutionary biology.The imaginative psychological studies that he published in Inquiries into
Human Faculty and its Development (1883), proved
an important influence among psychologists.
Further Reading
748
G a m et e s
Gametes
J R S Fincham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0492
Gametes are the haploid cells that fuse in the sexual life
cycle to form the diploid zygote. Not all sexual organisms have gametes in the sense of specialized uninucleate cells, but they nevertheless contrive to
bring haploid nuclei together for fusion (karyogamy),
and we can refer to these as gamete nuclei. Not all
gametes and gamete nuclei are differentiated into male
and female, and not all are the immediate products of
meiosis. In the majority of sexually reproducing
organisms other than animals there is usually a haploid
phase of mitotic division interposed between meiosis
and sexual nuclear fusion, and in ferns, mosses, and
liverworts, and most fungi and algae, the haploid
phase is free-living.
This article briefly reviews the variations found in
different groups of sexually reproducing organisms.
Animals
All groups of animals, other than the unicellular forms
(Protozoa), have differentiated female eggs and male
spermatozoa, the former contributing both nucleus
and cytoplasm to the zygote and the latter little more
than the nucleus. Both are the immediate products of
meiosis, but whereas all four spermatozoa formed in a
sperm mother cell (spermatocyte) are potentially
viable, only one nucleus from meiosis in the oocyte
survives in the egg, which is generally released from
the ovary for fertilization as a free cell. The spermatozoa are each propelled by a single flagellum.
In algae we see all the stages in the hypothetical evolution of male and female gametes from the supposed
In the budding yeasts such as Saccharomyces cerevisiae, the haploid products of meiosis are ready to
function as gametes immediately, provided that different mating types come together (as they always do in
strains with mating-type switching), but if restricted
to one mating type they can bud indefinitely as vegetative haploid cells.
Ga metes 749
within their respective female reproductive structures,
but, strictly speaking do not have male gametes, in the
sense of separate cells, but only gamete nuclei.
In the angiosperms the product of meiosis on the
female side is a megaspore, which undergoes haploid
mitosis to produce eight nuclei, one of which becomes
the nucleus of the egg. On the male side, the pollen
grain (microspore) germinates to give a pollen tube
which, after two mitotic divisions, contains three haploid nuclei. The pollen tube grows down the style of
the flower to the embryo sac, where one of its nuclei
fuses with the egg nucleus while another fuses with
two other embryo sac nuclei to found the (usually)
triploid tissue of the seed endosperm, which has a
nutritive function. The pollen tube nucleus that fertilizes the egg can certainly be called a gamete nucleus,
and the one that contributes to the endosperm, which
has no genetic future, is a gamete nucleus in a more
special sense.
The gymnosperms are different in that the haploid
tissue derived from the megaspore is much more
extensive than in the angiosperms. It includes the
endosperm of the seed, which here is purely maternal,
and, embedded within it, the archegonia that contain
the eggs. The pollen grains make no genetic contribution to the endosperm and, of the few haploid nuclei
(usually four) in the pollen tube, only the one that
fuses with the egg can be called a gamete nucleus.
only so long as it takes them to find another monokaryon of compatible mating type with which they
can fuse to form a dikaryon. The dikaryon produces
the mushrooms which bear the basidia specialized
cells within which fusion of the mutually compatible
nuclei finally takes place, with meiosis following
immediately. There are no gametes in the life cycle,
but the nuclei fusing within the basidium might be
called gamete nuclei, though they are not usually so
termed.
In the filamentous Ascomycetes (which include
such genetically important genera as Aspergillus,
Neurospora, Sordaria, Podospora, and Ascobolus)
dikaryon formation follows the fertilization of female
structures (ascogonia), which have receptive filamentous outgrowths called trichogynes. The male fertilizing elements come in various forms: as specialized
fertilizing spores (microconidia), as conidia of the
same kind as propagate the fungus vegetatively, or as
vegetative hyphal tips. Following fusion with a trichogyne, a male nucleus migrates into the ascogonium to
establish a dikaryon in partnership with the ascogonial
nucleus. The dikaryon proliferates briefly within the
developing fruit body (ascogenous hyphae) but soon
form ascus initials within each of which a pair of
nuclei, the descendants of the original pair, undergo
fusion. Meiosis in the ascus follows immediately, with
the formation of haploid ascospores. In this system the
term gamete, if used at all, should be reserved for the
nuclei within the dikaryon which finally fuse, rather
than the cells which initiate the dikaryon.
750
G a m et e s , M a m m a l i a n
Gametes, Mammalian
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0491
Gamete is the general term used to describe the reproductive cells of animals or plants. Thus, in animals,
sperm and eggs are both considered gametes.
See also: Meiosis
Gametic Disequilibrium
G Thomson
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0493
D;
f Ab pA pb
D;
f ab pa pb D
where
D f AB pA pB f ABf ab f Ab
f aB. If D 0 (a state referred to as gametic or linkage
equilibrium) then the alleles at the two loci are randomly associated. If D > 0 the allele A occurs more
often with allele B than expected by chance (and hence
a with b), while if D < 0 alleles A and b (and hence a
and B) are preferentially associated. The possible value
of D in a two locus system is constrained by the fact
that the haplotype frequencies must be 0. The
normalized gametic disequilibrium D0 D/Dmax is
often considered, where Dmax is equal to either the
lesser of pA pb or pa pB if D is positive, and the lesser of
pA pB or pa pb if D is negative. The advantage of this
measure over D is that it has a range from 1 to 1,
regardless of the allele frequencies. The correlation
coefficient: r D= pA pa pB pb 1=2 , with a range from
1 to 1, is also often used, as well as r2 with a range
from 0 to 1.
It is possible, although relatively rare, that unlinked
loci can be in significant gametic disequilibrium, and
that very closely linked loci may be in gametic equilibrium. The value of gametic disequilibrium is often
expected to change each generation. Further, a population at genetic equilibrium can have significant
gametic disequilibrium (e.g., under various selection
schemes). The term linkage disequilibrium is thus an
unfortunate choice and the term gametic disequilibrium is preferable, although also not perfect; however,
the term linkage disequilibrium is commonly used.
There is a relationship between D and the recombination fraction c between two loci. Changes in
gamete frequencies over generations only occurs by
recombination in individuals heterozygous at both
loci under consideration. In fact, the value of D
decreases by a fraction (1 c) each generation under
random mating and a neutral model. Thus the gametic
disequilibrium in this case converges to zero (random
association of alleles) with time as (1 c)n, where n is
the number of generations. The more loosely linked
are two loci, the faster the decay of gametic disequilibrium. On the other hand, for very tightly linked loci
gametic disequilibrium may exist for a very long time.
The definition of gametic disequilibrium is easily
extended to accommodate more than two alleles at a
locus. When considering three or more loci, higher
order disequilibrium terms are also needed. For example, in a three-locus system a gametic frequency can
be expressed in terms of the three-allele frequencies,
the three pairwise gametic disequilibria, and a single
measure of third-order gametic disequilibrium.
Selection for different combinations of alleles can produce D 6 0. If the selection is acting directly on the
two loci being considered gametic disequilibrium can
be maintained in an equilibrium state. While this can
apply to unlinked loci, it is expected more often for
closely linked loci. Transient gametic disequilibrium
can also be created with neutral loci via a hitchhiking
event. If an allele, say b, at a neutral locus is in gametic
disequilibrium with an allele favored by selection at
another locus, say a, then changes in the frequency of
the allele b will occur due to this nonrandom association. Such hitchhiking can noticeably increase the
absolute value of the gametic disequilibrium, if selection in favor of the new mutant is greater than the
recombination rate between the neutral and selected
loci. Further, gametic disequilibrium can be generated
between two neutral loci via hitchhiking at a third
closely linked locus. Gametic associations built up via
hitchhiking events are expected to decline in strength
as recombination breaks up haplotypes bearing the
selected allele.
Migration or Admixture
Nonrandom Mating
752
G a m et o g en e s i s
Gametogenesis
J Hodgkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0494
Gametogenesis is the process leading to the production of specialized reproductive cell types, either eggs
or sperm, collectively known as gametes. It entails
meiotic division, to generate haploid cells, together
with maturation into the appropriate functional
gamete.
See also: Oogenesis in Caenorhabditis elegans;
Oogenesis, Mouse; Spermatogenesis in
Caenorhabditis elegans; Spermatogenesis, Mouse
Gamma Distribution
N Saitou
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0495
Ras-GAP is a 120-kDa, ubiquitously expressed cytosolic protein. It was initially identified as an activity in
cell extracts able to stimulate the intrinsic GTPase
activity of p21Ras (Trahey and McCormick, 1987).
References
Gastrulation
See: Developmental Genetics
Gaucher's Disease
T M Cox
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0498
occur in an overall frequency of about 1 per 855 individuals in the population at large.
Definition
Gaucher's disease is a multisystem disorder principally affecting macrophages and classified in the
Online Mendelian Inheritance in Man (OMIM) website as OMIM 23080, 23091, and 23100. It is a prototype of the glycosphingolipidoses, an important group
of lysosomal disorders characterized by deficiency of
specific acid hydrolases responsible for the degradation of complex membrane glycolipids.
Gaucher's disease is caused by a recessively inherited deficiency of an acid b-glucosidase, glucocerebrosidase (EC.3.2.1.45). As a consequence of this
deficiency N-acyl-sphingosyl-1-0-b-d-glucoside and
other minor glycolipid metabolites such as glucosylsphingosine accumulate. All the accumulated glycolipids represent metabolic intermediates derived
from the cellular turnover of membrane lipid macromolecules of the ganglioside and globoside classes.
Genetics
The human glucocerebrosidase locus has been
mapped to chromosome 1q21 where it is found in
close proximity to a nonprocessed cognate pseudogene which is absent in several other vertebrates.
The human acid-b-glucosidase gene is also found in
proximity to two other genes, metaxin and thrombospondin 3. Expression of mRNA encoding human
acid-b-glucosidase is constitutive in nearly all cells
but varies in abundance. In the 50 untranslated region
of the functional acid-b-glucosidase gene in humans
there are two CAAT boxes and two sequences encoding putative CAAT boxes. Promoter/expression studies have revealed several transcription factors,
octamer-binding transcription factor 1 (OCT binding
protein), oncogene Jun activator protein-1 (AP-1),
ets-related transcription factors: Polyomavirus
Enhanced Activator-3 (PEA3), and the CAAT binding protein). These and other transcription factors
yet to be characterized indicate that the expression
acid-b-glucosidase may be regulated by transcriptional activation associated with proliferative cell responses.
754
G a u c h e r 's D i s e a s e
accumulationofglycolipidsinmacrophagesthatbelong
to the mononuclear phagocyte system located principally in the liver, bone marrow, and spleen. Pathological
macrophages containing excess lysosomal stored lipid
may also be found within the lung and, on rare occasions, pericardium and kidney. In the neuronopathic
forms of Gaucher's disease (type 2 and type 3) severe
deficiency of glucocerebrosidase caused by disabling
or inactivating mutations is additionally associated
with disease of the nervous system. The pathogenesis
of neuronopathic Gaucher's disease is complex but in
many instances failure to degrade endogenous glycosphingolipids present in brain tissue is a contributing
factor although the accumulation of Gaucher's cells
around adventitial spaces in cerebral blood vessels
as a result of uptake of circulating glucosylceramide
present in plasma may be a contributory factor.
GenotypePhenotype Correlations
In a particular variant of neuronopathic Gaucher's
disease described in Arabic, Japanese, and Spanish
populations, an intermediate phenotype associated
with corneal opacities and endocardial thickening
with mitral and aortic valve disease of the heart has
been identified and related to homozygosity for a
particular missense mutation D409H in the glucocerebrosidase gene. Other genotype/phenotype correlations are less close but of the common widespread
mutations L444P is associated with disease severity
and has been described in all three phenotypes of
Gaucher's disease. Homozygosity for L444P is a significant cause of neuronopathic Gaucher's disease in
the so-called Swedish or Norrbottnian variant with
slowly progressive neurological symptoms associated
with survival to adult life. In type 2 Gaucher's disease
the rapid onset of bulbar paresis, spastic paraparesis,
and opisthotonus with swallowing difficulties is noted
within the first few months of life and survival beyond
the first few years of life is very unusual.
A recently recognized and rare variant of Gaucher's
disease is associated with premature fetal loss and
stillbirth as well as infants with a desquamating and
dehydrating skin lesion that die shortly after birth of
dehydration. This condition associated frequently
with an abnormal appearance of the dermis `collodion' is associated with severely inactivating lesions
in the human glucocerebrosidase gene and has parallels with the short-lived lethal murine glucocerebrosidase deficiency state generated by targeted disruption
of the glucocerebrosidase gene in embryonic stem
cells. Homozygous animals die within 24 h of birth
and although scant Gaucher's cells are present
within systemic organs and excess glucosyl ceramide
accumulates, the principal cause of death appears to be
Clinical Presentation
Symptoms of type 1 Gaucher's disease usually result
from the presence of splenic enlargement and either
the enlarged viscera are noted by the patient or the
consequences of hypersplenism (anemia, thrombocytopenia, or leukopenia) declare themselves by the
occurrence of spontaneous bruising or unexplained
sepsis. Abnormal blood counts combined with enlargement of the liver and spleen (hepatosplenomegaly)
Marrow Transplantation
Early studies using preparations of human glucocerebrosidase prepared from placental tissue were conducted at the National Institutes of Health by
Roscoe Brady and colleagues. Infusion of the native
protein was associated with a reduction of erythrocyte
and plasma glucocerebroside over a few days. No
convincing clinical improvement was demonstrated.
However, since the pioneering discovery of the lysosome and its access to the aqueous phase by Christian
de Duve and contemporaneous studies on the uptake
of glycoproteins by parenchymal and nonparenchymal hepatic cells, it was considered that native human
glucocerebrosidase may lack the critical recognition
signals for uptake and delivery to the disease macrophage of Gaucher's tissue. With the identification of a
mannose receptor on the surface of macrophages and
the preferential uptake of mannosylated proteins by
human alveolar macrophages, experiments were
undertaken to modify the terminal carbohydrate residues of placental glucocerebrosidase by sequential
enzymatic deglycosylation. Mannose-terminated preparations of human glucocerebrosidase were then
shown to be taken up preferentially by nonparenchymal (Kupffer-cell-rich) rather than parenchymal
hepatic cells in rats and prompted further studies
of enzyme replacement therapy in patients with
Gaucher's disease.
Early clinical trials of mannosylated human placental glucocerebrosidase (alglucerase) showed rapid
regression of symptoms and visceromegaly with improvement and blood counts and other parameters of
Gaucher's disease activity. The preparation secured
approval as an Orphan Drug under the Food and
Drug Administration of the USA in 1990.
756
G a u c h e r 's D i s e a s e
Gene Therapy
Since Gaucher's disease can be corrected by transplantation of allogenic bone marrow providing a source of
granulocytemonocyte progenitor cells, the possibility of gene therapy directed toward hematopoietic
stem cells is raised. Several trials have been approved
in the USA for the genetic transduction of CD34
hemopoietic stem cells that have been therapeutically
corrected by transfer of the human glucocerebrosidase
gene in retroviral vectors. This approach has already
been successful in normal mice where prolonged
expression of human glucocerebrosidase at a high
level has been achieved in the macrophages of mice
that have received primary and secondary marrow
transplants. At present it is not clear how in humans
transfected cells would have a selective advantage for
survival and to populate the entire bone marrow that is
diseased in Gaucher's disease thereby providing longterm remission of glycolipid storage by the metabolism
of endogenous glucocerebroside. However, high efficiency vectors for long-term expression in grafted
autologous cells are currently being studied to secure
corrective expression with the wild-type glucocerebrosidase gene. The means to secure a selective advantage within the marrow population continues to be
explored actively.
Further Reading
G-Banding
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1845
G-banding (Giemsa banding) is a technique that generates a banded pattern in metaphase chromosomes,
thus allowing identification of the separate chromosomes. It involves brief treatment with protease and
staining with Giemsa.
See also: Giemsa Banding, Mouse Chromosomes
Gel Electrophoresis
B A Roe
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0499
Further Reading
G-Banding
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1845
G-banding (Giemsa banding) is a technique that generates a banded pattern in metaphase chromosomes,
thus allowing identification of the separate chromosomes. It involves brief treatment with protease and
staining with Giemsa.
See also: Giemsa Banding, Mouse Chromosomes
Gel Electrophoresis
B A Roe
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0499
758
G e l E l e c t ro p h o re s i s
Eq fv
Gene 759
The capillary electrophoresis instrument also is
coupled with automated detection equipment and
an associated computer, on which the results can be
stored for further analysis. Various media have been
described which can provide single-base resolution of
either single- or double-stranded nucleic acids as large
as 5001000 bases. These media include linear polyacrylamide, methyl cellulose, hydroxyethyl cellulose
either alone or mixed with polyethylene oxide, and
hydroxypropyl cellulose either alone or mixed with
polyethylene oxide. The capillaries are either coated
with a siliconizing reagent to reduce the charges on the
glass capillary or used directly without coating. Capillary electrophoretic-based instrumentation now is
quite robust and has gained wide acceptance in both
the gene-mapping and DNA-sequencing communities. In the case of protein separation by capillary
gel electrophoresis, media similar to that used in
nucleic acid separations have been described. However, coated capillaries almost always are required
because of the greater tendency for proteins to bind
to the capillaries and thereby cause altered observed
electrophoretic mobility and irreproducible quantitation of any resolved samples.
There are numerous manufacturers and resellers
of electrophoretic equipment that range from simple
but effective Plexiglas acrylamide apparatus and
power supplies to self-contained PAGE gel or capillary electrophoresis instruments. These suppliers
also include detailed protocols for optimal use of
their apparatus or instrumentation that are easy to
follow and typically yield reproducible results.
Finally, the electrophoresis literature is extensive and
provides many detailed procedures. However, the
reader is encouraged to investigate the following two
books: Molecular Cloning: A Laboratory Manual
(Sambrook and Russell, 2000); and Proteins (Walker,
1984), as they provide an almost complete review of
the existing literature.
References
Gene
J Merriam
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0500
760
Gene
Gene Action
C Yanofsky
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0501
Gene Amplification
T D Tlsty
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0502
762
Gene Amplification
Development
In Germline Cells
The earliest suggestion of developmental DNA amplification was made by King in 1908 when she
described the extra `chromatin' (associated with forming nucleoli) which arose in the oocytes of the toad,
Bufo sp., during pachytene (King, 1908). Later studies
show that these `masses' contain DNA, and hybridization studies demonstrate a great excess of sequences
coding for rRNA (Gall, 1968). This differential synthesis of rRNA genes is correlated with the appearance of
hundreds of nucleolar organizers during the pachytene
stage of meiosis. Similar examples of this phenomenon
are seen in other amphibians: Xenopus, Rana,
Eleutherodactylus, Triturus (Gall, 1968), as well as in
an echuiroid worm, the surf clam (Brown and David,
1968), and many insects (Gall et al., 1969). Documentation of rDNA amplification is particularly amenable, because the rDNA forms nucleoli which are
distinctive in appearance and because most rDNA
sequences have a relatively high guanosinecytosine
content, which allows their separation from bulk DNA
on CsCl gradients. These two aspects of rDNA have
been important in the recognition of this and other
phenomena (magnification and compensation) in
In Somatic Cells
also been observed in different tissues of Chironomidae, Drosophila, Sciara, Hybosciara, and other Diptera
(Breuer and Pavan, 1955), in some cases thought to be
under hormonal control (Pavan and da Cunha, 1969;
Bostock and Sumner, 1978).
In many cases of differentiation, somatic cells
acquire adequate amounts of mRNA for production
of abundant proteins by accumulation of stable
mRNA molecules over a period of days. Examples of
such are the silk fibroin genes, the ovalbumin genes,
and the b-globin-chain genes. This is thought to be
controlled at the transcriptional or posttranscriptional
level. However, in other cases, such as during the
synthesis of the insect eggshell by the ovarian follicle
cells of Drosophila, little time is allotted for the production of the specific mRNAs which are needed
in large quantities. In these latter cases, the rates of
transcription and translation do not seem to be high
enough for the production of adequate amounts of
protein. Spradling and Mahowald (1980) found that
this need is met by differential amplification of the
chorion gene sequences in the ovarian follicle cells.
Spradling and Mahowald (1981) found that the chorion
genes that are located on the X chromosome are in two
clusters, s36 and s38, and are amplified 15-fold. The
s15 and s18 loci are on the third chromosome and
are amplified 60-fold. The genes in both clusters are
amplified at the same time and both homologs
are amplified equally. Sequences which flank these
genes are also disproportionately replicated but not
to as great an extent. This results in a gradient of
amplification which spans 90 kb of DNA, is maximal
in the center, and does not evidence any discrete termination sites.
Changes in the DNA content of the nucleus of
germline cells and somatic cells during development
is common. Amplification of rRNA sequences, as well
as others coding for proteins which are needed in large
amounts during that particular phase of development,
have been extensively documented. In still other cases,
the nature of the amplified DNA cannot be totally
accounted for by known sequences and probably contains amplified sequences of unknown function. Such
sequences have been implicated in the differentiation
of the orchid Cynidium sp. (Nagl et al., 1972), the
differentiation of peas (Van'T Hof and Bjerknes,
1982), and in the flowering of the tobacco plant,
Nicotiana sp. (Wardell, 1977).
764
Gene Amplification
events that are specifically regulated during development, amplification as a means of survival is a more
sporadic event, whose frequency is often detected at
much lower levels than that seen in developmental
amplification.
Bacteria
The first well-documented instance of gene duplications in bacteria which provided an adaptive advantage was seen by Novick and coworkers (Novick and
Horiuchi, 1961; Horiuchi et al., 1962, 1963). Bacterial
strains were grown for long periods of time in
limiting concentrations of lactose in the chemostat.
The bacterial strains which emerged were able to
synthesize four times the maximal normal amount of
b-galactosidase. The ability to produce large amounts
of enzyme was unstable and could be transferred by
conjugation at a time when the lactose operon genes
were expected to be transferred. It was concluded that
the ability to overproduce b-galactosidase was due to
extra copies of the lactose genes present in these
strains. The spontaneous rate of duplication was
estimated to be 10 3 (Horiuchi et al., 1963) and, in a
similar system, 10 4 (Langridge, 1969). Overproducers with similar characteristics were subsequently
reported for other enzymes such as ribitol dehydrogenase, b-lactamase (Normark et al., 1977), and several others (for an excellent review, see Anderson and
Roth, 1977). Roth and coworkers have demonstrated
that duplications of up to a quarter of the Salmonella
typhimurium chromosome occur at large homologous
segments such as the rRNA genes (Anderson and
Roth, 1977, 1981). Such large duplications are dependent on the recA system. On the other hand, duplications in the range of 1030 kb appear to be
independent of recA and do not involve very large
homologous segments of DNA (Emmons et al., 1975;
Anderson and Roth, 1977; Emmons and Thomas,
1981). Edlund and Normark (1981) have detected
tandem duplications of 1020 kb at the E. coli chromosomal ampC locus, which codes for b-lactamase and
confers resistance to ampicillin. After a step-wise
selection in increasing concentrations of ampicillin,
3050 copies of the duplication events were analyzed
at the restriction fragment level and were found to
have different endpoints. The junction point of the
amplified unit in one case was sequenced and it was
found that the original duplication event occurred at a
sequence of 12 bp, which was repeated on each side of
the ampC locus, 10 kb apart. Tlsty et al. (1984a), have
detected amplification events as revertants of certain
leaky Lac mutants. The DNA sequences, which were
amplified, contained the lactose operon and anywhere from 7 to 32 kb of flanking sequences. These
regions were amplified 100-fold. Virtually all of the
duplications, which have been detected and subsequently studied in bacteria, have been tandem in nature.
Several bacteriophages are also known to
amplify genetic markers. The phage lambda (Edlund
and Normark, 1981), T4 (Kozinski et al., 1980), and
P1 phage (Meyer and Lida, 1979) may use mechanisms
that parallel those used by viral sequences in mammalian cells. At the present time, the mechanism is unknown.
Yeast
Resistance to the toxic effects of copper in Saccharomyces cerevisiae is mediated by tandem gene amplification of the CUP1 locus (Fogel et al., 1983) but
is different from sporadic amplification events characterized in other organisms. The CUP1 locus of
yeast codes for a small molecular-weight copperbinding protein. Copper-sensitive strains contain one
copy of this locus and, when grown in elevated
concentrations of copper, failed to produce resistant
derivatives with a higher gene copy number of CUP1
genes on one chromosome. Infrequently, copperresistant strains were isolated in the laboratory and
were found to carry up to 10 tandem duplications
of the region, which is 2 kb in size. Further amplification could be achieved by growing the copperresistant strains in elevated copper concentrations.
The authors postulate that the mechanism of amplification in this instance proceeds through the formation of a disomy for chromosome VIII (which carries
the CUP1 locus). Copper-resistant mutants of the
sensitive strain were found to be disomics for chromosome VIII. The amplification or randem iteration
could then result primarily for subsequent unequal
chromosome or sister chromatid exchanges. The formation of a disomy may constitute an initial event in
the process.
A second example of gene amplification in yeast
exhibits molecular structures that are more similar to
those described in other organisms. A yeast strain
resistant to antimycin A, an alcohol dehydrogenase
inhibitor, has been found to contain multiple copies
of a nuclear gene, ADH4, an isoenzyme of alcohol
dehydrogenase. The amplified copies are 42 kb in
length, display a linear, extrachromosomal, palindromic structure and contain telomeric sequences. Their
structure resembles that of the amplified rDNA
genes in the macronucleus of Tetrahymena and related
cilliated protozoa, except that the nuclear copy
remains within the chromosome in this situation. In
contrast to what is often observed in mammalian
amplification, the extrachromosomal copies of this
gene were stable during mitotic growth. Amplification of the ADH4 gene is a relatively rare
event (* 10 10 mutations/cell/generation); alternative
Protozoans
Drug resistance in protozoan parasites is a common occurrence and presents a serious problem for
the chemotherapy of diseases caused by such pathogens as Trypanosoma, Leishmania, and Plasmodium
(Browning, 1954; Peters, 1974; Rollo, 1980). Recently,
Leishmania strains resistant to the well-known chemotherapeutic agent methotrexate (MTX), have been isolated and analyzed as to their mechanism of resistance.
Organisms which were resistant to high concentrations of MTX (1 mM) had a 40-fold increase
in dihydrofolate reductase (DHFR), which in this
organism is associated with thymidylate synthetase
(Beverley et al., 1984).
Recent studies have shown that Plasmodium falciparum contains genes that are analogous to the multidrug resistance genes in mammalian cells. Parasites
that become resistant to chloroquine have also proven
to be resistant to other antimalarial drugs; similar
to the phenomenon of multi-drug resistance seen
in human tumors. Wilson et al. (1989) found that
sequences that were similar to the mammalian Pglycoprotein existed in P. falciparum. Their studies
showed that drug-resistant parasites contained amplified copies of these specific DNA sequences when
compared with their drug-sensitive siblings.
Invertebrates
the magnified rDNA become evident in the F2 progeny if the rDNA has been transmitted by males and if
the genotype of the F2 generation is again characterized by rDNA deficiencies. In other words, the extra
copies of rDNA are eliminated if the magnified fly is
crossed with a normal (bb) female. The phenotypic
inheritance of magnified rDNA requires its integration and inheritance through the male germline. Two
hypotheses have been proposed to explain rDNA
magnification: disproportionate replication of rDNA
(Ritossa and Scala, 1964; Ritossa et al., 1971) or unequal sister chromatid exchange (Tartof, 1974). Recent
experiments demonstrating the decrease in magnification frequency in organisms which carry the rDNA
on a ring chromosome strongly suggest unequal crossover as the mechanism.
Another type of amplification occurs in Drosophila
melanogaster, which differs from rDNA magnification in several characteristics. This amplification,
called `compensation,' occurs when one nucleolus
organizer of the two homologs is completely deleted
(X/O or X/X-no females). In such mutants, the
remaining organizer `compensates' for the deletion
of the rDNA sequences by a disproportionate replication of the remaining sequences on the intact
homolog. Compensation may only occur on the X
chromosomal nucleolus organizer, and the extra
rDNA is not inherited in subsequent generations.
Utilizing various deficiencies for the X-chromosomal
heterochromatin, evidence has been presented for the
existence of a genetic locus that regulates rDNA compensation (Procunier and Tartof, 1978). This locus,
called the `compensatory response' (cr), is located outside the ribosomal cluster and in the X-chromosomal
heterochromatin. The locus acts in trans to sense the
presence or absence of its partner locus on the opposite homolog. If only one cr locus is present, it acts in
cis by driving compensation (disproportionate replication) of adjacent rRNA genes. Not all embroys with
the proper genotype undergo compensatory amplification to emerge with an increased number of rRNA
genes. Only a small fraction undergoes the putative
compensatory amplification. In this respect the amplification behaves like a mutagenic reversion event to
restore the functional phenotype.
Resistance to environmental agents such as pesticides and toxic chemical waste has been documented in
laboratory stocks and natural populations of invertebrates. Selection of Drosophila larvae in increasing
concentrations of cadmium yields strains that contain
duplications of the metallothionien gene (Otto et al.,
1986). The duplication is stably inherited in the
absence of selection pressure and produces a corresponding increase in metallothionien messenger RNA.
A survey of natural populations found that this event
766
Gene Amplification
Plants
DNA changes in plants during response to environmental stress have been reported for flax (see Cullis,
1977, 1979, 1983). The suggestion that these changes in
DNA content are induced by the environment awaits
further studies for verification.
Vertebrates
et al., 1976). Wahl et al. (1979) have shown that overproduction of these enzymes is the direct result of
amplification of DNA coding for these proteins.
Numerous other instances of DNA amplification
have now been described (for a comprehensive list and
references, see Stark and Wahl, 1984). In all cases the
growth of cells is inhibited either by metabolic inhibitors, toxic agents, or altered enzymes with reduced
efficiency. Of clinical importance has been the discovery that multidrug resistance in cancer chemotherapy
is, in some cases, mediated by amplification of the mdr
locus (Roninson et al., 1984a,b). Selection pressure
can also lead to the amplification of sequences with
initially unknown functions.
The acquisition of a selected phenotype may often
result from selection pressures, which are unknown at
that time. In these cases a certain phenotype may be
accompanied by the manifestations of gene amplification for sequences unknown. Such an instance has
been described in studying the sequences, which are
carried on the double-minute chromosomes (DMs)
and found in the homogeneously staining regions
(HSRs) of neuroblastoma cells, where these structures
were first described. The sequences, which are amplified in these lines, are cellular onc genes, the N-myc
gene (Schwab et al., 1983). Amplification of oncogenes has now been found in several tumor types
containing DMs and HSRs.
768
Gene Amplification
Summary
The literature suggests that when gene amplification
does occur in normal tissues it is developmentally
regulated. This evidence is mostly compiled from
studies on Xenopus and Drosophilia (see the section
``Development''). In higher organisms, the documentation of gene amplification as a developmental event
is lacking. At the present time we do not know if gene
amplification can be developmentally programmed in
mammalian cells.
Sporadic amplification can occur in unicellular
organisms such as bacteria and yeast, but seems to be
lacking in the normal somatic tissues of higher eukaryotes. Several reports of sporadic amplification in the
germline cells of several organisms have been reported
and have been shown to be heritable. In all of these
cases, the phenotype demonstrated an increased resistance to an environmental toxin (Mouches et al., 1986;
Maroni et al., 1987; Prody et al., 1989). The extensive
documentation of sporadic amplification in neoplastic
tissues raises questions of when the neoplastic cell
acquires the ability to amplify and if the manipulation
of this event can aid in the treatment of cancer.
References
770
Gene Amplification
Gene Cassettes
R M Hall
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1720
772
Gene Cassettes
Cassette-Associated Genes
It is presumed that any gene can become packaged in
cassette form, though how and where this happens
remains a matter for speculation. Many (over 60) of
the known cassettes contain an antibiotic resistance
gene, and these genes determine resistance to various antimicrobial agents (b-lactams, aminoglycosides, trimethoprim, chloramphenicol, erythromycin,
rifampicin, and antiseptic quaternary ammonium
compounds) using a variety of mechanisms. Over
150 further cassettes have been found in the Vibrio
cholerae small chromosome. Only a few of the ORFs
(potential genes) contained in these cassettes have
been identified. They include genes for a toxin, a
virulence determinant, and a lipoprotein as well as a
few potential antibiotic resistance genes. Restriction
and modification enzymes have also been found to be
encoded in cassettes.
Cassette-Associated Recombination
Sites
The 59-be recombination sites found in cassettes provide the signal that permits cassettes to be mobilized.
They were originally recognized as a consensus of
59 bp found downstream of several different genes
and were subsequently shown to be recombination
sites recognized by the integron-encoded IntI integrases. Each cassettes includes a unique 59-be and
members of the 59-be family were later found to
vary considerably in sequence and length; the shortest
AAC
1L
X A
2L
2R
LH simple site
GTT
ATG
1R
RH simple site
59-be
(B)
TT
ATG
coding region
AAC
XA
2L
2R
1R
1L
LH simple site
RH simple site
Figure 1 Structure of a circular gene cassette. A generalized typical cassette in (A) its free, circular form and (B) its
linear integrated form showing the coding region of the gene and the 59-be recombination site. The extent of the
coding region (not to scale) is delineated by start (ATG) and stop(*) codons. 7-bp core site sequences, related to the
consensus GTTRRRY, that lie within putative IntI binding sites are boxed, and arrows indicate their relative
orientations. Only bases found in all 59-be are shown, while other consensus bases are represented by dots. An extra
base in 2L is marked by an . In any individual 59-be, sites numbered 1 are closely related, as are sites numbered 2,
but the bases between them are not. The left hand (LH) and right hand (RH) simple sites consist of pairs of core sites
(1L and 2L; 2R and 1R, respectively) together with flanking sequences. The region located between the LH and RH
simple sites is an inverted repeat (represented by a pair of arrows) which has a variable sequence and length. A vertical
arrow indicates the recombination crossover point. On integration of the cassette into the attI site of an integron, the
1R core site is split at the recombination crossover point so that the last six bases of 1R in the circular cassette
become the first six bases of the integrated cassette.
Further Reading
Hall RM and Collis CM (1995) Mobile gene cassettes and integrons: capture and spread of genes by site-specific recombination. Molecular Microbiology 15: 593600.
Recchia ED and Hall RM (1995) Gene Cassettes: a new class of
module element. Microbiology 141: 30153027.
774
Gene Conversion
Gene Conversion
F W Stahl
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0503
diploid
cell m /+
meiosis
normal
segregation
m
gene conversion
or
Conversion Gradient
The frequency of conversion (half and/or full) for
markers within a given gene varies with the position
of the marker. These rates may vary monotonically
from one end of the gene to the other (conversion
gradient).
776
Gene Conversion
(a)
cutting
(b)
resection
(c)
invasion
(d)
Holliday
junction
junction
sliding
(e)
mismatch
removal
DNA
synthesis
(f)
junction
cutting
same sense
opposite sense
or
5:3
5:3
6:2
6:2
(h)
(g)
crossover
noncrossover
alternative
5:3
resolution
6:2
(i)
noncrossover
Figure 2 Double-strand-break-repair model. A duplex (a) is cut on both strands at a hot spot (b). Resection of 50
ends creates 30 overhangs (c). These single strands of DNA bind proteins like RecA of Escherichia coli and invade the
homolog, creating regions of hybrid DNA (d). (Alternatively, invasion by one end, followed by DNA synthesis primed
by that end, displaces a strand from the intact chromatid, which can then anneal with the resected end on the other
side of the initiating break. The resulting double Holliday junction intermediate is the same for either scenario.) The
Holliday junctions (where the strands swap partners) may be pushed outward (branch migration) (e). Mismatch repair
of heteroduplexes (shown only on the right side of the initial break) removes invading DNA from the break site to a
mismatch. DNA lost by resection is resynthesized (broken lines) using the intact homolog as template, creating a joint
molecule (f). Resolution of the joint molecule results in crossing-over if one junction is cut vertically and the other
horizontally (g). Noncrossovers result if both junctions are resolved in the same way (h) or if the two participating
duplexes are separated from each other by an alternative route that could involve a topoisomerase or could result
from the cutting of one junction followed by sliding of the other (i). In the ARG4 gene of Saccharomyces cerevisiae, the
rarity of 5:3 segregations of the type shown on the bottom-right (h) and the occurrence of tetrads like those shown
on the bottom-middle (i) imply that cutting of the two Holliday junctions is rarely the route to the noncrossover
resolution of joint molecules. (Modified from Szostak et al., 1983.)
(a)
single-strand
gapping
(b)
Holliday
junction
invasion
(c)
DNA
synthesis
(d)
nicking
joining
(e)
(i)
junction
sliding
alternative
resolution
(f)
(j)
junction
vertical
(g)
cutting
horizontal
5:3
(h)
5:3
mismatch
repair
6:2
crossovers
4:4
noncrossovers
Figure 3 Single-strand-gap-repair model. Premeiotic replication of chromosomes (a) results in occasional single
strand gaps in one daughter or the other (b). RecA-like protein binds to this single-stranded DNA and catalyzes
interaction with a chromatid of the homolog (c). The invading 30 end primes DNA synthesis using the intact homolog
as template (D). The resulting joint molecule may contain two Holliday junctions (i), or only one (e). Resolution of the
joint molecule by cutting the single Holliday junction (f) may yield either crossover (g) or noncrossover products (h).
Examples of segregation ratios prior to and consequent to mismatch repair are shown. If two Holliday junctions are
formed, alternative resolution to form noncrossovers, effected by topoisomerase or by cutting of only one junction,
would preserve the homolog genetically intact (j), in keeping with most observations of noncrossovers in
Saccharomyces cervisiae. (Modified from Kuzminov, 1996.)
any marker that was in the gap. The failure to replace
all 6:2 tetrads with 5:3 tetrads consequent to the
removal of mismatch-repair systems suggests such
gap repair. When the marker examined is a deletion
of the initiation site, the only conversions seen are full
conversions, which favor the deletion and which occur
independently of mismatch repair.
accounts for a major fraction of meiotic recombination in S. cerevisiae. The possibility of other routes
to conversion, perhaps with attendant crossing-over,
remains open. A prime candidate for another route is
single-strand gap repair. When DNA is damaged on
one strand, replication can skip across the impediment, producing two duplexes, one of which is gapped
in its daughter strand. In mitotic cells, such a gap can
be repaired, and the impediment removed, with help
from the sister duplex (West et al., 1981). In meiotic
cells, a chromatid left gapped on one strand following
premeiotic DNA replication may be repaired with the
aid of the homolog, rather than the sister chromatid
778
Gene Dosage
Further Reading
References
Gene Dosage
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1848
Gene Duplication
D Carroll
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0505
Gene duplication is a process that occurs periodically (usually rarely) within genomes of all types of
1
1
2
1
globin
gene cluster,
chromosome 16
globin
gene cluster,
chromosome 11
780
G e n e E x p re s s ion
Further Reading
Gene Expression
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0506
The genetic material contains the information necessary for an organism to develop, function, and
reproduce, but it is necessary that this information
be expressed in order for any activity, even maintenance, to be carried out. Therefore, one can consider
gene expression as encompassing all the processes
which are necessary to produce a gene product from
a gene. One can also include all regulatory steps, those
necessary to synthesize the gene product in appropriate amounts at an appropriate time and those involved
in regulating the activity of the gene product. Therefore, the following description is not meant to be an
exhaustive account of gene expression, but an overview of some of the processes involved. Many of these
processes are explored in more detail in articles elsewhere in this volume.
Transcription
The genes of all cellular organisms are composed
of double-stranded DNA (some viruses have singlestranded DNA genomes and others even RNA
genomes) and the first step in their expression is transcription (see Transcription). Transcription involves
using one of the two strands of DNA as a template
to make an RNA copy by an enzyme called RNA
polymerase (see RNA Polymerase). All RNA polymerases synthesize an RNA chain from the 50 end to
the 30 end while reading the template strand of the
DNA in the 30 to 50 direction. The RNA molecules are
synthesized from specific starting sites on the DNA
and also terminate at specific sites. The sites where
RNA polymerase (using accessory factors) recognizes
the beginning of a transcriptional unit are termed
promoters (see Promoters). In higher organisms, the
unit of transcription is almost always a single gene.
However, in prokaryotes the transcriptional unit may
contain several contiguous genes. These genes are often
related in function and/or belong to one pathway.
Transcription is a target of several regulatory
mechanisms. These can serve to repress or activate
transcription, or lead to premature termination. One
common mechanism in bacteria is the binding of
a repressor protein to a specific region of the DNA
near the promoter which then blocks transcription
(see Repressor). The sequence to which the repressor
protein binds is termed an `operator' (see Operators),
a term which has given its name to the transcriptional
unit called an `operon' (see Operon). In bacteria an
operon may contain one or more genes, all under the
control of the single operator. Another mechanism for
regulating gene expression is the binding of a regulatory protein to the DNA which activates transcription. Such positive control is widespread in eukaryotic
genes. It is not uncommon for genes to be under more
than one form of regulation, nor is it uncommon, in
bacteria, for some regulatory proteins to be both repressors and activators for different genes. Attenuation
G e n e E x p re s s i o n 781
is another form of transcriptional regulation, but in
this case the transcript is terminated early in elongation (see Attenuation). The mechanism by which
attenuation takes place can vary quite dramatically
between different organisms. Also not all regulatory
molecules are proteins; regulatory RNA can also play
a role (see Regulatory RNA).
The majority of genes encode proteins, and the
RNA transcript must then be used as (or processed to
become) a messenger RNA (mRNA). As mentioned
above, eukaryotic transcriptional units are almost
always single genes, but some transcripts from
protein-encoding genes (particularly from animals)
can be very long (more than one million bases). The
great length of these transcripts results from the fact
that the protein-encoding genes of eukaryotes often
have several introns (noncoding sequences) interspersed within the coding sequences (exons), and these
are transcribed as a unit. Such genes are sometimes
referred to as `split genes' (see Introns and Exons; Split
Genes). In genes containing introns, then, one part
of gene expression is the processing of the transcript
to remove these introns. Indeed, in eukaryotes
most transcripts from protein-encoding genes need
three distinct processing steps to be converted into
mRNA: capping, splicing, and tailing. Capping
involves adding a modified guanosine to the 50 end of
the pre-mRNA. It is this cap that allows the RNA to
be recognized by the translational machinery of the
cell as an mRNA. The RNA splicing process removes
introns and joins the exons together. Tailing involves
cutting the transcript at a specific site downstream of
the region encoding the protein and polyadenylating
the newly created 30 end.
These processing events are coupled to transcription. Capping takes place very soon after transcription
has started. At least in the higher eukaryotes, where
genes may have, in the extreme, many large introns,
splicing is also coupled to transcription. The splicing
process in eukaryotic pre-mRNA is complex and
involves ribonucleoprotein particles called `spliceosomes' that contain various protein factors and small
nuclear RNA molecules (snRNPs or `snurps'; see PremRNA Splicing). Splicing involves recognition of specific sites on the RNA and very precise cleavage and
ligation of the RNA (since an error of a single nucleotide will result in a frameshifted message). Splicing is
also regulated, and some genes have transcripts that
can be spliced in more than one way (alternative splicing) to yield more than one protein from a single
gene. Alternative splicing pathways are particularly
prevalent in the transcripts from genomes of small
animal viruses but occur in other genomes also.
The transcripts of protein-encoding genes from prokaryotes do not require processing to be functional;
Translation
In prokaryotes, transcription and translation are
coupled, that is, the translation of a mRNA begins
before its synthesis is complete. There are even some
regulatory mechanisms which take advantage of this
coupling (see Attenuation). In eukaryotes, however,
transcription (and processing) occurs in the nucleus
and translation occurs in the cytoplasm. Therefore, in
eukaryotes the mature mRNA must be transported
to the cytoplasm. In all organisms, most mRNA is
782
G e n e E x p re s s ion
G en e F a m i l y 783
Gene Family
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0507
vertebrates and, as such, all three are a common feature of all mammals. The products encoded by genes
within two of these branches a-globin and b-globin
come together (with heme cofactors) to form a tetramer which is the functional hemoglobin protein that
acts to transport oxygen through the bloodstream.
The product encoded by the third branch of this
superfamily myoglobin acts to transport oxygen
in muscle tissue.
The b-like branch of this gene superfamily has
duplicated by multiple unequal crossing over events
and diverged into five functional genes and two b-like
pseudogenes that are all present in a single cluster on
mouse chromosome 7 as shown in Figure 1. Each of
the b-like chains codes for a similar polypeptide which
has been selected for optimal functionality at a specific
stage of mouse development: one functions during
early embryogenesis, one during a later stage of
embryogenesis, and two in the adult. The a-like
branch has also expanded by unequal crossing-over
into a cluster of three genes one functional during
embryogenesis and two functional in the adult on
mouse chromosome 11. The two adult a genes are
virtually identical at the DNA sequence level, which
is indicative of a very recent duplication event (on the
evolutionary time scale).
In addition to the primary a-like cluster are two
isolated a-like genes (now nonfunctional) that have
transposed to dispersed locations on chromosomes 15
and 17. When pseudogenes are found as single copies
in isolation from their parental families, they are
called `orphons.' Interestingly, one of the a-globin
orphons (Hba-ps3 on Chr 15) is intronless and
500 my BP
Chr 7
late
early
pseudogenes
adult
embryonic
-globin gene cluster
globin
gene
family
Chr 11
embryonic
800 my BP
Figure 1
Mouse chromosome 7.
adult
Chr 15
Hba-ps3
Chr 17
Hba-ps4
Chr 15
myoglobin
Isolated
-globin
pseudogenes
784
Gene Family
would appear to have been derived through a retrotransposition event, whereas the other orphon (Hbaps4 on chromosome 17) contains introns and may have
been derived by a direct DNA-mediated transposition. Finally, the single myoglobin gene on chromosome 15 does not have any close relatives either
nearby or far away. Thus, the globin gene superfamily
provides a view of the many different mechanisms that
can be employed by the genome to evolve structural
and functional complexity.
The Hox gene superfamily provides an alternative
prototype for the expansion of gene number. In this
case, the earliest duplication events (which predate the
divergence of vertebrates and insects) led to a cluster
of related genes that encoded DNA-binding proteins
used to encode spatial information in the developing
embryo. The original gene cluster has been duplicated
en masse and dispersed to a total of four chromosomal
locations (on chromosomes 2, 6, 11, and 15) each of
which contains nine to twelve genes. Interestingly,
because of the order in which the duplication events
occurred unequal crossing-over to expand the
cluster size first, transposition en masse second an
evolutionary tree would show that a single `gene
family' within this superfamily is actually splayed
out physically across all of the different gene clusters.
Some gene additions and subtractions within individual clusters have occurred by unequal crossing-over
since the en masse duplication so that differences in
gene number and type can be seen within a basic
framework of homology among the different whole
clusters.
A final example of a gene super-superfamily is the
very large set of genes that contain immunoglobulinlike (Ig) domains and function as cell surface or soluble receptors involved in immune function or other
aspects of cellcell interaction. This set includes the
immunoglobulin gene families themselves, the major
histocompatibility genes (called H2 in mice), the T cell
receptor genes, and many more. There are dispersed
genes and gene families, small clusters, large clusters,
and clusters within clusters, tandem and interspersed.
Dispersion has occurred with the transposition of
single genes that later formed clusters and with the
dispersion of whole clusters en masse. Furthermore,
the original Ig domain can occur as a single unit in
some genes, but it has also been duplicated intragenically to produce gene products that contain two, three,
or four domains linked together in a single polypeptide. The Ig superfamily, which contains hundreds
(perhaps thousands) of genes, illustrates the manner
in which the initial emergence of a versatile genetic
element can be exploited by the forces of genomic
evolution with a consequential enormous growth in
genomic and organismal complexity.
Gene Flow
J B Mitton
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0508
individuals. But gene flow is not equivalent to dispersal, for gametes or individuals that move among populations but fail to incorporate genes into the gene pool
have not mediated gene flow.
Population structure, the pattern of genetic variation among populations, is produced by the joint
action of gene flow, genetic drift, and natural selection.
Genetic drift is change in allelic frequencies produced
by accidents of sampling and chance variation in survival, mating success, and family size. Natural selection
is defined as the differential reproduction of genotypes.
Elaboration
Genetic Drift Differentiates Populations
If populations are not connected by gene flow, stochastic changes will cause them to diverge in time.
Imagine a large, genetically diverse, randomly mating
population that is suddenly broken apart into two
perfectly isolated populations, each considerably
smaller than the initial population. Initially, these
populations might share the same alleles at similar
frequencies. The HardyWeinberg Law demonstrates
that, in the absence of selection, mutation, and migration, allelic frequencies will not change in an infinitely
large population with random mating. But in finite
populations, allelic frequencies drift over time, with
stochastic variation in survival and reproduction. The
stochastic loss of alleles may differ between populations, increasing the genetic distance between them. In
addition, mutations may introduce new alleles into
populations, further distinguishing them. With sufficient time, perfectly isolated populations will become
completely differentiated, so that they do not share
any alleles.
The rate of genetic drift in a population is dependent on the number of breeding adults in the population. Consider a gene segregating two alleles, A and a,
at frequencies p and q, respectively, so that:
p q 1:0
The standard error (SE) of the allelic frequency is a
measure of the magnitude of drift of the frequency of
an allele in a single generation. The standard error
of the frequency of an allele is:
r
SE
pq
2N
786
Gene Flow
Gene flow among populations makes them more similar. This point can be made intuitively by considering
an exercise with two glasses of wine, one red and one
white. Imagine pouring a small amount from the glass
of white wine into the other glass, then swirling it. The
red wine will still be red, but careful inspection would
reveal that the intensity of the color has diminished.
Now pour some of the red wine into the other glass. A
few drops of red wine bring a tinge of red to the white
wine. Now imagine repeating the exchanges many
times. Ultimately, the colors in the two glasses will
be identical. Similarly, some gene flow between populations will make them more similar, and high gene
flow will make them indistinguishable.
The impact of gene flow on population structure
can be illustrated quantitatively by modeling genetic
variation at a single gene. Now consider gene flow into
a population from populations that have different
allelic frequencies. If the proportion of migrants into a
population is m, and the frequency of A in the migrants is p, then p0 , the new frequency of A in the
population, will be:
pm p1
p0
levels of gene flow, the mussels in Long Island Sound remain distinctly differentiated from other populations.
Long Island Sound receives water from several
major rivers (Housatonic, Quinnipiac, Connecticut,
Thames), which dilutes the salinity of the Sound to
about one half of the salinity of the open ocean. Thus,
the Sound is a distinct environment for mussels, which
must make physiological adjustment to retain osmotic
cell pressure. Variation at the gene coding for leucine
aminopeptidase (Lap) plays an important role in the
maintenance of cell pressure; some genotypes are most
efficient at high salinity, while other genotypes are
most efficient at low salinity. Each spring, millions of
larvae are carried into Long Island Sound by currents
sweeping west along the coast of Rhode Island and
Connecticut. But each fall, mortality in the young
mussels creates a sharp genetic cline in Lap frequencies near Guilford, Connecticut, where salinity
changes abruptly. Although the veliger are capable of
dispersing more than 100 km, the genetic cline is only
20 km wide.
In addition, studies of both ribbed mussels, Geukensia demissa, and acorn barnacles, Semibalanus
balanoides, have reported significant differentiation
between the samples taken from the upper and lower
portions of the intertidal zone distances of one or
two meters. These species, like the blue mussel, also
have pelagic larvae, and consequently gene flow
would homogenize the frequencies of neutral or unselected genes within the intertidal zone. Both cases of
differentiation were produced by selection differing
among habitats in a heterogeneous environment.
Although gene flow is not synonymous with dispersal, it is certainly true that long-distance dispersal
Figure 1
Mating System
Fst is a quantitative estimate of the degree of differentiation of populations. Consider a gene segregating
two alleles, A and a, at frequencies p and q, respectively. Fst, a standardized variance of allelic frequencies,
is defined as:
Fst
S2p
pq
where the numerator is the variance of p among populations, and the denominator is the product of the
means of the allelic frequencies. The variance of allelic
frequencies among populations is calculated as:
S2p
1X
pi
d i
p2
788
Gene Flow
where d is the number of populations, pi are the frequencies of the A allele in the populations, and p is the
mean of the frequencies. Fst is zero if all populations
have the same alleles at the same frequencies, and 1.0
for two populations fixed for different alleles. Fst will
increase over time between isolated populations, and
because genetic drift increases with decreasing population size, the rate of divergence increases with
decreasing population size.
The degree of differentiation among populations
will come to an equilibrium that reflects a balance
between genetic drift and gene flow. The relationship
between differentiation and gene flow is:
Fst 1=4Nm 1 or, equivalently,
Nm 1=Fst 1=4
where N is the number of breeding individuals in a
population. Thus, if populations are completely isolated for a long time, Fst will decline to zero, but if just
one member of a population is a new immigrant (e.g.,
Nm 1) then the rate of gene flow is:
m
1
N
The organellar genomes of pines are ideal for measuring gene flow, as mitochondrial DNA (mtDNA) has
maternal inheritance and chloroplast DNA (cpDNA)
has paternal inheritance in pines. These different
modes of inheritance allow us to explicitly identify
gene flow mediated by pollen and by seeds. In addition, pollen and seeds have disparate potentials for
dispersal. The wind-borne pollen have the potential to
travel great distances, but in contrast, the seeds of
pines usually fall within a circle that has a radius
equal to the height of the tree.
Limber pine, Pinus flexilis (Figure 2), is native to
western North America, where it is primarily
restricted to windy ridges and scree slopes from the
Figure 2
Private alleles, or alleles found only in a single population, can also be used to infer rates of gene flow among
populations. The private alleles can be from markers
from mtDNA, cpDNA, nuclear DNA, or allozyme
markers, and they are usually taken from surveys of
geographical variation within a species. For example, a
survey of allozyme variation throughout the range
might reveal several or many private alleles. The average frequency of the private alleles, p, is plotted on a
regression line on a plot of ln(p) on the ordinate versus
ln(Nm) on the abscissa. The regression line was estimated from a computer simulation study examining
the relationship between genetic drift and gene flow in
the determination of the geographical distribution
of new mutations. Consider a species that has very
low gene flow among populations. When mutation
produces a novel allele in a single population, it
could drift to moderate or even high frequencies
before an individual bearing that allele migrated to
another population and reproduced. However, if
gene flow in the species was very high, then it is likely
that the new mutation would still be at a low frequency
when it was successfully introduced to another population. Thus, low gene flow allows private alleles to
drift to higher frequencies while high gene flow holds
private alleles to low frequencies.
Table 1 Estimates of the number of migrants moving among populations (Nm) from the average frequency of
private alleles (p(1))
Common name
Formal name
p(1)
Nm
Blue mussel
Fruit fly
Milkfish
Desert lizard
[Annual plant]
Pacific treefrog
Valley pocket gopher
Pacific slender salamander
Red back salamander
Oldfield mouse
Camp's slender salamander
Zigzag salamander
Mytilus edulis
Drosophila willistoni
Chanos chanos
Lacerta melisellensis
Stephanomeria exigua
Hyla regilla
Thomomys bottae
Batrachoseps pacifica
Plethodon cinereus
Peromyscus polionotus
Batrachoseps campi
Plethodon dorsalis
0.008
0.014
0.030
0.066
0.054
0.081
0.087
0.117
0.200
0.158
0.338
0.294
42.0
9.9
4.2
1.9
1.4
1.4
0.86
0.64
0.22
0.31
0.16
0.10
Note: values of Nm have been adjusted for the sample sizes, so there is not a perfect rank-order correlation between p(1)
and Nm.
(Adapted from Slatkin, 1985.)
790
Gene Frequency
underlying their methods. If there are egregious violations of the assumptions, estimates of gene flow may
be unreliable.
Further Reading
Reference
Gene Frequency
C F Aquadro
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0509
Gene frequency refers to the proportion of a population that carries one type of variant, or allele, at a locus.
More appropriately referred to as `allele frequency,'
gene frequency ranges from 0 (where the particular
variant is absent from the population) to 1 (where the
variant type is the only allele present). In the latter case,
the population is said to be `fixed' for this particular
allele. While often defined in terms of a locus or gene
and, in the early days of genetics, assessed by phenotype
of the corresponding genotype, the gene frequency is
now applied to the frequency of any alternative form
found segregating in a population, e.g., alternative
nucleotides at a single site in a sequence, whether it
be in coding, intron, or intragenic regions, as well as
insertion/deletion variants and even alternative gene
rearrangements such as inversion types.
Gene frequency is estimated by taking a random
sample of individuals from what might be considered
a population of the species of interest (e.g., from a
792
frequency of both allele types. Even harmful (deleterious) mutants will exist in an equilibrium frequency in
populations, due to the balance of the introduction of
the allele type into the population by mutation and its
elimination by natural selection. Alleles that are phenotypically recessive, often due to a loss of function,
can reach moderate mutationselection balance frequencies, since they are `hidden' as heterozygotes
and only selected out as homozygotes. Knowledge of
the selective disadvantage of such alleles allows the
estimation of mutation rates for these types of alleles
and has been widely used to do so, particularly in
human genetics.
See also: Balanced Polymorphism;
Gene Flow; Genetic Drift; Genetic Equilibrium;
HardyWeinberg Law
Gene Insertion
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0510
Gene Interaction
J Hodgkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1530
Gene Library
I Schildkraut
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0512
Gene Mapping
J R S Fincham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0513
794
Gene Mapping
(A)
1
+ 2
+ +
+ +
+ +
+ +
+ +
(A
a
(A
+ +
+ +
)
)
(B)
1
Figure 1 The use of flanking markers to determine the order of mutant sites within a gene. (A) The use of flanking
markers in the study of recombination within a gene. The two mutant strains being crossed (with mutational sites 1
and 2, with corresponding wild-type sites shown as ) are also distinguished by allelic differences A/a and B/b at
closely placed flanking loci. Wild-type () recombinants from the A 1 B a 2 b cross can arise in the following ways:
(1) Reciprocal crossing-over between the sites: all recombinants will be a B if the 1 2 order is as shown; (2) and
(3) conversion of either 1 or 2 to wild-type without crossing-over, conveying no information about the order of the
sites; (4) and (5) conversion of 1 or 2 to wild-type, with crossing-over in the interval adjacent to the conversion
event: all recombinants will be a B if the 1 2 order is as shown. If the conversion-associated crossover is on the
other side of the gene (shown as a dotted cross), the outcome will be A b recombinants; this is usually a less
common event. Overall, therefore, the order A 1 2B will be indicated by a predominance of a B over A b
products. (B) The use of one flanking marker in a transduction cross in bacteria. Mutant sites 1 and 2 are present in
the donor and recipient, respectively; A is a flanking marker, not subject to selection, present in the donor, as opposed
to a in the recipient. If the sites are arranged 1 2 A, integration of a donor phage-borne fragment to give
a recombinant requires one exchange and another either (i) between 2 and A or (ii) to the right of A. If the
sites are the other way round, the exchange between 1 and 2 will always exclude A unless there are more than
two exchanges.
flanking marker combination (Figure 1). If the intragenic recombination is due to conversion at one site
without crossing-over, the flanking markers will
retain their parental combinations and give no information about order within the gene.
However, tetrad analysis in fungi, particularly
in the budding yeast S. cerevisiae, shows that even
though much, or sometimes nearly all, recombination
within genes is due to conversion and not to reciprocal
crossing-over, it is still, with a frequency which is often
about 40% or 45%, associated with crossing-over
between flanking markers. Most of the conversionassociated crossovers were found to be on the side of
the gene where the conversion event had occurred and
could have been immediately adjacent tothe conversion
Deletion Mapping
Among any large collection of mutations within a
particular gene some are likely to be due to deletions
of gene sequence rather than to changes of single base
pairs (point mutations). Deletion mutations can in
general be distinguished from point mutations through
their inability to back-mutate to wild-type. More
definitively, they fail to give wild-type recombinants
in crosses to sets of point mutations that are able to
recombine with each other. Deletions provide the
most unambiguous method of mapping within genes.
The analysis proceeds in two steps.
First, the deletions are arranged in a linear order
defined by their overlaps. Nonoverlapping deletions
can give wild-type recombinants when crossed,
whereas overlapping deletions can not. Then, when
the map of deletions has been established, the point
mutations can all be placed in one or other of the
segments defined by the deletion overlaps and nonoverlaps. The ability to recombine with a deletion
shows that the point mutation falls outside the deleted
segment; conversely, the failure to recombine with a
deletion shows that the point mutation falls within, or
at least very close to, the deleted segment. The principle is explained in Figure 2.
(A)
1
2
4
(B)
3
5
3
+
6
+
7
+
8
+
796
Gene Number
References
Gene Number
J Hodgkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0515
Gene Pool
K E Holsinger
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0516
Gene Product
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0517
798
Gene Rearrangement in
Eukaryotic Organisms
K L Hill and B C Coughlin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0518
V1
Final DNA
arrangement
V2
D1 D2 D3
V3
V2
D1 J2
J1 J2
Figure 1 Simplified diagram of antibody gene rearrangements. The boxes labeled `V,' `D,' `J,' and `C' represent the
four DNA segments of the gene for an antibody heavy-chain subunit. Rearrangements of these gene segments is
necessary to produce a functional gene, as shown in the final DNA arrangement.
chromosomes. Similar gene rearrangements are responsible for antigenic variation in other microbial
pathogens.
Further Reading
800
Weill JC and Reynaud CA (1996) Rearrangement/hypermutation/gene conversion: when, where and why? Immunology
Today 17: 9297.
Gene Rearrangements,
Prokaryotic
R Haselkorn
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1451
Stochastic Rearrangements
The general idea of a stochastic rearrangement is to
provide a new promoter for a gene encoding a structural protein or an enzyme. A classical example is
the phenomenon of phase variation in Salmonella,
observed originally in the 1920s and studied further
in the 1950s. Certain strains of Salmonella can switch
from a form expressing one flagellar antigen (H1) to a
form expressing a different flagellar antigen (H2) and
then back. This switch, or phase variation, corresponds at the molecular level to the inversion of a DNA
segment by a site-specific recombination enzyme. The
inverted segment carries a promoter such that, in one
orientation, it drives the transcription of a gene encoding the H2 flagellar protein and a repressor of
transcription of the distant H1 antigen gene. In the
opposite orientation, the promoter points the `wrong'
way, preventing transcription of both the H2 gene and
the repressor of H1. Thus, transcription of H1 occurs
and the phase is switched. Flipping of the promoter
segment is accomplished by a site-specific recombinase operating on short inverted repeat sequences at the
ends of the segment, which also encodes the recombinase. This antigenic variation occurs in about one cell per
thousand per generation, allowing the population as a
whole to survive antibody directed against one or the
other flagellar antigen.
A variation of this theme in Escherichia coli has the
promoter for the fimA gene, encoding the structural
protein for type 1 fimbriae, alone on the invertable
segment. Two recombinases (FimB and FimE) are
each encoded nearby. Inversion of the promotercontaining segment by FimB results in transcription
of the fimA gene, while the FimE recombinase flips
the promoter, shutting off transcription of fimA.
Other proteins, such as IHF, play a role in this inversion. Fimbriae are important in virulence, mediating the attachment of E. coli to epithelial and other
human cells.
Other rearrangements occur as a consequence of recombination between repeated sequences in the genome, catalyzed by the general recombination system,
or by the movement of elements that encode sitespecific recombinases that catalyze their own transposition. Any pair of repeated sequences can be found
in two different relative orientations: direct or inverted.
Recombination between two identical sequences in inverted orientation results in inversion of the entire DNA
segment between the recombining repeated elements.
One such event involves the genes encoding ribosomal
RNA in E. coli, two of which flank the origin of
chromosome replication (ORI), oriented away from
the ORI. Transcription seems to be more efficient,
for any gene, if it is oriented in the same direction
as DNA replication. Recombination between the inverted rRNA operons flanking ori does not, of course,
change the direction of transcription relative to ORI,
since DNA replication is bidirectional, but nevertheless growth of cells having one arrangement is slightly
faster than growth of cells with the other arrangement.
General recombination between the rRNA operons
flips the ORI in a few cells per thousand in each generation. Under normal circumstances, the rearranged
chromosomes are lost because the cells containing
800
Weill JC and Reynaud CA (1996) Rearrangement/hypermutation/gene conversion: when, where and why? Immunology
Today 17: 9297.
Gene Rearrangements,
Prokaryotic
R Haselkorn
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1451
Stochastic Rearrangements
The general idea of a stochastic rearrangement is to
provide a new promoter for a gene encoding a structural protein or an enzyme. A classical example is
the phenomenon of phase variation in Salmonella,
observed originally in the 1920s and studied further
in the 1950s. Certain strains of Salmonella can switch
from a form expressing one flagellar antigen (H1) to a
form expressing a different flagellar antigen (H2) and
then back. This switch, or phase variation, corresponds at the molecular level to the inversion of a DNA
segment by a site-specific recombination enzyme. The
inverted segment carries a promoter such that, in one
orientation, it drives the transcription of a gene encoding the H2 flagellar protein and a repressor of
transcription of the distant H1 antigen gene. In the
opposite orientation, the promoter points the `wrong'
way, preventing transcription of both the H2 gene and
the repressor of H1. Thus, transcription of H1 occurs
and the phase is switched. Flipping of the promoter
segment is accomplished by a site-specific recombinase operating on short inverted repeat sequences at the
ends of the segment, which also encodes the recombinase. This antigenic variation occurs in about one cell per
thousand per generation, allowing the population as a
whole to survive antibody directed against one or the
other flagellar antigen.
A variation of this theme in Escherichia coli has the
promoter for the fimA gene, encoding the structural
protein for type 1 fimbriae, alone on the invertable
segment. Two recombinases (FimB and FimE) are
each encoded nearby. Inversion of the promotercontaining segment by FimB results in transcription
of the fimA gene, while the FimE recombinase flips
the promoter, shutting off transcription of fimA.
Other proteins, such as IHF, play a role in this inversion. Fimbriae are important in virulence, mediating the attachment of E. coli to epithelial and other
human cells.
Other rearrangements occur as a consequence of recombination between repeated sequences in the genome, catalyzed by the general recombination system,
or by the movement of elements that encode sitespecific recombinases that catalyze their own transposition. Any pair of repeated sequences can be found
in two different relative orientations: direct or inverted.
Recombination between two identical sequences in inverted orientation results in inversion of the entire DNA
segment between the recombining repeated elements.
One such event involves the genes encoding ribosomal
RNA in E. coli, two of which flank the origin of
chromosome replication (ORI), oriented away from
the ORI. Transcription seems to be more efficient,
for any gene, if it is oriented in the same direction
as DNA replication. Recombination between the inverted rRNA operons flanking ori does not, of course,
change the direction of transcription relative to ORI,
since DNA replication is bidirectional, but nevertheless growth of cells having one arrangement is slightly
faster than growth of cells with the other arrangement.
General recombination between the rRNA operons
flips the ORI in a few cells per thousand in each generation. Under normal circumstances, the rearranged
chromosomes are lost because the cells containing
802
Figure 1 (See Plate 19) Filaments of the cyanobacterium Anabaena 77 h after transfer to nitrogen-free
medium. Nitrogen-fixing heterocysts have differentiated
at regular intervals along each filament. The image
shown is a composite of a fluorescence image showing
the location of green fluorescent protein expressed
from the promoter of the hetR gene and a DIC image
that outlines the cells. The HetR protein is required for
heterocyst differentiation. It is expressed early in the
differentiation of only those cells destined to develop.
and NifK). Most strains of Anabaena contain an 11-kb
DNA element interrupting the nifD gene in vegetative
cell DNA. During heterocyst differentiation, and
only during differentiation, the 11-kb element is
excised by a site-specific recombinase, acting on
directly repeated sequences at the ends of the element.
The resulting circular element does not replicate or
reinsert during the limited life of the heterocyst. These
nitrogen-fixing cells do not divide. Eventually they
die or are diluted out by growth of the vegetative
cells if a new supply of reduced nitrogen is found.
Under nitrogen-fixing conditions, the vegetative cells
can grow by virtue of amino acids supplied directly to
them by the heterocysts. Continued differentiation
of heterocysts, halfway between existing heterocysts
once each vegetative cell generation, maintains the
spacing pattern.
Excision of the 11-kb element in differentiating
heterocysts is catalyzed by a site-specific recombinase
encoded by the xisA gene, located within the 11-kb
element. The repeated sequences at the ends of the
11-kb element at which recombination occurs have
the same feature as those of the bacteriophage lambda
attachment site: a fully conserved core flanked on both
sides by regions of partial sequence identity. In these
respects, the 11-kb element looks like a remnant of
a bacterial virus chromosome, but lacking genes for
Further Reading
Gene Regulation
P Laybourn
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0520
804
Gene Regulation
These three RNA polymerases were originally identified by Robert Roeder while in William Rutter's
laboratory in the early 1970s by fractionation of
nuclear extracts from cells on a DEAE-Sephadex column and are found in all eukaryotes (yeast, plants,
insects, mammals). These enzymes were named RNA
polymerase I, II, and III (Pol I, Pol II, and Pol III,
respectively). Subsequently, the polymerases have
been shown to transcribe pre-rRNA genes (class I),
hnRNA genes (class II), and pre-tRNA and 5S RNA
genes (class III), respectively. The subunit compositions of the three eukaryotic polymerases are similar
(Figure 1). In addition, the two largest subunits of each
eukaryotic polymerase have structural and functional
conservation with Escherichia coli subunits b and b0 .
None of the subunits seem to correspond to sigma
factor (in E. coli). The corresponding activities associated with sigma factor are found in TFIIF and TBP,
thus these functions are divided between proteins.
The RNA polymerase II largest subunit contains an
unusual domain on its C-terminus referred to as the
C-terminal domain (CTD). The CTD is not found in
Pol I and Pol III largest subunit or b0 of E. coli RNA
polymerase. The CTD consists of 2652 repeats of a
7-amino acid sequence: Tyr-Ser-Pro-Thr-Ser- Pro-Ser.
This domain is highly phosphorylated on the Ser, Thr,
and Tyr residues. The less phosphorylated form is
called IIa and highly phosphorylated form is called
E. coli
'
II
L'
L'
III
L
L'
40
19
CTD
44
44
=
40
19
Largest
subunits
' and -like
DNA-binding
rNTP-binding
-Like assembly
Common
subunits
1027 kDa
+5
+4
+7
Specific subunits
1182 kDa
The three types of cell extracts that have been used are
whole cell, nuclear, and cytoplasmic. These extracts
35
TATAAAA
Figure 2
5'
N
Stirrups
Kinks
3'
Figure 3
Inr
80o
3'
806
Gene Regulation
First step: recognition and binding of TATA box by TBP (or TFIID)
250
110
30>
TBP
40
80
ATA
Inr
60
30=
TBP
150
ATA
IIA
Inr
TBP
ATA
Inr
TBP
ATA
IIB
Inr
TBP
Pol II
ATA IIF
IIB
Inr
IIE
TBP
Pol II
IIH
ATA IIF
IIB
Inr
IIA
Pol II
TBP
IIH
ATA
IIF
P PP PP P
IIB
Inr
ATP
ADP + Pi
IIB
IIA
IIE
TBP
IIH
Pol II
ATA IIF
P PP P P P
Seventh step: Initiation and elongation or RNA synthesis
IIA
TBP
ATA
IIF
Inr
Pol II
P P PP P P
RNA
It could be said that there are five steps in gene expression (protein coding genes): (1) activation of gene
chromatin structure; (2) initiation of transcription;
(3) RNA processing; (4) transport to the cytoplasm;
and (5) translation. Transcription initiation is an early
step, and is therefore an important control point.
Basic RNA polymerase II promoter structure involves cis-acting sequences that are bound by transacting protein factors (Figure 6). The cis-acting
sequences are often identified by deletion and `linkerscanning' mutagenesis.
The proximal, minimal, or core promoter region
consists of TATA and Inr/Start site (Figure 2).
Upstream promoter elements often act constitutively
or in an unregulated manner (Figure 6). These elements are often found on the promoters of `housekeeping' genes. Some examples of constitutive promoter
elements and the factors that bind them are the GC
box and the CCAAT box. GC boxes (GGGCGG) are
bound by Sp1, which is expressed in all cell types in
humans. There are often multiple GC boxes (G/C
islands) found in gene promoters. These may function
in concert with regulated factors to increase their
effect on transcription. In addition, GC boxes are
often seen upstream of TATA-less and Inr-less promoters. CCAAT boxes are bound by several factors
including CTF/NF1. CTF/NF1 is present in all
tissues, as well.
Regulatory promoter elements and the factors that
bind them respond to environmental stimuli or are
cell-type specific. Inducible element or response elements and transcription factors include those induced
by stress (for example heat shock). The HSE (heat
shock element) is bound by the heat shock transcription factor (HSTF). These elements can also be hormone inducible, for example, the hormone response
>1000
GC
box
GRE
Enhancer
400 to 300
100
30
+1
HSE GRE
GC CCAAT
box box
TATA Inr
Regulatory
elements
Upstream
promoter
elements
Minimal or
core
promoter
elements
>+1000
Proximal elements
(constitutive and regulator
promoter elements are often
intermixed)
808
Gene Regulation
Sequence-specific
Proteinprotein
DNA-binding domain interaction domain
N
GTFs
Class I (cys2/his2)
C
DNA
DNA-binding
in major groove
his
Zn2+
Zn2+
cys
cys
his
-sheet
-helix
cys
cys
cys
Zn2+
cys
cys
Zn2+
cys cys
cys
N
All -helical
Figure 8
domains.
Cell membrane
Steroid
turn
Homeodomain
or
helixturnhelix
2
3
turn
GR
GR I inhibitor
Cytoplasm
+
I
Nuclear
membrane
(+)
Nucleus
GRE
GR binds DNA, activates transcription
Helix
Loop
bHLH
Helix
N
Basic
DNA-binding
C
Leucine
Leucine
zipper
domain
coiled bZIP
coil
Subunit 1
N
Subunit 2
N
Basic
DNA-binding
810
Gene Regulation
Cell membrane
I
NF-B
I
Inhibitor
release
NF-B
Cytoplasm
Nuclear
membrane
Nucleus
NF-B
(+)
The TAFII40 is required for activation by GAL4VP16 (an acidic activator) and the TAFII110 is
required for Sp1 (glutamine-rich activator). TBP
alone is not sufficient for activated transcription by
these transcription factors.
Highly purified TFIID (includes all the TAFIIs) is
not sufficient for activated transcription by certain
transcript factors, as well. These transcription factors
require adapter proteins or coactivators. These adapter proteins may also be titrated out in `squelching.'
Another type of coactivator protein are the architectural transcription factors. These proteins mediate
proteinprotein interactions and bend the DNA to
promote interactions between transcription factors
bound to an enhancer.
PromoterProximal Attenuation
ATA
IIE
IIH
Inr
25 nt
RNA
IIF Pol II
812
Gene Regulation
F
NF-1
GRE5
300
GRE4
250
200
150
100
50
+1
Nucleosome B
Glucocorticoid
GR
SWI/SNF
complex
GRs
Transparent
nuclesome
GRs
B
SWI/SNF
complex
H1
Array primed
for activation
Positioned nuclesome
array on inactive
promoter
Figure 13
NF1
Oct1
TFIID
Activated promoter
complex
Chromatin
Free DNA
Gene activation
10-fold
Gene activation
102- to 103-fold
Basal
True
activation
Antirepression
Repressed
Most biochemical studies on the mechanism of transcription and its regulatory factors have used naked
DNA templates for purposes of simplicity. The level
of activation observed with these templates typically
Further Reading
Ge n e S i l e n c i n g 813
Wolffe AP (1998) Chromatin: Structure and Function, 3rd edn. San
Diego, CA: Academic Press.
Gene Replacement
See: Gene Targeting
Gene Sequencing
See: DNA Sequencing
Gene Silencing
W Filipowicz and J Paszkowski
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1685
814
Gene Splicing
Gene Splicing
See: Recombinant DNA
Gene Substitution
T Ohta
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0524
The process of gene frequency change in a population has been studied by population geneticists. Two
approaches are deterministic and stochastic. When the
population size is large and random drift is negligible,
the deterministic model is applicable, i.e., the change
of gene frequency by natural selection can be predicted by simple formulas. However, when the population size is not large, the chance effect becomes
significant, and stochastic treatments are needed.
Behavior of molecular mutants is often influenced
by random genetic drift even in a large population
because of their minute effect. In other words, many
molecular mutants are selectively neutral or nearly
neutral, and their behavior depends on random drift.
The dynamics of a completely neutral mutant has been
theoretically described, i.e., the average course of the
substitution process is known. For nearly neutral
mutants, interaction of selection and random drift is
important.
The number of gene substitutions at a locus is
estimated by comparing gene sequences at this locus
between species. From such comparative studies of
gene sequences, the rates of gene substitutions at various protein loci and noncoding regions have been
obtained. The rate is defined as the number of substitutions per unit time. As an example, the rate of
substitution of the a hemoglobin gene is about
0.5 per site for amino acid replacement sites, and
about 4 per site for synonymous sites (rates are given
per 109 years). In general, unimportant sites are evolving rapidly and important sites are evolving slowly.
See also: Population Genetics
Gene Targeting
See: Genetic Recombination
815
816
. The ex vivo approach in which the genetic correction would be accomplished by removing target
cells from a patient, introducing a therapeutic viral
vector into them in vitro, and then returning the
genetically corrected cells to the patient.
. The in vivo approach in which the gene transfer
vector is introduced directly into the target defective cells in the patient.
While very attractive in principle, these concepts could
not be put into practice at the time because no efficient
viral gene transfer vectors existed and the recombinant
DNA techniques needed to produce them had not
yet been developed. Fortunately, over the next few
years, methods of recombinant DNA manipulation
were developed and refined and allowed, in the early
1980s, the design and production of the first truly
efficient viral vectors for gene transfer into mammalian cells. These vectors were derived from mouse
viruses that used RNA as their genetic material and
that were associated with several kinds of cancer in
laboratory mice. These original vectors were derived
from viruses that were called retroviruses because they
have the property of converting their RNA into DNA
after infection. They are able to integrate the DNA
copies of their genomes into the genome of the host
cell, thereby allowing them to express some of the
viral genes in a stable and heritable way in the cell for
the lifetime of the cell. Recombinant DNA methods
allowed investigators to remove the potentially deleterious genes from the viruses and replace them with
other genes that could also be expressed permanently in the infected cells. These methods provided a
proof of principle that such retrovirus vectors could
carry out the functions required of a gene therapy vector for human disease. Very quickly thereafter, our
laboratory showed for the first time that a retrovirus
vector carrying a normal copy of a human diseaserelated gene could correct the abnormal properties of
cells derived from patients. We transferred a cDNA
corresponding to the normal allele of the hypoxanthine guanine phosphoribosyl transferase (HPRT)
via a retrovirus vector into cultured cells from patients
with the rare but devastating LeschNyhan disease,
and found that we could identify modified cells that
not only demonstrated restored expression of the normal gene but also correction of some of the secondary
metabolic defects resulting from their HPRT deficiency (Willis et al., 1984). It was the development
of the retrovirus vectors that represented the single
most important early technical advance that opened
the door to the subsequent explosion of gene transfer
with many additional disease-related genes.
One of those other disease-related genes that was
applied early to gene transfer studies with retrovirus
Table 1
817
Vector
Advantages
Disadvantages
Retrovirus
Lentivirus
(HIV, FIV, etc.)
Adenovirus
Herpes simplex
Liposomes
Naked DNA
818
References
G en e Tre es 819
Friedmann T (1996) Gene therapy: an immature genie but
certainly out of the bottle. Nature Medicine 2: 144147.
Friedmann T (1997) Overcoming the obstacles to gene therapy.
Scientific American 276: 95101.
Friedmann T and Roblin R (1972) Gene therapy for human
genetic disease? Science 175: 949955.
Willis RC, Jolly DJ, Miller AD et al. (1984) Partial phenotypic
correction of human LeschNyhan (HPRT-deficient) lymphoblasts with a transmissible retroviral vector. Journal of Biological Chemistry 259: 78427849.
Gene Transfer
Gene Trees
N Saitou
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0554
Gene Trapping
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0553
820
Gene Trees
(A)
(A)
Human-1 (C)
Chimpanzee-1 (C)
Gene 1 (species A)
Gorilla-1 (G)
Orang utan(C)
Gibbon-1 (C)
Human-2 (C)
Chimpanzee-2 (C)
-1
S
Gene 3 (species B)
D
Gene 2 (species A)
-2
Gibbon-2 (C)
Crab-eating macaque (C)
Gene 4 (species B)
(B)
Gorilla-2 (G)
[deletion]
(B)
Gene 1 (species A)
D
Gene 2 (species A)
S
Gene 3 (species B)
D
Gene 4 (species B)
: Gene Duplication
: Speciation
Figure 1 Two possible relationships of four homologous genes sampled from two species. (A) When
gene duplication preceded separation. (B) When
two independent gene duplications occurred after
speciation.
Human-1
Human-2
Gorilla-1
Gorilla-2
Chimpanzee-1
Chimpanzee-2
Orang utan
Gibbon-1
Gibbon-2
Crab-eating macaque
Genetic Code
S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0528
Table 1
UUU phenylalanine
UUC phenylalanine
UUA leucine
UUG leucine
CUU leucine
CUC leucine
CUA leucine
CUG leucine
AUU isoleucine
AUC isoleucine
AUA isoleucine
AUG methionine
GUU valine
GUC valine
GUA valine
GUG valine
UCU serine
UCC serine
UCA serine
UCG serine
CCU proline
CCC proline
CCA proline
CCG proline
ACU threonine
ACC threonine
ACA threonine
ACG threonine
GCU alanine
GCC alanine
GCA alanine
GCG alanine
UAU tyrosine
UAC tyrosine
UAA stop (ochre)
UAG stop (amber)
CAU histidine
CAC histidine
CAA glutamine
CAG glutamine
AAU asparagine
AAC asparagine
AAA lysine
AAG lysine
GAU aspartic acid
GAC aspartic acid
GAA glutamic acid
GAG glutamic acid
UGU cysteine
UGC cysteine
UGA stop
UGG tryptophan
CGU arginine
CGC arginine
CGA arginine
CGG arginine
AGU serine
AGC serine
AGA arginine
AGG arginine
GGU glycine
GGC glycine
GGA glycine
GGG glycine
822
Table 2
Organism1
Genes
Codon2
Universal Actual
References
meaning meaning
Prokaryotes
Various
Selenoproteins3
UGA
Stop
SeCys
Mycoplasma sp.
All genes
UGA
Stop
Trp
Organellar genomes
Mammals
All mitochondrial
Drosophila
All mitochondrial
Saccharomyces
cerevisiae
All mitochondrial
UGA
Stop
AGA, AGG Arg
AUA
Ile
UGA
Stop
AGA
Arg
AUA
Ile
UGA
Stop
Trp
Stop
Met
Trp
Ser
Met
Trp
Leu
Ile
Stop
Arg
Thr
Met
Trp
Trp
Gln
CUG
Leu
Ser
UGA
Stop
SeCys
Fungi
Maize4
CUN
AUA
All mitochondrial (?) UGA
All mitochondrial (?) CGG
Eukaryotic nuclear
genomes
Protozoa
All nuclear
Candida cylindracea
All nuclear
Various mammals
Selenoproteins
Where a single species is given it is possible that related organisms also display the same code modifications.
N any nucleotide.
3
The following are known to be selenoproteins: formate dehydrogenase (Escherichia coli, Enterobacter aerogenes, Clostridium
thermoaceticum, C. thermoautotrophicum, Methanococcus vannielii), NiFeSe hydrogenase (Desulphomicrobium baculatum, M. voltae),
glycine reductase (C. sticklandii, C. purinolyticum), cellular glutathione peroxidase (human, cow, rat, mouse), plasma glutathione
peroxidase (human), phospholipid hydroperoxide glutathione peroxidase (pig, rat), selenoprotein P (human, cow, rat),
selenoprotein W (rat), type 1 deiodinase (human, rat, mouse, dog), type 2 deiodinase (Rana catesbiana), type 3 deiodinase
(human, rat, R. catesbiana). See http://www.tigr.org/tdb/at/at.html
4
In maize and other plants the CGG codon is probably converted into UGG (the correct codon for tryptophan) by RNA editing.
(Reproduced with permission from Molecular Biology Labfax 1: Recombinant DNA. London: Academic Press.)
2
For some time it was thought that the code was universal, that is, identical for all living organisms from
viruses to humans. However, in certain protozoa and in
the mitochondrial organelles of higher organisms there
are differences. For example, a codon that normally signifies chain termination can encode an amino acid, or
a codon that codes for a particular amino acid in one
organism can code for a different amino acid in another.
See also: Adaptor Hypothesis; Codon Usage Bias;
Codons; Universal Genetic Code; Variable Codons
Genetic Colonization
J B Mitton
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0529
pump sea water into their tanks to adjust their buoyancy. The next time they take on cargo, they pump
water out of the tanks, often introducing marine pelagic larvae into new environments. There are more than
60 marine species that have successfully colonized San
Francisco Bay in this way.
Founder Effect
The waxing and waning of glaciers modifies the distributions of species in both temperate terrestrial and
marine environments. As the glaciers grow, they displace species from high elevations and high latitudes
into glacial refugia at lower elevations and latitudes.
The accumulation of glacial ice lowers sea levels,
exposing land bridges that connect continents and
islands. For example, during the last glaciation, Native
Americans colonized North America by crossing the
land bridge between Siberia and Alaska.
At the height of the most recent glaciation, 18 000
years ago, Scandinavia and parts of the British Isles
were covered with ice, and tundra and permafrost
covered central Europe. So it comes as no suprise
that the modern ranges of European plants and animals were colonized from glacial refugia further
south. Plants and animals occupied at least four glacial
refugia, in the Iberian Peninsula, and areas in Italy,
Greece, and Turkey. The plants and animals migrated
to the north and into the mountains as the glaciers
receded.
Genetic Correlation
W G Hill
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1423
The phenotypic correlation (rP) is a measure of association between the observed performance (phenotypic value) of individuals for a pair of quantitative
traits, for example, stature and body weight of man.
The genetic correlation is the corresponding measure
of association between the genotypes of individuals,
formally their genotypic or breeding values. It is important in describing how traits are associated at the
genetic level and in predicting the effect of selection on
one trait on changes in other traits.
For a single trait, the phenotypic variance can be
partitioned into genetic and environmental components, and the genetic variance into further components.
In the same way the phenotypic covariance (but not
824
Geneti c Correla t io n
correlation) covP can be expressed as a sum of covariance components, covP covA covD covI covE
of which only the phenotypic, (additive) genetic
(covA) and environmental (covE) are much used, the
dominance (covD) and epistatic covariances (covI)
typically being subsumed into covE. Unless otherwise
qualified, the genetic correlation is usually defined as
the correlation, rA, of breeding values (or sums of
average effects), AX and AY for traits X and Y, rather
than as the correlation of genotypic values, because rA
can be estimated from the correlation between relatives and is useful in predicting
selection response.
p
Thus rA covAX ; AY = VAX VAY . The genetic correlation is visualized most simply for individuals with
large numbers of progeny, such as dairy sires used in
artificial insemination, where it becomes approximately equal to the correlation of progeny group
means for the two traits.
The correlation may be caused by the pleiotropic
effects of individual genes on the two traits or by
linkage disequilibrium between genes each affecting
only one of the traits. Although pleiotropy is likely
to lead to essentially stable correlations, for example,
genes influencing appetite may affect size and obesity,
correlations due to disequilibrium are likely to be transient, perhaps following the crossing or introgression
between populations. For example, the cross between
a line with large body size and high prolificacy and a
line with small size and low prolificacy will induce a
positive genetic correlation between the traits, which
may be sustained by disequilibrium.
As pointed out by Falconer in 1952, the genetic
correlation can also be defined where the two traits
specify performance in two different environments, or
indeed in two sexes. As an individual is reared in only
one environment, an equivalent phenotypic correlation can not be defined, however. A high genetic correlation between environments then specifies a lack of
genotype environment interaction, and shows, for
example, that selection in one environment will lead to
genetic change in another.
The genetic correlation can be estimated from
resemblance among relatives using the same designs
and similar methods as used to estimate heritability,
including offspringparent and sib correlations, and
maximum-likelihood methods using all relationships
in the data. The covariances (directly, or scaled as
correlations or regressions) are now computed between the performance of individuals for one trait
with that of their relatives for another. For example,
if c(X,Y) is the sample covariance between trait X on
the parent and trait Y on the offspring, 2c(X,Y) is an
estimate
of the (additive) genetic covariance and
p
fcX; YcY; X=cX; XcY; Yg an estimate of
the genetic correlation. Unless the data set is large, the
Further Reading
Genetic Counseling
R Harris
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0530
Ethical Issues
The aim is to be nondirective and to supply clients
with the facts, understanding, and confidence to make
reproductive or other decisions that are best for them.
Trained medical geneticists and counselors, especially
when dealing with reproductive decisions, will always
attempt to adhere to the gold standard of nondirectiveness. This means that counselors provide the five
components of genetic counseling (see Table 1) but
rarely advise a `correct' or `incorrect' course of action
for the client to follow. This is not always true of
other specialists, because they are accustomed to advising patients to accept various forms of treatment
for physical illness. There are shades of opinion
amongst health professionals (and their patients)
about giving advice rather than only information,
but the rule is to reject coercion aimed at the subjection of an individual patient's wishes to the public
good (Ethics and Genetics).
Components
1.
2.
3.
4.
5.
826
References
Genetic Covariance
E Pollak
Copyright 2001 Academic Press
doi: 10.1006/rwgn. 2001.1424
where uXZ fPR fQS fPS fQR . The variance components 2A ; 2D , and 2Ar Ds are, respectively, the additive
genetic variance, the dominance variance, and the variance associated with all interactions of single alleles at
r loci and genotypes at s other loci. This general
expression for the genetic covariance was independently derived by Cockerham and Kempthorne.
If there is no epistasis but loci are not in
gametic phase equilibrium, a general expression for
covGX GZ is obtainable, but it has a very complicated form, as shown by Weir, Cockerham, and
Reynolds. Genetic covariances can also be calculated if
there is inbreeding, but the resulting expressions contain covariances that are not present when there is
random mating. A thorough analysis of this problem
Ge n e ti c D i se as es 827
if two loci are involved was presented by Weir and
Cockerham.
Other models in the literature allow for sex-linked
loci, maternal effects, effects of cytoplasmic genes that
are maternally inherited, polyploidy, and covariances
between relatives when one relative of a pair is measured in trait Y1 and the other in trait Y2. There is also
some theory that applies when there is assortative
mating. Discussions of these topics and references to
the original papers mentioned above can be found in
the books listed below.
Further Reading
Cockerham CC (1954) An extension of the concept of partitioning heredity variance for analysis of covariances among
relatives when epistasis is present. Genetics 114: 859882.
Falconer DS and Mackay TFC (1996) Introduction to Quantitative
Genetics, 4th edn. Harlow, UK: Longman.
Kempthome O (1954) The correlation between relatives in a
random mating population. Proceedings of the Royal Society of
London B 143: 103113.
Kempthorne O (1957) An Introduction to Genetic Statistics. New
York: John Wiley.
Lynch M and Walsh B (1998) Genetics and Analysis of Quantitative
Traits. Sunderland, MA: Sinauer Associates.
Weir BS, Cockerham CC and Reynolds J (1980) The effects of
linkage and linkage disequilibrium on the covariances of
noninbred relatives. Heredity 45: 351359.
Genetic Diseases
K M Beckingham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0531
For a genetic disease caused by a recessive mutation, both parents must be heterozygous for the
mutation and there will be a one in four chance of
producing a homozygous affected child. However,
given that heterozygous individuals (carriers) are
asymptomatic, in the case of relatively rare genetic
diseases, carriers can be completely unaware of their
status and thus unprepared for the birth of a genetically compromised child. Some recessive genetic disease, such as TaySachs disease, show dramatically
increased prevalence in certain ethnic groups. Considerable effort has been devoted to identifying and
counseling potential carriers within these groups.
For X-linked recessive diseases such as hemophilia,
where half the sons of a heterozygous mother are
affected, and for dominant mutation diseases such
as achondroplasia (see Achondroplasia), individuals
carrying a single mutant allele will, in general, be
aware of their problem and thus able to make
informed choices about parenthood. Unfortunately
some diseases caused by dominant mutations, such as
Huntington disease, do not usually manifest until after
the reproductive years.
Two complementary approaches to the eradication
of genetic disorders are ongoing, both of which rely
on preliminary identification of the affected gene. One
approach is preventative and involves genetic testing
of asymptomatic potential carriers or potentially
affected individuals so that those carrying a mutant
allele may know their status and, if necessary, avoid
passing on the mutation in question. In combination
with genetic testing of embryos produced by in vitro
fertilization and selection of embryos carrying nonmutant genes as potential progeny, this approach can
permit carriers to produce their own children while
simultaneously eradicating the disease mutation from
their family lineage. Some in the medical community
advocate neonatal genetic testing of all individuals for
all possible genetic diseases.
The second approach is a therapeutic one for
affected individuals and involves supplying a correctly
functioning version of the mutated gene to the malfunctioning tissue(s). The first experiments using this
approach, termed gene therapy, were initiated in 1990.
Introducing a gene into the cells of certain body
(somatic) tissues within an individual still leaves the
germline sperm or egg progenitor cells mutant and
thus does not eliminate the potential for disease in
any offspring. However, current methods for permanent gene integration into germline cells carry a
potential for causing further genetic damage and thus
are ethically unacceptable. The option of genetically
screening embryos and selecting unaffected embryos
for implantation (as discussed above) is also possible
for genetic disease patients. Currently, harmless,
828
Genetic Distance
i1
Genetic Distance
M Nei
Copyright 2001 Academic Press
doi: 10.006/rwgn.2001.0532
Rogers' Distance
q
X
i1
i i
q
X
i1
iyi
F*ST Distance
Allele A 2
The allele frequencies of different populations may differentiate by genetic drift alone without any selection.
When a population splits into many populations of effective size N in a generation, the extent of differentiation of allele frequencies in subsequent generations
can be measured by Wright's FST Nei and Kumar,
2000. When there are only two populations but allele
frequency data are available from many different loci,
it is possible to develop a statistic whose expectation is
equal to FST. One such statistic is given by
Population X
x2
y2
Population Y
x1
Allele A 1
y1
arc cos
!#2
yi
1 X xi yi 2
2 i1 xi yi
L
X
k1
qk
X
p
!
xik yik =L
i1
^JX ^JY =2
FST
^JXY =1
^JXY
estimators P
of the
where ^JX , ^JY , and ^JXY are unbiased
P 2 P
xi ,
y2i , and
xi y i
means ( JX , JY , and JXY) of
over all loci, P
respectively.
For
a
single
locus,
unbiased
P
P
estimates of x2i , y2i , and xi yi are given by
X
^jX 2mX
^x2i 1 =2mX 1
7
X
^jY 2mY
^y2i
^jXY
1 =2mY
^xi ^yi
where mX and mY are the numbers of diploid individuals sampled from populations X and Y, respectively,
and ^xi and ^yi are the sample frequencies of allele Ai in
populations X and Y. Therefore, ^JX , ^JY , and ^JXY are the
means of ^jX , ^jY , and ^jXY over all loci, respectively. The
is given by
expectation of FST
1
EFST
t=2N
10
ln1
FST
11
830
Genetic Distance
lnI
12
p
I ^JXY = ^J X ^JY
13
14
15
a
IA
a 2t
a
16
where a is the shape parameter of the gamma distribu is the mean of a over loci. Therefore, the
tion and
number of gene substitutions per locus is given by
D 2t a1
IA
1=a
17
IA =IA
18
v/2
Figure 2
v/2
v/2
v/2
A2
A1
A0
A1
A2
(dm)2 Distance
L
X
Xk
Yk 2 =L
19
(B)
Chimpanzee
Chimpanzee
Amerindian (K)
Bantu (Lisango)
96
Pygmy (CAR)
48
16
North Italian
96
North European
31
New Guinean
83
48
98
Australian
New Guinean
34
35
Amerindian (M)
98
North European
Cambodian
57
Cambodian
Amerindian (M)
54
Japanese
Chinese
74
0.1
25
Australian
61
Chinese
Melanesisn
98
99
Melanesian
Japanese
43
North Italian
49
Pygmy (Zaire)
47
Amerindian (K)
Amerindian (S)
10
Amerindian (S)
85
44
Bantu (Lisango)
Pygmy (CAR)
Pygmy (Zaire)
Figure 3 Neighbor-joining trees of human populations obtained by using Bowcocket al.'s (1994) data of 25 microsatellite
loci. (A) Tree obtained by DA distance. (B) Tree obtained by 2 distance. The number for each interior branch is
the bootstrap value from 1000 replications. M, Maya; K, Karitiana; S, Surui; CAR, Central African Republic.
reason a distance measure that is linear with time is not
necessarily better than a nonlinear distance in obtaining true trees (topologies).
A number of authors have studied this problem by
using computer simulation. The general conclusions
obtained from these studies are as follows:
1. For all distance measures, the probability of obtaining the true topology (PT) is very low when the
number of loci used is less than ten but gradually
increases with increasing number of loci. In general,
PT is lower for the stepwise mutation model than
for the infinite-allele model. This indicates that a
larger number of loci should be used for microsatellite DNA data than for electrophoretic data
when the level of average heterozygosity is the
same.
2. Distance measures DA and dC are generally more
efficient in obtaining the true topology than other
distance measures under many different conditions.
3. When the total number of individuals to be studied
is fixed, it is generally better to examine more loci
with a smaller number of individuals per locus
rather than fewer loci with a large number of individuals in order to have a high PT value, as long as
the number of individuals per locus is greater than
about 25. When average heterozygosity is as high as
0.8, however, a larger number of individuals per
locus need to be studied.
832
Genetic Drift
polymorphic DNA (RAPD) data. The allele frequency data obtained by these markers can be analyzed by the same methods as mentioned earlier. They
can also be used to estimate the average number of
nucleotide differences per site between two populations. In the latter analysis somewhat sophisticated
statistical methods are required, and they are presented in Nei and Kumar (2000).
During the last two decades, RFLP data for mitochondrial and chloroplast DNA have been used extensively to study the extent of genetic differentiation of
closely related species or populations. RFLP data can
be obtained inexpensively and give sufficiently accurate
results for studying closely related populations. In
recent years, however, many authors sequence polymorphic alleles to obtain more accurate results. Statistical methods for analyzing these polymorphic DNA
sequences are described in Nei and Kumar (2000).
References
Genetic Drift
J Arnold
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0533
Genetic drift is the random variation in gene frequencies due to sampling. Since populations are finite in
size, those individuals contributing genes to the next
generation constitute a sample from the population.
1=2Ne t H0
2
3
4
5
2
3
4
5
6
8
9
10
6
7
8
9
10
11
11
12
12
13
13
14
14
15
15
16
16
N
/
10
N
/
5
N
/
2
17
17
18
18
2N
19
19
1 2 4
6 8 10 12 14 16 18 20 22 24 26 28 30 32
(A)
(B)
Figure 1 (A) Empirical histogram of allele frequency in 107 experimental populations of Drosophila melanogaster all
started with an allele frequency of p 0.5. (B) Theoretical histogram of allele frequency f (p;t) after multiple of N
generations and all started with an allele frequency of p 0.50. (Redrawn from Buri, 1956 and Wright, 1969.)
of a subdivided population. In the first phase genetic
drift causes each subdivision to undergo a random
walk in allele frequencies to explore new combinations of genes. In the second phase a new favorable
combination of alleles is fixed in the subpopulation by
natural selection and is exported to other demes by
factors like migration between demes. Much of the
basic theory of genetic drift was developed in the
context of understanding the shifting balance theory
of evolution.
Genetic drift has also played a fundamental role in
the neutral theory of molecular evolution. In this theory most of the genetic variation in DNA and protein
sequences is explained by a balance between mutation
and genetic drift. Mutation slowly creates new allelic
variation in DNA and proteins, and genetic drift
slowly eliminates this variability, thereby achieving a
steady state.
Consider for example a new mutation arising in a
gamete's DNA with probability u each generation.
If there are 2Ne alleles in the gene pool, then the
number of new mutations per gamete per generation
References
834
G e ne t i c En g i n ee r in g
Genetic Engineering
I Schildkraut
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0534
Genetic Equilibrium
M Tracey
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0535
HardyWeinberg Equilibrium
In a sexually reproducing, diploid population of infinite size in which there is no mutation or migration, no
natural selection, and where mating takes place at
random, the frequencies of alleles and genotypes will
remain unchanged in HardyWeinberg equilibrium.
If this hypothetical population is not at equilibrium
at the outset, it will take only a single generation to
establish equilibrium under the conditions defined
above. Imagine a population like the one defined
above in which we focus on a gene with two alleles,
h1 and h2. Let us assume that the frequency of h1 is
defined as 0.1 and the frequency of h2 equals 0.9, since
h2 is always 1 f(h1) in a two-allele system, because
the sum of allele frequencies must always equal one.
Since h1 cannot change into h2 by mutation, nor can
h2 change into h1 and there is no natural selection,
we may represent random mating by multiplying the
frequencies of male gametes by the frequencies of
female gametes:
f h1 f h2 f h1 f h2
2
h1 2h1h2 h2
MutationSelection, a Balanced
Equilibrium
Dominant alleles which confer prereproductive death
or sterility on their carrier are eliminated from the
population in a single generation. All the alleles seen
in a population must be new mutations and the
equilibrium is simply a balance between elimination
by death or sterility and new mutations.
Genetic Homeostasis
J Phelan
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0536
In environments that fluctuate and change unpredictably, the adaptability of a population is critically
dependent upon the maintenance of reserves of genetic
variability. Genetic homeostasis describes this property of populations that emerges from stabilizing
selection which operates on individuals. In such environments, the adaptedness of individuals is increased
by the enhanced buffering against developmental
instability that comes from heterozygosity. Such buffering enables individuals to produce the proper adaptive phenotype despite the inevitable environmental
fluctuations that occur during development.
What is Homeostasis?
From an organism's perspective, the ability to maintain normal physiological functioning despite inconstant environmental variables both internally and
externally is a central aspect of their fitness. Consequently, homeostasis is one of the fundamental adaptations. The ability of an animal embryo to develop
normally even in the face of large insulin fluctuations,
for example, or of a mammal to maintain a constant
body temperature despite changing weather, confer a
survival advantage.
Overdominance
Perhaps the most well-known case of stable equilibrium in human genetics is sickle-cell anemia, where
Just as individuals exhibit physiological or developmental homeostasis, populations, too, may exhibit
836
G e n e t i c H o meo s ta s is
838
G e n e t i c L o ad
Further Reading
Palmer AR and Strobeck C (1986) Fluctuating asymmetry: measurement, analysis, patterns. Annual Review of Ecology and
Systematics 17: 391421.
Genetic Load
J F Crow
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0537
measurements of allele frequencies rather than indirect assessments from load theory.
References
Genetic Mapping
See: Chromosome Mapping, Gene Mapping
Genetic Marker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1849
Genetic Material
J Merriam
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0539
The term genetic material describes the physical substance that is inherited from parents by offspring.
Generally this refers to DNA. This physical substance
is a constant connection for all cells in a body or
colony, all individuals in a species, or ultimately, all
organisms. The physical substance carries the information specifying the enzymes, structural proteins,
and other gene products that are characteristic of life.
More abstractly, genetic material refers to that information, or code, that directs life processes.
See also: DNA; Universal Genetic Code
Genetic Migration
J B Mitton
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1422
840
G e n e t i c Pol a rity
Further Reading
Genetic Polarity
See: Polaron
Genetic Ratios
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0542
Genetic Recombination
D Carroll
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0543
Definitions
Genetic recombination refers to the rearrangement
of DNA sequences by some combination of the
breakage, rejoining, and copying of chromosomes
or chromosome segments. It also describes the consequences of such rearrangements, i.e., the inheritance
of novel combinations of alleles in the offspring that
carry recombinant chromosomes. Genetic recombination is a programmed feature of meiosis in most
sexual organisms, where it ensures the proper segregation of chromosomes. Because the frequency of
recombination is approximately proportional to the
physical distance between markers, it provides the
basis for genetic mapping. Recombination also serves
as a mechanism to repair some types of potentially
lethal damage to chromosomes.
Genetic recombination is often used as a general
term that includes many types of DNA rearrangements and underlying molecular processes. Meiotic
recombination is an example of a reaction that
involves DNA sequences that are paired and homologous over very extended lengths. This type of process, which is illustrated in Figure 1, is termed general,
legitimate, or homologous recombination. Recombination of this type is reciprocal, because each participating chromosome receives information comparable
to what it donates to the other partner. The event
shown in Figure 1 is also designated as a crossover,
since all the information on both sides of the effective
break has been exchanged.
Gene conversion is a form of homologous recombination that is nonreciprocal. This is recognized by
the recovery of unequal numbers of the parental markers at a particular locus, and a simple example is
shown in Figure 2. Conversion events can be accompanied by a crossover, or not (as shown in Figure 2).
In the latter case, conversion looks like a very localized double crossover, but it is nonreciprocal and is
likely the result of a single event.
Homologous recombination can occur between
homologous chromosomes or sister chromatids in
mitotic cells as well. In addition, essentially analogous
events may take place between homologous sequences
that are present at different locations on nonhomologous chromosomes; this is often called ectopic recombination. Recombination that involves very limited or
no homology between the interacting DNA sequences
No recombination:
2 AB
2 ab
Recombination:
1 AB
1 aB
1 Ab
1 ab
Figure 1 Simplified diagram of a meiotic recombination event. Vertical bars indicate individual chromatids
(i.e., double-stranded DNA molecules); shaded ovals are
centromeres. We imagine two pairs of sister chromatids
after premeiotic DNA synthesis that are distinguished
by color and by genetic markers at locations A/a and B/b.
If meiosis were to proceed without recombination, the
markers would segregate 2:2 in linked pairs in the
resulting gametes or spores, as indicated below the left
diagram. If one reciprocal recombination event takes
place between the two markers, the linkage relationships are changed, yielding two new chromatids as
shown and ultimately four distinct haploid products.
is termed illegitimate or nonhomologous recombination. Sometimes a few matched base pairs are seen
precisely at illegitimate recombination junctions, and
these are called microhomologies. An event supported
by homologies of 100 bp or more would typically be
classified as homologous, a match of 10 bp or fewer
would be nonhomologous, and there is evidently a
gray area in between.
In conservative recombination events, the number
of copies of the interacting chromosomes or DNA
sequences is maintained throughout the process,
while in nonconservative events, two original copies
are reduced to one in the product. This distinction can
be made for both homologous and nonhomologous
recombination.
Site-specific recombination events are mediated
by sequence-specific recombination enzymes often
encoded by viruses or transposable elements. The
molecular processes they catalyze may rely on very
short stretches of homology between the interacting
DNAs, or they may be entirely nonhomologous.
842
Gene conversion:
1 AB
1 aB
2 ab
Genetic Mapping
To a first approximation, the probability that a genetic
recombination event will occur during meiosis is distributed equally along the length of each chromosome,
and in most organisms the number of crossovers in
each chromosome arm is limited to one or a few. This
means that it is quite unlikely that an event will occur
between two genes that are very close to each other on
a chromosome, but much more likely between distant
genes. The closer genes A and B are along the DNA,
the less likely an exchange that rearranges the alleles of
these genes, as shown in Figure 1. This forms the basis
of genetic mapping.
The frequency of recombination is defined as the
fraction of all cases in which two genetic markers that
came from the same parent are found separated in the
offspring. If two markers are on different chromosomes, they will not be linked as they pass through
meiosis, and their recombination frequency will be
0.5, i.e., they will segregate into the same gamete by
chance half the time and into different gametes half the
time. Markers that are very close to each other on the
same chromosome arm will be separated very rarely
and will have a recombination frequency close to zero.
Markers more distant from each other on the same
chromosome will show recombination frequencies
between zero and 0.5.
Now imagine a situation in which a third marker,
c, is added to the same chromosome. When the three
b B
Initiation
Exchange
Resolution
4
Resolution
6
Repair
B : b ratio
Repair
2:6
3:5
1 ABC, 1 aB / bC
1 AB / bc, 1 abc
2 ABC, 1 aBc, 1 abC
1 ABc, 1 Abc, 2 abc
4:4
(aberrant)
Figure 3 Illustration of some events that occur following the initial exchange of one strand of DNA between
homologous chromosomes. Each DNA single strand is shown as a vertical bar, and the parental chromosomes are
imagined to differ in allelic markers in three genes: A/a, B/b, and C/c. In diagram 1, all white strands carry ABC
information, while all shaded strands are abc. In subsequent diagrams, the information in the strands of the interacting
chromosomes is noted explicitly. After the initiation event (2), in which a segment of one strand invades
corresponding sequences in a homologous chromosome, several outcomes are possible. The initial patch, carrying
marker b, can be incorporated into the recipient chromosome, and the gap it left behind can be filled by DNA
synthesis (3). Mismatches in the resulting heteroduplex at B/b can be repaired (4). An alternative fate of intermediate
2 is the reciprocal exchange of a single strand from the invaded chromosome (5). One possible way this intermediate can be resolved is by cleavage and religation of the nonexchanged strands; this leads to a crossover, with
heteroduplexes remaining at site B/b (6). If these heteroduplexes are both repaired to the B allele, the result will be
that shown in 7. Below the diagrams of products 3, 4, 6, and 7, the genetic outcomes are indicated. The line labeled
`Gametes' shows the status of the double-stranded DNAs, with B/b indicating the persistence of heteroduplexes. The
results of postmeiotic segregation (PMS) are indicated, as are the recoveries of the parental alleles at B/b for each of
the outcomes. Product 4 would be scored as a gene conversion, while product 7 is a gene conversion associated with
a crossover.
will be produced. This phenomenon is referred to as
postmeiotic segregation, or PMS.
As shown in Figure 1, the usual outcome of meiosis would be the recovery of equal numbers of parental
alleles, in either parental or recombinant configuration. The processes illustrated in Figure 3 can alter
this distribution for markers that are close to the site of
the recombination event itself. The number of alleles
of each parental type that are present in each of the
recombination products is tabulated in the figure. The
844
Nonhomologous
end joining
Conservative
homologous
recombination
or
Nonconservative
homologous
recombination
Figure 4 Modes of double-strand break repair by recombination. Each horizontal bar represents a double-stranded
DNA molecule. Thin bars indicate no homology, while thick bars denote homologous sequences. In nonhomologous
end ligation, broken ends are rejoined precisely without gain or loss of information. Nonhomologous end joining is
typically accompanied by deletion (as shown) or insertion of DNA. In conservative homologous recombination, the
break is repaired by copying information from homologous sequences elsewhere in the genome, often from a homologous
chromosome or sister chromatid, and the repair may or may not be accompanied by crossing over, as shown in
the two alternative outcomes. Nonconservative homologous recombination relies on repeated sequences near the
break, and repair is accompanied by deletion of one copy of the repeat and all sequences between the two copies.
in DNA. By definition, this type of damage must be
repaired by recombination: what came apart must
be put back together. DSBs can be generated by external agents, like ionizing radiation and some types
of chemicals, or during normal cellular processes, like
generation of reactive oxygen species and problems in
DNA replication. Essentially all types of recombination play a role in DSB repair in some organisms or
cell types: homologous and nonhomologous, conservative and nonconservative, reciprocal and nonreciprocal. Some examples are illustrated in Figure 4.
Some types of breaks in DNA can be rejoined by
the simple action of DNA ligase without a need for
extensive sequence homology. Examples would be the
short, complementary single-stranded tails generated
by many restriction endonucleases. More frequently,
however, the broken DNA ends are not ligatable and
end joining occurs between novel sequences, often
with concomitant deletions or insertions. Examination of the junctions produced in these illegitimate
events frequently reveals microhomologies (*15
nucleotides) between the parental sequences.
Two forms of homologous recombination are
involved in DSB repair. If an unbroken copy of the
same sequence is available on a homologous chromosome or sister chromatid, the most reliable way to restore the integrity of the broken DNA is to copy that
information in repairing the break. Conservative
G e n e t i c R e d u n d a n c y 845
DNA cloning and sequencing techniques. If it is then
introduced into living yeast, some fraction of the cells
will incorporate it at the homologous chromosomal
site, using the normal recombination apparatus.
This type of experiment, called gene targeting,
works rather well in fungi, but is less efficient in multicellular organisms. The frequency of homologous
recombination events between an introduced DNA
molecule and the corresponding chromosomal target
can be improved if both the target and the introduced
DNA are broken. Apparently, the cell sees the DSBs
as damage that needs to be repaired. Picture a variant
of the conservative homologous DSB repair illustrated
in Figure 4, in which the broken chromosome is
shaded and the white DNA is a linear fragment introduced into the cells. The non-crossover product
carries the targeted insertion. As described in the
preceding section, cells have multiple pathways of
DSB repair, and in practice both homologous and
nonhomologous events occur at the broken ends.
In addition to adding to the arsenal of the experimental geneticist, gene targeting holds promise for
human gene therapy. In principle, the disease-causing
version of a gene could be replaced by the normal
allele using this same procedure. At present, however,
the efficiency of gene targeting in human cells is too
low to make this approach practical.
Further Reading
Genetic Redundancy
D C Krakauer and M A Nowak
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0519
846
G e n e t i c S c re e n i n g
Pleiotropic Theories
Some forms of pleiotropy can also ensure the conservation of redundant function. Recall that pleiotropy refers to cases where a single gene experiences
selection in more than one context. Consider again
two genes with two independent functions. Furthermore, assume that one gene can, in addition to its
own function, perform the function of the other
gene, but less efficiently. Redundancy is partial and
measured in terms of correlated function. When mutations to the pleiotropic gene can either eliminate its
unique function or both functions, then redundancy is
preserved whenever the rate of elimination of the
pleotropic function is lower than elimination of its
unique function. In other words, when one gene's
pleiotropic function is more robust than the other
gene's unique function, the correlated function can
be preserved.
Summary
In summary the problem of redundancy is the problem of its preservation. Evolutionary stability of nontrivial redundancy requires asymmetries in mutation
and functional efficiency. However, trivial redundancy such as dosage effects could provide the most
parsimonious explanation for the observed data.
Assuming that weak selection acting in large populations over long time scales has played an important
role in genome evolution, cumulative benefit appears
to be refuted simply because experimental assays are
not sufficiently sensitive.
See also: Gene Regulation; Mutation Load;
Pleiotropy
Genetic Screening
See: Gene Mapping; Gene Therapy, Human;
Genetic Counseling; Pedigree Analysis; Prenatal
Diagnosis
Drosophila
T. H. Morgan and his students at Columbia, as part of
their studies of Drosophila melanogaster mutants that
began in 1913, preserved their stocks in a collection
maintained by C. Bridges. These stocks were provided
to anyone requesting them. In 1928, Morgan, Bridges,
848
Maize
At the 1928 Winter Science meetings in New York, a
work on the maize linkage maps, and the idea of an
group of maize geneticists discussed organized Maize
Table 1
Type
Organism
Microbial
Bacillus subtilis
Chlamydomonas
Escherichia coli
http://bacillus.biosci.ohio-state.edu
http://www.biology.duke.edu/chlamy
http://cgsc.biology.yale.edu; http://shigen.lab.nig.ac.jp/ecoli/strain;
http://www.shigen.nig.ac.jp/cvector/cvector.html
http://www.fgsc.net; http://www.hgmp.mrc.ac.uk/research/fgsc/intro.html
http://www.pseudomonas.med.ecu.edu
http://www.acs.ucalgary.ca/*kesander
http://phage.atcc.org/searchengine/ygsc.html or http://www.atcc.org (Berkeley);
http://panizzi.shef.ac.uk/msdn/peter (Peterhof);
http://www.ifrn.bbsrc.ac.uk/NCYC (UK); see also
http://genome-www.stanford.edu/Saccharomyces
http://www.cabi.org/; http://www.belspo.be/-bccm;
http://www.dsmz.de; http://www.pasteur.fr/applications/CIP;
http://www.ukncc.co.uk/ http://www.jcm.riken.go.jp/JCM/ aboutJCM.html;
http://wdcm.nig.ac.jp; http://www.atcc.org;
http://mgd.nacse.org/ocid/prospect3.html
Filamentous fungi
Pseudomonas
Salmonella
Yeast
Miscellaneous (having
subcollections of
Agrobacterium, Escherichia
coli, yeast, etc.)
Plant
Arabidopsis
Barley
Maize
Pea
Rice
Tomato
Wheat
Animal
Axolotl
Chicken
Drosophila
Mouse
Caenorhabditis elegans
Zebrafish
Mouse
The Jackson Laboratory is a nonprofit, independent
research institution which was founded in 1929 by
C. C. Little to conduct basic genetic and biomedical
research and to provide training and genetic resources
to the scientific community. It has since that time supplied inbred and mutant strains of mice to the research
community. Its current resource includes over 2500
strains of genetically defined mice, both live stocks
and frozen embryos, and a transgenic mouse resource
and DNA resources (http://www.jax.org and http://
www.jaxmice.jax.org). Mouse Genome Informatics is
served from (http://www.informatics.jax.org).
The MRC Mammalian Genetics Unit, Harwell,
UK (http://www.mgu.har.mrc.ac.uk) maintains a
Frozen Embryo and Sperm Archive of almost 1000
stocks and live mouse stocks of 200 mutant, chromosomal anomaly, and inbred lines available on request.
The Archive provides free cryopreservation and
Escherichia coli
Sexuality and the ability to make genetic crosses
between mutant strains of bacteria were discovered
850
Phabagen, the Phage and Bacterial Genetics Collection, includes 3500 mutant bacterial strains, 450
cloning vectors, 800 other plasmids, 2 plasmidcontaining gene banks of E. coli, and over 100 phages.
It was established in the early 1960s with deposits of
bacterial mutants from researchers of the Working
Community Phabagen. Since 1990 it has been part of
the Centraal Bureau voor Schimmelcultures (CBS),
and has merged with the Laboratory for Microbiology
at Delft (LMD) Collections to become the National
Culture Collection of Bacteria of the Netherlands,
which includes mutant derivatives of E. coli K-12
and B and also mutants of Agrobacterium tumefaciens,
wild-type and reference strains of other bacteria,
plasmids, and phages (http://www.cbs.knaw. nl/nb).
Bacillus subtilis
Pseudomonas aeruginosa
Yeast
The Yeast Genetic Stock Center originated at the University of California at Berkeley, in 1960, founded and
administered by R.K. Mortimer. It included 1200
strains of Saccharomyces cerevisiae, primarily derivatives of the stocks of C.C. Lindegren at Southern
Illinois University. Professor Mortimer and the stock
center annually published updated linkage summaries
and linkage maps. After his retirement, the collection
moved, in 1998, to the ATCC, where it is maintained
as a separate collection (http://phage.atcc.org/searchengine/ygsc.html). The ATCC also has a number of
the mutant lines of Schizosaccharomyces pombe. In
addition it will serve as a repository for a complete
set of deletion strains (http://www.atcc.org). Maps
and sequence are now presented on-line as part of
the Saccharomyces Genome Database (http://genome
-www.stanford.edu/Saccharomyces).
The Peterhof Genetic Collection of Yeasts (PGC)
is part of the Biotechnology Center at St. Petersburg
State University in Russia. It has over 1000 genetically
marked yeast strains, with an origin distinct from the
Carbondale/Berkeley collection. The Peterhof lines
originated from a diploid cell of an inbred strain of
Saccharomyces cerevisiae. The Collection includes
mutants derived from this line, other genetically
marked yeast strains, and segregants of crosses between
the Peterhof-derived and other strains. (panizzi.shef.
ac.uk/msdn/peter).
The National Collection of Yeast Cultures
(NCYC), Institute of Food Research, in Norwich,
UK, includes brewing yeast strains, genetically
defined strains of Saccharomyces cerevisiae and Schizosaccharomyces pombe, and general yeast strains,
totalling over 2700 nonpathogenic yeasts. In addition
to supplying cultures, it is a patent and safe repository
and it performs yeast identification services. A searchable database for the NCYC is found at http: //www.
ifrn.bbsvc.ac.uk/ncyc.
Chlamydomonas
strains, and genomic and cDNA clones of Chlamydomonas reinhardtii. The web and gopher sites provide,
in addition, information on genetic and molecular
maps of Chlamydomonas, plasmids, sequences, and
bibliographic citations (http://www.biology.duke.
edu/chlamy www.biology.date.edu/chlamy).
A number of the large national collections of microorganisms include genetic stocks. These are described
in more detail in the Stock Centers entry in the Encyclopedia of Microbiology (Berlyn, 2000).
For example:
1. The International Mycological Institute (IMI) for
Culture Collections, which was founded in 1920 as
an organization supported by 32 governments and
has over 16 500 strains of filamentous fungi, yeasts,
and bacteria. It is part of Commonwealth Agricultural Bureaux (CAB) International, a nonprofit
intergovernmental organization (http://www.cabi.
bioscience.grc.htm).
2. The Belgian Coordinated Collections of Microorganisms (BCCM), a consortium of four researchbased collections, include 50 000 documented
strains of bacteria, filamentous fungi, and yeasts
and over 1500 plasmids, supported by the Belgian
Federal Office for Scientific, Technical, and Cultural Affairs. They provide patent and safe-deposit
services, as well as fingerprinting/biotyping and
identification services, contract research, and training (http://www.belspo.be/bccm).
3. The Deutsche Sammlung von Mikroorganismen
und Zellkulturen (DSMZ) is the national culture
collection in Germany, founded in 1969 and supported by the Federal Ministry of Research and
Technology and the State Ministries. It includes
genetic stocks of bacteria, filamentous fungi, and
yeast. It also has plant and animal cell cultures. In
addition to supplying scientists and institutions
with its cultures, it acts as a patent and safe repository (http://www.dsmz.de).
4. The Collection of the Institut Pasteur (CIP) traces
its origin to Dr. Binot's collection of microbial
strains in 1891. It now includes genetic stocks of
E. coli (http://www.pasteur.fr/applications/CIP).
5. The NCYC has been cited for its yeast collections.
The National Collection of Type Cultures in
London (NCTC) is another of the UK National
Culture Collections (http://www.ukncc.co.uk) and
it has genetic derivatives as well as natural isolates
and pathogenic strains of E. coli in its large collection, which emphasizes pathogenic bacteria and
mycoplasmas. It is a patent and safe depository
852
6.
7.
8.
9.
Research using Arabidopsis thaliana as a model organism for flowering plants increased dramatically from
the mid-1980s through the 1990s. Resource centers
that included genetic stocks, genomic libraries, and
cloned DNA were established in response to recommendations from a series of workshops sponsored by
NSF and culminating in a long-range plan for a multinational-coordinated Arabidopsis genome project presented in 1990. The Arabidopsis Biological Resource
Center (ABRC) at Ohio State University includes
seeds, restriction fragment length polymorphism
(RFLP) markers, and yeast artificial chromosome
(YAC) libraries. The seed collection and distribution
activities in Europe are performed by the Arabidopsis
Centre at Nottingham, in England, and there is a clone
center in Germany. The centers manage the same
(mirrored) collection of seed stocks and collaborate
and coordinate their efforts to meet the needs of the
world Arabidopsis research community. The US center, directed by R. Scholl, receives funding from the
Tomato
Wheat
Rice
Barley
Pea
G. Marx, at Cornell University, collected pea germplasm and mutants, and upon his retirement the G.A.
Marx Collection became part of the NPGS, and the
collection was moved to Washington State University,
with accessions numbering c. 3000. It includes mutations affecting foliage, flowers, seeds, pods, productivity, and photoperiodism, and a special subset
tagged `Mendel's Genes' (http://www.ars-grin.gov/
ars/PacWest/Pullman/GenStock/pea/MyHome.html).
Use of this model organism for the genetics of development, behavior, and neurobiology began in S. Brenner's laboratory in the early 1970s and encompasses
mutant isolation and analysis, documentation of cell
lineage and development, and the complete genome
sequence (see, for example, Cell Division in Caenorhabditis elegans). The Caenorhabditis Genetic Center
(CGC) at the University of Minnesota keeps genetic
stocks of C. elegans, approximately 3500, and a database linked to the C. elegans genomic database, http://
biosci.umn.edu/CGC.
A more recently developed model system, the zebrafish, for study of vertebrate development and genetics,
has a repository for strains at the University of Oregon,
supported by funds from the NIH and the state of
Oregon.
The International Resource Center for Zebrafish
preserves sperm samples, embryos, and live stocks
of zebrafish wild-type and mutant stocks submitted
by researchers and available for distribution to the
research community, maintains the genetic map and
information on genetic markers, publishes information on methods for maintenance and use use of zebrafish in research, and studies disease and health of
zebrafish strains. The Center maintains the ZFIN,
the Zebrafish Information Network database for
disseminating information on genetics, genomics,
and development of the organism and community
information. The database project was founded in
1994, with initial support from the NSF and the
Keck Foundation and current support from the NIH
(http://zfin.org/zfinfo/stckctr/stckctr.htmland http://
zfin.org/index.html).
854
Genetic Transformation
Genetic Transformation
See: Bacterial Transformation
Genetic Translation
B E Schoner
Further Reading
References
Berlyn MKB (2000) Stock culture collections and their databases. In: Lederberg J (ed.) Encyclopedia of Microbiology, vol. 4,
pp. 404427. London: Academic Press.
Demerec M, Adelberg EA, Clark AJ et al. (1966) A proposal for
a uniform nomenclature in bacterial genetics. Genetics 54:
6176.
Further Reading
References
Genetic Variation
W J Ewens
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0552
856
Genetics
Genetics
A Campbell
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0545
Transmission Genetics
Transmission genetics concerns the germinal substance (deoxyribonucleic acid or DNA) and its mode
of transmission from parent to progeny. DNAs are
distinguished from one another by the sequence of
nucleotides along their length. Linear molecules of
DNA constitute the core of microscopically visible
structures (the chromosomes) that divide and segregate at cell division so that each cell of a multicellular
organism generally has the same chromosome complement. Sexually reproducing eukaryotes are typically diploid: an individual has one chromosome set
from each of his or her parents. At reproduction, a
meiotic division produces gametes (sperm or egg),
each of which has a single chromosome set.
The discipline of genetics follows the work of
Gregor Mendel, who discovered in 1865 the regular
pattern of transmission of units (later called genes)
that affect visible properties (later called phenotypes)
of organisms. Mendel's units are segments of linear
DNA molecules. Most of the basic rules of genetics
Genome 857
were deduced before 1951 (when the germinal substance was shown to be DNA) and long before
DNA sequencing. Among the processes fundamental
to deducing the rules are mutation, recombination, and
the meiotic behavior of structurally aberrant chromosomes. A mutation is a heritable change, almost
always a change in DNA sequence. Mutations can be
used to mark the chromosome entering a cross from
the parents and recovered in the progeny. Genetic
recombination of such marked chromosomes allows
the construction of linkage maps. All these processes
aid in equating genetic determinants with specific
chromosomal segments a goal that is superseded by
complete genome sequencing, when that is available.
Transmission genetics includes the study of DNA
transfer between individuals by means other than
sexual reproduction, and its incorporation into the
recipient genome. This process is conspicuous in prokaryotes, which lack a meiotic cycle and frequently
have circular rather than linear chromosomes. Transmission genetics also include the study of organelles
such as mitochondria or plastids that contain DNA
but are not distributed in a regular manner at cell
division, and of viruses and related elements that contain RNA rather than DNA (some of which are transmitted vertically from parent to progeny cell).
Physiological Genetics
Physiological genetics concerns the mechanisms
whereby genes affect organismal properties through
transcription of DNA to RNA, translation of RNA to
protein, and their regulation. Some subdivisions are
biochemical genetics, developmental genetics, and cell
genetics. A major tool of physiological genetics has
been the characterization of mutant organisms. Such
studies have identified various regulatory elements
including activators and repressors of transcription
and translation and nucleases and proteases that affect
protein concentrations. Mutant studies frequently
reveal complex pathways leading from primary gene
functions to visible traits, sometimes allowing genes to
be classified into regulatory hierarchies that define
temporal patterns of gene expression during such
processes as metazoan development and cell cycle
progression.
Population Genetics
The subject matter of population genetics is the distribution of heritable variation among the members of
an interbreeding population. It includes both the genesis of such variation (through mutation and selection)
and its maintenance from one generation to the next.
Much of the variation in natural populations has no
Genome
F Ruddle
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0560
858
Genome
Further Reading
Alberts B, Bray D, Lewis J et al. (1994) The Cell, 3rd edn. New
York: Garland.
Brown TA (1999) Genomes. New York: Wiley-Liss.
Ridley M (1999) Genome. London: Harper Collins.
Science (2000) 287: 21052364.
Genome Organization
G L Gabor Miklos
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0556
Genomes of Bacteria
The variation in bacterial genome size and gene
number is large. Mycoplasma genitalium (0.58 Mb)
has approximately 470 protein coding genes, whereas
Myxococcus xanthus (9.5 Mb) probably has in excess
of 8000 genes. Genome sizes also vary within a taxonomic group, e.g., from 2.7 to 6.5 Mb in cyanobacteria
and from 6.5 to 8 Mb in different strains of Streptomyces ambofaciens. Furthermore, the genome organizations of Mycoplasma genitalium (470 genes),
Haemophilus influenzae (1709 genes), Synechocystis
ssp. (3200 genes), Bacillus subtilis (4000 genes), and
Escherichia coli (4300 genes) reveal that gene order is
not conserved, and that there is no absolute functional
requirement for specific gene juxtapositions. Bacterial
genomes are organized as linear and circular structures. Linear chromosomes occur in Borrelia burgdorferi, various species of Streptomyces, Agrobacterium
tumefaciens, and in Rhodococcus fasciens. Circular
chromosomes are found or inferred in other species:
Mycoplasma genitalium, Haemophilus influenzae,
Escherichia coli, Deinococcus radiodurans, Leptospira
interrogans, and Rhizobium meliloti.
Further Reading
Alberts B, Bray D, Lewis J et al. (1994) The Cell, 3rd edn. New
York: Garland.
Brown TA (1999) Genomes. New York: Wiley-Liss.
Ridley M (1999) Genome. London: Harper Collins.
Science (2000) 287: 21052364.
Genome Organization
G L Gabor Miklos
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0556
Genomes of Bacteria
The variation in bacterial genome size and gene
number is large. Mycoplasma genitalium (0.58 Mb)
has approximately 470 protein coding genes, whereas
Myxococcus xanthus (9.5 Mb) probably has in excess
of 8000 genes. Genome sizes also vary within a taxonomic group, e.g., from 2.7 to 6.5 Mb in cyanobacteria
and from 6.5 to 8 Mb in different strains of Streptomyces ambofaciens. Furthermore, the genome organizations of Mycoplasma genitalium (470 genes),
Haemophilus influenzae (1709 genes), Synechocystis
ssp. (3200 genes), Bacillus subtilis (4000 genes), and
Escherichia coli (4300 genes) reveal that gene order is
not conserved, and that there is no absolute functional
requirement for specific gene juxtapositions. Bacterial
genomes are organized as linear and circular structures. Linear chromosomes occur in Borrelia burgdorferi, various species of Streptomyces, Agrobacterium
tumefaciens, and in Rhodococcus fasciens. Circular
chromosomes are found or inferred in other species:
Mycoplasma genitalium, Haemophilus influenzae,
Escherichia coli, Deinococcus radiodurans, Leptospira
interrogans, and Rhizobium meliloti.
860
Gene Families
All genomes contain different-sized gene families, the
members of which have arisen by duplicative processes. Thus in Haemophilus influenzae, while there
are 1709 genes in total, 284 of these are duplicated
products, or paralogs. Thus, there are only 1425
distinct families some of which have more than one
family member. In E. coli, nearly 50 % of the genes
are duplicated, a figure not very different from the
percentage of paralogs in the genomes of the worm
(49 %) and the fly (41 %). Thus, independently of
their grade of evolutionary organization, genomes
have undergone a significant degree of duplication of
their genes. These duplications can be local and form
a cluster, such as a tandem array of 10 glutathione
S-transferase genes in the fly and the cluster of 10
kallikrein serine proteases in the rat. Alternatively,
the duplicated family members can be dispersed, such
as the G-protein-coupled receptor genes (GPCRs),
which are distributed throughout the entire fly genome. Furthermore, the extent of these duplication
events is different in each evolutionary lineage. In
the case of the trypsin-like (S1) proteases, yeast has
one gene, the worm has seven, and the fly has 199.
862
In the case of the GPCRs, there are 160 in the fly, 1100 in
the worm, and an estimated 700 in the human genome.
In the case of neurotransmitter-gated ion channels,
there are 27 genes in the fly, 81 in the worm, and
none in yeast. In summary, hundreds of gene families,
with very different membership sizes, characterize the
various evolutionary lineages.
Pseudogenes
In addition to duplicated gene products that are functional, many metazoan genomes are littered with
pseudogenes, duplicated copies that have become
inactivated. For example, while there is only a single
functional copy of the glyceraldehyde 3-phosphate
dehydrogenase (GAPDH) gene in humans, mice,
and rats, there are 10 to 30 nonfunctional GAPDH
pseudogenes in humans and more than 200 in mice and
rats. In the completely sequenced human chromosome 22, there are estimated to be 545 genes and 134
pseudogenes. Furthermore, at least half of the human
olfactory GPCRs are pseudogenes. In the worm, at
least 300 of the 1100 GPCRs are pseudogenes. In yeast
and bacteria, on the other hand, pseudogenes are rare
(usually less than 1%).
Orphan Genes
The most surprising result that has emerged from all
completely sequenced genomes, be they from bacteria, eukaryotes, or metazoans, is that irrespective of
their genic content, at least 20 % of the genes (and
sometimes much more) are orphans. ORFans are
genes whose protein products have no clear sequence
similarities to proteins encoded from their own genome or to any other protein in existing public databases.
For example, even in Mycoplasma genitalium with
only 470 genes, 120 are ORFans of totally unknown
origin or function. In yeast and the fly, the figures are in
excess of 25 %. Whether ORFans constitute an irreducible core of genes that evolve rapidly and whose protein products can maintain old functions, or acquire
new ones, is not yet clear. They remain a mystery.
Gene Sizes
A comparison of the partially characterized 400 Mb
genome of the puffer fish with that of the 3300 Mb
human genome is intriguing, particularly since these
two organisms are believed to contain the same number of genes. When the human dystrophin, utrophin,
and Huntington genes are compared with their puffer
fish homologs, it is found that the human genes are
2500, 1000, and 170 kb in length, respectively, whereas
their puffer fish homologs are only 200, 100, and
Summary
Eukaryote genome organization of the have been
dominated by a mixture of whole genome as well as
piecemeal genome duplications. Thus the genes presently constituting the mammalian lineage likely stem
from a combination of whole genomic amplifications
and subsequent reductions of a much smaller genome.
Layered on top of this are the local and dispersed
duplicative processes that have resulted in the expansion and contraction of individual protein coding
families, as well as the expansion and contraction of
noncoding tandemly repetitious and transposable
families. Layered on top of this again are the molecular
processes that generate larger proteins with a greater
combinatorial complexity of protein domains. Finally,
it is clear that much of the variation that is seen in
present-day genomes, particularly in the localized heterochromatic and transposable element compartments
of the genome, is essentially the flotsam and jetsam of
genomic turnover events. Most of these processes have
little effect on phenotype in the short term.
Further Reading
Genome Relationships:
Maize and the Grass Model
K M Devos and J L Bennetzen
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1668
Gene sequences have remained highly conserved during evolution. Hence, a single set of complementary
DNAs (cDNAs) i.e., sequences derived from transcribed genes, can be used as hybridization probes
across a range of related species to construct comparative genetic maps, outlining their genome relationships. Comparative mapping within the grass family
(Poaceae), including the major cereals rice, maize, and
wheat, has demonstrated that both gene content and
orders have remained highly conserved during 60 million years of evolution. Thus, it is possible to describe
each grass genome, irrespective of its genome size or
chromosome number, by its relationship to a single
reference genome, rice. These relationships can be
depicted by a series of concentric circles with the
inner and outer circles representing the smallest
and largest genomes in the comparison, respectively
(Figure 1). Within a genome, chromosomes are
ordered so that a minimum number of rearrangements
are needed in the overall comparison. Corresponding
genes across the species can be found on the radii.
Maize (2n 20; C 2.5 pg), a species belonging to
the subfamily Panicoideae, originated about 16 to 11
million years (My) ago through the hybridization of
two diploid ancestors and subsequent diploidization
(Gaut and Doebley, 1997). The ancient tetraploid origin of the maize genome is revealed in the comparative
maps. Each of two sets of five maize chromosomes
(1, 2, 3, 4, 6 and 5, 7, 8, 9, 10) corresponds to a complete
rice genome, albeit with a different order of the rice
linkage blocks (Figure 1). Some rearrangements relative to the rice genome are common to both genomes
(indicated by red arrows in Figure 1), and also extend
to other Panicoideae species. These chromosomal
mutations provide information on species' relationships and evolution. However, the rate at which
rearrangements occur and are fixed may be speciesspecific, and thus dependent on the genome structure
Genome Relationships:
Maize and the Grass Model
K M Devos and J L Bennetzen
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1668
Gene sequences have remained highly conserved during evolution. Hence, a single set of complementary
DNAs (cDNAs) i.e., sequences derived from transcribed genes, can be used as hybridization probes
across a range of related species to construct comparative genetic maps, outlining their genome relationships. Comparative mapping within the grass family
(Poaceae), including the major cereals rice, maize, and
wheat, has demonstrated that both gene content and
orders have remained highly conserved during 60 million years of evolution. Thus, it is possible to describe
each grass genome, irrespective of its genome size or
chromosome number, by its relationship to a single
reference genome, rice. These relationships can be
depicted by a series of concentric circles with the
inner and outer circles representing the smallest
and largest genomes in the comparison, respectively
(Figure 1). Within a genome, chromosomes are
ordered so that a minimum number of rearrangements
are needed in the overall comparison. Corresponding
genes across the species can be found on the radii.
Maize (2n 20; C 2.5 pg), a species belonging to
the subfamily Panicoideae, originated about 16 to 11
million years (My) ago through the hybridization of
two diploid ancestors and subsequent diploidization
(Gaut and Doebley, 1997). The ancient tetraploid origin of the maize genome is revealed in the comparative
maps. Each of two sets of five maize chromosomes
(1, 2, 3, 4, 6 and 5, 7, 8, 9, 10) corresponds to a complete
rice genome, albeit with a different order of the rice
linkage blocks (Figure 1). Some rearrangements relative to the rice genome are common to both genomes
(indicated by red arrows in Figure 1), and also extend
to other Panicoideae species. These chromosomal
mutations provide information on species' relationships and evolution. However, the rate at which
rearrangements occur and are fixed may be speciesspecific, and thus dependent on the genome structure
12
13.8 kb
6pt
60
8
10
9pt
8pt
9pt
1pt
5pt
8pt
14.0 kb
50
9
8
40
IV
IIIpt
V
5
10pt
IIIpt
9pt
30
19.1 kb
8
IXpt
12
1pt
Cpt
5pt
Adh
10pt
10
IXpt
2pt
IIpt
IIpt
Fpt
7pt
Fpt
11
Cpt
2
9
3
2
VII
20
13.8 kb
Adh
35.3 kb
VI
10
70.3 kb
0
5pt
1pt
VIII
Rice
2pt
7pt
Foxtail millet
Sorghum
2pt
Maize
Figure 1 (See Plate 20) Consensus map showing the relationship at the map level between the rice, foxtail millet, sorghum, and maize genomes. Arrows indicate
rearrangements; Red arrows indicate rearrangements that are common to species within a taxonomic group; C, centromere positions; pink circles, location of
orthologous Adh genes in sorghum and maize. In the detailed comparison of the orthologous Adh regions of sorghum (green) and maize (yellow) at the DNA
sequence level, arrows indicate the location and predicted transcriptional orientation of the identified genes; conserved sequences in sorghum and maize are
connected by gray shading.
70
11
6pt
78
864
Retrotransponson
blocks
14
13
G e n o m i c L i b r a r y 865
was unexpected and suggests a high rate of genome
rearrangements in the lineage leading to Arabidopsis.
Although this needs to be confirmed by further comparative data, one can speculate that the extensive
duplication of the Arabidopsis genome (Bancroft,
2000; Blanc et al., 2000) may have contributed to its
faster evolution. Gene duplication, and subsequent
divergence of the two copies may also be an important
mechanism through which species acquire new gene
functions.
In conclusion, comparative genome analyses have
demonstrated that gene orders have remained conserved during 60 My of evolution, both at the map and
at the DNA sequence level. The wealth of information
provided by the integrated grass maps can now be
exploited to enhance our knowledge of both wellstudied major cereals and under-resourced orphan
crops.
References
Genome Size
J Hodgkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0557
References
Genomic Library
W C Nierman and T V Feldblyum
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0559
G e n o m i c L i b r a r y 865
was unexpected and suggests a high rate of genome
rearrangements in the lineage leading to Arabidopsis.
Although this needs to be confirmed by further comparative data, one can speculate that the extensive
duplication of the Arabidopsis genome (Bancroft,
2000; Blanc et al., 2000) may have contributed to its
faster evolution. Gene duplication, and subsequent
divergence of the two copies may also be an important
mechanism through which species acquire new gene
functions.
In conclusion, comparative genome analyses have
demonstrated that gene orders have remained conserved during 60 My of evolution, both at the map and
at the DNA sequence level. The wealth of information
provided by the integrated grass maps can now be
exploited to enhance our knowledge of both wellstudied major cereals and under-resourced orphan
crops.
References
Genome Size
J Hodgkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0557
References
Genomic Library
W C Nierman and T V Feldblyum
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0559
866
Genomic Library
History
Genomic libraries were constructed in the early days
of the development of recombinant DNA technology
in the mid to late 1970s. The libraries were the source
of clones for the analysis of genes of interest. The first
libraries were constructed using partial restriction
digests as the means for fragmenting the genomic
DNA in a way that generated overlapping fragments
of length suitable for cloning. Such fragments were
cloned into plasmid vectors by Clark and Carbon
(genomic libraries of Escherichia coli and Saccharomyces cerevisiae DNA). Maniatis constructed genomic libraries of Drosophila, rabbit, and human using a
bacteriophage lambda vector.
Procedures were developed for the rapid screening
of these libraries for sequences of interest based on
sequence similarity to a labeled nucleic acid probe.
Colony hybridization for screening plasmid libraries
and plaque hybridization for screening bacteriophage
lambda libraries revealed which clones contained
DNA sequences with identity or very high similarity
to the sequence of the probe. In these procedures DNA
from the high-density plates of colonies or plaques
was transferred to solid hybridization membranes,
initially nitrocellulose and subsequently various
formulations of modified nylon. The initial pattern
of colonies or plaques on the plates was preserved on
the membrane. A DNA fragment serving as a probe
was typically labeled with the 32P isotope of phosphorus. The membrane was hybridized to the probe
in solution and after washing away the unhybridized
probe, it was exposed to X-ray film to reveal those
colonies or plaques that contained DNA identical or
similar to the probe. Based on the location of the hybridizing signal on the membrane, the corresponding
colonies or plaques were recovered for further analysis.
Recombinant DNA technology lead to the explosive development of molecular genetics as numerous
genes of biological and medical importance were isolated from genomic libraries, characterized, and their
products expressed in E. coli, other bacterial species,
S. cerevisiae, and insect and mammalian cultured cells.
The plan to map and sequence the human genome
emerged as the Human Genome Project in the late
1980s, bringing genomic libraries into this new application. Up to this point the libraries were the source of
clones for studying individual genes or sequences. For
whole genome scale analysis, the properties required of
genomic libraries were more rigorous. Three characteristics emerged as the requirement for use in libraries
for genome projects. These characteristics (large cloning capacity, stable propagation of insert DNA, and
curtailing of chimeric inserts) are critical features of
vectors and library construction protocols for genome
mapping and sequencing. For these applications cosmids, yeast artificial chromosome vectors (YACs),
bacteriophage P1 vectors, P1 artificial chromosome
vectors (PACs), and bacterial artificial chromosome
vectors (BACs) have been developed and used.
The decade of the 1990s has brought a focus on high
throughput genomic DNA sequencing of many species including the human, the laboratory mouse, the
roundworm Caenorhabditis elegans, plants, including
Arabidopsis thaliana, rice and potato, and numerous
species of bacteria. A critical component of all of these
projects is the construction of genomic libraries from
either the entire genome or from a large insert genomic clone such as a BAC. These libraries are constructed by shearing the genomic DNA to randomly
generate overlapping fragments of the appropriate
size. A fraction of the sheared DNA is then selected
by size for construction of the library. The resultant
library has a narrow range of insert sizes and clones
are randomly selected from the library for sequencing
from both ends (shotgun sequencing). The insert size
of the clones being sequenced is typically 2 kb and the
average sequence read length is about 650 bases. The
randomly collected sequence reads are then assembled
into the original molecule using assembly software,
which will find overlaps in the sequence reads to
accomplish the assembly process.
The features of the vectors and library construction
protocols for shotgun DNA sequencing are very rigorous due to the high cost of sequencing, the technical
challenge of the assembly of shotgun sequence reads,
and the cost of closing the remaining gaps after the
assembly. The libraries need to have a very low incidence of clones that contain no inserts, or the
no-insert-containing clones need to be readily identifiable and excluded from the sequencing pipeline. The
libraries should have a narrow insert size range. This
allows the assembly software to use the distances
between the sequences obtained from each end of the
clone. The libraries cannot contain chimeric DNA
inserts as these will confound the assembly process.
The libraries need to be as truly random as is technically achievable to minimize the number of gaps in the
sequence after the assembly of the sequence reads.
This requires that the insert DNA be sheared instead
G e n o m i c L i b r a r y 867
of the more traditional technique of partial restriction
digestion to reduce the size of the DNA fragments to
that required for cloning.
Plasmid Vectors
Plasmids are small extrachromosomal circular doublestranded DNA molecules that replicate independently
of the chromosome or chromosomes of a microorganism. Their copy number in the cell is maintained by
control systems built into the plasmid's gene content
but varies depending on the replication system of the
plasmid. For example, pUC-based plasmid vectors are
maintained at a copy number of 500700 per cell.
Plasmid vectors derived from the E. coli F plasmid
are rigorously maintained at a copy number of one.
Naturally occurring bacterial plasmids have been
engineered to serve as vectors for the propagation of
exogenous DNA fragments. To serve as a vector the
plasmid must have in addition to a replication system,
a selectable marker (typically an antibiotic resistance
gene) and a cloning site, a unique restriction site in a
nonessential region of the plasmid for the insertion of
the exogenous insert DNA.
One of the historical limitations on the use of plasmids for genomic libraries was the inefficient procedure of chemical transformation for transferring the
recombinant plasmid DNA constructs into E. coli. For
library construction, that technique has been replaced
by the use of very high efficiency electroporation. A
high-voltage electric field is applied briefly to cells, producing transient holes in the cell membranes through
which plasmid DNA enters. Electroporation allows
for the efficient transfer of plasmid DNA as large as
200 kb into cells.
Table 1
Plasmid
Genomic DNA
DNA fragmentation
Restriction
reaction
Ligation
Transformation
Vector
Cloning
capacity (kb)
Applications
Plasmids
Bacteriophage lambda
Cosmids
Bacteriophage P1
BACs
YACs
0.112
10 20
35 45
30 90
30 300
100 1000
868
Genomic Library
Left arm
Vector DNA
Restriction
endonuclease
Internal fragments
Right arm
Genomic DNA
DNA Fragmentation
Remove internal
fragments not
essential for
phage replication
Left arm
Right arm
Ligation
Left arm
Insert packaging
into bacteriophage
Right arm
Infection of
E.coli cells
G e n o m i c L i b r a r y 869
to prevent vector to vector ligation that will reduce the
efficiency of the in vitro packaging reaction. After the
removal of the stuffer fragment, the remaining arms
are sufficiently small that if ligated together without
inserts will not be packaged. This will prevent the
propagation of vectors without inserts.
Cosmid Vectors
Cosmid vectors are prepared in much the same manner as plasmids. The cloning site sticky ends are generated by digestion with a restriction enzyme and a
phosphatase is used to remove the 50 phosphates from
the vector to prevent vector to vector ligation. The
insert fragments must be size selected so that the
in vitro packaging of the DNA into bacteriophage
lambda particles occurs efficiently.
The P1 cloning system was developed by Nat Sternberg for use in large genome mapping and sequencing
projects. P1 vectors are much like cosmids in that they
are plasmids that can be packaged in a phage particle
for efficient injection into E. coli. The bacteriophage
P1 phage head can hold about 110 kb of DNA. The
vectors designed for use with the P1 packaging system
are up to about 30 kb in size so that the cloning capacity
870
Genomic Library
expended in preventing the formation of BACs without inserts in the library construction process.
The vector molecules are digested with the appropriate restriction enzyme and the 50 phosphates
removed by treatment with a phosphatase. Linear
vector molecules of the correct size are obtained by
recovery from an agarose gel after sizing by electrophoresis. The recovered linear vector molecules are
then self-ligated in bulk to form multimers from
those molecules still retaining the 50 phosphates.
These ligation products are removed by another
round of agarose gel electrophoresis and size selection
for the linear BAC monomer.
At this point aliquots of the vector are self-ligated
and ligated with a test insert in separate reactions. The
reaction products are electroporated into E. coli and
the colony count compared. The vector only electroporation should yield few to no colonies and a large
number of colonies should be obtained with the test
insert. If the vector only electroporation does give a
consequential number of colonies, then another round
of self-ligation and size purification is required. Once
the desired result is obtained, the vector is ready to
receive inserts.
Yeast artificial chromosomes (YACs) provide the largest insert capacity of any cloning system. This system,
developed by Burke and Olson in 1987, supports the
propagation of exogenous DNA segments hundreds
of kilobases in length. YACs representing contiguous
stretches of genomic DNA (YAC contigs) have
provided a physical map framework for the human,
mouse, and even Arabidopsis genomes.
The YAC vector itself provides the essential elements for propagation of DNA as a chromosome in
the yeast Saccharomyces cerevisiae. These elements
include a yeast centromere, two functional telomeres,
and auxotrophic markers for selection of the YAC in
an appropriate yeast host. A problem encountered in
constructing and using YAC libraries is that they
typically contain clones that are chimeric, i.e., contain
DNA in a single clone from different locations in the
genome.
G e n o m i c L i b r a r y 871
This results in some specific fragments that are too big
or too small to be cloned in a particular vector. There are
two approaches to the fragmentation of genomic DNA
for the construction of libraries. One is by partial
restriction digestion and the other is by physical shearing. Both methods are reviewed below.
872
Genomics
Further Reading
Genomics
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2131
Genotype
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0561
Genotypic Frequency
A Clark
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0562
Numerical Example
If a sample of genotypes from a population consists of
16 AA, 48 Aa, and 36 aa, then the frequency of genotype AA is 16/100 0.16. Similarly the frequencies of
Aa and aa are 0.48 and 0.36, respectively. Note that the
sum of the frequencies of all genotypes is 1. Notice
also that these are estimates of the genotypic frequencies. Out of the entire population, the true frequency
of genotype AA may be slightly different from 0.16.
We can estimate our statistical confidence in the genotypic frequency estimate by assuming that the sampling was done by randomly drawing individuals from
the population. Under this kind of sampling, the variance of a genotype with frequency x is approximately
x(1
x)/n, where n is the sample size. It should be
clear that the larger our sample is, the smaller will be
this variance, and the better will be our estimate of
genotypic frequency.
Hidden Variation
Whenever we determine what genotypes individuals
have, we nearly always restrict attention to one or a
few genes. Calculation of the genotypic frequencies of
the genes we observe is done in the same way, whether
we score only the one gene or many other genes. There
will always be hidden or unobserved variation lying
within each of the genotypic classes. Another kind of
hidden variation that makes calculation of genotypic
frequencies difficult is dominance. The simplest kind
of dominance occurs when genotypes AA and Aa both
have the same phenotype. In this case we cannot
directly count up the genotypes, so only indirect estimation of genotype frequencies is possible. In this
case, it would be necessary to make some additional
assumptions about the population before it would be
possible to estimate genotypic frequencies. In this
example, if we were willing to assume that the population is in HardyWeinberg equilibrium, then we could
estimate the frequencies of AA and Aa from the frequencies of the A and a alleles.
See also: Allele Frequency; HardyWeinberg Law
Germ Cell
T Schedl
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0563
Germ cells are a central component of sexual reproduction in animals. They are the route by which the
genome and cytoplasmic components are transferred
to the next generation. This route utilizes meiosis and
gametogenesis, processes that are unique to germ cells.
Germ cells differentiate to produce male and female
gametes, sperm and unfertilized eggs (oocytes or ova),
and undergo meiosis to produce a haploid set of
chromosomes. Haploid gametes then unite to form a
diploid zygote that develops into a new individual.
Germ-cell-mediated sexual reproduction thus creates
genetic diversity, which is essential for evolution:
meiosis and gamete fusion generates offspring that
are genetically dissimilar from each other and distinct
from either parent. In many animals, there is a germline lineage, composed of germ cells that will form
gametes, and a somatic lineage, containing the majority
of cells, which form the rest of the organism (tissues
874
Germ Cell
Sperm
Meiosis I
Germ cell
specification
Gonad
formation
Proliferation
meiosis II
Prophase
arrest
Diploid
zygote
Meiosis I
meiosis II
Diploid
zygote
Soma
Body
Oocyte
Figure 1 Cycle of the germline. Soon after formation of the diploid zygote, germ cells become specified as distinct
from somatic cells that will give rise to the rest of the organism. These primordial germ cells migrate and then
interact with specific somatic cells to form the gonad. Germ cells proliferate and then initiate meiotic development
(enter meiotic prophase). The timing of proliferation and entry into meiotic prophase depends on the species and
the sex. For example, in female mammals all germ cells have entered meiotic prophase prior to birth, while in male
mammals, proliferation and entry into meiotic prophase is continuous in sexually mature animals. Reciprocal
recombination between homologs occurs during meiotic prophase I. Many of the activities of gametogenesis occur
contemporaneously with late stages of meiotic prophase I. For male germ cells, progression through meiotic
prophase I and the divisions of meiosis I and meiosis II occur without pause. Female germ cells of most species arrest
late in meiotic prophase. Following external signals, the oocyte matures and progresses through meiosis I. A number
of species have a second arrest point (e.g., vertebrate oocytes arrest in meiosis II). Fertilization relieves the arrest,
resulting in the completion of meiosis and the initiation of a new round of zygotic development.
such as gut, limbs, etc.). One can take the view that the
raison d'etre for an organism's somatic cells is to facilitate the function of the germline so that their genetic
material is passed to the next generation. Features
of germ cells and their development are described
below in general terms and with specific organismal
examples, which give them their unique character.
Germline Development
The development of germ cells is similar among animals (Figure 1), although the details often differ
between species and between sexes of the same species. Germ cells are usually separated from somatic
cells in early development. In a number of nonmammalian species, cytoplasmic `germ plasm' in the unfertilized egg (also called pole plasm, germinal granules,
or P-granules, depending on the organism) may specify the germ cell identity. In the fruit fly Drosophila,
the nematode Caenorhabditis elegans, and various
amphibians, embryonic cells that contain `germ plasm'
usually develop as germ cells. The `germ plasm' is
prelocalized in the Drosophila oocyte or becomes
asymmetrically segregated to certain blastomeres
during cleavage divisions in C. elegans. By contrast,
Meiosis
The process by which diploid germ cells produce
haploid gametes that contain only one of each homologous chromosome is called meiosis. Following the
last mitotic division, germ cells initiate meiosis/gametogenesis, undergoing a period of DNA synthesis
such that both the maternal and paternal homologous
chromosomes are duplicated, resulting in each containing two sister chromatids. Prior to the first meiotic
division, the 4n germ cells are in prophase (prophase I)
Gametogenesis
The sperm and egg are highly specialized for their
different tasks. Sperm are small, highly motile and
efficient in the process of fertilization. In many species, sperm also provide the centrioles necessary for
zygotic development. The egg is very large, ranging
from a thousand- to more than a millionfold the mass
of a typical somatic cell, depending on the species. The
egg supplies organelles (e.g., mitochondria), nutrients,
precursors, RNAs, proteins, and a protective covering
or shell. The stored RNAs and proteins provide
the materials necessary to direct embryogenesis until
expression from the zygotic genome is initiated. For
many nonmammalian species, the unfertilized egg
contains molecular determinants that are either prelocalized or become localized following fertilization,
which provide polarity information for the developing embryo. In addition, for nonmammalian species,
the stored materials allow embryogenesis to occur
externally without further support from the mother.
In the production of gametes, the nuclear events of
meiosis and the cellular differentiation of oogenesis
and spermatogenesis are intimately intertwined. Much
of the RNA and protein synthetic activity necessary
for gametogenesis occurs in pachytene and diplotene.
For oocytes, the massive growth usually occurs in
diplotene. The large accumulation of material in the
oocyte is also often assisted by somatic gonad cells
(follicle cells) and, for many invertebrates, can also be
aided by other germ cells called nurse cells. For many
nonmammalian species, yolk is synthesized outside
theovaryandistransportedtogrowingoocytes.Spermatogenesis usually occurs continuously, without
arrest, in reproductively mature males. The meiotic
divisions produce four haploid spermatids, of equal
size, which then undergo extensive postmeiotic differentiation to produce mature spermatozoa. Oogenesis
has a number of features that are distinct from spermatogenesis. To generate the large size of the egg, the
meiotic divisions are unequal; MI generates a large
diploid oocyte (often called the secondary oocyte)
and a small first polar body and MII produces a large
haploid egg and a small second polar body. Oogenesis
is often arrested in prophase to allow oocyte growth
and to provide a means of regulating egg release. In
most vertebrates, the prophase arrest is in diplotene.
The release from prophase arrest (called meiotic
maturation) is regulated by external cues (e.g., hormonal signals from the menstrual or estrus cycle). In
many vertebrate species, there is a second arrest at
metaphase of MII. Following ovulation, where the
egg is discharged from the ovary, fertilization releases
the arrest resulting in the completion of meiosis and
the initiation of zygotic development. The point in
876
G i e m s a B a n d i n g , M o u s e C h ro m o s o m e s
Further Reading
G i l b e r t , Wa l t e r 877
(phytohemagglutinin; concentration determined by a
doseresponse curve for each batch of PHA). The
cultures are incubated at an approximately 458 angle
for 43 h at 37 8C in a shaking water bath. Colchicine
(0.15 ml of a 50 mg ml 1 solution) is added to each culture for the last 1520 min. Cells are harvested by centrifugation, resuspension in hypotonic 0.56% (0.75 mol)
potassium chloride for 15 min, centrifugation, and fixation in methanol:glacial acetic acid (3:1). After 30 min
cells are centrifuged and resuspended in three sequential washes of the methanol:glacial acetic fixative.
The method of slide preparation is important
because well-spread metaphases are critical for high
quality G-banded preparations. Precleaned slides are
soaked in fixative at least 15 min prior to use. Air-dried
metaphases are prepared by dropping a few small
drops of cell suspension onto a precleaned slide,
allowing it to spread, and then rapidly blowing dry
when the drop begins to contract and rainbow colors
appear at the edges. Some cytogeneticists believe
spreading is improved by dropping a very small drop
of clean fixative onto the preparation just as it starts to
dry and allowing the slide to dry in a horizontal position. G bands appear sharper if slides are aged at room
temperature for 710 days.
To prepare G-band chromosomes slides are incubated in Coplin jars (no more than five to six per jar) in 2
SSC at 6065 8C for 1.5 h, transferred to 0.9 % NaCl at
room temperature, then each slide is rinsed individually in fresh 0.9 % NaCl and drained. Thorough
rinsing is critical. Slides are stained for 57 min in
a trypsinGiemsa solution (1.0 ml Gurr improved
Giemsa R66, 45 ml Gurr pH 6.8 phosphate buffer,
4 drops 0.0125% trypsin), then transferred to Gurr
phosphate buffer diluted 1:1 with distilled water, then
slides are rinsed individually in two changes of buffer
distilled water solution and blown dry. Factors that
influence chromosomal response to trypsin treatment
and, therefore, G-band quality, include chromosome
length (contracted chromosomes are more sensitive
than elongated ones), chromosome dryness (recently
made preparations are more sensitive than aged ones),
and chromosome fixation time (sensitivity is inversely
proportional to fixation time or chromosome hardness).
Further Reading
Akeson EC and Davisson MT (2000) Analyzing mouse chromosomal rearrangements with G-banded chromosomes. In:
Jackson I and Abbott C (eds) Mouse Genetics and Transgenics:
A Practical Approach Series, 2nd edn, pp. 144153. Oxford:
Oxford University Press.
Committee on Standardized Genetic Nomenclature for Mice
(1972) Standard karyotype of the mouse, Mus musculus.
Journal of Hereditary 63: 6971.
Lyon MF, Rastan S and Brown SDM (eds) (1996) Genetic Variants
and Strains of the Laboratory Mouse, 3rd edn. Oxford: Oxford
University Press.
References
Cowell JK (1984) A photographic representation of the variability in the G-banded structure of the chromosomes in the
mosue karyotype. Chromosoma 89: 294320.
Davisson MT and Akeson EC (1987) An improved method for
preparing G-banded chromosomes from mouse peripheral
blood. Cytogenetics and Cell Genetics 45: 7074.
Gilbert, Walter
W C Summers
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0565
Walter Gilbert (1932 ), an American molecular biologist, was born 21 March 1932 in Boston, Massachusetts. He was educated at Harvard University and the
University of Cambridge, receiving the PhD with a
thesis on particle physics in 1957. He did postdoctoral
work in physics at Harvard, and in 1959 joined the
Physics faculty at Harvard. In the summer of 1960 he
joined James Watson and Francois Gros in Watson's
laboratory in research on messenger RNA. This initial
exposure to molecular biological research redirected
his career from theoretical physics to molecular biology, where he has made his major scientific contributions, and he subsequently transferred to the faculty in
Biochemistry and Molecular Biology at Harvard. In
1982 he left Harvard to head the Swiss biotechnology
company, Biogen, but returned to Harvard in 1984.
Among his many honors, Gilbert received the Nobel
Prize in Chemistry in 1980, sharing it with Frederick
Sanger and Paul Berg.
His early research focused on the utilization of
mRNA and the mechanisms of protein synthesis,
especially the relationships between the messenger
RNA, the ribosome, and the transfer RNA. In the
mid-1960s, Gilbert and Benno Muller-Hill isolated
the protein that functions as the repressor of the lactose operon in Escherichia coli, the first example of a
genetic control element. This work led to his investigation of the physical basis of gene regulation by study
of the interaction of the lac repressor with RNA polymerase and fragments of DNA. In 1968 Gilbert and
David Dressler proposed the `rolling-circle model'
for DNA replication which gave the first clear indication as to how certain small phages might replicate
878
Glioma
Glioma
V P Collins
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1575
The globin genes determine the structure and synthesis of the globin chains that constitute the different
hemoglobins that are produced in the human embryo,
fetus, and adult. Human beings make different hemoglobins as they develop as an adaptive response to the
variation in oxygen requirements between embryonic,
fetal, and adult life.
All the normal human hemoglobins have the same
basic structure. They consist of two different pairs of
globin chains, that is, long strings of amino acid which
fold into a complex three-dimensional structure. Each
of the four globin subunits that makes up a hemoglobin molecule has a heme group, the oxygencarrying moiety, embedded in its surface. The different
globin chains are named after letters of the Greek
alphabet. Adult and fetal hemoglobins have a chains
associated with b (hemoglobin A, a2b2), d (hemoglobin A2, a2d2) or g chains (hemoglobin F, a2g2),
whereas in the embryo, embryonic a-like chains called
z chains combine with g (hemoglobin Portland, z2g2)
or e chains (hemoglobin Gower 1, z2e2), and a and e
chains combine to form hemoglobin Gower 2 (a2e2).
The embryonic hemoglobins are so-called because
they were first characterized at University College
Hospital in Gower Street, London, and in Portland,
Oregon.
Since each globin peptide chain is the product of a
gene locus, it follows that there must be a, b, g, d, e,
and z globin genes.
878
Glioma
Glioma
V P Collins
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1575
The globin genes determine the structure and synthesis of the globin chains that constitute the different
hemoglobins that are produced in the human embryo,
fetus, and adult. Human beings make different hemoglobins as they develop as an adaptive response to the
variation in oxygen requirements between embryonic,
fetal, and adult life.
All the normal human hemoglobins have the same
basic structure. They consist of two different pairs of
globin chains, that is, long strings of amino acid which
fold into a complex three-dimensional structure. Each
of the four globin subunits that makes up a hemoglobin molecule has a heme group, the oxygencarrying moiety, embedded in its surface. The different
globin chains are named after letters of the Greek
alphabet. Adult and fetal hemoglobins have a chains
associated with b (hemoglobin A, a2b2), d (hemoglobin A2, a2d2) or g chains (hemoglobin F, a2g2),
whereas in the embryo, embryonic a-like chains called
z chains combine with g (hemoglobin Portland, z2g2)
or e chains (hemoglobin Gower 1, z2e2), and a and e
chains combine to form hemoglobin Gower 2 (a2e2).
The embryonic hemoglobins are so-called because
they were first characterized at University College
Hospital in Gower Street, London, and in Portland,
Oregon.
Since each globin peptide chain is the product of a
gene locus, it follows that there must be a, b, g, d, e,
and z globin genes.
31 32 99 100
30 31
104
105
CHROMOSOME 16
Figure 1
CHROMOSOME 11
22
22
22
Hb Gower 1 Hb Portland Hb Gower 2
22
HbF
Embryo
Fetus
22
HbA
22
HbA2
Adult
Both clusters contain two genes which are duplicated; there are two a chain genes, a2 and a1, and two
g chain genes, Gg and Ag. The G and A refer to the amino
acids glycine and alanine; the products of the two g
genes are identical except at amino acid residue 136, at
which one contains glycine and the other alanine. The
product of the pairs of a genes are identical. The other
feature of these clusters is the presence of pseudogenes
which are given the prefix j. They are thought to be
evolutionary remnants of once-active globin genes.
880
Further Reading
Glucose 6-Phosphate
Dehydrogenase (G6PD)
Def iciency
L Luzzatto
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1520
Evolutionary Genetics
G6PD is very ancient in an evolutionary context: it is
found in all organisms except in some of the Archaea
that live in anaerobic environments and some intracellular microorganisms that seem to be able to exploit
the G6PD activity of their respective host cells. The
G6PD sequence shows evidence of conservation
throughout all living phyla, with some regions being
identical in disparate organisms, for example, the
active center (which includes the G6P binding site)
and the NADP binding site.
882
G6PD Deficiency
Investigations of patients who developed acute hemolytic anemia upon exposure to certain antimalarial
drugs revealed, in 1956, that their red blood cells had
a markedly reduced G6PD activity, and that this trait
was inherited. Thus, G6PD deficiency emerged as the
first example of a blood cell disease caused by a specific enzyme abnormality. It quickly became apparent
that this inherited abnormality predisposes to hemolysis in response to several other factors. The wide
range of factors that can trigger hemolysis in G6PDdeficient subjects is related to the fact that all of them
impose an oxidative stress on red cells. The response
to this type of stress involves, in particular, glutathione
(GSH). Since G6PD activity is rate-limiting for regeneration of GSH, normal red cells can withstand such
stress, but G6PD-deficient red cells succumb. G6PD
deficiency is due to mutations in the G6PD gene.
There are some 130 mutations known to date: all of
them are in the coding region, and almost all of them
are point mutations causing single amino acid replacements. In most cases of G6PD deficiency the activity
of the enzyme in red cells is reduced to about 1020%
of normal activity; in some cases it may be as low as
12%. However, there is always some residual activity. In a few instances these amino acid replacements
may affect the catalytic function of the enzyme, but in
the majority of cases they cause G6PD deficiency
because they cause the protein to become unstable.
The absence of large deletions, frameshifts, or nonsense mutations supports the notion that complete
G6PD deficiency would be lethal. This notion has
been confirmed recently by targeted homologous recombination in mouse embryonic stem (ES) cells: when
`G6PD knock-out' ES cells are injected into blastocysts heterozygous female mice can be obtained, but
hemizygous male mutants die in utero at about 10 days
of gestation.
Population Genetics
In many human genes pathogenic mutations are often
regarded as being in a different category from `polymorphisms.' In the case of G6PD it is quite remarkable
that many mutations, which are potentially pathogenic
because they cause G6PD deficiency, are also polymorphic. Indeed, these mutant genes have frequencies
of up to 1020% and even greater in many human populations. Since the G6PD gene is X-linked, in any
population in which G6PD deficiency is common,
the frequency of G6PD-deficient hemizygous males
will be higher than that of G6PD-deficient homozygous females but lower than that of females heterozygous for G6PD deficiency. Interestingly, different
Clinical Genetics
As stated above, it was the clinical manifestation of
acute hemolytic anemia (AHA) that led to the discovery of G6PD deficiency; AHA can be triggered by a
variety of drugs, including antimalarials, aspirin, some
sulfate drugs, and some antibiotics such as nalidixic
acid. G6PD-deficient subjects can also develop AHA
in concomitance with a variety of infections, or after
ingestion of fava beans (see Favism, a well-characterized syndrome which in children is life-threatening).
The most important approach to these clinical problems is prevention, by helping people at risk to avoid
the offending agents. In cases of severe AHA blood
transfusion may be imperative. In addition, G6PD
deficiency can cause a predisposition to severe neonatal jaundice, which can result in long-term neurological damage. Phototherapy is sufficient in preventing
such damage in most cases, but exchange transfusion
may be required in severe cases.
A small proportion of patients with G6PD
deficiency present with a more severe disease, namely
chronic nonspherocytic hemolytic anemia (CNSHA),
even in the absence of any triggering agent. These
patients have anemia and jaundice, and may require
regular blood transfusion, which can bring about iron
overload and the need for iron chelation. The association of G6PD deficiency with CNSHA and with
AHA is an excellent example of genotypephenotype
correlation. Indeed, not surprisingly, the mutations
A
M
A
A
S
As
A
M
S
h
Ck
o t
v i
Um
M
M
o
vm
Uk
C
k
M
M Cm
t
i
C
Uh
v t
U
o
U
o
U
o
Figure 1 Worldwide distribution of polymorphic variants of G6PD. The variants in each country are shown in order of prevalence according to these symbols:
U Union; C Canton; M Mediterranean; A A (202A); k Kaiping; t Taipei; v Viangchan; m Mahidol; h Chatham; l Coimbra; p Local variant;
S Seattle; s Santamaria; a Aures; z Cosenza; A A (968C). See Color Plate 9.
M
MU
AA
MS S z
Si
UA
h A M U o z Mo M S h
SA a i s A
sho
AA
MM M
Ss
AMs
M
M
aS hz
A
a
A
M a
A A
A
s A
s
h
A A
A
A
A A
884
G l ut a m i c A c i d
Glutamic Acid
Glycine
J Read and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2074
H3N
CH2
COO
Figure 1
Glycine.
Glutamate.
Glutamine
J Read and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2080
H3N
CH2
CH2
C
H2 N
Figure 1
P Gresshoff
CH2
Figure 1
H3N
COO
Glutamine.
Further Reading
http://www.unitedsoybean.org/soystats
http://www.ag.uiuc.edu/*stratsoy/new/
Glycolysis
F K Zimmermann
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0570
886
Glycolysis
Glucose
Fructose
Hexokinases
ATP
Glucokinase
Glucose-6-phosphate
ADP
Phosphoglucose isomerase
Fructose-6-phosphate
ATP
Phosphofructokinase
Fructose-1,-6-bisphosphate
ADP
Aldolase
NAD+
NADH + H+
Glyceraldehyde-3-phosphate
DHAP
Trisosephosphate
isomerase
Pi
Glyceraldehyde-3-phosphate
dehydrogenase
1,3-Bisphosphoglycerate
Phosphoglycerate kinase
3-Phosphoglycerate
Phosphoglycerate mutase
2-Phosphoglycerate
Enolases
H2O
ADP Phosphoenolpyruvate
Pyruvate kinase
ATP
Figure 1
Pyruvate
ATP, adenosine triphosphate; ADP, adenosine diphosphate; NADH, reduced nicotinamide adenine dinucleotide;
Pi, inorganic phosphate; DHAP dihydroxyacetone phosphate.
Heterooctameric phosphofructokinase consists of
two different subunits, with about 50% amino acid
identity (genes PFK1 and PFK2). Its activity is subject
to numerous effectors, most prominently by activating fructose-2,6-bisphosphate generated by two 6phosphofructo-2-kinases (genes PFK26 and PFK27).
A block in glycolysis requires deletion of PFK1 and
PFK2. A single deletion of PFK2, but not of PFK1,
slightly reduces growth on glucose. Double mutants
without PFK26 and PFK27 cannot form fructose2,6-bisphosphate but grow normally on hexoses.
However, all these mutants have altered levels of
glycolytic metabolites.
Yeast aldolase (gene FBA1) belongs to the prokaryotic type II aldolases. Mutants deleted for FBA1 are
inhibited by glucose and grow very poorly on a mixture
of acetate and low amounts of galactose. The deduced
amino acid sequence of triosephosphate isomerase
Glycolytic enzyme
Mutation-associated
demonstrated or possible defects
Hexokinase I
Hexokinase II
Glucokinase
Phosphoglucose isomerase
Phosphofructokinase
Aldolase B
Triosephosphate isomerase
Glyceraldehyde-3-phosphate dehydrogenase
Phosphoglycerokinase
Phosphoglycerate mutase
Enolase I
Pyruvate kinase
Glycolysis in Humans
The genetics of glycolysis in humans is complicated
(1) by the presence of tissue and cell type-specific
888
Gl ycosylase Repair
References
Glycosylase Repair
J Laval
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0571
Grasses, Synteny,
Evolution, and Molecular
Systematics
E A Kellogg
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1728
890
Gravitropism in Arabidopsis
thaliana
P H Masson
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1681
892
894
Group Selection
Future Prospects
Further Reading
Group Selection
M J Wade
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0573
Basic Concepts
Natural selection occurs in any system whose members have the properties of replication, variation, and
heredity (Lewontin, 1970; Maynard Smith, 1976).
When the system consists of cells, the variation
among cell lineages in replication and death rates,
Recipients
Figure 1
No cost of altruism
but reap benefits from others
896
Group Selection
benefactors. Indeed, there are two components of variation in the frequency of benefactors: (1) among birds
within groups, and (2) among groups.
Each component of variation has a selective effect
or consequence for both individual and group replication. Within each group, benefactors experience reduced fitness (Figure 3). In group 1, the frequency
of benefactors declines from 0.80 to 0.78. Similarly, in
group 2, the frequency of benefactors declines from
0.20 to 0.17. Individual selection operating within
groups selects against the benefactors and their frequency declines as a consequence. The magnitudes
of the decline in benefactor frequency are 0.02 and
0.03, in groups 1 and 2, respectively. The total
decline in the frequency of benefactors by individual
selection is 0.027. This is the weighted average decline, where the weights are determined by the size of
the group relative to the total after individual selection.
The among-group component of variation also has
a selective effect (Figure 4). A group with a high
frequency of benefactors has a higher growth rate
than a group with a lower frequency of benefactors.
This positive effect of benefactor frequency is the
opposite of the negative fitness effect of being a benefactor within a group. Group 1, with an initial frequency of 0.80 benefactors, increased in size from five
Group 1: p benefactors =0.80
Group genetic structure is often characterized in hierarchical terms associated with the components of
genetic variation among individuals within groups
Before:
After:
Figure 3
Figure 4
Phenotypic value
aa
Aa
AA
bb
0.50
0.75
1.00
Bb
0.75
0.75
0.75
BB
1.00
0.75
0.50
1.00
bb
0.75
Bb
0.50
BB
aa
p total =
/Total} {
After
/Total}
Before
Figure 5
Figure 6
Aa
AA
898
Group Selection
a
+a
Allelic effect:
Allelic effect of B
bb
Bb
BB
+0.25
0.00
0.25
aa
Aa
AA
References
Growth Factors
J K Heath
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0574
900
GSD
(GerstmannStraussler
Disease)
J Hodgkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0575
GT Repeats
See: Microsatellite, CA Repeats
GTAG Rule
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1852
GTP (Guanosine
Triphosphate)
E J Murgola
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0576
Gynogeno ne 901
GDP ATP GTP ADP
For the synthesis of deoxyguanosine triphosphate
(dGTP), a precursor of DNA, the 20 hydroxyl group
of the ribose moiety of GTP is replaced by a hydrogen
atom. The final step in this conversion is catalyzed by
ribonucleotide reductase.
GTP is an energy-rich, activated precursor for
RNA synthesis that also plays important roles in several other cellular processes such as protein synthesis,
protein localization, signal transduction, visual excitation, and hormone action. The free energy of hydrolysis of GTP can be used to drive reactions that
otherwise are energetically unfavorable. For example,
for translocation of a protein through a membrane of
the endoplasmic reticulum, GTP hydrolysis is probably needed to insert the signal sequence into the
channel and is required to release the signal recognition particle from its receptor. GTP may act as an
allosteric effector, causing a protein to change shape
slightly. Its hydrolysis can then lead to a cyclic variation in macromolecular shape and functioning of the
protein. This is seen in the GTP-dependent release of
photoexcited rhodopsin from transducin.
In mRNA-programmed, ribosome-dependent protein synthesis, GTP plays a role at all three stages:
initiation, elongation, and termination. For initiation,
the binding of GTP to a protein initiation factor leads
to formation of the small subunit initiation complex.
Subsequent hydrolysis of GTP results in the association of the large subunit with the complex. In elongation, GTP has more than one role. It binds to
elongation factor (EF) Tu (EF-1 in eucaryal cells) to
facilitate the delivery to the ribosome of each successive aminoacyl-tRNA as dictated by the mRNA
sequence. The aminoacyl-tRNA is delivered as part
of a ternary complex composed of itself, EF-Tu, and
GTP. After GTP hydrolysis, EF-Tu is released, GTP
is regenerated, and the cycle continues for the next
designated aminoacyl-tRNA. To ensure the accuracy
of the match between the incoming aminoacyl-tRNA
and the mRNA codon in the A-site of the ribosome,
the binding of GTP and its EF-Tu-dependent hydrolysis play a role in the process of proofreading. After
peptide bond formation, translocation of the peptidyltRNA from the A-site to the P-site requires the binding of GTP to EF-G (EF-2 in eucaryal cells) and its
subsequent EF-G-dependent hydrolysis.
Finally, after the presence of a termination codon
(UGA, UAA, or UAG) in the A-site is recognized by
the combined action of a protein release factor (RF)
and specific regions of ribosomal RNA, a signal is
transmitted to the hydrolytic center of the ribosome
(in the large subunit) to hydrolyze the peptidyl-tRNA
in the P-site. The binding of GTP to another RF and
Guanine
R L Somerville
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1724
Guide RNA
See: RNA Editing in Trypanosomes
Gynogenone
W Reik
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0578
902
Gyrase
(occurring, for example, in some fish species), the paternal, sperm-derived genome is inactivated or lost. Experimentally, gynogenetic embryos can be made using
irradiated sperm. The haploid gynogenetic embryos
can then be diploidized. In the mouse, gynogenetic
embryos are made by pronuclear transplantation.
Following fertilization and formation of pronuclei,
the male pronucleus is removed by microsurgery and
replaced by a second female pronucleus.
Gynogenetic embryos are useful for genetic mapping or for the rapid recovery of mutations. In mice
gynogenetic development only progresses midway
through gestation, to early postimplantation stages.
Such embryos have particularly deficient development
of extraembryonic membranes such as the trophoblast
and yolk sac, which may explain the failure of further
development. This developmental failure is explained
by the phenomenon of genomic imprinting, whereby
certain genes in eutherian mammals are expressed
from only one of the parental chromosomes. Gynogenetic embryos thus lack gene products that are only
made by the paternal genome, and have overexpression of gene products made by the maternal genome.
Gyrase
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1853
H
H19
K N Gracy and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0620
H2 Locus
See: Major Histocompatibility Complex (MHC)
Hadulins
W Broughton
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1643
Reference
Hairpin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1854
A hairpin is a double helical region formed by basepairing between adjacent (antiparallel) complementary
sequences in a single strand of RNA or DNA. It
comprises a stem and loop structure.
See also: Antiparallel
904
Haldane , J.B.S .
50% in the absence of t (11;14) or BCL1 rearrangement. Prolonged remission can be obtained with the
nucleoside analogs pentostatin and cladribine. Median
survival is greater than 10 years. A rare variant form
of the disease has the same histological features as
typical HCL but the leukocyte count is high (50
109 l 1), the hairy cells have a prominent nucleolus and
the response to therapy and overall prognosis are
poor.
See also: Leukemia
Haldane, J.B.S.
K R Dronamraju
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0579
Scientific Work
Haldane's contributions to genetics were largely theoretical and mathematical. Yet few scientists have had
more influence on the steady growth of genetics than
Haldane during his long career. Haldane's first contribution to genetics, which dealt with the measurement of linkage in mice, was published in 1914. His
research was interrupted by World War I, but in 1919
Haldane worked out a more accurate method of
Figure 1
detecting linkage and a way of relating the map distance to the frequency of recombination (`mapping
function'). He suggested the use of `centimorgan'
(cM) as a unit of chromosome length. In collaboration
with others, he undertook a series of linkage studies in
the following years, extending the linkage theory to
polyploids, demonstrating the effect of age on the
frequency of recombination in the fowl, and demonstrating partial sex-linkage in the mosquito Culex
molestus. In his book New Paths in Genetics (1941)
Haldane introduced `cis' and `trans' to replace the
terms `coupling' and `repulsion' that were in vogue
at that time.
Perhaps the most famous aspect of Haldane's
genetical work is his generalization concerning the
offspring in interspecific crosses, which he formulated
in 1922, called `Haldane's rule':
When in the first generation between hybrids between two
species one sex is absent, rare or sterile, that sex is always the
heterogametic sex.
Population Genetics
Haldane is best remembered as a founder of population genetics, an honor he shared with R.A. Fisher and
S. Wright. Population genetics is best described as the
offspring of the union between Mendelian genetics
and the Darwinian theory of evolution. In a series of
papers, entitled ``A mathematical theory of natural
selection,'' which were published between 1924 and
1934, Haldane investigated the conditions required to
maintain a balance between selection intensity and
mutation pressure, under varying intensities of selection, inbreeding, size of the population, frequency of a
character, reproductive isolation, type of inheritance,
and environmental interaction. Haldane later commented that adequate quantitative data are rarely
available to test the mathematical models that he and
others had developed.
Haldane's contributions to population genetics
were quite extensive. He showed that the probability
that a single mutation will ultimately become established in a population of finite size is proportional to
its selective advantage, but for dominant mutations is
independent of the population size. He showed
further that mutant genes which are harmful singly,
but become advantageous in combination, could accumulate in small, isolated populations, leading eventually to speciation.
Haldane further showed, mathematically, that the
impact of a mutation on a population depends merely
on the rate of recurrence of the mutation and not on
the degree of severity of selection against it. This
principle was later applied to measure the impact of
genetic damage resulting from high-energy radiation
by the US National Academy of Sciences. From an
evolutionary point of view, Haldane's paper on ``The
cost of natural selection'' broke new ground in its
approach to measuring one of the major factors determining the rate of evolution. His calculations showed
that, during the course of evolution, the substitution
Human Genetics
Haldane's contributions to human genetics were
of particular importance. He was a pioneer who laid
its foundations, and shaped and nursed its growth
from its infancy to a mature discipline. Furthermore,
through numerous popular writings, Haldane prepared the ground for the acceptance of human genetics
and an appreciation of its importance in the public
domain. He developed statistical methods for the
study of genetic traits in families and populations
and the analysis of geneenvironment interaction.
He estimated the first mutation rate of a human gene
(hemophilia) and prepared the first human gene map,
involving the traits on the X chromosome, hemophilia
and color blindness.
Of special importance was Haldane's suggestion
that resistance to malaria and other infectious diseases
played a significant role in recent human evolution,
resulting in greater genetic diversity and greater prevalence of certain diseases such as sickle cell anemia.
This has stimulated a great deal of epidemiological
research of considerable importance in recent years.
Haldane's books include: Daedalus or Science and the
Future (1923), Possible Worlds and Other Essays
(1928), The Causes of Evolution (1932), Science and
the Supernatural (1935), Heredity and Politics (1938),
New Paths in Genetics (1941), and The Biochemistry
of Genetics (1954). He was also the author of a popular
children's storybook, My Friend, Mr. Leaky (1937).
For several years during the 1940s, Haldane embraced
Marxism, but there is no evidence to indicate that it
had a significant influence on his scientific work.
Further Reading
906
HaldaneMuller Principle
References
HaldaneMuller Principle
J F Crow
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1425
fitness is the total haploid mutation rate per generation, multiplied by a factor of 1 or 2 depending on
whether the mutation is recessive or dominant. If n
mutations are eliminated with each genetic death, as
might be true with extreme epistasis, then the load is
1/n as large as if they were eliminated independently.
J.L. King made this more precise by saying that the
mutation load is twice the mutation rate divided by
the difference between the frequency of mutations in
individuals eliminated by selection and that before
selection. This principle can be written in more general form. The mutation load is
L 2U=z
x 2U
in which U 1 q, q is the mutant allele frequency, x is the mean number of mutations per individual before selection and z is the mean number of
individuals eliminated by selection per generation
(Kondrashov and Crow, 1988).
With epistasis the mutation load can be considerably decreased by permitting several mutations to be
eliminated with each genetic death. Such epistasis is
generated by truncation selection, which may be the
way in which many organisms survive a high mutation
rate (see Mutation Load).
References
Haldane's Mapping
Function
See: Mapping Function
Hamilton's Theory
B Brembs
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0581
906
HaldaneMuller Principle
References
HaldaneMuller Principle
J F Crow
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1425
fitness is the total haploid mutation rate per generation, multiplied by a factor of 1 or 2 depending on
whether the mutation is recessive or dominant. If n
mutations are eliminated with each genetic death, as
might be true with extreme epistasis, then the load is
1/n as large as if they were eliminated independently.
J.L. King made this more precise by saying that the
mutation load is twice the mutation rate divided by
the difference between the frequency of mutations in
individuals eliminated by selection and that before
selection. This principle can be written in more general form. The mutation load is
L 2U=z
x 2U
in which U 1 q, q is the mutant allele frequency, x is the mean number of mutations per individual before selection and z is the mean number of
individuals eliminated by selection per generation
(Kondrashov and Crow, 1988).
With epistasis the mutation load can be considerably decreased by permitting several mutations to be
eliminated with each genetic death. Such epistasis is
generated by truncation selection, which may be the
way in which many organisms survive a high mutation
rate (see Mutation Load).
References
Haldane's Mapping
Function
See: Mapping Function
Hamilton's Theory
B Brembs
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0581
Group Selection
When describing the above problem, Darwin was
referring to the social insects in particular. At that
time, it was already common knowledge that hymenopteran colonies (honeybees, wasps, bumble bees,
and ants) usually consist of one reproducing queen
and a multitude of sterile workers. This particular
case of sociality is termed `eusociality.' In addition to
sterile individuals cooperatively helping the fertile
animals to raise their offspring, eusociality is characterized by another trait: At least two generations
overlap in life stages in which they are capable of
contributing to colony labor, so that the offspring
can assist their parents during part of their life cycle.
The abandonment of reproduction by the worker
caste was the huge dilemma to which Darwin devoted
an entire chapter in his book On the Origin of Species.
While the omnipresence of exploitation is well in
accord with the rule ``reproduce at the cost of your
competitors,'' the equally obvious existence of all
degrees of altruism up to the complete sacrifice of reproductive success in favor of another organism seemed
an insurmountable obstacle. How can individuals
without their own offspring exist if reproduction and
inheritance are the foundation of the whole theory?
Darwin's own solution was to assume that the
colonies formed some sort of superorganism that
competes against other colonies in a very similar way
as individuals do. To perceive animal colonies as
superorganisms with their members as rough analogs of cells has long been known and is a very
useful concept, even today for certain studies. The idea
of family or `group selection' placated Darwin's contemporaries and was still widely accepted well into the
twentieth century. According to this idea, the unit of
selection for altruistic alleles of an originally selfish
gene would be the colony or deme, not the individual.
The altruistic, cooperative allele spreads in the species,
as colonies without a high occurrence (gene frequency) of this allele become extinct. However, in
order for interdemic selection to be effective, one has
to assume that there is no migration between the groups
and that there is sufficient selection pressure, i.e., the
rate of colony extinction is very high. Furthermore,
individual selection will always be faster than group
selection, as the number of individual organisms is
much larger than that of populations and the turnover
rate of individuals is much higher. Thus, group selection can never counteract individual selection. Because
of these considerations, group selection was eventually abandoned as the prime explanation for the
evolution of cooperation. Then, in 1964, William
Donald Hamilton's principle of `kin selection' was
published in the Journal of Theoretical Biology. At
the time, it was so innovative that it almost failed to
be published and was largely ignored for a decade.
When finally noticed, its influence spread exponentially until it became one of the most cited papers in
the field of biology. It is the key to understanding
the evolution of altruistic cooperation among related
organisms, such as the social insects. Cooperation
among unrelated individuals is beyond the scope
of this article and is treated elsewhere (see Further
Reading).
908
Hamilton's Theor y
Kin Selection
Why should there be a distinction between cooperation among unrelated individuals and that among
related individuals? We have learnt that genetics is
the basis upon which evolutionary change is taking
place. Fitness was defined above in terms of successful
reproduction, i.e., the number of offspring carrying
the selected allele. The more offspring, the `fitter' the
parent. Darwin's ``struggle for existence'' is a struggle
for reproduction. With sexual reproduction, however,
only one half of an organism's genome is transferred to
one of his offspring at a time. Therefore, any particular
trait depending on its mode of inheritance is often
transmitted from a parent to its offspring with a probability of less than one. Thus, in order to transmit as
many of one's genes into the next generation as possible (and hence be evolutionarily successful), an organism has to produce as many surviving offspring as
possible in order to maximize the probability of transmitting all its genetic information. This might constitute a difficult task, however, since all its competitors
try to do the same. But there are other sources of one's
own genes available: relatives.
An ordinary diploid, sexually produced organism
shares 50% of its genes with either of its parents.
Accordingly, it shares about 50% of its genes with its
siblings, 25% with its uncles, aunts, grandparents,
grandchildren, etc. (coefficient of relatedness, see
Figure 1). Hamilton's stroke of genius was to reformulate the definition of fitness as the number of an
individual's alleles in the next generation. Or, more
precisely, inclusive fitness is defined as an individual's
relative genetic representation in the gene pool in the
next generation:
inclusive fitness
own contribution
contribution of relatives
average contribution
of the population
rij bij
where r is the coefficient of relatedness between individual i and its relative j, and b is the fitness of j. Note
that r is always 1 and therefore j's contribution to wi
depends critically on its relatedness to i. We can thus
reformulate equation (2) to:
w i ai
rij bij
C rB > 1
C > 0 or
C
<r
B
0.5
0.5
0.25
0.25
0.5
0.5
0.5
0.5
0.5
0.125
Figure 1 The coefficient of relatedness. In diploid organisms, every parent (top row) transmits 50% of its genetic
information to each offspring (middle row). On average, therefore, siblings share half of each parent's contribution to
their genome, adding to a coefficient of relatedness r 0.5. Consequently, cousins share an r 0.125 or r 1/8
(bottom row). Likewise, these cousins are related to their common grandparents by 1/4 or r 0.25. It might also be
said that r is a measure for the probability that any given allele is shared by two individuals.
0.5
0.5
1.0
0.5
1.0
0.25
0.5 0.25
0.25
0.75
0.5
1.0
0.5
0.5
0.141
Figure 2 The coefficient of relatedness with haplo-diploid sex determination. Note how the coefficients are
skewed with respect to the diploid system depicted in Figure 1. For example, sisters (middle row) are more related
to each other (r 0.75) than they are to their mother (top row; r 0.5).
910
Handedness, Left/Right
very high coefficient of relatedness is needed to overcome high fitness costs due to sterility or a decrease in
life expectancy, or both. The benefit of altruism decreases rapidly with declining relatedness. It becomes
clear that the distinction between cooperation among
related and among unrelated individuals is vital for
understanding the evolution of cooperation.
different light: the otherwise weak interdemic selection can act together with genic selection, and against
individual selection, to spread cooperative genes in a
population.
Further Reading
Handedness, Left/Right
J Hodgkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1128
Molecular Handedness
Molecular handedness refers to the chirality of molecules resulting from asymmetrically substituted carbon atoms, or to differently handed arrangements of
larger assemblages of atoms, for example in righthanded and left-handed DNA structures.
Developmental Handedness
Developmental handedness or laterality usually
refers to differences between the left and right sides
of bilaterally organized animals. Some animals, such as
fruit flies, are bilaterally symmetric in their anatomy,
but most animals exhibit anatomical differences
between left and right sides, to a greater or lesser
extent. How these differences are created is a special
case of pattern formation in development, and some
progress has been made in understanding the genetic
and molecular basis of laterality in vertebrates and
nematodes, although the mechanisms do not seem to
be conserved. Some animals, or organs within them,
also exhibit helical anatomy which can take one of two
hands. The two testes of male fruit flies are bilaterally
symmetric in placement, but both develop into
H a p l o ty p e 911
left-handed helical tubes, spiraling counterclockwise.
Snail shells can occur in either left-handed or righthanded spirals. In this case the direction of the spiral is
genetically determined by the maternal genotype,
which affects the handedness of the first spiral cleavage in developing eggs. Many plants exhibit helical
growth patterns, with one hand or the other preferred.
Behavioral Handedness
Behavioral handedness in animals refers to the preferential use of one limb or organ as compared to its
contralateral homolog. The fact that most, but not
all, humans are right-handed suggests that this behavioral asymmetry is adopted partly at random, but is
biased towards right-handedness by developmental
asymmetry. There is no convincing evidence for a
separate genetic influence on behavioral handedness
in humans.
See also: Maternal Effect; Pattern Formation;
Right/Left Handed DNA
Haploid Number
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0583
Haploinsufficiency
M A Cleary
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0582
Haploinsufficiency is the requirement for two wildtype copies of a gene for a normal phenotype. For
haploinsufficient genes, when one copy of a gene is
Haplotype
D E Bergstrom
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0584
912
HardyWeinberg Law
HardyWeinberg Law
K E Holsinger
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0585
When Mendel's laws were rediscovered, many biologists had difficulty understanding why recessive
traits were not lost from populations. ``If recessive
traits are not expressed in the presence of a dominant
allele,'' they reasoned, ``they should eventually disappear from populations.'' Observations showed that
this does not happen. W. E. Castle explained why this
was so in 1903, although his explanation was ignored
until G.H. Hardy, a British mathematician, and
W. Weinberg, a German biologist, published papers
independently in 1908 that provided mathematical
justification for Castle's intuitive argument.
Hardy's and Weinberg's papers pointed out another
important consequence of Mendel's rules as applied to
populations: if individuals choose mates at random
and several other important assumptions apply,
there is a simple relationship between allele frequencies and genotype frequencies. If xij is the frequency of
if
A population in which this relationship between genotype and allele frequencies holds is said to have its
genotypes in HardyWeinberg proportions.
Meiosis is Fair
A1A1
A1A2
A2A2
A1A1
A1A2
A2A2
A1A1
A1A2
A2A2
Frequency
A1A1
A1A2
A2A2
2
x11
x11x12
x11x22
x12x11
2
x12
x12x22
x22x11
x22x12
2
x22
1
1/2
0
1/2
1/4
0
0
0
0
0
1/2
1
1/2
1/2
1/2
1
1/2
0
0
0
0
0
1/4
1/2
0
1/2
1
H a rd y We i n b er g L aw 913
In a small population the `actual' frequency of offspring genotypes observed in matings involving a
heterozygote may be different from the `expected'
frequency listed in the Table 1 for the same reason
that a fair coin tossed four times will not always give
two heads and two tails. If meiosis is fair, the gametes
that participate in fertilization are a random sample of
all gametes produced, and in a small sample the
observed and expected frequencies may be different
from one another. Similarly, the `actual' frequency of
a mating in a small population may differ from the
`expected' frequency. As with sampling of gametes to
form zygotes, the matings that actually occur are a
sample of all those that could have occurred.
914
Harlequin Chromosomes
Importance of HardyWeinberg
Proportions
Given the many assumptions needed to derive the
HardyWeinberg Law, it may come as a surprise to
learn that it plays a central role in the theory of population genetics. It does so for two reasons. First, it
provides a way to estimate allele frequencies for a
trait in which heterozygotes are indistinguishable
from one of the homozygotes, provided we are willing
to assume that all of the assumptions apply to the
population in which we are interested. Second, it
tells us what will happen in a population in the absence
of any evolutionary forces. As the philosopher Elliott
Sober has pointed out, it plays a role in population
genetic theory similar to the role that the first and
second laws of motion play in Newtonian mechanics.
The first and second laws of motion tell us that an
object at rest will tend to remain at rest and an object
in motion will tend to remain in motion (in a straight
line at a constant speed) unless acted on by outside
forces. They are `zero-force laws' that tell us what to
expect when no forces are operating on an object.
Moreover, they allow us to judge the magnitude and
direction of any forces operating on an object by the
acceleration to which it is subject.
The HardyWeinberg law is population genetics'
zero-force law. It tells us what a population will look
like if neither genetic drift nor any evolutionary forces
affect it. If all of the assumptions of HardyWeinberg
apply, then the population must have genotypes in
HardyWeinberg proportions. Moreover, a single generation in which those assumptions apply is sufficient
to put genotypes into those proportions, and neither
the allele frequency nor the genotype frequencies will
change so long as they continue to apply. If genotypes
are not in HardyWeinberg proportions, then one or
more of the assumptions must have been violated in
this population, and the direction in which genotypes
depart from HardyWeinberg proportions is often a
clue to the cause of the departure. If, for example,
fewer heterozygotes are observed than expected,
some form of inbreeding is a likely cause.
It is important to remember, however, which inferences can be made with the HardyWeinberg law and
which cannot:
1. If the assumptions apply, genotypes will be in
HardyWeinberg proportions.
2. If genotypes are not in HardyWeinberg proportions, one or more of the assumptions has been
violated.
It is tempting to conclude that if genotypes are in
HardyWeinberg proportions, all the assumptions
apply. But this conclusion is not justified. Suppose,
for example, genotypes differ in their ability to survive, but all the other assumptions apply. Then genotypes will be found in HardyWeinberg proportions
among newly formed zygotes, but they will not be
found in HardyWeinberg proportions in adults.
Further Reading
Harlequin Chromosomes
O J Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0586
H el i c a s es 915
temperatures above their normal growth temperature.
Because Hsps may also be produced by cells exposed
to harmful chemicals or to other conditions that cause
cellular stress, they are sometimes called stress proteins. The synthesis of Hsps results from a turning on
or induction of the genes encoding these proteins,
following the temperature increase.
It was first observed in the fruit fly Drosophila
melanogaster that when either isolated tissues or
whole flies were subjected to a heat shock, new proteins, not detectable in unshocked cells, were made.
Furthermore, other specific proteins that were
present in unshocked cells were made in much
greater amounts following a heat shock. Both of
these categories of proteins were defined as heat
shock proteins (Hsps). The synthesis of Hsps is a
universal phenomenon, occurring in all plant and
animal species studied, including humans. Hsps are
also made by prokaryotic cells, namely the bacteria
and archaea. The temperature at which heat shock
proteins are induced varies depending upon the
normal growth temperature of the species. For
instance, fruit flies, normally grown at 25 8C in the
laboratory, are heat shocked at 3537 8C, whereas
human or mouse cells are induced to make Hsps
when the temperature is raised to several degrees
above their normal body temperature of 37 8C, for
instance 4142 8C. Chemicals that can induce Hsps
in many cell types include heavy metal ions and
arsenite.
Heat shock proteins can be distinguished on the
basis of their molecular masses, and are thus conveniently named according to their sizes. Major Hsps in
animal cells have molecular masses of approximately
90 000 daltons (Hsp90), 70 000 daltons (Hsp70),
60 000 daltons (Hsp60), and 25 00030 000 daltons
(Hsp25 or Hsp30). These four groups also make up
distinct families of Hsps which have characteristic
amino acid sequences, three-dimensional structures,
and mechanisms of action.
One of the properties of Hsps is the ability to
prevent partially unfolded proteins from aggregating
to form insoluble complexes. Since Hsps are able to
prevent such undesirable interactions, they are also
referred to as molecular chaperones. In cells, unfolded
or partially unfolded proteins may include those in the
process of being made on ribosomes, and therefore not
yet folded to their mature state, pre-existing proteins
that have become unfolded due to physical or chemical stresses, and proteins that are partially unfolded in
the process of their transport across a cell membrane.
Thus most Hsps have roles in interacting with
unfolded proteins in normal, unstressed cells, and are
also of particular importance during exposure to heat
or other stressors.
Heavy/Light Chains
See: Globin Genes, Human
Helicases
D M J Lilley
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0590
916
Helicobacter pylori
Helicobacter pylori
F Carneiro and C Caldas
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1578
Helicobacter pylori organisms are spiral, microaerophilic, gram-negative bacteria that colonize the human
stomach. H. pylori bacteria were identified for the
first time in 1982 by Warren and Marshall in Perth,
Australia. However, there is evidence to suggest that
these organisms colonized the stomach well before
we became humans. H. pylori infection is one of
the most common chronic infections in man. It is
believed that until the twentieth century nearly all
humans carried H. pylori or closely related bacteria
H e m o p h i l i a 917
gastric or duodenal ulcer or gastric cancer. In contrast,
people infected with cagA , vacA s2, and iceA2 strains
will most probably remain asymptomatic despite
developing gastritis. There is a wide variation in the
H. pylori genotypes colonizing different parts of the
world. The similarities between several populations
(for instance in the Iberian Peninsula and South America) with respect to the prevalence of specific H. pylori
genotypes suggest comigration and coevolution of H.
pylori and humans. These similarities may reflect historical, cultural, and socioeconomic relationships
between different areas of the world.
See also: Adenocarcinomas; Bacterial Genetics
Helix-Loop-Helix Proteins
See: DNA-Binding Proteins
HelixTurnHelix Motif
J Read and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2170
Hemizygote
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1855
Helper Phage
E Kutter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0594
Hemoglobin
See: Globin Genes, Human
Hemophilia
F Giannelli
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0597
Hemophilia is the name shared by two X-linked recessively inherited defects of blood coagulation. These
918
Hemophilia
manifest as spontaneous or excessive bleeding following minor surgery or trauma. The bleeding episodes
may occasionally threaten life and in the long run may
cause serious disability, especially by damaging joints.
Hemophilia is due to defects in either the gene for
coagulation factor VIII or that for factor IX. Mutations of the factor VIII gene cause hemophilia A, or
classic hemophilia, while those of the factor IX gene
cause hemophilia B, or Christmas disease.
Population Genetics
Both hemophilias have been maintained in the population by an equilibrium between mutation and selection against the affected males. The latter causes a
loss of hemophilia genes at each generation equal to
[(1 f ) I/3], where f is the chance that a patient will
produce offspring relative to that of a normal male,
and I is the incidence of the disease. The value of f was
about 0.5 prior to the introduction of modern treatment for both hemophilias, so that existing hemophilia genes were lost from the population and were
replaced by new mutations at a rate of 1/6 per generation. As a result both diseases show a high degree of
mutational heterogeneity. In the second half of the last
century the value of f is thought to have increased in
developed countries because of treatment and better
patients' health. In this situation, since the mutation
rates are expected to remain unchanged, the incidence
of the disease rises, eventually to reach a new equilibrium between mutation and selection. Currently,
in the UK, the incidence of hemophilia A and B is
respectively 1 per 5000 and 1 per 30 000 males.
Factor IX Gene
The factor IX gene is near the boundary between
Xq26 and Xq27. It spans 33.5 kb, contains 8 exons,
and produces a message of 2802 nt. This gene appears
to derive from an ancestral gene that gives rise to three
more genes encoding proteins of blood coagulation:
factors VII and X, and protein C.
H e m o p h i l i a 919
B domain. The heterodimer (A1a1A2B a2A3C1C2) is
the inactive circulating form of factor VIII and is
carried by a large multimeric protein: von Willebrand
factor. This protects factor VIII and slows down its
clearance. Mutations of von Willebrand factor that
only affect its factor VIII binding property may therefore mimic hemophilia A.
Factor VIII is activated by cleavage at the A1a1/A2
and a2/A3 boundary while any residual B domain is
eliminated by cleavage at the A2/B boundary. The
heterotrimeric (A1a1 A2 A3C1C2) active form of
factor VIII is unstable and may become inactive either
by spontaneous dissociation of the A2 chain or by
enzymatic cleavage at the A1/a1 boundary or within
the A2 chain. Activated protein C operates these cleavages as part of a negative feedback control on blood
coagulation.
Factor VIII is homologous to coagulation factor V.
This, however, has a clearly distinct B domain and
lacks the a1 and a2 acidic peptides.
GenotypePhenotype Correlations
In both hemophilia A and B frameshifts and nonsense
mutations tend to cause severe disease with absence of
coagulant protein in circulation; in hemophilia B this
seems to be true irrespective of the position of the
premature translation stop signal. Splicing and missense mutations may cause mild, moderate, or severe
disease. Missense mutations may simply impair the
function of the coagulant factor; cause gross reduction
or virtual absence of the coagulation factor in the
blood; or decrease the amount of protein in circulation
as well as reducing its specific activity.
Mutations expected to prevent the synthesis of
`near-normal' coagulant proteins such as gross or
complete gene deletions, frameshifts, nonsense mutations, and inversions breaking intron 22 of the factor
VIII gene, predispose to the inhibitor complication.
This entails the development of antibodies against the
coagulant factor used in replacement therapy, so that
the patient becomes refractory to standard treatment.
Predisposition to manufacture such antibodies is
probably due to failure to develop tolerance to the
relevant coagulation factor because of inadequate
exposure to the factor during maturation of the
immune system.
920
He re d i t a r y Dis ea s es
Further Reading
Hereditary Diseases
D E Wilcox
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0599
Hereditary Neoplasia
L M Mulligan
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1579
H e r i t a b i l i t y 921
falls into this category. Characteristic features of hereditary neoplasia include an early age of disease onset
and the occurrence of multiple primary tumors. It
should be stressed that hereditary neoplasia refers to
inheritance of a susceptibility allele or alleles but not
to inheritance of a cancer phenotype per se.
See also: Cancer Susceptibility; Tumor
Suppressor Genes
Heritability
W G Hill
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0600
Definition
The observed performance or phenotypic value, P, of
an individual for a quantitative trait can be partitioned
in a simple additive model into two components,
genotypicvalue (G)and environmental deviation(E)as:
PGE
A genotype environment interaction term, GE,
can also be included in the model, but cannot usually
be distinguished from the environmental deviation, E,
as each environmental deviation is unique. Because
individuals transmit only one gene at each locus to
their offspring, the other copy coming from the second parent, in describing correlations among most
relatives and in predicting responses to selection it is
necessary to consider the average performance of individuals who receive one copy of a specified gene and
Magnitude
Because the amount of genetic variance depends on the
frequencies and effects of genes at many loci, and the
environmental variation depends on the environment
in which individuals are kept, heritability differs
among traits, species, populations within species, and
over time. In practice, however, it turns out that each
922
Heritability
Table 1
traits
18
32
h2 (%)
12
10
30
45
34
48
(broad sense)
65
>95
Estimation
Relatives resemble each other because they have
genes in common, and the closer the relationship the
more likely they are to share genes and the more
highly correlated are their phenotypes for quantitative traits. Similarly, the higher the heritability, the
more highly correlated are the phenotypes of relatives.
Heritability is therefore estimated from the resemblance between relatives, scaled to take account of
the relationship.
There are two major problems in estimating heritability. The first is to avoid confounding of the correlations among relatives by nongenetic causes such
as shared environment. It is therefore much easier to
get good estimates in well-designed experiments in
laboratory animals than in man. The second problem
is to get sufficient data to provide accurate estimates;
and while estimates from distant relatives may be less
confounded by common environment, they estimate
only a fraction of h2 and so have to be scaled up and
have a high sampling error.
H e r i t a b i l i t y 923
half sibs are the norm, and in animals where half sibs
are raised together). Data from experiments or field
trials are typically subjected to analysis of variance,
and heritability is estimated from the intraclass correlation. Assuming there is no confounding, this correlation is an estimate of h2/2 for full sibs and h2/4 for
half sibs. In mammals in which each male has several
mates, both the full and half-sib correlations can be
estimated for the same experiment. The half-sib estimate is usually taken because it is less likely to be
confounded by common environment and dominance, although it has a higher sampling error. As
for the offspringparent correlation, positive assortative mating can increase the correlation among sibs.
Twins
There are considerable problems in eliminating common environment effects for heritability estimation in
man. The use of twins provides a route, specifically by
comparing the correlations of identical (monozygous,
MZ) and nonidentical (dizygous, DZ) twins. If the
MZ correlation is assumed to equal h2 c2 and the
DZ correlation h2/2 c2, then an estimate of heritability is 2[corr(MZ) corr(DZ)]. This is, however, biased
upward by all nonadditive genetic effects (for dominance the MZ covariance includes VD and the DZ
includes VD/4) and epistasis, and by any extra similarity of environment that MZ share over DZ through
their treatment or behavior.
Combination of Information
Selection Response
As the regression of offspring on mid-parent phenotype equals heritability and is linear or close to linear
Uses
Heritability tells us no more than the additive genetic
variance and phenotypic variance do separately, but
it is a useful summary and descriptive parameter. Just
as the correlation among relatives can be used to estimate heritability, so the heritability can be used to
predict the correlation of relatives. Prediction of the
expected phenotype of offspring of selected individuals (equal to the breeding values of these individuals)
and thus of selection response is probably the most
important practical use of the heritability estimate. A
comparison between the heritability predicted from
collateral relatives such as half sibs and the realized
heritability or selection response provides a check on
quantitative genetics theory (whereas comparison
of realized heritability and regression of offspring on
parent does not, for they are based on the same
principles).
Discrete Traits
924
Hermaphrodite
Some Misinterpretations
The magnitude of the heritability does not tell us a lot
of things. For example, as it applies to individuals
within populations, it cannot be used to predict
genetic differences between races or other populations
from phenotypic differences, whether or not they
share the same environment.
The prediction formula R h2S applies only (other
than in very special circumstances) if selection is practiced on the trait on which response is measured. If
selection is practiced on some trait or combination of
traits other than the one of interest, the regression of
response on selection differential is not therefore an
unbiased estimate of heritability, but depends inter
alia on the genetic and phenotypic correlations among
the traits. This is a serious problem in inferences
about selection in nature, where the actual selection
applied is not known. Methods exist to overcome this
problem, but require that records be available on all
traits on which selection is practiced or to which fitness is related.
As the heritability is a summary parameter over
loci, it does not tell us about either the numbers of
genes that affect a quantitative trait or the magnitude
of their effects. It is not therefore a constant as a
population changes. But heritability is nevertheless a
useful concept when properly used.
Further Reading
References
Hermaphrodite
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0601
Hershey, Alfred
W C Summers
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0603
Alfred Day Hershey (190897), an American geneticist, was born 4 December 1908 in Owosso, Michigan,
926
H e t eroa l l e l e
Heteroallele
F W Stahl
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0604
Heteroallelic Complementation
A heteroallelic diploid is characteristically mutant in
phenotype. A heteroallelic diploid that has wild-type
or quasi wild-type phenotype is said to manifest interallelic (or intragenic) complementation. Such complementation often reflects either a multimeric state of
the functional protein product of that gene or two or
more domains within the protein manifesting more
or less independent functions.
Heteroallelic Recombination
When the altered nucleotide sequences defining the
two heteroalleles are not overlapping, interallelic
History
In the 1940s and 1950s, demonstrations of interallelic
complementation and recombination strained the
classical definition of a gene. Complementation between mutations is a classical demonstration that two
mutations are in separate genes, defined as units of
function. However, understanding of quarternary
protein structure soon rationalized the exceptional
cases of heteroallelic complementation. Recombination between mutants is a classical demonstration
that the two mutations are in separate genes, defined
as units of recombination. However, analysis of the rll
gene of bacteriophage T4 combined with the Watson
Crick hypothesis for DNA structure established the
modern view that a gene is a segment of a continuous
DNA duplex with recombination possible between
any pair of adjacent nucleotides (Benzer, 1955).
Reference
Benzer S (1955) Fine structure of a genetic region in bacteriophage. Proceedings of the National Academy of Sciences, USA
41: 344354.
Heterochromatin
A T Sumner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0605
Constitutive Heterochromatin
Constitutive heterochromatin is most easily demonstrated using C-banding; a variety of other chromosome banding methods produce specific staining of
certain heterochromatic regions of chromosomes in
certain species. Characteristically, constitutive heterochromatin consists largely of highly repetitive (`satellite') DNA, although blocks of heterochromatin
may not necessarily consist exclusively of such
DNA, and in some species moderately repetitive
rather than highly repetitive DNA seems to be present. The DNA of constitutive heterochromatin is
late-replicating, and in mammals, its cytosines are
often methylated. A number of proteins have been
described that are either specific to, or concentrated
in, constitutive heterochromatin; such proteins may
well be involved in the condensed state of heterochromatin. Heterochromatin has generally been regarded
as genetically inert. The quantity in the genome can
vary extensively without any apparent phenotypic
effects. In Drosophila it is not replicated during polytenization of chromosomes, and in certain other
organisms heterochromatin is eliminated in somatic
cells, and retained only in the germline. The highly
repetitive DNA sequences found in most heterochromatin could not be translated into proteins. Nevertheless, constitutive heterochromatin is not without
effects. It can have profound effects on the position
and number of chiasmata at meiosis; induce the inactivation of genes close to it (position-effect variegation); and in Drosophila can contain Y-chromosome
fertility factors, factors involved in pairing and disjunction of achiasmate chromosomes, and certain
other unconventional genetic factors such as Responder and ABO. The genetics of few organisms have
been studied as intensively as that of Drosophila, and
it may yet turn out that constitutive heterochromatin
in many species contains nonconventional factors.
Facultative Heterochromatin
The best-known example of facultative heterochromatin is the inactive X chromosome of female
mammals, in which one of the X chromosomes is
permanently inactivated early in development, apparently as a means of dosage compensation, so that the
amount of X-chromosome gene products produced is
similar in males (with only one X) and in females (with
two X chromosomes). (It should be noted that in
birds, with an independently evolved ZW/ZZ sex
chromosome system, there appears to be no dosage
compensation, and no facultative heterochromatin,
Heterochronic Mutation
A E Rougvie
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0606
928
H e t eroch roni c M ut at i on
H0
H1
H2
V1
V3
V2
V4
V5
V6 T
(A)
Hatch
lin-14(0)
V6
Wild-type
V6
L1
S1
Wild-type
T
lin-14(gf)
V6
S2
lin-14(0)
T
lin-14(gf)
T
S2
S1
S1
S2
S3
S1
S3
S4
S1
S4
S1
S1
X
X
S2
L2
S1
X
L3
X
S1
L4
A
A
(B)
Wild-type
L1 S1
L2 S2
L3 S3
L4 S4
A A
lin-4
S1
S1
S1
S1
S1
lin-14(0)
S2
S3
S4
A
lin-14(gf)
S1
S1
S1
S1
S1
lin-28
S1
S3
S4
A
lin-29
S1
S2
S3
S4
S4
(C)
Figure 1 Illustration of phenotypes resulting from heterochronic gene mutations in C. elegans. (A) A schematic L1
stage larva is shown indicating the positions of the left lateral hypodermal blast cells. This pattern is repeated on the
right lateral side of the animal. (B) The cell lineage of the V6 and T cells are shown for wild-type and lin-14 null (0) and
gain-of-function (gf ) mutants. The vertical axis indicates developmental time, showing the four larval stages and the
adult stage. The marks on the Hatch verticle axis indicate the molts. In the lineage diagrams, vertical lines indicate
cells and horizontal lines indicate cell divisions. The triple horizontal bars indicate terminal differentiation and
synthesis of the adult cuticular ridges termed alae. V1V4 lineage patterns resemble the V6 lineage and the remaining
hypodermal blast cell lineage patterns contain slight variations. Arrows indicate that the division pattern is repeated
through additional molting cycles not observed in wild-type animals. Cells that undergo a programmed cell death are
indicated with an `X.' S1S4 and A are used to denote the stage-specific division patterns in wild-type animals. (C) The
cell division patterns defined in (B) are used to summarize the phenotypes of heterochronic mutants lin-4, lin-14, lin28, and lin-29.
LIN-14 protein in young L2 larvae requires wild-type
lin-4 activity. In lin-4 mutants, LIN-14 remains inappropriately high, again resulting in reiteration of S1
patterns. The functional lin-4 product is not a protein,
but rather a small RNA molecule with antisense complementarity to sequences present in the 30 untranslated region (UTR) of the lin-14 mRNA. These
complementary sequences are deleted in lin-14(gf )
mutants, rendering the mutant lin-14 mRNAs insensitive to lin-4 activity and preventing down-regulation
of LIN-14 levels.
lin-28 encodes a cytoplasmic protein with RNA
binding motifs and is also downregulated through
a lin-4-complementary site within its 30 UTR. The
930
H e t eroch rony
Relationship to Heterochrony
The term heterochrony is usually applied in an evolutionary context, referring to a change in the timing of a
developmental event in an organism relative to when
that event occurred in its ancestors. Naturally occurring heterochronic mutations analogous to those
described here could, if stably incorporated into
a population, result in heterochrony and provide
a mechanism for evolutionary variation between
species.
Further Reading
Heterochrony
See: Neoteny
Heteroduplexes
P J Hastings
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0607
(B)
(C)
(A)
C
G
A
T
C
G
A
T
C
G
Figure 1 Resolution of mismatched base pairs in heteroduplex DNA. (A) Resolution by replication. When the
replication fork passes a mismatch, the two chains are separated and replicated faithfully, so that each daughter
molecule is now of one genotype or the other. (B and C) Resolution by mismatch repair. The mismatch is recognized
and excised on one strand or the other. Copying the remaining strand restores homoduplex.
932
Heterogenote
Mismatch Repair
The best known mismatch repair system is the Mut
system of E. coli. Homologous systems are found
in eukaryotes. It is called Mut because mutations in
this system cause cells to have a mutator phenotype.
This is because the mismatch repair system acts
on mismatches generated by replication errors, as
well as those occurring in heteroduplex DNA.
The mismatch repair system acts on mismatches in
heteroduplex at two different levels. It prevents the
formation of heteroduplex between molecules that
have substantial divergence in their sequence, as
would be encountered in interspecific crosses. In
intraspecific heteroduplex, where mismatches are
few, the mismatch repair system recognizes the mismatch and excises one strand over a distance of a few
hundred base pairs. The resulting gap is then filled
by DNA synthesis that copies the remaining strand.
This results in homoduplex of one genotype or the
other (see Figure 1B and C). Mismatch repair of
heteroduplex is the major mechanism of gene conversion.
Different mismatches are recognized by the
mismatch repair system with different efficiency. The
frequency of DNA-mediated transformation in
pneumococcus varies with the efficiency of mismatch
repair. Mismatches that are readily recognized are
excised from the donor strand so that incorporation
into the genome is rare, while those that escape recognition are incorporated very frequently. This observation was interpreted as showing the effects of
mismatch correction as early as 1966. The CC base
pair is poorly recognized in several organisms. These
differences underlie many marker effects, that is,
situations in which the nature of the heterozygosity
present in a cross has an effect on the outcome of the
experiment.
Heterogenote
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0608
`Heterogenote' is a term meaning the same as heterozygote, viz., a diploid organism having different alleles
for one or more genes that therefore produces different gametes.
See also: Heterozygote and Heterozygosis
Heterokaryon
F Ruddle
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0609
Heteropyknosis
A T Sumner
Further Reading
Ephussi-Taylor H and Gray TC (1966) Genetic studies of recombining DNA in pneumococcal transformation. Journal of General Physiology 49 (suppl.): 211231.
Hershey AD and Chase M (1952) Genetic recombination and
heterozygosis in bacteriophage. Cold Spring Harbor Symposia
on Quantitative Biology 16: 471479.
Heterosis
J F Crow
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0611
Reference
Heterotrimeric G Proteins
H C Korswagen and R H A Plasterk
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0572
934
Heterozygote and
Heterozygosis
D E Wilcox
Copyright 2001 Academic Press
doi: 10/1006/rwgn.2001.0612
Heterozygote Advantage
In some circumstances, the effects of a recessive
mutation can affect the phenotype and thus reproductive fitness of heterozygotes. This is not always a negative effect as can be seen in the condition human
sickle-cell anemia. Sickle-cell carriers have a heterozygote advantage over the reproductive fitness of
normal homozygotes in some environments. In most
populations, sickle-cell anemia is a rare mutation, but
in malarial regions of Africa as many as one in three of
the population are carriers of the mutation in the
hemoglobin gene. The presence of the mutant hemoglobin in heterozygotes interferes with the malarial
936
Hfr
F Factor). This enables them to transfer their chromosomal DNA to other bacteria into which the DNA
can recombine. The existence of Hfr strains of E. coli
was observed by their high frequency of recombination with other bacteria. This was possible because
some of the strains mixed by Joshua Lederberg
and Edward Tatum in early mating experiments
(Lederberg and Tatum, 1946b) or contained the F
conjugative plasmid and others did not. In cultures
of cells carrying an F, some of the cells are Hfrs
(have an integrated F). The non-F-carrying strains
are called female or recipient bacteria and the Hfr or
F-carrying strains are male or donors. Transfer of
conjugative plasmid DNA, or chromosomal DNA in
Hfr cells, is unidirectional, i.e., male to female (Hayes,
1952). Males can be recipients only at much lower
efficiency, or under special environmental conditions.
Hfr strains form because the F carries transposable
elements that are also carried by the E. coli chromosome: two copies of the insertion sequence IS3, one
IS2, and one copy of transposon Tn1000 (also called
Hfr
S M Rosenberg and P J Hastings
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0613
Bacterial chromosome
Recombination
Hfr
Figure 1 Formation of an Hfr by recombination of the F plasmid with the Escherichia coli chromosome. Single lines
represent duplex DNA. Triangles represent transposable genetic elements IS3 ( ), IS2 ( ), and Tn1000 ( ) that are
present in the F (double lines) and also in the E. coli chromosome (single lines). These elements provide regions of
DNA sequence identity between which homologous recombination can occur (represented by an X), incorporating
the F into the chromosome and producing an Hfr.
H f r 937
Hfr
oriT
pro+
Transfer synthesis
Hfr
oriT
pro+
Transfer synthesis
Transfer
pro+
pro
recombination
Recipient chromosome
Recombination
pro+
Recipient chromosome
Figure 2 Conjugative transfer of chromosomal DNA from an Hfr donor bacterium to an F recipient bacterium.
Single lines represent single strands of DNA; dashed lines represent newly synthesized DNA and arrowheads
) at the transfer origin, oriT, followed by
represent 30 ends. Transfer begins with single-strand cleavage of F DNA (
DNA synthesis, which displaces a single strand that is transferred into the recipient. Synthesis of the complementary
strand occurs in the recipient, and the duplex fragment (
) can be incorporated into the recipient chromosome
) by recombination.
(
gd) (Figure 1). These elements are regions of DNA
sequence homology between the chromosome and
the F that allow the F to recombine with the chromosome and become incorporated into the chromosome
(Figure 1). The F can integrate at many different sites in
the chromosome, making many different Hfr strains.
Each different Hfr is capable of high frequency transfer
of the chromosomal DNA next to itself, or if given
enough time to mate without interruption, the whole
4.7 megabase E. coli chromosome.
Transfer of chromosomal DNA from an Hfr to a
female cell is depicted in Figure 2. Transfer begins by
action of an F-encoded single-strand endonuclease
and helicase, TraI, on the F origin of transfer, oriT.
Leading strand synthesis is primed from the 30 end at
the nick and displaces the 50 DNA strand. Continued
synthesis displaces that strand extending into the contiguous bacterial DNA, and the single DNA strand
displaced is transferred into a female bacterium that
has become attached to the male in a mating pair.
Transfer stops at random locations when the synthesis
tract encounters a DNA break in the donor template,
or due to breakage of the transferred strand. The
occurrence of such random disruptions produces a
gradient of transfer, with DNA near oriT being
transferred most efficiently, and decreasing transfer efficiency with increasing distance from oriT.
Once inside the female, lagging strand synthesis of
the complementary strand takes place, creating a
double-strand linear DNA fragment. The transferred
DNA will be lost unless it recombines into the recipient chromosome, which it can do (Xs in Figure 2)
938
H i n / G i n - M e d i a te d S i t e - S p e c i f i c D N A I nve r s i o n
using the cell's RecBCD system of homologous recombination of linear DNA and double-strand break
repair. This results in homologous replacement of a
segment of recipient DNA with sequences derived
from the donor chromosome. If that segment contains
different genetic information (prototrophic pro information is depicted in the transferred piece entering an
auxotrophic pro recipient in Figure 2), the recipient
can become genetically recombinant. Recombinant
strains made by Hfr conjugation do not usually become
male (Hfr) upon acquisition of donor DNA, because
the F transfer genes are the last to be transferred and are
not homologous with the recipient DNA.
Hfr crosses provided the first demonstration of
genetic recombination in bacteria and in so doing
encouraged the idea that bacteria, like other organisms,
possess genes. Hfr crosses were also the first tools used
for exploration of the proteins and enzymes that catalyze DNA recombination, leading to the discovery of,
for example, RecA (Clark and Margulies, 1965), a universal recombination and DNA repair protein of
which there are orthologs in all eubacterial, eukaryotic, and archaeal species examined to date. For descriptions of the E. coli rec genes discovered using Hfr
crosses, the recombination systems and pathways,
and double-strand break repair machinery of E. coli.
Further Reading
References
Inversion Systems
The DNA invertases catalyze a recombination reaction that inverts a segment of DNA between two
specific recombination sites. The best-characterized
invertases, Hin from Salmonella typhimurium and
Gin from bacteriophage Mu, catalyze site-specific
inversion reactions that result in alternate gene
expression. The Hin invertase regulates flagellar
phase variation in Salmonella, allowing the bacterium
to evade a host immune response (Figure 1A). In one
orientation, a promoter located within the invertible
segment of DNA directs the expression of the
H2 flagellin gene ( fljB), as well as a repressor of the H1
flagellin gene ( fljC). After Hin catalyzes a sitespecific inversion event, the promoter becomes
inverted and can no longer drive the expression of
these genes. Consequently, the H1 flagellin gene is
expressed from its unlinked site. The Gin invertase
of bacteriophage Mu controls the alternate expression
of tail fiber genes (Figure 1B). Each orientation of the
invertible segment in bacteriophage Mu encodes a
different C-terminal portion of the tail fiber protein
S. Site-specific inversion catalyzed by Gin switches
the expression of the C-terminal part of the protein,
which determines the host specificity range for the
phage. The Cin-mediated reaction of phage P1 performs a similar function. Due to the homology of
these proteins and the similarity of their recombination substrates, the characterized invertases are functionally interchangeable. The invertases belong to the
resolvase/invertase (also known as the serine) family
of recombinases which currently has over 50 members. Site-specific DNA inversions can also be catalyzed by recombinases belonging to the phage
integrase (also known as tyrosine recombinase) family.
Hi n / G i n - M ed i a t e d S i t e- S p e c i f i c D N A I nver s i o n 939
inversion
++
++++++
++++++
+
++++++
++++++
+++++
+ + ++
(A)
hixL
Recombinational
enhancer
++
++++++
++++++
+
++++++
++++++
+++++
+ + ++
hixR
hin
fljB
fljA
inversion
(B)
++
+ ++++
+
++
++++
++++++
+
++++++
++++++
+++
++
+ ++++
+
++
++++
++++++
+
++++++
++++++
+++
gixR
gixL
Sc
Recombinational
enhancer
Sv
Sv
gin
1+1
+13
13
TTTC AAACCAAGGTTT GAAA
(C)
hixL
TTCTTGAAAACCAAGGTTTTTGATAA
AAGAACTTTTGGTTCCAAAAACTATT
gixL
TTCCTGTAAACCGAGGTTTTGGATAA
AAGGACATTTGGCTCCAAAACCTATT
Figure 1 Regulation of gene expression by site-specific DNA inversion. (A) Salmonella invertible DNA segment.
The hixL and hixR recombination sites are shown as dark rectangles. The recombinational enhancer is depicted as a
striped rectangle. The 1 kb invertible segment, located between the two recombination sites, contains the hin gene
and a promoter (P) that directs the expression of the flagellar genes, fljB and fljA. Hin-catalyzed inversion switches the
orientation of the invertible segment such that the promoter can no longer direct the expression of fljB and fljA.
(B) Phage Mu invertible DNA segment. The gixL and gixR recombination sites are depicted as dark rectangles. The
recombinational enhancer, illustrated as a striped rectangle, and the gin gene are located outside of the ~3 kb
invertible segment. A promoter (P) located outside of the invertible segment controls the expression of the S and U
tail fiber genes. The constant N-terminal portion of the S tail fiber gene (Sc) is also located outside of the invertible
segment, while the variable C-terminal portion of the S tail fiber gene (Sv) and the U gene are located within the
invertible segment. Gin-catalyzed inversion of the invertible segment alternates the expression of the Sv and U genes
with the Sv0 and U0 genes. (C) Sequence of the invertase recombination sites. The recombination site consensus
sequence for the invertase family of recombinases is shown at the top. The hixL and gixL recombination site sequences
are shown below the consensus sequence. The arrows mark the sites of 2 bp staggered double-strand DNA cleavage.
The relative orientation of the recombination sites are determined by these two core nucleotides.
another cis-acting DNA element called the recombinational enhancer. Each recombination site is bound
by a dimer of the Hin or Gin recombinase, and the
enhancer contains two binding sites for the dimeric
protein Fis. Once bound to their respective DNA
sites, Hin/Gin and Fis dimers are able to assemble
into a higher order nucleoprotein complex called
an invertasome (Figure 2iii). The DNA bending protein HU also aids in the formation of the invertasome complex in the Hin system by facilitating the
bending of a small loop of DNA between one recombination site and the enhancer. Once assembled in
the invertasome structure, Fis stimulates Hin/Gin to
catalyze recombination. The inversion reaction can
940
H i n / G i n - M e d i a te d S i t e - S p e c i f i c D N A I nve r s i o n
Invertasome
The three DNA sites must synapse in a highly specific
fashion to form an invertasome complex. The Fisbound enhancer interacts with the invertase-bound
recombination sites at a branch in plectonemically
supercoiled DNA (Figure 2iii). The recombination
sites pass on either side of the enhancer such that
two negative DNA nodes are trapped within the complex. Immunoelectron microscopy of crosslinked
invertasome complexes has provided direct evidence
for the three-looped DNA structures containing Hin
hixL
hixL
Enhancer
Hin
HU
Fis
Enhancer
hixR
hixR
(i)
(ii)
(iii)
(iv)
(v)
Figure 2 The site-specific inversion reaction. Pathway of invertasome assembly using the Hin system as a model. (i)
Supercoiled DNA substrates contain two inversely oriented recombination sites, hixL and hixR, and a recombinational
enhancer. (ii) A Hin recombinase dimer binds to each recombination site and two Fis dimers bind to the
recombinational enhancer. (iii) Hin and Fis assemble into an invertasome complex with the aid of the DNA-bending
protein HU. In the invertasome complex the recombination sites associate with the enhancer at a branch in the
supercoiled DNA. (iv) Once assembled in the invertasome complex, Fis activates Hin to catalyze DNA cleavage and
strand exchange. (v) Recombination results in an inversion of the segment of DNA located between
the recombination sites.
Hi n / G i n - M ed i a t e d S i t e- S p e c i f i c D N A I nver s i o n 941
and Fis. The sizes and positions of the DNA loops in
the invertasome complex were consistent with the
enhancer associating with both recombination sites
at a branch in supercoiled DNA. The formation of
these structures was absolutely dependent on DNA
supercoiling.
The specific topology of the DNA strands in the
invertasome complex has been determined through
several experimental approaches. The change in linking number observed in the DNA molecules after
inversion by Gin suggested that two negative DNA
nodes were trapped in the invertasome complex. In
addition, the stereostructure of knotted DNA products generated from iterative rounds of Hin/Gin
recombination provided strong evidence for this
specific configuration of DNA strands at synapsis.
The knotted DNA products also indicated that each
round of recombination results in a 1808 right-handed
rotation of the DNA ends. Since the invertase is covalently associated with the DNA ends during strand
exchange, this observation implies that the recombinase subunits must also undergo a rotation. Direct
experimental evidence for exchange of subunits
between dimers accompanying strand exchange, however, is lacking thus far.
942
Hirschsprung's Disease
S10
(A)
S10
(B)
Figure 3 Model of Fis and Hin dimers. (A) Structure of a Fis dimer. The N-terminal b-hairpin arms (protruding
from the top of the structure in the figure) are responsible for stimulating invertase activity. The helixturnhelix
DNA-binding domains are located in the C-terminal end of the protein. (B) Model of a Hin dimer bound to a
recombination site. The structure of the Hin C-terminal DNA-binding domain bound to a recombination half site was
determined by X-ray crystallography. The N-terminal catalytic domain is modeled after the structure of the
homologous recombinase gd resolvase. The location of the active-site nucleophile serine 10 is marked with a black
ball. In this figure, the catalytic domains are located above the DNA and the DNA-binding domains are located below
the DNA.
Fis using the Fis-independent mutant DNA invertases, efficiently catalyze deletions as well as inversions
since random collision of recombination sites yield
catalytically active synaptic complexes.
Hirschsprung's Disease
S Malcolm
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1580
Further Reading
Hi s t id i ne Op e ron 943
Histidine
J Read and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2075
H3N
CH2
C
C
H
Figure 1
NH
CH
N+
H
Histidine.
Histidine Operon
P Alifano
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0615
The biosynthesis of histidine has been studied extensively in Salmonella typhimurium and Escherichia coli.
In these microorganisms, a single operon composed of
eight adjacent genes encodes the complete set of
enzymes required for the biosynthesis of histidine.
Three (hisD, hisB, and hisI) of the eight genes of the
operon encode bifunctional enzymes, while two (hisH
and hisF) encode enzymes that catalyze single steps,
for a total of 10 enzymatic reactions.
The first step in histidine biosynthesis (Figure 1)
is the condensation of ATP and 5-phosphoribosyl
1-pyrophosphate (PRPP) to form N0 -50 -phosphoribosyl-ATP (PRATP). This reaction is catalyzed by
N0 -50 -phosphoribosyl-ATP transferase, the product
of the hisG gene. This reaction is the one involved in
feedback inhibition by the end product of the pathway, histidine. The inhibitory effect of histidine
requires the presence of the product of the reaction,
PRATP, and is further increased by AMP. Synergistic
inhibition by the product of the first reaction and the
end product of the pathway is a sophisticated variation
of the general principle of feedback control, that has
been found to also regulate the activity of glutamine
synthetase. The inhibitory effect of AMP supports the
energy charge theory of D.E. Atkinson and is logical
in view of the high energy input required for histidine
biosynthesis.
The product of the transferase reaction, PRATP, is
hydrolyzed to N0 -50 -phosphoribosyl-AMP (PRAMP).
This irreversible hydrolysis is catalyzed by an activity
associated with the C-terminal domain of the enzyme
encoded by the hisI gene. The other activity, localized
within the N-terminal domain of the bifunctional
enzyme, is a cyclohydrolase, that opens the purine
ring of PRAMP. This leads to the production of an
imidazole intermediate, the N0 -[(50 -phosphoribosyl)
formimino]-5-aminoimidazole-4 carboxamide ribonucleotide (abbreviated to 50 -ProFAR). The fourth
step of the pathway of histidine biosynthesis is an
internal redox reaction, also known as an Amadori
rearrangement, involving the isomerization of the
aminoaldose 50 -ProFAR to the aminoketose N0 -[(50 phosphoribulosyl) formimino]-5-aminoimidazole-4
carboxamide ribonucleotide (abbreviated to 50 PRFAR).
Although the pathway of histidine biosynthesis
was almost completely characterized by 1965, the
biochemical event leading to the synthesis of imidazole-glycerol phosphate (IGP) and 5-aminoimidazole
4-carboxamide ribonucleotide (abbreviated to
AICAR or ZMP) from 50 -PRFAR remained unsolved
for a long time. The protein products of the hisH and
hisF genes were known to be involved in the overall
944
Histidine Operon
P1
L hisG
P2
hisD
hisC
P3
hisB
hisH
hisA
hisF
hisI
ATP + PRPP
hisG
N'-5'-phosphoribosyl-ATP transferase
PRATP
hisI
N'-5'-phosphoribosyl-ATP pyrophosphohydrolase
PRAMP
hisI
N'-5'-phosphoribosyl-AMP cyclohydrolase
5-Pro-FAR
N'-[(5'-phosphoribosyl)-formimino]-5-aminoimidazole-4carboxamide isomerase
hisA
PRFAR
hisH:hisF
imidazole-glycerol-phosphate synthase
IGP + AICAR
hisB
IAP
hisC
imidazole-acetol-phosphate aminotransferase
HOL-P
hisB
L-histidinol-phosphate phosphatase
HOL
hisD
L-histidinol-dehydrogenase
HAL
hisD
L-histidinol-dehydrogenase
Histidine
Figure 1 Structure of the his operon of Salmonella typhimurium and metabolic pathway of histidine biosynthesis. Top:
the relative positions of P1 (primary promoter), P2 and P3 (internal promoters) and T (rho-independent bifunctional
transcription terminator) are indicated below the genetic map. L represents the leader regions preceding the
structural genes. Bottom: biosynthetic steps from ATP and PRPP to histidine. Abbreviations are specified in the text.
process in eubacteria, but the catalytic events were
elusive. The last blind spot of histidine biosynthesis
has recently been clarified. The protein encoded by
the hisF gene has an ammonia-dependent activity that
converts PRFAR to AICAR and IGP, while the product of the hisH gene has no detectable catalytic properties. However, in combination, the two proteins are
able to carry out the above reaction with glutamine as
a nitrogen donor, without releasing any free metabolic
intermediate. The hisH and hisF gene products form
a stable 1:1 complex that constitutes the IGP
synthase holoenzyme. AICAR, which is produced in
the reaction catalyzed by IGP synthase, is recycled
into the de novo purine biosynthetic pathway. The
other product, IGP, is dehydrated by an activity of a
bifunctional enzyme encoded by hisB. The resulting
enol is ketonized nonenzymatically to imidazoleacetol phosphate (IAP). The seventh step of the pathway consists of a reversible transamination between
IAP and glutamate. The reaction, catalyzed by a
Hi s t id i ne Op e ron 945
the histidine pathway was presumed to lack any
branch point leading to other metabolites required
for growth. Nevertheless, the two initial substrates
of histidine biosynthesis, PRPP and ATP, play key
roles in intermediary and energy metabolism and
link this pathway to the biosynthesis of purines,
pyrimidines, pyridine nucleotides, folates, and tryptophan. Moreover, the purine and histidine biosynthetic
pathways are connected through the AICAR cycle.
AICAR, a by-product of histidine biosynthesis, is
also a purine precursor. The conversion of AICAR
to purines involves a folic acid-mediated transfer of
a one-carbon unit. Following treatment thought to
lower the folic acid pool, the unusual nucleotide 5aminoimidazole-4-carboxamide riboside-50 -triphosphate (ZTP) accumulates in S. typhimurium. On the
basis of this and additional evidence, the rare nucleotide ZTP was proposed to be an alarmone signaling
C-1 folate deficiency and to mediate a physiologically
beneficial response to folate stress.
946
Histidine Operon
Bacteria
Escherichia coli
Salmonella typhimurium
Haemophilus influenzae
hisG
hisD
hisC
hisB
Azospirillum brasiliense
hisB d
hisD
Streptomycei coelicolor
Lactococcus lactis
hisC
ORF 3
Mycobacterium
tuberculosis
hisG
hisI
Bacillus subtilis
hisG
hisD
ORF 6
hisD
hisG
hisZ
hisC
hisB d
hisB
hisH
hisH
ORF 8
hisH
hisB d
hisH
hisA
ORF
168
hisH
hisB d
hisC
hisD
hisH
hisA
hisF
hisA
hisI
hisF
hisI
ORF
122
hisA
hisA hisF
hisA
hisF
hisI
ORF 13
hisI
hisC
Archaea
Methanococcus
jannaschii
hisF
hisE
hisH
hisBd
hisC
hisG
hisI
hisD
hisA
Eukarya
Saccharomyces
cerevisiae
hisH hisF
HIS7
(chr.2)
hisI hisD
hisG
HIS4
(chr.3)
HIS1
(chr.5)
hisB px
HIS2
(chr.6)
hisC
hisA
hisB d
HIS5
(chr.9)
HIS6
(chr.9)
HIS3
(chr.15)
Figure 2 Organization of the histidine genes in different organisms. The single gene encoding a bifunctional
enzyme, formerly known as hisIE, has been renamed hisI in E. coli. I have therefore used hisI for organisms with a single
gene and hisI and hisE for organisms with two independent genes. Another gene encoding a bifunctional enzyme, hisB,
is often split into two separate genes in different organisms. They are referred to as hisB proximal (hisBpx) encoding
the HOL-P phosphatase, and hisB distal (hisBd) encoding the IGP dehydratase. In Mycobacterium tuberculosis, a gene
encoding inositol monophosphatase, impA, is located between hisA and hisF. In Bacillus subtilis the structural gene
encoding the histidyl tRNA synthetase is located proximally to the biosynthetic his cluster.
A
ATCAAATGAATAAGCATTCATCGAATTTTTATGACACGCGTTCAATTTAAACACCACCATCATCACCATCATCCTGACTAG
+1
MetThrArgValGlnPheLysHisHisHisHisHisHisHisProAspEnd
B
TCTTTCAGGCGATGTGTGCTGGAAGACATTCAGATCTTCCAGCGGCGCATGAACGCATGAGAAAGCCCCCGGAAGATCATCT
F
TCCGGGGGCTTTTTTTTTGGCGCGCGATACAGACCGGTTCAGACAGGATAAAGAGGAACGCAGAATGTTAGACAACACC
MetLeuAspAsnThr
Figure 3 Features of the leader region of the his operon of Salmonella typhimurium. The nucleotide sequence of the
leader region from the transcription initiation site (1) to the first structural gene (hisG) is reported. The amino acid
sequence of the leader peptide and of the amino-proximal region of the hisG gene product are indicated below the
nucleotide sequence. Solid lines above the nucleotide sequence correspond to regions (A to F) capable of forming
mutually exclusive secondary structures.
Histocompatibility 947
unimportant and their presence merely fortuitous,
their presence in homologous genomic regions of
related microorganisms supports their physiological
relevance. They could reinforce the expression of distal cistrons of large operons, thereby alleviating the
effects of natural polarity. Alternatively, they could
allow regulation of an operon in a noncoordinate fashion and cause differential expression of certain genes
under specific growth conditions. Based on several
features of the nucleotide sequence, the internal promoters, as well as the primary hisP1 promoter, belong
to the Es70 class of promoters.
Transcription of the his operon is also modulated at
the level of intracistronic rho-dependent terminators
by a nonspecific mechanism operating during the
elongation step. Terminators account for the polarity
exhibited by several nonsense and frameshift mutations. Polarity is a phenomenon observed in polycistronic operons, by which certain mutations that
prematurely arrest translation not only affect the
gene in which they occur, but also reduce the expression of downstream genes. Although polarity was
first described in the lactose system, the coordinate
effect of polar mutations on downstream gene expression and the existence of polarity gradients were
defined with precision in the his system by using a
large collection of polar mutations. The phenomenon
of polarity has been explained by postulating the
existence of cryptic intracistronic rho-dependent terminators. According to a general model of transcriptional polarity, premature arrest of translation would
favor the binding of rho to the nascent transcript via
cytosine-rich and guanosine-poor regions. Using the
energy of ATP hydrolysis, rho moves along the nascent transcript, overtakes elongating RNA polymerase, and precipitates release of the transcript. The
physiological significance of rho-dependent intracistronic termination should be to prevent further
elongation of nontranslated or infrequently translated
transcripts.
Finally, it has recently been documented that posttranscriptional events contribute substantially to his
operon expression. In S. typhimurium and E. coli, the
unstable native 7300 nucleotide-long polycistronic his
message is degraded with a net 50 to 30 directionality,
generating products that decay at different rates. The
decay process generates three major processed species,
6300, 5000, and 3900 nucleotides in length (Pr1, Pr2,
and Pr3), that encompass the last seven, six, and five
cistrons, respectively, and have increasing half-lives (5,
6, and 15 min, respectively). RNase E controls the
decay of the native transcript. Active translation of
the 50 -end-proximal cistrons of the processed Pr1
and Pr2 species is required to temporarily stabilize
these species. The overall process of decay may have
Further Reading
Alifano P, Fani R, Lio P et al. (1996) Histidine biosynthetic pathway and genes: structure, regulation, and evolution. Microbiological Reviews 60: 4469.
Winkler ME (1996) Biosynthesis of histidine. In: Neidhart FC
et al. (eds) Escherichia coli and Salmonella Cellular and Molecular Biology, 2nd edn, vol. 1, pp. 485505. Washington, DC:
American Society for Microbiology Press.
Histocompatibility
J Read and B J Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0616
948
antigens on the surface of foreign cells and, in conjunction with T cells, will destroy the foreign cells.
See also: Antigen; Major Histocompatibility
Complex (MHC)
Histocompatibility
Complex Genes
See: Major Histocompatibility Complex (MHC)
Histone Genes
A P Wolffe{
Background
Histone genes were among the first eukaryotic genes
to be characterized. Their cloning and isolation in the
1980s was facilitated by their repetition in metazoans,
their small size, the abundance of their mRNAs, and
the early sequence characterization of the histone proteins. Interest in histone genes derives from their regulated transcription, the control of histone mRNA
stability, and the regulation of histone mRNA 30 processing. The histone genes provide a paradigm for the
study of DNA replication (S-phase)-dependent transcription. They have also been exceptionally useful in
the investigation of the determinants of tissue-specific
and embryo stage-specific transcriptional control.
Pioneering studies on the assembly of specialized
architectures within chromatin made effective use of
histone gene sequences. Research in the 1990s has led
to recognition of the importance of histone protein
sequence in the packaging of DNA for transcription,
replication, recombination, and repair, together
with the maintenance of chromosome stability and
chromosome segregation. The histone genes have
been subjected to an extensive mutational analysis,
with the consequences for DNA metabolism of histone
gene ablation, deletion, and point mutation investigated by many research scientists.
deceased
950
Histone Genes
Further Reading
952
Histones
Histones
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1856
Hitchhiking Effect
M Kreitman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0619
DNA, the genetic material, is packaged into chromosomes ranging in length from thousands to many tens
of millions of base pairs. Now consider the fate of two
independent mutations that have occurred by chance
on one specific copy of a chromosome in a population.
Imagine one of these mutations is a selectively favorable mutation that natural selection will increase in
frequency in the population each generation (see Selective Sweep). The other mutation on this same chromosome is a selectively neutral mutation (see Neutral
Mutation), one whose fate will be governed under
normal circumstances by genetic drift (see Genetic
Drift). Its association with the favorable mutation on
the same chromosome guarantees that as the adaptive
mutation increases in frequency by natural selection,
so the `linked' neutral mutation will also deterministically increase in frequency. The `hitchhiking effect'
Further Reading
HIV
See: Virus
Hodgkin's Disease
M J S Dyer
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1581
954
Hogness Box
Holliday Junction
P J Hastings
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1429
Further Reading
b
a
A
A
+
(iii)
a
(iv)
(vi)
A J Berk
Reference
(ii)
Hogness Box
b
+
H o l l i d ay 's M o d e l 955
homologous DNA molecules. The structure is named
after Robin Holliday who first proposed the structure
in 1964 (Holliday, 1964). Figure 1 shows how the two
molecules are held together by the presence of hybrid
DNA, that is, DNA formed with one strand from one
parental molecule and the other strand from the other
parent. Physical models of DNA show that it can
adopt this structure without strain and with all bases
remaining paired.
The Holliday junction is central to recombination
theory because it has three interesting properties.
First, it can isomerize, i.e., take on an alternative
structure (see Isomerization (of Holliday Junctions)).
Second, it can migrate, leading to extension or shortening of the lengths of hybrid DNA (see Branch
Migration). Third, it can be resolved by a special class
of enzymes (resolvases) that cut the structure symmetrically to give two separate molecules (see Resolvase).
Isomerization occurs spontaneously. Resolving the
structure while in one isomer is expected to lead to
crossing-over. In the other isomer, resolution restores
the parental combination of flanking regions, but
lengths of single strands have been exchanged. Migration of a Holliday junction may be able to occur by
random drift, but it is an enzyme-mediated process in
Escherichia coli, where the RuvABC proteins acting
together are able to catalyze both branch migration
and resolution.
Reference
Holliday's Model
P J Hastings
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0622
(1)
(2)
(3)
r
(4)
(5)
(6)
(7)
(8)
956
Holocentric Chromosomes
Further Reading
Holocentric Chromosomes
D G Albertson
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0623
Mitotic Behavior
The nonlocalized kinetochore becomes visible at the
ultrastructural level in prophase. By metaphase, it is
typically a well-differentiated trilaminar structure
resembling the kinetochore of monocentric chromosomes and probably extends the entire length of the
chromosome. Holocentric chromosomes appear as
stiff rods under the light microscope and lack the
primary constriction that demarcates the centromere
of monocentric chromosomes. At metaphase, the
chromosomes align parallel to the equator of the metaphase spindle and lie entirely within the spindle.
Microtubule attachments are distributed along the
kinetochore, so that at anaphase the chromosomes
move broadside on to the spindle poles. Studies in C.
elegans have also demonstrated that these holocentric
chromosomes terminate in telomere sequences similar
to those of mammalian telomeres.
Chromosome Rearrangements
Holocentric chromosome organization allows the
stable propagation of chromosome rearrangements
that are not mitotically and meiotically stable in
organisms with monocentric chromosomes. Translocation chromosomes involving two entire holocentric
chromosomes align and segregate to a single spindle
pole, whereas in organisms with monocentric chromosomes, the linkage of two chromosomes results in
the formation of dicentric chromosomes that fail to
segregate properly. Fragments of holocentric chromosomes may also be propagated, because they retain the
capability to attach to the spindle apparatus. In contrast, fragmentation of monocentric chromosomes
results in the generation of mostly acentric fragments
that are lost. Indeed, before visualization of the holocentric kinetochore by electron microscopy was possible, this differential behavior of holocentric and
monocentric chromosome fragments formed the
basis of a test for holocentric organization.
Meiotic Behavior
Holocentric chromosomes typically behave differently in meiosis and mitosis. In meiosis, in most
organisms that have been examined at the ultrastructural level, no kinetochore structure is seen. Instead,
(B)
a
b
b
a
Diakinesis
a
b
b
a
(C)
a
b
b
a
Metaphase I
(D)
a
b
a
b
b
a
b
a
Anaphase I
b
a
a
b
(E)
b
a
a
b
Metaphase II
b
a
b
a
a
b
a
b
Anaphase II
Figure 1 Orientation and segregation of axially oriented holocentric chromosomes in meiosis. (A) Two holocentric
meiotic bivalents are shown in diakinesis. The homologs are associated at the ends labeled (b). (BE) Segregation of
meiotic chromosomes on meiotic spindles drawn with the long axis vertical. The spindle microtubules are indicated
by lines and are shown converging toward the poles. (B) Alignment of the bivalents on the metaphase I spindle with
the ends labeled a proximal to the spindle poles and the ends labeled b on the spindle equator. (C) At anaphase I,
homologs separate to opposite spindle poles with the ends labeled a leading the way to the spindle poles. (D) Axial
orientation of sister chromatids with ends labeled b proximal to the spindle poles and the ends labeled a on the
equator of the spindle. (E) Anaphase II segregation with ends b leading the way to the spindle poles.
microtubules appear to project directly into the chromatin. At diakinesis of meiotic prophase, the bivalents
of holocentric chromosomes are composed of homologous chromosomes, which appear to be held
together in an end-to-end association. In earlier literature, this association was attributed to terminalization
of chiasmata, but whether there is terminalization in
organisms for which meiosis I is reductional is now
being questioned. It seems more likely that the
extreme condensation of the chromatin obscures cytological manifestations of distributed crossovers and
gives rise to the apparent end-to-end association of
the homologs. Furthermore, proper disjunction of the
homologs requires a crossover event, and it appears
that the location of the crossover determines which of
the two ends of the homologs are associated in the
bivalent.
The orientation of the bivalents on the metaphase I
spindle varies from species to species. The bivalentsmay
adopt the equatorial orientation and align parallel to the
equator of the spindle, or they may align parallel to
the spindle pole axis, adopting the axial orientation. If
the bivalent aligns axially, then the sister chromatids
segregate to the same pole at anaphase I, so that the
first meiotic division is reductional, as occurs in meiosis
in species with monocentric chromosomes. For equatorially oriented bivalents, the order is reversed and
the first meiotic division is equational. In C. elegans
and in some heteropteran species, it has been possible
to use cytological markers to study the segregation of
axially oriented homologs. As shown in Figure 1, the
chromosomes align axially at metaphase I and move
end on toward the spindle pole at anaphase I. On
completion of meiosis I, the sister chromatids
remain in association at the ends that were poleward
References
Albertson DG, Rose AM and Villeneuve AM (1997) Chromosome organization, mitosis and meiosis. In: Riddle DL, Blumenthal T, Meyer BJ and Priess JR (eds) C. elegans II, p. 47.
Plainview, NY: Cold Spring Harbor Laboratory Press.
White MJD (1973) Animal Cytology and Evolution. Cambridge:
Cambridge University Press.
958
H o l o p hy l y
Holophyly
E Mayr
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1514
Further Reading
References
Homeobox
T Burglin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0625
Structure of Homeodomain
The typical homeodomain is 60 amino acids long. The
structure of several highly divergent homeodomains
from yeast, flies, and vertebrates has been determined
using X-ray and nuclear magnetic resonance (NMR)
analysis. The different homeodomains are essentially
very similar, even though the primary sequence similarity can be very small. The core of the homeodomain
consists of three a-helixes (Figure 1). Helix 2 and helix
3 are linked via a short turn and form a structural motif
called a `helixturnhelix motif' that is shared with
many bacterial DNA-binding transcription factors
and repressors. Helix 1 crosses over helix 3 so that the
three helixes form a hydrophobic core that stabilizes
the structure (Figure 1). A substantial part of the
DNA-binding activity is located in helix 3, which lies
in the major groove of the DNA and provides most of
the sequence-specific contacts. In particular, residue 9
of helix 3 is a key residue that provides DNA-specific
contacts; most homeodomains have a glutamine at that
position. The flexible N-terminal arm of the homeodomain can reach into the minor groove of the DNA
and provide additional contacts, although the mode of
contact of this arm is subject to more variation between
different types of homeodomains.
In several different classes of homeobox genes,
insertion events have expanded or contracted the size
of the homeodomain. The insertion points for extra
residues are either in the loop between helix 1 and
helix 2, such as in the TALE class of homeobox
genes, or in the turn between helix 2 and helix 3.
Homeobox 959
2
2
1
3
1
Figure 1 Structure of the homeodomain. Two schematic views of the Antennapedia homeodomain bound to DNA
as determined by M. Billeter, G. Otting, and colleagues in the laboratory of K. Wuthrich. The NMR data were
modeled in RasMol V2.6, the DNA is shown as a stick model in light gray, while the protein backbone (side chains
not shown) is displayed as a dark ribbon. The numbers indicate the three a-helixes. Helix 3 sits in the major groove of
the DNA.
other, often sharing less than 50% identity in the
homeodomain. The vertebrate Evx genes, though not
Hox genes, are part of the vertebrate Hox cluster; in
other species such as flies, this gene family has separated from the cluster.
A number of homeobox gene families share similarities with the Hox cluster genes. For example, the
empty spiracles (ems) and caudal (cad) genes play a
role in anteriorposterior patterning and have a hexapeptide upstream of the homeodomain. A second
group of genes of the NK-2, NK-1, Tlx (which also
has a hexapeptide), and ladybird (lbx) families reside
in another gene cluster, the NK cluster, in Drosophila.
Several NK and NK-related goes are also linked in
vertebrates: analysis of human genome data suggests
that an NK cluster was duplicated and subsequently
broken in vertebrate evolution. A further small cluster, termed the `ParaHox cluster' has been found in
amphioxus, with the gene families cad, Xlox, and Gsx.
Some other gene families that are dispersed through
the genome, but are more similar to the Hox, NK, and
POU Class
prd Class
960
Homeobox
Drosophila melanogaster
Bithorax complex
lab
pb
Antennapedia complex
zen
Scr
&z2 bcd Dfd
ftz
Antp
HoxA
Evx1
HoxB
HoxC
HoxD
1
7
8
Mouse
10
11
12
13
Evx2
Figure 2 Hox clusters in Drosophila melanogaster and mouse. The Hox cluster of D. melanogaster contains
12 homeobox genes: labial (lab), proboscopedia (pb), zerknullt-related (zen2), zerknullt (zen), bicoid (bcd ), Deformed
(Dfd ), Sex combs reduced (Scr), fushi tarazu (ftz), Antennapedia (Antp), Ultrabithorax (Ubx), abdominalA (abd-A), Abdominal
B (Abd-B). The cluster is in fact split into two parts, one part is the Antennapedia complex and the other, the Bithorax
complex. Although located in homeotic complexes, several of the homeobox genes in the cluster are not homeotic
genes: zen, zen2, bcd, and ftz. In mouse and other vertebrates, there are four paralogous Hox clusters (apart from
fish, which have more). The duplications of the cluster from a single ancestral cluster probably happened at the
beginning of vertebrate evolution. In the course of evolution, some of the paralogous Hox genes were lost, so that
the present-day mammalian cluster contains 39 Hox genes, as well as two genes of the even-skipped family (Evx1 and
Evx2), which probably formed part of the ancestral cluster, too. The genes can be grouped into 13 paralog groups.
The lines between the mouse and the fly cluster show the evolutionary relationships between the homeobox genes.
Thus, the Hox genes of paralog group 1 are orthologous to lab in flies, paralog groups 913 are homologous to Adb-B.
Not all genes have 1:1 paralogs: for example, in the central part of the cluster the fly genes Antp, Ubx, and adb-A) and
the mouse paralogs Hox6, Hox7, and Hox8 may have arisen through independent duplication events from a single
ancestral gene. Likewise, zen2 is a relatively recent duplication from zen, and bcd also is derived from an ancestral
zen/Hox3 gene.
actually comprised of two similar domains, each containing a helixturnhelix motif. The homeodomain
distinguishes itself from other homeodomains by
having a serine residue at position 9 of helix 3.
prd-Like Class
LIM Class
ZF Class
cut Class
Homeobox 961
SO/SIX Class
HD-ZIP Class
TALE Class
pros Class
The prospero class is a highly divergent class of homeodomain proteins. The homeodomain has three extra
residues between helix 2 and helix 3, and a pros
domain of about 100 amino acids follows immediately
after the homeodomain.
Evolution
Homeobox genes are found in plants, fungi, and
animals, and even in slime molds (Dictyostelium).
Although now several prokaryotic genomes have
been sequenced, no true homeobox gene has been
found in these organisms. Thus, it appears likely that
the first homeobox appeared sometime in eukaryote
evolution, probably derived from a helixturnhelix
factor. In the ancestral organism from which eventually plants, fungi, and animals were derived, at
least two different homeobox genes must have existed
already: one a typical 60-amino acid homeobox gene,
and one TALE homeobox gene. This ancestral TALE
homeobox gene had a conserved upstream domain
from which the KNOX, MEIS, and PBC domains
are derived. While in plants and fungi some proliferation of different types of homeobox genes has
taken place, by far the largest expansion has happened
Function
Given the widespread nature of homeobox genes in
higher eukaryotes, it is not surprising that their function is very diverse. Nevertheless, most of them play
important roles in the development of their respective
organisms. Homeobox genes have been found to function at the earliest points in development as well as in
the very latest cell differentiation events. Some examples follow.
The Hox cluster genes of animals are involved in
patterning and specification of identity in regions
along the anteriorposterior body axis. A striking
aspects is that the order of the genes in the cluster is collinear with their function along the anteriorposterior
axis. Thus, the Drosophila gene labial (lab) functions
in the very anterior of the animal, while the gene
Abdominal-B (Abd-B) functions in the 5th to 8th
abdominal segments. In vertebrates, the Hox cluster
genes are likewise involved in patterning along the
body axis. Since the binding site of Hox cluster genes
is rather short, additional cofactors are necessary to
provide DNA-binding specificity. Two TALE class
homeobox genes have been identified as cofactors for
Hox cluster genes: in flies the two genes extradenticle
(exd), a PBC class gene, and homothorax (hth), a MEIS
class gene, have been shown to form complexes with
Hox proteins such as Ubx or lab.
One of the earliest developmental homeobox genes
is found in Drosophila. The gene bicoid plays a key
role in setting up the anteriorposterior axis in the
embryo. Mutant embryos lack head and thorax and
develop posterior structures at the head. The bcd
RNA is provided maternally, and the RNA as well as
the protein are localized at the anterior pole of the
embryo. Despite the crucial role in early development
for Drosophila, the bcd gene is a relatively new gene in
evolutionary terms; it is most likely derived from an
ancestral gene of the Hox3 group. Furthermore, while
most homeobox genes bind DNA and function as
transcription factors, bcd also plays a regulatory role
at the level of messenger RNA. It can bind RNA and
regulate expression of other genes at the translational
level.
The C. elegans prd-like gene unc-4 is involved in
the specification of motor neurons. Mutations in this
962
Homeotic Genes
Further Reading
Homeotic Genes
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1857
Homeotic Mutation
T Burglin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0626
H o m o l o g o u s C h romos o m es 963
fly counterparts. Functional analysis of these genes
using knock-out techniques revealed that in vertebrates, too, they function in patterning along the
anteriorposterior body axis and cause homeotic
transformations. The HOM-C and HOX clusters harbor the perhaps most well-known developmental control genes; Ed Lewis was awarded the Nobel Prize for
his ground-breaking studies of BX-C.
While the term `homeotic mutation' is mainly
known from mutations in segmentation genes in Drosophila, the original definition of homeosis is very
broad. Thus, other mutations that cause transformations have also been termed homeotic. For example,
in the nematode Caenorhabditis elegans the gene lin12, which encodes a transmembrane receptor, has been
termed a homeotic gene, because many cell lineages
(patterns of cell divisions) are transformed into other
cell lineages. In plants, many homeotic mutations
are known that cause transformations of leaves and
flowers.
See also: Homeobox; Lewis, Edward
Homogeneously Staining
Regions
G Levan
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1582
Reference
Schwab M (1999) Oncogene amplification in solid tumors. Seminars in Cancer Biology 9: 319325.
Homologous
Chromosomes
J R S Fincham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0627
Homologous means having a common origin by descent. Chromosomes that are homologous in this broad
sense may have diverged to a very considerable extent.
In the long evolutionary term, chromosomes may
undergo structural rearrangements, so that homology
between different species and genera is often a property of chromosome segments rather than whole
chromosomes. Between the chromosomes of mice
and humans, for example, there is quite a high degree
of patchwork homology, with blocks of similar genes
in locally similar sequences (syntenic genes) in very
different larger-scale arrangements.
However, in the context of experimental genetics,
homology most usually refers to the close similarity of
the pairs of chromosomes in the diploid organism.
The classical criterion for assessing homology in this
stricter sense is ability to pair. A fully homologous
chromosome pair will be closely associated all along
their lengths at the pachytene stage of meiosis. The
pairing can be seen under the light microscope in
organisms with reasonably large chromosomes (e.g.,
very clearly in Zea mays and not at all in yeasts, except
by fluorescent in situ hybridization, also known as
FISH), but the synaptonemal complex, which is
formed between the paired homologues, can usually
be clearly visualized with the electron microscope,
even in yeasts, by virtue of its staining with silver
ions. In Drosophila species (as well as in other flies),
homologous pairing can be seen in unrivalled detail in
the giant nuclei of salivary gland cells, where maximally
extended chromosomes are amplified over 100-fold in
thickness by repeated replication without separation
(polytene chromosomes), and homologs are closely
paired. The close pairing of chromosomes, either at
pachytene of meiosis or in Drosophila giant nuclei
(where much more detail can be seen), reveals structural
964
Homologs
Homology
Homologs
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1858
E O Wiley
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0628
Hom ol og y 965
structure and position of the characters of two different
organisms seems necessary to be able to do so. The
practice of distinguishing characters and character
states grew from the use of data matrices where columns of data were given a general name and the
characters of organisms a specific name. But, columns
really represent initial hypotheses of homology. Characters placed in a single column of data are initially
thought to be good candidates for having a homologous relationship. Whether this is true in the end is
another matter.
Nature of Homologs
Exactly what constitutes homology from the ontological perspective is also debated. Although we
understand that homologs gain their `comparability'
through descent, and we understand that homologies
appear as apomorphies on phylogenetic trees, we also
understand that the homologies being compared do
not actually have descent relationships. That is, right
hands do not actually give rise to other right hands,
nor does guanine at position 158 in a cytochrome b
gene sequence give rise to a descendant guanine at that
same position in a descendant mitochondrion. Rather,
the relationship between homologs is always indirect,
being mediated by ontogeny at the morphological
level and semiconservative replication at the DNA
level (and other processes at intermediate levels).
This has led authors such as Van Valen (1982),
Hausperger (1991), and Roth (1994) to characterize
homology as a manifestation of the flow of biological
information between generations and over phylogeny.
This concept of information should not be confused
with sequence information; it includes epigenetic
information as well. Under this concept, homologous
966
Homology
structures are the observable manifestation of information flow over time and through descent. If so, then
this general concept of homology can be easily
extended to behavioral and functional characters (see
Greene, 1994 and Lauder, 1994, for examples).
Further, it solves certain conundrums such as how to
homologize Meckel's cartilage in vertebrates where
the structure is induced by different tissues (see review
in Wagner, 1994). In such cases, epigenetic constraints
on the developing phenotype may allow for considerable variation in the actual way that a particular structure is built during ontogeny.
Hom ol og y 967
synapomorphies, then what appears to be the most
parsimonious tree (four dependent synapomorphies)
may actually have less support than the alternative tree.
Sequence synapomorphies of
orthologous hemoglobin genes
Origin of -hemoglobin via
tandem duplication
Sequence synapomorphies
of -hemoglobin gene
Similarity Test
Myoglobin
Sequence synapomorphies
among paralogous genes
Conjunction Test
``If two supposed homologues are found together
in one organism, they cannot be homologous''
(Patterson, 1988, p. 605). In morphological characters,
similarities that are found in two to many `copies' are
termed homonomies. Homonomies (iterative homologies at lower levels) may take the form of `serial
homologs' in the case of metameristic repeats of
body segments or `mass' or `general' homologies in
the case of hair in mammals. Homology statements
mixing parts of homonomous body segments would
result in spurious taxic homology statements (as in
paralogous genes). However, the evolutionary novelty
that produced the serial homology may act as a synapomorphy at a higher level in the phylogeny. In the
case of such characters as mammalian hair, general
presence of the mass homology may act as a synapomorphy in spite of the fact that it may be difficult
to impossible to provide a one-to-one homology
statement about individual hairs.
In genetic systems, similar characters that fail
the conjunction test may take the form of paralogous
genes or xenologous genes (paralogy is discussed
968
Homology
above). Some paralogous gene families show concerted evolution and in some circumstances their copies
(plerologs) may be treated as a single gene for the
purposes of phylogenetic analysis (but see Hillis
(1994), for further discussion). Xenology obtains
when genes of the same gene family are spread by
lateral gene transfer rather than common descent.
On a phylogeny of organisms, paralogous genes are
expected to form nested sets of characters that reflect
descent of the gene family (Figure 1). There is no
expectation of such a pattern in xenologs whose
spread is not historically constrained.
Congruence Test
Similarities that pass both the similarity and conjunction tests whose distribution on a phylogenetic
tree are congruent with many other similarities are
deduced to be candidates for the status of uncontested homologies. Under the assumption that homology is apomorphy, the congruence test provides
the final arbitrator for accepting or rejecting parts
that pass the first two tests. Similarities that fail the
congruence test are frequently termed parallelisms if
they are very similar, or convergences if they are dissimilar upon reinspection. (The distinction between
parallel and convergent characters is debatable, see
Homoplasy.)
Homoplasy 969
References
Homoplasy
E O Wiley
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0629
Analogous Similarities
There is considerable confusion concerning the concept of analogy (see Analogy). Analogous similarities,
as the term is usually applied in systematics, are
similarities in function and frequently do not appear
as similarities in underlying structure. As such, they
do not usually appear as homoplasies because they are
screened before analysis and would appear in different
data columns in a matrix of characters. That is, the
investigator would not enter the analysis with an
underlying hypothesis that the wings of bats and the
wings of insects were homologous. However, analogous similarities can appear as homoplasies if the
underlying structures are homologous but modified
to perform a similar function. For example, one could
imagine that a matrix of all vertebrates would contain
a column containing the character `wings present'
versus `forelimbs present,' and that the resulting analysis would show `wings present' as homoplastic,
appearing as a synapomorphy of birds and another
synapomorphy of bats independently. However,
even a cursory examination of the character `having
wings' would reveal that the structure of the wings of
bats and birds are different relative to the details of
wing architecture. Analogies fail the similarity test
(see Analogy for further discussion).
970
Homozygosity
Homonomy
At the level of taxa, homonoms are similarities that fail
the conjunction test because two or more similar
structures are found in the same taxon. Some homonomies are termed `general homologies' or `mass
homologies' and their presence versus absence is
treated as a synapomorphy. For example, although it
is difficult to homologize any two mammalian hair
follicles, the presence of hair is treated as a synapomorphy of Mammalia. In other cases, body parts are
duplicated through metamerism in development
(legs and antennae of insects). The classic example of
homoplasy in this example is the mouth parts of
arthropods where function of feeding is allocated to
appendages that belong to different segments in different groups. Homologous structures can be found
among organisms within a homologous segment,
but comparison of similar appendages that belong to
different segments would lead to a mistake in taxic
homology determination. Paralogous genes are examples of homonomous parts at the organism level
of organization. Just as with `mass homology,' the
presence of a particular gene family may be treated
as a synapomorphy if it passes the congruence test
(see Homology).
References
Homozygosity
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0630
H o rd e u m Sp e ci es 971
Hordeum Species
J W Snape and W Powell
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1673
Uses of Barley
The grain of cultivated barley has two major uses, first
for malting to produce beer and spirits, and second for
animal feed. Plant breeding has produced varieties that
are specialized for malting, and all others not suitable
for this are used for animal feed. These latter varieties
tend to be the highest yielding. Some grain is used
directly for human food products, for example, in
certain countries such as Ethiopia and Nepal, but
overall, this is a minor use. Malting varieties are bred
for a particular grain composition which includes low
protein and b-glucan content, and high enzyme activity, although the final product is also affected by the
environmental conditions under which the variety is
grown. To produce malt for the brewing industry,
grains of barley are germinated so that enzymes are
released for digestion of the cell walls and endosperm.
The digested grains are then heat treated and dried.
This produces malt which is a mixture of enzymes and
substrates, mainly starch, proteins, and b-glucans. The
malting process thus involves degradation of the cell
wall material by b-glucanases, digestion of starch by
a-amylases, and hydrolysis of the protein matrix. The
malt is then used as a substrate for fermentation by
972
H o rd e u m Species
Cytogenetics of Barley
Barleys have a basic chromosome number of seven.
The chromosomes are large enough to be identified
individually by light microscopy, particularly if they
are differentially stained using C-banding or N-banding where blocks of heterochromatin reveal distinctive
patterns for each chromosome. This also reveals species relationships, such as the close relationship
between the genomes of North and South American
species. The H symbol (with or without a species superscript) is conventionally used to designate chromosomes of the genus so as to indicate homoeology
with chromosomes of other species of Triticeae. Cultivated barley chromosomes are thus designated 1H to
7H, and H. bulbosum chromosomes or H. chilense
chromosomes Hb and Hch, respectively. Barley geneticists originally designated the chromosomes of cultivated barley from 1 to 7 and the relationship between
the old and new (H) nomenclature is 1 7H, 2 2H,
3 3H, 4 4H, 5 1H, 6 6H, and 7 5H.
Cytogenetical and genetical analysis in barley has
been greatly assisted by the availability of a range of
aneuploid stocks including a complete barley trisomic
series and 11 telotrisomic lines. Amongst the most
useful aneuploid stocks for genetical analysis in barley
are those obtained by interspecific hybridization. In
particular, the chromosomes of cultivated barley and
H. chilense added to bread wheat, and substitution
lines derived from these. In addition, cultivated barley
has a whole range of other cytogenetically defined
stocks including over 1000 reciprocal translocations,
inversions, deletions, and duplications. Deletion
breakpoints have been used to make comparisons
between physical and genetic maps of barley.
Breeding Barley
Barley is a natural inbreeder and most breeding
schemes follow a pedigree selection scheme with
minor variations in detail. All schemes are based on
the principle of identifying the desirable recombinant
whilst progressing to homozygosity. Conventional
breeding schemes tend to be lengthy (up to 10 years)
but have been successful in contributing an average
1% annual increase in grain yield. Both single seed
descent and doubled haploid methods are being used
to augment conventional breeding methods. These
approaches reduce the time scale and improve the
efficiency of selection by creating homozygous material for evaluation. Molecular marker technology is
being used to enhance the effectiveness of barley
Horizontal Transfer
M G Kidwell
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0632
974
H o r iz o n ta l Tra ns fe r
Significance
The existence of horizontal gene transfer in nature has
important implications for both basic and applied
science. However, because of the infancy of studies
in this area, the full significance of this process is not
yet known. The strictly bifurcating tree of life as
envisaged by Darwin assumes no exceptions to the
vertical transmission inherent in normal parentto-offspring inheritance of genetic material. In contrast, horizontal transfer introduces crosslinks into the
phylogenetic trees of those genes that are transferred
and incongruities between the phylogenies of different gene sequences. Thus if horizontal transfer is frequent, our picture of the tree of life is changed
significantly and serious practical difficulties can
arise when attempts are made to infer phylogenies
from horizontally transferred sequences. The existence of natural horizontal transfer also has important
implications for artificial gene transfer in medicine
and agriculture.
See also: Conjugation; Symbiosis Islands; Transfer
of Genetic Information from Agrobacterium
tumefaciens to Plants
Host-Lethal Gene
E Kutter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0633
976
H o s t - Ra n g e M u t a n t
Host-Range Mutant
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0634
Hot Spots
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0635
978
Housekeeping Gene
Housekeeping Gene
Hox Genes
M Goldman
H TLV- 1 979
work together and can compensate for each other.
Hence a defect in one gene is corrected by the similar
activities of other Hox proteins. In humans, specific
Hox genes have been implicated in genetic disorders
affecting development of the limbs and the genitourinary tract. Several studies have suggested that
Hox genes are also required for proper function of
adult tissues. Specific Hox genes function together to
control development of the mammary gland in
response to pregnancy, whereas others may be
involved in human endometrial development and
implantation. Recent studies have also shown direct
involvement of deregulated Hox genes in the development of human leukemias.
Since the description of the first homeotic mutations by Bateson in 1894 and the discovery of the
homeodomain in 1984 there has been tremendous
progress in understanding the function of these
important genes. These genes represent important
control points in the processes that regulate morphogenesis or how tissues are formed and patterned. To build a picture of how this entire process
occurs we still need to determine the immediate
gene and cellular targets of their action in order to
understand how they regulate cell growth and differentiation.
See also: Homeotic Genes
Hsp
See: Heat Shock Proteins
HTLV-1
M J S Dyer
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1583
protein), and Pol (RNA-dependent DNA polymerase) genes, but also Tax and Rex, genes involved in
the regulation and splicing of viral RNA. HTLV-1
lacks an obvious transforming oncogene. HTLV-1
may infect several different cell types in vitro but
only replicates efficiently in CD4 T cells. The virus
is endemic in the tropics and the prevalence may reach
over 20% in some areas. Transmission may be vertical,
from mother to infant by breastfeeding, or horizontally, via intravenous drug abuse, sexual contact, or
transfusion of contaminated blood. The percentage
of infected individuals developing either disease is
very low and the factors necessary for the development remain unknown.
ATLL
In 1977, a rapidly progressive and uniformly fatal
T-cell lymphoproliferative disorder in patients from
the south west of Kyushu, Japan was described by
Takatsuki and Uchiyama. An identical disease in
patients from the Caribbean was described by
Catovsky and colleagues in 1982 in London. Subsequent investigations showed the presence of antibodies to HTLV-1 and monoclonal proviral integration
in tumor cells. Other cases have now been reported
from a number of other geographical sources including southeastern USA, South America (Chile and
Brazil), and West Africa. In southwest Japan, ATLL
constitutes a major health problem. Various clinical
types of ATLL have been described, but ultimately, all
forms progress and are fatal; treatment is often associated with opportunistic infections. Patients present
with enlarged lymph nodes and skin rash often accompanied by hypercalcemia. Diagnosis is usually made
on the presence of cells with a characteristic convoluted nuclear morphology (`flower cells') in the
peripheral blood. These cells are characteristically
CD4 and CD25, the latter being a component of
the IL2 receptor. There are no consistent cytogenetic
abnormalities in ATLL patients and the mechanisms
that promote transformation of infected CD4 T cells
are not known. Proviral integration appears to be
random. Comparative genomic hybridization studies
have shown amplification of 14q32 and 2p13 in some
patients, although the nature of the target genes is not
known. Recent work indicates that viral Tax may
result in constitutive NF-kB activation and therefore
prolonged cell survival through its interaction with
the IKKb/IKKg complex of controlling kinases.
Expression of antiapoptotic proteins such as BCL-XL
may also be upregulated.
See also: Retroviruses
980
H u m a n C h romo s omes
Human Chromosomes
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0643
Human Genetics
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0644
The Human Genome Project (HGP) is an international 13-year effort to sequence and discover all
human genes (the human genome) and make them
accessible for further biological study. The collaborative project began formally in October 1990, and
involves 20 groups from the USA, UK, Japan, France,
Germany, and China. Originally, the project was
expected to last 15 years, but technological advances
have brought forward the completion date to 2003.
The total size of the human genome is estimated to
be about 3 billion base pairs, arrayed in 24 distinct
chromosomes (autosomes 122 plus X and Y). The
chromosomes range in size from 50250 million
bases (megabases) long, too large to be sequenced
directly, so each chromosome is first broken into relatively large fragments about 150 000 bp long. The
large fragments are inserted into bacterial artificial
chromosomes (BACs), and genome mapping techniques are used to determine the position of each
References:
Dunham I et al. (1999) The DNA sequence of human chromosome 22. Nature 402: 489499.
International Human Genome Sequencing Consortium
Announces ``Working Draft'' of Human Genome (Press
Release), The Sanger Centre, Monday 26 June 2000.
http://www.ncbi.nlm.nih.gov/genome/seq/page.cgi?F=HsHome.
shtml&ORG=Hs
International Human Genome Sequencing Consortium (2001)
Initial Sequencing of the Human Genome. Nature 409:
860921.
Hunter Syndrome
K M Beckingham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0645
Huntington's Disease
D C Rubinsztein
Copyright 2001 Academic Press
doi: 10.1006/rwgn. 2001.0646
982
Huntington's Disease
Genetics
HD is associated with abnormal expansions of a
(CAG)n repeat in the 50 end of the coding region of
a large gene called IT15. Normal chromosomes are
Pathology
Neuronal loss in HD is particularly severe in the
caudate nucleus, putamen, and cerebral cortex. However, in advanced cases, there is overall atrophy and
brain weight can be reduced by up to 25%. The cell
loss in the caudate nucleus and the putamen (which
together comprise the corpus striatum) is selective.
The earliest loss is in the dorsal and medial regions
and this progresses laterally and ventrally as the disease takes its course. Within the striatum, the medium
spiny neurons, particularly those synthesizing enkephalin and g-aminobutyric acid (GABA), show particular sensitivity to the HD mutation. In the cortex,
the large neurons appear to be most severely affected,
with greatest loss in layers VI, V, and III.
Pathological Mechanisms
The HD mutation confers a deleterious gain-offunction on the mutant protein. This model was suggested before the gene was cloned by observations
in patients with WolfHirschorn syndrome. These
Further Reading
Hurler Syndrome
T M Picknett and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0647
984
primate and human paleontology and ethology, linking evolution to homo sapiens.
Having had to fight his way into and to the top of
the scientific profession he also helped set in place
procedures for scientists to be awarded salaries. This
gave all people, rich and poor, a chance to enter the
scientific ranks.
See also: Darwin, Charles
Hybrid
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0649
Hybrid is the term for the offspring from two genetically distinct parents. When the two parents have no
recent common ancestry, the offspring are referred to
as F1 hybrids.
See also: F1 Hybrid
Hybrid-Arrested
Translation
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1859
Hybrid Dysgenesis
M G Kidwell
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0651
986
H y b r i d S t e r i l i t y, M o u s e
Hybrid Vigor
J A Fossella
L Silver
988
H y d a t i d i f o r m M o l es 989
Hybridization
T M Picknett and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1866
Hydatidiform Moles
D K Kalousek
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0658
genetic contribution is more important in the development of the early embryo. This differential expression of genetic messages, depending on their maternal
or paternal origin, is known as genomic imprinting
(Hall, 1990).
990
Hydatidiform Moles
anuclear
23
CHM
46 chromosomes
Figure 1
Table 1
PHM
69 chromosomes
Feature
Complete mole
Partial mole
Clinical presentation
Gestational age
Uterine size
Serum hCG
Cytogenetics
Spontaneous abortion
1618 weeks
Often large for dates
Absent
Present
Round
Marked
Circumferential
Often present
Scalloped
Less pronounced
Focal, minimal
Absent
hCG, human chorionic gonadotropin; PLAP, placental alkaline phosphatase; PL, placental lactogen. (Modified from
Silverberg SG and Hurman RJ (1992) Atlas of Tumor Pathology: Tumors of the Uterine Corpus and Gestational Trophoblastic
Disease. Washington DC: Armed Forces Institute of Pathology.)
paternal chromosome sets, is detected. A few trisomic conceptuses with partial mole-like morphology have been described.
The gross specimen in PHM shows hydropic villi like
those seen in CHM mixed with nonmolar placental
tissue. Evidence of an embryo or an amnion is usually
present; stromal vasculature and vessels may contain
fetal nucleated erythrocytes. Microscopic and
differential features between CHM and PHM are
H y p e r v a r i a b l e Re g i o n 991
fetal phenotypes have been delineated: type I fetus
with paternal sets dominance, associated with a large
cystic placenta, has relatively normal fetal growth and
microcephaly; type II fetus with maternal sets dominance, associated with a small noncystic placenta, is
markedly growth retarded, and has a disproportionately large head (McFadden and Kalousek, 1991).
References
Hyperchromicity
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2105
Hypervariable Region
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1868
I
Ichthyosis
F M Pope
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0661
of the extensor surfaces, sparing the trunk and flexures. There is criss-crossing of the palms and soles
and histologically the granular cell layer is deficient,
with epidermal hyperkeratosis. There is frequently
associated atopy. Profillagrin deficiency has been
identified (Sybert et al., 1985).
X-Linked Ichthyosis
This is also has a very early onset, at birth or within
the first 3 months of age. The distribution of scaling
differs substantially from ichthyosis vulgaris. Thus it
involves the scalp, ears, neck, and flexures and affects
the abdomen and the anterior trunk. Unlike ichthyosis
vulgaris, the epidermis is hypertrophic, with a normal
granular layer.
Both 3b-steroid sulfatase and aryl sulfatase are
deficient causing estriol deficiency, delayed labor,
and increased fetal loss. Postnatally, affected boys
develop ichthyosis. In some families there is hypogonadism. The STS gene has been cloned and in most
cases is completely deleted, but if not has 50 misfunctional deletions (Basler et al., 1992). In other cases
there are point mutations. The steroid sulfatase
enzyme assay is also very reliable, and a simple staining assay for hexanol dehydrogenase provides rapid
confirmation of diagnosis (Lake et al., 1991).
994
I c h t hyo s i s
(A)
(B)
Figure 1 (See Plate 21) (A) Generalized hyperkeratosis with background erythema of the lower limbs;
(B) palmoplantar hyperkeratosis extending proximally, typical of epidermolytic hyperkeratosis.
delicate than EH skin. However, like EH, there are
keratin mutations, in this case in the rod domain of the
keratin 2e gene on chromosome 12.
Lamellar Ichthyosis
Usually affected infants are collodion babies at birth.
There are ectodermal dysplastic features, with poor
sweating, dystrophic nails, alopecia, and ectropion.
However, the very large branny scales are diagnostic
and very typical and, furthermore, the teeth are normal. Inheritance is usually autosomal recessive.
Another variant has much more severe erythroderma and more severe collodion changes. Like the
former type, there are very large adherent scales. The
two differ histologically, with severe orthokeratosis
and hyperkeratosis in the milder phenotype. The outcome is variable, some affected infants dying of dehydration, sepsis, or hypoproteinemia, whilst others heal
and survive. There are two nonallelic gene loci, one of
which at 14q11 is close to the transglutaminase gene
(Russell et al., 1995), and mutations have been
detected (Parmentier et al., 1995). There is a second
Harlequin Fetus
It is unclear whether this is allelic to the lamellar
ichthyoses or a separate entity, (Figure 2). In any
event, there is spectacular hyperkeratosis, with very
severe facial edema and distortion. It is unclear
whether or not this is allelic to any of the other lamellar ichthyoses.
Other Ichthyoses
These include Netherton syndrome (linear ichthyosis,
with pili torti, or trichorrhexis). Refsum disease (ichthyosis vulgaris-like scaling with retinitis pigmentosa,
peripheral neuropathy, and cerebellar ataxia), trichothiodystrophy, ichthyosiform erythroderma with
I d e n t i t y b y D e s c e n t 995
Rothnagel JA, Dominey AM, Dempsey LD et al. (1992) Mutations in the rod domains of keratins 1 and 10 in epidermolytic
hyperkeratosis. Science 257: 11281130.
Russell LJ, Di Giovanna LJJ, Rogers GR et al. (1995) Mutations
for the gene for transglutaminase 1 in autosomal recessive
lamellar ichthyosis. Nature Genetics 9: 279283.
Sybert VP, Dale BA and Holbrook KA (1985) Ichthyosis vulgaris:
identification of a defect in synthesis of filaggrin correlated
with an absence of keratohyaline granules. Journal of Investigative Dermatology 84: 191194.
Identity by Descent
D L Hartl
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0662
References
996
Idiogram
References
Idiogram
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0663
Igf2 Locus
K N Gracy and B J Smith
Further Reading
Igf 2r Locus
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0665
Illegitimate
Recombination
D Carroll
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0666
I l l e g i t i m a t e R e c o m b i n a t i o n 997
The definition of illegitimate recombination is
imprecise, since there is no strict threshold in the
amount of homology that defines legitimate events.
When the junctions resulting from illegitimate recombination are examined for example, in experiments
with cultured mammalian cells they often show
matches of a few base pairs between the parental
sequences. These microhomologies are not absolutely
required, since some joints show no such matches.
Typically the number of matched base pairs at the junction is 15, but occasionally longer matches are seen.
Illegitimate recombination is observed in most
organisms. The origin of spontaneous illegitimate
events cannot be traced, but such joints are clearly
formed in response to double-strand breaks in chromosomal DNA. In cultured cells from multicellular
eukaryotes, illegitimate end joining is the most common fate of linear DNAs introduced artificially into
the cells. The yeast Saccharomyces cerevisiae and some
other fungi have very efficient mechanisms of homologous recombination that predominate in the processing of chromosomal breaks and of DNA introduced
during transformation; but illegitimate events can be
detected if no homology is present, or if the capability
of performing homologous recombination has been
disabled by mutation.
The mechanism by which illegitimate recombination occurs in cells is not known. Two simple and
attractive hypotheses describe mechanisms that very
likely both contribute to the observed junctions. The
first hypothesis is that DNA ends are simply joined by
a DNA ligase (Figure 1). When there are complementary nucleotides appropriately situated in singlestranded regions, they stabilize an association between
the ends and help set the register for the ligase. Joints
of this type have been produced in crude extracts from
eukaryotic cells and with some purified DNA ligases.
There are also ligases e.g., that encoded by bacteriophage T4 that can join blunt DNA ends that have no
single-stranded overlaps.
The second hypothesis is that rather long singlestranded tails are formed presumably by the action
of exonucleases at broken ends (Figure 2). If these
tails have free 30 ends, microhomologies can support
transient associations that can be stabilized by DNA
synthesis through use of the transient joint as a
primer-template complex by a cellular DNA polymerase.
While the details of the illegitimate recombination
mechanism remain obscure, some information is available on proteins that participate in the process. In
mammalian cells, a protein complex called Ku and its
associated DNA-dependent protein kinase (DNAPK) are required for efficient end joining. Yeast share
the requirement for Ku and for a DNA ligase that is
Figure 1 Illegitimate recombination by microhomologydirected ligation. The broken ends of two DNA
molecules are indicated in the top diagram. Horizontal
lines show the phosphodiester backbones and short
vertical lines the WatsonCrick base pairs. In step 1, each
end is partly degraded by a strand-specific exonuclease. In
step 2, the single-stranded tails of the two DNAs come
together, directed by the formation of two base pairs
surrounding a mismatch. In step 3, the strands of the two
DNAs are joined by a cellular DNA ligase.
Figure 2 Illegitimate recombination by microhomologydirected DNA synthesis. The starting point is the same
as in Figure 1. In step 1, each DNA is degraded more
extensively by a 50 !30 exonuclease. In step 2, the 30 singlestranded tails come together through the formation of two
base pairs. In step 3, the 30 end of the thinner strand is
extended by DNA polymerase, forming an extended
base-paired region. In step 4, the thicker single strand
is degraded, its 30 end is used to prime synthesis in the
remaining gap, and nicks at both ends of the new joint are
sealed by DNA ligase.
998
Immunity
Further Reading
Roth D and Wilson J (1988) Illegitimate recombination in mammalian cells. In: Kucherlapati R and Smith GR (eds) Genetic
Recombination, pp. 621653. Washington, DC: American
Society for Microbiology Press.
Immunity
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1870
Immunoglobulin Gene
Superfamily
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0669
Imprinting, Genomic
A C Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0672
A
imprinted genes
B
C
non-imprinted
genes
D
maternal
paternal
Developmental Consequences of
Imprinting
Genomic imprinting ensures the requirement for both
a mother and a father to produce normal mammalian
offspring as shown by the failure of bimaternal and
bipaternal conceptuses to complete embryogenesis.
Parthenogenesis, the development of an egg without
1000
Imprinting, Genomic
Imprinting in Disease
It became evident that imprinting had clinical implications through the study of patients with disorders
that exhibit parental-origin effects in their patterns of
inheritance. There are now several syndromes which
are recognized as imprinting disorders. Imprinting
mutations have also been implicated in the genesis of
some tumours, notably Wilms's tumor and familial
glomus tumors. These imprinting disorders show a
normal autosomal dominant pattern of inheritance
but from a parent of one sex offspring of an affected
individual of the opposite sex are completely unaffected. The disorder remanifests itself in a subsequent
generation after inheritance through a phenotypically
normal carrier individual of the appropriate sex. Males
and females are equally affected which clearly distinguishes an imprinting pedigree from that of a sexlinked disorder. For example, benign familial glomus
tumors show autosomal dominant inheritance but are
only manifest in individuals inheriting the mutant
gene from their fathers. Inheritance of the mutation
from the mother results in normal offspring; however,
her sons (if carriers) will have affected offspring at a
frequency of 50%. Other imprinting disorders have
been associated with a significant level of UPD. To
date, all human chromosomes involved in these syndromes show evolutionary conservation with those
in the mouse identified as imprinted chromosomes
(see above). In addition to hereditary glomus tumors,
imprinted disorders identified to date include
BeckwithWiedemann syndrome and SilverRussell
syndrome which are growth defects, two neurological
disorders Angelmann syndrome and PraderWilli
syndrome, transient neonatal diabetes, and maternal
UPD14 syndrome. The latter is a rare disorder associated with growth defects and premature puberty.
Mechanism of Imprinting
The mechanism causing parental-origin specific gene
expression must allow the transcriptional machinery
of the cell to distinguish between two chromosome
homologs and differentially act on one or the other.
The imprint is believed to be initiated late in the
I m p r i n t i n g , G e n o m i c 1001
development of the egg and sperm and then acted
upon in the zygote and developing conceptus to affect
developmental gene activity. It is therefore likely
that the imprint is a modification to the DNA and/
or chromatin which must have the following properties:
1. It must be able to affect the transcription of the
gene.
2. It must be heritable in somatic cells over many cell
divisions and not lost during chromosome replication. This renders the imprint stable and allows
it to have parental-origin specific memory. This
step is known as maintenance.
3. Importantly, the imprints must be erased in the
male and female germlines during gametogenesis
to allow new imprints to be set down which are
specific to the parental origin of the newly formed
gametes.
There are only a few recognized mammalian genome
modifications that might fulfil the above criteria. By
far the best studied is DNA methylation. DNA
methylation of CpG dinucleotides is known to affect
gene activity. Indeed, methylation of CpG-rich regulatory portions of genes, for example on the inactive
X chromosome in females, has long been associated
with gene inactivity. More recently, it has been shown
that imprinted genes contain regions that are differentially methylated on the two parental chromosomes;
however, sometimes methylation is associated with
the inactive allele and sometimes with the active allele.
In the absence of the DNA methyltransferase gene,
which encodes the methylating enzyme, the methylation imprint is lost from somatic cells and imprinted
gene activity is perturbed. Thus, methylation is
involved, at least, in the maintenance of imprinting.
Whether methylation is the germline imprinting
initiator remains to be proven; however, several differences in methylation have been found in the DNA of
eggs and sperm in imprinted regions, which suggest
that CpG methylation may have a role to play in the
earliest imprinting events.
Other modifications may also be involved in the
imprinting process. It is apparent that many imprinted
genes show differences in their chromatin structure
between the active and inactive alleles. However, in
the case of imprinting, the relationship, if any, between
a region's chromatin conformation and its methylation status is not understood. It is now well documented that modifications to chromatin-associated
proteins, notably acetylation of core histones, have
key roles to play in the regulation of gene expression
and it is possible that these may be involved in the
imprinting mechanism. Nonetheless, it seems that the
imprints are acting both at short and long range,
Further Reading
1002
I n s i t u H y b ridiz a tion
In situ Hybridization
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0697
When exposed to a UV light source, each fluorochrome is excited by a different wavelength and
each emits a distinctive fluorescence. In order to distinguish the various emissions produced by each
fluorochrome, a series of exitation and emission filters
are used that are specific for each fluorochrome. Combinations of filters allow the observation simultaneously of several fluorochromes excited by different
wavelengths, and this, together with the development
of digital fluorescence microscopy and image analysis,
has led to the introduction of multicolor fluorescence
ISH (M-FISH). M-FISH systems depend on the use of
combinations of up to five different fluorochromes to
label individual DNA probes so that a large number of
probes can be distinguished in each preparation. This
requires a sensitive, monochromatic, cooled chargedcoupled device (CCD) camera and computerized
image analysis. A gray-scale image of the fluorescence
of each fluorochrome is acquired sequentially and
merged to provide a false color on the computer
screen, which is chosen on the basis of the relative
intensities of the constituent fluorochromes.
In situ H y b r i d i z a t i o n 1003
analysis of complex chromosome aberrations and are
commercially available from several distributors
either as single chromosome-specific paint probes or
as complete probe sets in which each chromosome is
labeled differently for M-FISH analysis. This allows
the analysis of a complete cell in one hybridization.
The main disadvantage of chromosome-specific
paint probes is that they are unable to identify
intrachromosomal aberrations such as inversions,
duplications, and insertions, and that areas containing
repetitive sequences, especially telomeres and centromeres, are not painted. In these cases, region-specific
paint probes prepared from amplified chromosome
segments obtained by chromosome microdissection
have found some application.
Chromosome-specific centromeric probes are prepared from cloned alphoid repeat sequences which are
located adjacent to centromeres. Almost all human
chromosomes have chromosome-specific sequences
of this type. The exceptions are chromosomes 13, 14,
21, and 22. Chromosomes 13 and 21 have the same
centromeric sequences, different from 14 and 22,
which also share the same sequences. These probes
are used to determine chromosome copy number in
interphase nuclei. More than 80% of normal diploid
nuclei will show two distinct signals when hybridized
with a chromosome-specific centromeric probe. Centromeric probes are therefore used for aneuploidy
detection in uncultured amniotic fluid cells, for preimplantation diagnosis in cells from the blastocyst, for
the detection of residual disease in the management
of certain hematological malignancies, and for the
analysis of nondisjunctional abnormalities in sperm.
Chromosome-specific sequences cloned in yeast artificial chromosome (YAC), bacterial artificial chromosome (BAC), or cosmid vectors replace the lack of
specific centromeric probes for aneuploidy detection
involving chromosomes 13, 14, 21, and 22.
The project to map and sequence the human genome has, as one of its by-products, a complete series of
overlapping DNA clones from which reference
probes can be produced which can be used as FISH
markers to delineate any point on any chromosome.
Cloned in a variety of cosmid and other vectors, they
can be used to characterize specific breakpoints and to
detect specific microdeletions (such as the DiGeorge
syndrome on chromosome 22). These single-copy
DNA sequence probes have wide application in
clinical cytogenetics and in the mapping and cloning
of disease genes.
Telomere-specific probes are now available for the
ends of all human chromosomes. They have proved to
be particularly valuable in the detection of reciprocal
translocations which are beyond the resolution of
conventional diagnostic cytogenetics.
1004
I n vi t ro Evolution
will be revealed by an increased green-to-red fluorescence ratio in the complementary region; similarly,
chromosomal deletion in the tumor sample is revealed
by a decreased green-to-red ratio. The method requires
digital fluorescence microscopy in which the relative
amounts of green and red fluorescence are measured
along the length of the chromosome.
Mention should be made of the use of chromosomespecific paint probes in the study of comparative genomicandkaryotypeevolution.Theconservationofgenes
between mammalian species is widely appreciated,
and even widely divergent species such as the human,
the fruit-fly, the nematode worm Caenorhabditis
elegans, and yeast share a number of genes. Comparative mapping studies reveal that the X chromosome
carries the same transcribed genes in all mammals, and
also that large blocks of linked autosomal genes show
similar conservation between species. In closely
related species, these genetic linkage groups tend to
be more extensive than in more distantly related species, sometimes representing whole chromosomes that
are shared between the species. Cross-species chromosome painting has been used to demonstrate the extent
of chromosome homology between species. Chromosome- specific paints from one species are hybridized
to the chromosomes of a second species. The precise
origin of a particular block of homology revealed by
a paint probe from the first species can be determined
by hybridizing chromosome-specific paint from the
second species back to the chromosomes of the first
species. In this way simple comparative maps can be
constructed between species. If one of the species is
a well-mapped species, such as human or mouse, a
preliminary genetic map can be constructed for the
unmapped species. This homology map can assist in
more detailed mapping using genetic linkage and radiation hybrid techniques. Phylogenetic relationships
between species can be studied, based on chromosome
rearrangements revealed by chromosome painting and
shared by species diverged from a common ancestor.
See also: Chromosome; Chromosome Painting;
Chromosome Scaffold; FISH (Fluorescent in situ
Hybridization); Gene Mapping; Genome
Organization
In vitro Evolution
R L Dorit
I n vi t ro E vo l u t i o n 1005
3 1014
Generation 0
Generation 4
2 1014
1 1014
Experimental generations
Figure 1 A diagram of the results obtained by Sol Spiegelman in his early in vitro evolution experiments on Qb
replicase. As can be seen, the particular templates best replicated by the Qb replicase rise in frequency; as new
variation is constantly introduced, new, even better templates emerge and come to dominate the system. The
composition of the template population changes and evolves at every generation.
1006
I n vi t ro Evolution
Mutation
Amplification
Selection
I n vi t ro E vo l u t i o n 1007
Cleavage
B
B
100
80
40
Relative Activity
(WT=1)
60
20
lati
S1
Pop
u
S2
S3
9
ration
Gene
10
11
12
13
14
15
16
tion
Muta oduced
r
t
n
l
re
on
Figure 3 An example of in vitro evolution, where RNase P RNA is evolved to cleave a DNA substrate. The top panel
shows a diagram of the selection scheme, wherevariant RNA molecules anneal to a DNA substrate, which in turn attaches
to a column (via a biotin `B' molecule). Those variants that can cleave the DNA substrate are eluted from the
column, and are amplified for the next generation of selection. The bottom panel shows the response of
three parallel RNase P RNA populations under in vitro evolution. Note the increase in the overall activity of the
population, as well as the dramatic, but transient drop in activity that accompanies the reintroduction of mutations
into the evolving population.
1008
I n vi t ro Fe r t i li z a ti o n
deoxyribozymes) capable of enhancing the rate of biochemically important reactions. The success of these
DNA selections expand the perspective of workers
in the field, and have led to the realization that any
biopolymer that can be copied with some degree of
fidelity can, in principle, serve as the raw material
for in vitro evolution.
Taken together, in vitro results underscore the tremendous functional versatility of nucleic acid polymers. The fact that in vitro evolution experiments
frequently succeed attests to the density of functional solutions scattered throughout sequence space.
Phrased differently, a fully randomized pool of RNA
100-mers could theoretically contain 4100 or ~1060
variants. A typical experiment will thus sample only
1013/1060 or 1 of 1047 possible sequences. Even with
this extremely sparse sampling, functional sequences
are almost always retrieved. While there may be certain functions for which a viable solution is so rare that
it cannot be captured by in vitro selection, theoretical
and empirical results paint a different picture of functional space. Indeed, multiple solutions appear to exist
for any given catalytic challenge, and these solutions
seem both to lie in close proximity and to be accessible
from practically any given starting point. The presence of so many peaks on the RNA functional landscape may well account for the rapid emergence of
organization early in the history of life.
Future Directions
The uses of in vitro evolution continue to expand,
limited only by the ability to identify targets of interest and to design effective selection strategies.
The raw materials for in vitro evolution are now
more diverse. For example, synthetic nucleotides and
nucleotide analogs have been incorporated into the
sequences constituting the initial pool for in vitro
evolution. This increase in the complexity of the
nucleotide sequences increases the number of potential three-dimensional interactions, and, by extension,
the number of shapes that can be assumed by the sampled pool. Similarly, recent methods have succeeded in
coupling peptides to their coding sequences. This significant advance allows for the in vitro isolation of
proteins with desirable properties, followed by the
replication of their cognate coding sequences. The
substantially wider repertoire of side groups provided
by a 20 amino-acid alphabet may well allow for the
in vitro evolution of catalysts capable of a broader
range of chemical reactions (Figure 3).
The power of the in vitro approach is now being
directed toward the more subtle issues of emergence
and complexity. Studies now underway seek not just
to evolve novel molecules or to expand the catalytic
repertoire of single molecular species, but instead to
evolve metabolic networks. Such enclosed networks,
composed of multiple interacting molecular species,
serve as a model for the earliest protocells and their
rapidly evolving metabolic potential. The field of
in vitro evolution is still in its early phase, and its
potential still enormous. In effect, in vitro evolution
allows us to explore sequence, structure, and function
space beyond the solutions already present in living
systems. This ability to compare existing functional
solutions with possible (but unrealized) solutions, to
probe not just the actual but the possible, adds a radically new tool to the arsenal of comparative biology.
See also:BacterialGenetics;BiochemicalGenetics;
Evolution; RNAWorld; Selection Techniques
In vitro Fertilization
R Edwards
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0713
great care in searching for mutants among the children, although at present more attention is paid to
disordered chromosome constitutions. Treating other
extreme forms of male infertility has also uncovered
genetic defects rarely found in a normal-conceiving
human population, such as cystic fibrosis variants that
distort the formation of the vas deferens. Applying
ICSI in these cases can risk the health of the child
when wives are carriers for cystic fibrosis. Separating
human X and Y spermatozoa is now possible and
reliable and also more easily achieved by applying
ICSI for the limited spermatozoa available until techniques improve to produce sufficient spermatozoa for
artificial insemination.
Preimplantation genetic diagnosis for inherited disease is also becoming more widespread in IVF programs. A single cell excised from an 8-cell embryo or
half a dozen cells from the trophectoderm of a blastocyst can be used to type genetic disorders in human
embryos. Many single-gene disorders can be identified in the embryos, and also highly complex translocations, chromosome errors, and complex variants,
such as those involved in Duchenne muscular dystrophy. Improvements in array technology promise to
permit hundreds or even thousands of genes to be
identified in preimplantation embryos. Such knowledge might provide a genetic blueprint of the growth
of the embryo, with considerable social and ethical
implications. Other genetic-related advances stemming from IVF include the potential cloning of
human embryos for spare-parts surgery. It is notable
that cloned embryos and offspring have enormous
anomalies and very high death rates, and the effects
of such epigenetic changes will presumably be present
in embryo stem cells. Cloning was not attempted in
hundreds of IVF laboratories practicing ICSI, which
could enable cloning to be introduced. The UK
government's decision to permit cloning of human
embryos to make tolerant embryo stem cells for
organ repair has just been announced.
Knowledge about the genetic regulation of the
human oocyte and embryo is now accumulating
rapidly. Polarities have been identified in oocytes,
cleaving embryos, and blastocysts, and genes affecting early growth have been identified. This information, together with that gained from the mouse and
human genome projects, indicates that hundreds or
thousands of genes are expressed in preimplantation
mammalian embryos, with blocks of closely linked
genes acting in concert to regulate successive cleavage
stages.
See also: Ethics and Genetics; Fertilization
1010
I n vi t ro Mutagenesis
In vitro Mutagenesis
M Arkin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0714
Nonselective Mutagenesis
Deletions
Chemical mutagenesis and enzymatic misincorporation techniques cause a small number of mutations
Site-Directed Mutagenesis
Site-directed mutagenesis involves the specific substitution of one DNA base for another. Unlike the nonspecific mutations described above, site-directed
mutagenesis allows precise control of the number,
placement, and base substitution of mutants. The two
classes of site-directed mutagenesis include methods
that use double-stranded DNA cassettes and those
that use single-stranded oligonucleotide primers.
All of the techniques described here can give high
yields of the desired mutations; the choice of mutagenesis method is largely a matter of convenience and
personal preference.
Site-directed mutagenesis is possible because of the
invention of automated chemical synthesis of DNA
and the overexpression of DNA-processing enzymes.
Through chemical DNA synthesis, defined oligonucleotides up to *100 bases can be prepared reproducibly and inexpensively. Synthetic oligonucleotides
are used extensively for site-directed mutagenesis, as
primers for DNA polymerase and as oligonucleotide
cassettes. Equally important has been the identification and overexpression of DNA-modifying enzymes,
including restriction endonucleases for cleaving DNA
at specific recognition sites and DNA polymerases for
generating double-stranded DNA from a singlestranded template. Furthermore, the discovery of
Cassette Mutagenesis
dU
dU
a. anneal mutagenic
oligo to dU-DNA
b. DNA polymerase
c. DNA ligase
Bcgl
dU
dU
Bcgl
Transform into
dut + ung + strain
of E. coli
dU
cleave with
Bcgl
dU
dU
dU
Figure 1 Methods of oligonucleotide-directed mutagenesis. (A) Cassette mutagenesis with BcgI-containing plasmid.
The BcgI-containing plasmid is constructed by removing the region to be mutagenized and replacing it with a BcgI
recognition sequence. Short arrows show sites of BcgI cleavage. Following cleavage by the restriction enzyme, the
mutagenic cassette is ligated into the gene. Note that the restriction sites are removed during mutagenesis. (B) dU
method for primer-based mutagenesis. A dU-containing single-stranded plasmid is prepared from an M13 vector in a
dut ung strain of Escherichia coli. The mutagenic oligonucleotide is hybridized to the dU template (the mutagenic
primer shown will create an insertion in the gene of interest). The rest of the second strand is filled in by DNA
polymerase and ligated by DNA ligase. Transformation into a duung strain of E. coli results in degradation of the dU
strand and propagation of the mutant.
1012
I n vi t ro Mutagenesis
Primer-Directed Mutagenesis
General methods
dU Method
PCR-mediated mutagenesis is similar to the oligonucleotide methods described above, in that a mutagenic
oligonucleotide is used as a primer for DNA synthesis.
An advantage of the PCR method lies in the inherent
amplification of the mutagenic DNA, which requires
only a small amount of the wild-type DNA as template.
PCR mutagenesis can be performed on linear pieces
of DNA, such as restriction fragments, as well as
on circular plasmids. Figure 2 pictures some of the
methods discussed below.
(B)
(C)
4
4
2
1
PCR
PCR
phosphorylate
and ligate
mix and
anneal
1
PCR with external
primers
Figure 2 Methods of PCR mutagenesis. (A) Extension-overlap PCR generates mutations between two restriction
enzyme sites. Four primers are prepared; two containing the restriction sites (primers 1 and 4) and two containing
the mutagenic sequence (primers 2 and 3). After two PCRs, fragments 12 and 34 are combined and stitched
together by PCR using primers 1 and 4. The long product 14 is restricted and ligated into the vector. (B) Inverted
PCR uses two back-to-back primers, one containing the mutations of interest. PCR yields the full-length, linear
plasmid which is made into closed circular DNA by DNA ligase. (C) Recombinant circle PCR uses two sets of
primers. Primers 2 and 3 contain the mutagenic sites and prime opposite strands; primers 1 and 4 prime from
different positions on the plasmid. The PCR products 12 and 34 are truncated, linear versions of the plasmid;
recombination in vitro gives gapped plasmids which are repaired in E. coli.
techniques, two inverse PCRs are performed with
gapped primers at different sites. These two mutant
plasmids are then recombined in vitro by mixing
and annealing (recombinant circle PCR) or in vivo
(recombination PCR). The gaps are then repaired by
the bacterial DNA repair machinery.
Libraries of Mutations
Combinatorial and random mutagenesis methods create libraries of DNA which are subsequently screened
or selected for function. By analyzing large numbers
of clones simultaneously, a small number of active
mutants can be separated from a pool of millions
of variants. The preparation of libraries, selection of
`winners' and amplification of these selectants is often
called `in vitro evolution.'
1014
I n vi t ro Pa cka ging
Prospects
In vitro mutagenesis has become an integral part of
genetic analysis. Controlled mutagenesis has identified the function of new genes, a process termed
`reverse genetics,' and allowed dissection of the
mechanism of known proteins. Additionally, sitedirected mutagenesis has become an important tool
in biotechnology. For example, the design of nonimmunogenic antibodies for human therapeutics
underscores the practical benefits of mutagenesis
and protein engineering. The availability of DNAmodifying enzymes, cloning vectors, and synthetic
DNA make site-directed mutagenesis straightforward
in most laboratories; its applications are limited only
by the imagination.
Further Reading
In vitro Packaging
I Schildkraut
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0715
Inborn Errors of
Metabolism
T M Picknett and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1805
Inbred Strain
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0674
1016
1 2 9 I n b re d S t r a i n
Inbreeding
Incompatibility
Inbreeding Depression
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0677
The major hurdle that must be overcome in the development of new inbred strains from wild populations is
inbreeding depression which occurs most strongly
between the F2 and F8 generations (second through
eighth generation of sequential brothersister mating).
The cause of this depression is the load of deleterious
recessive alleles that are present in the genomes of wild
animals as well as all other animal species. These deleterious alleles are constantly generated at a low rate by
spontaneous mutation but their number is normally
held in check by the force of negative selection acting
Incomplete Dominance
J A Fossella
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0678
Incomplete Penetrance
See: Penetrance
Incross
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0681
Indel
W Fitch
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0682
Independent Assortment
J Merriam
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0683
1018
Independent Segregation
Independent Segregation
J R S Fincham
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0684
Inducer
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1872
An inducer is a small molecule that triggers gene transcription on binding to a regulator protein.
See also: Induction of Transcription
Inducible Enzyme,
Inducible System
See: Induction of Transcription
Induction of Prophage
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1875
Induction of Transcription
B Muller-Hill
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0689
1020
Induction of Transcription
I n f e r ti l i t y 1021
dimers in the DNA. The presenceofthyminedimers triggers the turning on of the SOS pathway. This in turn
leads to the proteolytic destruction of lambda repressor by RecA and so to the induction of phage lambda
i.e., the liberation of phage lambda from repression.
Reference
Infertility
P J Turek and R R Pera
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0450
Introduction
Infertility is a common human health problem, in fact,
almost as common as diabetes mellitus. Approximately 1015% of couples of reproductive age are
infertile. In women, common causes of infertility
include tubal or pelvic disorders such as endometriosis, ovulatory dysfunction, or anatomical problems. In
men, infertility can be caused by the presence of
dilated blood vessels around the testes (varicoceles),
blockage or absence of the spermatogenic tubules
from infection or congenital absence of the vas deferens, and low or no sperm counts (oligospermia and
azoospermia, respectively) from testicular failure.
Genetic causes of infertility can lead to either defects
in sperm or egg production or result in defects in
anatomical development within the reproductive
tract. This article will review our present understanding of these two kinds of genetic causes of infertility.
Common conditions that directly affect the development of oocytes in the ovary are Turner syndrome,
premature ovarian failure, and mutations in the follicle stimulating hormone (FSH) receptor. At present,
little therapy exists to stimulate the production of
oocytes in women with these conditions.
Turner syndrome
1022
Infertility
Klinefelter syndrome
Figure 1 The intracytoplasmic sperm injection (ICSI) procedure. (A) A mature oocyte (left) is readied for injection
with a sperm (arrow) in a micropipette under high-power microscopy. (B) The micropipette is placed directly into the
oocyte and the sperm deposited in the cytoplasm.
I n f e r ti l i t y 1023
XXY/XY chromosomes. This syndrome may present
with increased height, decreased intelligence, varicosities, obesity, diabetes, leukemia, an increased likelihood of extragonadal germ cell tumors, and breast
cancer (20 times higher than normal males). Paternity
with this syndrome is rare, and more likely in the
mosaic or milder form of the disease. Recently, paternity has been reported in several cases of pure XXY
men with the use of ICSI.
XYY syndrome
XX male syndrome
Azoospermia gene(s)
Centromere
AZFa region
Noonan syndrome
sY200
sY78
DFFRY
DBY
UTY
TB4Y
sY83
sY84
sY85
sY90
sY275
AZFb region
EIF1AY
CDY
PRY
sY142
sY143
AZFc region
sY158
sY13
sY14
sY238
Heterochromatin
sY160
CDY
BPY2
DAZsY254
DAZsY258
sY148
sY202
Pseudoautosomal
Yq
1024
Infertility
Infertility can often be traced to abnormal development of the female reproductive tract, including the
ovaries, oviducts, uterus, and vagina. Although it is
clear that genetic causes for abnormal development
exist, they are likely to be polygenic or multifactorial
in nature. Major female reproductive tract abnormalities include endometriosis, polycystic ovarian syndrome, and anomalies of uterine structure.
Endometriosis
Polycystic ovarian syndrome is characterized by anovulation associated with the persistence of numerous
cysts and a continuous secretion of gonadotropins
and sex steroids. Similar to endometriosis, polycystic
ovarian syndrome is common and may occur in 10
15% of women with normal reproductive function.
Among infertile women with anovulation, polycystic
ovaries are detected in 75% of cases. The genetics of
polycystic ovarian syndrome are complex, yet studies
suggest that, like endometriosis, there may be a 5- to
Uterine abnormalities
Cystic fibrosis
I n f e r ti l i t y 1025
Vas
Caput
Corpus
St
em
Cauda
pe
Figure 3 Illustration of scrotal anatomy. In congenital absence of the vas deferens (CAVD), there is a normal
testis (T), but the epididymis and vas deferens (vas) are abnormal. The caput epididymis (caput) is present and
attached to the testis, but the corpus and cauda epididymis and the vas deferens are absent (stipled areas).
have no palpable vas deferens (one or both sides) on
physical examination (Figure 3). Similar to CF, the
rest of the wolffian duct system may also be abnormal
and is largely unreconstructable. Recently, this disease
has been shown to be a genetic form fruste of CF, even
though the vast majority of these men fail to demonstrate any symptoms of CF. In men with bilateral vasal
absence, 65% will harbor a detectable CF mutation. In
addition, 15% of these men will have renal malformations, most commonly unilateral renal agenesis. In
patients with unilateral vasal absence, the incidence of
detectable CF mutations is lower, and the incidence
of renal agenesis approaches 40%. Pituitarygonadal
hormones are usually normal, as is spermatogenesis.
Young syndrome
Summary
Although our understanding of genetic causes of
female and male infertility is still quite naive, it is
already obvious that research in this field has the
1026
I n f l u e n za V i r u s
Further Reading
Influenza Virus
K N Gracy and W Fitch
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0690
Further Reading
Inheritance
J Merriam
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0691
I n h e r i t ed R i c ket s 1027
individuals share traits and picking out distinct, rare
traits to follow, details of the inheritance mechanism
that explain, for instance, how certain traits reappear
after seeming to skip generations, have been worked
out. This has progressed so well that inheritance can
refer to the mechanism for either the transmission of
specific traits or the transmission of all the biological
information required for life without specifying
genotypes.
See also: Mendelian Inheritance; Quantitative
Inheritance
Inherited Rickets
J L H O'Riordan
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1375
1028
I n h e r i t e d R i c ke t s
Hypercalciuric Rickets
Dent's disease (OMIM 300 009) was originally described as a combination of rickets and hypercalciuria.
It later became apparent that in the same families a
I n s e r t i o n S eq u en c e 1029
about three-fourths of patients have mutations of this
gene which can be detected. These mutations include
deletions that may be large or small, or there may be
point mutations leading to single amino acid changes
or splice-site alterations. There are two mouse homologs; one of these is the Hyp mouse. This was the
result of a spontaneous mutation, while the second
model is the Gyr mouse, in which there is hypophosphatemic rickets plus a gyratory movement. In both
mutations of the mouse homolog, Pex has been found;
in the Gyr mouse this is a deletion which includes also
an adjacent gene. In the Hyp mouse, there is evidence
that a hormonal mechanism is involved, which provides
some support for the possibility that the PEX gene
product is acting upon the yet unidentified hormone.
Further Reading
Initiation Factors
A J Berk
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0692
Insertion Sequence
M Chandler
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0696
Discovery
Insertion sequences (ISs) are small pieces of DNA
which move within or between genomes using their
own specialized recombination systems. They were
discovered in the mid-1960s in studies of gene expression in Escherichia coli and its bacteriophages. Initially
recognized by their ability to generate highly polar
but unstable mutations in the gal and lac operons
and in the `early' genes of bacteriophage lambda,
they were later identified by electron microscopy as
short insertions of DNA. The repeated isolation of a
limited number of identical DNA sequences associated with these unstable mutations led to their
being named: insertion sequences. The similarity of
ISs and the mobile genetic elements described by
Barbara McClintock in Zea mays in the 1940s became
clear when it was realized that ISs formed an integral
part of the E. coli genome and that their mutagenic
activity was a result of their movement to new genetic
locations. At about this time, transmissible resistance
to antibiotics was also observed. Genetic studies of
this phenomenon implicated an analogous mechanism
of gene mobility in the distribution of these drug
resistance genes among the conjugal plasmids and
phage involved in this transmission. Subsequently,
insertion sequences were shown in many cases to
play a key role in mobilizing these genes.
1030
I n s e r t i o n S eq u en ce
General Structure
ISs are genetically compact (Figure 1), typically less
than 2.5 kb in length, and carry only the genes necessary for their transposition. They comprise a single, or
sometimes two, open-reading frames covering almost
the entire length of the element. The products are
specialized recombinases called transposases (Tpases).
ISs characteristically terminate in small flanking (10
40 bp) inverted repeat sequences (IRs) with imperfect
homology. By convention, the terminal inverted
repeat proximal to the Tpase promoter is defined as
the left repeat, IRL, while the distal IR is defined as the
right repeat, IRR. In the majority of cases, ISs are
flanked by small directly repeated duplications in the
target DNA, which they generate on insertion. The
length of this duplication is specific for each element
and ranges from 2 to 13 bp.
IRL
p
XXX
TPase
IRR
XXX
A B
B A
I n s e r t i o n S eq u en c e 1031
Table 1
The IS families
Family
Groups
Size range
Ends
IR
ORF
TPase
IS1
IS3
IS2
IS3
IS51
IS150
IS407
IS5
IS427
IS903
IS1031
ISH1
ISL2
770
1200 1550
9 (811)
5
3 (4)
3 (4)
35
4
912
4
23
9
3
8
23
8
4 (5, 8)
23
8
N
N
N
89
2
?
4
8
8
GGT
TGA
Y
Y
lambda integrase ?
DDE
C(A)
GG
Ga/g
GGC
GAG
GG
TG
GTA
Gg/a
AC
Cc/g
C
GG
Y
Y
2
2
2
2
2
2
1
1
(2)
1
1
1
1
1
2
1
>3
1
1
1(2)
1
1
1
1
1
1
IS4
IS5
IS6
IS21
IS30
IS66
IS91
IS110
IS200/IS605
IS256
IS630
IS982
IS1380
ISAS1
ISL3
1300 1950
800 1350
750 900
1950 2500
1000 1250
2500 2700
1500 1850
1200 1550
700 2000
1300 1500
1100 1200
1000
1650
1200 1350
1300 1550
Y
Y
Y
Y
N
N
N
Y
Y
Y
Y
Y
Y
DDE
DDE
DDE
DDE
DDE
ssDNA Rep
Site-specific recombinase
Complex organization
DDE eukaryote relatives
DDE eukaryote relatives
DDE
Size range in base pairs (bp) represents the typical range of each group. N, no; less frequently observed lengths are included
in parentheses; Ends, typical nucleotide sequences at the very ends of the element. Presence (Y) or absence (N) of terminal
inverted repeats is indicated. DDE represents the common acidic triad presumed to be part of the active site of the
transposase. ssDNA Rep indicates that the enzyme is a polymerase of the rolling circle type.
1032
I n s e r t i o n S eq u en ce
multimers and carry domains involved in multimerization. Indeed, in several cases, multimerization
appears to be essential for DNA binding (see below).
Sequence alignment of most bacterial Tpases and
the functionally related retroviral integrases, IN
(which catalyze integration of the double-stranded
viral cDNA into the host genome), revealed a common triad of acidic amino acids with a characteristic
spacing, the DDE motif (Table 2A). This similarity
was subsequently shown to include additional conserved amino acids and has also been detected in
many other major transposons (Table 2B) and IS
Table 2 The DDE motif showing representative transposases from various insertion sequence families (A) and
transposases from other bacterial transposons (B)
N2
64
HIV-1 (IN)
IS911 (IS3)
IS10 (IS4 )
IS50 (IS4 )
IS903 (IS5 )
IS26 (IS6 )
IS30
IS21 (lstA)
237
122
181
IS630
IS982
IS256
Tc1
86
B
Mu (MuA)
269
Tn7 (TnsA)
(hgk D yip)(85)
273
Tn7 (TnsB)
Tn552
Tn3
689
N3
C1
116
152
E smNKel K
ynpqsQgvi
287
323
E rffRsl K
gncwNspm
292
161
E etfRdl K
niyskRmq
188
326
E efHKawK
diythRwri
193
259
niyskRmqi E eftRdlK
138
173
qikylNNvi E cdHgklK
293
ltw D rgmel
327
(33)
qspwqRgtn E ntNgli R
(46)
rrartKgkv
230
184
vlv D nqkaa
liv D nyiih
(35)
vyspwvNhv
(45)
nfskrRKvi E rvsfl
341
233
vis D ahkgl
E rlwQal
237
192
vlg D mgylg
E rmvKyl K
297
261
177
spspdlNpi E hmweeleR
336
392
kgwgqaKpv E rafgvg
114
149
erleKlel
E rrywqqK
396
rrfdaKgiv E stfRel
276
gvprgRgki E rffQtv
895
riltqlNrg E srHavaR
Large bold letters indicate highly conserved residues, smaller bold letters indicate partially conserved residues. Bold figures
above each line indicate the coordinates in amino acid residues and figures in parentheses indicate the number of residues
between the conserved DDE. Part A includes an example of the HIV integrase protein to show its similarity to Tpases and
Tc1, a member of the eukaryote mariner/Tc insertion elements.
I n s e r t i o n S eq u en c e 1033
Not all ISs exhibit a well-defined DDE triad
(Table 1). For example, the Tpases of members of
the IS91 family show strong similarities with replicases involved in rolling circle plasmid and bacteriophage replication. Members of the IS110 family
appear to encode a novel type of site-specific recombinase, while the IS1 transposase shows limited
similarity to phage lambda integrase.
Transposition Strategies
Endonucleolytic cleavage of the phosphodiester bonds
at the ends of the transposable element and their
transfer into a target DNA molecule generally requires
the assembly of a synaptic complex including the
Tpase, the transposon ends, and target DNA. There
are two principal modes of transposition, conservative
and replicative, based on whether or not the element is
copied in the course of its displacement. This is dictated by the nature and order of the cleavages at the
ends (Figure 2): whether the transposon is liberated
from its donor backbone by double strand cleavages
or whether it remains attached following cleavage of
only a single strand.
The DNA cleavage and strand joining reactions
necessary for transposition of many transposable
elements with Tpases of the DDE type are remarkably similar. These Tpases catalyse endonucleolytic
cleavage at each 30 transposon end to liberate 30 OH
groups, which are then used in a concerted nucleophilic attack on the target molecule. An important feature
of the transposition reaction is therefore the way in
which the 50 end (second strand) is processed.
Replicative transposition entails cleavage of only
one strand at each transposon end and transfer into a
target site in such a way as to create a replication fork
(Figure 2). Some IS elements do not appear to process
the second strand and simply undergo replicative
transposition, or more precisely, `replicative integration.' These include members of the Tn3 and IS6
families and perhaps IS1. If transposition is intermolecular, replication from the nascent fork(s) generates
cointegrates (replicon fusions), where donor and target replicons are separated by a directly repeated copy
of the element at each junction. Resolution of these
structures to regenerate the donor and target molecules, each carrying a single copy of the element, is
accomplished by recombination between the two
elements. This proceeds for some transposons by sitespecific recombination promoted by a specialized
transposon-specific enzyme distinct from the Tpase,
the `resolvase' (e.g., Tn3 family), or is taken in charge
by the host homologous recombination system.
In conservative or `cut-and-paste' transposition, the
element is excised from the donor site and reinserted
IS10 / IS50
IS3/ IS911
Excision
IS91
3'
5'
3'
5'
Cleavage
GTTC
GTTC
3'OH
5'
3'
3'OH
P
3'OH
3'OH
3'OH
IRR
IRL
Strand transfer
GTTC
GTTC
Replication
Integration
3'OH
3'OH
3'OH
3'OH
3'OH
3'OH
GTTC
P
3'OH
3'OH
3'OH
P
Host
processing
3'OH
GTTC
3'OH
3'OH
3'OH
GTTC
Termination
GTTC
GTTC
5'
3'
I n ser ti on S equenc e
3'
5'
1034
Tn3 / IS6
I n s e r t i o n S eq u en c e 1035
Figure 2 (opposite) Transposition strategies. Transposon DNA is indicated by open boxes or shaded boxes for
newly replicated transposon DNA. Donor DNA is indicated as stippled lines and target DNA as bold lines. Strand
cleavage is shown as small vertical arrows. Nucleophilic attack of the phosphodiester bond P by the active 30
hydroxyl (30 OH) resulting in strand transfer is also indicated by arrows. The toothed region shown in the target DNA
represents target duplications associated with insertion. DNA polarity is shown at the top of each panel. Note that in
the case of IS91, the polarity of the target DNA has been inverted to facilitate drawing of the figure.
specific tetranucleotide target sequence which abuts
IRR (Figure 2). `One-ended' transposition products
occur at high frequency in the absence of IRL. They
carry a constant end defined by IRR and a variable end
defined by a copy of the target consensus located in
the donor plasmid. It is thought that donor strand
cleavage results in a covalent complex between the 50
IRR end and Tpase and is followed by single-strand
transfer into the target DNA at a site containing a
consensus tetranucleotide. The attached single strand
of the IS is displaced by replication in the donor
molecule. Termination is triggered when the complex
reaches either the 30 IRL end or a tetranucleotide consensus sequence in the donor (Figure 2). This scheme
does not, however, address how the element is replicated into the target molecule.
Target Specificity
ISs show differing degrees of selectivity in their choice
of target DNA sites. Sequence-specific insertion is
exhibited to some degree by several elements and
varies considerably in its stringency. It is strict in the
case of IS91, which requires a GTTC/CTTG target
sequence, but less strict for members of the IS630 (and
the related eukaryotic mariner/Tc) family, which
require a TA dinucleotide in the target, for IS10,
which prefers (but is not restricted to) the symmetric
50 -NGCTNAGCN-30 heptanucleotide, and for IS231,
which shows a preference for 50 -GGG(N)5CCC-30 .
In the case of IS10, sequences immediately adjacent
to the consensus have also been shown to influence
target choice. A demonstration that IS10 Tpase
directly influences target choice has been obtained
by isolation of Tpase mutants which exhibit altered
target preference. Other elements show regional
preferences such as DNA segments rich in GC
(IS186) or AT (IS1), which could reflect more global
parameters such as local DNA structure. Indeed, the
degree of supercoiling (IS50), bent DNA (IS231),
replication (IS102), transcription (IS102, Tn5/Tn10),
and possibly protein-mediated targeting to (or exclusion from) transcriptional control regions have all
been evoked as parameters which influence target
choice.
Another phenomenon which may reflect insertion
site specificity is the interdigitation of various intact or
1036
I n s e r t i o n S eq u en ce
XXX
PIRL
RBS
(A)
PIRL
A(0)
B(1)
Y YYX XXZ
orfA
orfAB
H-T-H
DD(35)E
(B)
I n s u l i n o m a 1037
other IS elements (e.g., one subgroup of the IS5
family).
Other control mechanisms may occur at translation
termination. In some cases, the translation termination
codon of Tpase genes is located within their IRR
sequences, while in others the transposase gene simply
does not possess a termination codon. Among the latter
cases, the IS is known to insert into a specific target
sequence in which the target direct repeat produced on
insertion itself generates the Tpase termination codon.
This has been observed for certain members of the
IS630 family. The significance of these arrangements
may be to couple translation termination, transposase
binding, and transposition activity.
Early studies of IS1 and IS50 demonstrated that
impinging transcription from outside reduces transposition activity. Transcription may disrupt the
formation of the transposition complexes known as
transpososomes in which transposase and the transposon ends are intimately bound.
Tpase stability can also contribute to control of
transposition since it can limit activity both temporally and spatially. This may explain the observation
that several Tpases function preferentially in cis (see
below). Derivatives of the IS903 Tpase that are more
resistant to the E. coli Lon protease than the wild type
protein are more active and exhibit an increased capacity to function in trans (see below).
Early studies indicated that transposition activity
of some elements was more efficient if the transposase
is provided by the element itself or by a transposase
gene located close by on the same DNA molecule.
This preferential activity in cis reduces the probability
that transposase expression from a given element
will activate transposition of related copies elsewhere
in the genome. The effect can be of several orders of
magnitude. It presumably reflects a facility of the
cognate transposases to bind to transposon ends
close to their point of synthesis and is likely to be
the product of several phenomena such as expression
levels and protein stability. Another contributing factor may derive from the domain structure of known
transposases (see above) in which the DNA binding
domain is located in the N-terminal end of the protein.
This arrangement would permit preferential binding
of nascent transposase polypeptides to neighboring
binding sites. Indeed, the N-terminal portion of several Tpases exhibits a higher affinity for the ends than
does the entire transposase molecule, suggesting that
the C-terminal end may mask the DNA binding activity of the N-terminal portion.
Further Reading
Chaconas G, Lavoie BD and Watson MA (1996) DNA transposition: jumping gene machine, some assembly required. Current Biology 6: 817820.
Haren L, Ton-Hoang B and Chandler M (1999) Integrating DNA:
transposases and retroviral integrases. Annual Review of
Microbiology 53: 245281.
Mahillon J and Chandler M (1998) Insertion sequences. Microbiology and Molecular Biology Reviews 62: 725774.
Mizuuchi K (1992) Transpositional recombination: mechanistic
insights from studies of Mu and other elements. Annual
Review of Biochemistry 61: 10111051.
Mizuuchi K (1997) Polynucleotidyl transfer reactions in sitespecific DNA recombination. Genes to Cells 2: 112.
Rice P, Craigie R and Davies DR (1996) Retroviral integrases and
their cousins. Current Opinion in Structural Biology 6: 76 83.
Saedler H and Gierl A (eds) (1996) Transposable Elements, Current Topics in Microbiology and Immunology, Vol. 204. Berlin:
Springer-Verlag.
Insertion, Insertional
Mutagenesis
See: Chromosome Aberrations; DNA Cloning; In
vitro Mutagenesis; Mutation
Insulinoma
C S Grant
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1584
Further Reading
1038
Integrase
Integrase
N Grindley
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0698
The Reaction
The minimal Int family target on DNA consists of a
single binding site for a topoisomerase monomer.
DNA strand cleavage involves activation of the scissile
phosphate by the highly conserved pentad of active site
residues (Arg, Lys, His, Arg, His) and formation of a
nick with a 50 OH and 30 -phosphotyrosine linkage to
the recombinase. This transient covalent intermediate
releases one superhelical turn, via mechanics that are
not completely understood, and the nick is resealed by
a simple reversal of the cleavage step.
In the case of Int family-mediated recombination,
the minimal DNA target consists of two recombinase
binding sites that are positioned as inverted repeats
separated by 68 bp (called the `overlap region').
Synapsis and proper alignment of two such recombination partners generates a tetrameric complex in
which each recombinase protomer carries out the
cleavage and ligation of one DNA strand, executed
as two sequential pairs of cleavage/ligation reactions.
In the first pair of reactions one strand in each partner
DNA helix is cleaved and the first three or four bases
of the free 50 hydroxyl-terminated strands of the overlap region are swapped and then ligated. This forms a
Holliday junction with four continuous DNA strands
(see Figure 1). After some rearrangements within the
Holliday junction, the intermediate is `resolved' by a
reciprocal strand swapping of the second pair of
strands so that all four DNA strands have new junctions and two recombinant DNA helices have been
generated. The formation of Holliday juction
intermediates distinguishes the Int family from the
resolvase/invertase family of site-specific recombinases, which use a serine nucleophile to carry out a
pair of concerted (rather than sequential) strand
exchanges, and from the transposase family of
Target Specificity
1040
I n te g r a s e F a m il y o f S i t e- S p e c i f i c R e c o m b i na s e s
Accessory Proteins
Sub-Families
Monomeric Targets
Dimeric Targets
After the topoisomerases, the most basic recombination pathway requires four identical protomers and is
described for the most part by the reaction scheme
outlined above. The two best-studied examples of this
group are the Cre recombinase of the Escherichia coli
bacteriophage P1 and the Flp recombinase of the 2 mm
plasmid of the yeast Saccharomyces cerevisiae (see
Table 1). Cre recombinase acts on two DNA target
sites (lox sites) to reduce multimers of P1 plasmid to
monomeric circles (each containing a single lox site)
and thereby numerically favoring the passively dispersive inheritance of P1 to both daughter cells. FLP
recombinase also has the biological function of enhancing plasmid inheritance but uses a different strategy.
The 2 mm plasmid contains two Flp target sites ( frt
sites) oriented with respect to each other such that
recombination between them results in inversion
(rather than deletion) of the intervening DNA. The
effect of this inversion is to convert two divergent
DNA replication forks into two tandem forks with a
rolling circle mode of replication that generates multiple copies of plasmid DNA. Neither Cre nor Flp
exhibit topological or orientation selectivity. That is,
they are capable of recombining sites on different
molecules or on the same molecule. When the sites
are on the same molecule they can be direct repeats,
leading to excision of the intervening DNA (called
resolution), or they can be inverted repeats, leading
Table 1 Levels of complexity in different Int family
reactions
Recombinase
Accessory
factors
Heterobivalent Int
Cre
Flp
Xer
lInt
None
None
ArgR, PepA
IHF, Xis, Fis
Heterobivalent Recombinases
Integro ns 1041
the E. coli host where they play important roles in the
regulation of DNA transcription and replication, and
the third accessory bending protein, Xis (excision factor), is encoded by the viral genome. This additional
complexity affords mechanisms by which the viruses
can both control the direction of recombination (the
presence or absence of Xis is required for excisive
versus integrative recombination, respectively) and
modulate its efficiency (the levels of IHF and Fis
have opposite effects on the efficiency of excisive
recombination).
Structures
Crystal structures have been determined for four
recombinases and two topoisomerases. Each of the
structures captures a different view of the protein: a
monomeric catalytic domain of lambda Int, a dimeric
catalytic domain of HP1 Int, the full length protein of
XerD, a Cre tetramer covalently bound to a Holliday
junction recombination intermediate, the catalytic
core of vaccinia virus topoisomerase, and fragments
of human topoisomerase I complexed with DNA. The
most dramatic and informative of the structures are
those of the cocrystals with their respective target
DNAs, where many of the biochemical insights into
Int Family reaction mechanisms have been visualized
and extended.
A number of informative generalizations also
emerge from a comparison of the structures and especially from the structures involving proteinDNA
cocrystals. Despite the great divergence in primary
amino acid sequences there are extensive regions
where the six structures possess a similar tertiary fold,
but because of the differences in primary sequence
these structural similarities are punctuated by insertions and deletions. A critical region, involving several
of the conserved active site residues and the tyrosine
nucleophile is surprisingly not part of the highly conserved tertiary fold. However, it is thought that these
differences are likely due to the different multimerization states and the presence or absence of DNA in the
crystals. It remains to be determined whether these
differences might reflect structures that are relevant at
different steps in the reaction or whether they are idiosyncrasies of the particular crystallization states. As
expected, the crystal structures of the Int family sitespecific recombinases comprise the foundation and
impetus for further sharpening our understanding of
this fascinating class of DNA transactions.
Further Reading
Grainge I and Jayaram M (1999) The integrase family of recombinase: organization and function of the active site. Molecular
Microbiology 33: 449456.
Integration
N Grindley
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0699
Integrons
R M Hall
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0700
1042
I n te g ro n s
Pc
attI
(A)
Empty integron
intI
gttrrry
Pint
ORF
Free gene cassette
(B)
IntI
59-be
GTTRRRY
Pc
attI
ORF
59-be
(C)
intI
Pint
gTTRRRY
Gttrrry
TT
Integron containing
one cassette
Integrated cassette
Figure 1 Integrons capture gene cassettes. (A) An empty integron, showing the key features of an integron, an intl
gene that encodes the Intl integrase, an adjacent recombination site, attl (hatched box), and promoters Pc and Pint. (B)
A circular gene cassette consisting of a gene or open reading frame (ORF) and a 59-be recombination site (filled box).
(C) An integron containing one gene cassette, showing the boundaries of the integrated cassette below. Gene
cassettes are inserted into the integron by Intl-catalyzed recombination between attl in the integron and the 59-be in
the circular cassette. The ORF in the inserted gene cassette can now be transcribed from Pc. The 7-bp core sites
surrounding the recombination crossover point in the attl site of the integron and in the 59-be of the circular cassette
are represented by gttrrry and GTTRRRY, respectively, and the configuration of these bases after incorporation of
the cassette is shown in (C). Further cassettes may be inserted at attl in like manner, leading to arrays of integrated
cassettes.
both as agents of gene capture and as expression vectors for the captured genes.
Integrons 1043
Several different intI/attI units that are associated
with gene cassettes have been found and it is likely that
many more remain to be discovered. To distinguish
them, integrons are classified using the sequence of the
intI gene and IntI recombinase. Members of the same
class have the same (>98% identity) integrase. The
known IntI proteins all share significant levels of identity and form a distinct family within the integrons (or
tyrosine recombinase) superfamily. Overall, integrons
fall into two groups: those that are mobile and those
that are an integral part of a bacterial chromosome. The
three classes of integrons found in antibiotic-resistant
clinical isolates are all mobile. The best-characterized
example of the chromosomal integrons is situated in
the small chromosome of Vibrio cholerae. Chromosomal integrons are also found in other Vibrionaceae
and it appears that an integron found its way into the
small chromosome of the common ancestor before
speciation occurred. Other bacterial species also include an integron as part of their genome.
Expression of Cassette-Associated
Genes
Gene cassettes are compactly organized and the vast
majority do not include a promoter. Expression of the
cassette-associated genes is thus dependent on the presence of an upstream promoter (see Gene Cassettes).
Cassettes are integrated in only one orientation and,
in general, the relationship of the gene and 59-be is
such that in this orientation the Pc promoter supplied
by the integron lies upstream as shown in Figure 1.
For class 1 integrons containing more than one cassette,
it has been shown that all of the cassette genes are
1044
I n te g ro n s
transcribed from Pc. Integrons thus create new operons containing a wide variety of genes and gene
orders. The level of expression is highest for the gene
in the Pc proximal cassette and falls progressively for
genes in downstream cassettes. Consequently, a cassette needs to be located relatively close to Pc if its
gene is to be expressed. The Pc promoters of other
classes of integrons have not been located, but in some
cases their presence is implied because antibiotic resistance genes in associated cassettes are expressed.
Whether the genes and ORFs found in cassettes that
are part of the very long cassette array in the V.
cholerae chromosome are expressed remains to be
established. However, it is possible that only the genes
in cassettes located closest to the attI site or in a
cassette that contains a promoter can be expressed,
while downstream genes remain silent.
Pc
int I1
ATG
GTTATGG
GTTACGC
Weak binding
site
Strong binding
site
CCCTAAAACAAAGttrrry
Pint
Simple site
att I1
Integrated
cassette
5'-conserved segment
Figure 2 Structure of attI1 and the promoter region of class I integrons. attl1 contains four Intl1 binding domains
that include a central 7-bp sequence related to the core site consensus sequence GTTRRRY. These core sites are
boxed, and arrows indicate their relative orientations. The simple site of attl1, within which the recombination
crossover (vertical arrow) occurs, is at the right-hand end. The binding sites found to the left of the simple site
enhance recombination efficiency. Bases to the left of the crossover belong to the 50 -conserved segment that is found
in all class I integrons. Bases to the right of the crossover point (lower case letters) are part of the first integrated
cassette, if a cassette is present. The genes in cassettes are transcribed from the promoter Pc, located within the intl1
gene, and intl1 is transcribed leftward from Pint.
I n t e l l i g e n c e a n d th e ` I n t e l l i g e n c e Q u o t i e n t ' 1045
Further Reading
Hall RM and Collis CM (1995) Mobile gene cassettes and integrons: capture and spread of genes by site-specific recombination. Molecular Microbiology 15: 593600.
Hall RM and Collis CM (1998) Antibiotic resistance in gramnegative bacteria: the role of gene cassettes and integrons.
Drug Resistance Updates 1: 109119.
Partidge SR, Recchia GD, Scaramuzzi C et al. (2000) Definition
of the attI1 site of class 1 integrons. Microbiology 146: 2855
2864.
Recchia GD and Hall RM (1995) Gene cassettes: a new class of
mobile element. Microbiology 141: 30153027.
symbols. These abilities develop in the course of childhood. It was discovered early on through multivariate
analysis that a general factor of ability can be extracted
from the performances of individuals on batteries of
tests constructed to assess the development of cognitive ability. From these analyses emerged the concept
of the `intelligence quotient' which attempts to assess
the extent to which an individual differs from the
mean of the population of his/her age group, a calculation that is based on a population mean of 100.
Standard batteries of tests (e.g., the StanfordBinet
and the Wechsler adult intelligence scale, WAIS) have
been constructed and widely used both for assessing
learning disability and for the purposes of educational
and occupational selection.
Once generated, the abstract concept of `intelligence' acquired an autonomous life that left unanswered questions concerning its reality and origin.
Controversy centered on whether intelligence is unitary or a composite of component abilities and, if the
latter, which of these are fundamental. Equally
importantly, the origins of the variation within populations and the extent to which it can be regarded as
genetically determined have been widely and sometimes acrimoniously disputed, with claims being made
for differences between populations that cannot be
accounted for by environmental factors such as educational opportunity.
The possibility that some part of the variation is
genetic raises the further interesting questions of what
sort of genes might be responsible and what selective
pressures these genes might be under. One can also ask
whether the variation is specific to Homo sapiens or
whether similar variation might be detected in other
primates and other mammals.
These questions suggest a quite different approach
to human cognitive abilities and that the whole concept of intelligence can be placed in an alternative
context. This is the suggestion that what is characteristic of Homo sapiens is not intelligence (or a
particular degree of intelligence) but the capacity for
language, and that this arose as a result of discrete
genetic changes in the course of hominid evolution.
Language, according to the linguists N. Chomsky and
D. Bickerton, for example, is a capacity that has no
obvious precedents in the communicative abilities of
other primates. It is the defining feature of modern
Homo sapiens.
The salient candidate for the genetic change is that
the brain lateralized, i.e., that the functions of the
two hemispheres became differentiated, and that this
occurred on the basis that development of the
hemispheres became subtly asynchronous across the
anteroposterior axis, i.e., from right frontal to left
occipital lobes. One component of language, probably
1046
Intercross
the phonological sequence, is localized in the `dominant,' usually the left, hemisphere. Dominance for this
component of language is reflected in directional
handedness (8590% of most populations is righthanded) and this also appears to be a characteristic
that distinguishes humans from the chimpanzee.
Handedness, reflecting cerebral dominance, is a trait
that is associated with quantitative variation. Whether
this variation is a correlate of human cognitive ability,
as would seem plausible if it underlies the specific
characteristic of language, has been much debated,
but it now appears that lesser degrees of lateralization
(`hemispheric indecision') are associated with delay in
the development of verbal, and also nonverbal, ability
(Crow et al., 1998). Thus it appears that lateralization
is associated with significant variation in the rate at
which words acquire meaning, and that this variation
reflects a dimension that is specific to Homo sapiens.
The genetics of lateralization reflects the mechanism
of transition from a precursor hominid to modern
Homo sapiens. Of particular note is the fact that
there are sex differences both in handedness (girls on
average are more right-handed and less likely to be
left-handed than boys) and verbal ability (girls acquire
words faster). There is an obvious possibility that the
relevant gene(s) is sex-linked and an X-Y homologous
locus has been suggested.
These considerations cast the question of human
`intelligence' in a new and perhaps more biological
perspective. In particular, they emphasize the speciesbound nature of the variation and the survival value of
the core characteristic of language. There remains the
problem of the genetic nature of the variation and its
persistence. Such questions touch on the evolutionary
significance of species transitions and the maintenance
of species boundaries.
Reference
Crow TJ, Crow LR, Done DJ and Leask SJ (1998) Relative hand
skill predicts cognitive ability; global deficits at the point of
hemispheric indecision. Neuropsychologia 36: 12751280.
Intercross
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0703
hybrid organisms that were both derived from an outcross between two inbred strains.
See also: Backcross; Incross; Outcross
Interference, Genetic
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0704
Multiple events of recombination on the same chromosome are not independent of each other. Instead, a
recombination event at one position on a chromosome
will act to interfere with the initiation of other recombination events in its vicinity. This phenomenon is
known, appropriately, as `interference.' Interference
was first observed within the context of significantly
lower numbers of double crossovers than expected in
the data obtained from some of the earliest linkage
studies conducted on Drosophila. Since that time,
interference has been demonstrated in every higher
eukaryotic organism for which sufficient genetic data
have been generated.
Significant interference has been found to extend
over very long distances in mammals. The most
extensive quantitative analysis of interference has
been conducted on human chromosome 9 markers
that were typed in the products of 17 316 meiotic
events. Within 10 cM intervals, only two double crossover events were found; this observed frequency of
0.0001 is 100-fold lower than expected in the absence
of interference. Within 20 cM intervals, there were 10
double crossover events (including the two above);
this observed frequency of 0.0005 is still 80-fold
lower than predicted without interference. As map
distances increase beyond 20 cM, the strength of interference declines, but even at distances of up to 50 cM,
its effects can still be observed.
If one assumes that human chromosome 9 is not
unique in its recombinational properties, the implication of this analysis is that for experiments in which
fewer than 1000 human meiotic events are typed, multiple crossovers within 10 cM intervals will be extremely unlikely, and within 25 cM intervals, they will
still be quite rare. Data evaluating double crossovers
in the mouse are not as extensive, but they suggest a
similar degree of interference. Thus, for all practical
purposes, it is appropriate to convert recombination
fractions of 0.25, or less, directly into centimorgan
distances through a simple multiplication by 100.
See also: Linkage Map
Intron Ho m i ng 1047
Interphase
Intron Homing
Intersex
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0706
Interspecific, Intraspecific
Cross
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0707
Intervening Sequence
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1880
History
Mobile introns are widespread. They have been identified in bacteria and bacteriophage, archaebacteria,
and eukaryotes. The RNA of most of these introns
folds into a series of stems and loops. There are two
different basic folding patterns, corresponding to the
group I and group II introns. In addition to different
RNA structures, introns in the two groups also have
distinct autocatalytic splicing mechanisms. Mobility
has been demonstrated for group I and group II
introns and for a noncatalytic archaebacterial intron,
but not for nuclear spliceosomal introns.
The first intron shown to be mobile, in the early
1970s, was the group I ribosomal large subunit (LSU)
intron, formerly called the o intron, of the yeast Saccharomyces cerevisiae. The DNA-based homing process was elucidated by experiments showing polarity
of recombination in crosses between intron-plus and
intron-minus alleles. The intron was mobilized so that
more than 90% of the progeny were found to carry the
intron-containing allele.
The first group II intron shown to exhibit homing
was the aI1 intron, also of S. cerevisiae. The original
papers refer to this as transposition, but it is in fact
homing as defined above. Group II intron homing is
distinguished from homing of group I introns by the
involvement of the intron RNA in both templating
and mediating the mobility event.
1048
Intro n Homing
Table 1 Mobile introns. Based on the presence of endonuclease-encoding open reading frames and homology to
known mobile introns, many more introns are likely to be mobile.
Intron Name Organism
Reference
Group I
LSU (o)
coxI-3a
coxI-14a
coxI-15a
bi-2
coxI1I
LSU-3
LSU-5
LSU
CobI-1
td
sunY
LSU
Saccharomyces cerevisiae
Saccharomyces cerevisiae
Saccharomyces cerevisiae
Saccharomyces cerevisiae
Saccharomyces capensis
Schizosaccharomyces
pombe
Physarum polycephalum
Chlamydomonas eugametos
Chlamydomonas reinhardtii
Chlamydomonas smithii
T4 bacteriophage
T4 bacteriophage
Desulfurococcus mobilis
DiSSuI
cox1
Didymium iridis
Peperomia polybotrya
Group II
aI1
Saccharomyces cerevisiae
aI2
L1.LtrB
RmInt1
Xln6
P1DNA
Cox 1.1
Saccharomyces cerevisiae
Lactococcus lactis
Sinorhizobium melliloti
Pseudomonas alcaligenes
Podospora anserina
Kluveromyces lactis
Homing Mechanism
Group I Mechanism
intron, genetic markers both upstream and downstream of the intron insertion site may be converted
to those of the intron donor.
Group II Mechanism
Retrohoming
Group I introns with mutations that block RNA splicing remain capable of homing, but for group II
introns splicing is a requirement for intron mobility,
because the spliced intron RNA is active in the homing process. The prefix `retro-' acknowledges the role
of RNA in the group II homing mechanism. The
intron-encoded proteins of group II introns are more
complex than those of group I introns. They consist of
a single multifunctional protein that generally encodes
endonuclease, RNA maturase (for splicing enhancement), and reverse transcriptase functions. In all
group II mobile introns, the open reading frame
(ORF) is located in a large loop in the RNA secondary
Intron Ho m i ng 1049
DONOR
1. Transcription,
translation
+
RECIPIENT
2. Cleavage
3. Resection
4. Synapsis
5. Recombination
and repair
PRODUCTS
Figure 1 Group I intron homing pathway. Outlined strands represent sequence of the donor allele. Gray
lines represent sequence of the recipient allele. Black lines symbolize the intron sequence, and the pac-man
symbols represent the intron-encoded endonuclease. Arrowheads at the ends of the lines represent the 30 end of the
DNA.
structure. If the ORF is deleted from the intron, mobility is lost, but, when the intron-encoded protein is
provided in trans, mobility is restored.
Group II intron homing is catalyzed by a ribonucleoprotein consisting of the intron-encoded protein
and the spliced intron RNA. This is shown in
Figure 2. The first step in homing is cleavage of the
homing site of the intron-minus allele. The top strand
is cleaved by the intron RNA, in a reverse-splicing
reaction, while the bottom strand is cleaved by the
1050
Intro n Homing
DONOR
1. Transcription,
splicing, translation
RT M E
RNP
+
RECIPIENT
dsDNA
2. Reverse splicing,
cleavage
3. cDNA synthesis
5. Repair
DONOR
+
PRODUCT
Figure 2 Group II intron retrohoming pathway. Outlined strands represent sequence of the donor allele. Gray lines
represent sequence of the recipient allele. Black lines symbolize the intron sequence, with solid and dashed lines
representing DNA and RNA, respectively. The staggered arrowheads mark the sites of intron insertion and
endonuclease cleavage. The pathway shown is for a bacterial group II intron. A similar pathway, with some variations,
occurs for the yeast introns. (dsDNA, double-stranded DNA; RNP, ribonucleoprotein; RT, reverse transcriptase;
M, maturase; E, endonuclease).
Retrotransposition
Intron Ho m i ng 1051
site. Restriction endonucleases generally recognize
small (48 bases), often palindromic sites with strong
sequence specificity at the cleavage site. In contrast, all
the intron-encoded endonucleases characterized thus
far have large (in the 12- to 40-bp range) recognition
sites. The homing endonucleases exhibit a relaxed
sequence specificity over these lengthy recognition
sites and can tolerate many base changes. Yet the
main requirement of homing endonucleases is simply
to initiate cleavage of the DNA at the target site. An
intron has been engineered to express the EcoRI
restriction endonuclease, and this intron can home to
an engineered EcoRI restriction site.
Homing endonucleases constitute a diverse group of
proteins. While found in both mobile introns and
inteins (mobile elements that splice at the protein
level), they also exist in freestanding form, such as the
HO endonuclease involved in mating-type switching
in S. cerevisiae. Most homing endonucleases fall into
four major classes, based on their conserved structural motifs: LAGLIDADG, GIY-YIG, H-N-H,
and His-Cys. The name of the first three classes
is the amino acid sequence of the motif in singleletter code. Enzymes in both the H-N-H and
His-Cys classes share a common protein fold, and a
metal ion involved in catalysis, so it has been proposed that they are diverse members of a single structural class.
Significance
Evolutionary Implications
at the homing site. The RNA world hypothesis postulates that the first enzymes were RNA-based. The
group II ribonucleoproteins may represent ancient
biochemistry and a transitional state between an
RNA world and the DNAprotein world as we
know it.
The splicing reactions of group II introns are
mechanistically similar to those of the nuclear spliceosomal introns, which comprise about 15% of the
human genome. It is widely hypothesized that group
II introns evolved into spliceosomal introns, although
direct evidence in terms of conservation of sequence
and structure is lacking. It is also noteworthy that
group II introns resemble retrotransposons that lack
a long terminal repeat, both in mechanism of integration and in sequence of the reverse transcriptase
moiety of the intron-encoded protein. These retrotransposons make up more than 17% of mammalian
genomes. Group II introns and their close relatives
have therefore played a major evolutionary role in
shaping the human genome.
Potential Applications
Further Reading
1052
of eukaryotes. Group II introns are known in eukaryotic organellar (but not nuclear) genomes, as well as
in eubacterial chromosomes and plasmids. Though
common in some organellar genomes, self-splicing
introns are extremely rare elsewhere, and seem to be
entirely absent from most prokaryotic genomes as
well as many eukaryotic nuclear genomes.
Invariants, Phylogenetic
W Fitch
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0710
4C
1A
4A
1A
4A
1A
4A
1A
4A
A
C
C
2A
3C
A
Figure 1
2C
3C
B
2C
3C
C
2C
3?
D
2C
3T
E
1054
I nve r s i o n
Reference
Inversion
N Grindley
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0711
Inverted Repeats
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1881
Inverted repeats are two copies of the same DNA sequence repeated in opposite orientation in the same
molecule.
See also: Repetitive (DNA) Sequence
Isochromosome
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0717
Isolation by Distance
N E Morton
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1426
I s o m e r i z a t i o n ( o f Ho l l i d ay Ju n c t i o n s ) 1055
work was based on a hierarchical model of local populations (demes) in successively larger regions, each
with its own gene frequencies. This model is difficult
to apply to real populations, and so has been superseded by the theory of Gustave Malecot for pairs of
individuals born at a known distance d in a given
region. Genetic similarity is measured by kinship fd ,
the probability that a gene drawn randomly from one
individual be identical by descent with a random allele
in the other individual. If the pair are spouses, kinship
is the inbreeding Fd of their children. The Malecot
equation is usually written as 'd 1 La e bd L,
where 0 < a < 1 is kinship within a local population (d 0) and
1 < L 0) is kinship at large
distance. If current gene frequencies are used in
kinship bioassay on genotypes, phenotypes, or surnames, L 'R =1 'R , where jR is random kinship is the sampled region. If kinship in relation to
founder gene frequencies is predicted from migration
or genealogy, L 0. The parameters a, b are functions
of effective population size N and systematic pressure m largely due to migration. Validity of this equation depends on discreteness of local populations.
More complicated expressions derived for continuous
distributions and two or three dimensions are less
accurate for real populations. Oceanic islanders
and nomadic populations have small values of b
compared with coastal islanders and agriculturists.
Kinship increases rapidly in populations with preferential consanguineous marriage but then reaches
a plateau that is not much greater than for isolates
that avoid consanguineous marriage. The effect of
migration is everywhere apparent. This evidence
helped to resolve misunderstanding about the role of
population structure in assessing forensic DNA identification.
In recent years the Malecot model has been useful
for study of linkage disequilibrium. Distance between
loci or nucleotide polymorphisms is measured along
the physical or genetic map, usually in kilobases (kb)
or centimorgans (cM), taking advantage of the fact
that recombination acts on allelic association in the
same way as migration acts on kinship. Isolation by
distance has become a cornerstone of genetic epidemiology, as it has long been for population genetics and
anthropology.
Further Reading
Isoleucine
J Read and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.2076
H3N
CH3
CH2
CH3
Figure 1
Isoleucine.
1056
I s o m e r i z a ti o n ( o f H o l l i d ay Ju n c t i o n s )
(A)
A
II
(B)
II
+
(C)
+
b
Figure 1 (A) With some base pairs unstacked, the Holliday junction takes on an X-form. The two DNA strands
that cross in the middle of this X (labeled I) are those that exchanged places in the formation of the Holliday junction.
Rotation of the upper arms, as shown by the circular arrow, reveals that the structure has a hole in its center. (B) A
rotation of two side arms of this structure relative to the other two gives a configuration in which the other two
strands cross in the center (shown at II). (C) If the crossing strands are cut at I in (A), the structure is resolved as a
noncrossover. If the crossing strands are cut at II, a crossover results.
molecule are called isomers and the process by which
a molecule adopts an alternative structure is called
isomerization. These terms are applied to Holliday
junctions.
Isomerization of a Holliday junction is conceived
as beginning with some bases becoming unstacked.
That is, the eight bases at the junction no longer interact with their neighbors in the same DNA strand. This
allows the arms of the junction to open out so that the
structure takes on the form of an X as shown in the
first structure in Figure 1A. The process of isomerization of a Holliday junction is described as two
I s o t y p e S w i tc h i n g 1057
Further Reading
Isotype
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1884
Isotype Switching
See: Recombination in the Immune System
J
J Gene
See: Recombination in the Immune System
Jackknifing
See: Trees
Jacob, Francois
K Handyside, E Keeling and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0721
Further Reading
http://www.nobel.se/medicine/laureathes/1965/jacob.bio.html.
http://www.rockefeller.edu/pubinfo/jacob.nt.html.
JukesCantor Correction
P Pamilo
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0723
1060
Ju mping Genes
3=4log1
4=3p
p=L1
4p=32
Further Reading
Jumping Genes
See: Horizontal Transfer; Transposon Excision;
Transposons as Tools
L
lac Mutants
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0737
1070
l a c Operon
lac Operon
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0738
Lactose
J H Miller
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0741
K
Karyotype
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0725
kb (Kilobase)
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1885
Har Gobind Khorana (1992 ) is one of the most outstanding geneticists in the world. Khorana may be best
1062
K i m u r a C o r re c t i o n
Kimura Correction
N Saitou
log1
1=5p2
References
Kimura, Motoo
J F Crow
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0728
Further Reading
Kimura M (1983) The Neutral Theory of Molecular Evolution. Cambridge: Cambridge University Press.
Kimura M (1994) In: Takahata A (ed.) Population Genetics,
Molecular Evolution, and the Neutral Theory: Selected Papers.
Chicago, IL: University of Chicago Press.
Kin Selection
See: Hamilton's Theory
1064
Kinetochore
Reference
Kinetochore
M A Hulten and C Tease
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0731
Structure
The somatic vertebrate kinetochore, when viewed by
standard transmission electron microscopy, is a trilaminar structure on the surface of centromeric
heterochromatin of each chromatid of a chromosome
(Figure 1). A fourth, fibrillar layer can also be discerned adjacent to the trilaminar structure.
Centromere-Associated Proteins
Many kinetochore proteins have been identified
although their functions have not been fully
K l i n ef e lt e r Sy nd ro m e 1065
microtubules
fibrous corona
outer plate
central zone
inner plate
heterochromatin
CENP-E; dynein
CENP-E; CENP-F; BUBR1
3F3/2 antigens
CENP-A; CENP-C; CENP-G
CENP-B; INCENP
Figure 1 Schematic representation of somatic kinetochores as viewed by conventional electron microscopy. The
various components of the kinetochore are identified and the locations of some of the centromere associated
proteins are also indicated.
characterized. The locations of some of these
proteins within the somatic vertebrate kinetochore
are illustrated in Figure 1. In vertebrates, some centromere-associated proteins are present as constitutive
elements of the kinetochore throughout the cell cycle,
e.g., CENP-A, -B, -C. Others show a transient pattern
of association, and are termed passenger proteins,
being present usually from late G2 to mitotic
anaphase, e.g., CENP-E, -F, INCENP.
CENP-A and -C are essential for kinetochore
function. In mice lacking these proteins, cell division
is irregular and embryos die early in development.
CENP-A is a histone H3-like protein; it may be
involved in the epigenetic marking of kinetochoreassociated chromatin and possibly also in the recruitment of CENP-C to the kinetochore. CENP-C is
present at active centromeres, including neocentromeres (de novo sites of kinetochore activity outwith
the centromere), but absent from inactive centromeres
(e.g., in dicentric chromosomes). CENP-B binds to a
specific 17bp DNA sequence that shows wide conservation in vertebrates. However, the functional role of
this protein in kinetochore formation and activity is
unclear. It is present on both active and inactive centromeres but is not present in neocentromeres.
M A FergusonSmith
Further Reading
Klinefelter Syndrome
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0732
1066
Knockout
Knockout
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0733
Korn be r g, Ar t h ur 1067
Knockout alleles are generated by an in vitro process of
homologous recombination in embryonic stem cells.
See also: Embryonic Stem Cells
Kornberg, Arthur
T N K Raju
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0734
References
1068
Kor n b e r g E n z y m e
Kornberg Enzyme
See: Kornberg, Arthur
Kuru
M A Ferguson-Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0736
Lagging Strand
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1886
1072
L a m a rc k i s m
Lamarckism
G S Stent
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0743
`Lamarckism' refers to the first comprehensive theory of evolution developed by the French natural
historian Jean Baptiste Lamarck and set forth by him
in his treatises Recherches sur l'organization des corps
vivants (1802); and Philosophie zoologique (1809).
Lamarck's theory was based on his lifelong direct
observation of plants and animals, which provided
him with a sense of the dynamic quality of life, as
well as of the close interdependence of physical and
vital processes in which life is grounded. As originally
formulated, Lamarckism was part of an elaborate surmise about processes for whose operation Lamarck
had no direct evidence.
Lamarckism asserted that all living things have arisen
via a continuous process of a gradual modification
L a m p b r u s h C h ro m o s o m e s 1073
science: the Russian agronomist Trofim Lysenko, who
dominated (not to say destroyed) genetics in the
Soviet Union and its satellite popular democracies
from the mid-1930s until the mid-1960s. Lysenko
was not openly opposed to classical Darwinism,
Karl Marx having been a great admirer of Darwin
but he declared neo-Darwinism, with its reliance on
Mendelian genetics and gene mutation, to be idealist
racist metaphysical speculations propagated by the
Catholic Church and the Fascists to keep the proletariat
intellectually enchained.
At first, in the 1930s, Lysenko denied that he
was a Lamarckist and declared that ``starting from
Lamarckian positions, the work of remaking the
nature of plants by `education' cannot lead to positive
results.'' Then, when he became director of the
Institute of Genetics of the Soviet Academy of
Sciences in 1940 and had Stalin's ear, Lysenko declared
Mendelian genetics erroneous. By 1948, when he had
ruthlessly silenced any Soviet geneticists who opposed
him, he no longer concealed his adherence to
Lamarckism, declaring that:
the well-known Lamarckian propositions, which recognize
the active role of the conditions of the external environment
in the living body and the inheritance of acquired characters,
in contrast to the metaphysics of neo-Darwinism, are indeed
scientific.
Lampbrush Chromosomes
H C Macgregor
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0745
Nuclear membrane
Nucleus containing
lampbrush chromosomes
Cytoplasm and yolk
1074
L a m p b r u s h C h ro m o s o m e s
P
PPP
LL
C
cf
Figure 2 A region of a lampbrush chromosome showing the interchromomeric axial fiber (cf ) connecting small
compact chromomeres (c), chromomeres bearing pairs (L) or multiple pairs (LL) of loops, loops of different
morphologies, polarization of thickness along individual loops, loops consisting of a single unit of polarization (P), and
loops with several tandem units of polarization having the same or different directions of polarity (PPP).
Some particularly distinctive loops can be used for
chromosome identification and the construction of
LBC maps. Loops arising from the same chromomere
have the same appearance and are usually, though not
always, of the same length (Figure 2).
The general pattern of events during the lampbrush
phase of oogenesis is one of extension followed by
retraction of the lampbrush loops and there is a clear
inverse relationship between loop length and chromomere size. The longer the loop, the smaller the
chromomere, and vice versa.
Most lateral loops have an asymmetrical form.
They are thin at one end of insertion into their
chromomere and become progressively thicker
towards the other end (Figure 2). If an LBC is
stretched, breaks first happen transversely across the
chromomeres so that the resulting gaps are spanned by
the loops that are associated with the chromomeres
(Figure 3). This demonstrates the structural continuity between the main axis of the chromosome the
interchromomeric fiber and the axes of the loops.
Lampbrush loops are sites of active RNA synthesis
and RNA is being transcribed simultaneously all
along the length of the loop. In newts, there are more
than 20 000 RNA-synthesizing loops per oocyte. Particular loops may be present or absent in homozygous
or heterozygous combinations and the frequency of
combinations within and between bivalents with
respect to presence or absence of loops, signifies
that these loops assort and recombine like pairs of
Mendelian alleles. So there appears to be an element
of genetic unity in a loopchromomere complex.
By 1960, it was known that an LBC has two DNA
duplexes running alongside one another in the interchromomeric fiber, compacted into chromomeres at
intervals and extending laterally from a point within
Figure 3 Breakage of a stretched lampbrush chromosome across a chromomere, such that loops associated
with the chromomeres come to span the gap between
the two halves of the chromomere.
each chromomere to form loops where RNA transcription takes place. Each duplex represents one
chromatid (Figure 4). New technologies of the late
1970s confirmed this model and extended it.
A technique that removed most of the protein from
chromosomes, leaving only the DNA and attached
newly transcribed RNA, and then visualized what
was left by electron microscopy, showed a lampbrush
loop as a thin DNA axis with RNA polymerase molecules lined up and closely packed along its entire
length. Each polymerase carried a strand of RNP. At
one end of the DNA axis, the RNP strands are short.
L a m p b r u s h C h ro m o s o m e s 1075
that `gene.' In effect, the loop is a large object, consisting of hundreds of RNA copies of the gene, all clustered at one position on the chromosome set. Isolate
and purify the DNA of that gene, and label it in some
way, and it will be easy to make it single-stranded
and bind it specifically to the complementary singlestranded RNA attached to the lampbrush loop. The
technique is known as DNA/RNA transcript in situ
hybridization (DR/ISH).
The end product of an experiment involving DR/
ISH is a preparation showing one or more pairs of
loops with label distributed along their lengths. It is
not uncommon in DR/ISH experiments to find loops
that are labeled over only part of their lengths. This is
evidence that the DNA sequence of a loop axis can and
does change from place to place along the length of the
loop. Wherever there are partially labeled loops, it is
usual to find the same partially labeled loops, with
precisely the same pattern of labeling, in every oocyte
over quite a wide range of size and stage. So loops are
permanent structures that transcribe from the same
stretch of DNA axis throughout the entire lampbrush
phase. DR/ISH experiments prove that highly repeated short DNA sequences, commonly referred to
as `satellite' DNA, which could not possibly serve as a
basis for transcription and translation into functional
polypeptides, are abundantly transcribed on lampbrush loops along with more complex sequences that
are definitely translated into functional proteins.
The current hypothesis for LBC function is as follows. At the thin base of each loop or the start of each
transcription unit there is a promoter site for a functional gene sequence. RNA polymerase attaches to
this site and moves along the DNA, transcribing the
sense strand of the gene and generating messenger
RNA molecules that remain attached to the polymerase (Figure 5). In the lampbrush environment there
are no stop signals for transcription, so the polymerases continue to transcribe past the end of the
functional gene and into whatever DNA sequences
lie `downstream' of the gene. This results in very
long transcription units, very long transcripts, mixing
of gene transcripts with nonsense transcripts in high
molecular weight nuclear RNA, and lampbrush loops.
This `read-through' hypothesis predicts that the number of functional genes that are expressed to form
translatable RNAs may be expected to equal the number of transcription units that are active in a lampbrush
set. The hypothesis says, in effect, that the only unusual feature of an LBC, and the very reason for the
lampbrush form, is that once transcription starts it
cannot stop until the polymerase meets another promoter that is already initiated or some condensed
chromomeric chromatin that is physically impenetrable and untranscribable.
1076
L a m p b r u s h C h ro m o s o m e s
L e a d er Pe p t i d e 1077
Some LBCs have long loops and others have very
short ones. We have seen that the transcription units of
LBCs are unusually long because they include interspersed repetitive elements of the genome. Structural
genes in large genomes are more widely spaced than
in small genomes as they are interspersed with noncoding DNA. One might therefore expect LBCs from
large genomes to have longer loops (transcription
units) than those of smaller genomes, and this is what
has been observed.
Many of the very long loops that we see in LBC
from animals with large genomes show multiple, tandemly arranged thinthick segments (transcription
units). The individual transcription units within one
loop can have the same or opposite polarities and can
be of the same or different lengths (Figure 6). This
observation suggests that it is really the transcription
unit that is the ultimate genetic unit in an LBC and
not the loop/chromomere complex, as was once
thought.
Why do LBCs exist at all? They are characteristic
of eggs that develop quickly into complex multicellular organisms independently of the parent. A frog's
egg is fertilized and develops into a complex tadpole
within a few days. Much of the information and raw
materials for this process are laid down during oogenesis through activity of LBCs and amplified ribosomal
genes and the accumulation of yolk proteins imported
from the liver. LBCs may therefore be regarded as an
adaptive feature that has evolved to preprogramme the
egg for rapid early development. The fact that they are
not present in mammalian eggs could be regarded as
an advanced feature that is consistent with the slow
pace of mammalian development. A frog's egg, for
example, will have completed gastrulation and the
differentiation of its central nervous system and
Further Reading
Late Genes
E Kutter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0746
Leader Peptide
Figure 6 The various arrangements of transcription
units that actually occur on lampbrush chromosomes.
The loop on the left comprises a single transcription
unit. In the middle loop there are two transcription units
of the same size and polarity. The right-hand loop has
four transcription units of different sizes and different
directions of polarity.
J Parker
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0749
1078
L e a d er S eq u en c e
Leader Sequence
P S Lovett
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0750
Transcription in Bacteria
Transcription attenuation comprises one level of regulation for most amino acid biosynthetic operons in
enteric bacteria. Nucleotide sequences within the
leader cause the formation of a domain of secondary
structure which acts as a transcription termination
signal for bacterial RNA polymerase. Transcription
initiated in the upstream promoter terminates within
the leader so as to prevent RNA polymerase from
entering the structural genes of an operon. Transcription termination is relieved when the intracellular
concentration of the end-product amino acid of the
operon-specified enzymes falls below some minimal
level. The level of the end-product amino acid is
sensed by the translation of a short leader-encoded
open reading frame (ORF) immediately upstream of
the transcription termination signal; the open reading
frame contains one or more codons for the operon
end-product amino acid. Low intracellular levels of
the end-product amino acid prevent high level charging of the cognate tRNA, resulting in ribosomal pausing at leader codons for the end-product amino acid.
The paused ribosome interferes with the secondary
structure of the transcription terminator causing the
formation of a second configuration in the mRNA,
the attenuator, which allows transcription to enter the
downstream operon coding sequence.
Transcription antitermination involves the formation of a transcription termination structure in leader
mRNA, which is either inhibited or facilitated by the
interaction of a protein (or a tRNA molecule) with
leader mRNA sequences. For example, the TRAP
protein plus tryptophan binds to the leader sequence
of the Bacillus subtilis trp operon causing the formation
of a transcription terminator. In the absence of tryptophan, TRAP fails to bind to the leader sequence and
an antiterminator structure forms allowing transcription to enter the operon. Other operons that follow
Translation in Bacteria
Translation attenuation regulates several antibiotic
inducible, antibiotic resistance genes (e.g., cat, erm).
A domain of secondary structure in leader mRNA
sequesters the ribosome binding site for the downstream resistance determinant, preventing translation
initiation. Antibiotic-induced ribosome stalling in a
short open reading frame within the leader causes
destabilization of the secondary structure, which
frees the ribosome binding site allowing translation
of a coding sequence whose protein product can
neutralize the antibiotic.
Translational repression is well exemplified by certain operons encoding bacterial ribosomal proteins.
The translational repressor is a single ribosomal protein encoded by the operon; the nonregulatory function of this protein is to act as a structural component
of the ribosome. In several examples, the binding
target for the repressor protein in the operon leader
sequence mimics the structure or sequence of the
rRNA target for the same protein. Binding of
the regulatory protein to leader mRNA is presumably
of lower affinity than that for rRNA binding in vivo.
Leader binding by the repressor interferes with translation of operon mRNA by occluding the ribosome
binding site or by changing the secondary structure of
leader.
Translation in Eukaryotes
Translation in eukaryotes is typically initiated by the
scanning of a 40S ribosomal preinitiation complex.
Scanning begins at the 50 capped end of the mRNA
and halts at the first initiator codon, usually AUG,
where translation begins. Translation initiation efficiency at any particular AUG is affected by the context of the leader sequence flanking the AUG codon;
a preinitiation complex may ignore an AUG codon
located in a region of poor context. Several features of
the leader sequence can dramatically decrease translation of the main (downstream) coding sequence:
Further Reading
1080
L e a d in g S t r a n d
Leading Strand
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1887
Least Squares
W-H Li and K Makova
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1497
The least squares method is a well-established statistical method of parameter estimation. This method
chooses predicted values ei that minimize the sum of
squared errors of prediction ~i (di ei)2 for all sample
points di (observed values). The least squares method
has been utilized in molecular evolution to estimate
the branch lengths in a phylogenetic (evolutionary)
tree and to estimate the topology of a tree. The least
squares estimates of the branch lengths bi's are the
estimates ei's that minimize the following sum of
squares: ~i,j (dij
eij)2, where dij is the observed
evolutionary distance between taxa i and j, and eij is
the sum of length estimates (ei's) of the branches connecting taxa i and j. To choose the best topology
according to the least squares criterion, the above
sum is computed for each possible topology and the
topology with the smallest sum is taken as the best
tree.
See also: Phylogeny; Trees
Lederberg, Joshua
E Kutter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0752
L eg um in os ae 1081
Lederberg's interests extend well beyond basic
science. He has played a number of important roles
in the international health community, including
spending years on the World Health Organization's
Advisory Health Research Council and serving as
chairman of the President's Cancer Panel and of
the congressional Technology Assessment Advisory
Council. He also chaired a UNESCO committee on
improving global internet communications for science
and helping third-world people get onto the internet
so they can be more involved in the process. Family
has played an important role in his life. His father, a
Rabbi who emigrated from Israel shortly before his
birth, had a strong impact. His French-born wife is a
Clinical Professor of Psychiatry at Memorial Sloan
Kettering Cancer Center, and he has two children,
David and Annie. His life has exemplified the basic
advice he gave to young people in a recent interview (www.almaz.com/nobel/medicine/lederberginterview.htm):
Try hard to find out what you're good at, and what your
passions are, and where the two converge, and build your life
around that . . . and make deliberate choices.
Leguminosae
J J Doyle
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1642
The legume or bean family (Leguminosae or Fabaceae), with over 650 genera and 18 000 species, is the
third largest family of flowering plants (angiosperms),
behind only orchids (Orchidaceae) and the composite
or sunflower family (Asteraceae or Compositae).
Morphologically and ecologically, it is a very diverse
family, ranging from tiny alpine ephemerals to huge
tropical rainforest canopy trees. As much as one-third
of the family's species are concentrated in a handful
of large genera, such as Acacia, Astragalus, and
Mimosa, that have radiated abundantly in disturbed
habitats.
The family is characterized by its distinctive (and
eponymous) fruit, a two-valved pod whose halves
separate to disperse the seeds; however, this form is
modified into a wide variety of shapes and sizes,
including indehiscent dry or fleshy forms. Symbioses
with nitrogen-fixing soil bacteria (collectively called
Molecular phylogenetic studies support the naturalness (monophyly, descent from a single common
ancestor) of the Leguminosae (Figure 1). The legumes
have their relationships with taxa of the broad `rosid'
alliance that includes a major portion of angiosperm
diversity, among which are other families that participate in nitrogen-fixing symbioses. Within this large
clade (the descendants of a single common ancestor),
the relationships of the family are more controversial.
Morphological and chemical data suggest affinities
with families such as Connaraceae or Sapindaceae,
but molecular results ally the legumes with families
previously not suggested as close relatives: Polygalaceae (milk vetches), Surianaceae, and Quillaja, an
anomalous member of the Rosaceae (rose family).
1082
L eguminosae
Quillaja
Polygalaceae
Surianacea
Cercideae
Cassieae (various)
Detarieae
Cassia (Cassieae Cassiinae)
Senna (CassieaeCassiinae)
Chamaecrista (Cassieae Cassiinae) N(1)
Caesalpinieae (various)
Sclerolobium group (Caesalpinieae)
Dimorphandra group (Caesalpinieae) N(2)
MIMOSOIDEAE
Swartzieae (various)
Sophoreae (various)
Sophoreae (various)
Dalbergieae (various)
Peanut (Arachis)
Mirbelieae/ Bossiaeeae
LEGUMINOSAE
PAPILIONOIDEAE
50 kb
inversion
N(3)
cp IR loss
Amorpheae
Dipterygeae
Daibergia group (Dalbergieae), Aeschynomeneae
Adesmieae, Aeschynomeneae
aeschynomenoids
Sophoreae
genistoids
Thermopsideae
Podalyrieae (and Liparieae)
Lupin (Lupinus)
Crotalarieae
Genisteae
Clover (Trifolium)
Hologalegina
Sesbania (Robinieae)
Alfalfa (Medicago)
Loteae
(galegolds)
Lotus
Broad bean (Vicia)
Robinieae
Pea (Pisum)
Wisteria and some other Millettieae, Galegeae
Lentil (Lens)
Hedysareaea,Galegeae
Grass pea (Lathyrus)
Carmichaelieae, Galeteae
Chick
pea (Cicer)
Galegeae, Trifolieae, Vicieae, Cicereae
Indigofereae
phaseoloids
Millettieae
Millettieae and some Phaseoleae
Phaseoleae, Desmodieae
Phaseoleae, Psoraleeae
Common bean (Phaseolus)
Cowpea/ mung bean (Vigna)
Hyacinth bean (Labiab)
Yam bean (Pachyrhizus)
Winged bean (Psophocarpus)
Soybean (Glycine)
Kudzu (Pueraria)
Pigeonpea (Cajanus)
Sword/jack bean (Canavalia)
Figure 1 Phylogenetic relationships of legumes summarized from several phylogenetic studies, mostly using gene
sequences from the chloroplast genome. Vertical heavy lines next to names indicate groups whose relationships are
unresolved, or which themselves represent several lineages that are not closely related. The ancestor of all
Leguminosae is indicated by an arrow. Two of the three subfamilies are also indicated: Mimosoideae as a terminal unit
of the tree and Papilionoideae by an arrow pointing to its ancestor; all other legume taxa (genera or tribes) are
Caesalpinioideae. Major lineages of papilionoid legumes referred to in the text are boxed, with the informal name of
the group in bold type (e.g., `phaseoloids') within the box. Dashed lines connect particular tribes to boxes containing
economically or scientifically important representatives. `N' indicates groups known to be capable of nodulation,
followed by a number in parentheses that refers to one of three potentially independent origins of the syndrome. The
vast majority of Papilionoideae derived from the ancestor indicated as `N(3)' are known to nodulate, but some do
not. Two chloroplast DNA structural mutations are indicated by arrows; the taxonomic distribution of the 50 kb
inversion is not precisely known due to lack of sampling.
numerous in many species, providing the main floral
display. Unlike other legumes, many mimosoids shed
their pollen in polyads of 16 or 32 grains. Mimosoideae has some very close allies that are classified as
Caesalpinioideae; this has been suspected for some
time based on morphology, and the hypothesis has
been supported by molecular data (Figure 1).
Papilionoideae is the subfamily most people visualize when legumes are mentioned. It is by far the
largest and ecologically most diverse of the three subfamilies, with some 450 genera and 12 000 species. Its
members typically have bilaterally symmetrical flowers like those of pea (Pisum), with two wing petals,
two keel petals, and a large standard petal. It is this
butterfly-like (`papilionoid') floral morphology that
gives its name to the subfamily. This morphological
condition is derived, and some early-diverging members of the subfamily retain the radially symmetrical
floral morphology of Caesalpinioideae and other
rosid families. The large papilionoid radiation appears
to have no particularly close allies among caesalpinioid taxa (Figure 1).
L eg um in os ae 1083
1084
L eguminosae
Fossil Record
Legumes have a well-developed macrofossil record in
the Eocene (c. 50 million years ago) that includes
flowers of each of the three subfamilies, suggesting
that the major radiation of legumes occurred prior to
that time. The ages of particular genera within each
subfamily are more problematic, and thus it is difficult
to say when, for example, pea diverged from soybean,
or the various species of Phaseolus diverged from their
common ancestor.
The divergence of the family from other rosid taxa
is also difficult to determine with any precision. This is
at least in part due to the fact that ancient legumes, like
many modern Caesalpinioideae, were most probably
fairly stereotypical rosids in much of their morphology. The rich Cretaceous floral fossil record from
around 92 million years ago indicates that by that
time most major lineages of flowering plants had
diverged from one another. For example, fossil representatives of the lineage that includes Arabidopsis have
been described from these Cretaceous deposits, indicating that legumes and Arabidopsis diverged at least
this long ago. A possible ceiling for this divergence is
around 110 million years ago, given the paucity of
angiosperm fossils prior to this period, though this is
in conflict with some estimates of angiosperm divergences based on molecular clock assumptions.
Evolution of Nodulation
Molecular phylogenies have identified a subgroup
within the large rosid radiation that includes families
that participate in nitrogen-fixing symbioses. However, within this `nitrogen-fixing clade' the various
nodulating families do not share a single common
ancestor, suggesting that the ability to participate in
these symbioses arose independently in different plant
groups, but that an ancestor of the entire group may
have evolved some key (but unknown) innovation
that facilitates the formation of symbioses with
diverse nitrogen-fixing microsymbionts. These are
various `rhizobia,' in legumes and Parasponia (Ulmaceae), or actinorhizal bacteria in other families. Some
molecular similarities between mycorrhizal and
nitrogen-fixing symbioses have been noted, and it
may be that machinery of the pre-existing and more
widespread mycorrhizal relationship was co-opted
and modified in the evolution of nodulation. It now
appears that genes encoding `nodulins' (proteins that
function in the nodule) are not, strictly speaking,
novel or uniquely nodular, but have been recruited
from other functions. Even the quintessential nodulin,
leghemoglobin, whose presence in legumes was once
considered so unexpected that it was thought to be a
case of horizontal gene transfer from the animal kingdom, is now known to be part of a plant gene family
whose membership includes paralogous copies that
are not associated with symbiosis.
The ability to nodulate is characteristic of most
mimosoid and papilionoid legumes, but it is not universal in the family. Most Caesalpinioideae do not
nodulate, nor do some early-diverging lineages of
Papilionoideae. The phylogenetic distribution of
nodulation in Leguminosae suggests that the ability
to nodulate is not primitive in the family, and that
nodulation may have arisen several times (Figure 1).
There are also cases of loss of the ability to nodulate,
which complicates the picture. Nodulation involves
the production of a special organ, the nodule, and also
what has been called a novel organelle, the symbiosome, consisting of nitrogen-fixing bacteroids enclosed
in a primarily host-derived peribacteroid membrane.
Independent losses of these structures seem more
likely than does their independent origin, but the fact
that nitrogen-fixing symbioses have almost certainly
arisen multiple times elsewhere in the flowering plants
is a mitigating consideration.
The details of nodulation vary even within Papilionoideae, where nodulation almost certainly had a single origin. The ancestral type of nodule appears to be
an indeterminate, unbranched type that is also found
in mimosoids and caesalpinioids. This `caesalpinioid'
type has been modified in several ways among papilionoids, including highly branched indeterminate
types such as are common on Trifolieae, the large
girdling indeterminate type of Lupinus, the clustered
determinate types of aeschynomenoid taxa such as
Arachis, and the single globular determinate `desmodioid' nodules of Loteae and many phaseoloids. The
desmodioid nodule appears to have originated independently in these two groups, and the determinate
condition itself apparently arose yet another time in
aeschynomenoids. Nitrogen is transported as amide
compounds in many legumes, but as ureides in at least
some phaseoloids.
Genomes
In some legumes the chloroplast genome departs from
the common pattern of highly conserved gene content
and order typical of photosynthetic angiosperms.
Most notably, the large inverted repeat typical of land
plant chloroplast chromosomes has been lost from all
of the members of the major lineage of Hologalegina
that contains Vicieae and allied temperate herbaceous
tribes, plus Wisteria and allies. In some but not all
species that have lost the inverted repeat there has
been considerable subsequent rearrangement. Other
major rearrangements include a 50 kb inversion found
Lejeune , Je ro m e 1085
in most papilionoid species, and a 78 kb inversion
found in Phaseolus and close allies (subtribe Phaseolinae). A number of chloroplast gene and intron losses
have been reported from the family. Several related
legume genera (e.g., Medicago, Lens, Cicer) depart
from the typically maternal pattern of chloroplast inheritance found in angiosperms, and exhibit biparental
or even predominantly paternal transmission.
What is known about mitochondrial genomes of
legumes suggests that they are typical of angiosperms
in being large relative to their counterparts in most
animals, and exhibit a master/subgenomic structure
due to recombination among direct repeats. Recent
transfer of cytochrome oxidase subunit 2 (cox2) from
the mitochondrial to the nuclear genome has occurred
in Phaseoleae, with complex patterns of expression
and, in some cases, subsequent loss from the mitochondrial genome.
Nuclear genomes of legumes vary greatly in size.
The smallest legume genomes are only around twice
the size of that of Arabidopsis thaliana, and are found
in species of Lablab, Scorpiurus, Trifolium, and Vigna;
genomes of the model legumes Lotus japonicus and
Medicago truncatula are only slightly larger than
these. At the other end of the spectrum, the genomes
of some diploid (based on chromosome number) Vicia
and Lathyrus species are nearly 100 times as large as
that of Arabidopsis. Variation can be extreme within
genera diploid species of Vicia vary in their genome
sizes from around 4 pg/2C (a haploid genome size of
around 200 Mb) to over 50 pg/2C (2500 Mb). As is
true for flowering plants in general, information on
genome sizes is limited for legumes. A handful of
papilionoid genera have been surveyed in detail, but
otherwise there are few published values, with particularly sparse sampling in Caesalpinioideae and
Mimosoideae. The sparse data from these subfamilies
suggest that relatively small genome sizes are ancestral
in the family as a whole. Cercis and Bauhinia of the
caesalpinioid tribe Cercideae, which molecular data
suggest is one of the earliest diverging legume groups,
have genome sizes of 1.3 and 1.2 pg/2C, respectively.
The legumes as a whole are considered to have a
base chromosome number of x 7, but many groups
of the family are thought to have experienced early
polyploidization followed in many cases by aneuploid
reduction. This is true, for example, of the entire subfamily Mimosoideae, and its closest allies in Caesalpinioideae, which as a group is thought to be tetraploid at
x 14. Similarly, the entire DetarieaeAmherstieae
lineage of caesalpinioid legumes is considered to
be based on x 12. Within individual tribes there are
relatively few genera that are wholly polyploid,
among them Glycine, which is 2n 40, as compared
with most Phaseoleae at 2n 22; however,
Further Reading
Crisp MD and Doyle JJ (eds) (1997) Advances in Legume Systematics, vol. 7, Phylogeny. London: Royal Botanic Gardens, Kew.
Herendeen PS and Bruneau A (eds) (2000) Advances in Legume
Systematics, vol. 9. London: Royal Botanic Gardens, Kew.
Polhill RM (1994) Classification of the Leguminosae. In: Bisby FA,
Buckingham J and Harborne JB (eds) Phytochemical Dictionary
of the Leguminosae, vol. 1, Plants and their Constituents, pp.
xxxvlvii. London: Chapman & Hall.
Smartt J (1990) Grain Legumes. Cambridge: Cambridge University Press.
Leiomyoma
See: Lipoma and Uterine Leiomyoma
Lejeune, Jerome
P L Pearson
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0753
1086
L e j eu n e , Je ro m e
L e s c h N y h a n Sy n d ro m e 1087
French to abortion, a statute that had been in place
since 1975.
See also: Down Syndrome; Ethics and Genetics
LeschNyhan Syndrome
L De Gregorio and W L Nyhan
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0754
present in almost all patients. The clinical consequences of the accumulation of large amounts of
uric acid in body fluids are the classical manifestations of gout.
2. Neurological variant (1.58% of residual enzyme
activity). The `neurological' picture has been observed in a small but important group of patients and is
characterized by a neurological examination that is
identical to that of the classic LeschNyhan patient
(i.e., cerebral palsy or atheoid cerebral palsy).
Patients are confined to wheelchairs and unable to
walk. However, behavior is normal and intelligence
is normal or nearly normal.
3. Hyperuricemic variant (more than 8% of residual
enzyme activity). The phenotype of the patients
with this partial variant enzyme consists of manifestations that can be directly related to the accumulation of uric acid in body fluids (acute attacks
of gouty arthritis and tophi). Indeed, the central
nervous system and behavior are normal.
The HPRT gene is located on the long arm of chromosome X (Xq26 q27). The gene has been cloned and its
sequence determined: the entire locus spans more than
44 kb, the coding region consisting of 654 nucleotides
in nine exons. The protein contains 218 amino acids.
HPRT is expressed in all tissues, although at different
levels, and the enzyme is particularly active in basal
ganglia and testis. The incidence of LNS has been
estimated to range from 1 in 100 000 to 1 in 380 000.
Characterization of the molecular defect in the HPRT
gene of a number of HPRT-deficient patients has
revealed a heterogeneous pattern of mutations, with
the same alteration rarely being found in unrelated
pedigrees. About 63% of all the described molecular
alterations represent point mutations, giving rise to
either amino acid substitution in the protein sequence
or stop codons, leading to truncated protein molecules. In some instances, the point mutation alters a
splice site consensus sequence, activating an alternative, cryptic splice site, creating aberrant mRNA and
protein products. It has not been possible to clearly
correlate different types of mutations (genotype) with
the various aspects of the clinical manifestations
(phenotype). However, a rough guide predicts that
mutations producing complete disruption of HPRT
enzyme function (stop codons, deletions) are associated with classical LNS, while mutations allowing
some residual HPRT enzyme activity (conservative
amino acid substitutions) are associated with a less
severe phenotype.
The excessive uric acid production in HPRTdeficient patients is effectively treated with daily
administration of allopurinol. This is the unique
and specific treatment available for all the patients
1088
Lethal Locus
COOH
H2N
CH2
CH
H3C
Figure 1
CH3
Leucine.
Lethal Locus
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1890
Leukemia
J D Rowley
Lethal Mutation
M A Cleary
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0755
Leucine
E J Murgola
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0757
Phenotype
Chromosome abnormality
Involved genes
t(1;19)(q23;p13)
t(12;21)(p13;q22)
PBX1-TCF3 (E2A)
TEL-AML1
B(SIg)
t(8;14)(q24;q32)
t(2;8)(p12;q24)
t(8;22)(q24;q11)
MYC-IGH
IGK-MYC
MYC-IGL
B or B-myeloid
t(9;22)(q34;q11)
t(4;11)(q21;q23)
t(11;19)(q23;p13.3)
ABL-BCR
AF4-MLL
MLL-ENL
Other
5060 chromosomes
t(5;14)(q31;q32)
del(9p),t(9p)
t(9;12)(q34;p13)
del(12p)
IL3-IGH
?CDKN2(p16)
TEL-ABL
TEL;?p27KIP1
Reproduced with permission from Rowley JD (1999) The role of chromosome translocations in leukemogenesis. Seminars in
Hematology 36 (supp. 7): 5972.
Table 2
Phenotype
Chromosome abnormality
Involved genes
t(1;14)(p34;q11)
t(1;14)(p32;q11)
t(7;9)(q35;q32)
t(7;9)(q35;q34)
t(7;7)(p15;q11)
t(7;14)(q35;q11)
t(7;14)(p15;q11)
LCK-TCRD
TAL1-TCRD
TAL1Del
TCRB-TAL2
TCRB-TAN1
TCRG-?
TCRB-TCRD
t(8;14)(q24;q11)
inv(14)(q11;q32)
t(14;14)(q11;q32)
t(10;14)(q24;q11)
t(11;14)(p15;q11)
t(11;14)(p13;q11)
MYC-TCRA
TCRA-IGH
TCRA-IGH
HOX11-TCRA
LMO1-TCRD
LMO2-TCRD
1090
Leuke mia
Table 3
Disease
Chromosome abnormality
Involved genes
t(9;22)(q34;q11)
t(9;22), 8, Ph, i(17q)
ABL-BCR
ABL-BCR
t(5;12)(q33;p13)
PDGFRB-TEL
t(8;21)(q22;q22)
t(15;17)(q22;q12)
t(11;17)(q23;q12)
inv(16)(p13q22) or
t(16;16)(p13;q22)
t(6;11)(q27;q23)
t(9;11)(p22;q23)
t(1;22)(p13;q13)
t(3;3)(q21;q26)
or inv(3)(q21q26)
t(3;5)(q21;q31)
t(3;5)(q25;q34)
t(6;9)(p23;q34)
t(7;11)(p15;p15)
t(8;16)(p11;p13)
t(9;12)(q34;p13)
t(12;22)(p13;q13)
t(16;21)(p11;q22)
7 or del(7q)
5 or del(5q)
del(20q)
del(12p)
ETO-AML1
PML-RARA
PLZF-RARAMYH11-CBFB
AML-M2
APL-M3, M3V
atypical APL
AMMoL-M4Eo
AMMoL-M4/AMoL-M5
AMegL-M7
AML
Therapy-related AML
7 or del(7q) and/or
5 or del(5q)
t(11q23)
t(3;21)(q26;q22)
AF6-MLL
AF9-MLL
RPN1-EVI1
MLF1-NPM1
DEK-CAN
HOXA9-NUP98
MOZ-CBP
TEL-ABL
TEL-NM1
TLS(FUS)-ERG
TEL, ?p27KIP1
IRF1?
MLL
EAP/MDS1/EVI1-AML1
Leukemia, Acute
M J S Dyer
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1538
The acute leukemias represent the malignant transformation of myeloid and lymphoid precursors within
the bone marrow or thymus. All hematopoietic precursor cells can be transformed. The commonest leukemia in children from developed countries is B-cell
precursor acute lymphoblastic leukemia (BCP-ALL),
which express the neutral endopeptidase CD10; these
are often referred to as `common' ALL (cALL).
BCP-ALL is the commonest malignancy of childhood. However, as with other malignancies, the
incidence of other forms of acute leukemia and particularly acute myeloid leukemia (AML) increases
with age. The etiologies of the acute leukemias remain
unknown. Although many hypotheses have been
advanced, particularly for childhood BCP-ALL,
none have been proven, due in major part, to the rarity
of the disease. Childhood leukemias may arise in utero
(see below). Familial acute leukemia is rare, although
when it occurs, it frequently exhibits genetic anticipation.
Diagnosis is based on a combination of morphology (particularly for the myeloid/monocytic
leukemias), immunophenotype, and molecular cytogenetics. Most acute leukemias exhibit the immunophenotype of a single hematopoietic lineage, although
some may coexpress molecules associated with two
different lineages; these are known as biphenotypic
acute leukemias. Cytogenetics, increasingly supplemented by fluorescent in situ hybridization (FISH)
and molecular techniques such as reverse transcriptase
polymerase chain reaction (RT-PCR), plays a
Leukemia, Acute
M J S Dyer
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1538
The acute leukemias represent the malignant transformation of myeloid and lymphoid precursors within
the bone marrow or thymus. All hematopoietic precursor cells can be transformed. The commonest leukemia in children from developed countries is B-cell
precursor acute lymphoblastic leukemia (BCP-ALL),
which express the neutral endopeptidase CD10; these
are often referred to as `common' ALL (cALL).
BCP-ALL is the commonest malignancy of childhood. However, as with other malignancies, the
incidence of other forms of acute leukemia and particularly acute myeloid leukemia (AML) increases
with age. The etiologies of the acute leukemias remain
unknown. Although many hypotheses have been
advanced, particularly for childhood BCP-ALL,
none have been proven, due in major part, to the rarity
of the disease. Childhood leukemias may arise in utero
(see below). Familial acute leukemia is rare, although
when it occurs, it frequently exhibits genetic anticipation.
Diagnosis is based on a combination of morphology (particularly for the myeloid/monocytic
leukemias), immunophenotype, and molecular cytogenetics. Most acute leukemias exhibit the immunophenotype of a single hematopoietic lineage, although
some may coexpress molecules associated with two
different lineages; these are known as biphenotypic
acute leukemias. Cytogenetics, increasingly supplemented by fluorescent in situ hybridization (FISH)
and molecular techniques such as reverse transcriptase
polymerase chain reaction (RT-PCR), plays a
1092
L e u ke m i a , A c u t e
Table 1
Diagnosis
Cytogenetic abnormality
Involved genes
t(12;21)(p13;q21)
t(9;22)(q34;q11)
t(1;19)(q23;p13)
t(11;14)(p13p15;q11)
t(1;14)(p32;q11)
Various 11q23 translocations
t(8;21)(q22;q22)
inv(16)(p13q22)
t(15;17)(q21;q22)
ETV6/AML1
ABL/BCR
PBX/E2A
LMO1/2-TCRD/A
TAL1-TCRD/A
MLL fusions
CBF a/ETO
MYH11/CBF b
PML/RARa
differentiation. Further dissection of the transcriptional pathways involved may allow the rational introduction of new therapeutic strategies.
A fascinating observation is that many of the childhood leukemias may originate in utero. This conclusion was made initially on the basis of data from
identical twins with concordant leukemia, the leukemic stem cell passing from one twin to another due to
the shared placental blood supply. Such twins showed
identical translocation breakpoints and antigen receptor gene rearrangements. These data have now been
confirmed in other patients through the use of Guthrie
blood spots, collected at birth for the screening for
phenylketonuria. These spots contain sufficient DNA
to allow retrospective analysis of the leukemic clone at
birth by using long-range PCR methods to detect the
t(12;21)(p13;q21) breakpoint. This clone may only
present several years later, implying the necessity for
other genetic/environmental events for its eventual
appearance. However, at least some chromosomal
translocations may occur in normal stem cells with
normal capacity to differentiate without giving rise
to overt leukemia.
Further Reading
Rowley JD (1998) The critical role of chromosome translocations in human leukemias. Annual Review of Genetics 32:
495519.
Wiemels JL, Cazzaniga G, Daniotti M et al. (1999) Prenatal origin
of acute lymphoblastic leukaemia in children. Lancet 354:
14991503.
Reference
L eu ke mi a, Chron i c 1093
Leukemia, Chronic
M J S Dyer
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1539
Disease Immunophenotype
CLL
SLVL
Deletion/translocations of 13q14
Deletion of 11q23
Deletion of 17p13.1
Trisomy 12 (secondary)
t(14;19)(q32.3;q13)
t(11;14)(q13;q32.3)
t(2;7)(p12;q22)
deletion of 17p13.1
B-PLL
HCL*
T-PLL
T-LGL
ATLL
Sezary's
abnormalities
abnormalities
abnormalities
abnormalities
Unknown
ATM mutation
p53 mutation
Unknown
BCL3 (rare)
CCND1 overexpression
CDK6 overexpression
p53 mutation
TCL1 overexpression
MTCP1 overexpression
ATM mutation
1094
L evan , Al be r t
Reference
Levan, Albert
M Hulten and K Fredga
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0759
Albert Levan (190598) is most famous for his discovery that the diploid chromosome number in humans is
46 and not 48 as had been the dogma since 1912. This
discovery was made with Joe-Hin Tjio at the Institute
of Genetics, University of Lund, Sweden. It resulted
from the application to human cells of a methodology
for chromosome preparation that Levan had pioneered
in plants and animals.
Levan was born and grew up in the Swedish town
of Gothenburg, where his father, who was Director of
Post Services, passed to Albert an interest in classical
languages and botany. After high school, Levan
moved to the University of Lund where he graduated
in botany in 1927. From 1926 to 1931, he held a post as
Assistant at the Institute of Zoology. He was awarded
a PhD in 1935. At the age of 30, he became Assistant
Professor in Genetics, and subsequently, in 1947,
also in Cytology. In 1961, he was awarded a personal
chair in Cytology.
Levan published his first paper in 1929, on the
chromosomes of onions. The large size and clear
morphology of these species' chromosomes made
them especially well suited for both descriptive and
experimental studies.
In 1938, Levan published a very important paper,
entitled ``The effect of colchicine on root mitoses in
Li b r a r y 1095
In 1953, Levan had set up the Cancer Chromosome
Laboratory at the Institute of Genetics of the University of Lund. Initially, the laboratory had only a few
scientists, but soon attracted researchers from all over
the world. When Levan retired in 1976 at the age of 71,
the laboratory had grown to 40 scientists, technicians
and students. He and coworkers made many important contributions to our understanding of cancer cytogenetics, for example:
1. Chromosomal changes in tumor cell lines are not
arbitrary but follow particular developmental
patterns.
2. Measles virus and Rous sarcoma virus lead to an
increased amount of chromosome breakage in
human blood cells.
3. Patterns of chromosome rearrangements are related
to tumor etiology. Thus, histologically identical
tumors can show totally different patterns of
chromosome aberrations dependent on whether
they have been induced by virus or chemicals.
4. Burkitt's lymphoma is characterized by a particular
chromosome aberration of chromosome 14; this
was the second example of a specific chromosome
abnormality in a human tumor cell.
Levan's interest and involvement in chromosome
research continued after his retirement from the
Directorship of the Cancer Chromosome Laboratory.
In particular, he studied the phenomenon of `double
minutes' seen in some tumor types. These chromosomal elements result from the massive amplification
of genes involved in malignancy.
The inspiring atmosphere that Albert Levan created
in his laboratory made every member of staff do their
best. He allowed much freedom of research. Although
the key subject of the laboratory was cancer research,
Levan accepted with much enthusiasm some of his
younger colleagues devoting their time to the study of
shrews, lemmings, hedgehogs, seals, and even whales.
His curiosity for scientific matters never waned, and at
the age of 85 he learnt to use the computer for writing
and correspondence. As a researcher, Levan stands out
as an intuitive and creative talent. He had an unusual
attitude to work. Unlike many others in his laboratory, he adhered stringently to a 9-to-5 working day.
This allowed him time for his many other interests,
including playing the cello and writing music.
Reference
Lewis, Edward
K Handyside, E Keeling and S Brenner
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1702
Library
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1891
1096
Ligation
Ligation
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1893
LI M D o m a i n G e ne s 1097
Lhx2 subfamily
Lhx3 subfamily
Lhx6 subfamily
The members of this group are lim-4 in worm, arrowhead in fly, and Lhx6 and Lhx8 in vertebrates. lim-4 is
necessary for specification of an olfactory neuron
in worms, while in flies, arrowhead is expressed in
neuroblasts and is involved in the development of
abdominal and salivary imaginal cells. In the mouse,
both Lhx6 and Lhx8 are expressed in the developing
forebrain and branchial arches and loss of Lhx8 in the
mouse resulted in cleft palate formation.
Islet subfamily
Lmx subfamily
LIM-kinase
1098
L I M D o ma in G e ne s
Adapter Proteins
Enigma family
Focal adhesions
Future Prospects
The catalog of nuclear LIM proteins is nearly complete. One high-affinity target, NLI, is the basis for
combinatorial association of nuclear LIM proteins
into a transcriptional `code' underlying developmental choices. How these complexes operate in the context of other transcriptional regulators remains to be
determined. The catalog of cytoplasmic LIM proteins
is incomplete and other associated protein motifs are
likely to accompany the LIM domains. Most, if not all,
interact with and regulate the cytoskeleton. In contrast to nuclear LIM domains, a common high-affinity
target for cytoplasmic LIM domains has not been
identified and specific recognition sites remain to be
determined. In both nuclear and cytoplasmic proteins
LIM domains function as recognition modules for
macromolecular assemblies.
Further Reading
L i m b D evel o p m e n t 1099
Jurata LW, Pfaff SL and Gill GN (1998) The nuclear LIM domain
interactor NLI mediates homo- and heterodimerization of
LIM domain transcription factors. Journal of Biological Chemistry 273: 31523157.
Perez-Alvarado GC, Miles C, Michelsen JW et al. (1994) Structure of the carboxy-terminal LIM domain from the cysteine
rich protein CRP. Nature Structural Biology 1: 388398.
Limb Development
R Johnson
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0763
distal
dorsal
posterior
somitic mesoderm
lateral plate mesoderm
presumptive
wing forming
region
FL
radius
digits
carpal
HL
ulna
Initiation
Figure 1 Key stages of vertebrate limb development as illustrated in the chick embryo. Limb initiation begins within
the lateral plate mesoderm, which proliferates to form limb buds at the forelimb (FL) and hindlimb (HL) levels. At this
stage outgrowth becomes dependent on the apical ectodermal ridge (AER) and axial pattern is specified through the
action of multiple signaling centers (see text). After further development, a cartilage model of the adult limb is
formed. Illustrated here is the skeletal pattern of a chick wing at 8 days of incubation. Note the conserved features of
all tetrapod limbs: a single proximal long bone (humerus) followed by two long bones (raidus and ulna) and capped by
carpals and digits. (Adapted from Johnson and Tabin, 1997.)
1100
L i m b D evel o p m en t
L i m b D evel o p m e n t 1101
en-1
wnt 7a
A/P pattern
shh
lmx1b
fgfs
D/V pattern
Distal outgrowth
AER formation
Fgf-10
1102
L i m b D evel o p m en t
References
cascade and feedback loop to integrate growth and patterning of the developing limb bud. Cell 79: 9931003.
Logan C, Hornbruch A, Campbell I and Lumsden A (1997) The
role of Engrailed in establishing the dorsoventral axis of the
chick limb. Development 124: 23172324.
Loomis CA, Harris E, Michaud J et al. (1996) The mouse
Engrailed-1 gene and ventral limb patterning. Nature 382:
360363.
Loomis CA, Kimmel RA, Tong CX, Michaud J and Joyner AL
(1998) Analysis of the genetic pathway leading to formation
of ectopic apical ectodermal ridges in mouse Engrailed-1
mutant limbs. Development 125: 11371148.
MacArthur CA, Lawshe A, Xu J et al. (1995) FGF-8 isoforms
activate receptor splice forms that are expressed in
mesenchymal regions of mouse development. Development
121: 36033613.
MacCabe JA, Errick J and Saunders JWJ (1974) Ectodermal
control of the dorsoventral axis in the leg bud of the chick
embryo. Developmental Biology 39: 6982.
Mahmood R, Bresnick J, Hornbruch A et al. (1995) A role for
FGF-8 in the initiation and maintenance of vertebrate limb
bud outgrowth. Current Biology 5: 797806.
Manouvrier-Hanu S, Holder-Espinasse M and Lyonnet S (1999)
Genetics of limb anomalies in humans. Trends in Genetics 15:
409417.
Masuya H, Sagai T, Moriwaki K and Shiroishi T (1997) Multigenic
control of the localization of the zone of polarizing activity in
limb morphogenesis in the mouse. Developmental Biology 182:
4251.
Mercader N, Leonardo E, Piedra ME et al. (2000) Opposing RA
and FGF signals control proximodistal vertebrate limb development through regulation of Meis genes. Development 127:
39613970.
Min H, Danilenko DM, Scully SA et al. (1998) Fgf-10 is required
for both limb and lung development and exhibits striking
functional similarity to Drosophila branchless. Genes and
Development 12: 31563161.
Moon AM, Boulet AM and Capecchi MR (2000) Normal limb
development in conditional mutants of Fgf4. Development
127: 989996.
Niswander L and Martin GR (1992) Fgf-4 expression during
gastrulation, myogenesis, limb and tooth development in
the mouse. Development 114: 755768.
Niswander L and Martin GR (1993) FGF-4 and BMP-2 have
opposite effects on limb growth. Nature 361: 6871.
Niswander L, Tickle C, Vogel A, Booth I and Martin GR (1993)
FGF-4 replaces the apical ectodermal ridge and directs outgrowth and patterning of the limb. Cell 75: 579587.
Niswander L, Jeffrey S, Martin GR and Tickle C (1994) A
positive feedback loop coordinates growth and patterning
in the vertebrate limb. Nature 371: 609612.
Ohuchi H, Nakagawa T, Yamamoto A et al. (1997) The mesenchymal factor, FGF10, initiates and maintains the outgrowth
of the chick limb bud through interaction with FGF8, an
apical ectodermal factor. Development 124: 22352244.
LINE 1103
Parr BA and McMahon AP (1995) Dorsalizing signal Wnt-7a
required for normal polarity of D-V and A-P axes of mouse
limb. Nature 374: 350353.
Riddle RD, Johnson RL, Laufer E and Tabin C (1993) Sonic
hedgehog mediates the polarizing activity of the ZPA. Cell
75: 14011416.
Riddle RD, Ensini M, Nelson C et al. (1995) Induction of the LIM
homeobox gene Lmx1 by WNT7a establishes dorsoventral
pattern in the vertebrate limb. Cell 83: 631640.
Rijli FM and Chambon P (1997) Genetic interactions of Hox
genes in limb development: learning from compound
mutants. Current Opinianism Genetics and Development 7:
481487.
Saunders JWJ (1948) The proximo-distal sequence of origin of
the parts of the chick wing and the role of the ectoderm.
Journal of Experimental Zoology 108: 363403.
Saunders JWJ and Gasseling MT (1968) Ectodermmesenchymal
interaction in the origins of wing symmetry. In: Fleischmajer R
and Billingham RE (eds) EpithelialMesenchymal Interactions,
pp. 7897. Baltimore, MD: Williams & Wilkins.
Sekine K, Ohuchi H, Fujiwara M et al. (1999) Fgf10 is essential
for limb and lung formation. Nature Genetics 21: 138141.
Shubin N, Tabin C and Carroll S (1997) Fossils, genes and the
evolution of animal limbs. Nature 388: 639648.
Storm EE and Kingsley DM (1996) Joint patterning defects
caused by single and double mutations in members of the
bone morphogenetic protein (BMP) family. Development 122:
39693979.
Storm EE, Huynh TV, Copeland NG et al. (1994) Limb alterations in brachypodism mice due to mutations in a new
member of the TGF beta-superfamily. Nature 368: 639
643.
Summerbell D (1974a) Interaction between the proximo-distal
and antero-posterior co-ordinates of positional value during
the specification of positional information in the early development of the chick limb-bud. Journal of Embryology and
Experimental Morphology 32: 227237.
Summerbell D (1974b) A quantitative analysis of the effect of
excision of the AER from the chick limb-bud. Journal of
Embryology and Experimental Morphology 32: 651660.
Sun X, Lewandoski M, Meyers EN et al. (2000) Conditional
inactivation of Fgf4 reveals complexity of signalling during
limb bud development. Nature Genetics 25: 8386.
Thomas JT, Lin K, Nandedkar M et al. (1996) A human chondrodysplasia due to a mutation in a TGF-beta superfamily member. Nature Genetics 12: 315317.
Tickle C, Shellswell G, Crawley A and Wolpert L (1976) Positional signalling by mouse limb polarising region in the chick
wing bud. Nature 259: 396397.
Vogel A, Rodriguez C, Warnken W and Izpisua Belmonte JC
(1995) Dorsal cell fate specified by chick Lmx1 during vertebrate limb development. Nature 378: 716720.
Vogel A, Rodriguez C and Izpisua-Belmonte JC (1996) Involvement of FGF-8 in initiation, outgrowth and patterning of the
vertebrate limb. Development 122: 17371750.
Vollrath D, Jaramillo-Babb VL, Clough MV et al. (1998) Loss-offunction mutations in the LIM-homeodomain gene, LMX1B,
in nailpatella syndrome. Human and Molecular Genetics 7:
10911098.
Vortkamp A, Gessler M and Grzeschik KH (1991) GL13 zincfinger gene interrupted by translocations in Greig syndrome
families. Nature 352: 539540.
Vortkamp A, Franz T, Gessler M and Grzeschik KH (1992) Deletion of GLI3 supports the homology of the human Greig
cephalopolysyndactyly syndrome (GCPS) and the mouse
mutant extra toes (Xt). Mammalian Genome 3: 461463.
Vortkamp A, Lee K, Lanske B et al. (1996) Regulation of rate of
cartilage differentiation by Indian hedgehog and PTH-related
protein. Science 273: 613622.
Wang B, Fallon JF and Beachy PA (2000) Hedgehog-regulated
processing of Gli3 produces an anterior/posterior repressor gradient in the developing vertebrate limb. Cell 100:
423434.
Yang Y and Niswander L (1995) Interaction between the signaling molecules WNT7a and SHH during vertebrate limb
development: dorsal signals regulate anteroposterior patterning. Cell 80: 939947.
Yang Y, Drossopoulou G, Chuang PT et al. (1997) Relationship
between dose, distance and time in Sonic Hedgehogmediated regulation of anteroposterior polarity in the
chick limb. Development 124: 43934404.
Yonei-Tamura S, Endo T, Yajima H et al. (1999) FGF7 and FGF10
directly induce the apical ectodermal ridge in chick embryos.
Developmental Biology 211: 133143.
Zakany J and Duboule D (1999) Hox genes in digit development
and evolution. Cell and Tissue Research 296: 1925.
Zou H and Niswander L (1996) Requirement for BMP signaling
in interdigital apoptosis and scale formation. Science 272:
738741.
LINE
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0764
1104
L i nk a g e
Linkage
F W Stahl
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0766
Detection of Linkage
When two loci on the same chromosome are far apart,
they may fail to generate meiotic products whose
frequencies differ significantly from the expectations
of nonlinkage. Such a failure to demonstrate linkage of
two loci may be overcome with the demonstration of
their common linkage to a locus that lies between
them on the chromosome.
Much of genetics involves determining the location
in the genome of a newly identified gene (mapping the
gene). The first step in such mapping is determining on
which chromosome the locus is situated. Crosses of
the new mutant by strains that carry mutant alleles at
loci on each of the chromosomes may detect linkage to
one of those loci. Since only a finite number of haploid
meiotic products (or of meiotic tetrads) can be examined, the statistical test w2 is standardly employed to
determine when deviations from random assortment
should be taken seriously.
When tetrad data are available, an excess of parental
ditype meiotic tetrads over nonparental ditype tetrads
sensitively indicates linkage.
Linkage to a Centromere
Since homologous centromeres segregate in the
first division of meiosis, relatively strong linkage of a
locus to a centromere is indicated by an excess of
first division over second division segregations;
weaker linkage is implied as long as the frequency of
second division segregation is less than 2/3, the frequency expected for random assortment of a locus
from its centromere. In the presence of positive
chiasma interference, second division segregation
may exceed 2/3, indicating that most of the tetrads
have a single exchange between that locus and its
centromere.
Li n ka g e Di s eq u il i br i um 1105
Linkage Maps
Maps that reflect the degree of linkage between loci
can be constructed from observed recombination frequencies (see Centimorgan (cM)). The distances on
these linkage maps will accurately reflect physical
distances on the chromosome only if exchange frequencies are constant along the chromosome.
Linkage in Prokaryotes
In bacteria with a single chromosome, loci that are
close enough together on the chromosome to be transmitted together in phage-mediated transduction may
be referred to as `linked.' Similarly, when transformation is conducted with chromosomal DNA, markers
are `linked' if they are cotransformed as a result of
sometimes being on the same fragment created by
artifactual breakage of the chromosome. Unlinked
markers are transduced or transformed into the same
recipient cell at a frequency about equal to the product
of the transduction or transformation frequencies of
the individual markers.
In crosses with bacteriophages, as standardly conducted, recombination frequencies less than 50% cannot be taken as evidence of linkage because a fraction
of the progeny phage particles has lacked the opportunity to assort its genes. Linkage is implied by a pair
of loci that gives a significantly lower recombination
frequency than do the loci with the largest observed
values.
See also: Centimorgan (cM); Genetic
Recombination; Mapping Function; Tetrad
Analysis
Linkage Disequilibrium
N E Morton
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0767
but inaccurate rule of thumb is that y 0.01 corresponds to about 1 megabase (Mb). By this approximation, y 10 5 corresponds to 1 kb. If y is as small as
10 6, the time since apes and hominids diverged is not
long enough to go halfway to equilibrium. Therefore
selection is not required to explain persistence of disequilibrium, which depends to a considerable extent
on episodes of population contraction. There have
been two major bottlenecks in human evolution. The
first was when two chromosomes that are nonhomologous in apes fused to form the chromosome 2 inherited by our species. The second bottleneck was when
we migrated out of Africa in the last 100 000 years. As
a consequence, LD is least for sub-Saharan Africa.
Lesser bottlenecks have occured in the history of
particular populations.
LD may be measured in many ways. Some are confounded with significance tests, and therefore with
sample size. All are to some degree confounded with
allele frequencies. The most reliable and best validated
is the association probability rt, which is made up of
two parts. Association that has diminished from an
initial value r0 in founders is rrt r0e (1/2N y)t,
where N is the effective size over t generations. Association that has built up by genetic drift since the
founders is rct L (1 e (1/2N y)t), and rt rrt rct.
If N is constant, the equilibrium value as t ! ? is
L1/(1 2N y) if y is small and 1 / (1 2 N) if y0.5.
The latter is negligible in real populations. If 1/2 N is
small compared to y, rd follows the Malecot model for
isolation by distance, equating ty to ed, where d is
distance between loci. On the genetic scale d measures
recombination directly, with relatively larger sampling error over small distances. On the physical
scale d is only indirectly related to recombination,
but is more accurate if sequence-based. Choice should
be based on goodness of fit to the best available maps.
Analyzed as isolation by distance, LD provides a way
to compare allelic association for chromosome regions
in different populations, and therefore to detect variations in recombination, selective sweeps that reduced
haplotype diversity, and effects of population history
and structure. This information determines the optimal populations and density of markers for positional
cloning of genes affecting normal physiology and disease. Localization is more precise by LD than by
linkage. An alternative for multilocus haplotypes is
cladistic analysis when its assumptions to reduce the
number of independent variables are valid and the
causal region has been made small by LD or other
evidence.
See also: Bottleneck Effect; Genetic Drift
1106
Linkage G ro up
Linkage Group
M A Cleary
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0768
Linkage among genes or genetic markers is determined by the frequency with which they are inherited
together. Genes that are frequently inherited together
tend to lie near each other on the same chromosome.
The frequency with which genes or markers are inherited together is measured by the percentage of recombination that occurs between them. A linkage group
is defined by all of the genes and markers for which
linkage has been established. An entire chromosome is
considered to be a linkage group.
See also: Crossing-Over; Independent
Assortment; Linkage Map
Linkage Map
M F Seldin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0769
viewed as reflecting a biologic process in which individual variation may be influenced by a large number
of factors. For example, the recombination frequency
between homologous chromosomes can be substantially different in oogenesis than in spermatogenesis.
Although, in general, there is more frequent recombination in female meioses, different regions of the genome show different relationships and there are even
chromosomal segments in which the recombination
frequency is greater in male meioses. Other studies
have indicated that recombination frequency is itself
influenced by genetic factors; meiotic recombination
frequency may differ in crosses between different
strains of mice and to some extent in different human
populations. These factors must be considered when
utilizing information from mammalian linkage maps.
1108
L i nk a g e M a p
L i p o m a a n d U t e r i n e L e i o myom a 1109
Other linkage maps contain large numbers of genes
and enough microsatellite markers to allow reasonable
integration with other markers not included in these
maps. The chromosomal positions of these markers
and genes can be used to map specific traits in a manner similar to that employed in human genome screening. The cross-specific linkage maps and their derived
composite maps (discussed above) have thus allowed a
large number of traits to be placed in specific intervals
and have facilitated positional cloning projects.
References
Linker DNA
I Schildkraut
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0770
Linker DNA is a short self-complementary palindromic DNA molecule which forms a blunt end duplex
containing a recognition sequence for a restriction
endonuclease. The linker DNA is generally bluntend ligated between two blunt-ended DNA fragments
to introduce a restriction site.
See also: Restriction Endonuclease
1110
L i po ma a nd Ut er in e L ei omyom a
Uterine Leiomyoma
Pathology of Leiomyomas of the Uterus
leiomyomas (hormone replacement therapy) is associated with major side effects, more and more women
are directly seeking some form of surgery to remove
their fibroids. This has led to a situation in which in
the United States alone, uterine leiomyomas are the
leading indication for about 300 000 hysterectomies
performed annually.
L i p o m a an d U t e r i n e L e i o myom a 1111
Uterine leiomyomas
~ 30%
aberrant
karyotype
~ 70%
normal
karyotype
30 %
25 %
25 %
10 %
5%
3.5 %
Random
changes
12q14q15
and/or
14q23q24
involvement
7q deletions
Trisomy 12
6p21-pter
involvement
t(1;2)(p36;p24)
q21 q22
most commonly
involved
Mainly:
t(12;14)(q14q15;q23q24)
Figure 1 Cytogenetics of uterine leiomyomas. Schematic representation of the different cytogenetic subgroups of
uterine leiomyomas.
review, see Jansen et al., 1999) consists of five exons
and spans about 175 kb (Figure 2A). The gene contains one large intron, i.e., intron 3, which spans about
140 kb. The HMGIC protein has three DNA-binding
domains (a 9 basic amino acid DNA-binding motif,
also referred to as the AT-hook) and an acidic
C-terminal domain. The location of the breakpoints
with respect to HMGIC is variable and has been
found 50 to the gene, in its 30 nontranslated region, as
well as in one of its introns. The intragenic breakpoints frequently occur in the large third intron. In
such cases, the three DNA-binding domains in the
N-terminal region of the protein become separated
from the acidic, C-terminal domain. Furthermore,
it is of interest to note that another member of the
HMG protein gene family, i.e., the HMGI(Y) gene
(Figure 2B), which maps at chromosome 6p21, is also
implicated in uterine leiomyoma.
HMG proteins (Bustin et al., 1990) are named after
their fast electrophoretic migration at acidic pH, and
were first discovered in the 1960s as contaminants in
calf thymus histone H1 preparations. They are operationally defined as small (mol.wt < 30 kDa) and
abundant, 2% TCA/25% perchloric acid-soluble,
nonhistone proteins, extractable from chromatin
with 0.35 m NaCl and having a high content of acidic
L i po ma a nd Ut er in e L ei omyom a
1112
2.6
> 140
10.3
25 33
45
BD1
(A-T)
53
73
81
94
BD3
BD2
(A-T)
(A)
13
108
ACIDIC
DOMAIN
(A-T)
BD1
BD2
IC
(M)SARGEGAGQPSTSAQGQPAAPAPQKRGRGRPRKQQQ
EPTGEPSPKRPRGRPKG
(M)SESSSKSSQPLASKQEKDGT
EKRGRGRPRKQPP
(M)SESSSKSSQPLASKQEKDGT
EKRGRGRPRKQPPVSPGTALVGSQKEPSEVPTPKRPRGRPKG
KEPSEVPTPKRPRGRPKG
BD3
Acidic domain
IC SKNKSPSKAAQKKAEATGEKRPRGRPRKWPQQVVQKKPAQEETEETSSQESAEED
Y
SKNKGAAKT
RKTTTTPGRKPRGRPKK
LEK
EEEEGISQESSEEEQ
SKNKGAAKT
RKTTTTPGRKPRGRPKK
LEK
EEEEGISQESSEEEQ
(B)
*
1
5 AA
*
6
10
11a
5 AA
11b
80 AA
c
11 11c
Figure 2 Structure of the human HMGIC gene and its protein product and the human RAD51L1 gene products.
(A) On the HMGIC gene map, the exons are depicted as boxes, with the 50 - and 30 -untranslated regions represented
as shaded areas. The numbers below the map indicate the intron sizes in kilobase pairs. The dashed lines indicate
which regions of the HMGIC protein are encoded by the individual exons and the amino acid numbering above
the protein map marks the boundaries of the various DNA-binding and acidic domains. (B) HMGIC amino acid
sequence aligned with HMGI and HMGY. (C) Schematic representation of the three alternative RAD51L1 mRNA
splice variants (exons are numbered). The relative position of two highly conserved nucleotide-binding Walker
domains are marked by asterisks and the number of amino acids encoded by the three alternative terminal-coding
exons are indicated. Arrows mark the positions of chromosome breakpoints found in the RAD51L1 gene in various
uterine leiomyomas.
L i p o m a an d U t e r i n e L e i o myom a 1113
transcription factors' (Wolffe, 1994; Lovell-Badge,
1995), of which HMGI(Y) is the founding member.
Indeed, studies on the role of HMGI(Y) in the induction of INFb gene expression (Falvo et al., 1995) point
toward a more architectural role for HMGI(Y), and
have resulted in the model that the HMGI proteins as
a group, just like other architectural transcription factors, might function as `facilitators' of gene expression
(Figure 3). The intriguing question remains as to how
particular genetic changes in such facilitators result in
aberrant cell proliferation of a benign nature. Identification of the spectrum of their target genes is an
important objective for future research of the
HMGIC proteins.
be kept in mind that this preference for certain ATrich stretches has been shown to be caused by recognition of substrate structure rather than nucleotide
sequence.
As far as expression patterns of HMGI(Y) and
HMGIC are concerned, there seems to be a link to
cell proliferation. Expression of the HMGIC gene is
tightly linked to growth, since it is mainly expressed
during early development and in growing cells.
Furthermore, it responds to serum induction as a
delayed early response gene (Ayoubi et al., 1999).
Finally, homozygous disruption of the Hmgic gene
in mice leads to the pygmy phenotype (Zhou et al.,
1995). The observation that the HMGI proteins are
developmentally regulated and constitute abundant
proteins might indicate that they could be involved
in the regulation of many genes, some possibly
involved in cell growth. The fact that HMGI(Y) is
known to cause a more general regulatory effect on
transcription through modification of chromatin
structure by inducing DNA bends, thereby facilitating the assembly of transcriptionally active nucleoprotein complexes (Grosschedl et al., 1994), has
resulted in the definition of so-called `architectural
GF
GF
MAPK
TF
MAPKK
MAPKKK
Immediate
HMGIC
GAP
RAS
PI-3
PKB
early genes
Target genes
TF
Proliferation
differentiation
TF
CRP
Actin
Basal transcription
machinery
TF
pol II
LPP
Focal adhesions
Zyxin
Tensin
VASP
Vinculin
TF
ACT
Talin
FAK
Integrins
Extracellular
matrix
Figure 3 Signal transduction model for HMGIC and LPP. Schematic representation of the delayed, early response
activation of the HMGIC gene via the growth factor (GF)receptor (R)-mediated activation of MAP kinases (MAPK).
The HMGIC protein will bind to regulatory regions of its respective target genes. The bottom part of this figure
shows the localization of wild-type LPP in focal adhesions of which other structural components are also shown.
1114
L i po ma a nd Ut er in e L ei omyom a
Lipoma
Pathology of Lipomas
Lipomas are benign neoplasms of adipose tissue. Histologically, they belong to the group of lipomatous
tumors that are classified as soft tissue tumors
(Enzinger and Weiss, 1995). Several types of benign
lipomatous tumors can be distinguished such as ordinary benign lipoma, angiolipoma, fibrolipoma, hibernoma, lipoblastoma, spindle cell/pleomorphic lipoma,
and atypical lipomatous tumors. Lipomas are one of
the most common soft tissue tumors and form part of
the daily practice of many surgical pathologists. With
rare exceptions they may occur at any age and at
almost any anatomical location. However, in general,
most of the lipomas become apparent between the
fourth and sixth decade and most of these are found
in the subcutaneous tissues of the upper back, neck,
shoulder, and abdomen, followed in frequency by the
Cytogenetics of Lipomas
L i p o m a an d U t e r i n e L e i o myom a 1115
usually due to translocations, inversions, or insertions.
In addition, rearrangements of chromosome region
1p36 have been found as well as supernumerary ring
chromosomes and complex karyotypes.
Apart from the fact that normal karyotypes are
more common in patients younger than 30 years old,
there appears to be no significant association between
the cytogenetic pattern and patient sex, age, or tumor
localization, size, or depth. Therefore, to date, the
pathogenetic basis and clinicopathological relevance
of the cytogenetic subtypes among lipomas remain
unexplained (Willen et al., 1998).
Lipomas (solitary)
< 40%
normal
karyotype
> 60%
aberrant
karyotype
2/3
1/3
No 12q13q15
involvement
12q13q15
involvement
*Rearrangements of
1/4
3/4
Translocation
partner:
*Random
translocation,
insertion partners
3q27q28
13q
6p23p21
1p36
1p36
der 12
3
12
der 3
1p34p32
2p24p21
2q35
5q33
11q13
13q12q14
21q21q22
*Supernumerary
ring chromosomes
*Complex karyotypes
*Supernumerary
ring chromosomes
*Complex karyotypes
Lipomas (multiple)
98%
normal karyotype
Figure 4 Cytogenetics of lipomas. Schematic representation of the different cytogenetic subgroups of lipomas. The
inserted picture represents the partial karyotype from a lipoma showing a t(3;12)(q27q28;q13q15). Arrowheads
indicate breakpoints.
1116
L i po ma a nd Ut er in e L ei omyom a
HMGIC/LPP
fusion
proteins
D
B
D
1
D
B
D
2
D
B
D
3
D
B
D
1
D
B
D
2
D
B
D
3
LIM 1
LIM 2
LIM 2
LIM 3
LIM 3
t(3;12)(q27q28;q15)
HMGIC
D
B
D
1
D
B
D
2
D
B
D
3
Proline-rich
AD
LIM 1
domain
LIM 2
LIM 3
LPP
t(3;12)(q27q28;q15)
LPP/HMGIC
fusion
proteins
Proline-rich
domain
Proline-rich
domain
LIM 1
AD
AD
Figure 5 Schematic representation of wild-type HMGIC and LPP proteins and related fusion proteins predicted to
be expressed in lipomas. The wild-type LPP protein is predicted to consist of a proline-rich N-terminal domain and
three LIM domains in its C-terminal region. HMGIC consists of three N-terminal DNA-binding domains and an acidic
C-terminal tail domain. Hybrid transcripts encoding the two variants of HMGIC/LPP fusion proteins (upper part) and
the reciprocal LPP/HMGIC fusion protein (lower part) were detected in RT-PCR analysis of primary lipomas and
lipoma cell lines. DBD, DNA-binding domain; AD, acidic domain; LIM, LIM domain.
References
Liposarcoma
See: Myxoid Liposarcoma and FUS/TLS-CHOP
Fusion Genes
Little, Clarence
L Silver
1118
L MO F a m i l y o f L I M - O n l y G e n es
Table 1
location
Gene
LM01
LM02
LM03
LM04
Mouse
11p15
11p13
12 p12-13
1p22.3
7
2
6
3
Human translocation
t(11;14)(p15;q11)
t(11;14)(p13;q11)
nd
nd
LO D S core 1119
GATA
CANNTG GATA
(B)
Aberrant T cell complex
Ldb1
LMO2
bHL1 bHLH
LMO2
bHL1 bHLH
CANNTG CANNTG
Figure 1 Lmo2 participates in DNA-binding complexes. (A) Erythroid Lmo2-containing complex. The Lmo2
protein interacts with Tall and with GATA1 in a complex comprising an TallE47 dimer, binding an E-box (CANNTG)
and a GATA1 molecule, binding a GATA site, as part of an erythroid complex, which presumably regulates target
genes. (B) T cell Lmo2-containing aberrant complex. An analogous DNA-binding complex comprises bHLH
heterodimers linked by Lmo2 and Ldb1 proteins, binding to dual E-box sites.
The LIM domains of LMO1 and LMO2 can bind
various proteins, such as GATA1, GATA2, and
Ldb1/Nli1 protein. This array of interactions led to
the observation that Lmo2 can be found in an oligomeric complex in erythroid cells which involves Tall,
E47, Ldb1, and Gata-1 This complex is able to bind
DNA through the GATA and bHLH components
thereby recognizing a unique bipartite DNA sequence
comprising an E-box separated by one helix turn from
a GATA site, with Lmo2 and Ldb1 proteins seeming to bridge the bipartite DNA-binding complex
(Figure 1A). Different Lmo2-containing complexes
may exist in different hematopoietic cell types,
which may differ in the types of protein factors
expressed and may control distinct sets of target genes
Proteinprotein interactions are crucial control
points for normal cells and alterations in these are
important components in tumorigenesis after chromosomal translocations have taken place. Gain-of-function transgenic mouse models of LMO gene expression
induce clonal T cell leukaemia with a long latency,
indicating that the transgenes are necessary but not
sufficient to cause tumours. These mice show an accumulation of immature CD4 , CD8 , CD25, CD44
T cells in transgenic thymuses compared to nontransgenic littermates. Thus the role of Lmo2 in T-ALL is
to cause an inhibition in T cell differentiation. T-ALL
cells contain a Lmo2 complex which, like its analog in
erythroid cells, binds to a bipartite DNA recognition
site. Analysis of the components of this complex
showed that E47Tall bHLH heterodimeric elements
were present as well as Lmo2 and the Ldb1 proteins
(Figure 1B). A possible role for the E-boxE-box
binding T cell complex is the regulation of specific
sets of target genes which, based on the difference in
DNA-binding site, would differ from those putative
genes controlled by the Lmo2multimeric complex in
hematopoietic cells.
Further Reading
Locus
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0772
LOD Score
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1898
1120
Long-Period Interspersion
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1899
Loss of Heterozygosity
(LOH)
P Rabbitts
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1590
epithelial origin, are characterized by highly aneuploid karyotypes with deletions as a common, frequently tumor-specific feature. If a patient's normal
and tumor DNA are compared at a locus known to be
heterozygous in that patient's normal DNA, it is possible to determine whether the tumor DNA has suffered genetic loss (deletion) encompassing that locus.
If it has, only one of the two alleles will be detectable,
and the locus will appear to be homozygous in the
tumor and will show loss of heterozygosity (LOH).
L o t u s j a p o n i c u s 1121
during treatment and can indicate relapse before obvious clinical symptoms appear.
References
Lotus japonicus
J Stougaard
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1665
1122
can thus be divided into a series of genetically separable steps. For further studies the following genome
resources are being developed: a general genetic map
and bacterial artificial chromosomes (BAC) libraries
for positional cloning of untagged mutants; recombinant inbred lines; and inventories of expressed
sequence tags (ESTs) sampling the gene expression
profiles from several tissues and growth conditions
(www.Viazusa.or.jp/een/index.html).
Sequencing of the L. japonicus genome has been
initiated. The sequences of the bacterial genes required
for nodulation and nitrogen fixation located on the
pSym plasmid of NGR234, and the complete genome
of Mesorhizobium loti, are available, together with a
wide selection of rhizobial mutants.
Like soybean, L. japonicus develops the determinate type of nodules. In contrast to for example pea
nodules with a persistent meristem, the meristematic
activity ceases early in determinate nodules developing on L. japonicus. After the initial phase with meristematic cell proliferation determinate nodules grows
by expansion giving a typical spherical shape. All
developmental stages from root hair curling to nodule
senescence are consequently phased in time. Root
nodule development is a rare example of induced and
dispensable organ formation in plants. Nodulation
mutants can be rescued on nitrogen containing nutrient solution and developmental control genes that
would compromise plant development and completion of the life cycle in other organogenic processes
could thus be identified from nodulation mutants.
See www.mbio.aau.dk/nchp/table1.html for a list of
literature on L. japonicus.
Further Reading
www.mbio.aau.dk/nchp/table1.html
Lung Cancer,
Chromosome Studies
P Rabbitts
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1591
Cytogenetic Analysis
Further Reading
Luria, Salvador
W C Summers
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0773
1124
L u r i a De l b r u ck E x p e r i m en t
LuriaDelbruck
Experiment
W C Summers
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0774
Reference
Lycopersicon esculentum
(Tomato)
R Chetelat
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1671
1126
Ly s e n ko, T. D. / Ly s e n ko i s m 1127
cultivars, and derivatives of wild species such as alien
additions, substitutions, and introgression lines, are
maintained by the TGRC. Lastly, the collection also
includes over 1100 wild species accessions, representing nine Lycopersicon and four Solanum species, of
which all but two can be crossed to L. esculentum,
albeit with varying degrees of difficulty.
These wild populations contain a vast amount of
genetic diversity, in contrast to the cultigen which
is severely depleted, and are important sources of
enhanced disease resistance, yield, fruit quality, environmental stress tolerance, and other desiderata of
interest to breeders. Resistances to over 42 diseases
have been detected in the wild relatives, many of
which have been bred into the cultivated tomato;
cloning and sequencing of many of these resistance
genes has contributed to our understanding of the
molecular basis of plantpathogen interactions. The
wild Lycopersicon species are also tolerant of abiotic
stresses encountered in their native habitats, which
include extreme aridity (e.g., Atacama desert), flooding and high humidity (e.g., equatorial jungle), saline
soils (e.g., coastal bluffs in Galapagos Islands), and
freezing or chilling temperatures at high elevations in
the Andes. Though bearing horticulturally unacceptable fruit, the wild species contain alleles that when
bred into cultivated tomato confer desired characteristics such as increased soluble solids, fruit color, size,
and yield.
Despite the complex genetic control of these fruit
traits, the application of molecular marker maps has
resolved quantitative trait loci (QTLs) for each of
them. In the case of fruit size, a single QTL ( fw2.2)
accounts for a large portion of the difference between
wild and cultivated forms; the recent cloning of this
QTL (Frary et al., 2000) has contributed to our understanding of the molecular basis of plant domestication,
and has demonstrated that even genes for complex
traits such as yield can be isolated through the use of
molecular maps. Levels of diversity in Lycopersicon
species vary greatly, due in large part to differences
in mating systems, which include autogamy, facultative allogamy, and self-incompatibility of the gametophytic type; tomato is therefore a rich source of allelic
variation for evolutionary and molecular studies of
self-incompatibility, pollination biology, and many
other reproductive characters.
Information on tomato germplasm and many types
of genetic data are available through online databases.
The TGRC database (http://tgrc.ucdavis.edu) provides search tools, gene descriptions, and photos of
mutants and wild species from its collection. The
GRIN database (http://www.ars-grin.gov) allows
users to search the US Department of Agriculture's
entire National Plant Germplasm System, which
References
Lyon Hypothesis
See: X-Chromosome Inactivation
Lysenko, T.D./Lysenkoism
W C Summers
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0779
1128
Ly s e n ko, T. D. / Ly s e n ko i s m
beans, and lentils in the fall and observed that some of
the peas and vetch survived the winter and produced a
crop early in the spring. From this research he concluded that ``By changing the external conditions it is
possible to change the behaviour of different plants of
the same variety'' (Lysenko, 1954, p. 18). In 1929 the
term `vernalization' was proposed for this plasticity
of plant varieties. This work was extended to cereals
and he claimed that spring-sown varieties could be
transformed into winter-sown forms by the proper
environmental manipulations. The results of this
work was reported first at the All-Union Genetics
Congress in Leningrad in January 1929.
Lysenko extended his experimental work to actual
field studies by inducing his father, Denis N. Lysenko,
to plant winter wheat in the spring. This crop was
apparently successful and Lysenko reported that:
In the same summer (1929) the Soviet public learned from
the press of the full and uniform earing of winter wheat
sown in the spring under practical farming conditions in
the Ukraine. (Lysenko, 1954, p. 23)
Figure 1
Lysenko
Lysenko was born in Karlovka, about 50 miles southwest of Kharkov, in Ukraine. His father was a small
farmer. Lysenko graduated from the Kiev Agricultural
Institute in 1925 and embarked upon a career of agronomical research, helped no doubt by the government
policy (vydvizhentsy) of that time to bring young
people of peasant and worker backgrounds into positions of leadership. He worked on practical breeding
problems, especially the control of the growing
periods of agricultural plants.
In 1925, immediately after his graduation, he went
to work at the newly established experimental station
at Kirovabad (in Azerbaijan) and he was entrusted
with work on breeding legumes for fodder and silage.
The need for such plants in that region did not correspond with the availability of reliable water from rains
or irrigation, so he attempted to find ways to alter the
growing seasons of legumes to produce fodder in the
autumn and winter or early spring, when sufficient
water was present. He sowed varieties of peas, vetch,
Heredity ``is inherent not only in the chromosomes, but in every particle of the living body''
(quoted in Huxley, 1949, p. 17).
By 1932, an agronomy journal, the Bulletin of Vernalization, began publication to report research in this
field, and by 1935 Lysenko was its editor, a position he
held until 1941. From the mid-1930s onward, Lysenko
became increasingly involved in spreading his beliefs
about agrobiology and vernalization in opposition
to what he saw as the erroneous theories based
on the work of Gregor Mendel, August Weissmann,
and Thomas Hunt Morgan. His scientific work was
Ly s e n ko, T. D. / Ly s e n ko i s m 1129
intimately interwoven with political issues in the
Soviet Union, and he was eventually relieved of most
of his leadership roles by 1965. In 1966 he was relegated to directorship of the Lenin Hills Agricultural
Experiment Station of the Academy of Sciences until
his death in 1976.
Lysenkoism
Lysenko's beliefs and theories were so at odds with the
rest of contemporary genetics, both inside (initially)
and outside the Soviet Union, that his doctrines came
to be known as Lysenkoism. He did not, however,
claim sole credit for his position; Lysenko cited a
rather obscure Russian horticulturist, plant breeder,
and patriot, Ivan V. Michurin (18551935) as his
inspiration, and intellectual forerunner. Thus, he
usually presented his views as `Michurinist' and he
and his followers became known by that name.
Michurin worked with fruit trees and developed a
theory of `mentoring.'
By grafting twigs of old varieties of fruit trees on the
branches of a young variety, the latter acquires properties
which it lacks, these properties being transmitted to
it through the grafted twigs of the old varieties. (Lenin
Academy, 1949, pp. 3839)
1130
Ly s i n e
Further Reading
References
Lysine
E J Murgola
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0776
COOH
H2N
CH2
CH2
CH2
CH2
NH2
Figure 1
Lysine.
Lysis
E Kutter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0780
Lysogeny
B S Guttman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0781
Lysogeny is a condition in which a bacterial cell carries the genome of a virus in a relatively stable state.
Investigators of bacteriophage growth during the
1920s and 1930s were often puzzled by a strange
phenomenon: while some bacteria would produce
phage shortly after infection, other bacteria yielded
no phage and even appeared to be immune to infection
by the phage. However, in a culture of such resistant
bacteria, small amounts of phage appeared irregularly.
These puzzling bacteria were termed lysogenic because
it was supposed that some cells in a culture were
capable of lysing and producing the observed phage.
Max Delbruck refused to believe in the phenomenon
and ascribed the appearance of phage to sloppy technique, even though some of the investigators notably
Eugene and Elisabeth Wollman were known to be
scrupulous workers.
In 1950, Andre Lwoff and Antoinette Gutmann demonstrated the reality of lysogeny through painstaking
Ly s o z y m e 1131
experiments with a strain of Bacillus megaterium.
They followed individual cells in microdrops of
broth by microscopic examination; each time a cell
divided, the daughter cells were separated into their
own drops by micromanipulation. Occasionally, a cell
would disappear from a drop, leaving behind phage
whose presence could be demonstrated by growth
on susceptible bacteria. In later experiments, Lwoff
demonstrated that when lysogenic cells are irradiated
with UV light, they lyse uniformly and liberate phage, a
phenomenon called phage induction. The hypothetical
intracellular state of the phage in a lysogenized bacterium was called a prophage. Mapping experiments by
Jacob andElie Wollman (thesonofthe above Wollmans,
who were killed by the Nazis) then demonstrated
that the phage lambda prophage is located at a specific
site, near the genes for galactose metabolism. Dale
Kaiser provided strong evidence that the prophage
genome is integrated into the bacterial DNA so that
it is continuous with the bacterial DNA on either side.
Thus, when bacterial DNA replicates during each
round of reproduction, the prophage DNA is replicated as part of the whole genome. (The process of
lambda integration is discussed in an article of its own.)
The lysogenic state is maintained by a control system intrinsic to the phage. Phage lambda, which has
been most intensively studied, carries a single gene, cI,
that encodes a repressor protein. In a stable lysogenic
state, this protein binds to certain sites in the lambda
genome and represses transcription of all other lambda
genes. However, establishment of the lysogenic state is
a complex process involving the products of several
genes, binding to a series of regulatory sites. The heart
of the molecular decision between the lytic and lysogenic states involves a competition between the
repressor (cI) protein, which promotes lysogeny, and
the Cro protein, which promotes lytic growth. The
latter choice depends heavily on a complex process
of antitermination (see Antitermination Factors).
Furthermore, the decision involves proteins that measure the availability of energy, as signalled by the level
of cyclic AMP (cAMP) (see Cyclic AMP (cAMP)). A
cell with an adequate supply of glucose has a low level
of cAMP, and a phage entering such a cell is likely to
enter the lytic cycle; if the glucose level falls, the level of
cAMP rises, and a phage entering such a cell is more
likely to go lysogenic. In effect, the phage is determining whether the most prudent strategy for reproduction is a `short-term tactic' of using the avilable energy
for synthesis of a cellful of new phage or a `long-term
tactic' of producing more copies of its genome through
bacterial growth.
See also: Antitermination Factors; Cyclic AMP
(cAMP); Phage l Integration and Excision
Lysozyme
E Kutter
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0782
Bacterial cells are generally protected from lysis induced by such factors as osmotic shock by having a cell
wall made of peptidoglycan, also called murein. The
entire peptidoglycan sack around each bacterial cell is in
fact one giant, covalently bonded bag-shaped molecule.
Growth of the cell requires that links of this sack be
opened up long enough to insert new links in between
them; penicillin leads to the death of growing bacterial
cells by interfering with the filling and resealing of
these small gaps in the cell's armor. Lysozymes are a particular class of enzymes that are able to attack this murein structure and thus generally effect the destruction
of the cell. In 1922, the Scottish physician Alexander
Fleming showed that saliva, tears, and sweat all contined a substance that could destroy bacteria. What he
was observing was in fact lysozyme the first human
secretion shown to have chemotherapeutic properties.
Peptidoglycans are composed of long polysaccharides that are alternating copolymers of N-acetyl
glucosamine and N-acetylmuramic acid that are
cross linked through unusual short peptides with
structures such as (l-Ala)-(d-Glu)-(l-Lys)-(d-Ala).
In gram-negative bacteria, the peptidoglycan sack
is generally only one layer thick and lies just inside an
outer membrane. In gram-positive bacteria, it has no
outer membrane cover but is many layers thick; this
thick sack is able to take up and retain the Gram stain,
giving these bacteria their name. In both groups of
bacteria, lysozymes catalyze the hydrolysis of the
glycosidic links between GlcNAc and MurNAc, dissolving the cell wall. Lysozymes are found throughout nature in egg whites, in tears and sweat, and in
mucus. A number of bacteriophages also encode lysozymes to help them get in and out of cells. Other
phages make other endolysins enzymes with peptidoglycan degrading activity. The others have somewhat
different specificity but the same function as lysozyme, attacking the peptide crosslinker or the bond
on the other side of MurNAc.
Four families of endolysins have been identified:
1. The true lysozymes (glycosidases) that have just
been described, which include the products of
the bacteriophage T4 e gene (for endolysin) and
P22 gp19.
2. Transglycosylases, such as the phage lambda R protein and the product of the P2 phage K gene, which
attack the same bond as lysozyme but conserve
1132
Ly t i c P h a g e
the glycosidic bond energy by forming a cyclic 1,6disaccharide product. They catalyze the intramolecular transfer of the O-muramyl residue to its
own C6 hydroxyl group.
3. The amidases, such as bacteriophage T7 gp3.5, which
degrade the peptide bond between MurNAc and the
adjacent tetrapeptide crosslinker and endopeptidases, such as the Listeria monocytogenes A500
ply500, which degrade the peptide bond between
two tetrapeptides, cutting between m-DAP Ala.
Lytic Phage
B S Guttman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0783
M
Macronuclear
Development, in Ciliates
L A Klobutcher
Conjugation
Sexual reproduction is initiated by the pairing of cells
of compatible mating types. The micronuclei in each
cell then undergo meiosis to form haploid products.
Next, a haploid nucleus is passed to the mating partner, where it fuses with a resident haploid nucleus
to regenerate a diploid nucleus termed the zygotic
nucleus. The zygotic nucleus divides at least once by
mitosis, in the absence of cell division. Some of these
mitotic products are retained as micronuclei when the
cell resumes asexual growth, while others undergo
macronuclear development to form new macronuclei.
The specific factors responsible for determining micronuclear vs. macronuclear differentiation are unknown,
Macronuclear development in all characterized ciliates involves multiple rounds of DNA replication
that ultimately lead to a polyploid macronucleus (see
article on Macronucleus for the ploidy levels of representative ciliates). Various types of DNA rearrangement occur during this DNA amplification process,
with one of the major events being chromosome fragmentation (Figure 1A). Following fragmentation,
telomeric repeat sequences are quickly added to the
DNA ends of the macronuclear-destined sequences.
The fragmentation/telomere addition process is generally reproducible, but varying degrees of heterogeneity in the precise position of telomeric repeat
addition are observed for different species. In Tetrahymena thermophila, a 15 bp sequence element termed
the Cbs (Chromosome breakage sequence) is necessary
and sufficient to direct fragmentation and telomere
addition. Each Cbs resides in developmentally eliminated DNA and appears to work in an orientationindependent manner. A different conserved 10 bp
sequence (E-Cbs) is found adjacent to chromosome
fragmentation sites in the hypotrich Euplotes crassus.
E-Cbs can reside in either eliminated or macronuclear-destined DNA and is thought to direct fragmentation in an orientation-dependent manner, suggesting
that there may be significant differences in chromosome fragmentation among ciliates. Little is known as
yet concerning the molecular mechanism(s) of chromosome fragmentation and the proteins mediating it
have not been identified. In contrast, de novo telomere
addition is known to be catalyzed by the ribonucleoprotein telomerase. The ciliates are unique in respect
to their ability to efficiently `heal' the DNA ends
generated during macronuclear development by telomere addition.
1134
M a c ro nu c l e a r D eve l o pm e nt , i n Ci l i a t es
(A)
Micronucleus
IES
Macronucleus
(C34A24)n
(C34A24)n
(B)
Micronucleus
3
Macronucleus
(C4A4)n
(C4A4)n
Figure 1 The three major types of DNA rearrangement that occur during ciliate macronuclear development. (A)
Chromosome fragmentation, generating the ends of two macronuclear minichromosomes, is illustrated. Following
fragmentation, species-specific telomeric repeats (C34A24) are added to the minichromosome ends. Also shown is
the removal of an IES by DNA breakage and rejoining. Macronuclear-destined DNA sequences are indicated as open
rectangles, the IES as a black rectangle, and a micronuclear-specific `spacer' sequence as a line. (B) Illustration of the
DNA scrambling observed for some oxytrichid genes. Segments of micronuclear DNA are reordered, and sometimes
inverted, during macronuclear development to form a macronuclear minichromosome. (Reproduced with permission
from Klobutcher and Herrick, 1997.)
during macronuclear development, with the concomitant rejoining of the flanking DNA (Figure 1A).
In Tetrahymena thermophila, about 6000 IESs are
excised. The IESs are generally a few kilobase pairs
in size, are bounded by 48 bp direct repeats of varying sequence, and share little similarity. There are
typically >50000 IESs in Paramecium and hypotrichous ciliates, and they are generally smaller, ranging
from 14 to *500 bp in size. The hypotrichs Euplotes
crassus, Oxytricha fallax, and O. trifallax also contain
large families of transposable elements that behave as
IESs during macronuclear development (i.e., they are
excised). The termini of the small IESs in Paramecium
and Euplotes are conserved and similar to the ends of
the Euplotes Tec transposon IESs. This observation
has bolstered suggestions that the IES excision process
originated from transposons that invaded the micronuclear genome. In this sense, the IESs can be viewed
as a form of `get-out-of-the-way' transposon, similar
to mobile introns and inteins that are removed by
RNA and protein splicing, respectively. The ability
of all these elements to be removed at some point
during the process of gene expression enhances their
ability to coexist with their host genomes. While the
relationship of IESs to transposons is less evident in
species such as Tetrahymena and other hypotrichs,
it is noteworthy that the analysis of either excision
intermediates or excision products in these species
DNA Scrambling
Some hypotrichous ciliates in the genus Oxytricha
have been found to undergo an additional DNA rearrangement process: unscrambling. The micronuclear
copies of some of the macronuclear chromosomes
in this group are not only interrupted, but the segments
that will form the macronuclear DNA molecule are
not in the correct order and in some cases inverted
(Figure 1B). The micronuclear copy of the macronuclear DNA molecule containing a DNA polymerase
alpha gene represents an extreme form of micronuclear
scrambling: it is split into at least 51 segments, most of
which are scrambled. Unscrambling during macronuclear development appears to be guided by 619 bp
repeats which flank the sequences that are ultimately
joined together. The origin of gene scrambling is unclear, but it may be related to the IES excision process.
M a c ro nu c l e us 1135
changes in chromatin structure occur during macronuclear development. Tetrahymena genes have been
identified that encode development-specific proteins
which interact with DNA sequences eliminated
during macronuclear development. The eliminated
DNA appears to be heterochromatic, suggesting an
alternative chromatin structure. In addition, a variant
development-specific histone H3 protein in Euplotes
has been shown to be targeted to the developing
macronucleus and its expression correlated with a
change in nucleosome spacing. Chromatin remodeling
may be a prerequisite for ciliate DNA rearrangement,
or, alternatively, may be involved in the subsequent
process of DNA elimination.
Further Reading
Coyne RS, Chalker DL and Yao M-C (1996) Genome downsizing during ciliate development: nuclear division of labor
through chromosome restructuring. Annual Review of Genetics 30: 557578.
Klobutcher LA and Herrick G (1997) Developmental genome
reorganization in ciliated protozoa: the transposon link. Progress in Nucleic Acids Research and Molecular Biology 56: 162.
Prescott DM (1997) Origin, evolution, and excision of internal
eliminated segments in germline genes of ciliates. Current
Opinion in Genetics and Development 7: 807813.
Macronucleus
L A Klobutcher
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1456
1136
M a j o r H i s t o c o m p a t i b i l i t y C o m p l e x ( M HC )
Further Reading
Major Histocompatibility
Complex (MHC)
J Read and B J Smith
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0784
Malthus, Thomas
See: Darwin, Charles
Mammalian Genetics
(Mouse Genetics)
L Silver
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0787
do not harm their young, and after centuries of artificial selection, they are docile and easily handled.
But why study a mammal at all when animals like the
fruit fly Drosophila melanogaster and the nematode
Caenorhabditis elegans are even smaller and much
more amenable to genetic analysis? The answer is that
a significant portion of biological research is aimed at
understanding ourselves as human beings. And although many features of human biology, especially at
the cell and molecular level, are shared across a broad
spectrum of life, our most advanced organismal-level
characteristics are shared in a much more limited fashion with other animals. In particular, many aspects of
human development and disease are common only to
placenta-bearing mammals such as the mouse.
1138
of departure of C. elegans and vertebrates from a common ancestor that lived on the order of one billion
years ago. Drosophila diverged apart from the vertebrate line at a somewhat later period approximately
700 million years ago. The divergence of mice and
people occurred relatively recently at 60 million
years before present. Thus, humans and mice are ten
times more closely related to each other than either is
to flies or nematodes.
Although the haploid chromosome number associated with different mammalian species varies tremendously, the haploid content of mammalian DNA
remains constant at approximately 3 billion base pairs.
It is not only the size of the genome that has remained
constant among mammals; the underlying genomic
organization has also remained the same as well.
Large genomic segments on average, 10 to 20 million
base pairs have been conserved intact between mice,
humans, and other mammals as well. In fact, the available data suggest that a rough replica of the human
genome could be built by simply breaking the mouse
genome into 130170 pieces and pasting them back
together again in a new order. Although all mammals
are remarkably similar in their overall body plan, there
are some differences in the details of both development and metabolism, and occasionally these differences can prevent the extrapolation of mouse data to
humans and vice versa. Nevertheless, the mouse has
proven itself over and over again as being the model
experimental animal par excellence for studies of
nearly all aspects of human genetics.
High-Resolution Genetics
With the automation and simplification of molecular
assays that have occurred over the last several years, it
has become possible to determine chromosomal map
positions to a very high degree of resolution. Genetic
studies of this type are relying increasingly on extremely polymorphic microsatellite loci to produce
anchored linkage maps, and large insert cloning
vectors such as yeast artificial chromosomes (YACs)
to move from the observation of a phenotype, to a
map of the loci that cause the phenotype, to clones of
the loci themselves Thus, many of the advantages that
were once uniquely available to investigators studying
lower organisms, such as flies and worms, can now be
applied to the mouse through the three-way marriage
of genetics, molecular biology, and embryology.
How should one go about performing a mapping
project? The answer to this question will be determined by the nature of the problem at hand. Is there
a particular locus, or loci, of interest that you wish to
map? If so, at what level is the locus defined, and at
what resolution do you wish to map it? Is the locus
associated with a DNA clone, a protein-based polymorphism, or a gross phenotype visible only in the
context of the whole animal? Are you interested in
mapping a transgene insertion site unique to a single
line of animals? Do you have a new mutation found in
the offspring from a mutagenesis experiment? Alternatively, are you isolating clones to be used as potential DNA markers for a specific chromosome or
subchromosomal region with the need to know simply whether each clone maps to the correct chromosome or not? The answers to these questions will lead
to the choice of a general mapping strategy.
Mutant Phenotypes
For loci defined by phenotype alone, rapid mapping is
usually not possible. Interest in the new phenotype is
likely to lie within its novelty and, as such, the parental
strains used in all standard mapping panels are almost
certain to be wild-type at the guilty locus. Thus, a
broad-based recombinational analysis can be accomplished only by starting from scratch with a cross
between mutant animals and a standard strain. Before
one embarks on such a large-scale effort, it makes
sense to consider whether the mutant phenotype, or
the manner in which it was derived, can provide any
clues to the location of the underlying mutation. Is the
mutant phenotype similar to one that has been previously described in the literature? Does the nature of
the phenotype provide insight into a possible biochemical or molecular lesion?
The most efficient way to begin a search for potentially related loci is to search through an online database of genetic mapping information. Phenotypically
related loci can be uncovered by searching an electronic databases for the appearance of well-chosen keywords. Finally, one can carry out a computerized
online search through the entire biomedical literature.
Once again, this search need not be confined to the
mouse since similarity to a human phenotype can be
informative as well.
When a possible relationship with a previously characterized locus is uncovered, genetic studies should be
directed at proving or disproving identity. This is most
readily accomplished when the previously characterized locus either human or mouse has already been
cloned. A clone can be used to investigate the possibility of aberrant expression from mice that express
the new mutation. One can follow the segregation of
the cloned locus in animals that segregate the new
mutation. Absolute linkage would provide evidence
in support of an identity between the new mutation
and the previously characterized locus.
Even if the previously characterized mutant locus
has not yet been cloned, it may still be possible to
test a relationship between it and the newly defined
mutation. If the earlier mutation exists in a mouse strain
1140
Map distances on bacterial chromosomes, determined from conjugation between sexually compatible
strains, are given in minutes, indicative of the time following cell contact at which a given locus is transferred
from the donor to the recipient cell. By convention,
these distances in Escherichia coli are normalized to a
total map length of 100 minutes.
See also: Centimorgan (cM); Mapping Function
Map Expansion
F W Stahl
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0789
Reference
Mapping Function
F W Stahl
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0791
1
1
2
2M
1
ln1
2
2R
1142
Mapping Function
M 0:7R
2R
m
X
yi
1
i!
i0
i
m1
#
0.5
2
Recombinant frequency (R )
0.4
0.3
0.2
0.1
0.5
1.5
Map distance (M )
Figure 1 Haldane's
mapping functions.
function with
(1) Sturtevant's function (R M); (2) Haldane's
mapping
1
2M
interference M 0:7R 0:3
ln
1
2R
;
(3)
Haldane's
no-interference
mapping
function
R
1
e
.
2
2
2
3
Recombinant frequency (R )
0.4
0.3
0.2
0.1
0
0
0.5
1.5
Map distance (M )
Figure 2 Other linear mapping functions. (1) Sturtevant's function (R M); (2) counting model with m 2;
(3) Kosambi's function; (4) Haldane's no-interference function.
(Foss et al., 1993). For D. melanogaster, m 4, while
for Neurospora crassa, m 2. For values of m up to
three, the function has been written in closed form.
For instance, when m 2:
1
1
1 4M 6M2 e 6M
2
This counting model and Kosambi's function are
compared with Haldane's no-interference function
and with Sturtevant's complete interference function in Figure 2.
R
Circular Functions
Functions for circular maps must depart from
Haldane's function because each pair of markers is
linked by a short arc and a long arc, both of which
need to be `broken' to effect recombination. A modelbased function for bacteriophage T4, which has a
circularly permuted chromosome, achieved linkage
circularity by assuming a mixed population composed
of chromosomes that had termini in one or the other
arc in proportion to the relative linkage lengths of those
arcs. These chromosomes were assumed to undergo a
succession of spatially clustered exchanges between
Further Reading
1144
Mapping Panel
References
Mapping Panel
M F Seldin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0790
Marfan Syndrome
R E Pyeritz
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0792
The Marfan syndrome (MFS) is an autosomal dominant, heritable disorder of connective tissue characterized by clinical findings in multiple tissues and
organ systems, including the eyes, skeleton, muscles,
heart, major arteries, lungs, and skin. The cause is a
defect in the extracellular microfibril.
Clinical Manifestations
The manifestations of MFS show considerable variability in expression, and people need not show all
M a r f a n Sy n d ro m e 1145
features to warrant the diagnosis. In general, people
with MFS have disproportionate tall stature, with
arms and legs particularly long. The ribs also overgrow, and push the sternum in (pectus excavatum) or
out (pectus carinatum). Joint laxity is common, but
congenital contractures of the elbows and digits also
occur. Ligamentous laxity also contributes to abnormal
spinal curvature (scoliosis) and to flat feet (pes planus).
The palate tends to be narrow and high, and the teeth
crowded and maloccluded. While skeletal muscle is
underdeveloped in many, which contributes to the
asthenic habitus, few people are functionally weak.
Dislocation of the ocular lens can be present at birth
or appear at any time during growth of the eye. Nearsightedness (myopia), strabismus, and astigmatism are
common ocular signs. If the ocular problems are not
detected in early childhood, amblyopia becomes a permanent problem. Glaucoma and cataract frequently
occur in young and middle-aged adults with MFS. The
lung is subject to spontaneous pneumothorax from
rupture of apical blebs (5%). The dura stretches in
the lumbosacral region producing a dilated thecal sac
(dural ectasia) and occasionally anterior meningoceles, which can cause pain and weakness.
The complications that reduce life expectancy in
MFS by half are mainly cardiovascular. The first segment of the aorta (aortic root) may be enlarged at
birth, and typically dilates progressively throughout
life. This process is painless, and in the absence of
appropriate imaging, goes unrecognized until the
symptoms of aortic regurgitation or life-threatening
aortic dissection appear. Mitral valve prolapse occurs
in most people with MFS, and has a tendency to progress to mitral regurgitation, which is the most common indication for cardiac surgery in children.
Prevalence
The MFS is one end of a spectrum of heritable disorders of connective tissue, and drawing the line of
demarcation among these other disorders is based
on arbitrary diagnostic criteria. Estimates of the
prevalence of classic MFS in all populations range
from 1 per 3000 to 10 000. No racial or ethnic group
seems predisposed. About 2530% of affected people
are the first case in the family due to a new mutation
in the egg or the sperm that led to that person's conception.
Cause
The cause of MFS, defined in 1991, is a mutation in the
gene encoding fibrillin-1, the principal component of
the extracellular microfibril. This gene, FBNI, spans
over 100 kb of chromosome 15, and consists of 65
Management
Early diagnosis is crucial to effective management,
which is much easier when a family history of MFS
raises suspicion. Even today, some patients are not
detected until a major complication occurs. Diagnosis
of the first case in any family should prompt evaluation of close relatives.
Early evaluation by an ophthalmologist familiar
with MFS is key to preventing amblyopia. With
improved ocular surgery, lens removal for valid indications is much less risky. Little can be done to affect
stature. Screening for scoliosis should begin in early
childhood. Bracing may be effective, but curves
greater than about 408 require surgical stabilization.
Severe pectus excavatum can be surgically repaired to
improve respiratory mechanics and surgical access to
the heart and aorta.
Central to managing the cardiovascular features is
echocardiography: the size of the aortic root can be
followed, cardiac and valvular function quantified,
and the effects of therapy gauged. The most effective
therapy is early administration of a b-adrenergic
blocking agent. The intent is to reduce both heart
rate and impulse of ejection in order to reduce
hemodynamic stress on the aorta and delay or prevent
dilatation and dissection. When the aortic root
reaches 5055 mm in the adult, strong consideration
to prophylactic aortic replacement should be given.
The long-term responses to this approach have
resulted in average life expectancy rising over the
past three decades from the fourth to and the seventh
decade.
1146
Marker
Further Reading
Marker
B S Guttman
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0793
Marker Effect
P J Hastings
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0794
Masked mR NA 1147
nearby does not occur when the two markers are very
close (the reason for this is not known). The persistence of heteroduplex gives a very high frequency of
recombination with other markers nearby because
one nucleotide strand will always show recombination. This effect can also be produced in Saccharomyces cerevisiae by the introduction of a palindrome
for use as a marker; palindromes within heteroduplex
appear to be corrected inefficiently.
Deletions or the inclusion of heterologous sequences influence the amount of recombination in
several systems, perhaps by interfering with the
migration of a Holliday junction or other branch structure. Another kind of marker effect reported in several
fungi stems from the creation of a specific DNA
sequence by the introduction of a mutation for use as
a marker. This effect can be explained as the creation of
a binding site for a protein that creates a recombination hot spot.
See also: Hot Spot of Recombination; Map
Expansion; Mismatch Repair (Long/Short Patch)
Marker Rescue
I Schildkraut
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0795
Marker rescue is a method that analyzes gene organization and establishes the relationship of a mutation
to a physical location on a DNA molecule. In a simple
conception of marker rescue, a mutation carried by a
virus that prevents the virus from forming plaques can
be mapped to one of the many DNA fragments that
are created by digestion of the wild-type virus with a
restriction endonuclease. First, the DNA fragments of
the viral DNA are physically separated. Cells are then
simultaneously infected with the mutant virus and
transfected with one of each of the DNA fragments.
The DNA fragment that carries the wild-type copy of
the mutant allele for which the mutant virus is defective recombines with the viral DNA and enables the
virus to form a plaque or rescue the virus. All the other
DNA fragments do not contain the wild-type copy of
the mutant allele and when tested do not give rise to
a plaque. Marker rescue determines which DNA
restriction fragment carries the mutation in the mutant
virus. Marker rescue can show that the ordered presentation of genes along a chromosome correlates to
their linear occurrence in the DNA molecule.
See also: Genetic Marker
Masked mRNA
N Standart
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0796
1148
Maternal Effect
Maternal Effect
K J Kemphues
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0797
Maternal Inheritance
J Poulton
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0798
Mitochondrial Diseases
Mitochondrial DNA (mtDNA) mutants cause diverse
phenotypes in different organisms due to impaired
respiratory chain function: the petit colony morphology with loss of aerobic respiration in yeast, cytoplasmic male sterility and nonchromosomal stripe
in higher plants, neurological or multisystem disease
in man. The latter include mtDNA rearrangements
(Holt et al., 1988) which cause sporadic Kearns
Sayre syndrome, Pearson syndrome (Rotig et al.,
1988), or maternally inherited diabetes and deafness
(Ballinger et al., 1994); point mutations which cause
mitochondrial encephalopathy with lactic acidosis and
stroke-like episodes (MELAS) (Goto et al., 1990),
myoclonic epilepsy with ragged red fibres (MERRF)
(Shoffner et al., 1990), and Leber's hereditary optic
1150
Maternal Inheritance
neuropathy (LHON) (Wallace et al., 1988). Heteroplasmy (the presence of both normal and mutant
mtDNA in a single individual) is present in many
such mtDNA diseases, so that the proportion of
mutant mtDNA in any cell or tissue may vary from
0% to 100%. In most disorders, there appears to be a
threshold effect such that tissues with a high level of
mutant result in symptoms and preferential accumulation of mutant mtDNAs in affected tissues appears to
explain their progressive nature (Poulton et al., 1995;
Weber et al., 1997).
References
Mating Types
L A Casselton
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0799
References
Mating Types
L A Casselton
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0799
1152
M a t i n g Ty p e s
three mating type genes are all different, but have the
same general function of being DNA binding proteins
(transcription factors) that regulate transcription of a
specific subset of genes. In both types of cells, genes
encoding general mating functions, such as components of a signaling pathway mentioned below, are
expressed independently of the mating type proteins;
they are expressed constitutively. In haploid MATa
cells, genes required for expression of MATa-cellspecific mating functions are also constitutively
expressed because the a1 protein has no role in haploid
cells. In MATa cells, the MATa genes have two
functions: to repress the transcription of MATa-cellspecific genes; and to activate the transcription of
MATa-cell-specific genes. The a2 protein is a repressor and a1 protein is an activator.
The genes that are expressed in a mating typespecific way enable cells to identify a compatible partner. MATa and MATa cells detect each other by means
of small, secreted peptide pheromones. Each mating
type produces its distinct pheromone (a-factor or
a-factor) together with a cell-surface pheromone
receptor that will only bind pheromone produced by
cells of the other mating type. Pheromone binding
triggers an intracellular signal transduction pathway.
This is known as the pheromone response pathway,
and it leads to activation of a transcription factor
responsible for activating transcription of genes
required to bring about cell fusion and subsequent
nuclear fusion. The pheromone signaling pathway of
yeast is one of five mitogen-activated protein kinase
(MAPK) pathways found in yeast. Amongst the genes
that are activated by the pheromone response are
those that lead to the production of cell surface proteins that make the compatible cells adhere to each
other. Another activated protein has the dual role of
arresting the cell cycle so that cells can fuse before
DNA replication, as well as being a scaffold protein
that links the site on the cell surface where receptor
activation occurred to the proteins that determine the
orientation of the cytoskeleton. Mating cells are thus
able to respond to the direction from which the pheromone is coming and to reorient their growth to form
mating projections that will enable them to fuse.
Once compatible cells have fused, there is no need
to send and respond to mating signals. The genes
required for signaling are repressed, or fail to be activated, and the diploid cell follows a new developmental program that will ultimately lead to meiosis. A
remarkably simple mechanism enables a cell to sense
that it has successfully mated. The a1 protein encoded
by the gene at the MATa locus and the a2 protein
encoded by one of the genes at the MATa locus are
produced in different haploid cells, but following cell
fusion, these two proteins are in the same cell and form
M a t i n g - Ty p e G e n e s a n d t h e i r S w i t c h i n g i n Ye a s t s 1153
a heterodimeric protein complex. This complex is a
diploid cell-specific transcription factor. It represses
directly or indirectly the transcription of all genes
required for mating and permits the activation of
genes required for meiosis.
In other fungi, for example, the bread mold Neurospora crassa, the genes found at the mating-type locus
are not all homologs of the yeast genes. However,
the proteins they encode are transcription factors
and lead to a similar cell-type expression of the signaling molecules that enable compatible mates to detect
each other. The mating-type genes of the mushroom
fungi, such as the ink cap Coprinus cinereus, are
remarkable in that they are multiallelic and there
may be several thousands of different mating types
in a population. Here, the mating-type genes encode
large families of proteins that are the homologs of
the yeast a1 and a2 proteins, and mate recognition
depends on a sensitive dimerization domain that permits proteins from compatible partners to dimerize,
but not those from incompatible partners. The pheromones and receptors of these fungi help to determine
mating type and are also members of a large family,
where subtle differences in amino acid sequence are
sufficient for a pheromone to distinguish between
compatible and incompatible receptors.
The genetic mechanisms that enable fungal cells to
signal and respond to each other during mating are
universal and illustrate in relatively simple systems
some of the complex cellular mechanisms involved in
mating in all eukaryotic organisms.
See also: Mating-Type Genes and their Switching
in Yeasts
saccharomyces pombe) and budding yeasts (Saccharomyces cerevisiae) have shown that these organisms
have chosen very different mechanisms of cell-type
change where individual haploid cells express one or
the other mating cell type during mitotic growth.
Interestingly, the pattern of cell division in both yeasts
is analogous to the stem cell pattern of cell division
found in the growth of many self-renewing tissues of
mammals and in other species. As the mechanism of
asymmetric cell division is crucial to explain cellular
differentiation during development in higher systems,
these yeasts have provided different paradigms for
mechanisms of asymmetric cell division. This process
generates cells of opposite type, which can mate and
undergo meiosis and sporulation. This review discusses the mechanism of cell-type change in these
two distantly related organisms as a model system
for answering questions about gene regulation in
eukaryotes.
1154
M a t i n g - Ty p e G e n e s a n d t h e i r S w i t c h i n g i n Ye a s t s 1155
or
W
Y Z1Z2
Z1Z2
MAT DSB
or a
Silent HML
Ya Z1
Silent HMRa
(A)
Silent
mat2-P
mat1-P or M
H2
H1
H3H2
H1
Silent
mat3-M
H3H2
H1
or
Imprint
(B)
Figure 1 (A) Arrangement of expressed MAT and silent HMLa and HMRa loci on chromosome 3 of Saccharomyces
cerevisiae. HMR is located about 120 kb to the right, while HML is about 180 kb to the left of MAT. The MAT
interconversion occurs by replacing the Y region derived from the HMLa (747 bp) or HMRa (642 bp) locus. The W
(723 bp), X (704 bp), and Z1 (239 bp), and Z2 (88 bp) boxes represent sequence homology shared by these loci. This
homology is used for pairing these elements during the process of gene conversion. The DSB at the MAT Y/Z1
boundary catalyzed by the HO-endonuclease initiates recombination. The dot represents the centromere. (B)
Arrangement of expressed mat1 and silent mat2-P and mat3-M loci on chromosome 2 of Schizosaccharomyces pombe.
mat2 is located 17 kb distal to mat1 and mat3 is situated 11 kb distal to mat2. Switching occurs by replacing the allelespecific mat1 sequences with those derived from the mat2-P or mat3-M locus. Short sequence homologies H1 (59 bp),
H2 (135 bp), and H3 (57 bp) flank the P-specific 1104 bp (jagged line) and M-specific 1128 bp (straight line) regions. The
vertical arrow indicates a strand-specific imprint that initiates gene conversion for switching of mat1 only. Long
arrows in both drawings represent unidirectional gene conversion.
copy of a specific donor is transposed into the transcriptionally active MAT/mat1 locus by recombination.
1156
Directionality of Switching
In both Sc. pombe and Sa. cerevisiae donors are chosen
nonrandomly such that nearly 80% of switches occur
to the opposite allele, which contains nonhomologous
allele-specific sequences. Both yeasts have evolved
mechanisms whereby the specific donor is preferentially chosen, based on cell type but regardless of the
donor's genetic content. This was discovered because
when the donor's genetic content was swapped
experimentally, cells in both yeasts preferentially
underwent futile, homologous cassette replacements.
However, the precise mechanism of donor choice is
different in these organisms. In Sc. pombe, it is thought
to be regulated by chromatin structural changes of the
donor loci, probably in a cell-type specific fashion,
but in Sa. cerevisiae, the control lies with a distantly
located site on chromosome 3 that influences availability of the left chromosome arm for recombination.
Further Reading
Maxam-Gilbert Sequencing
See: DNA Sequencing
Maximum Likelihood
H Kishino and M Hasegawa
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0803
1158
M a x im u m L ike l i ho o d
Ig : f ju
M
X
gzm log
m1
Z
E log gZ
gzm
f zm ju
n
X
i1
log
EZ log f Zju 1
log f Zi ju
n
Y
k1
`ujZ1 ; . . . ; Zn
f Zi ju
Xi1
i1
log LujZ1 , . . . , Zn
k1
A
Xi0
k2
v
Xi2
LA 2 (Xi2 k2)
Likelihood of a Tree
Sequence data consist of homologous sites after
alignment. Assuming a model of substitution with
the independence among sites, the log likelihood of a
given tree i is:
`i ui jX
n
X
h1
log fi Xh jui
2 2 log
^1 jX
L 1 u
^0 jX
L 0 u
^1 jX
2 `1 u
^0 jX
` 0 u
!
n
^1 2
1X
f1 Xh0 ju
log
^0
n 1 h1
n h0 1
f0 Xh0 ju
h
p
^ with absolute value
Normalized statistic z = V
larger than 2 can be regarded as significant.
Maximum log likelihoods of more than two topologies follow multivariate normal distribution. When a
topology is compared not with a prespecified tree but
with the maximum likelihood tree, the normalized statistic z should be compared with the distribution of the
maximum of the multivariate normal random variable.
n
n
X
^1
f1 Xh ju
log
^0
f0 X ju
1160
Further Reading
Akaike H (1974) A new look at the statistical model identification. IEEE Transactions Automatic Control AC-19: 716723.
Dayhoff MO, Schwartz RM and Orcutt BC (1978). A model
of evolutionary change in proteins. In: Dayhoff MO (ed.)
Atlas of Protein Sequence and Structure, vol. 5, suppl. 3: pp.
345352. Washington, DC: National Biomedical Research
Foundation.
Felsenstein J (1981) Evolutionary trees from DNA sequences:
maximum likelihood approach. Journal of Molecular Evolution
17: 368376.
Galtier N, Tourasse N and Gouy M (1999) A nonhyperthermophilic common ancestor to extant life forms. Science
283: 220221.
Goldman N (1993) Statistical tests of models of DNA substitution. Journal of Molecular Evolution 36: 182198.
Hasegawa M, Kishino H and Yano T (1985) Dating of the
humanape splitting by a molecular clock of mitochondrial
DNA. Journal of Molecular Evolution 22: 160174.
Kishino H, Miyata T and Hasegawa M (1990) Maximum likelihood inference of protein phylogeny and the origin of
chloroplasts. Journal of Molecular Evolution 31: 151160.
Kullback S (1959) Information Theory and Statistics. New York:
John Wiley.
Nielsen R and Yang Z (1998) Likelihood models for detecting
positively selected amino acid sites and application to the
HIV-1 envelope gene. Genetics 148: 929936.
Thorne JL, Goldman N and Jones DT (1996) Combining protein
evolution and secondary structure. Molecular Biology and
Evolution 13: 666673.
Yang Z (1995) A spacetime process model for the evolution of
DNA sequences. Genetics 139: 9931005.
Yang Z, Nielsen R, Goldman N and Pedersen A-MK (2000)
Codon-substitution models for heterogeneous selection
pressure at amino acid sites. Genetics 155: 431449.
MC29 Avian
Myelocytomatosis Virus
M Frame
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1592
The avian myelocytomatosis virus, MC29, is an oncogenic replication-defective retrovirus that encodes an
oncogenic fusion protein between viral gag sequences
and sequences derived from the coding regions of
c-Myc. Specifically, the hybrid gene arose by deleting
50 coding sequences of the c-myc gene, these being
substituted with a 50 region of the viral gag gene, and
a small number of base substitutions in c-myc resulting in a few amino acid changes. The resulting v-Myc
phospho-protein (p110-gag-myc) transforms avian
hematopoietic target cells and fibroblasts in vitro,
and induces tumors in vivo, most likely as a result of
its DNA-binding transcription factor activity that is
conferred by the Myc-related portion of the fusion
protein. The activity of v-Myc is required for both
proliferation and long-term survival of transformed
cells.
See also: Retroviruses
M c C l i n t o c k , B a r b a r a 1161
McClintock, Barbara
N Fedoroff
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0804
1162
Medicago truncatula
expression, but that they are not the essential regulatory components of most genes. Yet McClintock's
studies on autonomous Spm elements and genes with
nonautonomous Spm insertions provided the first
genetic insights into regulatory gene function.
McClintock was also the first to describe what is now
called `epigenetic' regulation, which she discovered
in her studies on the Spm and Ac transposons.
Although McClintock's early contributions were
recognized by her election to the National Academy
of Sciences in 1944, the importance of her discovery
of transposition was not immediately appreciated.
Transposons were not identified in another organism
until more than a decade after their discovery in maize.
Thier ubiquity became increasingly apparent through
the 1970s and their pervasiveness was understood in
the 1980s. The Nobel Prize in Physiology or Medicine
was belatedly awarded to Barbara McClintock in 1983
for her discovery of transposition almost four decades
earlier.
See also: Transposable Elements; Transposons as
Tools
Medicago truncatula
C Staehelin
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1666
L. (alfalfa or lucerne), a polymorphic Medicago species, whose genetic analysis is complicated by abundant repetitive DNA, hybridization, polyploidy, and
domestication. M. truncatula develops root nodules
(indeterminate nodule type) in symbiosis with the
nitrogen-fixing bacteria Rhizobium meliloti (Rhizobiaceae), whose genome has been completely sequenced. M. truncatula was selected as a model plant
for studying the genetic and molecular processes of
nodule formation. Moreover, M. truncatula provides
the opportunity to study symbiotic associations with
arbuscularmycorrhizal fungi as well as resistance to
plant pathogens.
The genome size of M. truncatula is approximately
5 108 bp, and the current genetic map comprises 348
markers on eight linkage groups and covers 1400 cM
(about 400 kb/cM). A bacterial artifical chromosome
(BAC) library has been constructed. Genome comparison between M. truncatula and other legumes resulted
in a high level of macro- and microsynteny. Map-based
cloning of a number of symbiosis-related loci is in
progress. Insertion mutagenesis programs (using
either transposon or T-DNA tagging strategies) have
been initiated. Efficient Agrobacterium tumefaciensmediated transformation systems have been developed for specific M. truncatula ecotypes ( Jemalong,
R108). M. truncatula plants that overexpress the
early nodulin gene enod40 exhibited accelerated
development of root nodules and increased mycorrhizal colonization. Expressed sequence tags (ESTs)
have been obtained from various tissues including
roots, nodulated roots, and mycorrhizal roots. Chemical mutagenesis has generated a set of various M.
truncatula mutant lines, including mutants unable to
form root nodules. Among them, a number of mutants
are blocked at early stages of the symbiosis and do not
display calcium spiking (sharp oscillations of cytoplasmic calcium ion concentration) in the root hairs,
a response induced by rhizobial lipochitooligosaccharide signals (Nod-factors). These mutants are also
impaired in their ability to interact with mycorrhizal
fungi. Another M. truncatula mutant is insensitive to
the plant hormone ethylene and can be hyperinfected
by R. meliloti.
Further Reading
http://chrysie.tamu.edu/medicago/
http://sequence.toulouse.inra.fr/Mtruncatula.html
http://www.ncgr.org/research/mgi/
http://www.tigr.org/tdb/mtgi/
Meiosis 1163
Medical Genetics
See also: Clinical Genetics
Meiosis
P B Moens
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0807
Definition
Meiosis is defined as the cellular and nuclear processes
that reduce the chromosomal content per nucleus
from two sets to one set. In most organisms, two sets
of chromosomes (diploid) are reduced to one set
(haploid) (see Chromosome Pairing, Synapsis).
When the haploid cell becomes involved in the process
of fertilization, it is referred to as a `gamete.' If a cell
with one set of chromosomes goes on to proliferate, it
is called a `gametophytic generation.' This occurs in
many fungi, ferns, and, for a few divisions, in plants.
Many variations in the meiotic process have evolved
that are of particular adaptive value to specific organisms. The products of meiosis in organisms with three
or four sets of chromosomes are usually unbalanced
because of difficulties in the segregation and assortment of chromosomes. Some of the mechanics of
meiosis are presented in the articles on Chiasma, and
Synaptonemal Complex.
Benefits of Meiosis
Because meiosis can introduce genetic variation, it is
intuitively assumed to be of biological benefit. This
concept, however, needs qualification. Clearly, if an
organism or a population is genetically fine-turned to
a given environment, it would be counter productive
to break up the balanced genetic makeup by meiosis.
Thus, one would expect to see asexual mechanisms, or
at least reduced genetic variability, evolve in a sexual
population under stable conditions. Numerous examples have been cited of such adaptations. They
include vegetative reproduction as an alternative to
sexual reproduction under stable conditions (e.g.,
strawberries), or temporary asexual reproduction during stable conditions followed by sexual reproduction
under unfavorable conditions (e.g., aphids). Complete
asexual reproduction derived from sexual reproduction has been reported in cases of wide dispersion in
which malefemale contact becomes highly tenuous.
Reduced genetic variability through permanent translocation heterozygosity exists in a number of plant
species, but the relationship to environmental factors
is not obvious. In some insects and spiders, the sexlinked chromosome complex may become extensive
(e.g., some termites), thereby also limiting genetic
variability.
1164
M e io s i s
Meiotic Mistakes
It would be an oversimplification to assume that any
biological process is executed flawlessly. However,
most mistakes that take place in cells of the body are
usually of minor importance and can be attended to by
repair or replacement. The consequences of meiotic
mistakes, on the other hand, can be disastrous, because
the resulting individual may be affected in its entirely.
In humans, statistics show that 7.5% of all conceptions carry lethal chromosome defects and 0.5% have
nonlethal chromosomal aberrations. Half of these are
the result of missegregation at meiosis in the male or
female parent. Typically, instead of one of a pair of
chromosomes going to each nucleus, both end up in
one nucleus and none in the other. If fertilized, the first
cell would result in an offspring with three instead of
two copies of the chromosome, while the other would
carry a single chromosome donated by the other parent. For the larger chromosomes, this imbalance is
lethal, but for the sex chromosomes, X and Y, the
imbalance is tolerated because of the cell's ability to
shut down most of the X chromosome when present
in more than one copy (as in normal females). However, there are still more or less severe developmental
problems arising from this aneuploidy (incorrect
numbers of chromosomes). In humans, a trisomy
(three chromosomes) involving the small chromosome 21 can be viable, resulting in the condition
known as Down syndrome. However, monosomy (a
single chromosome) is lethal.
It has been reasoned that organisms with offspring
numbering in the thousands, such as plants, tend to
have less stringent control of meiosis. In exceptional
cases, organisms such as some crustaceans and plants
have very large numbers of chromosomes (hundreds),
all of which appear to be repeat copies of one or a few
chromosomes. In such cases, there does not appear to
be a careful assortment at either mitosis or meiosis.
A variety of plants species can tolerate unbalanced
products of meiosis better than animals such as vertebrates. The existence of chromosomal variation has
resulted in the observable evolution of new species
by natural or artificial selection of favourable types.
In wheat, an ancestral wheat-like species with chromosomes A1, A1, A2, A2, etc., hybridized with a closely
related species with chromosomes B1, B1, B2, B2, etc.
The resulting plant is of type A1, B1, A2, B2, etc., and
is infertile because the A chromosomes cannot properly match at meiosis with the B chromosomes and no
balanced gametes can form. However, after doubling
of the chromosome number per cell, which is not an
uncommon event in plants, the plant is of type A1, A1,
A2, A2, B1, B1, B2, B2, . . . It is fully fertile because at
meiosis, the A chromosomes pair with A chromosomes and B chromosomes pair with B chromosomes,
so that each gamete has a complete set of each type.
Such a plant is called an `allotetraploid' or `amphidiploid.' It is a new species in the sense that it is genetically isolated from both ancestors, because meiosis in
crosses between the allotetraploid and the ancestors
produces chromosomally unbalanced gametes. It is
estimated that about 8000 years ago, a further new
species arose through the hybridization between the
Emmer wheat with the AABB genomes and a wild
wheat species with DD chromosomes. After doubling
of the ABD chromosomes, the present-day bread
wheat, Triticum aestivum, with genomes AA BB DD
resulted.
M e i o t i c D r i ve , M o u s e 1165
has very few genes other than the sex-determining
region (SRY). However, development is not entirely
normal in these cases. Unfortunately, the SRY of
humans is close to the pseudoautosomal region so
that occasionally, by accident, there is a crossover at
meiosis that transfers the SRY to the X chromosome.
The result is an XXy individual who expresses male
characteristics. By contrast, in mice, the SRY is far
away from the pseudoautosomal region, so these accidents are less frequent. There are many forms of sex
determination that are entirely different from the XX/
XY type, but they are beyond the scope of this article.
Further Reading
M e i o t i c D r i ve , M o u s e 1165
has very few genes other than the sex-determining
region (SRY). However, development is not entirely
normal in these cases. Unfortunately, the SRY of
humans is close to the pseudoautosomal region so
that occasionally, by accident, there is a crossover at
meiosis that transfers the SRY to the X chromosome.
The result is an XXy individual who expresses male
characteristics. By contrast, in mice, the SRY is far
away from the pseudoautosomal region, so these accidents are less frequent. There are many forms of sex
determination that are entirely different from the XX/
XY type, but they are beyond the scope of this article.
Further Reading
1166
M e i o ti c D r i ve , M o u s e
t Haplotypes
The best studied example of meiotic drive in the house
mouse is the t haplotype, which biases DNA transmission by disrupting spermatogenesis. t haplotypes are
a selfish form of chromosome 17 that are found in
natural populations of all subspecies of the house
mouse. They comprise a large 20 cM (centimorgan)
region, which is approximately the proximal third
of the chromosome. Within this region are a series of
four major nonoverlapping inversions which suppress
recombination across the region in /t heterozygotes
so that t haplotypes are inherited as a single genetic
unit. t haplotypes show segregation bias in male mice
only. In /t females, segregation is normal and offspring are produced in the expected Mendelian ratios.
In contrast, in /t males, the t haplotype is transmitted
to over 90 % of the offspring. This is known as transmission ratio distortion (TRD) and is a consequence
of the production of wild type sperm that are functionally inactivated due to motility defects. Multiple
independent loci are involved in drive. Three to five
t complex distorter loci (Tcds) have been described.
These vary in strength and act additively on a single,
centrally located t complex responder (Tcr) locus to
produce the high transmission bias in favor of the
t haplotype. The mechanism by which this occurs is
still unclear and investigations are ongoing.
t haplotypes have not become fixed in natural
populations owing to several, strong counterbalancing
forces. All males that inherit two t haplotypes are
Melanoma, Cytogenetic
Studies
J Limon
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.1593
References
Meiotic Product
I Ruvinsky
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0810
The majority of cytogenetic studies of malignant melanoma have been performed on human tumors; a few
studies have also been performed on transplantable
melanomas in rodents. Cytogenetic studies on the
human malignant melanoma have revealed that, as in
most human cancers, melanoma cells display acquired,
clonal chromosome aberrations. The most consistently observed numerical changes have been losses of
chromosomes 10 and 9, and gain of chromosome 7.
Among structural aberrations, the most common
have been del(6q) or other rearrangements, including
i(6)(p10), that lead to loss of 6q. Results from chromosome transfer experiments have provided functional
evidence for the presence of a tumor suppressor gene
on chromosome 6, which may be acting early in the
pathway of tumor formation. All aberrations mentioned above may play an important role in the tumorigenesis and development of malignant melanoma.
In addition, various abnormalities of chromosome 1,
often resulting in loss of 1p material, and, with a
lower frequency, abnormalities of chromosomes 7, 9
(mostly affecting 9p), 11, and 17 were observed. However, the cytogenetic pattern of cutaneous malignant
melanoma outlined above concerns predominantly
metastatic tumors, since only approximately 20%
of all abnormal malignant melanoma karyotypes
have been obtained from primary tumors. In general,
the karyotypes of metastatic melanoma are more
complex, with higher modal chromosomal numbers
and higher numbers of structural chromosome
abnormalities. It seems that rearrangements of chromosome 11 are later events in tumor progression,
and may represent an indicator for a less favorable
clinical outcome. An increase in number of chromosome 7, often accompanied by enhanced expression
1168
M e lt i n g Tem p er a t ure ( Tm )
Mendel, Gregor
G R Fraser
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0813
M e n d e l , G re g o r 1169
There are, of course, other examples of `prematurity' in scientific discovery, prematurity in this context
being defined by Stent as follows: ``A discovery is
premature if its implications cannot be connected by
a series of simple logical steps to canonical, or generally accepted, knowledge.'' There is a good case,
nevertheless, for arguing that Mendel's discovery
transcends these other instances, both in the quality
of its `prematurity' and in its importance which has led
to the passing of his name into everyday language in
the form of words such as mendelian and mendelism.
Although very few of Mendel's experimental notes
have survived, we know that between 1857 and 1863,
he investigated the laws of the origin and development
in Pisum of variable hybrids in connection with seven
pairs of traits. It is difficult to conceive how Mendel
could have had the prescience and good fortune to
have chosen just these traits in just this species,
whose study enabled him to demonstrate the basic
laws of heredity and to create clarity and order out
of the chaos which had long characterized this area of
biology. This degree of prescience border seems to
same on the preternatural; however that may be, it is
a fact that Mendel was not able to repeat the results
which he obtained with Pisum in subsequent experiments involving several other plant species.
Mendel's insight was so profound that his concepts
of dominance and recessivity remain entirely valid
today. Thus, he denoted the round shape of the ripe
pea seeds as dominating over the angular wrinkled
shape which, temporarily receding from view in the
F1 hybrid generation and reappearing in a ratio of 1:3
in the F2 generation, he denoted as recessive. Among
the plants with round seeds of the F2 generation, he
showed a ratio of 2:1 if he differentiated in the F3
generation bred by self-fertilization between the
``meaning of the dominating trait as a hybrid (i.e.,
producing F3 plants with round and wrinkled seeds
in the ratio of 3:1) and as a parental (i.e., producing
only F3 plants with round seeds) trait.'' Thus, in his
analysis of this monofactorial experiment, as it came
to be called, he clearly appreciated the difference
between the appearance of the dominating trait, or
phenotype, and its hereditary basis, or genotype. As
a trained physicist, he commanded combinatorial
mathematics to an extent which enabled him to interpret the ratios obtained in his bi- and trifactorial
experiments, and to extrapolate these results in mathematical terms to general predictions involving n pairs
of factors.
These arithmetical ratios, through which Mendel
demonstrated the particulate inheritance of traits in
the pea, seem in retrospect to be extremely simple.
However, this simplicity is deceptive, being apparent
only with the benefit of hindsight; no one had had an
As far as the lack of immediate successors is concerned, it would be an error to suppose this to have
been due to the inaccessibility of the published report
of Mendel's 1865 lectures. Mendel corresponded with
the leading scientists in the field, and sent a reprint of
his publication to the most prominent among them,
Nageli, as well as describing his work to him in detail
in the course of an extensive correspondence over a
number of years. In fact, Mendel ordered 40 reprints
of his publication, and these reached colleagues all
over Europe; some of these reprints have come to
light for the first time relatively recently, often
uncut. In addition, the journal itself, Verhandlungen
des Naturforschenden Vereines (Brunn), was not an
obscure one, and it is known to have reached the
libraries of the Royal Society and the Linnean Society
in London, among many other academies, universities, and institutes throughout the world of learning.
1170
Mendel, Gregor
Further Reading
Mendel's Laws
R Lewis
Copyright 2001 Academic Press
doi: 10.1006/rwgn.2001.0818
1172
Mendel's L aw s