You are on page 1of 11

Chapter 5: Introduction to Proteins: The Primary Level of Protein Structure

- Protein function is determined by protein structure, which is determined, in turn, by the


structures and properties of the various amino acids that make up the protein.

5.1: Amino Acids

A. Structure of the alpha-amino acids


- Proteins are polymers. The monomers that make the proteins are the alpha amino acids.
- They are called alpha amino acids because the amino group (-NH2) is attached to the alpha
carbon ( the carbon next to the COOH group)
- The alpha carbon of every amino acid is attached to:
o An amino group
o A hydrogen atom
o A side chain (R group)
- Different alpha amino acids are distinguished by their side chains
o The pKa of the COOH is about 2
o The pKa of the alpha amino group is about 10
o At physiological pH (7) the carboxylic acid group and the alpha
amino group will be ionized to yield the zwitterion form.
- 20 different kinds of amino acids, each with different R group.

B. Stereochemistry of the alpha amino acids


- Amino acids are asymmetrical and this eventually causes asymmetry in the entire protein =
great specificity with regards to structure and function
- Thus, we must understand stereochemistry

- Central alpha carbon is sp3 hybridized, so all alpha amino acids have a tetrahedral
geometry.
- All amino acids, except glycine (because it has two H groups), are chiral
o The stereocenter is at the alpha carbon
o The stereochemistry is designated D- or L-
o There is preference for L-amino acids in proteins.
o L- amino acids have the R-group on the left.
o Isoleuicine and threonine have an additional stereocenter at the beta carbon.
- Stereochemistry of amino acids promotes the formation of the secondary structure of proteins
(alpha helix and beta sheet), which, in turn, is a determinant of the overall structure of the
protein, and the overall structure determines function.
C. Properties of Amino Acid Side Chains: Classes of alpha-amino acids
- Amino Acids with Nonpolar Aliphatic Side Chains
o Glycine, alanine, valine, leucine, and isoleucine
o They all have aliphatic side chains. Aliphatic just means they are straight chains as
opposed to aromatic rings
o Aliphatic side chains are hydrophobic because they are made of chains of
hydrocarbons.
o The more hydrophobic amino acids, such as isoleucine, are usually found within the
core of a protein molecule, where they are shielded from water.

o Glycine is a special case


Glycine is very small and so it is found on the surface of proteins, because its
small structure allows it to form tight turns in proteins.
They are still hydrophobic, but the fact that they are small allows them to be
on the surface instead of buried within the protein.

- Other Nonpolar Amino Acids


o Proline and Methionine
o Proline
Only amino acid in this group in which the side chain forms a covalent bond
with the alpha amino group (ie, a secondary amine ring)
The proline side chain has a primarily aliphatic character, but, like glycine, it
is frequently found on the surface of proteins because the rigid ring of proline
is well suited for turns.
o Methionine
Not aliphatic
Sulfur has the same electronegativity as carbon, so the thioether side chain is
quite hydrophobic instead of hydrophilic (polar)

- Amino Acids with Nonpolar Aromatic Side Chains


o Phenylalanine, tyrosine, and tryptophan
o Phenylalanine
Very hydrophobic because it is aromatic ring of hydrocarbons
o Tyrosine
Hydrophobic, but it is tempered by polar group (-OH) in its side chain
It can also ionize (be deprotonated) at high pH
Absorb UV at 280nm
o Tryptophan
Hydrophobic, but it is tempered by polar group in its side chain
Absorb UV at 280nm

- Amino Acids with Polar Side Chains


o Serine, cysteine, threonine, asparagine, and glutamine
o All have polar side chains that can form multiple H bonds with water molecules
and/or other good H bond donors and acceptors.
o As such, they are found on the surface of proteins where they can interact with
aqueous environment.
o The OH group of serine and the SH group of cysteine are good nucleophiles and
often play key roles in enzyme activity
o Cysteine
The thiol (-SH) has a pKa= 8.3 and ionizes at moderately high pH
The oxidation of two cysteine side chains yields a disulfide bond (cystine).
Disulfide bonds stabilize the active structure of a protein that contains them
o Aspargine and Glutamine
Have uncharged polar (amide) side chains
Hydrophilic and tend to be on surface of protein in contact with water

- Amino Acids with Positively Charged (Basic) Side Chains. (ALSO POLAR)
o Histidine, lysine and arginine
o Lysine (side chain pKa = 10) and arginine (pKa =12.5) are the more basic amino
acids (histidine side chain pKa= 6)
o Because lysine and arginine are so basic, their side chains are almost always
positively charged at physiological pH.
o Arginine has a guanidine group, which is the most basic amino acid de to the
resonance stabilization of the protonated side chain.
o Unique behavior of histidine
The imidazole ring in histidine loses its proton at about pH 6.
When incorporated into proteins, its pKa ranges from 6.5-7.4
The value of pKa for an ionizable side chain is sensitive to proximity of other
charged groups
Because the histidine side chain has a pKa near physiological pH, it is often
involved in proton transfers during enzymatic catalysis.
o Basic amino acids are strongly polar so they are usually found on the surface of
proteins or in substrate binding clefts.

- Amino Acids with Negatively Charged (Acidic) Side Chains (ALSO POLAR)
o Aspartic acid (pKa = 3.9) and glutamic acid (pKa=4.2)
o The side chain pKa values are so low that the negatively charged form of the side
chain typically predominates under physiological conditions, even when they are
incorporated into protein
o Therefore we usually refer to them as aspartate and glutamate (the conjugate bases
rather than the acids)
o They are very hydrophilic and tend to be on the surface of a protein.

D. Rare Genetically Encoded Amino Acid


- Found in nature: selenocystein and pyrrolysine
- Encoded genetically and incorporated into proteins
- Found in a small number of proteins (mainly archaea and eubacteria)

E. Modified Amino Acids


- Amino acid can undergo post-translational modification resulting in modified amino acids
with unique properties.
5.2: Peptides and the Peptide Bond

- The peptide (amide) bond


o Amino acids can be covalently linked together
by formation of an amide bond between the
alpha-carboxylic acid group on one amino acid
and the alpha-amino group on another.
This involves loss of water (dehydration/condensation)
o The products formed by such linkage are called peptides
o A peptide with 2 amino acids is a dipeptide; 4 amino acids is tetrapeptide.
- N- and C- termini
o The end of the peptide that carries an amino group is called the N-terminus
o The other end, which carries a carboxylate group, is called the C-terminus B
o The portion of each amino acid in the chain is called an amino acid residue
o When specifying an amino acid residue in a peptide, the suffix yl is used.

A. Structure of the Amide bond


- The amide carbonyl and the amid N-H bonds are nearly parallel
- The atoms from the first alpha carbon to the second alpha carbon are coplanar
- There is little twisting possible around the peptide
bond because the bond has a substantial fraction of
double-bond character

- A peptide bond exists in two possible configurations, trans and cis


o These are related by rotation around the carbonyl C-N bond
o The trans is usually favored
Why? In the cis configuration the R
groups on adjacent alphacarbons can
experience steric interactions
o The major exception to the trans rule is an amid
bond that is X-Pro, where X is any other amino
acid.
o cis is sometimes allowed, but trans is favored 4:1

B. Stability and Formation of the Peptide Bond


- Peptide bond formation (condensation/dehydration rxn) has a free energy change of
+10kj/mol which is disfavored.
- What IS favored is the hydrolysis of a peptide bond. Although it is favored, it occurse very
slowly at physiological pH and temperature.
- But there are ways we can hydrolyze peptide bonds:
o Strong mineral acid (6M HCL) cleaves all peptide bonds (including the Asn and the
Gln amide bonds)
o Particular chemical reagents that cleave at specific sites (i.e CNBr cleaves at Met)
o Proteolytic enzymes (proteases) that cleave at specific sites
- The action of some proteolytic enzyes
o Know trypsin, chymotrypsin, carboxypeptidase A, cyanogen bromide

C. Peptides
- Always write the N-terminal residue to the left and the C-terminal residue to the right
- In the structure of a peptide, we distinguish between the side chains and the main chain (or
peptide backbone), which is composed of the atoms that make up the peptide bonds- namely,
the alpha-NH, the alpha-C, and the alpha-C=O groups of each amino residue in the peptide

D. Local Electrostatic Effects on Side Chain pKa Values (Polypeptides as Polyampholytes)


- In addition to the free amino group at the N-terminus and the free carboxylate group at the C-
terminus, polypeptides usually contain some amino acids that have ionizable groups on their
side chains
- These groups are all weakly acidic or weakly basic
- Amino acid side chains display a range of pKa values in different proteins due to differences
in the local electrostatic environment. Example:
- To illustrate how ionization of side chains affects the molecular charge of a protein, consider
the titration of the following tetrapeptide.

- So we start off at pH= 0 (very acidic)


o This pH is below the pKa of any of the ionizable groups (the NH3s and the COOHs)
o So all the ionizable residues will be in the protonated form
o Overall charge of the protein is +2 at this point
- If we raise pH (by titrating with NaOH), then the various ionizable groups will lose protons
at pH values in the vicinity of their pKa.
o At pKa = 1.8, the COOH on the right loses its proton
o At pKa = 4.2, the COOH on the left loses its proton
- As the pH of the solution increases, more groups become deprotonated; thus, the net positive
charge decreases and passes through zero at the isoelectric point (pI = 6.0)

5.3: Proteins: Polypeptides of Defined Sequence


- Every protein has a unique, defined amino acid sequence its primary structure
- Over extended periods of time, proteins evolve, and such evolution is reflected in changes in
their amino acid sequences.
o Conservative change conserve the chemical properties and/or size of the side chain
(ex: exchanging Asp for Glu)
o Nonconservative change do no conserve (ex: exchanging Asp for Ala)
- Some proteins contain two or more polypeptide chains held together by noncovalent
interactions (i.e Hemoglobin) or by covalent disulfide bonds (i.e insulin)
5.4: From Gene to Protein
A. The Genetic Code
- Triplets of nucleotides (codons) are used to code for each amino acid, allowing 64 different
combinations, which is more than enough to code for 20 amino acids.
- Some amino acids have multiple codons
- The genetic code specifies RNA triplets that correspond to each amino acid residue.
- Translation of mRNA into protein sequence begins with an AUG codon which encodes Met
in eukaryotes and fMet in prokaryotes and eukaryotic organelles.
o AUG = START
o UAA, UAG, UGA = STOP

B. Post-translational Processing of Proteins


- After ribosome forms polypeptide chain, it has to fold into correct 3D structure and maybe
make disulfide bonds.
- One modification involves specific proteolytic cleavage, which shortens the length of the
peptide chain.
- One example of a polypeptide that undergoes modification is insulin
o When it leaves the ribosome it is a long polypeptide called preproinsulin.
o The residues at the N-terminus of preproinsulin serve as a signal peptide (leader
sequence) and essentially bring it to cellular machinery that will transport it across
hydrophobic cell membranes
o The leader sequence is then cut off by a specific protease, leaving proinsulin.
o Proinsulin folds into a stable conformation allowing disulfide bonds to form.
o The connecting sequence is then cut out by protease, yielding the finished insulin.

- Because proinsulin is not an active hormone, it can be produced and stored in the pancreas at
high concentrations, whereas similar high levels of active insulin would be toxic.
5.5: From Gene Sequence to Protein Function
- The degree of similarity between protein sequences is determined from a procedure called
alignment.
- Primary sequence analysis can be used to predict protein function because similarities in
aligned sequences are correlated with similarities in protein structure and function
- With at least 25% amino acid sequence identity, two proteins will have similar structure, and
very likely similar function.
- Gene sequence analysis can be used to identify protein variants associated with human
disease or resistance to certain treatment

5.6: Protein Sequence Homology


- Protein sequence similarity is also used to map evolutionary relationships between
organisms.
- Protein sequences are classified as homologous in cases where any sequence similarity is
thought to be the result of common evolutionary ancestry.
- If sequence similarity arises but via convergent evolution, the two protein sequences would
be similar but not homologous.
- For organisms that have a common ancestor, a great degree of protein sequence homology
indicates a closer evolutionary relationship, whereas a lesser degree of homology indicates
greater divergence.

- The sequence logo the type size used to represent the one-letter code of the amino acid is
correlated with the relative frequency of an amino acid at a given position.

Tools of Biochemistry 5A: Protein Expression and Purification


- To determine structure and/or functional properties of a specific protein, it is necessary to
separate that protein from the other biomolecules in the cell.
- This can be done by increasing the concentration of the desired protein within the cell and
exploiting specific interactions between the protein and materials used in the purification
process.

A. Recombinant Protein Expression


- So to separate the protein, we usually need to increase its abundance.
- Recombinant protein expression allows researchers to produce proteins of interest at
relatively high concentrations within cells and also enables to production of site-directed
mutants, which are protein variants with designed amino acid sequence alteration.

- Basically, we know that the amino acid sequence for a protein is determined by the sequence
of DNA in the gene that codes that protein.
- We also know that bacteria have this piece of extrachromosomal DNA called plasmids.
Plasmids are capable of autonomous replication in a bacterial cell.
- This is good because what scientists can do is cut open that plasmid at a desired spot, and
splice in a gene encoding the protein they want to make more of.
- E.coli is a bacteria that we can insert plasmid (expression vectors), which are on the order of
2-10 kilobases in length.
- A selection marker is usually a protein that confers resistance to an antibiotic that is
included in the cell growth medium
o Only those cells that have taken up the vector, and are thereby capable of expressing
the desired protein, will survive in the growth medium.

B. The Purification Process


- Now that weve increased the concentration of the protein, we need to be able to separate it
from all the other cellular components.
- Purification begins with lysis of the cell.
- This is followed by centrifugation to remove unbroken cells and insoluble cell parts (e.g.
membranes) to yield an extract, called the cell lysate, which contains the soluble proteins and
other biomolecules in cell.
- Then to finally purify it we have three options:
o (1) Affinity Chromatography
o (2) Ion Exchange Chromatography
o (3) Size-Exclusion Chromatography
- Chromatographic purification is the result of interactions between the proteins in the cell
lysate and the matrix within the column.
o The more strongly a protein interacts with the matrix, the later it will elute from the
column.
- Affinity Chromatography
- Ion Exchange Chromatography

- Size Exclusion Chromatography

You might also like