You are on page 1of 17

STRING Database Known and Predicted Protein-Protein

Interactions (PPIs)
Mireia Calvo
John Gonzlez
Carles Navarro
Bioinformatics, March 2014
1
Index
2
STRING Database

Database architecture

Annotation Statistics

Annotation format and example

Curation

Programmatic access

Access condition

Take fro mhttp://www.nature.com/
STRING Database
3
Search Tool for the Retrieval of Interacting
Genes/Proteins (http://string-db.org/)
2000
Web-Server and
Search Tool
2009
STRING 8
URL-based programming
interface - API
2011
STRING 9.0
interactive network viewer
3D structures
third-party resources
2013
STRING v9.1
automated mining
Statistical enrichment
2005
+ Data sources +scores
Associations acrross
organisms
2003
+ Predictome and COG
scoring scheme
2007
STRING 7
AJAX-based web-navigation
Database architecture
4
Primary
Databases
BIND, DIP, GRID,
HPRD, IntAct and
MINT
Metadatabase
PID
Prediction
Database
STRING v9.1
curated data from Biocarta, BioCyc,
GO, KEGG, and Reactome
STRING v9.1
Items
Proteins,
species, COGs
Network
Nodes, edges,
scores
Evidence
Datasets,
abstracs,
predictions
Homology
All-agains-all
BLAST

JSON (JavaScript Object Notation)
*.tvs
Annotation Statistics
5
Latest version: 9.1 (2013)

5,214,234 proteins

1133 organisms

>200 million interactions
http://upload.wikimedia.org/wikipedia/commons/
b/ba/Tree_of_Living_Organisms_2.png
Annotation format
and example
6
Browser: name, identifier or aminoacid sequence of one or multiple proteins
Annotation format
and example
7
Summary view: predicted associations, sorted by score
o Input gene
o Predicted Functional Partners
Annotation format
and example
8
Selection of a gene: similar proteins
Selection of a score bullet: breakdown of the individual prediction method
Annotation format
and example
9
Combined association score

Selected association: detailed evidence breakdown

Selected protein: detailed information
Network view
o Red: fusion evidence
o Green: neighborhood evidence
o Blue: co-occurrence evidence
o Purple: experimental evidence
o Yellow: textmining evidence
o Light blue: database evidence
o Black: coexpression evidence
Annotation format
and example
10
Selected protein
o Actions
o Information
o SMART domain
o Protein structure
o Identity
Annotation format
and example
11
Neighborhood view
Runs of genes occurring repeatedly in close
neighbordhood in genomes



Co-occurence view
Linked proteins across species
Fusion view
Individual gene fusion events per species

Annotation format
and example
12
Coexpression view
Genes coexpressed in the same or in
other species (transferred by homology)
Experiments view
Significant protein interaction datasets,
gathered from other protein-protein
interaction databases
Databases view
Significant protein interaction groups,
gathered from curated databases
Textmining view
Significant protein interaction groups,
extracted from the abstracts and scientific
literature

Curation
13
Data source
o Imported data (COG, Ensembled, Biocarta, KEGG, Swissprot)
o Text mining
o Predicted data

Inside COG data base manually
Outside COG data base automatically


Programmatic access
14
Web-browser API
Access condition
15
Four types of licence
Commercial license
Academic license
Creative Commons Attribution 3.0 License
Creative Commons Attribution-Noncommercial-Share Alike 3.0 License
References
16
[1] Franceschini A, Szklarczyk D, Frankild S, Kuhn M, Simonovic M, Roth A, Lin J, Minguez P, Bork P, von Mering C, Jensen LJ.,
STRING v9.1: protein-protein interaction networks, with increased coverage and integration., Nucleic Acids Research, vol. 41, n
Database issue, p. D808D815, 2013.
[2] Szklarczyk D, Franceschini A, Kuhn M, Simonovic M, Roth A, Minguez P, Doerks T, Stark M, Muller J, Bork P, Jensen LJ, von Mering
C., The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored., Nucleic Acids
Reseach, vol. 39, n Database issue, pp. D561-8, 2011.
[3] Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T, Julien P, Roth A, Simonovic M, Bork P, von Mering C.,
STRING 8--a global view on proteins and their functional interactions in 630 organisms., Nucelic Acids Research, vol. 37, n
Database issue, pp. D412-6, 2009.
[4] von Mering C, Jensen LJ, Kuhn M, Chaffron S, Doerks T, Krger B, Snel B, Bork P., STRING 7--recent developments in the
integration and prediction of protein interactions., Nucleic Acids Research, vol. 35, n Database issue, pp. D358-62, 2007.
[5] von Mering C, Jensen LJ, Snel B, Hooper SD, Krupp M, Foglierini M, Jouffre N, Huynen MA, Bork P., STRING: known and
predicted protein-protein associations, integrated and transferred across organisms., Nucleic Acids Research, vol. 33, n Database
issue, pp. D433-7, 2005.
[6] von Mering C, Huynen M, Jaeggi D, Schmidt S, Bork P, Snel B., STRING: a database of predicted functional associations between
proteins, Nucleic Acids Research, vol. 31, n 1, pp. 258-61, 2003.
[7] Snel B, Lehmann G, Bork P, Huynen MA., STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood
of a gene., Nucleic Acids Research, vol. 28, n 18, pp. 3442-4, 200.
[8] Javier De Las Rivas, Celia Fontanillo, ProteinProtein Interactions Essentials: Key Concepts to Building and Analyzing Interactome
Networks, PLoS Computational Biology, vol. 6, no. 6, June 2010.
Thank you for your
attention
Questions?
17

You might also like