Over the years the three-dimensional structures of a large number of proteins have been obtained, primarily through X-ray crystallographic techniques. By relating this known 3D-structural information to the primary structure of proteins, patterns have emerged that can be used for the prediction of 3D-structure based on the primary structure of the protein. From these predicted structures, insight into the function of the protein can be obtained.
Why are such predictive abilities useful? With the sequencing of the human and other genomes rapidly progressing, an important next step is to determine the functions of the proteins encoded by genes. In some cases genes are discovered by mapping mutations that cause some discernible phenotype- perhaps cancer or some developmental defect. By analyzing the structure of the proteins encoded by these genes, insight into the mechanism of the genetic defect can be obtained, and, perhaps, insight into a treatment for it.
The links to the following web-based programs allow
the
user to search existing protein databases for matches to query protein
sequences in sequence databases, databases of known 3D structures (PDB
files), and databases of recognized domains, folds and motifs.
Other
programs make predictions of secondary structures and subcellular
locations.
Still others attempt to predict overall 3D structure even when
homologies
to PDB files are low. This is not an exhaustive list, but simply
reflects links I have used for teaching and research purposes.
For
a more comprehensive list of options try the listings at The
CMS Molecular Biology Resource, ExPASy Proteomics Research Tools,
Amos'
WWW links page, and assorted tools at NCBI.
3D-Structure Visualization
3D structures are determined primarily using X-ray
crystallography
and NMR spectrometry. Known 3D structures are usually cataloged
as
PDB files (*.pdb) which can be read as text, showing a header
containing
background information followed by coordinate data. They can also
be opened by several programs that give a manipulatable image of the
protein.
My favorites are the standalone program Rasmol
and the related (and probably preferable) Netscape 4.7x plug-in Chime,
both freeware and both easily downloaded and installed. Support and
links
for both is available at http://www.umass.edu/microbio/rasmol/.
PDBLite
Search the PDB data base for a specific macromolecule structure
PDB
to MultiGIF Page
This site will convert a PDB file to a rotating, animated gif
picture
that can be incorporated into a web page or powerpoint presentation.
Sequence Searches and Alignments
BLAST
BLAST (Basic Local Alignment Search Tool) is a program that allows
similarity searches through the various nucleotide and protein
databases.
There are several programs available for protein-protein,
nucleotide-nucleotide,
and protein-nucleotide searches.
Secondary Structure Prediction
Domain Databases and Sequence Analysis
Simple Modular
Architecture
Research Tool (SMART)
Maintained by EMBL. Allows text searches for proteins
containing
combinations of domains.
ScanProsite
ExPASy Molecular Biology Server of Swiss Institute of Bioinformatics.
InterPro
European Bioinformatics Institute site allows text and sequence
searches of multiple databases, including TIGR, Swiss-Prot and SMART
Conserved
Domain Database (CDD)
Site maintained by NCBI. Text and sequence searches of SMART
and Pfam databases. Link to CDART:( Conserved Domain Architecture
Retrieval Tool) output with listing of proteins containing a particular
domain, with Genbank links.
Prediction of 3-Dimensional Structure
3D-PSSM
This multifaceted program may give you some insights into the
structure
and function of your protein of interest that other programs
don't.
In this program you enter the sequence of your protein as well as
keywords
describing your protein. The program then:
Performs a BLAST search of protein structure databases for PDB
structure
files
Performs a protein database search (BLAST) to find sequence
alignments.
Results are in a nonlinked FASTA format, so you may prefer the NCBI
version
of protein BLAST search.
Searches for PROSITE motifs
Performs a secondary structure analysis
Performs a modified BLAST for PDB structure files that also includes
structure/function keywords
If there is an existing PDB structure file for your protein, it
will
find that in the initial search and present that information in the
results
table showing structural alignments. If a PDB structure does not
exist, it will utilize a BLAST search to find PDB files that have some
homology to it. It will then create a model based on that
structure.
Tabulated results have links to SCOP (Structural Classification of
Proteins),
descriptions of the relevant protein family and fold class, and a link
to a PDB file that can be opened using Rasmol or Chime that
superimposes
your sequence on the 3D model.
Subcellular Localization
PSORT
The subcellular or extracellular location of your protein of
interest
can provide insight into the function of the protein. For
example,
the presence of nuclear localization sequence (NLS) might support a
hypothesized
function of transcriptioin factor.
This program analyzes the sequence of your protein for signal
sequences
and sequences charcteristic for targetting of proteins to the nucleus,
chloroplast, mitochondria, peroxisomes etc. Different programs
are
available for plant/bacterial proteins and for animal/yeast proteins.
Physical Characteristics of Polypeptides
ExPASy
ProtParam Tool
Calculates the pI, extinction
coefficient, and amino acid composition from a polypeptide sequence
ExPASy
PeptideMass
Calculates the masses of peptides
resulting from proteolytic digests
Free-Standing Programs
ANTHEPROT
(Analyze The Proteins)
This is a nice, free, downloadable program
that does secondary structure predictions,
helical wheels, and multiple alignments