Conformational
Likeness - Tutorial
Last Update August 19, 1997
CL Home | Complete Database Search | Detailed 1-on-1 | HELP | Tutorial | Citing | Similar Resources
Overview
Methodology
Examples
Comparison to Other Methods
Conformational Likeness (CL)is a methodology for finding proteins with common 3-D motifs, where a motif is a complete polypeptide chain or a fragment of a polypeptide chain. Motifs are defined by a set of geometric-, physicochemical- and sequence-based properties. CLcan be used in one of two modes:
The methodology is differentiated from other methods by the variety of properties that can be used in the alignment and the speed with which it operates. The speed makes it Web accessible.
Step 1 Define likeness profiles for each structure in the PDB based upon pentamer values of:
A total of 495 properties are stored.
Step 2 Characterize these properties in a way that makes biological sense when querying:
For efficiency each query will use a subset of roperties from the 7 groups.
At this point you have a conformational likeness database for all protein structures in the PDB. Our goal is to update this nightly so that it remains current with the primary source of structural data..
There are now two options:
Step 3 (for detailed one-on-one comparison)
Step 3a Calculate partial differences - done differently for different properties
Step 3b Convert partial differences to absolute scale - a profile - by comparison to random pentamers..
Step 3c Use dynamic programming to align profiles
Step 3d Display the results as:
Step 4 (for comparison against the whole database)
The methodology used depends on whether you are searching for a fragment or a complete polypeptide chain. For meaningful results (i.e. not too many and not too few hits), fragments demand a more exact match that whole polypeptide chains.
Step 4a (for protein fragments)
A fast search is performed for fragments of the same size where the likeness exceeds a user defined likeness threshold between 0 an 1. This is performed on a subset of properties.
Step 4b (for complete polypeptide chains)
This is more difficult since you would like to find matches when only part of the chain matches part of other chains in the database - you do not want common features lost because of parts of the chains that do not match. This introduces the idea of frequencies of particular conformational features within a given polypeptide chain. If the given chain has no alpha helix, i.e. a frequency of zero, this is a strong indicator for use in comparison. That is, while it will not distinguish between a beta barrel type structure and a beta sandwich type structure, it will reduce the number of possible structures significantly. The frequency of other properties can be used in a similar way.
For both steps 4a and 4b, results are displayed as a list of entities (polypeptide chains for the most part) with a likeness index. At the very least the starting chain or fragment will be returned with a likeness index of 1.0
What structures are conformationally similar to thw cAMP dependent protein kinase (PDB code 1ATP)?
We have found that NeighDist (distances associated with decamers proceeding and folowing the pentemer) is a good parameter to use in finding structures with a similar overall topology. A database search with:
revealed the following 31 structures (based upon a June 1997 database):
1) 1.000 # 1ATP:E # $C-/AMP$-DEPENDENT PROTEIN KINASE (E.C.2.7.1.37) 2) 0.856 # 2CPK:E # $C-/AMP$-DEPENDENT PROTEIN KINASE (E.C.2.7.1.37) 3) 0.854 # 1YDR:E # MOL_ID: 1; MOLECULE: C-AMP-DEPENDENT PROTEIN KINA 4) 0.838 # 1YDS:E # MOL_ID: 1; MOLECULE: C-AMP-DEPENDENT PROTEIN KINA 5) 0.819 # 1APM:E # $C-/AMP$-DEPENDENT PROTEIN KINASE (E.C.2.7.1.37) 6) 0.817 # 1YDT:E # MOL_ID: 1; MOLECULE: C-AMP-DEPENDENT PROTEIN KINA 7) 0.750 # 1CDK:B # MOL_ID: 1; MOLECULE: CAMP-DEPENDENT PROTEIN KINAS 8) 0.723 # 1CDK:A # MOL_ID: 1; MOLECULE: CAMP-DEPENDENT PROTEIN KINAS 9) 0.518 # 1CMK:E # CAMP-DEPENDENT PROTEIN KINASE CATALYTIC SUBUNIT ( 10) 0.511 # 1CSN:_ # MOLECULE: CASEIN KINASE-1; EC: 2.7.1.-; HETEROGEN 11) 0.440 # 2CSN:_ # MOL_ID: 1; MOLECULE: CASEIN KINASE-1; CHAIN: NULL 12) 0.421 # 1CTP:E # CAMP-DEPENDENT PROTEIN KINASE (E.C.2.7.1.37) (CAP 13) 0.388 # 1BMF:E # MOL_ID: 1; MOLECULE: BOVINE MITOCHONDRIAL F1-ATPA 14) 0.384 # 1GOL:_ # MOL_ID: 1; MOLECULE: EXTRACELLULAR REGULATED KINA 15) 0.374 # 1CXT:A # MOL_ID: 1; MOLECULE: DIMETHYLSULFOXIDE REDUCTASE; 16) 0.367 # 1BMF:D # MOL_ID: 1; MOLECULE: BOVINE MITOCHONDRIAL F1-ATPA 17) 0.365 # 1EFR:E # MOL_ID: 1; MOLECULE: BOVINE MITOCHONDRIAL F1-ATPA 18) 0.364 # 1FIN:A # MOL_ID: 1; MOLECULE: CYCLIN-DEPENDENT KINASE 2; C 19) 0.363 # 1KOB:B # MOL_ID: 1; MOLECULE: TWITCHIN; CHAIN: A, B; FRAGM 20) 0.362 # 1EFR:D # MOL_ID: 1; MOLECULE: BOVINE MITOCHONDRIAL F1-ATPA 21) 0.360 # 1COW:D # MOL_ID: 1; MOLECULE: BOVINE MITOCHONDRIAL F1-ATPA 22) 0.360 # 1CKJ:B # MOL_ID: 1; MOLECULE: RECOMBINANT CASEIN KINASE I 23) 0.350 # 1COW:E # MOL_ID: 1; MOLECULE: BOVINE MITOCHONDRIAL F1-ATPA 24) 0.349 # 1CXS:A # MOL_ID: 1; MOLECULE: DIMETHYLSULFOXIDE REDUCTASE; 25) 0.343 # 1GPM:A # MOL_ID: 1; MOLECULE: GMP SYNTHETASE; CHAIN: A, B, 26) 0.342 # 1KOB:A # MOL_ID: 1; MOLECULE: TWITCHIN; CHAIN: A, B; FRAGM 27) 0.342 # 1TCO:A # MOL_ID: 1; MOLECULE: SERINE/THREONINE PHOSPHATASE 28) 0.333 # 1JST:C # MOL_ID: 1; MOLECULE: CYCLIN-DEPENDENT KINASE-2; C 29) 0.333 # 1FIN:C # MOL_ID: 1; MOLECULE: CYCLIN-DEPENDENT KINASE 2; C 30) 0.332 # 1BMF:F # MOL_ID: 1; MOLECULE: BOVINE MITOCHONDRIAL F1-ATPA 31) 0.325 # 1JST:A # MOL_ID: 1; MOLECULE: CYCLIN-DEPENDENT KINASE-2; C
These are the list of structures that would be expected and corresponds to the list found with DALI, with two exceptions. 1IRK - the tyrosine receptor kinase and 1PHK - the phosphorylase kinase. [BTW DALI did not detect the mitogen activated kinase (1GOL)].
Notice the difference in the likeness values of say 1ATPE(1.0) and 1CTPE(0.421). The difference is the shift between the open and closed conformation that occurs on substrate binding. 1ATPE is a closed conformation, whereas 1CTPE is an open conformation. This difference is clearly seen in the detailed alignment by conformational likeness using:
The stereo view clearly shows the displacement of the smaller upper lobe relative to the more stable larger lobe and illustrates the sensitivity of topological similarity to small changes in overall conformation.
How sensitive is the method in detecting the similarity reported by Vriend and Sander (Proteins 11:52-58, 1991) between ferredoxin (2Fe-2S) and ubiquitin?
Ubiquitin is involved in protein breakdown via covalent conjugates, whereas ferredoxin in an electron carrier in the photoreduction of cytochrome c. That is there is no apparent functional similarity and no significant sequence homology.
Here we use a local and topological comparison between the two proteins to detect a similarity. The cutoff is based upon pentapeptide likeness. The structure superposition is on the best fragment and not the whole structure.
The following indicates the strong structural homology that exists between these two proteins.
In fact the structure alignment agrees closely with that of Vriend and
Sander.
The structure superposition shown at the begining of all CL pages illustrates the homology between these two structures.
How good a job can we do of detecting a EF hand as a classic calcium binding motif? Consider two well known calcium binding proteins:
Both have two calcium binding domains as shown in the following:
The approximate motifs are:
These intersections are quite clear on the following conformational likeness plot:
This is reported in a Proteins paper currently in the review stage.