If you use the DSSP software or databank please cite the appropriate paper:

Joosten RP, te Beek TAH, Krieger E, Hekkelman ML, Hooft RWW, Schneider R, Sander C, Vriend A series of PDB related databases for everyday needs. Nuc. Acids Res. 2010; 39:D411-D419.
Kabsch W, Sander C. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983; 22:2577-2637.

Using DSSP data

DSSP provides an elaborate description of the secondary structure elements in a protein structure, including backbone hydrogen bonding and the topology of β-sheets. The most popular feature is the per-residue assignment of secondary structure with a single character code:

  • H = α-helix
  • B = residue in isolated β-bridge
  • E = extended strand, participates in β ladder
  • G = 3 10 -helix
  • I = π-helix
  • P = κ-helix (poly-proline II helix)
  • T = hydrogen-bonded turn

The full DSSP output is provided in two formats. The legacy DSSP format was origianlly designed for structures that were in PDB-formatted models. Now, 40 years later, the PDB format has become obsolete as it cannot capture the large structure models that modern structural biology methods can provide. The mmCIF format is the data format of choice for structural biology as it has no size limitations for structure models and it can hold extensive annotations and metadata. DSSP now writes its data straight to these mmCIF files by default. The legacy DSSP format can still be written but only for structure models that fit.

DSSP format

The output from DSSP contains secondary structure assignments and other information. Extract from 3kew.dssp (header):

The first few lines are taken from the input model file, then some general statistics about the model and hydrogen bonding are given. The histograms describe the distribution of sizes of secondary structure elements. For instance, this structure has three helices, one short one consisting of 4 residues and two longer ones of 16 and 17 residues. Note that beta sheets are described as a collection of ladders, rather than strands. Ladders can be seen as two strands together with the hydrogen bonds as the rungs of the ladder. More formal definitions are given in the Kabsch and Sander paper.

The model statistics are followed by a detailed per-residue description. Extract from 3kew.dssp (continued):

Below is a brief description of the data columns. More details are described in the Kabsch and Sander paper.

Two columns of residue numbers. First column is DSSP's sequential residue number, starting at the first residue actually in the model set and including chain breaks; this number is used to refer to residues throughout. The second column gives the numbering as is used in the structure model 'residue number','insertion code' and 'chain identifier'; these are given for reference only.

One letter amino acid code, non standard residues are marked as X . CYS in an SS-bridge are marked by a lower case letter. So when cysteines are bridged, then the first bridged cysteine in the sequence and its partner elsewhere in the sequence are marked a . The next bridged cysteine, that is not yet marked, and its partner are both marked b , etcetera. Unbridged cysteines remain marked as C .

The first column (under the S ) gives aone-letter summary of secondary structure, intended to approximate crystallographers' intuition. This summary is based on the next 8 columns, which are the principal result of DSSP analysis of the atomic coordinates. More details in the Kabsch and Sander paper.

BP1 and BP2

Residue numbers of the first and (if available) second beta bridge partner. The letter marked the B-sheet that contains the bridges.

  • Effects leading to larger than expected values: solvent exposure calculation ignores unusual residues, like ACE, or residues with incomplete backbone. it also ignores HETATOMS, like a heme or metal ligands. Also, side chains may not have all atoms explicitly modeled.
  • Effects leading to smaller than expected values: in complexes, e.g. a dimer, solvent exposure is for the entire assembly, not for the monomer. Also, atom OXT of c-terminal residues is treated like a side chain atom if it is listed as part of the last residue.
  • Unknown or non-standard residues are named X on output and are not checked for the expected number of sidechain atoms.
  • All explicit water molecules, like other hetatoms, are ignored.

N-H-->O etc.

Hydrogen bonds; e.g. -3,-1.4 means that this residue (i) has its HN atom H-bonded to O of residue i-3 with an electrostatic H-bond energy of -1.4 kcal/mol. There are two columns for each type of H-bond, to allow for bifurcated H-bonds. Note: The marked H-bonds are the best and second best candidate. The second best and even the best (in rare occasions) may be unrealistically por H-bonds.

The cosine of angle between C=O of residue i and C=O of residue i-1. For α-helices, TCO is near +1, for β-sheets TCO is near -1. These values are descriptive and not used for structure definition.

Virtual bond angle (bend angle) defined by the three Cα atoms of residues i-2, i, and i+2. Used to define bends (structure code S ).

Virtual torsion angle (dihedral angle) defined by the four Cα atoms of residues i-1, i, i+1, and i+2. Used to define chirality (structure code + or - ).

PHI and PSI

The peptide backbone torsion angles as described in the IUPAC standard

X-CA, Y-CA, and Z-CA

Just a copy of the Cα atom coordinates in the structure model

DSSP data in mmCIF files

The mmCIF-formatted DSSP output caries the same information as the DSSP format but in a more scalable way and with a formal description caputered in an mmCIF dictionary. It is designed to be machine readable. Developers who create software to read these annotations can use our extension to the mmCIF dictionary on GitHub. Note: For sake of speed the solvent accessibility is not calculated by default when using mmCIF output. The command-line switch --calculate-accessibility can be used to switch this feature on.

Home

  • Remote Access

STRIDE -- A web server for secondary structure assignment from known atomic coordinates of proteins

  • It implements a knowledge-based algorithm that makes combined use of hydrogen bond energy and statistically derived backbone torsional angle information and is optimized to return resulting assignments in maximal agreement with crystallographers' designations.
  • The web server allows visualization of the secondary structure, as well as contact and Ramachandran maps for any file uploaded by the user with atomic coordinates in the Protein Data Bank (PDB) format.
  • A searchable database of STRIDE assignments for the latest PDB release is also provided.
  • protein structures
  • protein secondary structures
  • protein secondary structure analysis tool
  • protein secondary structure assignments

The Health Sciences Library System supports the Health Sciences at the University of Pittsburgh .

© 1996 - 2023 Health Sciences Library System, University of Pittsburgh. All rights reserved.

Navigation Menu

Search code, repositories, users, issues, pull requests..., provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications You must be signed in to change notification settings

Application to assign secondary structure to proteins

PDB-REDO/dssp

Folders and files.

NameName
295 Commits
workflows workflows

Repository files navigation

GitHub License

This is a rewrite of DSSP, now offering full mmCIF support. The difference with previous releases of DSSP is that it now writes out an annotated mmCIF file by default, storing the secondary structure information in the _struct_conf category.

Another new feature in this version of DSSP is that it now defines Poly-Proline helices as well.

The DSSP program was designed by Wolfgang Kabsch and Chris Sander to standardize secondary structure assignment. DSSP is a database of secondary structure assignments (and much more) for all protein entries in the Protein Data Bank (PDB). DSSP is also the program that calculates DSSP entries from PDB entries.

DSSP does not predict secondary structure.

Requirements

A good, modern compiler is needed to build the mkdssp program since it uses many new C++20 features.

The new makefile for dssp will take care of downloading and building all requirements automatically. So in theory, building is as simple as:

See manual page for more info. Or even better, see the DSSP website .

Contributors 3

  • CMake 14.1%

Advertisement

Advertisement

Protein secondary structure assignment using residual networks

  • Original Paper
  • Published: 23 August 2022
  • Volume 28 , article number  269 , ( 2022 )

Cite this article

secondary structure assignment database

  • Jisna Vellara Antony   ORCID: orcid.org/0000-0001-5210-9583 1 ,
  • Roosafeed Koya   ORCID: orcid.org/0000-0002-6018-4580 1 ,
  • Pulinthanathu Narayanan Pournami   ORCID: orcid.org/0000-0002-8846-2044 1 ,
  • Gopakumar Gopalakrishnan Nair   ORCID: orcid.org/0000-0002-4801-9259 1 &
  • Jayaraj Pottekkattuvalappil Balakrishnan   ORCID: orcid.org/0000-0002-9924-9046 1  

598 Accesses

3 Citations

Explore all metrics

Proteins are constructed from amino acid sequences. Their structural classifications include primary, secondary, tertiary, and quaternary, with tertiary and quaternary structures influencing protein function. Because a protein’s structure is inextricably connected to its biological function, machine learning algorithms that can better anticipate the structures have the potential to lead to new scientific discoveries in human health and improve our capacity to develop new treatments. Protein secondary structure assignment enriches the structural and functional understanding of proteins. It helps in protein structure comparison and classification studies, besides facilitating secondary and tertiary structure prediction systems. Several secondary structure assignment methods have been developed since the 1980s, most of which are based on hydrogen bond analysis and atomic coordinate features. However, the assignment process becomes complex when protein data includes missing atoms. Deep neural networks are often referred to as universal function approximators because they can approximate any function to produce the desired output when properly designed and trained. Optimised deep learning architectures have already proven their ability to increase performance in a wide range of problems. Recently, the ResNet architecture has garnered significant interest due to its applicability in various areas, including image classification and protein contact map prediction. The proposed model, which is based on the ResNet architecture, assigns secondary structures using Cα atom coordinates. The model achieved an accuracy of 94% when evaluated against the benchmark and independent test sets. The findings encourage the development of new deep learning-based methods that are more generalised across various protein learning tasks. Furthermore, it allows computational biologists to delve deeper into integrating these techniques with experimental methods. The model codes are available at: https://github.com/jisnava/ResNet_for_Structure_Assignments/ .

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

secondary structure assignment database

Similar content being viewed by others

secondary structure assignment database

U-Net: Convolutional Networks for Biomedical Image Segmentation

secondary structure assignment database

Deep Learning: A Comprehensive Overview on Techniques, Taxonomy, Applications and Research Directions

secondary structure assignment database

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

Data availability.

The data is available at: https://github.com/jisnava/ResNet_for_Structure_Assignments/ .

Code availability

The model codes are made open at: https://github.com/jisnava/ResNet_for_Structure_Assignments/ .

Pauling L, Corey RB, Branson HR (1951) The structure of proteins: two hydrogen-bonded helical configurations of the polypeptide chain. Proc Natl Acad Sci 37(4):205–211

Article   CAS   Google Scholar  

Andersen CA, Rost B (2003) Secondary structure assignment. Methods Biochem Anal 44:341–364

CAS   PubMed   Google Scholar  

Andersen CA, Rost B (2009) Secondary structure assignment. Structural Bioinformatics 44:459–484

Google Scholar  

Murzin AG, Brenner SE, Hubbard T, Chothia C (1995) Scop: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 247(4):536–540

Sayle RA, Milner-White EJ (1995) Rasmol: biomolecular graphics for all. Trends Biochem Sci 20(9):374–376

Fischel-Ghodsian F, Mathiowitz G, Smith TF (1990) Alignment of protein sequences using secondary structure: a modified dynamic programming method. Protein Eng Des Sel 3(7):577–581

Fischer D, Eisenberg D (1996) Protein fold recognition using sequence-derived predictions. Protein Sci 5(5):947–955

A. Fiser (2010), Template-based protein structure modeling, in: Computational biology, Springer, 73–94.

Torrisi M, Kaleel M, Pollastri G (2019) Deeper profiles and cascaded recurrent and convolutional neural networks for state-of-the-art protein secondary structure prediction. Sci Rep 9(1):1–12

W. Kabsch, C. Sander (1983), Dictionary of protein secondary structure:pattern recognition of hydrogen-bonded and geometrical features, Biopolymers: Original Research on Biomolecules 22 (12) 2577–2637.

King SM, Johnson WC (1999) Assigning secondary structure from protein coordinate data, Proteins: Structure. Function, and Bioinformatics 35(3):313–320

Cubellis MV, Cailliez F, Lovell SC (2005) Secondary structure assignment that accurately reflects physical and evolutionary characteristics. BMC Bioinformatics 6(4):1–9

F Dupuis, J-F Sadoc, J-P Mornon (2004) Protein secondary structure assignment through voronoi tessellation, Proteins: structure, function, and bioinformatics 55 (3) 519–528

Zhang W, Dunker AK, Zhou Y (2008) Assessing secondary structure assignment of protein structures by using pairwise sequence-alignment benchmarks, Proteins: Structure. Function, and Bioinformatics 71(1):61–67

Park S-Y, Yoo M-J, Shin J-M, Cho K-H (2011) Saba (secondary structure assignment program based on only alpha carbons): a novel pseudo center geometrical criterion for accurate assignment of protein secondary structures. BMB Rep 44(2):118–122

 Heinig M, Frishman D (2004) STRIDE: a web server for secondary structure assignment from known atomic coordinates of proteins. Nucleic Acids Res 32(suppl2):W500–W502

Adasme-Carre ̃no F, Caballero J, Ireta J (2021) Psique: protein secondary structure identification on the basis of quaternions and electronic structure calculations. J Chem Inf Model 61(4):1789–1800

Article   Google Scholar  

Brinkjost T, Ehrt C, Koch O, Mutzel P (2020) Scot: rethinking the classification of secondary structure elements. Bioinformatics 36(8):2417–2428

Kumar P, Bansal M (2015) Identification of local variations within secondary structures of proteins. Acta Crystallogr D Biol Crystallogr 71(5):1077–1086

Labesse G, N. Colloc’h, J. Pothier, J.-P. Mornon, (1997) P-sea: a new efficient assignment of secondary structure from cα trace of proteins. Bioinformatics 13(3):291–295

Koch O, Cole J (2011) An automated method for consistent helix assignment using turn information, Proteins: Structure. Function, and Bioinformatics 79(5):1416–1426

Srinivasan R, Rose GD (1999) A physical basis for protein secondary structure. Proc Natl Acad Sci 96(25):14258–14263

Fodje M, Al-Karadaghi S (2002) Occurrence, conformational features and amino acid propensities for the π-helix. Protein Eng Des Sel 15(5):353–358

Nagy G, Oostenbrink C (2014) Dihedral-based segment identification and classification of biopolymers i: proteins. J Chem Inf Model 54(1):266–277

Hosseini S-R, Sadeghi M, Pezeshk H, Eslahchi C, Habibi M (2008) Prosign: a method for protein secondary structure assignment based on three-dimensional coordinates of consecutive cα atoms. Comput Biol Chem 32(6):406–411

Majumdar I, Krishna SS, Grishin NV (2005) Palsse: a program to delineate linear secondary structural elements from protein structures. BMC Bioinformatics 6(1):202

Taylor WR (2001) Defining linear segments in protein structure. J Mol Biol 310(5):1135–1150

Martin J, Letellier G, Marin A, Taly J-F, de Brevern AG, Gibrat J-F (2005) Protein secondary structure assignment revisited: a detailed analysis of different assignment methods. BMC Struct Biol 5(1):17

Cao C, Wang G, Liu A, Xu S, Wang L, Zou S (2016) A new secondary structure assignment algorithm using cαbackbone fragments. Int J Mol Sci 17(3):333

Jordan MI, Mitchell TM (2015) Machine learning: trends, perspectives, and prospects. Science 349(6245):255–260

Wu Y, Ianakiev K, Govindaraju V (2002) Improved k-nearest neighbor classification. Pattern Recognit 35(10):2311–2318. https://doi.org/10.1016/S0031-3203(01)00132-7

Law SM, Frank AT, Brooks CL III (2014) Pcasso: a fast and efficient cα-based method for accurately assigning protein secondary structure elements. J Comput Chem 35(24):1757–1761

Salawu EO (2016) Rafosa: random forests secondary structure assignment for coarse-grained and all-atom protein systems. Cogent Biology 2(1):1214061

LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444

Goh GB, Hodas NO, Vishnu A (2017) Deep learning for computational chemistry. J Comput Chem 38(16):1291–1307

Jisna VA, Jayaraj PB (2021) Protein structure prediction: conventional and deep learning perspectives. Protein J 40(4):522–544

Antony JV, Madhu P, Balakrishnan JP, Yadav H (2021) Assigning secondary structure in proteins using ai. J Mol Model 27(9):1–13

Wang, L, Cao C, Zuo S (2021) Protein secondary structure assignment using pc‐polyline and convolutional neural network. Proteins: Structure, Function, and Bioinformatics 89(8):1017–1029

Wang G, Dunbrack RL (2005) Pisces: recent improvements to a pdb sequence culling server. Nucleic Acids Res 33(suppl2):W94–W98

Rose PW, Bi C, Bluhm WF, Christie CH, Dimitropoulos D, Dutta S, Green RK, Goodsell DS, Prli ́c A, Quesada M et al (2012) The rcsb protein data bank: new resources for research and education. Nucleic Acids Res 41(D1):D475–D482

Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

Werbos PJ (1990) Backpropagation through time: what it does and how to do it. Proc IEEE 78(10):1550–1560

Hecht-Nielsen R (1992) Theory of the backpropagation neural network. Neural networks for perception. Academic Press, pp 65–93

Sazli MH (2006) A brief review of feed-forward neural networks. Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering 50(01)

Bengio Y, Simard P, Frasconi P (1994) Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Networks 5(2):157–166

Zeiler MD, Ranzato D, Monga R, Mao M, Yang K, Le QV, Nguyen P et al ( 2013) On rectified linear units for speech processing. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, pp 3517–3521

Wu Z, Chunhua S, Van Den Hengel A (2019) Wider or deeper: Revisiting the resnet model for visual recognition. Pattern Recognit 90:119–133

Yamashita R, Nishio M, Do RKG, Togashi K (2018) Convolutional neural networks: an overview and application in radiology. Insights Imaging 9(4):611–629

Kim P (2017) Convolutional neural network. In: MATLAB deep learning. Apress, Berkeley, pp 121–147

Sermanet P, Chintala S, LeCun Y (2012) November), Convolutional neural networks applied to house numbers digit classification, In Proceedings of the 21st international conference on pattern recognition (ICPR2012) ( 3288–3291) IEEE.

Krizhevsky A, Sutskever I, Hinton GE (2017) Imagenet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

R Pascanu, T Mikolov, Y Bengio (2013) On the difficulty of training recurrent neural networks, In: International conference on machine learning, PMLR, 1310–1318.

Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, pp 315–323

Ioffe S, Szegedy C (2015 June) Batch normalization: accelerating deep network training by reducing internal covariate shift, In International conference on machine learning ( 448–456) PMLR.

Araujo A, Norris W, Sim J (2019 ) Computing receptive fields of convolutional neural networks. Distill 4(11):e21

Zhao Y, Liu Y (2021) Oclstm: optimized convolutional and long short-term memory neural network model for protein secondary structure prediction. PLoS ONE 16(2):e0245982

Heffernan R, Yang Y, Paliwal K, Zhou Y (2017) Capturing non-local interactions by long short-term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility. Bioinformatics 33(18):2842–2849

Download references

Acknowledgements

The authors thank the Centre for Computational Modelling and Simulation (CCMS) and Central Computer Centre (CCC) at the National Institute of Technology Calicut, for providing the NVIDIA DGX station facility to train the deep neural network architectures.

It is part of my (V. A. Jisna) PhD work at the National Institute of Technology Calicut, India. The research is funded by the Ministry of Human Resource Development, India.

Author information

Authors and affiliations.

Department of Computer Science and Engineering, National Institute of Technology Calicut, Kattangal, Kerala, 673601, India

Jisna Vellara Antony, Roosafeed Koya, Pulinthanathu Narayanan Pournami, Gopakumar Gopalakrishnan Nair & Jayaraj Pottekkattuvalappil Balakrishnan

You can also search for this author in PubMed   Google Scholar

Contributions

Jisna Vellara Antony (JVA) did the conceptualisation and dataset construction. JVA and Roosafeed Koya (RK) implemented the models. Jayaraj Pottekkattuvalappil Balakrishnan (JPB), Pulinthanathu Narayanan Pournami (PNP), and Gopakumar Gopalakrishnan Nair (GGN) supervised the project. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Jisna Vellara Antony .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Antony, J.V., Koya, R., Pournami, P.N. et al. Protein secondary structure assignment using residual networks. J Mol Model 28 , 269 (2022). https://doi.org/10.1007/s00894-022-05271-z

Download citation

Received : 31 October 2021

Accepted : 12 August 2022

Published : 23 August 2022

DOI : https://doi.org/10.1007/s00894-022-05271-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Protein secondary structure
  • Secondary structure assignments
  • Neural networks
  • Deep learning
  • Residual networks
  • Find a journal
  • Publish with us
  • Track your research

bioRxiv

Assigning Secondary Structure in Proteins using AI

  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Jisna Vellara Antony
  • For correspondence: [email protected]
  • ORCID record for Jayaraj Pottekkattuvalappil Balakrishnan
  • Info/History
  • Preview PDF

Knowledge about protein structure assignment enriches the structural and functional understanding of proteins. Accurate and reliable structure assignment data is crucial for secondary structure prediction systems. Since the ’80s various methods based on hydrogen bond analysis and atomic coordinate geometry, followed by Machine Learning, have been employed in protein structure assignment. However, the assignment process becomes challenging when missing atoms are present in protein files. Our model develops a multi-class classifier program named DLFSA for assigning protein Secondary Structure Elements(SSE) using Convolutional Neural Networks(CNN). A fast and efficient GPU based parallel procedure extracts fragments from protein files. The model implemented in this work is trained with a subset of protein fragments and achieves 88.1% and 82.5% train and test accuracy, respectively. Our model uses only C α coordinates for secondary structure assignments. The model is successfully tested on a few full-length proteins also. Results from the fragment-based studies demonstrate the feasibility of applying deep learning solutions for structure assignment problems.

Competing Interest Statement

The authors have declared no competing interest.

jayarajpb{at}nitc.ac.in , prayagh.m{at}gmail.com

https://github.com/jisnava/DLFSA/

View the discussion thread.

Thank you for your interest in spreading the word about bioRxiv.

NOTE: Your email address is requested solely to identify you as the sender of this article.

Reddit logo

Citation Manager Formats

  • EndNote (tagged)
  • EndNote 8 (xml)
  • RefWorks Tagged
  • Ref Manager
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Subject Area

  • Bioinformatics
  • Animal Behavior and Cognition (5415)
  • Biochemistry (12219)
  • Bioengineering (9147)
  • Bioinformatics (30173)
  • Biophysics (15488)
  • Cancer Biology (12596)
  • Cell Biology (18082)
  • Clinical Trials (138)
  • Developmental Biology (9756)
  • Ecology (14629)
  • Epidemiology (2067)
  • Evolutionary Biology (18786)
  • Genetics (12557)
  • Genomics (17234)
  • Immunology (12324)
  • Microbiology (29058)
  • Molecular Biology (12064)
  • Neuroscience (63268)
  • Paleontology (464)
  • Pathology (1939)
  • Pharmacology and Toxicology (3373)
  • Physiology (5193)
  • Plant Biology (10815)
  • Scientific Communication and Education (1710)
  • Synthetic Biology (3007)
  • Systems Biology (7547)
  • Zoology (1692)

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Secondary structure assignment tool

Is there any CA-based online tool/web-server available for secondary structure assignment of proteins except DSSP or STRIDE?

DSSP is a database of secondary structure assignments (and much more) for all protein entries in the Protein Data Bank (PDB)

STRIDE Protein secondary structure assignment with stride, Basic assignment, Visual assignment, Contact map, Ramachandran Plot

  • protein-structure

M__'s user avatar

Know someone who can answer? Share a link to this question via email , Twitter , or Facebook .

Your answer, sign up or log in, post as a guest.

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Browse other questions tagged protein-structure webservice or ask your own question .

  • Featured on Meta
  • Upcoming sign-up experiments related to tags

Hot Network Questions

  • What could explain that small planes near an airport are perceived as harassing homeowners?
  • How would I say the exclamation "What a [blank]" in Latin?
  • What type of black color text for brochure print in CMYK?
  • Is there an image viewer for ubuntu that will read/apply xmp sidecars?
  • Is there any legal justification for content on the web without an explicit licence being freeware?
  • What is the translation of misgendering in French?
  • Trying to determine what this small glass-enclosed item is
  • Are both vocal cord and vocal chord correct?
  • Could space habitats have large transparent roofs?
  • Can I get a refund for ICE due to cancelled regional bus service?
  • Cleaning chain a few links at a time
  • In By His Bootstraps (Heinlein) why is Hitler's name Schickelgruber?
  • Is there a way to non-destructively test whether an Ethernet cable is pure copper or copper-clad aluminum (CCA)?
  • Why can't LaTeX (seem to?) Support Arbitrary Text Sizes?
  • Is there any other reason to stockpile minerals aside preparing for war?
  • Next date in the future such that all 8 digits of MM/DD/YYYY are all different and the product of MM, DD and YY is equal to YYYY
  • Weird behavior by car insurance - is this legit?
  • Do known physical systems all have a unique time evolution?
  • Do capacitor packages make a difference in MLCCs?
  • Diagnosing tripped breaker on the dishwasher circuit?
  • How to bid a very strong hand with values in only 2 suits?
  • How are "pursed" and "rounded" synonymous?
  • White grids appears when export GraphicsRow to PDF
  • Where can I access records of the 1947 Superman copyright trial?

secondary structure assignment database

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

PolyprOnline: polyproline helix II and secondary structure assignment database

Affiliations.

  • 1 Inserm U1134, Paris, France, Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, Institut National de la Transfusion Sanguine, Paris, France and Laboratory of Excellence GR-Ex, Paris, France Inserm U1134, Paris, France, Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, Institut National de la Transfusion Sanguine, Paris, France and Laboratory of Excellence GR-Ex, Paris, France Inserm U1134, Paris, France, Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, Institut National de la Transfusion Sanguine, Paris, France and Laboratory of Excellence GR-Ex, Paris, France.
  • 2 Inserm U1134, Paris, France, Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, Institut National de la Transfusion Sanguine, Paris, France and Laboratory of Excellence GR-Ex, Paris, France Inserm U1134, Paris, France, Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, Institut National de la Transfusion Sanguine, Paris, France and Laboratory of Excellence GR-Ex, Paris, France Inserm U1134, Paris, France, Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, Institut National de la Transfusion Sanguine, Paris, France and Laboratory of Excellence GR-Ex, Paris, France Inserm U1134, Paris, France, Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, Institut National de la Transfusion Sanguine, Paris, France and Laboratory of Excellence GR-Ex, Paris, France.
  • 3 Inserm U1134, Paris, France, Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, Institut National de la Transfusion Sanguine, Paris, France and Laboratory of Excellence GR-Ex, Paris, France Inserm U1134, Paris, France, Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, Institut National de la Transfusion Sanguine, Paris, France and Laboratory of Excellence GR-Ex, Paris, France Inserm U1134, Paris, France, Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, Institut National de la Transfusion Sanguine, Paris, France and Laboratory of Excellence GR-Ex, Paris, France Inserm U1134, Paris, France, Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, Institut National de la Transfusion Sanguine, Paris, France and Laboratory of Excellence GR-Ex, Paris, France [email protected].
  • PMID: 25380779
  • PMCID: PMC4224144
  • DOI: 10.1093/database/bau102

The polyproline helix type II (PPII) is a regular protein secondary structure with remarkable features. Many studies have highlighted different crucial biological roles supported by this local conformation, e.g. in the interactions between biological macromolecules. Although PPII is less frequently present than regular secondary structures such as canonical alpha helices and beta strands, it corresponds to 3-10% of residues. Up to now, PPII is not assigned by most popular assignment tools, and therefore, remains insufficiently studied. PolyprOnline database is, therefore, dedicated to PPII structure assignment and analysis to facilitate the study of PPII structure and functional roles. This database is freely accessible from www.dsimb.inserm.fr/dsimb_tools/polyproline.

© The Author(s) 2014. Published by Oxford University Press.

PubMed Disclaimer

Data flow in PolyprOnline system.…

Data flow in PolyprOnline system. Access to the system can be done in…

Results. On the top of…

Results. On the top of the table a pie chart displaying statistics of…

Detailed analysis of a protein…

Detailed analysis of a protein structure (3KWEA; 25). ( A ) Sequence and…

Some examples of proteins with…

Some examples of proteins with high number of PPII conformations revealed by PolyprOnline…

Similar articles

  • Recent advances on polyproline II. Narwani TJ, Santuz H, Shinada N, Melarkode Vattekatte A, Ghouzam Y, Srinivasan N, Gelly JC, de Brevern AG. Narwani TJ, et al. Amino Acids. 2017 Apr;49(4):705-713. doi: 10.1007/s00726-017-2385-6. Epub 2017 Feb 9. Amino Acids. 2017. PMID: 28185014 Review.
  • Structural and functional analyses of PolyProline-II helices in globular proteins. Kumar P, Bansal M. Kumar P, et al. J Struct Biol. 2016 Dec;196(3):414-425. doi: 10.1016/j.jsb.2016.09.006. Epub 2016 Sep 13. J Struct Biol. 2016. PMID: 27637571
  • The structure of "unstructured" regions in peptides and proteins: role of the polyproline II helix in protein folding and recognition. Rath A, Davidson AR, Deber CM. Rath A, et al. Biopolymers. 2005;80(2-3):179-85. doi: 10.1002/bip.20227. Biopolymers. 2005. PMID: 15700296 Review.
  • Properties of polyproline II, a secondary structure element implicated in protein-protein interactions. Cubellis MV, Caillez F, Blundell TL, Lovell SC. Cubellis MV, et al. Proteins. 2005 Mar 1;58(4):880-92. doi: 10.1002/prot.20327. Proteins. 2005. PMID: 15657931
  • Conservation of polyproline II helices in homologous proteins: implications for structure prediction by model building. Adzhubei AA, Sternberg MJ. Adzhubei AA, et al. Protein Sci. 1994 Dec;3(12):2395-410. doi: 10.1002/pro.5560031223. Protein Sci. 1994. PMID: 7756993 Free PMC article.
  • Exploring the Role of Globular Domain Locations on an Intrinsically Disordered Region of p53: A Molecular Dynamics Investigation. Bakker MJ, Sørensen HV, Skepö M. Bakker MJ, et al. J Chem Theory Comput. 2024 Feb 13;20(3):1423-1433. doi: 10.1021/acs.jctc.3c00971. Epub 2024 Jan 17. J Chem Theory Comput. 2024. PMID: 38230670 Free PMC article.
  • Cryo-EM reveals how the mastigoneme assembles and responds to environmental signal changes. Wang Y, Yang J, Hu F, Yang Y, Huang K, Zhang K. Wang Y, et al. J Cell Biol. 2023 Dec 4;222(12):e202301066. doi: 10.1083/jcb.202301066. Epub 2023 Oct 26. J Cell Biol. 2023. PMID: 37882754 Free PMC article.
  • Molecular dynamics simulations of the adsorption of an intrinsically disordered protein: Force field and water model evaluation in comparison with experiments. Koder Hamid M, Månsson LK, Meklesh V, Persson P, Skepö M. Koder Hamid M, et al. Front Mol Biosci. 2022 Oct 26;9:958175. doi: 10.3389/fmolb.2022.958175. eCollection 2022. Front Mol Biosci. 2022. PMID: 36387274 Free PMC article.
  • BERT-PPII: The Polyproline Type II Helix Structure Prediction Model Based on BERT and Multichannel CNN. Feng C, Wang Z, Li G, Yang X, Wu N, Wang L. Feng C, et al. Biomed Res Int. 2022 Aug 24;2022:9015123. doi: 10.1155/2022/9015123. eCollection 2022. Biomed Res Int. 2022. PMID: 36060139 Free PMC article.
  • Residue Folding Degree-Relationship to Secondary Structure Categories and Use as Collective Variable. Sladek V, Harada R, Shigeta Y. Sladek V, et al. Int J Mol Sci. 2021 Dec 2;22(23):13042. doi: 10.3390/ijms222313042. Int J Mol Sci. 2021. PMID: 34884847 Free PMC article.
  • Tyagi M., Bornot A., Offmann B., et al. (2009) Analysis of loop boundaries using different local structure assignment methods. Protein Sci., 18, 1869–1881. - PMC - PubMed
  • Schrodinger L.L.C. (2010) The PyMOL Molecular Graphics System, Version 1.3r1.
  • Humphrey W., Dalke A., Schulten K. (1996) VMD - visual molecular dynamics. J. Mol. Graph, 14, 33–38. - PubMed
  • Pettersen E.F.G., Goddard T.D., Huang C.C., Couch G. S., Greenblatt D.M., Meng E.C., Ferrin T.E. (2004) UCSF Chimera—a visualization system for exploratory research and analysis. J. Comput. Chem., 13, 1605–1612. - PubMed
  • Offmann B., Tyagi M., de Brevern A.G. (2007) Local protein structures. Curr. Bioinformatics 165–202.

Publication types

  • Search in MeSH

Related information

  • PubChem Compound (MeSH Keyword)

LinkOut - more resources

Full text sources.

  • Europe PubMed Central
  • PubMed Central
  • Silverchair Information Systems

Other Literature Sources

  • scite Smart Citations

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Database (Oxford)
  • v.2014; 2014

PolyprOnline: polyproline helix II and secondary structure assignment database

Romain chebrek.

1 Inserm U1134, Paris, France, 2 Université Paris Diderot, Sorbonne Paris Cité, UMR_S 1134, Paris, France, 3 Institut National de la Transfusion Sanguine, Paris, France and 4 Laboratory of Excellence GR-Ex, Paris, France

Sylvain Leonard

Alexandre g. de brevern, jean-christophe gelly.

The polyproline helix type II (PPII) is a regular protein secondary structure with remarkable features. Many studies have highlighted different crucial biological roles supported by this local conformation, e.g. in the interactions between biological macromolecules. Although PPII is less frequently present than regular secondary structures such as canonical alpha helices and beta strands, it corresponds to 3–10% of residues. Up to now, PPII is not assigned by most popular assignment tools, and therefore, remains insufficiently studied. PolyprOnline database is, therefore, dedicated to PPII structure assignment and analysis to facilitate the study of PPII structure and functional roles. This database is freely accessible from www.dsimb.inserm.fr/dsimb_tools/polyproline .

Introduction

Fifty percent of local protein conformations are constituted of the two regular secondary structure, i.e. α helices and β sheets, while the remaining protein structure is essentially constituted of turns that can overlap the two previous local conformation, and coil ( 1 ). Regular secondary structures are fundamental descriptor for the analysis and the understanding of the structure and function of proteins at a molecular level. As such, they are automatically used to visualize the protein 3D structures with popular software like PyMOL ( 2 ), VMD ( 3 ) or Chimera ( 4 ). Thus, the secondary structures assignment is an essential step for studying protein architecture, folding and for the prediction of 3D protein structure. Besides α helices and β sheets, a number of other regular secondary structures are often ignored, despite their importance in biological processes ( 5 ). Among other regular secondary structures the polyproline II helix (PPII) is of significant interest. PPII conformation was primarily identified in the 1950s in collagen helix by Pauling and Corey ( 6 ), and in structures containing many repeating proline amino acids ( 7 ). It was not until the end of the 1990s that this conformation has been demonstrated to occur frequently in globular protein ( 8 ), with a very high conservation ratio of 80–100% in proteins families sharing 20% sequence identity or more, a ratio close to the conservation found for α helices and β strands ( 9 ). Depending on the tools used for the assignment of secondary structures, their frequency varies in the range of 3–10% of all conformations with a common core of more than 1.6% assignment shared by all tools ( 10 ). Other studies have shown similar frequency, Adzhubei and co-workers in a recent review ( 11 ) estimated about 2% of residues in Protein Databank to be in PPII-helices of length 3 and more residues. For historical reasons, this conformation had been called ‘polyproline helix’, although most PPIIs comprised non-proline residues and some even contain no proline at all ( 8 ).

In term of local structure conformation, Polyproline II is a left-handed helical conformation with average dihedral angle values of Φ = −75° and Ψ = +145°. Unlike classical regular secondary structures, PPIIs are not usually associated with conventional stabilizing internal hydrogen bonds due to this extremely extended conformation. PPII is a far more extended helix than classical α-helix (5.4 Å/turn, 3.6 residues per turn) and has a helical pitch of 9.3 Å/turn and 3 residues per turn. Thanks to this over extended conformation and high solvent exposure, residues in PPII may lead to potential interactions with other molecular partners. Thus, it was suggested that they might have an important functional role, particularly in protein–protein or protein–nucleic acid interactions and recognition ( 12 , 13 ). Regrettably, PPIIs are still insufficiently studied. In fact, PPII assignment is not done with the most common method of secondary structure assignment such as Dictionary of Protein Secondary Structure (DSSP; 14) and STRIDE ( 15 ), and therefore, newly solved protein structures are not assigned with PPII in Protein DataBank ( 16 ). Here we introduce a new assignment method and a dedicated webserver for PPII.

Aim and overview of database

The PolyprOnline database ( http://www.dsimb.inserm.fr/dsimb_tools/polyproline ) contains secondary structure assignments on a large subset of the Protein Databank. It also allows to dynamically handle any new user submitted structures. Unlike other databases established for protein secondary structure analysis, PolyprOnline particularly focalize on PPII, an assignment that is rarely documented in experimentally solved structures as well as in services and tools dedicated to the analysis of protein structures. For instance, 2struc ( http://2struc.cryst.bbk.ac.uk/about/ ) assigns protein in three secondary states using six different algorithms ( 17 ), but none of them address PPII assignment. More general tools such as PDBsum ( 18 ) give assignment by one method, PROMOTIF ( 19 ) in this case, with no details about PPII. As previously mentioned, this assignment is especially important since this conformation is the third most abundant regular secondary structure just behind α-helix and β-strand, and it is also involved in various function related to molecular interactions such as protein–protein and protein–nucleic-acid binding. However assignment using different tools show discrepancy thus our database provide assignments with the four main methods developed so far ( 10 ).

Results and features

The data flow and processing step performed by the system are summarized in Figure 1 .

An external file that holds a picture, illustration, etc.
Object name is bau102f1p.jpg

Data flow in PolyprOnline system. Access to the system can be done in two ways: through ‘Simple query’ for the analysis of one or more protein structures from their PDB code and through ‘Advanced query’ for performing more complex queries using different criteria such as resolution (Å), protein length, minimal and maximal number or percentage of residues in PPII conformation assigned by a particular tool. The last type of advanced query allows local structure search on specific positions using secondary structure patterns. It is also possible to dynamically upload and process a PDB file if it is absent of the database. The query is then processed to be interpreted by our Database Management System. In the case where a PDB structure is not found in the database, a PDB file can be downloaded from the Protein Databank website and dynamically processed by the system. PolyprOnline webserver offers the following outputs to display results: Summary of all protein identified by PDB code, title, size, resolution and PPII content, printed in a sortable table according to the values in different columns ( Figure 2 ). From this table, individual protein data analysis can be accessed individually ( Figure 3 ).

Through the main interface, two types of search are possible. Both searches are detailed in text of Figure 1 : simple search (analysis of one or more protein structure) and advanced search based on specific criteria to perform more complex queries. One of the most interesting features is the ability to perform secondary structure pattern query. This search is useful to look for a fragment of specified conformation contained in protein structures using a simple regular expression pattern. Pattern search uses the classical rules for regular expressions. It is possible to use conformation code letters (e.g . HHHH-PPEEE), and introduce wildcard (e.g . HHH**PP*-). It is also possible to specify the minimal and the maximal conformation length (e.g . PPPX{1,8}PP).

The PolyprOnline webserver offers the following outputs:

A table of sortable results

Results are displayed in a table that can be sorted accordingly to the values in different columns ( Figure 2 ). Results in the table can also be directly downloaded in text format. All proteins in the table are identifiable by PDB code, title, size, resolution and PPII content. You can also download the assignment of each protein in classical fasta format.

An external file that holds a picture, illustration, etc.
Object name is bau102f2p.jpg

Results. On the top of the table a pie chart displaying statistics of secondary structure content for each tool of all entries is dynamically generated. The table gives information on each selected protein chain. Each line corresponds to a PDB chain and each column to attribute values describing every entry. Alphabetical (PDB, Title) or numerical (length, resolution, PPII number and percentage) ordering and re-ordering of entries in ascending or descending order is possible. Another possibility is to do a free text search through a specific field. Each detailed analysis can be accessed from this table.

Individual protein data and analysis

The PolyprOnline web server provides access to different assignment methods and allows visualization of both regular secondary structure and PPII helix ( Figure 3 ). We have recently underlined the discrepancies between the three different secondary structure methods able to assign PPIIs, and proposed a novel PPII assignment using the de facto standard DSSP assignment method ( 10 , 14 ). To better visualize the secondary structure and PPII assignments given by PROSS ( 21 ), SEGNO ( 22 ), XTLSSTR ( 23 ) and our DSSP-PPII ( 10 , 14 ), they are all displayed at the bottom of sequence One letter code is used to represent specific conformation. Letters are coloured accordingly to more general class of secondary structure (e.g. helix residue in red, strand in green, PPII helix in blue non-regular secondary structure in grey, coil being in dark grey colour) for a fast visualization of overall local structures. All data from protein structure analysed can be downloaded.

An external file that holds a picture, illustration, etc.
Object name is bau102f3p.jpg

Detailed analysis of a protein structure (3KWEA; 25 ). ( A ) Sequence and analysis of secondary structures using four different protein secondary structure assignment methods are printed on a 1D alignment. One letter code is used to represent a specific conformation. Letters are coloured accordingly to more general class of secondary structure (i.e. helix residue in red, strand in green, PII helix in blue and non-regular secondary structure in grey). ( B ) Ramachandran plots give the distribution of φ and ψ torsion angles of PPII amino acids for each method. The most frequent areas for α-helix and β-sheet are shown in the background of the plot (represented by a colour scale). Statistics about areas were derived from our previous study. Residues assigned as PPIIs are represented as white points. ( C ) Full 3D structure visualization and animation using a JMol applet of different assignment can be dynamically displayed (Cα trace only, cartoon). Local conformations are coloured with the same colour scheme as used for the 1D alignment in (A; i.e. helix residue in red, strand in green, PII helix in blue and non-regular secondary structure in grey).

Ramachandran plots give the distribution of φ and ψ torsion angles for each assignment method. The most frequent areas for α-helix and β-sheet are shown in the background of the plot (represented by a colour scale). Statistics about areas were derived from our previous study ( 10 ). Residues assigned as PPIIs are represented as white points. The image is mouse sensitive and gives additional information on residue number, nature and φ and ψ angle values of assigned as PPII. Indeed assignments provided by the various tools can be quite different between them. Ramachandran plot lets to visually inspect φ and ψ angle PPII value distributions and help the user to apprehend the relevance of each assignment.

Visualization and manipulation of three dimensional protein structures is allowed thanks to a JMol applet ( 24 ). It displays the assignment of secondary structures by all of the four methods and details about positions of secondary structures with a particular emphasis on PPII. This visualization can also be useful to observe difference between assignments directly in protein structure.

Protein structures dataset

A subset of the experimental protein structures extracted from the PDB was selected based on the resolution methods (RX), quality of structures (resolution lower than 3.0 Å and R-factor lower than 1.0) limited redundancy (proteins share no more than 90% of identity between each others) using webserver PISCES ( 20 ). The full list of selected structures comprised 24 761 protein chains and is available on database. The list is regularly updated.

Assignment of PPII and other secondary structures

Currently, there is a limited number of tools for assigning PPII number. The tools available today are XTLSSTR, PROSS (version September 2004) and SEGNO (version 3.1). We have added our PPII DSSP-based program DSSP (CMBI version 2000) developed in our laboratory to this list ( 10 ). As we have previously explained, the use of multiple tools is necessary because it has been shown that PPII assignments using several methods yielded different results ( 10 ).

Secondary structures assigned by PROSS ( 21 ) are as follow: α helix (H), β turn (T), β strand (E), PPII (P), and coil (C). Assignments are based exclusively on Φ and Ψ dihedral angles.

The algorithm XTLSSTR ( 23 ) uses two angles and three distances to assign secondary structure from coordinates of PDB files. It assign secondary structures: α helix (H and h), 3 10 helix (G and g), hydrogen bonded β turn (T), non-hydrogen-bonded β turn (N), Extended β strand (E and e) and PPII (P and p)

SEGNO ( 22 ) uses also the Φ and Ψ dihedral angles coupled with other angles to assign the secondary structures. It assign α helix (H), β-strand (E and e), isolated β-strand (B and b) 3 10 helix (G and g), π-helix (I), coil (O, coded as ‘-’ in this database) and PPII (P and p).

DSSP-PPII is a new method for PPII assignment recently developed in our laboratory ( 10 ). It is based on the most popular secondary assignment tools: DSSP ( 14 ). DSSP assignment is based on the identification of precise hydrogen bond patterns corresponding to regular secondary structures. Assignment strategy of PPII is based on simple set of basic rules to have the highest agreement with PROSS, SEGNO and XTLSSTR methods. PPII are assigned solely in the coil region for at least two consecutive amino acids in coil with Φ = −75° ±  ε and Ψ = +145° ±  ε with ε  = 29°. Basic assignment of secondary structure in DSSP defines eight types of secondary structures: α helix (H), extended β strand in parallel and or anti-parallel β-sheet conformation (E), isolated β-strand (B), 310 helix (G), Pi helix (I), bend (S) and coil (O, coded as ‘-’ in this database). This is the basic assignment to which helix PPII (P) has been added.

Web interface and Database

Database management server used by our system is MySQL. The PolyprOnline web interface has been written mainly in PHP, Perl, R and Javascript programming languages.

Conclusion and interesting case study

To better understand structure/function and structure/architecture relationships, the advanced search interface of PolyprOnline can be used to find proteins with a high content of PPII. Thus a query launched on the basis of PPII frequency or containing long PPII helix can highlight different properties and peculiarities. It can be noted that proteins with the highest content of PPII have an over-frequency of functions related to interaction mechanisms and/or binding, which is consistent with observations in ( 11 ). For example, Figure 4 provided some examples involved in various function such as cell adhesion (B), self binding (C) or binding to cyclin-dependent kinases (A), neurotoxicity, an effect that involved blockade of acetylcholine receptors (D) and anti-freeze effect where solvent interaction is fundamental (E). With more than 72% of residues in PPII conformation, this anti-freeze protein contains the highest percentage of PPII of our database. It can also be noted, in these examples, that the organization of these PPII present characteristics of this regular conformation: rather isolated and exposed prolines for cyclin-dependant kinase regulation subunit (A), and the characteristics of other regular secondary structures: (i) similarities with α helix motifs such as PPII-beta-beta motif in Thrombospondin (B) and Atratoxin of cobra venom (D), (ii) and analogy with both alpha and beta motif such in GTP-binding protein obg (C) and snow flea anti-freeze protein (E) where PPII arrangements appear as a six anti-parallel PPII helices bundle. All theses PPII have in common a broad exposure to the solvent as it has already been highlighted in previous studies ( 11 ). Please note that these proteins are extreme cases in term of PPII content and are provided for illustrative purposes. The largest continuous PPII helix, of 13 residues long, is found in a protein Lyase (2VK8A; 31 ). This quick analysis highlights the utility of PolyprOnline database for PPII study.

An external file that holds a picture, illustration, etc.
Object name is bau102f4p.jpg

Some examples of proteins with high number of PPII conformations revealed by PolyprOnline database. ( A ) Cyclin-dependant kinase regular subunit (1CKSA; 26 ), ( B ) Thrombospondin (1LSLA; 27 ), ( C ) GTP-binding protein OBG (1UDXA; 28 ), ( D ) Atratoxin (1V6PA; 29 ) and ( E ) Snow flea anti-freeze protein (2PNEA; 30 ). β sheets appear in cyan while α helices are in red with an internal face in yellow. PPII are in violet and pink for internal face. Some PPII arrangements are very well organized in anti-parallel six helix bundle such in Snow anti-freeze protein ( E ) or in GTP-binding protein OBG ( C ). Others architectures are remarkable: β-β-PPII or PPII-β-β architecture found in Thrombospondin ( B ) and Atratoxin ( D ) have a similar arrangement to well known motif β-β-α or α-β-β building with an α helix instead of PPII. Cyclin-dependant kinase regular subunit ( A ) does not show any PPII specific arrangement.

Acknowledgements

The authors would like to thank Stéphane Téletchéa for corrections and comments on the manuscript.

Funding : This work was supported by grants from the Ministry of Research (France); University Paris Diderot, Sorbonne Paris Cite' (France); the National Institute for Blood Transfusion (INTS, France); the Institute for Health and Medical Research (INSERM, France); and ‘Investissements d'avenir', Laboratory of Excellence GR-Ex (France) to R.C., S.L., A.G.B. and J.-C.G; Funding for open access charge: Institute for Health and Medical Research (INSERM, France).

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

PolyprOnline: polyproline helix II and secondary structure assignment database

Profile image of Sylvain Leonard

2014, Database

Related Papers

Manju Bansal

PolyProline-II (PPII) helices are defined as a continuous stretch of a protein chain in which the constituent residues have the backbone torsion angle (φ,ψ) values of (-75°, 145°) and take up extended left handed conformation, lacking any intra-helical hydrogen bonds. They are found to occur very frequently in protein structures with their number exceeding that of π-helices, though it is considerably less than that of α-helices and β-strands. A relatively new procedure, ASSP, for the identification of regular secondary structures using Cα trace identifies 3597 PPII helices in 3582 protein chains, solved at resolution ≤ 2.5Å. Taking advantage of this significantly expanded database of PPII-helices, we have analyzed the functional and structural roles of PPII helices as well as determined the amino acid propensity within and around them. Though Pro residues are highly preferred, it is not a mandatory condition for the formation of PPII-helices, since ~40% PPII-helices were found to co...

secondary structure assignment database

Amino acids

Jean-christophe Gelly

About half of the globular proteins are composed of regular secondary structures, α-helices, and β-sheets, while the rest are constituted of irregular secondary structures, such as turns or coil conformations. Other regular secondary structures are often ignored, despite their importance in biological processes. Among such structures, the polyproline II helix (PPII) has interesting behaviours. PPIIs are not usually associated with conventional stabilizing interactions, and recent studies have observed that PPIIs are more frequent than anticipated. In addition, it is suggested that they may have an important functional role, particularly in protein-protein or protein-nucleic acid interactions and recognition. Residues associated with PPII conformations represent nearly 5% of the total residues, but the lack of PPII assignment approaches prevents their systematic analysis. This short review will present current knowledge and recent research in PPII area. In a first step, the different...

Prasun Kumar , Manju Bansal

Abstract PolyProline-II (PPII) helices are defined as a continuous stretch of a protein chain in which the constituent residues have backbone torsion angle (φ, ψ) values of (−75°, 145°) and take up an extended left handed helical conformation, without any intra-chain hydrogen bonds. They are found to occur quite frequently in protein structures, with their number exceeding that of π-helices, though it is considerably less than that of α-helices and β-strands. A relatively new procedure, ASSP, for the identification of regular secondary structures using Cα trace identifies 3597 PPII-helices in 3582 protein chains, solved at resolution ⩽2.0 Å. Taking advantage of this significantly expanded database of PPII-helices, we have analyzed their structural and functional roles as well as determined the amino acid propensity within and around them. Though Pro residues are highly preferred, their presence is not a mandatory requirement for the formation of PPII-helices, since ∼40% PPII-helices were found to contain no Pro residues. Aromatic amino acids are avoided within this helix, while Gly, Asn and Asp residues are preferred in the proximal flanking regions. The PPII-helices range from 3 to 13 residues in length with the average twist and rise being −121.2° ± 9.2° and 3.0 Å ± 0.1 Å respectively. A majority (∼72%) of PPII-helices were found to occur in conjunction with α-helices and β-strands, and serve as linkers as well. The analysis of various intra-helical non-bonded interactions revealed frequent presence of Csingle bondH⋯O H-bonds. PPII-helices participate in maintaining the three-dimensional structure of proteins and are important constituents of binding motifs involved in various biological functions.

Protein Science

Trevor Creamer

Royal Society Open Science

Denis Shields

Background: The polyproline II helix (PPIIH) is an extended protein left-handed secondary structure that usually but not necessarily involves prolines. Short PPIIHs are frequently, but not exclusively, found in disordered protein regions, where they may interact with peptide-binding domains. However, no readily usable software is available to predict this state. Results: We developed PPIIPRED to predict polyproline II helix secondary structure from protein sequences, using bidirectional recurrent neural networks trained on known three-dimensional structures with dihedral angle filtering. The performance of the method was evaluated in an external validation set. In addition to proline, PPIIPRED favours amino acids whose side chains extend from the backbone (Leu, Met, Lys, Arg, Glu, Gln), as well as Ala and Val. Utility for individual residue predictions is restricted by the rarity of the PPIIH feature compared to structurally common features. Conclusion: The software, available at ht...

Biochemistry

Biophysical Journal

avijit ghosh

Jessica Morgan

Proceedings of The National Academy of Sciences

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

Matija Popovic

Bioinformatics

Avraham Samson

Proteins-structure Function and Bioinformatics

Debnath Pal

Biomacromolecules

Nikolaos S Thomaidis

Journal of Molecular Biology

Malcolm W MacArthur

Journal of Peptide Science

Alessio Osler

Joseph Orgel

Marco Crisma

FEBS Journal

Prasun Kumar

Ruxandra i. Dima

Reinhard Schweitzer-Stenner

Journal of The American Chemical Society

Rajkishor Rai

Protein science

Nagarajaram HA

Journal of Molecular Structure

Kurosu Souma

Scott Hollingsworth

Kannan Gunasekaran

Richard Sessions

Protein Engineering Design and Selection

JOSE CASILLAS MARTINEZ

RELATED TOPICS

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024
  • Search Menu
  • Sign in through your institution
  • Volume 2024, 2024 (In Progress)
  • Volume 2023, 2023
  • Author Guidelines
  • Submission Site
  • Open Access
  • About Database
  • About the International Society for Biocuration
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Self-Archiving Policy
  • Journals on Oxford Academic
  • Books on Oxford Academic

International Society for Biocuration

Article Contents

Introduction, aim and overview of database, results and features, conclusion and interesting case study, acknowledgements.

  • < Previous

PolyprOnline: polyproline helix II and secondary structure assignment database

Citation details: Chebrek,R., Leonard,S., de Brevern,A.G., et al. PolyprOnline: polyproline helix II and secondary structure assignment database. Database (2014) Vol. 2014: article ID bau102; doi:10.1093/database/bau102

  • Article contents
  • Figures & tables
  • Supplementary Data

Romain Chebrek, Sylvain Leonard, Alexandre G. de Brevern, Jean-Christophe Gelly, PolyprOnline: polyproline helix II and secondary structure assignment database, Database , Volume 2014, 2014, bau102, https://doi.org/10.1093/database/bau102

  • Permissions Icon Permissions

The polyproline helix type II (PPII) is a regular protein secondary structure with remarkable features. Many studies have highlighted different crucial biological roles supported by this local conformation, e.g. in the interactions between biological macromolecules. Although PPII is less frequently present than regular secondary structures such as canonical alpha helices and beta strands, it corresponds to 3–10% of residues. Up to now, PPII is not assigned by most popular assignment tools, and therefore, remains insufficiently studied. PolyprOnline database is, therefore, dedicated to PPII structure assignment and analysis to facilitate the study of PPII structure and functional roles. This database is freely accessible from www.dsimb.inserm.fr/dsimb_tools/polyproline .

Fifty percent of local protein conformations are constituted of the two regular secondary structure, i.e. α helices and β sheets, while the remaining protein structure is essentially constituted of turns that can overlap the two previous local conformation, and coil ( 1 ). Regular secondary structures are fundamental descriptor for the analysis and the understanding of the structure and function of proteins at a molecular level. As such, they are automatically used to visualize the protein 3D structures with popular software like PyMOL ( 2 ), VMD ( 3 ) or Chimera ( 4 ). Thus, the secondary structures assignment is an essential step for studying protein architecture, folding and for the prediction of 3D protein structure. Besides α helices and β sheets, a number of other regular secondary structures are often ignored, despite their importance in biological processes ( 5 ). Among other regular secondary structures the polyproline II helix (PPII) is of significant interest. PPII conformation was primarily identified in the 1950s in collagen helix by Pauling and Corey ( 6 ), and in structures containing many repeating proline amino acids ( 7 ). It was not until the end of the 1990s that this conformation has been demonstrated to occur frequently in globular protein ( 8 ), with a very high conservation ratio of 80–100% in proteins families sharing 20% sequence identity or more, a ratio close to the conservation found for α helices and β strands ( 9 ). Depending on the tools used for the assignment of secondary structures, their frequency varies in the range of 3–10% of all conformations with a common core of more than 1.6% assignment shared by all tools ( 10 ). Other studies have shown similar frequency, Adzhubei and co-workers in a recent review ( 11 ) estimated about 2% of residues in Protein Databank to be in PPII-helices of length 3 and more residues. For historical reasons, this conformation had been called ‘polyproline helix’, although most PPIIs comprised non-proline residues and some even contain no proline at all ( 8 ).

In term of local structure conformation, Polyproline II is a left-handed helical conformation with average dihedral angle values of Φ = −75° and Ψ = +145°. Unlike classical regular secondary structures, PPIIs are not usually associated with conventional stabilizing internal hydrogen bonds due to this extremely extended conformation. PPII is a far more extended helix than classical α-helix (5.4 Å/turn, 3.6 residues per turn) and has a helical pitch of 9.3 Å/turn and 3 residues per turn. Thanks to this over extended conformation and high solvent exposure, residues in PPII may lead to potential interactions with other molecular partners. Thus, it was suggested that they might have an important functional role, particularly in protein–protein or protein–nucleic acid interactions and recognition ( 12 , 13 ). Regrettably, PPIIs are still insufficiently studied. In fact, PPII assignment is not done with the most common method of secondary structure assignment such as Dictionary of Protein Secondary Structure (DSSP; 14) and STRIDE ( 15 ), and therefore, newly solved protein structures are not assigned with PPII in Protein DataBank ( 16 ). Here we introduce a new assignment method and a dedicated webserver for PPII.

The PolyprOnline database ( http://www.dsimb.inserm.fr/dsimb_tools/polyproline ) contains secondary structure assignments on a large subset of the Protein Databank. It also allows to dynamically handle any new user submitted structures. Unlike other databases established for protein secondary structure analysis, PolyprOnline particularly focalize on PPII, an assignment that is rarely documented in experimentally solved structures as well as in services and tools dedicated to the analysis of protein structures. For instance, 2struc ( http://2struc.cryst.bbk.ac.uk/about/ ) assigns protein in three secondary states using six different algorithms ( 17 ), but none of them address PPII assignment. More general tools such as PDBsum ( 18 ) give assignment by one method, PROMOTIF ( 19 ) in this case, with no details about PPII. As previously mentioned, this assignment is especially important since this conformation is the third most abundant regular secondary structure just behind α-helix and β-strand, and it is also involved in various function related to molecular interactions such as protein–protein and protein–nucleic-acid binding. However assignment using different tools show discrepancy thus our database provide assignments with the four main methods developed so far ( 10 ).

The data flow and processing step performed by the system are summarized in Figure 1 .

 Data flow in PolyprOnline system. Access to the system can be done in two ways: through ‘Simple query’ for the analysis of one or more protein structures from their PDB code and through ‘Advanced query’ for performing more complex queries using different criteria such as resolution (Å), protein length, minimal and maximal number or percentage of residues in PPII conformation assigned by a particular tool. The last type of advanced query allows local structure search on specific positions using secondary structure patterns. It is also possible to dynamically upload and process a PDB file if it is absent of the database. The query is then processed to be interpreted by our Database Management System. In the case where a PDB structure is not found in the database, a PDB file can be downloaded from the Protein Databank website and dynamically processed by the system. PolyprOnline webserver offers the following outputs to display results: Summary of all protein identified by PDB code, title, size, resolution and PPII content, printed in a sortable table according to the values in different columns ( Figure 2 ). From this table, individual protein data analysis can be accessed individually ( Figure 3 ).

Data flow in PolyprOnline system. Access to the system can be done in two ways: through ‘Simple query’ for the analysis of one or more protein structures from their PDB code and through ‘Advanced query’ for performing more complex queries using different criteria such as resolution (Å), protein length, minimal and maximal number or percentage of residues in PPII conformation assigned by a particular tool. The last type of advanced query allows local structure search on specific positions using secondary structure patterns. It is also possible to dynamically upload and process a PDB file if it is absent of the database. The query is then processed to be interpreted by our Database Management System. In the case where a PDB structure is not found in the database, a PDB file can be downloaded from the Protein Databank website and dynamically processed by the system. PolyprOnline webserver offers the following outputs to display results: Summary of all protein identified by PDB code, title, size, resolution and PPII content, printed in a sortable table according to the values in different columns ( Figure 2 ). From this table, individual protein data analysis can be accessed individually ( Figure 3 ).

Through the main interface, two types of search are possible. Both searches are detailed in text of Figure 1 : simple search (analysis of one or more protein structure) and advanced search based on specific criteria to perform more complex queries. One of the most interesting features is the ability to perform secondary structure pattern query. This search is useful to look for a fragment of specified conformation contained in protein structures using a simple regular expression pattern. Pattern search uses the classical rules for regular expressions. It is possible to use conformation code letters (e.g . HHHH-PPEEE), and introduce wildcard (e.g . HHH**PP*-). It is also possible to specify the minimal and the maximal conformation length (e.g . PPPX{1,8}PP).

The PolyprOnline webserver offers the following outputs:

A table of sortable results

Results are displayed in a table that can be sorted accordingly to the values in different columns ( Figure 2 ). Results in the table can also be directly downloaded in text format. All proteins in the table are identifiable by PDB code, title, size, resolution and PPII content. You can also download the assignment of each protein in classical fasta format.

Results. On the top of the table a pie chart displaying statistics of secondary structure content for each tool of all entries is dynamically generated. The table gives information on each selected protein chain. Each line corresponds to a PDB chain and each column to attribute values describing every entry. Alphabetical (PDB, Title) or numerical (length, resolution, PPII number and percentage) ordering and re-ordering of entries in ascending or descending order is possible. Another possibility is to do a free text search through a specific field. Each detailed analysis can be accessed from this table.

Results. On the top of the table a pie chart displaying statistics of secondary structure content for each tool of all entries is dynamically generated. The table gives information on each selected protein chain. Each line corresponds to a PDB chain and each column to attribute values describing every entry. Alphabetical (PDB, Title) or numerical (length, resolution, PPII number and percentage) ordering and re-ordering of entries in ascending or descending order is possible. Another possibility is to do a free text search through a specific field. Each detailed analysis can be accessed from this table.

Individual protein data and analysis

The PolyprOnline web server provides access to different assignment methods and allows visualization of both regular secondary structure and PPII helix ( Figure 3 ). We have recently underlined the discrepancies between the three different secondary structure methods able to assign PPIIs, and proposed a novel PPII assignment using the de facto standard DSSP assignment method ( 10 , 14 ). To better visualize the secondary structure and PPII assignments given by PROSS ( 21 ), SEGNO ( 22 ), XTLSSTR ( 23 ) and our DSSP-PPII ( 10 , 14 ), they are all displayed at the bottom of sequence One letter code is used to represent specific conformation. Letters are coloured accordingly to more general class of secondary structure (e.g. helix residue in red, strand in green, PPII helix in blue non-regular secondary structure in grey, coil being in dark grey colour) for a fast visualization of overall local structures. All data from protein structure analysed can be downloaded.

 Detailed analysis of a protein structure (3KWEA; 25 ). ( A ) Sequence and analysis of secondary structures using four different protein secondary structure assignment methods are printed on a 1D alignment. One letter code is used to represent a specific conformation. Letters are coloured accordingly to more general class of secondary structure (i.e. helix residue in red, strand in green, PII helix in blue and non-regular secondary structure in grey). ( B ) Ramachandran plots give the distribution of φ and ψ torsion angles of PPII amino acids for each method. The most frequent areas for α-helix and β-sheet are shown in the background of the plot (represented by a colour scale). Statistics about areas were derived from our previous study. Residues assigned as PPIIs are represented as white points. ( C ) Full 3D structure visualization and animation using a JMol applet of different assignment can be dynamically displayed (Cα trace only, cartoon). Local conformations are coloured with the same colour scheme as used for the 1D alignment in (A; i.e. helix residue in red, strand in green, PII helix in blue and non-regular secondary structure in grey).

Detailed analysis of a protein structure (3KWEA; 25 ). ( A ) Sequence and analysis of secondary structures using four different protein secondary structure assignment methods are printed on a 1D alignment. One letter code is used to represent a specific conformation. Letters are coloured accordingly to more general class of secondary structure (i.e. helix residue in red, strand in green, PII helix in blue and non-regular secondary structure in grey). ( B ) Ramachandran plots give the distribution of φ and ψ torsion angles of PPII amino acids for each method. The most frequent areas for α-helix and β-sheet are shown in the background of the plot (represented by a colour scale). Statistics about areas were derived from our previous study. Residues assigned as PPIIs are represented as white points. ( C ) Full 3D structure visualization and animation using a JMol applet of different assignment can be dynamically displayed (Cα trace only, cartoon). Local conformations are coloured with the same colour scheme as used for the 1D alignment in (A; i.e. helix residue in red, strand in green, PII helix in blue and non-regular secondary structure in grey).

Ramachandran plots give the distribution of φ and ψ torsion angles for each assignment method. The most frequent areas for α-helix and β-sheet are shown in the background of the plot (represented by a colour scale). Statistics about areas were derived from our previous study ( 10 ). Residues assigned as PPIIs are represented as white points. The image is mouse sensitive and gives additional information on residue number, nature and φ and ψ angle values of assigned as PPII. Indeed assignments provided by the various tools can be quite different between them. Ramachandran plot lets to visually inspect φ and ψ angle PPII value distributions and help the user to apprehend the relevance of each assignment.

Visualization and manipulation of three dimensional protein structures is allowed thanks to a JMol applet ( 24 ). It displays the assignment of secondary structures by all of the four methods and details about positions of secondary structures with a particular emphasis on PPII. This visualization can also be useful to observe difference between assignments directly in protein structure.

Protein structures dataset

A subset of the experimental protein structures extracted from the PDB was selected based on the resolution methods (RX), quality of structures (resolution lower than 3.0 Å and R-factor lower than 1.0) limited redundancy (proteins share no more than 90% of identity between each others) using webserver PISCES ( 20 ). The full list of selected structures comprised 24 761 protein chains and is available on database. The list is regularly updated.

Assignment of PPII and other secondary structures

Currently, there is a limited number of tools for assigning PPII number. The tools available today are XTLSSTR, PROSS (version September 2004) and SEGNO (version 3.1). We have added our PPII DSSP-based program DSSP (CMBI version 2000) developed in our laboratory to this list ( 10 ). As we have previously explained, the use of multiple tools is necessary because it has been shown that PPII assignments using several methods yielded different results ( 10 ).

Secondary structures assigned by PROSS ( 21 ) are as follow: α helix (H), β turn (T), β strand (E), PPII (P), and coil (C). Assignments are based exclusively on Φ and Ψ dihedral angles.

The algorithm XTLSSTR ( 23 ) uses two angles and three distances to assign secondary structure from coordinates of PDB files. It assign secondary structures: α helix (H and h), 3 10 helix (G and g), hydrogen bonded β turn (T), non-hydrogen-bonded β turn (N), Extended β strand (E and e) and PPII (P and p)

SEGNO ( 22 ) uses also the Φ and Ψ dihedral angles coupled with other angles to assign the secondary structures. It assign α helix (H), β-strand (E and e), isolated β-strand (B and b) 3 10 helix (G and g), π-helix (I), coil (O, coded as ‘-’ in this database) and PPII (P and p).

DSSP-PPII is a new method for PPII assignment recently developed in our laboratory ( 10 ). It is based on the most popular secondary assignment tools: DSSP ( 14 ). DSSP assignment is based on the identification of precise hydrogen bond patterns corresponding to regular secondary structures. Assignment strategy of PPII is based on simple set of basic rules to have the highest agreement with PROSS, SEGNO and XTLSSTR methods. PPII are assigned solely in the coil region for at least two consecutive amino acids in coil with Φ = −75° ±  ε and Ψ = +145° ±  ε with ε  = 29°. Basic assignment of secondary structure in DSSP defines eight types of secondary structures: α helix (H), extended β strand in parallel and or anti-parallel β-sheet conformation (E), isolated β-strand (B), 310 helix (G), Pi helix (I), bend (S) and coil (O, coded as ‘-’ in this database). This is the basic assignment to which helix PPII (P) has been added.

Web interface and Database

Database management server used by our system is MySQL. The PolyprOnline web interface has been written mainly in PHP, Perl, R and Javascript programming languages.

To better understand structure/function and structure/architecture relationships, the advanced search interface of PolyprOnline can be used to find proteins with a high content of PPII. Thus a query launched on the basis of PPII frequency or containing long PPII helix can highlight different properties and peculiarities. It can be noted that proteins with the highest content of PPII have an over-frequency of functions related to interaction mechanisms and/or binding, which is consistent with observations in ( 11 ). For example, Figure 4 provided some examples involved in various function such as cell adhesion (B), self binding (C) or binding to cyclin-dependent kinases (A), neurotoxicity, an effect that involved blockade of acetylcholine receptors (D) and anti-freeze effect where solvent interaction is fundamental (E). With more than 72% of residues in PPII conformation, this anti-freeze protein contains the highest percentage of PPII of our database. It can also be noted, in these examples, that the organization of these PPII present characteristics of this regular conformation: rather isolated and exposed prolines for cyclin-dependant kinase regulation subunit (A), and the characteristics of other regular secondary structures: (i) similarities with α helix motifs such as PPII-beta-beta motif in Thrombospondin (B) and Atratoxin of cobra venom (D), (ii) and analogy with both alpha and beta motif such in GTP-binding protein obg (C) and snow flea anti-freeze protein (E) where PPII arrangements appear as a six anti-parallel PPII helices bundle. All theses PPII have in common a broad exposure to the solvent as it has already been highlighted in previous studies ( 11 ). Please note that these proteins are extreme cases in term of PPII content and are provided for illustrative purposes. The largest continuous PPII helix, of 13 residues long, is found in a protein Lyase (2VK8A; 31 ). This quick analysis highlights the utility of PolyprOnline database for PPII study.

 Some examples of proteins with high number of PPII conformations revealed by PolyprOnline database. ( A ) Cyclin-dependant kinase regular subunit (1CKSA; 26 ), ( B ) Thrombospondin (1LSLA; 27 ), ( C ) GTP-binding protein OBG (1UDXA; 28 ), ( D ) Atratoxin (1V6PA; 29 ) and ( E ) Snow flea anti-freeze protein (2PNEA; 30 ). β sheets appear in cyan while α helices are in red with an internal face in yellow. PPII are in violet and pink for internal face. Some PPII arrangements are very well organized in anti-parallel six helix bundle such in Snow anti-freeze protein ( E ) or in GTP-binding protein OBG ( C ). Others architectures are remarkable: β-β-PPII or PPII-β-β architecture found in Thrombospondin ( B ) and Atratoxin ( D ) have a similar arrangement to well known motif β-β-α or α-β-β building with an α helix instead of PPII. Cyclin-dependant kinase regular subunit ( A ) does not show any PPII specific arrangement.

Some examples of proteins with high number of PPII conformations revealed by PolyprOnline database. ( A ) Cyclin-dependant kinase regular subunit (1CKSA; 26 ), ( B ) Thrombospondin (1LSLA; 27 ), ( C ) GTP-binding protein OBG (1UDXA; 28 ), ( D ) Atratoxin (1V6PA; 29 ) and ( E ) Snow flea anti-freeze protein (2PNEA; 30 ). β sheets appear in cyan while α helices are in red with an internal face in yellow. PPII are in violet and pink for internal face. Some PPII arrangements are very well organized in anti-parallel six helix bundle such in Snow anti-freeze protein ( E ) or in GTP-binding protein OBG ( C ). Others architectures are remarkable: β-β-PPII or PPII-β-β architecture found in Thrombospondin ( B ) and Atratoxin ( D ) have a similar arrangement to well known motif β-β-α or α-β-β building with an α helix instead of PPII. Cyclin-dependant kinase regular subunit ( A ) does not show any PPII specific arrangement.

The authors would like to thank Stéphane Téletchéa for corrections and comments on the manuscript.

Funding : This work was supported by grants from the Ministry of Research (France); University Paris Diderot, Sorbonne Paris Cite' (France); the National Institute for Blood Transfusion (INTS, France); the Institute for Health and Medical Research (INSERM, France); and ‘Investissements d'avenir', Laboratory of Excellence GR-Ex (France) to R.C., S.L., A.G.B. and J.-C.G; Funding for open access charge: Institute for Health and Medical Research (INSERM, France).

Tyagi M. Bornot A. Offmann B. et al.  . ( 2009 ) Analysis of loop boundaries using different local structure assignment methods . Protein Sci. , 18 , 1869 – 1881 .

Google Scholar

Schrodinger L.L.C. ( 2010 ) The PyMOL Molecular Graphics System , Version 1.3r1 .

Humphrey W. Dalke A. Schulten K. ( 1996 ) VMD - visual molecular dynamics . J. Mol. Graph , 14 , 33 – 38 .

Pettersen E.F.G. Goddard T.D. Huang C.C. Couch G. S. Greenblatt D.M. Meng E.C. Ferrin T.E. ( 2004 ) UCSF Chimera—a visualization system for exploratory research and analysis . J. Comput. Chem. , 13 , 1605 – 1612 .

Offmann B. Tyagi M. de Brevern A.G. ( 2007 ) Local protein structures . Curr. Bioinformatics 165 – 202 .

Pauling L. Corey R.B. ( 1951 ) The structure of fibrous proteins of the collagen-gelatin group . Proc. Natl. Acad. Sci. US A , 37 , 272 – 281 .

Cowan P.M. McGavin S. North A.C. ( 1955 ) The polypeptide chain configuration of collagen . Nature , 176 , 1062 – 1064 .

Adzhubei A.A. Sternberg M.J. ( 1993 ) Left-handed polyproline II helices commonly occur in globular proteins . J. Mol. Biol. , 229 , 472 – 493 .

Adzhubei A.A. Sternberg M.J. ( 1994 ) Conservation of polyproline II helices in homologous proteins: implications for structure prediction by model building . Protein Sci. , 3 , 2395 – 2410 .

Mansiaux Y. Joseph A.P. Gelly J.C. et al.  . ( 2011 ) Assignment of polyproline II conformation and analysis of sequence—structure relationship . PloS one , 6 , e18401 .

Adzhubei A.A. Sternberg M.J. Makarov A.A. ( 2013 ) Polyproline-II helix in proteins: structure and function . J. Mol. Biol. , 425 , 2100 – 2132 .

Bermudez A. Calderon D. Moreno-Vranich A. et al.  . ( 2014 ) Gauche side-chain orientation as a key factor in the search for an immunogenic peptide mixture leading to a complete fully protective vaccine . Vaccine , 32 , 2117 – 26 .

Chevrier L. de Brevern A. Hernandez E. et al.  . ( 2013 ) PRR repeats in the intracellular domain of KISS1R are important for its export to cell membrane . Mol. Endocrinol. , 27 , 1004 – 1014 .

Kabsch W. Sander C. ( 1983 ) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features . Biopolymers , 22 , 2577 – 2637 .

Frishman D. Argos P. ( 1995 ) Knowledge-based protein secondary structure assignment . Proteins , 23 , 566 – 579 .

Berman H.M. Westbrook J. Feng Z. et al.  . ( 2000 ) The protein data bank . Nucleic Acids Res. , 28 , 235 – 242 .

Klose D.P. Wallace B.A. Janes R.W. ( 2010 ) 2Struc: the secondary structure server . Bioinformatics , 26 , 2624 – 2625 .

Laskowski R.A. ( 2001 ) PDBsum: summaries and analyses of PDB structures . Nucleic Acids Res. , 29 , 221 – 222 .

Hutchinson E.G. Thornton J.M. ( 1996 ) PROMOTIF—a program to identify and analyze structural motifs in proteins . Protein Sci. , 5 , 212 – 220 .

Wang G. Dunbrack R.L. Jr . ( 2005 ) PISCES: recent improvements to a PDB sequence culling server . Nucleic Acids Res. , 33 , W94 – W98 .

Srinivasan R. Rose G.D. ( 1999 ) A physical basis for protein secondary structure . Proc. Natl. Acad. Sci. USA , 96 , 14258 – 14263 .

Cubellis M.V. Cailliez F. Lovell S.C. ( 2005 ) Secondary structure assignment that accurately reflects physical and evolutionary characteristics . BMC Bioinformatics , 6 Suppl 4 , S8 .

King S.M. Johnson W.C. ( 1999 ) Assigning secondary structure from protein coordinate data . Proteins , 35 , 313 – 320 .

Jmol: an open-source Java viewer for chemical structures in 3D. http://www.jmol.org/

Pena K.L. Castel S.E. de Araujo C. et al.  . ( 2010 ) Structural basis of the oxidative activation of the carboxysomal gamma-carbonic anhydrase, CcmM . Proc. Natl. Acad. Sci. USA , 107 , 2455 – 2460 .

Parge A.S. Arvai H.E. Murtari D.J. et al.  ( 1993 ) Human CksHs2 atomic structure: a role for its hexameric assembly in cell cycle control . Science , 262 , 387 – 395 .

Tan K. Duquette M. Liu J.H. et al.  ( 2002 ) Crystal structure of the TSP-1 type 1 repeats: a novel layered fold and its biological implication . J. Cell Biol. , 159 , 373 – 382 .

Kukimoto-Niino M. Murayama K. Inoue M. et al.  ( 2004 ) Crystal structure of the GTP-binding protein Obg from Thermus thermophilus HB8 . J. Mol. Biol. , 337 , 761 – 770 .

Lou X. Liu Q. Tu X. et al.  ( 2004 ) The atomic resolution crystal structure of atratoxin determined by single wavelength anomalous diffraction phasing . J. Biol. Chem. , 279 , 39094 – 39104 .

Pentelute B.L. Gates Z.P. Tereshko V. et al.  ( 2008 ) X-ray structure of snow flea antifreeze protein determined by racemic crystallization of synthetic protein enantiomers . J. Am. Chem. Soc. , 130 , 9695 – 9701 .

Kutter S. Weiss M.S. Wille G. et al.  . ( 2009 ) Covalently bound substrate at the regulatory site of yeast pyruvate decarboxylases triggers allosteric enzyme activation . J. Biol. Chem. , 284 , 12136 – 12144 .

Author notes

Month: Total Views:
December 2016 3
January 2017 25
February 2017 39
March 2017 30
April 2017 17
May 2017 22
June 2017 26
July 2017 9
August 2017 15
September 2017 27
October 2017 21
November 2017 30
December 2017 40
January 2018 48
February 2018 62
March 2018 63
April 2018 40
May 2018 46
June 2018 40
July 2018 37
August 2018 36
September 2018 56
October 2018 51
November 2018 53
December 2018 47
January 2019 17
February 2019 37
March 2019 47
April 2019 41
May 2019 43
June 2019 45
July 2019 34
August 2019 28
September 2019 47
October 2019 28
November 2019 36
December 2019 34
January 2020 32
February 2020 34
March 2020 16
April 2020 43
May 2020 36
June 2020 35
July 2020 27
August 2020 54
September 2020 39
October 2020 44
November 2020 69
December 2020 49
January 2021 33
February 2021 49
March 2021 72
April 2021 81
May 2021 54
June 2021 50
July 2021 30
August 2021 29
September 2021 54
October 2021 61
November 2021 65
December 2021 29
January 2022 45
February 2022 44
March 2022 32
April 2022 45
May 2022 38
June 2022 42
July 2022 47
August 2022 31
September 2022 48
October 2022 22
November 2022 41
December 2022 16
January 2023 22
February 2023 31
March 2023 39
April 2023 42
May 2023 42
June 2023 20
July 2023 26
August 2023 49
September 2023 36
October 2023 54
November 2023 68
December 2023 30
January 2024 50
February 2024 51
March 2024 52
April 2024 36
May 2024 37
June 2024 25

Email alerts

Citing articles via.

  • Recommend to your Library

Affiliations

  • Online ISSN 1758-0463
  • Copyright © 2024 Oxford University Press
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • DOI: 10.1093/database/bau102
  • Corpus ID: 17521069

PolyprOnline: polyproline helix II and secondary structure assignment database

  • Romain Chebrek , Sylvain Léonard , +1 author Jean-Christophe Gelly
  • Published in Database J. Biol. Databases… 6 November 2014
  • Biology, Computer Science
  • Database: The Journal of Biological Databases and Curation

Ask This Paper

By using this feature, you agree to AI2's terms and conditions and that you will not submit any sensitive or confidential info.

AI2 may include your prompts and inputs in a public dataset for future AI research and development. Please check the box to opt-out.

Ask a question about " "

Supporting statements, figures from this paper.

figure 1

41 Citations

Prediction of polyproline ii secondary structure propensity in proteins, structural and functional analyses of polyproline-ii helices in globular proteins, recent advances on polyproline ii, left-handed polyproline-ii helix revisited: proteins causing proteopathies, deciphering the backbone noncovalent interactions that stabilize polyproline ii conformation and reduce cis proline abundance in polyproline tracts., scot: rethinking the classification of secondary structure elements, force field effects in simulations of flexible peptides with varying polyproline ii propensity, plant polypeptide hormone systemin prefers polyproline ii conformation in solution, bert-ppii: the polyproline type ii helix structure prediction model based on bert and multichannel cnn, a linker of the proline-threonine repeating motif sequence is bimodal, 32 references, assignment of polyproline ii conformation and analysis of sequence – structure relationship.

  • Highly Influential

Polyproline-II helix in proteins: structure and function.

Conservation of polyproline ii helices in homologous proteins: implications for structure prediction by model building, left-handed polyproline ii helices commonly occur in globular proteins., secondary structure assignment that accurately reflects physical and evolutionary characteristics, local protein structures, bioinformatics applications note structural bioinformatics 2struc: the secondary structure server, pdbsum: summaries and analyses of pdb structures, assigning secondary structure from protein coordinate data, a physical basis for protein secondary structure., related papers.

Showing 1 through 3 of 0 Related Papers

Seeing Like a Data Structure

Bruce Schneier

Bruce Schneier

"We tell ourselves stories about technology and society every day. Those stories shape how we use and develop new technologies as well as the new stories and uses that will come with it. They determine who’s in charge, who benefits, who’s to blame, and what it all means."

Read more on the Belfer Center's website.

You might also like

  • community The Hacking of Culture and the Creation of Socio-Technical Debt
  • community Seiji Isotani elected president of the International Artificial Intelligence in Education Society
  • community Global AI Regulation: Protecting Rights; Leveraging Collaboration

IMAGES

  1. Database of secondary structure assignment (DSSP) and secondary

    secondary structure assignment database

  2. Database of secondary structure assignment (DSSP) and secondary

    secondary structure assignment database

  3. (PDF) PolyprOnline: polyproline helix II and secondary structure

    secondary structure assignment database

  4. Database of secondary structure assignment (DSSP) and secondary

    secondary structure assignment database

  5. Sequence alignment and secondary structure assignment. Alignment of the

    secondary structure assignment database

  6. Examples of the secondary structure assignment by EMNUSS for

    secondary structure assignment database

VIDEO

  1. What is Data Base and Data structure

  2. A3 Assignment ( Database Management system )

  3. Unit 2

  4. Database Schema for a student-Lab scenario

  5. Data structure Assignment 3 by Suhaib Sawalha semester 23 24

  6. Secondary structure predictions

COMMENTS

  1. DSSP

    Introduction. The DSSP program was designed by Wolfgang Kabsch and Chris Sander to standardize secondary structure assignment. DSSP is a database of secondary structure assignments (and much more) for all protein entries in the Protein Data Bank (PDB). DSSP is also the program that calculates DSSP entries from PDB entries.

  2. Using DSSP data

    Using DSSP data. DSSP provides an elaborate description of the secondary structure elements in a protein structure, including backbone hydrogen bonding and the topology of β-sheets. The most popular feature is the per-residue assignment of secondary structure with a single character code: H = α-helix. B = residue in isolated β-bridge.

  3. DSSP

    DSSP is a database of secondary structure assignments (and much more) for all protein entries in the Protein Data Bank (PDB). DSSP is also the name of the program that calculates DSSP entries from PDB entries. The above means there are actually two ways of looking at DSSP. First of all there are the precalculated DSSP files for each PDB entry.

  4. STRIDE: a web server for secondary structure assignment from known

    Here, we report a dedicated STRIDE web server and a database of secondary structure assignments. STRIDE WEB SERVER AND DATABASE The STRIDE web server, written in the python programming language, makes accessible all functions implemented in the STRIDE software and also provides several additional visualization tools ( Figure 1 ).

  5. Stride Homepage

    Stride Services. This server offers an interactive interface to the secondary structure assignment program STRIDE. The Method is presented in detail in: Frishman D, Argos P. Knowledge-Based Protein Secondary Structure Assignment Proteins: Structure, Function, and Genetics 23:566-579 (1995) When using this server, please cite: Heinig, M ...

  6. STRIDE: Protein secondary structure assignment from atomic ...

    It relies on database-derived recognition parameters with the crystallographers' secondary structure definitions as a standard-of- truth. Please see Frishman and Argos [1] for detailed description of the algorithm. Frishman,D & Argos,P. (1995) Knowledge-based secondary structure assignment. Proteins: structure, function and genetics, 23, 566-579.

  7. STRIDE: a web server for secondary structure assignment from known

    The STRIDE web server provides access to this tool and allows visualization of the secondary structure, as well as contact and Ramachandran maps for any file uploaded by the user with atomic coordinates in the Protein Data Bank (PDB) format. A searchable database of STRIDE assignments for the latest PDB release is also provided.

  8. 2Struc: the secondary structure server

    Abstract. Summary: The defined secondary structure of proteins method is often considered the gold standard for assignment of secondary structure from three-dimensional coordinates. However, there are alternative methods. '2Struc: The Secondary Structure Server' has been created as a single point of access for eight different secondary ...

  9. STRIDE -- A web server for secondary structure assignment from known

    The web server allows visualization of the secondary structure, as well as contact and Ramachandran maps for any file uploaded by the user with atomic coordinates in the Protein Data Bank (PDB) format. A searchable database of STRIDE assignments for the latest PDB release is also provided.

  10. DSSPcont: continuous secondary structure assignments for proteins

    Hence, secondary structure assignments are important to assure the optimal yield of experimental structures and to cleverly select the targets for structural genomics. ... Users may also access a DSSPcont database of pre-calculated assignments for all PDB records; this database is updated weekly with all new PDB entries. The interface is very ...

  11. GitHub

    The DSSP program was designed by Wolfgang Kabsch and Chris Sander to standardize secondary structure assignment. DSSP is a database of secondary structure assignments (and much more) for all protein entries in the Protein Data Bank (PDB). DSSP is also the program that calculates DSSP entries from PDB entries. DSSP does not predict secondary ...

  12. (PDF) STRIDE: a Web server for secondary structure assignment from

    STRIDE is a software tool for secondary structure assignment from atomic resolution protein structures. It implements a knowledge-based algorithm that makes combined use of hydrogen bond energy ...

  13. Protein secondary structure assignment revisited: a detailed analysis

    When needed, secondary structure assignments are reduced to three classes ... KAKSI takes a PDB file as input and prints the assigned secondary structure (and other data of intereset) in an XML output K2R reads a KAKSI XML output file and outputs the data in various FASTA format files by default. K2R allows users to easily implement any new ...

  14. Assigning secondary structure in proteins using AI

    Knowledge about protein structure assignment enriches the structural and functional understanding of proteins. Accurate and reliable structure assignment data is crucial for secondary structure prediction systems. Since the 1980s, various methods based on hydrogen bond analysis and atomic coordinate geometry, followed by machine learning, have been employed in protein structure assignment ...

  15. STRIDE: a web server for secondary structure assignment from known

    Here, we report a dedicated STRIDE web server and a database of secondary structure assignments. STRIDE WEB SERVER AND DATABASE The STRIDE web server, written in the python programming language, makes accessible all functions implemented in the STRIDE software and also provides several additional visualization tools (Figure (Figure1). 1 ).

  16. Protein secondary structure assignment using residual networks

    Knowledge about the secondary structure assignment contributes to a better understanding of proteins. Pauling and Corey identified the fundamental protein secondary structure elements (SSE), known as α-helices and β-sheets, in 1951 [].Together with irregular substructures called coils or loops, these regular substructures (α-helices and β-sheets) form the three-state classification (Q3 ...

  17. Assigning Secondary Structure in Proteins using AI

    Knowledge about protein structure assignment enriches the structural and functional understanding of proteins. Accurate and reliable structure assignment data is crucial for secondary structure prediction systems. Since the ′80s various methods based on hydrogen bond analysis and atomic coordinate geometry, followed by Machine Learning, have been employed in protein structure assignment.

  18. webservice

    DSSP is a database of secondary structure assignments (and much more) for all protein entries in the Protein Data Bank (PDB) STRIDE Protein secondary structure assignment with stride, Basic assignment, Visual assignment, Contact map, Ramachandran Plot. protein-structure. webservice. Share.

  19. PolyprOnline: polyproline helix II and secondary structure assignment

    Although PPII is less frequently present than regular secondary structures such as canonical alpha helices and beta strands, it corresponds to 3-10% of residues. Up to now, PPII is not assigned by most popular assignment tools, and therefore, remains insufficiently studied. PolyprOnline database is, therefore, dedicated to PPII structure ...

  20. PolyprOnline: polyproline helix II and secondary structure assignment

    In fact, PPII assignment is not done with the most common method of secondary structure assignment such as Dictionary of Protein Secondary Structure (DSSP; 14) and STRIDE , and therefore, ... However assignment using different tools show discrepancy thus our database provide assignments with the four main methods developed so far .

  21. PolyprOnline: polyproline helix II and secondary structure assignment

    PolyprOnline: polyproline helix II and secondary structure assignment database. Sylvain Leonard. 2014, Database. See Full PDF Download PDF. See Full PDF ...

  22. PolyprOnline: polyproline helix II and secondary structure assignment

    Regrettably, PPIIs are still insufficiently studied. In fact, PPII assignment is not done with the most common method of secondary structure assignment such as Dictionary of Protein Secondary Structure (DSSP; 14) and STRIDE ( 15), and therefore, newly solved protein structures are not assigned with PPII in Protein DataBank ( 16). Here we ...

  23. PolyprOnline: polyproline helix II and secondary structure assignment

    The polyproline helix type II (PPII) is a regular protein secondary structure with remarkable features, but it is not assigned by most popular assignment tools, and therefore, remains insufficiently studied. The polyproline helix type II (PPII) is a regular protein secondary structure with remarkable features. Many studies have highlighted different crucial biological roles supported by this ...

  24. Seeing Like a Data Structure

    The Berkman Klein Center for Internet & Society at Harvard University 1557 Massachusetts Avenue, 5th Floor, Cambridge, MA 02138 Phone: (617) 495-7547