The genome — the entire set of twirly helical DNA — carries the code for life in the genes. The genes dictate how an organism exhibits different traits and characteristics. However, genes alone do not provide us with all the information regarding the body‟s diseases and disorders. The next level in decoding the genome is in analysing the proteome, which is the full set of proteins, including the peptides (shorter proteins) and amino acids (the basic unit of a protein), that are produced according to the information in the genes. These proteins are essential components of our cells and keep our body functioning.
The Human Proteome Organisation launched the Human Proteome Project (HPP) in 2010, a decade after the release of the decoded information about the elusive genome by the Human Genome Project. HPP is an international collaboration which aims to assemble, analyse and
understand the molecular nature of the proteome. In a recent study, researchers from Human Proteome Organizations (HUPO), including researchers of Indian Institute of Technology Bombay (IIT Bombay), discuss the highly stringent standards of processing and classifying human proteins by the HPP. The study was published in the journal Nature Communications.
The Project has two main objectives. First, it aims to catalogue the parts of the complex human proteome by establishing reliable standards. Further, it intends to integrate proteomics as a necessary part of life science studies for understanding the myriad roles proteins play in diseases. HPP goes on to provide crucial biochemical data, such as the changes a protein undergoes after it is synthesised, which cannot be obtained from the genome alone.
“The progressing technology of bioinformatics depends largely on data analysis, and still faces limitations in false discovery of protein function. The growth of the proteomics arena has understood this problem and provided stringencies in terms of protein and peptide identifications,” says Prof Sanjeeva Srivastava from IIT Bombay, the lead author of this study from India.
HPP relies on four resources to enforce its stringency in correctly identifying and classifying the proteins. First, it uses antibodies to identify the proteins and, here, the Project details the antibody-based techniques to find the location and understand the role of proteins. Then, it employs mass spectrometry (MS), a method used to find out protein structure, adhering to certain standards for the instruments and workflow employed in processing the raw MS data.
For the third resource, HPP relies on proteins‟ pathology to provide the necessary epidemiological evidence, access to clinical samples, and diagnostic regulatory policies to find out the proteins responsible for several disorders. Finally, it compiles all this information as a knowledge base (KB) containing all the structural and functional information about proteins and makes it available to the community. One such KB, neXtProt, includes information about MS data (obtained from various other databases), antibody data, the interaction among proteins and the influence of the genes.
The neXtProt database classifies the existing proteins into five classes of credibility called Protein Existence(PE) levels. PE 1 level includes proteins that have clear experimental evidence for their structure and function. PE 2 level includes proteins whose structure and function have not been entirely identified. PE 3 level tells us about the possible similarity between proteins. PE 4 level proteins only provide the precursor genomic data. PE 5 usually consists of the incorrectly analysed proteins. As of 2020, 90.4% (approximately 17900 proteins) of the human proteome have credible PE1 evidence. This leaves the remaining 9.6% (about 1800 proteins, also called the “missing proteome”) of our proteome at PE2, PE3 and PE4 levels, which are yet to be identified at high stringency.
Protein assay tests have always been used in medical diagnostics, and are prone to inaccuracy. The tools of proteomics, along with genomics, can achieve the best results required to detect many pathogenic infections, including that of SARS-CoV-2(COVID-19), and understand disorders such as cancer and cardiovascular conditions.
“The initiative of HPP in identifying and characterizing human proteome has opened new avenues in the field of Proteomics. The advancement of time and technology will add new milestones which will enhance the understanding of human biology and expedite the role of proteomics in diagnosis, prognosis and precision medicine-based applications,” adds Mr Deeptarup Biswas, a PhD scholar at IIT Bombay who was a part of this study, about the further goals of the HPP.
|Author(s) of the research paper||Subash Adhikari, Edouard C. Nice, Eric W. Deutsch, Lydie Lane, Gilbert S. Omenn, Stephen R. Pennington, Young-Ki Paik, Christopher M. Overall, Fernando J. Corrales, Ileana M. Cristea, Jennifer E. Van Eyk, Mathias Uhlén, Cecilia Lindskog, Daniel W. Chan, Amos Bairoch, James C. Waddington, Joshua L. Justice, Joshua LaBaer, Henry Rodriguez, Fuchu He, Markus Kostrzewa, Peipei Ping, Rebekah L. Gundry, Peter Stewart, Sanjeeva Srivastava, Sudhir Srivastava, Fabio C. S. Nogueira, Gilberto B. Domont, Yves Vandenbrouck, Maggie P. Y. Lam, Sara Wennersten, Juan Antonio Vizcaino, Marc Wilkins, Jochen M. Schwenk, Emma Lundberg,Nuno Bandeira, Gyorgy Marko-Varga, Susan T. Weintraub, Charles Pineau, Ulrike Kusebauch, Robert L. Moritz, Seong Beom Ahn, Magnus Palmblad, Michael P. Snyder, Ruedi Aebersold & Mark S. Baker.|
|Contact email||Mark Baker: firstname.lastname@example.org|
|Title of research||A high stringency blueprint of the human proteome|
|Bibliographic Info||Nature Communications, 16 October 2020, Vol. 11, Issue 5301|
|Funding Information||Parts of this work were supported by grants to ProteoRed PRB3- ISCIII, PT17/0019/0001 Comunidad de Madrid Grant B2017/BMD- 3817 (F.J.C.); Korean Ministry of Health and Welfare HI13C2098 and HI16C0257 (Y.K.P.); NIH grants P30ES017885 and U24CA210967 (G.S.O.), 5U01HL-13104204, PADOM-SPO11347 and PARYB-SPO112285 (M.P.S.); NCI CPTAC U24CA210985and NCI EDRN U24CA115102 (D.W.C.); NIH National Institute of General Medical Sciences R01GM087221 (E.W.D./R.L.M.) and R24GM127667 (E.W.D.); NIH National Institute on Aging U19AG023122 (R.L.M.); NSF DBI-1933311 (E.W.D.); CIHRCOVID-19 Rapid Research Funding (F20-01013), CIHR Foundation Grant FDN:14840 and Canada Research Chair (C.M.O.); Investissement d‟Avenir Infrastructures Nationales en Biologie et Santé ANR-10-INBS-08 (Proteomics French Infrastructure ProFI (Y.V.); Wellcome Trust WT101477MA and 208391/Z/17/Z (J.A.V.); Knut and Alice Wallenberg Foundation (M.U., C.L., J.M.S., E.L.); Brazilian CAPES 88887.130697, CNPq 440613/2016-7, FAPERJ E-26/210.173/2018 (G.B.D.) and FAPERJE-26/202.650/2018 (F.C.S.N.), Australian Commonwealth NCRIS (M.S.B.); NHMRC 1010303 (M.S.B., E.C.N.); Cancer Council NSW RG19-04 (M.S.B., S.B.A., E.C.N.); Cancer Institute NSWFellowship 15/ECF/1-38 (S.B.A.), Sydney Vital CINSW Translational Cancer Research Centre grant (M.S.B., S.B.A., S.A.),„Fight on the Beaches‟ (M.S.B., S.B.A., E.C.N., S.A.) funding and an International Macquarie Research Excellence Scholarship (S.A.)|
|Article written by||Parvathi Nair|
|Image credits||Photo by National Cancer Institute|
|Gubbi Pages link|