Volume 14 (2): 118-125, 2013

Predicted structure of an unknown protein sequence from US patent 8211428


Akash kumar1, Mohd. Zakir Khawaja2, Ahsan ul haq Qureshi3


1Department of Bioinformatics, 2Department of Biotechnology, 1,2Uttaranchal College of Science & Technology, Dehradun, India 3CENT & Chemistry Department King Fahd University of Petroleum and minerals Dhahran, Saudi Arabia




Kumar A, Zakir Khawaja M, Ul Haq Qureshi A., Predicted structure of an unknown protein sequence from US patent 8211428A. Onl J Bioinform., 14 (2): 118-125, 2013. The protein sequence of an unknown organism (US Patent 8211428) from GenPept database (ACCESSION AFO13014) was predicted using 3D structure, molecular weight, theoretical pI, atomic composition, transmembrane segment prediction and domain identification. The retrieved sequence was used for homology modelling to identify the organism and its species. Atomic composition result revealed that the protein sequence consisted of C1222H1916N354O371S17 with a molecular weight of 28047.8 kdal and 260 residues suggesting that the atomic composition is involved in the synthesis of the protein sequence and an iteration in any atom could change the sequence and its structure. The instability index (II) was found to be 32.17 which classifies the protein as stable. The theoretical pH was estimated to be 7.52 with 3880 atoms which suggest that the protein sequence carries no net electrical charge. The protein domain of the submitted sequence was Trypsin with 120, 212,73 active sites with an E-value of 1.4e-74. 3D modelling with 3d jigsaw server revealed the structure analyzed with Rasmol with 228 groups with 1753 bonds, 3 Helices, 18 Strands and 32 Turns. Blastp showed that the unknown protein sequence has complete similarity with Homo Sapiens and probably belongs to that species.


Keywords: 3D modeling, BLASTP, E-Value, Query Coverage.