1996-2019 All Rights Reserved. Online Journal of Bioinformatics . You may not store these pages in any form except for your own personal use. All other usage or distribution is illegal under international copyright treaties. Permission to use any of these pages in any other way besides the  before mentioned must be gained in writing from the publisher. This article is exclusively copyrighted in its entirety to OJB publications. This article may be copied once but may not be, reproduced or  re-transmitted without the express permission of the editors. This journal satisfies the refereeing requirements (DEST) for the Higher Education Research Data Collection (Australia). Linking:To link to this page or any pages linking to this page you must link directly to this page only here rather than put up your own page.


 Online Journal of Bioinformatics 

 Volume 10 (1):74-81, 2009.

A statistical method for verification of rareness in DNA sequences using sequence encoding.


Meena Ka, Menaka Kb, Sundar TVd* Subramanian KR c


a Principal and Head, Department of Computer Science, b Lecturer, Department of I.T. & Applications,. cProfessor, Department of M.C.A., Shrimati Indira Gandhi College, Tiruchirapalli 620 002 and dLecturer(SS), Postgraduate and Research Department of Physics, National College, Tiruchirappalli 620 001, India.



Meena K, Menaka K, Sundar TV, Subramanian KR., A statistical method for verification of rareness in DNA sequences using sequence encoding, Online J Bioinformatics, 10 (1):74-81, 2009. Consensus sequence among a family of related sequences is the sequence that shows the characteristics common to most members of the family. Consensus sequences are important in various DNA sequencing analyses and applications and are a convenient way in characterizing a family of molecules. Rareness or anomaly detection followed by in depth analyses in such sequences may reveal significant information regarding the structural, functional and biochemical pathways of the genes which are of much importance to biologists. This paper describes a new method of encoding the DNA sequences for sequence analyses and subsequent application of a statistical method called Analysis of Variance to verify the nature of consensus among the family of given related sequences. The verification for the existence of consensus or rareness is judged based upon the calculation of the variation within the sequence data and the group of sequences. For an illustration, some DNA consensus sequences corresponding to human repetitive elements submitted to Repbase have been used.

KEY-WORDS: Rareness detection, Sequence Encoding and Consensus sequence.