Volume 12(2):274-288, 2011.

Neighbour joining microarray data clustering algorithm


B. Rajendran1 ,Vel Murugan.k2 , Premnath .D3* ,Patric Gomez4


Department of Bioinformatics, Karunya University, India.




Rajendran B, Murugan V, Premnath K,  Gomez P., Neighbour joining microarray data clustering algorithm, Onl J Bioinform., 12(2):274-288, 2011 Gene clustering groups related genes into a same cluster. K-means clustering algorithm is used for gene expression analysis, but has drawbacks which affect the accuracy of clustering.  Neighbour-Joining (NJ) has been widely used for phylogenetic reconstruction combining computational efficiency with reasonable accuracy: RapidNJ is an extension of the algorithm which reduces the average clustering time. However, the large O (n2) space consumption of RapidNJ is a problem when inferring phylogenies with large data sets. This work describes a method to reduce memory requirements and enable RapidNJ to infer large data sets. An improved heuristic search for RapidNJ improved performance on data sets. Performance of RapidNJ was evaluated against accuracy and time on lymphoma and leukemia data sets.


Keywords---Gene clustering, DNA, microarray, Neighbor Joining, RapidNJ