©1996-2019 All Rights Reserved. Online Journal of Bioinformatics. You may not store these
pages in any form except for your own personal use. All other usage or
distribution is illegal under international copyright treaties. Permission to
use any of these pages in any other way besides the
before mentioned must be gained in writing from the publisher. This
article is exclusively copyrighted in its entirety to OJB publications. This
article may be copied once but may not be, reproduced or
re-transmitted without the express permission of the editors. This journal satisfies the refereeing requirements
(DEST) for the Higher Education Research Data Collection (Australia). Linking:To link to this page or
any pages linking to this page you must link directly to this page only here
rather than put up your own page.
OJBTM
Online
Journal of Bioinformatics©
Volume 8 (2):139-153, 2007.
Shape-to-String Mapping: A Novel Approach To Clustering Time-Index Biomics
Data
Antoine W1,
Miernyk JA1,2,3
1Department of Biochemistry 2USDA,
Agricultural Research Service, Plant Genetics Research Unit and 3Interdisciplinary
Plant Group,
ABSTRACT
Antoine W, Miernyk JA, Shape-to-String Mapping: A Novel Approach To
Clustering Time-Index Biomics Data, Onl J Bioinform., 8 (2):139-153,
2007. Herein we describe a qualitative
approach for clustering time-index biomics data. The
data are transformed into angles from the intensity-ratios between adjacent
time-points. A code is used to map a qualitative representation of the
numerical time-index data which captures the features in the data that define
the shape of the pattern expression as a function of time. The problem of
clustering time-index biomics data is then either
solved directly or reduced to a problem similar to the well-studied task of
clustering protein sequence data. For datasets with few time points, the
words derived from the transformation are adequate to define clusters.
Dissimilarities between the newly defined objects can be estimated, and
the distance matrix can be used for further clustering. The results from
transcript profiling of developing soybean embryo have been used to illustrate
the utility of the method. Comparative mapping of the
intensity-ratios and the angles by multidimensional scaling and Procrustes
analysis revealed otherwise cryptic information within the data.
The Euclidian distance matrices were calculated from the words and
corresponding gene list using the PHYLogeny Inference Package
(PHYLIP) algorithms and the Point of Accepted Mutation (PAM) scores
matrix to compare the effectiveness of the code in clustering the data.
Key words: String Map, Cluster, Biomics