Semi-Parametric Model-Based Clustering for DNA Microarray Data
Title | Semi-Parametric Model-Based Clustering for DNA Microarray Data |
Publication Type | Conference Papers |
Year of Publication | 2006 |
Authors | Han B, Davis LS |
Conference Name | Pattern Recognition, 2006. ICPR 2006. 18th International Conference on |
Date Published | 2006//00/0 |
Keywords | clustering;, clustering;DNA;Gaussian, computing;genetics;pattern, data;Gaussian, data;maximum, DNA, Expression, fitting;data, kernel;Gaussian, likelihood;mean-shift, maximization;gene, Microarray, mixtures;curvature, model-based, procedure;semiparametric, processes;biology, representation;expectation |
Abstract | Various clustering methods have been proposed for the analysis of gene expression data, but conventional clustering algorithms have several critical limitations; how to set parameters such as number of clusters, initial cluster centers, and so on. In this paper, we propose a semi-parametric model-based clustering algorithm in which the underlying model is a mixture of Gaussian. Each gene expression data builds a Gaussian kernel, and the uncertainty of microarray data is naturally integrated in the data representation. Our algorithm provides a principled method to automatically determine parameters - number of components in the mixture, mean, covariance, and weight of each Gaussian - by mean-shift procedure (Comaniciu and Meer, 1999) and curvature fitting. After the initialization, expectation maximization (EM) algorithm is employed for clustering to achieve maximum likelihood (ML). The performance of our algorithm is compared with standard EM algorithm using real data as well as synthetic data |
DOI | 10.1109/ICPR.2006.1044 |