Edward Wijaya, Hajime Harada & Paul Horton
AIST, Computational Biology Research Center
Proceedings of Bioscience and BioTechnology (BSBT2008), Sānyà, China, December 2008.
We report the results of fitting mixture models to the distribution of expression values for individual genes over a broad range of normal tissues, which we call the marginal distribution of the gene. The base distributions used were normal, lognormal and gamma. The expectation-maximization algorithm was used to learn the model parameters. Experiments with articial data were performed to ascertain the robustness of learning. Applying the procedure to data from two publicly available microarray datasets, we conclude that lognormal performed the best function for modeling the marginal distributions of gene expression. Our results should provide guidances in the development of informed priors or gene specic normalization for use with gene network inference algorithms.