We develop scalable statistical methods to analyze massive genomic data sets.

Our people use computer science, statistics, and genetics to turn data into knowledge.


We aim to improve the diagnosis and treatment of cancer and other genetic diseases.

Learn more about our ongoing research projects or view our publications.



A mixed membership model for heterogeneous tumor subtype classification.



A hierarchical Bayesian model to detect rare single nucleotide variants in next-generation sequencing data.



A scalable genomic data search engine.


We are training and supporting a new generation of investigators that are proficient in computer science, statistics, and genetics who can tackle the most difficult problems in genomic medicine.

BME2211: Biomedical Data Analysis and Programming

Learn to program with MATLAB and analyze real biomedical data sets using data analysis methods including dimensionality reduction, classification, clustering, and regression.

BME595L: Machine Learning for Biomedical Informatics

A probabilistic graphical model approach to statistical machine learning. Learn theoretical and practical aspects of statistical inference and their application to biomedical data analysis problems.

About Us

Our people grow through theory and practice.

Patrick Flaherty , Principal Investigator

Patrick is broadly interested in scalable, accurate algorithms for gaining understanding from massive biomedical data.

Hachem Saddiki, Graduate Student

Hachem is focusing on developing efficient inference algorithms for Bayesian statistical models to identify biomarkers from genomic assays on heterogeneous tumor samples. He is generally interested in machine learning methods for large-scale genomic data sets.

Fan Zhang , Graduate Student

Fan graduated with a master degree majored in Bioinformatics from Jilin University, China in 2012. Then she worked as Research Assistant focused on bio-signal processing in Chinese Academy of Sciences. She is interested in next generation sequencing data analysis, and biomarkers based on microarray data. Her current work is concentrated on rare single nucleotide variant detection on NGS data using Bayesian statistical models.

Chuangqi Wang, Graduate Student

Chuangqi Wang earned his master degree with concentration in robot motion planning in Electronics Engineering & Computer Science (EECS) from Peking University in 2012. There, he focused on image processing and robotics in Chinese Academy of Sciences (CAS). Currently, he is interested in statistical machine learning methods for large-scale biomedical data.

Erin Esco , Undergraduate Student

Erin is a sophomore majoring in Computer Science at WPI. She is interested in working with large data sets - developing analysis pipelines and finding solutions for storing and mining data.