As a researcher at the National Cancer Institute, part of the National Institutes of Health, I work broadly in the areas of genomics and bioinformatics of high-throughput data analysis. In addition to data analysis and biologically-driven projects, I have a strong interest in developing software and tools for analysing and interpreting genomic data and help to lead the Bioconductor project. Take a look at my publications page or my Google Scholar Profile for details.

Recent Blog Posts

Publicly Available Human Genome Variant Databases

Tuesday, October 21, 2014

Back in the day, there was one variant database, and only one--dbSNP. Then came HapMap and 1000Genomes. Now, there are many such resources, each of which can be useful for annotating and filtering variants found in our own data. Here, I have just pulled together a few such resources and a little relevant information about each.

Convert from Sweave to R markdown vignettes

Sunday, September 21, 2014

I recently converted my GEOquery vignette from Sweave to R markdown and created a few notes on the process.

Software cataloging made simple--the R DESCRIPTION file

Saturday, August 30, 2014

The NIH has just released a Request for Information: Input on Information Resources for Data-Related Standards Widely Used in Biomedical Science. While the R package system is no panacea, it *is* a successful system by many measures and represents a practical approach to metadata management associated with software. At the core of this system is the R DESCRIPTION file that is computable, human consumable, and extensible.