As a researcher at the National Cancer Institute, part of the National Institutes of Health, I work broadly in the areas of genomics and bioinformatics of high-throughput data analysis. In addition to data analysis and biologically-driven projects, I have a strong interest in developing software and tools for analysing and interpreting genomic data and help to lead the Bioconductor project. Take a look at my publications page or my Google Scholar Profile for details.
Recent Blog Posts
Publicly Available Human Genome Variant Databases
Tuesday, October 21, 2014
Back in the day, there was one variant database, and only one--dbSNP. Then came HapMap and 1000Genomes. Now, there are many such resources, each of which can be useful for annotating and filtering variants found in our own data. Here, I have just pulled together a few such resources and a little relevant information about each.
Convert from Sweave to R markdown vignettes
Sunday, September 21, 2014
I recently converted my GEOquery vignette from Sweave to R markdown and created a few notes on the process.
Software cataloging made simple--the R DESCRIPTION file
Saturday, August 30, 2014
The NIH has just released a Request for Information: Input on Information Resources for Data-Related Standards Widely Used in Biomedical Science. While the R package system is no panacea, it *is* a successful system by many measures and represents a practical approach to metadata management associated with software. At the core of this system is the R DESCRIPTION file that is computable, human consumable, and extensible.