Simpleaffy: easy analysis routines for Affymetrix data

Simpleaffy is a Bioconductor package designed to to provide a starting point for exploring Affymetrix data, and to provide functions for some of the most common tasks we found ourselves doing over and over again. It also provides access to many of the standard QC functions recommended for Affymetrix arrays.

It is based on the affy package, which does does most of the hard work. affy provides a variety of functions for processing Affymetrix data, with many more in affycomp. Even so, some tasks (such as computing t-tests and fold changes between replicate groups, plotting scatter-plots and generating tables of annotated 'hits') require a bit of coding, and some of the most commonly used functions can be a bit slower than we would like. This package aims to provide high-level methods to perform these routine analysis tasks, and many of them have been re-implemented in C for speed.

Quality Control (QC) of Affymetrix data

Affymetrix recommend a series of QC metrics that should be used to check that arrays have hybridised correctly and that sample quality is acceptable. Simpleaffy provides a series of QC functions that can be used to assess the arrays in a project.

Implementation of mas 5.0 algorithms

Simpleaffy provides a fast, C implementation of the mas 5.0 algorithm, which generates expression levels for each probeset.

As with any re-implementation of an algorithm, variations can occur, and we've done a significant amount of testing to see how close simpleaffy gets to the values generated by Affymetrix's implementations. You should be aware of these differences, and if in any doubt, use MAS5.0 or GCOS to generate your expression calls. You can find details of what arrays have been tested, how the testing was done and its results, here.