Microarray introduction

An Introduction to Microarrays

Microarray technology enables the concurrent collection and mining of expression data from thousands of genes (Brown and Botstein, 1999; Kerr and Churchill, 2001). This technology has enabled analyses of gene expression in a diverse range of biological processes. Examples are measurement of gene expression responses to pathogens allowing drug development (Grunblatt, 2004; Shultz et al., 2004) , the study of fruit firmness in strawberries (Salentijn et al., 2003) , investigating diurnal patterns in expression (Schaffer et al., 2001; Qian et al., 2003) , and the analysis of gene expression in developmental mutants (Moon et al., 2003; Ohgishi et al., 2004).

Detection of transcript by microarrays

The use of microarrays enable the detection of mRNA transcripts at a given moment, providing an indication of protein abundance, though mRNA and protein levels do not always correlate (Gygi et al., 1999) . A recent poll of microarray users by the National Cancer Institute suggests that differences in protein amounts correlate with mRNA levels in only fifty percent of cases (http://www.cancer.gov/tarp). Although microarrays provide information on the expression of a large number of genes (over 24000 gene sequences (~94% of the total gene complement) in the case of the Affymetrix full Arabidopsis genome chip ATH1), there are also disadvantages associated with the technology. The number of genes analysed can make interpretation of data difficult, individual hybridisations can be noisy generating variations between experiments, and single data points may prove unreliable. This is particularly the case for genes with low expression levels. Furthermore, the most highly expressed genes or those showing the largest differences in expression in a particular comparison may not be the most biologically relevant. Often genes with known biological functions show a slight, though significant change in transcript levels (Causton et al., 2003).

Microarray Experiment Difficulties

To minimise problems associated with microarray experiments it is imperative to carefully design the experiments. Variations in the data obtained is usually caused in three ways: measurement error associated with the reading of fluorescent signals; natural variability in the biological system, and technical variations due to the extraction, labelling and hybridisation of samples (Churchill, 2002) . It is therefore important to reduce the number of non-essential variables. Unless environmental factors such as light, temperature, humidity and time of sampling are being measured, they should be kept as constant as possible. The pooling of samples is useful in eliminating variability between individual samples (Bakay et al., 2002). Replication of microarray experiments increases the confidence value of the results. Due to the high costs involved it is often not feasible to repeat experiments, especially when dealing with commercial arrays (Causton et al., 2003).

Types of array

Two major types of expression microarray platforms are currently in use: cDNA based and oligonucleotide based. Both types of array can be spotted on either porous substrates such as nylon or nitrocellulose, or non-porous surfaces such as a polymer or glass slide (Causton et al., 2003). cDNA arrays are typically made up of PCR products spotted upon the array. This type of array enables clone banks and DNA from limited templates to be spotted. A disadvantage of DNA arrays is that contamination can occur between spots, and that it can be difficult to distinguish between results of closely related genes. The size of a PCR amplified fragment used on a cDNA array is in the order of 400bp to 1000bp (Causton et al., 2003).

Oligonucleotide arrays

The second array class, oligonucleotides, are much smaller in size, varying between 25 and 80bp in length. These are either pre-synthesised and spotted onto the array, or synthesised directly on the array substrate. Oligonucleotide based arrays have advantages over cDNA arrays as they suffer less from contamination between spots, it is easier to standardise the sequence of each spot, and they can be purchased in a form that is ready for spotting.

Spotted Microarray

The spotted microarray is hybridised with probes derived from the mRNA of the biological samples being assessed. In the technique known as dye swapping, in which multiple extracts are hybridised to arrays, the mRNA is typically reverse transcribed into cDNA and labelled with a spectrally distinguishable red (Cy5) or green (Cy3) fluorescent dye (Kerr and Churchill, 2001) . Samples are then washed over the microarray, allowing labelled cDNA strands complementary to sequences on the microarray to bind. Generally two dyes are used; if only one dye is used there is little measure of the amount of DNA targeted to any particular spot. However, the relative fluorescence of two dyes to each other can be measured, with the sample containing higher levels of transcript producing a greater signal.

Spotted Array Hybridisation

The spotted microarray is hybridised with probes derived from the mRNA of the biological samples being assessed. In the technique known as dye swapping, in which multiple extracts are hybridised to arrays, the mRNA is typically reverse transcribed into cDNA and labelled with a spectrally distinguishable red (Cy5) or green (Cy3) fluorescent dye (Kerr and Churchill, 2001). Samples are then washed over the microarray, allowing labelled cDNA strands complementary to sequences on the microarray to bind. Generally two dyes are used; if only one dye is used there is little measure of the amount of DNA targeted to any particular spot. However, the relative fluorescence of two dyes to each other can be measured, with the sample containing higher levels of transcript producing a greater signal.

Single Extract Technique

Another common method is the single-extract technique (Causton et al., 2003) . This involves the use of commercial arrays such as the Affymetrix GeneChip TM (www.affymetrix.com), and relies upon the relationship of spot intensity measures of oligonucleotide probes hybridised to different array chips. This has the advantage that comparisons can be made between chips from multiple sources, allowing for changes in experimental focus and the sharing of datasets with other researchers. Following hybridisation the array is scanned. This produces a 16-bit greyscale TIFF image. The relative fluorescence of each spot is then ascertained from the image. Microarray image analysis software is used to measure pixel intensities. The intensity of each spot is normalised, allowing comparisons of spots within and between array(s). Normalisation itself is an adjustment of the average value of an experimental array to a baseline array.

Microarray data analysis - Analysis of microarray experiments

After transformation of the raw data into a gene expression matrix, data is analysed by using data analysis software such as Genespring TM or the Affymetrix Microarray Suite, and clustering software packages such as Genesis or Rcluster (genex.ncgr.org/genex/rcluster/help.html). These software packages allow the analysis of microarray experiments through interpretation of gene expression levels. The earlier 8K AtGenome1 as well as the 24K ATH1 Affymetrix Arabidopsis GeneChips were used in the array experiments described in this chapter, and analysis of the data was carried out using the Affymetrix Microarray Suite and Genesis.

Microarray Overview

Plant Biology Microarray guide