|
Overview
Spot quantitation is the assigning of numerical values to spots
imaged by the scanner system. The fluorescence signals in each spot
are encoded in a pixellated image file. The spot quantitation process
is based on standard image processing and recognition technology
which has been adapted to find circular spots in a regular grid
pattern.
The process begins with the output of a laser scanner system, which
is two image files, in either tagged image format (TIFF), Windows
Bitmap (BMP), or other common image format. The image consists of
a grid of pixels, each of which has a 16 bit grayscale (see the
Laser Scanning summary). Quantitation begins with the reading of
these pixellated images into a quantitation software package. Packages
currently in use include Biodiscovery's Imagene 3.0 (1), Michael
Eisen's freeware package Scanalyze (2), and Imaging Research's ArrayVision
(3).
The basic unit of quantitation is the microarray spot, typically
around 100mm in diameter. Scanner resolution is typically 10mm,
so there are approximately 75 pixels per spot. A well-captured spot
should have sharp edges and only a small amount of variation in
its individual pixel values. Often, a filtering step is performed
by the quantitation software to smooth outlier pixel values (a single
intense pixel in a background of low intense pixels, for example);
by applying a moving median or average filter. The quality of the
pixellated spot on the image ultimately depends on the physical
spotting process, how well the spot DNA was crosslinked to the slide
substrate, and/or how well-mixed the hybridization solution was
upon application to the spot DNA.
Superposing images
In order to compare Cy3 and Cy5 signal in a given spot, the pixels
in its Cy3 image must be matched with the corresponding pixels in
the Cy5 image. The software package will have a feature to align
each spot's Cy3 and Cy5 image. Features will be provided to translate,
rotate, shrink, and expand one image relative to the other, to obtain
accurate superposition.
Definition of the
template
The template is a framework for locating each spot in
an array, and defining the regions on the slide used to compensate
for background fluorescence. It looks like a grid of circles, in
the same spacing as the spots on the array it is modelling. A template
consists of (1) the metagrid and subgrid layout, (2) the spot diameter,
(3) spot finding algorithm parameters, (4) specification of background
region for each spot, and (5) labels for each spot (GenBank Accession,
e.g.). The metagrid layout consists of the number of rows and columns
of subgrids in the metagrid and the distance between adjacent metagrids,
both vertically and horizontally. The subgrid layout similarly consists
of the number of rows and columns of spots and the spot center-to-center
distance in the vertical and horizontal direction. Spot algorithm
parameters vary with algorithm. Typically they specify a measure
of spot shape detection sensitivity (how far from a perfect circle
a spot can be), a measure of the spot's edge (whether to look for
a sharp or a diffuse edge) and a contrast discrimination (what intensity
in a pixel constitutes a spot-level intensity and what constitutes
a background level intensity). Background specification varies with
software package, but it is usually specified for each spot, and
it is usually a ring-shaped or similar region around the spot.
Template alignment
and quantitation
Once the template has been specified, it must be aligned with the
two superposed images. Alignment is done in two steps. Manual alignment
is first used to place the template roughly onto the superposed
images. The mouse is used to move the template, or parts of the
template, to align with the array spots on the images. Translation,
rotation, stretching, or shrinking of the template can be done.
Automatic alignment is then used to make finer alignments, aligning
the template's spot circles with the spots on the array using the
spot finding algorithm in the software. Spot locations are approximately
known from the manual alignment. Automatic alignment is done iteratively
until the template-to-image fit is satisfactory. Quantitation consists
of pressing a button; each spot's signal and background is calculated
from the pertinent pixels, and statistics are computed. The basic
statistics calculated are medians and means of spot signal and spot
background pixels. Standard deviations of pixel intensities in a
spot are sometimes calculated as a way of assessing the uniformity
of physical spot deposition and hybridization. Additional statistics
are often calculated as diagnostics, used to determine if a spot
has merged with another or whether it is contaminated with dust.
The output of quantitation is placed into a spreadsheet file. Each
entry in the spreadsheet is the information on a single spot, and
contains the spot coordinate (metagrid and subgrid row/column),
the spot's clone ID (derived from the template), the Cy3 and Cy5
spot intensity values (medians or means or both), and additional
statistics.
Normalization to equalize
overall fluorescence intensity
It often occurs that in looking at the entire set of spot intensities,
overall mean fluorescence gathered from the Cy3-labelled probe is
different than the Cy5-labelled probe. Possible factors causing
this condition are unequal laser power and PMT settings in the scanner
when the scans were taken, differential incorporation of the fluor
into the two probes, and variability of fluor efficacy from lot
to lot. It is expected that in a large array, where only a small
subset of genes' mRNA will exhibit differing expression levels in
the control and treatment probes, the overall fluorescence intensity
should be about the same. This is because the incubation period
in hybridization is the same for both probes, since they are mixed
together, so the competition for any given cDNA molecule on a spot
should be 'fair'.
Therefore it is sometimes necessary to perform a specific type
of normalization (equalization) of the overall fluorescence signals
so that their means are approximately equal. This can be done in
three ways. Statistical equalization adjusts fluorescent
intensities using information on every spot on a slide. In the median
adjustment, the fluorescent signal from a given probe of each
spot on a slide is divided by the median signal of that probe on
that slide (4). The standard deviation adjustment is similar,
using the the standard deviation in lieu of the median (5). In interval-based
adjustment, spots are divided into groups based on their summed
Cy3 & Cy5 intensities falling in predefined intervals. For each
such group, the Cy3 and Cy5 values are each individually averaged,
and a ratio taken. This ratio is used to adjust one of the channels
to be equivalent to the other. Finally, the third type of equalization
uses positive control spots on the slide. These positive controls
are known to express equally for the control and experimental samples.
Their Cy3 and Cy5 values are averaged, and a ratio is taken; adjustment
is made to adjust one of the channels as in the interval-based adjustment.
Once equalization is done, the spreadsheet with the equalized values
can be saved, and spot quantitation is complete. The data can now
be imported into statistical packages for cluster analysis and other
analyses.
References
- Biodiscovery, Inc. (www.biodiscovery.com).
- Scanalyze freeware package, written by Michael Eisen. Register
at http://rana.Stanford.EDU/software/.
- Imaging Research (http://imaging.brocku.ca/).
- Worley, J. et al. (2000). Systems approach to DNA microarrays.
In Microarray Biochip Technology, M. Schena, ed., Eaton
Publishing, Natick, MA. 5.
- Shealy, R.T., personal communication.
Spot Quantitation
of a Scanned Image Pair of Image Scans
This protocol is a general recipe for spot quantitation; it is
not specific to any software package.
Read both images. Read the Cy3 and Cy5 TIFF image files
into the software package.
Align the images. The software package will have a feature
to align each spot's Cy3 and Cy5 image. Features will be provided
to translate, rotate, shrink, and expand one image relative to the
other, to obtain accurate superposition. It is important to obtain
such superposition, because in order to compare Cy3 and Cy5 signal
in a given spot, the pixels in its Cy3 image must be matched with
the corresponding pixels in the Cy5 image. Read the two images into
the packages and perform the alignment.
Define the template. Open the template definition window.
Specify the following values. All distances can be specified in
either pixels or mm.
- Metagrid rows and columns (number of rows and columns of subgrids).
- Subgrid rows and columns (number of rows and columns of spots
in each subgrid).
- Center-to-center distance (CTC) between adjacent spots in a
subgrid.
- Vertical and horizontal distance between adjacent subgrids.
This distance is typically measured between upper left spot centers
in adjacent subgrids.
- Spot diameter.
- Spot finding algorithm parameters. This varies with algorithm.
- Width of background region ring.
- Label naming each spot.
After specifying the above, press the "Create Template" button.
A template will appear over the image. It will not be aligned to
the spots on the image. Save the template in a template file.
Template alignment and quantitation. Access a template by
creating one as above or opening an existing template (in a template
file). Activate the template; mouse handles will appear on it. Use
the mouse to manually manipulate the template as accurately as possible
over the 2 superposed images. Use the translation, rotation, stretching,
and shrinking features by manipulating their handles on the template
using the mouse. After the template is approximately aligned, perform
an automatic alignment by pressing the appropriate button. Repeat
until the template spot circles are accurately outlining the spots
on the images. The automatic alignment uses the spot finding algorithm
to position the template accurately. When the template is accurately
aligned, save it in a template file, and press the "Quantitate"
button. The spots' signal and background will be computed and written
to a spreadsheet window in the software.
Channel Equalization. It often occurs that in looking at
the entire set of spot intensities, overall mean fluorescence gathered
from the Cy3-labelled probe is different than the Cy5-labelled probe.
It is necessary in this case to perform a special type of normalization
called equalization to make the overall intensities roughly the
same. Do one of the two ways below.
Channel Equalization - Statistical. To assay the amount
of normalization needed, initiate the scatterplot feature in the
software. This feature will plot all spots' Cy3 and Cy5 values
on a scatterplot, draw the line y=x (unit line), and draw the
regression line of Cy5 on Cy3. If the unit line deviates from
the regression line by a significant amount, do one of the following:
- Compute the median Cy3 and median Cy5 intensity. Divide each
Cy3 value by the Cy3 median and the Cy5 value by the Cy5 median,
save the results, and replot.
- Compute the standard deviation of the Cy3 values and the Cy5
values. Divide each Cy3 value by the Cy3 standard deviation
and the Cy5 value by the Cy5 standard deviation, save the results,
and replot.
In either method, the regression line should now closely approximate
the unit line. If the original regression line was far away from
the unit line, a rescanning may be warranted, allowing equalization
to primarily occur in the acquisition of signal.,
Channel Equalization - Control Spots. To assay the amount
of normalization needed, initiate the scatterplot feature in the
software. This feature will plot all spots' Cy3 and Cy5 values
on a scatterplot, draw the line y=x (unit line), and draw the
regression line of Cy5 on Cy3. If the unit line deviates from
the regression line by a significant amount, find the location
of the control spots in the spreadsheet. Select them, and compute
the average Cy3 and average Cy5 values of these spots. Calculate
the ratio of Cy5/Cy3, and multiply every spot's Cy3 value by this
ratio. After saving the results, replot, to see if the regression
line approximates the unit line.
Saving the data. Save the spot's Cy3 and Cy5 mean, median,
and other statistics in an Excel spreadsheet format for use by the
statistical analysis software.
|