celintensityread

Read probe intensities from Affymetrix CEL files

Syntax

ProbeStructure = celintensityread(CELFiles, CDFFile)

ProbeStructure = celintensityread(..., 'CELPath', CELPathValue, ...)
ProbeStructure = celintensityread(..., 'CDFPath', CDFPathValue, ...)
ProbeStructure = celintensityread(..., 'PMOnly', PMOnlyValue, ...)
ProbeStructure = celintensityread(..., 'Verbose', VerboseValue, ...)

Input Arguments

CELFiles

Any of the following:

  • String specifying a single CEL file name.

  • '*', which reads all CEL files in the current folder.

  • ' ', which opens the Select CEL Files dialog box from which you select the CEL files. From this dialog box, you can press and hold Ctrl or Shift while clicking to select multiple CEL files.

  • Cell array of CEL file names.

CDFFile

Either of the following:

  • String specifying a CDF file name.

  • ' ', which opens the Select CDF File dialog box from which you select the CDF file.

CELPathValueString specifying the path and folder where the files specified in CELFiles are stored.
CDFPathValueString specifying the path and folder where the file specified in CDFFile is stored.
PMOnlyValueProperty to include or exclude the mismatch (MM) probe intensity values in the returned structure. Enter true to return only perfect match (PM) probe intensities. Enter false to return both PM and MM probe intensities. Default is true.
VerboseValueControls the display of a progress report showing the name of each CEL file as it is read. When VerboseValue is false, no progress report is displayed. Default is true.

Output Arguments

ProbeStructureMATLAB® structure containing information from the CEL files, including probe intensities, probe indices, and probe set IDs.

Description

ProbeStructure = celintensityread(CELFiles, CDFFile) reads the specified Affymetrix® CEL files and the associated CDF library file (created from Affymetrix GeneChip® arrays for expression or genotyping assays), and then creates ProbeStructure, a structure containing information from the CEL files, including probe intensities, probe indices, and probe set IDs. CELFiles is a string or cell array of CEL file names. CDFFile is a string specifying a CDF file name.

If you set CELFiles to '*', then it reads all CEL files in the current folder. If you set CELFiles to ' ', then it opens the Select CEL Files dialog box from which you select the CEL files. From this dialog box, you can press and hold Ctrl or Shift while clicking to select multiple CEL files.

If you set CDFFile to ' ', then it opens the Select CDF File dialog box from which you select the CDF file.

ProbeStructure = celintensityread(..., 'PropertyName', PropertyValue, ...) calls celintensityread with optional properties that use property name/property value pairs. You can specify one or more properties in any order. Each PropertyName must be enclosed in single quotation marks and is case insensitive. These property name/property value pairs are as follows:


ProbeStructure = celintensityread(..., 'CELPath', CELPathValue, ...)
specifies a path and folder where the files specified by CELFiles are stored.

ProbeStructure = celintensityread(..., 'CDFPath', CDFPathValue, ...) specifies a path and folder where the file specified by CDFFile is stored.

ProbeStructure = celintensityread(..., 'PMOnly', PMOnlyValue, ...) includes or excludes the mismatch (MM) probe intensity values. When PMOnlyValue is true, celintensityread returns only perfect match (PM) probe intensities. When PMOnlyValue is false, celintensityread returns both PM and MM probe intensities. Default is true.

You can learn more about the Affymetrix CEL files and download sample files from:

http://www.affymetrix.com/support/technical/sample_data/demo_data.affx

    Tip   Reading a large number of CEL files and/or a large CEL file can require extended amounts of memory from the operating system. If you receive any errors related to memory or have trouble reading CEL files, try the following:

    • Increase the virtual memory (swap space) for your operating system (with a recommended initial size of 3,069 and a maximum size of 16,368) as described in Memory Usage.

    • Set the 3 GB switch (32-bit Windows® XP only) as described in Memory Usage.

ProbeStructure contains the following fields.

FieldDescription
CDFName

File name of the Affymetrix CDF library file.

CELNames

Cell array of names of the Affymetrix CEL files.

NumChips

Number of CEL files read into the structure.

NumProbeSets

Number of probe sets in each CEL file.

NumProbes

Number of probes in each CEL file.

ProbeSetIDs

Cell array of the probe set IDs from the Affymetrix CDF library file.

ProbeIndices

Column vector containing probe indexing information. Probes within a probe set are numbered 0 through N - 1, where N is the number of probes in the probe set.
GroupNumbers

Column vector containing group numbers for probes within the probe set. For gene expression data, the group number for all probes is 1. For SNP (genotyping) data, the group numbers for probes are:

  • 1 — Allele A – (sense)

  • 2 — Allele B – (sense)

  • 3 — Allele A + (antisense)

  • 4 — Allele B + (antisense)

PMIntensities

Matrix containing perfect match (PM) probe intensity values. Each row corresponds to a probe, and each column corresponds to a CEL file. The rows are ordered the same way as in ProbeIndices, and the columns are ordered the same way as in the CELFiles input argument.

MMIntensities (optional)

Matrix containing mismatch (MM) probe intensity values. Each row corresponds to a probe, and each column corresponds to a CEL file. The rows are ordered the same way as in ProbeIndices, and the columns are ordered the same way as in the CELFiles input argument.

ProbeStructure = celintensityread(..., 'Verbose', VerboseValue, ...) controls the display of a progress report showing the name of each CEL file as it is read. When VerboseValue is false, no progress report is displayed. Default is true.

Examples

The following example assumes that you have the HG_U95Av2.CDF library file stored at D:\Affymetrix\LibFiles\HGGenome, and that your current folder points to a location containing CEL files associated with this CDF library file. In this example, the celintensityread function reads all the CEL files in the current folder and a CDF file in a specified folder. The next command line uses the rmabackadj function to perform background adjustment on the PM probe intensities in the PMIntensities field of PMProbeStructure.

PMProbeStructure = celintensityread('*', 'HG_U95Av2.CDF',...
	                  'CDFPath', 'D:\Affymetrix\LibFiles\HGGenome');
BackAdjustedMatrix = rmabackadj(PMProbeStructure.PMIntensities);

The following example lets you select CEL files and a CDF file to read using Open File dialog boxes:

PMProbeStructure = celintensityread(' ', ' ');
Was this topic helpful?