Evaluating sequencing methods against microarrays for copy number analysis

Sequencing is increasing in popularity as a platform for copy number analysis. But how does it stack up to traditional methods (typically microarrays) ? To answer this, or at least begin to, the question has to be reframed around “which” array technology is being compared to “which” sequencing methods, what the scientific goals are (broadly, and as relates to copy number analysis), and, equally important, the nature of the samples.

Looking at a recent paper (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2919738/), cost-effective sequencing techniques can be used to detect large copy number changes. The low-depth techniques have low resolution (e.g., in the paper 0.04x coverage resolved CNVs of 15kb), but are cost-competitive. At increased, but still low depth (e.g., 0.3x), the resolution and cost compares favorably to some aCGH platforms, but the technique has some limitations. Obviously, focal events smaller than the window will not be resolved. Perhaps less apparent, the technique cannot be used to determine the zygosity of the input DNA. For both constitutional and cancer cases, this limitation makes the technique poorly suited to finding copy-neutral LOH (loss of heterozygosity) regions. However, unlike aCGH, the technique can be used to find the ploidy of the sample by summing the number of reads per chromosome and dividing by the chromosome length, then relative read depths can be used to infer copy number at points along the genome.

Simple CGH arrays also have the problem of determining zygosity, but SNP array technology (such as the Affy CytoScan HD and OncoScan products, Illumina OMNI arrays, and Agilent CGH+SNP) solves it. By using SNP arrays, zygosity of the sample within various regions of the genome can be imputed through the allelic balance (commonly reported as the B-allele frequency). Interpretation of SNP array data is further complicated, especially in cancer samples, due to sample heterogeneity; but in most cases the underlying aberrations can be detected with a high degree of confidence, along with copy-neutral LOH.

The limitations of low-depth sequencing are also circumvented, easily enough, at a sufficient depth of coverage that allows accurate base calling. Here, not only can zygosity be deduced, but small structural variants as well as somatic mutations can be detected, creating a more robust picture than arrays alone. Though, problems still exist. At this level, the cost and analysis of sequencing data may become problematic, particularly if the primary goal of genome-wide sequencing is for copy number detection (as opposed to a by-product result from some other objective). For instance, techniques to detect LOH from sequencing (including exome sequencing) have been developed, but require a depth of coverage which significantly increases the cost. The Exome-CNV paper tested the technique to find copy number and LOH using about a 40x depth of coverage on melanoma samples (http://bioinformatics.oxfordjournals.org/content/early/2011/08/09/bioinformatics.btr462.full.pdf).

Archival FFPE samples pose challenges for both arrays and NGS methods. The low-input DNA, sample heterogeneity (since archival FFPE are commonly cancer samples), and sample degradation all make analyzing FFPE samples difficult. The current wisdom is that sequencing methods ought to have an advantage here, since they don’t require DNA amplification, and the degraded DNA can be sequenced directly. But at least one array method, using molecular inverted probes (MIP), promises to work at least as well on FFPE samples, and perhaps better (http://www.sciencedirect.com/science/article/pii/S2210776212001627)

From where we stand today, the way to arrive at a clear answer as to whether sequencing or arrays makes sense for copy number analysis is to frame the question in terms of the nature of the sample, analysis objectives, and cost constraints. You may conclude, for example, that low-depth sequencing is a cost-effective and robust alternative to a low-resolution aCGH platform. But, if you work in cancer samples, or constitutional samples requiring LOH detection, depending on your goals you may decide the best approach is a high-density SNP array combined with targeted re-sequencing. In any event, the technology is moving at a rapid pace and you can be reasonably assured that whatever is the right answer ‘today’ is likely to change.

Please comment and add your thoughts — there’s so much happening at the moment that this likely (a) contains errors, (b) is out of date, and/or (c) misses some key points.

About Louis Culot

Louis Culot

Comments are closed.