The TaqMan® SNP genotyping technology utilizes the 5’ nuclease activity of Taq polymerase to generate a fluorescent signal during PCR. For each SNP, the assay uses two TaqMan® probes that differ in sequence only at the SNP site, with one probe complementary to the wild-type allele and the other to the variant allele. The technique utilizes the FRET technology whereby a 5’ reporter dye and a 3’ quencher dye are covalently linked to the wild-type and variant allele probes. When the probes are intact, fluorescence is suppressed because the quencher dyes are in the proximity of the reporter dyes. In the PCR annealing step, the TaqMan® probes hybridize to the targeted SNP site. During PCR extension, the reporter and quencher dyes are released due to the 5’ nuclease activity of the Taq polymerase, resulting in an increased characteristic fluorescence of the reporter dye. Exonuclease activity only happens on the perfectly hybridized probes, since a probe containing a mismatched base will not be recognized by the Taq polymerase.
At the end of the PCR reaction, the fluorescent signal for the two reporter dyes is measured. The ratio of the signals will be indicative for the genotype of the sample (see figure).
In most assays, the fluorescent signals of the two reporter dyes are normalized using the signal of a third dye (e.g. ROX), of which the intensity is proportional to the template DNA concentration and the extent of the PCR reaction. Typically, the reporter dye signals are visualized in a plot. A number of related genotyping systems utilize the reporter-quencher technology (e.g. Invader®, Molecular Beacons®, Scorpion® and other probe technologies) and can be analyzed and visualized in the same way as TaqMan® probes using the SNP genotyping plugin.
SNP genotyping in BIONUMERICS
The BIONUMERICS SNP calling plugin provides a fully automated platform for reliable SNP calling and genotyping. The plugin is very flexible and accommodates for different workflows for TaqMan SNP genotyping data analysis:
- Perform an auto-calling in other software and import the calls and their corresponding confidence values in BIONUMERICS.
- Perform an auto-calling in BIONUMERICS during import.
- Import data as “No call” and perform an auto-calling in the SNP calling window.
Through its rich and integrated environment, BIONUMERICS offers a number of features that no other software tool can deliver:
- Powerful databasing including user and security features and optional audit trails compliant with the highest standards.
- One platform to analyze SNP genotyping data and numerous other techniques such as sequences, microsatellites, phenotypic features, etc.
- Unparalleled degree of automation from import of raw data to reporting of results and problem samples. In addition the software is fully scriptable in Python to achieve any degree of customization. BIONUMERICS is therefore extremely suitable for high throughput analysis.
- BIONUMERICS offers a myriad of data mining, analysis, and statistical tools, enabling clustering, statistical analysis and hypothesis testing on large data sets.
Importing SNP data
At present, the plugin supports five different file formats (other formats can be added on request):
- Applied Biosystems 7900HT Fast Real-Time PCR System: .TXT files containing processed 384-well plate reads.
- BMG Labtech 384-well microplate readers: .DAT files containing raw fluorescence values in microplate layout.
- Douglas Scientific .TXT files generated by the Douglas Scientific Array Tape system, in 384-well format.
- Fluidigm Dynamic Array, .CSV files containing up to 9,216 data points (96 samples tested against 96 SNPs).
- Tecan Safire/Infinite .TXT files generated by Tecan Safire or Tecan Infinite microplate readers, in 96-well and 384-well microplate formats.
Multiple files can be imported in batch, and optionally, the software can perform automatic SNP calling during the import.
Automatic SNP calling
The automatic SNP calling algorithm is the most crucial element of the plugin and is determinative for the efficient and successful SNP genotyping in a high-throughput setting. BIONUMERICS uses an iterative seed-based partitioning algorithm for calling the SNP clusters according to the existing genotypes. The auto-call settings are divided in two tabs, representing the general settings and thresholds for SNP assignment (left image), and the advanced settings for the partitioning algorithm (right image). While the default partitioning settings are usually performing excellent, they can be optimized for any particular data source. The algorithm allows SNPs to be called successfully in polyploid genomes as well (theoretically up to heptaploidy).
The SNP calling window
Single or multiple SNP files can be opened in the SNP calling window. All SNPs found in the selected files are listed (upper left panel), from which an individual SNP can be selected to show in the plot window. SNP calls can also be plotted per file or for selected files (bottom left panel).
On the plot, SNP calls can be selected using the lasso selection tool.
By default, SNP calls are shown in differential colors according to the defined zygosity calls, the NTC calls and the 'No calls' (figure above). These colors can be changed by the user. The statistical confidence of the calls can also be used as a basis for plot colors (right). In a third view, colors are based upon the SNP files.
SNP calls can be sorted and filtered by their confidence, ROX signal or other parameters.
The plot can be displayed with Cartesian coordinates, polar coordinates and contrast coordinates (below).
Selected SNP calls can be changed by the user. For each SNP call, the software displays whether it was assigned manually or automatically. Tab-delimited files with calls and confidence information can be exported for any selection of samples in the database.
Analysis and genotyping
Analysis in BIONUMERICS does not end with SNP calling. The software offers numerous advanced tools for data mining, clustering, screening, identification and statistical analysis. The rich analysis platform in combination with the powerful databasing capacities make BIONUMERICS excellently suited for long term genotyping projects on a lab-wide or inter-laboratory basis. And most importantly, BIONUMERICS offers equally powerful tools for various other genotyping techniques as well, including AFLP band scoring, MLPA analysis, genome sequencing and SNP analysis, etc. Results from different techniques can be combined in various ways to obtain more conclusive analyses.
The figure below is an example of a two-way clustering of carrot samples and SNPs, nicely illustrating the correlation between groups of samples on the one hand and groups of SNPs on the other hand.
Samples that have specific genotypes (combinations of zygosities) can be screened for throughout projects and databases using the advanced query builder. To achieve an even higher level of screening and automation power, decision networks can be built.
This application is implemented in BIONUMERICS as the SNP calling plugin, which is license-based. Please contact us for licensing information.