Cluster analysis based on pairwise similarities

BIONUMERICS allows the calculation of pairwise similarity values and a cluster analysis from up to 20,000 database entries for any type of experiment. Various similarity and distance coefficients are available for different data types, for example:

Fingerprints: Pearson product-moment correlation, cosine correlation, Dice (or Nei and Li), Jaccard, Jeffrey's X, Ochiai, and number of different bands. Additional options are fuzzy logic and area sensitivity for band-based coefficients. Furthermore, banding patterns have optimization and tolerance settings that can be adjusted from trace-to-trace and for which the most suitable tolerance settings can be statistically determined.
Characters: Gower, Rank correlation, Canberra metric, Simple Matching, Bray-Curtis, Chebyshev, Euclidean distance, etc. The categorical coefficient is suitable for multi-state data like VNTR, MLST, AB resistance patterns, etc.
Sequences: Similarities calculated on pairwise and multiple sequence alignments, using Needleman-Wunsch, Wilbur-Lipman or BIONUMERICS’ own proprietary algorithm.

A number of clustering methods are available for calculating dendrograms from pairwise similarity values: Unweighted pair-grouping (UPGMA), complete linkage (furthest neighbor), single linkage (nearest neighbor), Ward, Centroid, Median, Neighbor Joining, Bio-Neighbor Joining, NeighborNet clustering, Correlation Eliminator and Partial Correlation Eliminator methods.

An interactive wizard-driven input of parameters, options and choices makes cluster analysis more intuitive for users with little statistical background.

Supported in BIONUMERICS configurations:

BIONUMERICS-GEL

BIONUMERICS-MALDI

BIONUMERICS-SEQ

BIONUMERICS-SUITE

Relevant applications:

AFLP-based codominant band scoring

Bacterial community fingerprinting

Molecular surveillance networks

Multi-Locus VNTR Analysis (MLVA)

Multilocus sequence typing (MLST) analysis

Mycobacterial Interspersed Repetitive Units (MIRU) typing