MICROBES AND
MICROCHIPS
COMMONLY
CLUSTERING
FINAL FUN FACT
(will do at end)
100

Richness and evenness increase this type of diversity.

What is alpha diversity?

100

These are more often functionally similar at some level, versus paralogs.

What are orthologs?

100

As Simon Fraser University's first Chancellor, this Physicist recommended to the provincial government that SFU be located atop the summit of Burnaby Mountain.

Who is George Schrum?

200

The existence of copy number variations, the fact that some marker regions are better than others, there are no universal genes for viruses, and limited ability to characterise novel taxa beyond placing them in phylogenetic tree, makes this type of sequencing approach for microbiome analysis challenging.

What is amplicon sequencing?

200

Clustal multiple sequence alignment uses an example of this general type of clustering.

What is hierarchical clustering?

300

Defined terms for genes, clustered into a hierarchy, connected by logical relationships, are commonly called this.

What is the Gene Ontology?

300

This is how many clusters a k-means algorithm will cluster data into.

What is k?

400

This type of indexing is used by DIAMOND, but not BLAST, which makes DIAMOND a whole lot faster.

What is double indexing?

400

These are two disadvantages of k-means clustering.

Note any two of:
1. Must choose k - the number of clusters.
2. Data must be numerical for euclidean distance
3. Clusters with other geometry may not be found
4. Outliers mess things up!

500

Not referring to a Star Trek episode, this type of seeding in DIAMOND adds higher sensitivity for matches and creates fewer redundant clusters.

What are spaced seeds?

500

These are two examples of when a reciprocal best BLAST hit could misidentify a paralog as an ortholog.

What are (1) gene deletion events, (2) incomplete genome information (or other such answers)?