Microbes and Microchips
What's in a Name?
Commonly Clustering
(R)evolution
Bonus "fun fact" question for the end
100

Richness and evenness increase this type of diversity.

Alpha diversity

100

Defined terms for genes, clustered into a hierarchy, connected by logical relationships, are commonly called this.

Gene ontology

100

This type of classification is performed using no knowledge of prior classes.

unsupervised

100

These are more often functionally similar at some level, versus paralogs.

Orthologs

200

The existence of copy number variations, the fact that some marker regions are better than others, there are no universal genes for viruses, and limited ability to characterise novel taxa beyond placing them in phylogenetic tree, makes this type of sequencing approach for microbiome analysis challenging.

Amplicon sequencing

200

GO is NOT this

any one of (1) a dictated standard (2) a way to unify databases (3) a definition of evolutionary relationships

200

Clustal multiple sequence alignment uses an example of this general type of clustering.

hierarchical clustering

200

This is something that you should do when trying to compute ortholog cutoffs

ask a statistician!

300

This is the recommended output of metagenomics now, instead of OTUs

amplicon sequence variants

300

This is a name of a subset of bio-related ontologies developed from a set of best practices in ontology development

OBO Foundry

300

This is how many clusters a k-means algorithm will cluster data into.

K

300

Ortholuge can identify orthologs correctly when they show this characteristic, making them look like paralogs

unusual divergence

300

As Simon Fraser University's first Chancellor, this Physicist recommended to the provincial government that SFU be located atop the summit of Burnaby Mountain.

Who is George Schrum?

400

This type of metagenomic analysis uses a database of annotated genes.

reference-based analysis

400

This is a network of terms used to store ontological data in GO databases

directed acyclic graph (DAG)

400

These are two disadvantages of k-means clustering.

1. Must choose k - the number of clusters.
2. Data must be numerical for euclidean distance
3. Clusters with other geometry may not be found
4. Outliers mess things up!

400

This event can cause a gene to undergo accelerated divergence

either (1) gene rearrangement that puts a gene under the control of a different promoter (2) different evolutionary pressure

500

These are 2 ways you can increase quality of microbiome sequencing

maintain microbiome integrity, avoiding selective enrichment/depletion of microbes, reducing contamination, total vs. active microbes, etc.

500

These are the three classification categories found in Gene Ontology

(1) molecular function (2) cellular component (3) biological process

500

Good clustering leads to groups with what type of similarity?

high intra-class similarity and low inter-class similarity

500

These are two examples of when a reciprocal best BLAST hit could misidentify a paralog as an ortholog.

(1) gene deletion events, (2) incomplete genome information