Richness and evenness increase this type of diversity.
Alpha diversity
Defined terms for genes, clustered into a hierarchy, connected by logical relationships, are commonly called this.
Gene ontology
This type of classification is performed using no knowledge of prior classes.
unsupervised
These are more often functionally similar at some level, versus paralogs.
Orthologs
The existence of copy number variations, the fact that some marker regions are better than others, there are no universal genes for viruses, and limited ability to characterise novel taxa beyond placing them in phylogenetic tree, makes this type of sequencing approach for microbiome analysis challenging.
Amplicon sequencing
GO is NOT this
any one of (1) a dictated standard (2) a way to unify databases (3) a definition of evolutionary relationships
Clustal multiple sequence alignment uses an example of this general type of clustering.
hierarchical clustering
This is something that you should do when trying to compute ortholog cutoffs
ask a statistician!
This is the recommended output of metagenomics now, instead of OTUs
amplicon sequence variants
This is a name of a subset of bio-related ontologies developed from a set of best practices in ontology development
OBO Foundry
This is how many clusters a k-means algorithm will cluster data into.
K
Ortholuge can identify orthologs correctly when they show this characteristic, making them look like paralogs
unusual divergence
As Simon Fraser University's first Chancellor, this Physicist recommended to the provincial government that SFU be located atop the summit of Burnaby Mountain.
Who is George Schrum?
This type of metagenomic analysis uses a database of annotated genes.
reference-based analysis
This is a network of terms used to store ontological data in GO databases
directed acyclic graph (DAG)
These are two disadvantages of k-means clustering.
1. Must choose k - the number of clusters.
2. Data must be numerical for euclidean distance
3. Clusters with other geometry may not be found
4. Outliers mess things up!
This event can cause a gene to undergo accelerated divergence
either (1) gene rearrangement that puts a gene under the control of a different promoter (2) different evolutionary pressure
These are 2 ways you can increase quality of microbiome sequencing
maintain microbiome integrity, avoiding selective enrichment/depletion of microbes, reducing contamination, total vs. active microbes, etc.
These are the three classification categories found in Gene Ontology
(1) molecular function (2) cellular component (3) biological process
Good clustering leads to groups with what type of similarity?
high intra-class similarity and low inter-class similarity
These are two examples of when a reciprocal best BLAST hit could misidentify a paralog as an ortholog.
(1) gene deletion events, (2) incomplete genome information