A name for genetic sequences that are similar due to shared evolutionary ancestry
What are homologous sequences?
A DNA sequence located far from a gene that can be bound by transcription factors to alter the expression of that gene
What is an enhancer region/ sequence?
Simultaneous measurements of two or more modalities in single cells
What is single-cell multiomics?
The type of enzyme that comes from phage defense systems, and always cuts a particular molecule at the same target sequence
Restriction enzymes/ nucleases
We use the expression of these genes to annotate scRNA-seq clusters with function or cell type
What are marker genes?
The part of the genome that directly codes for RNA and protein
What are coding regions or exons?
The sequencing method that enables both ends of the DNA fragment to be read, and is used in ChIP-seq experiments
What is paired-end sequencing?
An experimental artifact in which mRNA from two (or more) cells receive the same cell barcode
What is a doublet?
The first step in CRISPR-Cas immunity
What is spacer acquisition?
The statistical test we use in GWAS to test the following null hypothesis: the frequency of two alleles is the same between cases and controls
What is Pearson's Chi-Square test?
When alleles occur together more often than can be accounted for by chance
What is linkage disequilibrium?
The name for a type of plot commonly used to visualize bigWig signal near elements of interest, like enhancers
What are tornado plots?
In microfluidics, this is sequenced along with cDNA, corresponds to a single cell, and can be used to distinguish between mRNAs from different cells
What is a cell barcode?
This allows the CRISPR-Cas system to distinguish "self" from "non-self" DNA
What is a protospacer adjacent motif (PAM)?
In spatial transcriptomics, the cell barcode is replaced by this
What is a spatial barcode?
A genomic feature often found at the 5' end of genes, often underrepresented in human DNA due to methyl-C deaminating to T
What are CpG islands (CGI)?
The read counts in these regions cannot be used for normalization of ChIP-seq data
What are ChIP-seq peaks? (Anyone remember why?)
The algorithm we use to graph cell-cell similarities, in which each cell is connected with a fixed, constant number of most similar cells
What is k-nearest neighbor (kNN) clustering?
The type of pooled CRISPR screen typically used for "pathway-specific" readouts, like γ-globin expression
What is a flow cytometry screen?
The three main steps in the Needleman and Wunsch algorithm for finding an optimal alignment between protein or DNA sequences
What are setting up a matrix, scoring the matrix, and identifying the optimal alignment?
The most common approach used for multiple sequencing alignment (MSA), that works by constructing a succession of pairwise alignments
What is progressive alignment, or CLUSTAL?
In ChIP-seq peak calling, MACS2 accounts for background signal by modeling the ChIP-seq reads using this distribution
What is a Poisson distribution?
After scaling (to account for variability in read depth), we often use this transformation to upweight genes that are more likely to be non-uniformly expressed
What is Pearson residual transformation?
These are the four experimental components needed for a successful CRISPR screen
What are model, perturbation, challenge (or "assay"), and read-out?
We use this high-throughput experimental technique to measure protein expression
What is mass spectrometry or mass flow cytometry?