The technical term for the average value of a data set?
What is a Mean
This term refers to the overall management and control of data assets within an organization, ensuring data quality, compliance, and security
What is data governance?
Type of query language that is used to access data warehouses
What is SQL?
A chart that is used to show how two numeric variables are related
What is a scatterplot?
Name the two programming languages commonly used in analytics engineering for data manipulation and analysis
What is Python and SQL?
The most common measure of spread or dispersion in a data set
What is standard deviation
This type of cyber attack involves gaining unauthorized access to systems
What is Hacking?
Type of key uniquely identifing each record in a fact table
What is a surrogate key?
A type of plot that shows trends and cycles over time
What is a line graph?
This dbt command visualizes the DAG of model dependencies
What is dbt docs generate?
Coding language that is commonly used for statistical analysis and data science
What is R?
This law gives customers control over how companies use their personal data
What is GDPR?
Data warehouse design technique which involves storing aggregated data in multiple fact tables
What is a star schema?
This technique visualizes text data by word frequency
What is a word cloud?
This technique models data as nodes/edges in a graph database
What is graph data model?
Statistical method that calculates a line that best fits a set of data points
Linear regression
This technique obscures sensitive data like credit cards and social security numbers
What is masking?
Technique which creates aggregated views of data for reporting and analysis
What is OLAP - Online analytical processing?
This chart is commonly used to show hierarchical or tree-structured data
What is a Dendrogram?
A data modeling technique that graphically represents the entities, relationships, and attributes within a system
What is an entity relationship diagram?
The formula for calculating variance in a data set
What is the sum of squared deviations from mean divided by n-1?
This application layer firewall examines traffic before it reaches backend servers
What is WAF - Web application firewall?
A NoSQL database that uses key-value pairs for unstructured data
What is MongoDB?
This principle states that extra info should not distract from key relations
What is data-ink ratio?
This type of join is fastest when joining large data sets
What is a Hash join?