Computer Science
Machine Learning
Big Data
Probability and Statistics
Databases
200

Defined as a step-by-step procedure or formula for solving a problem, this term is fundamental in computer science and mathematics.

What is an Algorithm?

200

This occurs when a model learns training data too well, including noise and details, leading to poor performance on new data.

What is overfitting?

200

This storage repository holds a vast amount of raw data in its native format until it is needed for analysis.

What is a Data Lake?

200

These are the 3 main measures of central tendency.

What are mean, median, and mode?

200

This language is often used to manage and manipulate data stored in relational databases and allows users to retrieve, insert, update, and delete data with ease.  

What is SQL?
400

This brilliant mathematician and codebreaker, known as the father of modern computing, played a pivotal role in cracking the German Enigma code during World War II.

Who is Alan Turing?

400

This machine learning model is inspired by the structure and function of biological brains in animals.

What is a neural network?

400

This term refers to the process of extracting useful patterns and insights from large datasets that are too complex or extensive for traditional data-processing software to handle.

What is Data Mining?

400

This measure of variation and dispersion in a set of values is represented by the Greek letter sigma (σ)

What is Standard Deviation?

400

This type of key is used to link two tables together, establishing a relationship between them.

What is a foreign key?
600

This crucial component of a CPU performs basic mathematical and logical operations on data. It's abbreviated as ALU.

What is the Arithmetic Logic Unit?

600

This machine learning method allows models to analyze and cluster unlabeled data sets. These algorithms discover hidden patterns or data groupings without the need for human intervention. 

What is unsupervised learning?

600

This technology framework, often associated with Big Data, is designed for distributed storage and processing of large datasets across clusters of computers. It's mascot is a yellow elephant.

What is Hadoop?

600

This statistical method models the relationship between a dependent variable and one or more independent variables by fitting a linear equation to observed data points.

What is linear regression?

600

This set of properties, often expressed as an acronym, ensures database transactions are reliably processed, maintaining data integrity even in the event of failures.

What is ACID?

800

Named after its Dutch inventor, this algorithm finds the shortest path between nodes in a graph with non-negative weights. What is this algorithm called?

What is Dijkstra's algorithm?

800

This commonly-used machine learning algorithm combines the output of multiple decision trees to reach a single result. 

What is Random Forest?

800

A fast and general-purpose cluster computing system, this unified analytics engine for big data processing was developed by the Apache Software Foundation.

What is Spark?

800

This type of distribution is often called a bell curve and is symmetrical around its mean.

What is a normal distribution?

800

In a database, this sequence of operations is treated as a single unit, which must be either entirely completed or entirely failed to maintain data integrity.

What is a transaction?

1000

Often considered one of the most important problems in computer science, this question asks whether every problem whose solution can be verified in polynomial time can also be solved in polynomial time. 

What is P=NP?

1000

This subset of machine learning uses multilayered neural networks to simulate the complex decision-making power of the human brain.

What is Deep Learning?

1000

Developed at Google during the early 2000's, this popular programming model simplifies the process of writing applications that process vast amounts of data in parallel across a distributed cluster of computers.

What is MapReduce?

1000

This distribution summarizes the probability that a value will take one of two independent values under a given set of parameters or assumptions.

What is a binomial distribution?

1000

This type of database management system diverges from traditional SQL databases by offering flexible schema designs and scalability for handling large volumes of unstructured and semi-structured data.

What is NoSQL?

M
e
n
u