Query Join Order
CAP Theorem
Cassandra Schema Design
Data Distribution Strategies
NoSQL+
100

This type of join results when there is no join condition specified between two tables.

What is a Cartesian product?

100

In a network partition, this kind of system will serve all requests but may return stale data.

What is an AP system?

100

This component of a primary key determines how data is distributed across nodes.

What is the partition key?

100

This classic strategy for assigning keys to nodes fails when the number of nodes changes.

What is modulo-N partitioning?

100

This class of NoSQL systems is designed to associate BLOBs or JSON strings with keys.

What is object or document store?

200

This kind of optimizer uses static strategies rather than data statistics to determine join order.

What is a rule-based optimizer?

200

This kind of system will reject requests rather than return inconsistent data during a partition.

What is a CP system?

200

This component of a primary key determines how rows are sorted within a partition.

What is the clustering key?

200

This technique minimizes data movement by assigning nodes to points on a ring, but does not make balancing data evenly easy when new nodes join

What is ring based consistent hashing?

200

A system whose nodes are not specialized and can all perform all tasks for a subset of the data.

What is shared-nothing?

300

This kind of optimizer uses data size and filtering selectivity to choose join order.

What is a cost-based optimizer?

300

This CAP theorem choice is only possible for single node or local area network systems.

What is CA?

300

This SQL clause is not very useful in Cassandra because the sorting of keys is already determined.

What is the ORDER BY clause?

300

This building block of the most refined data distribution strategy we analyzed splits the hashing ring into many small ranges such that each physical node can own many of them.

What is a virtual node or vnode?

300

According to Stonebreaker's End of an Architectural Era paper, this factor prompted the move away from traditional RDBMS systems.

What is increase in memory speeds relative to disk?

400

This is the primary reason a rule-based join optimizer may choose a suboptimal join order.

What is ignoring table size or selectivity?

400

The core idea behind the CAP impossibility proof is that a node must make this kind of decision without knowing if a partition has occurred.

What is wait for a response (consistency) OR respond immediately (availability)?

400

This occurs when one of your partition keys has significantly more load than others.

What is hot spotting or unbalanced partitions?

400

Walking the ring clockwise to look for the next vnode that is owned by a different physical machine than the vnode a key initially landed on accomplishes this.

What is finding the physical node responsible for the second copy of the data for that key?

400

This concurrency control technique allows readers to access older versions without blocking writers.

What is MVCC?
500

A cost-based optimizer prefers this join algorithm when no useful indexes are present and one relation is smaller than the other, using the smaller side to minimize memory overhead during the build phase.

What is a hash join?

500
In this type of network, nodes cannot distinguish lost messages from significantly delayed ones.

What is an asynchronous network?

500

What is wrong with the following schema?

What is not enough partitions?

500

If a fourth node joins a a three node system using vnodes and keeping one copy of each piece of data, this is the fraction of data each node should send to the new node.  

What is one fourth?

500

This mechanism in Dynamo tracks causality between multiple versions of the same key and enables reconciliation during read operations.

What are vector clocks?

M
e
n
u