BigQuery
Dataflow
Pub/Sub & IoT
Dataproc
Machine Learning
100

Define a query

What is a query is how you retrieve information from a database?

100

This is a tool for developing and executing a wide range of data processing patterns on very large datasets (e.g. performing the transformations described in ETL)

What is Cloud Dataflow?

100

Pub/Sub is ...

What is a service to help customers capture data and rapidly pass massive amounts of messages between other GCP big data tools and other software applications with world-class security?

100

Hadoop is ...

What is a set of tools and technologies which enables a cluster of computers to store and process large volumes of data?

100

Define ML

What is a branch of computer science that is focused on enabling computers to recognize patterns in data - without humans telling the computer how to recognize the patterns?

200

DOUBLE JEOPARDY!*

Typically, queries are written in this language

DJ* IF you can tell me what SQL stands for

What is SQL?

What is structured query language?

200

DOUBLE JEOPARDY!*

These five products (plus some) available on GCP can allow Dataflow to read or write to them


*DJ if you can tell me why it's ideal

What are Google Cloud Storage, BigQuery, BigTable, Spanner, Firestore, etc?


What is it is ideal for building data pipelines that read from multiple data sources, process them, and write processed output to the final destination?

200

Pub/Sub stands for ...

What is publish/subscribe?

200

DOUBLE JEOPARDY!

Define Dataproc

What is a fully managed service for running Apache Spark and Apache Hadoop clusters in a simpler, more cost-efficient way?

200

DOUBLE JEOPARDY!*

Name the different dataset types

Explain what each dataset does


What is training, validation and testing?

300

This is how BigQuery works/the details about the product

What is Google's fully managed, petabyte-scale, low cost analytics data warehouse. BigQuery is serverless, there is no infrastructure to manage, and you don't need a database administrator, so you can focus on analyzing data to find meaningful insights, uses familiar SQL, and take advantage of our pay-as-you-go model?

300

True or false; Dataflow cannot handle an unbounded or "infinite" dataset streaming in from a continuously updating source (like Pub/Sub).

What is false?

300
DOUBLE JEOPARDY!


List the business and technical value props of Pub/Sub

DJ* IF you can tell the differences between them


Business: What is availability, thoroughput, and latency?

Technical: What is sources, sinks and transforms?

300

Why can customers run their big data jobs more efficiently?

What is billing per second, quick spin up of resources and no more underutilized clusters?

300

This was developed by Google and has become the leading open source tool for building ML models

What is TensorFlow?

400

DOUBLE JEOPARDY!*

These are the five high-level value propositions

*DJ IF you can explain in your own words what each of these props means

What is speed, scale, and agility, enterprise ready, managed services & cutting edge technology, new data is available instantly & ad hoc queries?

400

DOUBLE JEOPARDY!*

Explain what batch and streaming data are

Batch: What is one discrete job? Historical data, etc


Streaming: What is endless incoming data?

400

Define Cloud IoT

What is a set of fully managed and integrated services that allow customers to easily and securily connect, manage and ingest data from devices across the globe at a large scale, process and analyze/visualize that data in real-time, and then act on it for greater operational efficiency?

400
These are the four programming/language models that you can easily use with Dataproc

What is MapReduce, Pig, Hive, and Spark?

400

These are the six MLaaS offerings

What is TensorFlow, Speech API, Vision API, Natural Language API, Translation API and Jobs API?

500

DOUBLE JEOPARDY!*

Name the four technical value props of BigQuery

*DJ IF you can explain in your own words how we address each prop

What is data warehouse, centralize data for machine learning, big data and multiple data marts?

500

Why is Dataflow found in the process phase

What is dataflow is a fully managed service for transforming and enriching data in stream (real time) and batch (historical) modes with equal reliability and expressiveness?

500

DOUBLE JEOPARDY!*

List the business and technical value props of IoT

*DJ IF you can name how we address each prop

Business: What is reduce risk, optimize costs and grow?

Technical: What is securely connecting things, scaling big data and actionable insights and machine learning?

500

DOUBLE JEOPARDY!*

Name the business challenges and the technical challenges with Dataproc

Business: What is cost-effectiveness, spend and ease of use?

Technical: What is idle clusters, scaling inflexibility and high CPU cores and GPUs?

500

DOUBLE JEOPARDY!*

Name the four key products of ML

DJ If, you can tell me what do each of them do

What is Cloud AutoML, Cloud TPU, Cloud Machine Learning Engine and Dialogflow Enterprise Edition?

Cloud AutoML: Train custom ML models

Cloud TPUs: Hardware optimized for ML

Cloud ML Engine: Large-scale ML service

DialogFlow: Create conversational experiences across devices and platforms