Apache Spark, Apache Flink, Apache Kafka
What is the latest version of Apache Spark
4.2.0
What is the name of the annual conference about Apache Flink?
Flink forward
When was Kafka created and what is its name based on
2011, the writer Franz Kafka
What is a framework?
A framework is a software that provides reusable, generic functionality which developers can extend or customize to create complete solutions.
When and where was it founded?
Spark was originally developed at the University of California, Berkeley's AMPLab starting in 2009.
When and where did the first project start?
2010 in Berlin(University)
What are brokers?
Servers in a Kafka cluster that store and replicate topics
FILL THE BLANKS
Frameworks often include support programs, ......, software development kits, code libraries, ......, and .........
Compilers, Toolsets, APIs
How many companies use Apache Spark?
Thousands, including 80% of the Fortune 500
List at least two examples of the applications we have talked about
Fraud detection, anomaly detection(Event driven application), Monitoring file system (Data pipelines & etl), Quality monitoring, ad-hoc analysis of live data(Stream & batch analytics)
What is a topic
An immutable, ordered log that clients can read
List at least three of the key features of Apache Spark
Streaming of data, SQL Analytics, Data science at big scale, Usage on machine learning algorithms
List at least three benefits of Apache Flink
Scalability, Connectors, In-memory performance, process data sets, exactly-once consistency
What is replication
This ensures that kafka continues working even when broker fails