TOPICS
A) Introductory Concepts
- • Write code to connect to a Kafka cluster
- • Distinguish between leaders and followers and work with replicas
- • Explain what a segment is and explore retention
- • Use the CLI to work with topics, producers, and consumers
B) Working with Producers
- • Describe the work a producer performs, and the core components needed to produce messages
- • Create producers and specify configuration properties
- • Explain how to configure producers to know that Kafka receives messages
- • Delve into how batching works and explore batching configurations
- • Explore reacting to failed delivery and tuning producers with timeouts
- • Use the APIs for Java, C#/.NET, or Python to create a Producer
C) Consumers, Groups, and Partitions
- • Create and manage consumers and their property files
- • Illustrate how consumer groups and partitions provide scalability and fault tolerance
- • Explore managing consumer offsets
- • Tune fetch requests
- • Explain how consumer groups are managed and their benefits
- • Compare and contrast group management strategies and when you might use each
- • Use the API for Java, C#/.NET, or Python to create a Consumer
D) Streaming and Kafka Streams ➡️
- • Develop an appreciation for what streaming applications can do for you back on the job
- • Describe Kafka Streams and explore steams properties and topologies
- • Compare and contrast steams and tables, and relate events in streams to records/messages in topics
- • Write an application using the Streams DSL (Domain-Specific Language)
E) Introduction to Confluent ksqlDB ➡️
- • Describe how Kafka Streams and ksqlDB relate
- • Explore the ksqlDB CLI
- • Use ksqlDB to filter and transform data
- • Compare and contrast types of ksqlDB queries
- • Leverage ksqlDB to perform time-based stream operations
- • Write a ksqlDB query that relates data between two streams or a stream and a table
F) Schemas and the Confluent Schema Registry ➡️
- • Describe Kafka schemas and how they work
- • Write an Avro-compatible schema and explore using Protobuf and JSON schemas
- • Write schemas that can evolve
- • Write and read messages using schema-enabled Kafka client applications
- • Using Avro, the API for Java, C#/.NET, or Python, write a schema-enabled producer or consumer that leverages the Confluent Schema Registry
G) Kafka Connect ➡️
- • List some of the components of Kafka Connect and describe how they relate
- • Set configurations for components of Kafka Connect
- • Describe connect integration and how data flows between applications and Kafka
- • Explore some use cases where Kafka Connect makes development efficient
- • Use Kafka Connect in conjunction with other tools to process data in motion in the most efficient way
- • Create a Connector and import data from a database to a Kafka cluster
H) Security
- • protocol
- • countermeasures
- • authn & authz
- • encryption
- • ACL
I) Design Decisions and Considerations
- • Delve into how compaction affects consumer offsets
- • Explore how consumers work with offsets in scenarios outside of normal processing behavior and understand how to manipulate offsets to deal with anomalies
- • Evaluate decisions about consumer and partition counts and how they relate
- • Address decisions that arise from default key-based partitioning and consider alternative partitioning strategies
- • Configure producers to deliver messages without duplicates and with ordering guarantees
- • List ways to manage large message sizes
- • Describe how to work with messages in transactions and how Kafka enables transactions
J) Robust Development & Testing
- • Compare and contrast error handling options with Kafka Connect, including the dead letter queue
- • Distinguish between various categories of testing
- • List considerations for stress and load test a Kafka system
Table of Contents
- Introduction
- Conceptual Knowledge of Kafka
- Conceptual Knowledge of Confluent Platform
- Conceptual Knowledge of Building Applications
- Network Technologies Relating to Confluent and Kafka
- Kafka Producer Applications
- Kafka Consumer Applications
- Kafka Streams Applications
- ksqlDB Queries
- Apache Avro and the Schema Registry
- Integrating Kafka with External Systems with Kafka Connect
- Security
- Testing
- Tutorials
Introduction
In this module, we’ll look at instruction that addresses specific areas of Confluent and Kafka development. While these areas are called out in the certification exam, you’ll find them to be critical for success in the Kafka and Confluent developer role. This isn’t an exhaustive list, but it will give you an understanding of where to start and what to research for a further deep dive.
Conceptual Knowledge of Kafka
Topics, brokers, producers, consumers, schema, clusters… This is the foundation upon which everything else you learn will be built. Make sure that you fully understand the basics of Kafka before moving on to anything else.
One of the best places to start with this is this video from Confluent’s Tim Berglund: https://www.confluent.io/what-is-apache-kafka/
Conceptual Knowledge of Confluent Platform
Of course, the best source for this is Confluent.
Conceptual Knowledge of Building Applications
- https://www.confluent.io/blog/event-streaming-platform-1/
- This link appeared above but pay close attention to the application section, which discusses an overview of apps within the ecosystem.
- https://developer.confluent.io/
- https://docs.confluent.io/platform/current/app-development/index.html
- https://developer.okta.com/blog/2019/11/19/java-kafka — This is important because it also discusses security, which should be considered from the start.
Network Technologies Relating to Confluent and Kafka
- https://segment.com/blog/kafka-optimization/
- Discusses strategies to optimize a Kafka network and how doing so can lead to big savings.
- https://blog.cloudflare.com/squeezing-the-firehose/
Kafka Producer Applications
Creating a basic producer application is one of the most fundamental tasks for a Kafka developer.
- https://kafka-tutorials.confluent.io/creating-first-apache-kafka-producer-application/kafka.html
- Here you’ll find basic documentation for the KafkaProducer class
- https://kafka-python.readthedocs.io/en/master/
- If you’re a Python programmer, you’ll find helpful examples of both a producer and consumer written with the kafka-python framework. It’s designed to be Java-like in its approach and will be helpful to understand how a typical producer is built. Many other languages are supported. A web search can provide guidance for your chosen development language.
Kafka Consumer Applications
- https://docs.confluent.io/platform/current/clients/consumer.html
- https://kafka.apache.org/26/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html
- https://docs.confluent.io/3.0.0/clients/consumer.html
- https://docs.confluent.io/platform/current/installation/configuration/consumer-configs.html
- Shows the various consumer configuration parameters available on Confluent Platform.
- https://www.sohamkamani.com/golang/working-with-kafka/
- If you’re working in Go/Golang, you’ll find this example helpful. Again, other languages are available.
Kafka Streams Applications
ksqlDB Queries
ksqlDB is one of the most important tools in your toolbox for getting work done. Here are a few resources to help you understand and apply its powerful abilities.
- https://www.confluent.io/kafka-summit-london18/ksql-201-a-deep-dive-into-query-processing/
- https://www.confluent.io/blog/intro-to-ksqldb-sql-database-streaming/
- https://www.confluent.io/blog/kafka-streams-vs-ksqldb-compared/
- This is a helpful blog post about the differences between ksqlDB and Kafka Streams and how to choose the best fit for your use case.
- https://ksqldb.io/
- Here’s the main site for ksqlDB
Apache Avro and the Schema Registry
- https://www.confluent.io/blog/avro-kafka-data/
- This talks about the importance of Avro and provides context for using it with Kafka data.
- https://docs.confluent.io/platform/current/schema-registry/index.html
- Confluent Platform provides some additional features for working with Avro
- https://docs.confluent.io/platform/current/schema-registry/schema_registry_tutorial.html
- This link provides some great tutorials for working with the Schema Registry.
- https://medium.com/@stephane.maarek/introduction-to-schemas-in-apache-kafka-with-the-confluent-schema-registry-3bf55e401321
Integrating Kafka with External Systems with Kafka Connect
One of the strengths of Kafka is its ability to integrate with a variety of external resources. Here are several links to help you understand the concepts and practical application of Kafka Connect and working with connectors.
- This is a great resource for understanding what Kafka Connect is and how to use it.
- https://docs.confluent.io/5.5.0/connect/index.html
- This will help get you started
- https://confluent.buzzsprout.com/186154/1265780-why-kafka-connect-ft-robin-moffatt
- This podcast provides a case for using Kafka Connect. The discussion covers a variety of use cases as well as some of the lesser-known features.
- https://docs.confluent.io/home/connect/overview.html
Security
- https://kafka.apache.org/documentation/#security
- https://docs.confluent.io/platform/current/security/index.html
- https://medium.com/@stephane.maarek/introduction-to-apache-kafka-security-c8951d410adf
- This is a great blog post that discusses why you should care about security for Kafka and talks through some of the standard security protocols and countermeasures for the platform.
- This is a handy guide that talks through encryption and ACLs.
Testing
Here are a number of resources to help get you started:
- https://docs.confluent.io/cloud/current/client-apps/testing.html
- This covers unit, integration, performance, chaos, and other forms of testing.
- https://kafka.apache.org/21/documentation/streams/developer-guide/testing.html
- This discusses testing for Kafka Streams
- https://www.confluent.io/blog/testing-kafka-streams/
Tutorials
Along the way you’ll find tutorials for many of the topics listed above. Here’s a great collection of them, all in one place: