CentralMesh.io

Kafka Fundamentals for Beginners
AdSense Banner (728x90)

3.2 Key Components

Understanding brokers, producers, consumers, and topics in Kafka.

Video Coming Soon

Kafka Key Components

Topics

A topic is where data is stored in Kafka. Topics are divided into partitions to enable scalability and parallel processing.

Topic Structure

  • Think of a topic like a folder
  • Partitions are like sections within that folder
  • Data is stored and managed within partitions
  • Partitions enable horizontal scaling

Replication for Fault Tolerance

When a broker fails, replicas ensure data availability. Kafka replicates each partition across different brokers, providing redundancy.

Single Partition Topic

For a topic with one partition:

  • Partition 1 has two replicas
  • Replica 1_1 on Broker A (leader by default)
  • Replica 1_2 on Broker B (follower)
  • Handles all read and write operations through the leader
  • If Broker A fails, Replica 1_2 on Broker B becomes the new leader

Multi-Partition Topics

When a topic has multiple partitions, data distribution improves performance:

Example with 2 Partitions:

  • Partition 1: Replicated across Broker A and Broker B
  • Partition 2: Replicated across Broker A and Broker C

This distribution ensures:

  • If Broker B fails, Partition 1 data remains available on Broker A
  • If Broker C fails, Partition 2 data remains available on Broker A
  • Multiple consumers can read from different partitions simultaneously
  • Better load balancing across the cluster

Brokers

Brokers are the backbone of Kafka - the intermediaries between producers and consumers.

Broker Responsibilities

  • Store messages sent by producers
  • Distribute messages to consumers
  • Manage partitions and replicas
  • Handle data replication and persistence
  • Ensure reliability through redundancy

Broker Architecture

text
1
2Producer → Broker1 → Consumer
3           ↓ Replicates
4           Broker2

Brokers work together to ensure data is:

  • Properly stored
  • Replicated for fault tolerance
  • Available for consumer access

Metadata Management

Zookeeper (Current)

Zookeeper acts as the coordinator for Kafka clusters:

Key Functions:

  • Manages broker coordination
  • Handles controller election
  • Stores topic configurations
  • Tracks cluster state
  • Ensures synchronization across distributed system

Zookeeper is essential for:

  • High availability
  • Fault tolerance
  • Broker leadership management

KRaft Mode (Future)

KRaft is Kafka's self-managed metadata system that will replace Zookeeper:

Benefits:

  • Kafka handles metadata internally
  • Simpler architecture
  • Improved performance
  • No external dependency
  • Reduced operational complexity

Comparison:

| Feature | Zookeeper Mode | KRaft Mode |

|---------|---------------|------------|

| Metadata Management | External (Zookeeper) | Internal (Kafka) |

| Architecture Complexity | Higher | Lower |

| Operational Overhead | More | Less |

| Performance | Good | Better |

Producers and Consumers

Producers

Producers send data to Kafka brokers:

  • Push messages to specific topics
  • Data is routed to appropriate partitions
  • Can specify partition keys for ordered delivery

Consumers

Consumers retrieve data from Kafka brokers:

  • Pull messages from topics
  • Read from partition leaders
  • Track their position using offsets

Partition-Based Processing

Kafka uses partitions to distribute load and improve performance:

  • Multiple consumers can read from different partitions simultaneously
  • Each partition maintains message order
  • Parallelism improves throughput

Payment Topic Example

Single Partition Configuration

For a Payment Topic with one partition:

Components:

  • Replica 1_1 on Broker A (leader)
  • Replica 1_2 on Broker B (follower)
  • Zookeeper manages metadata
  • Producer sends payment data to leader
  • Consumer reads from leader

Limitation: Only one active consumer instance possible. Additional consumer instances will be idle.

Two Partition Configuration

For a Payment Topic with two partitions:

Partition 1:

  • Replica 1_1 on Broker A (leader)
  • Replica 1_2 on Broker B (follower)

Partition 2:

  • Replica 2_1 on Broker B (leader)
  • Replica 2_2 on Broker C (follower)

Metadata Management:

  • Zookeeper tracks which replica is the leader
  • Handles automatic failover if a broker fails
  • Ensures continuity of service

Data Distribution:

  • Odd-numbered transactions → Partition 1
  • Even-numbered transactions → Partition 2
  • Better load distribution

Consumer Parallelism:

  • Two consumer instances can run simultaneously
  • Each consumer reads from one partition
  • Parallel processing improves throughput
  • Single consumer would read from both partitions

Summary

Kafka's key components work together to provide:

  • Topics and Partitions: Organize and distribute data
  • Brokers: Store, replicate, and serve data
  • Replication: Ensure fault tolerance
  • Metadata Management: Coordinate cluster operations (Zookeeper/KRaft)
  • Producers: Send data to topics
  • Consumers: Read data from topics
  • Parallel Processing: Enable scalability through partitions

This architecture allows Kafka to handle high-throughput, fault-tolerant data streaming at scale.