4.1 Creating and Managing Topics
Commands and configuration for managing Kafka topics.
Video Coming Soon
Creating and Managing Topics
Overview
Managing Kafka topics requires a functional Kafka environment with Zookeeper and at least one broker. This lesson covers the essential commands and configurations for creating and managing topics using Docker and the Kafka CLI.
Environment Setup
Docker Compose Configuration
We use Docker Compose to simplify the setup process with a single configuration file that defines:
- Zookeeper Configuration: Coordinator for Kafka, helping brokers stay in sync
- Kafka Broker 1: First broker with its unique advertised name
- Kafka Broker 2: Second broker for multi-broker demonstration
Each broker advertises itself with unique names like 'kafka1' and 'kafka2', which is critical for client connections.
Starting the Environment
bash1docker-compose up -d
Kafka CLI Setup
Download and Extract Kafka
- Visit the Apache Kafka Downloads page
- Download the version matching your setup (e.g., Kafka 2.13-3.x.x)
- Extract the contents:bash
1tar -xvzf kafka_2.13-3.x.x.tgz 2cd kafka_2.13-3.x.x
DNS Configuration
Since Kafka brokers in Docker advertise themselves by names like 'kafka1' and 'kafka2', you need to map these names to localhost.
Linux/Mac Setup
Edit the hosts file:
1sudo vim /etc/hostsAdd these lines:
1
2127.0.0.1 kafka1
3127.0.0.1 kafka2Windows Setup
- Open Notepad as Administrator
- Open the file:
C:\Windows\System32\drivers\etc\hosts - Add the same lines as above
- Save the file
Verify DNS Setup
Test the configuration:
bash1ping kafka1 2ping kafka2You should see replies from 127.0.0.1 for both.
Connecting to Brokers
Use the --bootstrap-server option to specify which broker to connect to:
1kafka-topics.sh --list --bootstrap-server kafka1:9092,kafka2:9093Creating Topics
Basic Create Command
Create a topic with specific parameters:
1kafka-topics.sh --create --topic my_topic \\
2 --bootstrap-server kafka1:9092,kafka2:9093 \\
3 --partitions 3 \\
4 --replication-factor 2Command Parameters
- --topic: Name identifier for your data stream
- --bootstrap-server: Brokers to connect to (comma-separated)
- partitions: Number of partitions for parallel processing
- --replication-factor: Number of replicas for redundancy
Behind the Scenes
When you create a topic:
- Zookeeper Coordination: Manages metadata about the topic
- Leader Selection: Kafka assigns partition leaders across brokers
- Replica Distribution: Replicas are distributed for fault tolerance
- Metadata Update: Cluster metadata is updated with new topic information
Example: Payment Topic
bash1kafka-topics.sh --create --topic payment \\ 2 --bootstrap-server kafka1:9092,kafka2:9093 \\ 3 --partitions 2 \\ 4 --replication-factor 2This creates:
- A topic named 'payment'
- 2 partitions for parallel processing
- 2 replicas per partition for redundancy
Listing Topics
View all topics in the cluster:
1kafka-topics.sh --list --bootstrap-server kafka1:9092Describing Topics
Get detailed information about a specific topic:
1kafka-topics.sh --describe --topic payment \\
2 --bootstrap-server kafka1:9092This shows:
- Partition count
- Replication factor
- Leader for each partition
- In-Sync Replicas (ISR)
- Replica distribution
Deleting Topics
Remove a topic:
1kafka-topics.sh --delete --topic my_topic \\
2 --bootstrap-server kafka1:9092Note: Topic deletion must be enabled in broker configuration with delete.topic.enable=true
Best Practices
Partition Count
- Consider your throughput requirements
- More partitions = more parallelism
- Balance between parallelism and overhead
- Typical range: 1-100 partitions per topic
Replication Factor
- Minimum: 2 for production
- Recommended: 3 for critical data
- Must be ≤ number of brokers
- Higher replication = better fault tolerance but more storage
Naming Conventions
- Use descriptive names (e.g., 'payment', 'user-events')
- Avoid special characters except hyphens and underscores
- Consider naming hierarchy (e.g., 'app.domain.event-type')
Topic Configuration
Retention Settings
Control how long data is retained:
1kafka-topics.sh --create --topic my_topic \\
2 --bootstrap-server kafka1:9092 \\
3 --config retention.ms=86400000 # 24 hoursCompression
Enable compression for better storage efficiency:
1kafka-topics.sh --create --topic my_topic \\
2 --bootstrap-server kafka1:9092 \\
3 --config compression.type=lz4Common Configuration Parameters
| Parameter | Description | Default |
|-----------|-------------|---------|
| retention.ms | How long to keep messages (milliseconds) | 7 days |
| retention.bytes | Maximum size per partition | Unlimited |
| compression.type | Compression algorithm | none |
| cleanup.policy | delete or compact | delete |
| min.insync.replicas | Minimum ISR for writes | 1 |
Troubleshooting
Cannot Connect to Broker
- Verify DNS configuration (hosts file)
- Check broker is running:
docker ps - Verify port accessibility
- Check advertised listener configuration
Topic Creation Fails
- Ensure sufficient brokers for replication factor
- Check broker logs for errors
- Verify Zookeeper is running and accessible
- Check permissions and ACLs if security is enabled
Partition Rebalancing
After adding brokers, use kafka-reassign-partitions.sh to redistribute partitions evenly across the cluster.
Summary
Topic management is fundamental to working with Kafka. Key takeaways:
- Use Docker for easy environment setup
- Configure DNS for broker name resolution
- Choose partition count based on parallelism needs
- Set replication factor for desired fault tolerance
- Apply appropriate retention and compression settings
- Monitor and adjust configurations as needed
Proper topic configuration ensures optimal performance, reliability, and maintainability of your Kafka cluster.