CentralMesh.io

Kafka Fundamentals for Beginners
AdSense Banner (728x90)

3.4 Real-Time vs Batch Processing

Understanding the differences between real-time and batch processing in Kafka.

Video Coming Soon

Real-Time vs Batch Processing

Overview

This lesson explores two common patterns for interacting with Kafka: real-time processing and batch processing. Each has its strengths and challenges, particularly when dealing with retention windows and error handling.

Real-Time Processing

In real-time processing, producers send data to Kafka immediately as events occur, and consumers read this data almost simultaneously.

Use Cases

  • Fraud detection requiring immediate action
  • Real-time analytics and monitoring
  • Alert systems
  • Live dashboards
  • Transaction processing

Advantages

  • Immediate data availability
  • Quick response to events
  • Lower latency
  • Continuous data flow

Batch Processing

Batch processing involves collecting data over a specific interval and processing it all at once at a later time.

Key Considerations

  • Balance batch size with processing time
  • Ensure no data loss within retention window
  • Optimize for throughput over latency

Use Cases

  • ETL operations
  • Report generation
  • Bulk data transformations
  • Scheduled analytics

Bad Data: A Common Challenge

Bad data is a universal challenge for both processing modes, but each handles it differently.

Real-Time Processing Error Handling

Characteristics:

  • Data flows continuously from producer to consumer
  • Immediate action possible on bad data
  • Can retry, skip, or push to Dead Letter Queue (DLQ)
  • Flexible error handling without heavily impacting future data flow

Challenge:

  • If bad data isn't handled quickly, it can delay subsequent messages

Batch Processing Error Handling

Characteristics:

  • Data accumulates over an interval before processing
  • Error handling must fit within retention window
  • Delayed processing can cause message expiration

Challenge:

  • If retention is 1 hour and processing is delayed, messages might expire before being processed

Batch Processing Example: Payment System

Let's examine a concrete example with these parameters:

  • Transaction Rate: 1 transaction per second (TPS)
  • Retention Window: 1 hour
  • Batch Interval: 45 minutes

Calculation

Producer Side (45-minute interval):

  • Rate: 1 TPS
  • Total transactions: 45 min × 60 sec × 1 TPS = 2,700 transactions

Consumer Side (15-minute processing window):

  • Time available: 60 min retention - 45 min batch = 15 minutes
  • Transactions to process: 2,700
  • Required rate: 2,700 ÷ (15 × 60) = 3 TPS

| Time Interval | Rate | Transactions |

|---------------|------|--------------|

| Producer (0-45 mins) | 1 TPS | 2,700 |

| Consumer (15 mins) | 3 TPS | 2,700 |

Error Handling Challenges

Single Error Impact

When the consumer encounters bad data requiring 1 minute to handle:

  • Original time available: 15 minutes (900 seconds)
  • Time after error: 14 minutes (840 seconds)
  • Remaining transactions: 2,699

Multiple Errors Impact

Assuming 5% bad data rate:

  • Total transactions: 2,700
  • Bad transactions: 2,700 × 0.05 = 135 transactions
  • Error handling time: 135 transactions × 1 minute = 135 minutes

Critical Problem:

  • Error handling time (135 min) > Available window (15 min)
  • Error handling time (135 min) > Retention window (60 min)
  • Result: Data expires before it can be processed

This highlights the importance of:

  • Minimizing error rates
  • Optimizing error handling strategies
  • Proper data retention configuration

Solutions to Error Handling Challenges

Solution 1: Increase Data Retention

Approach:

  • Extend retention from 1 hour to 2+ hours
  • Provides more time to process all transactions, even with errors

Advantages:

  • Most reliable for critical data
  • Ensures no data loss
  • Handles error spikes

Trade-offs:

  • Requires more storage (increased costs)
  • Can slow down operations when cluster is busy
  • Higher infrastructure requirements

Best for: Critical data where loss is unacceptable

Solution 2: Reduce Error Handling Time

Approach:

  • Send bad transactions to Dead Letter Queue (DLQ)
  • Consumer skips errors and processes healthy transactions first
  • Failed data isolated for later analysis

Advantages:

  • Consumer remains efficient
  • Doesn't get stuck on errors
  • Failed data available for debugging
  • Smooth processing continues

Trade-offs:

  • Additional infrastructure required
  • DLQ needs monitoring
  • Complexity in error recovery process

Best for: Systems with frequent but manageable errors

Solution 3: Skip Bad Data

Approach:

  • Configure consumer to log and skip bad data entirely
  • Process only healthy messages
  • No retry or DLQ

Advantages:

  • Simplest implementation
  • Keeps system running without interruptions
  • No additional infrastructure

Trade-offs:

  • Potential data loss
  • Skipped data needs review later
  • Operational overhead for investigation

Best for: Systems where:

  • Errors are rare
  • Some data loss is tolerable
  • Simplicity is prioritized

Solution Comparison

| Solution | Time Required | Storage Cost | Complexity | Data Loss Risk |

|----------|--------------|--------------|------------|----------------|

| Increase Retention | More | Higher | Low | None |

| Use DLQ | Normal | Normal | Higher | None |

| Skip Bad Data | Less | Lower | Low | Some |

Choosing the Right Solution

Consider these factors:

  • Data criticality: How important is every transaction?
  • Error frequency: How often do errors occur?
  • Budget constraints: What are the storage costs?
  • Operational capacity: Can you manage complex error handling?
  • System priorities: Throughput vs. reliability vs. cost?

Decision Matrix

High-value, critical data:

  • Solution 1 (Increase Retention) + Solution 2 (DLQ)

Moderate importance, manageable error rate:

  • Solution 2 (DLQ)

Low criticality, rare errors:

  • Solution 3 (Skip) with logging

Summary

Both real-time and batch processing have their place in Kafka architectures:

Real-Time Processing:

  • Immediate action on data
  • Flexible error handling
  • Lower latency
  • Ideal for time-sensitive operations

Batch Processing:

  • Efficient for bulk operations
  • Must carefully manage retention windows
  • Error handling impacts processing time
  • Requires thoughtful strategy for reliability

The key to successful batch processing is balancing:

  • Batch size and frequency
  • Retention configuration
  • Error handling strategy
  • Infrastructure costs
  • Data criticality

By understanding these trade-offs, you can design a Kafka-based system that meets your specific requirements for performance, reliability, and cost.