Soroosh Khodami

Why and when should we consider Stream Processing frameworks in our solutions

Stream processing frameworks are powerful, but their complexity is immense. Learn when to use them—and more importantly, when not to.

Why and when should we consider Stream Processing frameworks in our solutions
#1about 2 minutes

Differentiating stream processing from event processing

Stream processing focuses on transforming continuous data streams, whereas event processing is about making decisions and triggering actions based on individual messages.

#2about 2 minutes

Handling out-of-order data with event time

Stream processing frameworks can reorder messages based on when the event actually occurred (event time) rather than when it was received (processing time).

#3about 2 minutes

Understanding message delivery guarantees

Frameworks provide mechanisms for exactly-once processing, which prevents duplicate message processing and is critical for financial systems.

#4about 3 minutes

Building data pipelines with sources and operators

Data pipelines are constructed by chaining operators that read from a source, apply transformations like filtering or joining, and write to a sink.

#5about 5 minutes

Using windowing to process continuous data streams

Windowing groups unbounded data into finite chunks for processing, with types like tumbling, sliding, and session windows serving different analytical needs.

#6about 1 minute

Joining data from multiple real-time streams

You can combine data from multiple streams using familiar concepts like inner joins and cross joins to create enriched data outputs.

#7about 2 minutes

Implementing complex logic with stateful processing

Stateful processing allows operators to store and retrieve data in memory, enabling complex logic like tracking user behavior or detecting fraud patterns over time.

#8about 1 minute

Overview of popular stream processing frameworks

Key frameworks for stream processing include Apache Flink, Apache Beam, Spark Streaming, and Kafka Streams, with cloud platforms offering managed services.

#9about 4 minutes

Comparing Spring Boot vs Apache Beam performance

A practical benchmark shows that while Apache Beam offers higher throughput, a standard Spring Boot and Redis setup can be sufficient and more cost-effective for many use cases.

#10about 3 minutes

Weighing the benefits and significant drawbacks

While powerful, stream processing frameworks are complex to learn, difficult to maintain and debug, and have a steep learning curve for development teams.

#11about 1 minute

Real-world use cases for stream processing

Stream processing is heavily used in industries like gaming for anti-cheat systems, IoT for real-time traffic data, and finance for fraud detection.

#12about 2 minutes

Learning resources and communicating with stakeholders

Before adopting these complex frameworks, it is crucial to manage stakeholder expectations about the high cost and difficulty of implementing and changing data pipelines.

Related jobs
Jobs that call for the skills explored in this talk.

d

Saby Company
Delebio, Italy

Junior

job ad

Saby Company
Delebio, Italy

Intermediate

Featured Partners

Related Articles

View all articles
DD
Dilek Demir
Data Science & more: The Lopez dilemma
Catwalk, Data Science, Hollywood, Google Images, Haute Couture, StackOverflow, Comfort Zone, Dota 2 and Versace – all these topics are connected and influenced by each other. Read here how and why!In 2000 Jennifer Lopez's green Versace dress went vi...
Data Science & more: The Lopez dilemma

From learning to earning

Jobs that call for the skills explored in this talk.

Test stream

Test stream

Google Inc.
Vienna, Austria

Senior
Ruby on Rails