Streaming Data on AWS — Kinesis, Firehose, MSK for SAA-C03

Streaming data Kinesis workloads are a signature SAA-C03 topic because real-time architectures appear in every modern customer scenario — IoT telemetry, clickstream analytics, log aggregation, fraud detection, and change-data-capture pipelines. The AWS exam tests whether a Solutions Architect can pick the right streaming data Kinesis service (Amazon Kinesis Data Streams, Amazon Data Firehose, Amazon Managed Service for Apache Flink, or Amazon MSK) under tight latency, durability, ordering, and cost constraints. Miss the trap between streaming data Kinesis Data Streams and Amazon Data Firehose and you miss a guaranteed two-to-four questions per attempt.

This streaming data Kinesis study note decodes every exam-relevant behaviour of the streaming data Kinesis family for SAA-C03 — shard mechanics, partition keys, hot shards, on-demand vs provisioned capacity, retention windows from one to 365 days, the Kinesis Client Library (KCL), enhanced fan-out, cross-account stream sharing, Firehose destinations (Amazon S3, Amazon Redshift, Amazon OpenSearch Service, Splunk, generic HTTP), Lambda transformation, dynamic partitioning, Apache Flink stream processing, and Amazon MSK including MSK Serverless and MSK Connect. Every streaming data Kinesis section ends with the decision rule the exam expects you to apply on sight.

What is Streaming Data on AWS?

Streaming data is an unbounded sequence of records produced continuously by many sources (devices, applications, databases, clickstreams) and consumed by one or more downstream systems in near-real-time. Unlike batch data, streaming data never ends, records must be processed in order (at least per key), and the system must scale horizontally as producers multiply. On AWS, the streaming data Kinesis family plus Amazon MSK are the managed building blocks that replace self-hosted Apache Kafka, Apache Flink, or Apache Storm clusters.

The streaming data Kinesis portfolio has four pillars:

Amazon Kinesis Data Streams (KDS) — the durable, replayable, ordered log. You write producers, you write consumers, you control the shard count.
Amazon Data Firehose (formerly Kinesis Data Firehose) — the fully managed delivery pipeline that batches, transforms, compresses, and lands streaming data in Amazon S3, Amazon Redshift, Amazon OpenSearch Service, Splunk, HTTP endpoints, or partner SaaS.
Amazon Managed Service for Apache Flink (formerly Kinesis Data Analytics) — the serverless Apache Flink runtime for stateful stream processing in Java, Scala, Python, or SQL via Zeppelin notebooks.
Amazon MSK — fully managed Apache Kafka, available as provisioned clusters or MSK Serverless, with MSK Connect for Kafka Connect connectors.

Streaming data Kinesis appears in Domain 1 (Design Secure Architectures), Domain 2 (Resilient Architectures), Domain 3 (High-Performing Architectures), and Domain 4 (Cost-Optimized Architectures) of the SAA-C03 exam guide. Expect one streaming-data-Kinesis question per domain on average.

Why Streaming Data Kinesis Matters for SAA-C03

SAA-C03 Domain 3 (High-Performing Architectures) explicitly lists "determine high-performing and/or scalable storage solutions" and "design high-performing and elastic compute solutions" — both of which the streaming data Kinesis family solves. Domain 4 (Cost-Optimized Architectures) tests the Kinesis Data Streams on-demand vs provisioned trade-off and the Amazon Data Firehose fully-managed zero-ops cost advantage. Domain 2 (Resilient Architectures) tests the streaming data Kinesis retention window (1 to 365 days) as the replay-safety feature. Domain 1 (Secure Architectures) tests KMS encryption, VPC endpoints, IAM resource policies, and cross-account streaming data Kinesis access.

Plain-Language Explanation: Streaming Data Kinesis

The streaming data Kinesis family sounds abstract until you map it onto everyday systems. Three analogies crack it open.

Analogy 1 — The postal sorting facility (郵政系統)

Picture a 24/7 postal sorting facility.

Amazon Kinesis Data Streams is the main conveyor belt that runs through the middle of the building. Every parcel (record) lands on one of several parallel lanes (shards), and every lane is sorted by the destination ZIP code printed on the parcel (partition key). Parcels stay on the belt for one day by default, up to 365 days if you pay for extended retention, so multiple downstream inspectors can photograph them later.
Amazon Data Firehose is the delivery truck that waits at the loading dock. You do not see individual parcels — the truck leaves only when the cargo hits 5 MiB or 300 seconds (buffer hints), and it always drives to the address you pre-configured (Amazon S3 warehouse, Amazon Redshift archive, Amazon OpenSearch Service index, Splunk dock, or a third-party HTTP endpoint).
Amazon Managed Service for Apache Flink is the inspector standing next to the conveyor, reading barcodes as parcels fly past, computing a rolling count of parcels per ZIP per minute, and sending the aggregate to a dashboard — stateful stream processing, no batch.
Amazon MSK is the same conveyor belt, but built to the Apache Kafka blueprint so that companies with Kafka-trained staff can plug their existing Kafka tooling straight in.

If the exam says "simply write streaming data to S3 with Parquet and a KMS key", the answer is the delivery truck (Amazon Data Firehose). If the exam says "multiple consumers must each read every record, replay seven days back", the answer is the conveyor belt with retention (Amazon Kinesis Data Streams).

Analogy 2 — Train lines on a station map (交通號誌)

Think of streaming data Kinesis as the train system of a mega city.

A shard is a single track. Adding shards adds parallel tracks that can carry more trains per minute. A stream with four shards ingests four times the throughput of a one-shard stream.
The partition key is the routing stamp on the ticket. Passengers with the same stamp always board the same track, so ordering is preserved per key. A bad stamp design (everyone stamps "VIP") jams one track — a hot shard.
On-demand mode is the metro that auto-adds trains when crowds arrive; provisioned mode is the commuter railway where you pre-book carriages and pay even when empty.
Enhanced fan-out is a private express track granted to one VIP consumer — 2 MB/s dedicated throughput per shard per consumer, not shared with other riders.
KCL (Kinesis Client Library) is the station master that coordinates which carriage each consumer boards and checkpoints where they stopped reading.

Analogy 3 — The kitchen brigade (廚房)

Or picture a restaurant kitchen mid-service.

Amazon Kinesis Data Streams is the pass — orders fly past, every line cook (consumer) grabs a copy, and the paper tickets stay pinned for a day so the expediter can reconcile.
Amazon Data Firehose is the dishwasher conveyor — plates go in dirty, come out rinsed, dried, and stacked at the back exit (Amazon S3) with zero human tuning.
Amazon Managed Service for Apache Flink is the sous-chef shouting running totals ("table five has ordered three mains, two desserts, running check 84 dollars") in real time.
Amazon MSK is the same pass, but the chef insists on the Italian kitchen convention (Apache Kafka), not the French one (Kinesis Data Streams).

Keep the three pictures handy and every streaming data Kinesis question becomes a routing puzzle.

Core Operating Principles — Streaming Data Kinesis Fundamentals

Every streaming data Kinesis service is built on four shared principles.

Partitioned, ordered log. Records land in shards (KDS) or partitions (MSK). Order is guaranteed per shard, not across shards. The partition key is the customer's responsibility.
Pull-based or push-based consumption. KDS and MSK are pull-based — consumers poll. Amazon Data Firehose is push-based — the service pushes to your destination on its buffer schedule.
Retention decouples producers from consumers. Records survive in the streaming data Kinesis buffer for a configurable window (KDS: 24 hours default, 7 days standard, 365 days extended; MSK: until disk fills or the configured retention hours; Firehose: only until the batch lands at the destination).
Throughput scales horizontally. Add shards to KDS, add brokers to MSK, or let MSK Serverless and Firehose auto-scale for you.

A shard is the base throughput unit of an Amazon Kinesis Data Stream. Each shard provides 1 MiB/s or 1,000 records/s of write capacity and 2 MiB/s of read capacity shared across classic consumers (or 2 MiB/s per enhanced-fan-out consumer). Stream capacity equals shard count multiplied by per-shard limits. Source ↗

A partition key is a Unicode string (up to 256 characters) supplied by the producer that Kinesis Data Streams hashes (MD5) to select which shard receives the record. Records with the same partition key always land on the same shard, so ordering is preserved per key. Source ↗

Amazon Kinesis Data Streams — The Durable Streaming Log

Amazon Kinesis Data Streams (KDS) is the foundational streaming data Kinesis service. It stores records in shards, preserves order per partition key, retains records for 24 hours by default (extended up to 365 days), and exposes two consumer models: classic shared-throughput and enhanced-fan-out.

Shards, throughput, and record size

Each KDS shard sustains:

Writes: 1 MiB/s or 1,000 records/s, whichever comes first.
Classic reads: 2 MiB/s or 5 GetRecords calls/s, shared across all classic consumers.
Enhanced-fan-out reads: 2 MiB/s per registered consumer per shard, dedicated (HTTP/2 push).

A record is limited to 1 MiB (payload plus partition key). The exam loves to ask what happens when a producer writes a 2 MiB record — the answer is ProvisionedThroughputExceededException regardless of shard count.

Partition keys and hot shards

The partition key is the design knob that determines parallelism. A well-distributed key (user_id, device_id, session_id) spreads records uniformly. A skewed key (constant string, country code with one dominant country, or a low-cardinality attribute) concentrates records on one shard — a hot shard — and the entire stream throttles at 1 MiB/s regardless of shard count.

Adding shards does NOT fix a hot shard. If your partition key sends 90 percent of traffic to key "US", resharding from 4 to 16 shards still routes all "US" records to one shard. The fix is to either (a) change the partition key to a higher-cardinality attribute, (b) append a random suffix (user_id + random 0-9) to spread synthetic keys, or (c) pre-aggregate with the Kinesis Producer Library (KPL) so fewer partition-key decisions are made. Source ↗

On-demand mode vs provisioned mode

Amazon Kinesis Data Streams ships in two capacity modes:

Provisioned mode: you specify the shard count, you pay per shard-hour plus PUT payload units. Cheapest for steady, predictable workloads where you can right-size. You are responsible for resharding (SplitShard, MergeShards) as traffic grows.
On-demand mode: AWS auto-scales the shard count up to 200 MiB/s write and 400 MiB/s read per stream by default. You pay per GB ingested and per GB retrieved, no shard math. Best for spiky, unknown, or new workloads.

Choose Amazon Kinesis Data Streams on-demand when throughput is unpredictable, bursts exceed 2x the baseline, or you do not want to own resharding. Choose provisioned when you have a predictable baseline above roughly 250 shard-hours per month — the per-shard-hour price is about 4x cheaper than the on-demand equivalent GB price at steady load. Source ↗

Retention: 1 day, 7 days, 365 days

The default retention period is 24 hours. Extended retention up to 7 days is a toggle with no extra per-GB cost (only per-shard-hour premium). Long-term retention from 7 to 365 days is billed per GB-month stored and per GB retrieved with GetRecords beyond seven days. Long-term retention is the streaming data Kinesis answer whenever the exam asks "replay events up to a year ago without moving them to Amazon S3 first."

Many SAA-C03 questions state that a consumer failed for three days and ask whether records can be replayed. If the stream is on default retention (24 hours), the data is gone. The fix is to raise retention to 7 days (or 365 days) BEFORE the outage — you cannot extend retention retroactively to recover already-expired data. Always assume default unless the scenario specifies otherwise. Source ↗

KCL — the Kinesis Client Library

The Kinesis Client Library (KCL) is the AWS-provided consumer framework (Java, Python, Node.js, .NET, Ruby) that:

Discovers shards and assigns them to worker instances.
Checkpoints progress in an Amazon DynamoDB table (one table per application name).
Handles shard splits, merges, and worker failures.
Guarantees at-least-once delivery per record per application.

Multiple applications can share the same stream by each running its own KCL application with a unique application name (hence a unique checkpoint table). This is how fan-out to multiple independent consumers works in classic mode.

KCL uses Amazon DynamoDB for checkpointing — that table counts toward your DynamoDB bill.
One KCL worker processes one-or-more shards; never more than one worker per shard per application.
Scale horizontally by adding EC2/ECS workers up to the shard count; beyond that, extra workers idle.
KCL 2.x adds HTTP/2 push for enhanced fan-out. Source ↗

Enhanced fan-out

Enhanced fan-out (EFO) is the streaming data Kinesis answer when multiple consumers need low-latency, dedicated read throughput. Each registered consumer gets:

2 MiB/s per shard dedicated (not shared with other EFO consumers).
HTTP/2 push delivery with sub-200 ms end-to-end latency.
Up to 20 consumers per stream.

EFO costs extra per consumer-shard-hour and per GB retrieved, so reserve it for consumers that actually need the isolation. In classic mode, adding a fifth consumer to a single shard means all five fight for the same 2 MiB/s; with EFO, each gets its own 2 MiB/s pipe.

A Kinesis Data Stream can be read or written from another AWS account using resource-based IAM policies attached to the stream (since 2023) or, classically, cross-account IAM role assumption. The exam pattern is: account A owns the stream; account B runs the consumer; attach a resource policy on the stream that allows account B's role to call SubscribeToShard, GetRecords, and DescribeStream. AWS KMS keys used for stream encryption must also grant the consumer account Decrypt permission.

Producers: SDK, KPL, Kinesis Agent, Firehose-as-producer

AWS SDK PutRecord / PutRecords — lowest-level, highest flexibility.
Kinesis Producer Library (KPL) — async, batching, aggregation (multiple logical records per Kinesis record), retries. Pair with KCL for automatic de-aggregation.
Amazon Kinesis Agent — a Java-based log-tailer installed on EC2 that ships log files to KDS or Firehose.
Amazon Data Firehose can itself read from a KDS as its source, then land the data in S3/Redshift/OpenSearch without writing any code.

Amazon Data Firehose — The Zero-Ops Delivery Pipeline

Amazon Data Firehose (renamed from Kinesis Data Firehose in 2024) is the fully managed streaming data Kinesis service that ingests records and delivers them to a pre-configured destination with batching, compression, encryption, format conversion, and optional Lambda transformation. There are no shards to size, no consumers to run, no checkpointing to manage.

Firehose destinations

Amazon Data Firehose supports the following destinations:

Amazon S3 — the most common destination; supports GZIP, Snappy, ZIP, Parquet, and ORC format conversion.
Amazon Redshift — Firehose lands the batch in S3 first, then issues a COPY into Redshift.
Amazon OpenSearch Service (and Amazon OpenSearch Serverless) — direct index writes with optional S3 backup.
Splunk — via HTTPS event collector (HEC).
Generic HTTP endpoint — any HTTPS endpoint, retries with configurable back-off.
Partner destinations — Datadog, MongoDB Atlas, New Relic, Coralogix, Logz.io, Dynatrace, Honeycomb, Sumo Logic, Elastic Cloud.
Apache Iceberg tables (in S3 via Glue Data Catalog).

The exam always tests the S3, Redshift, OpenSearch, and Splunk destinations.

Amazon Data Firehose is not a pub/sub system. You cannot attach arbitrary consumers to a Firehose stream; there is exactly one destination per delivery stream. If the scenario requires multiple independent consumers reading the same stream, the correct service is Amazon Kinesis Data Streams (with multiple KCL applications or enhanced-fan-out consumers), not Amazon Data Firehose. Source ↗

Buffering: size and time

Firehose batches records by two buffer hints, whichever trips first:

Buffer size: 1 MiB to 128 MiB (default 5 MiB for S3, 1 MiB for OpenSearch).
Buffer interval: 0 to 900 seconds (default 300 seconds).

A smaller buffer means fresher data at the cost of more, smaller destination files — which in turn means more Amazon S3 PUT requests and more expensive Athena scans later. A larger buffer is cheaper but delays visibility.

Lambda transformation

Each Firehose delivery stream can optionally invoke an AWS Lambda function to transform records in-flight — parse JSON, add fields, convert formats, drop PII, or mask sensitive values. The Lambda function receives a batch, returns the transformed batch with per-record result status (Ok, Dropped, or ProcessingFailed). Failed records go to an S3 error bucket.

Dynamic partitioning

Dynamic partitioning is the streaming data Kinesis Firehose feature that writes records to Amazon S3 under partition prefixes derived from record content — for example, s3://bucket/year=2026/month=04/day=20/customer_id=42/. Without dynamic partitioning, you get a single YYYY/MM/DD/HH/ prefix controlled by delivery time, which fragments Athena partition pruning. Dynamic partitioning supports:

JQ-style expressions on JSON records to extract partition values.
Lambda-provided partition keys for non-JSON records.
Per-record pricing surcharge — billed per GB partitioned.

The canonical SAA-C03 answer for "land JSON streaming data in Amazon S3 with Parquet format and partition pruning for Amazon Athena" is Amazon Data Firehose with record format conversion (JSON to Parquet using the AWS Glue schema registry) plus dynamic partitioning. No EMR, no Glue job, no custom Lambda — all fully managed. Source ↗

Firehose sources

A Firehose delivery stream reads from one source:

Direct PUT — producers call PutRecord / PutRecordBatch on Firehose directly.
Amazon Kinesis Data Streams — Firehose is a KDS consumer, pulling records and delivering them.
Amazon MSK (MSK cluster or MSK Serverless) — Firehose consumes Kafka topics and lands them in S3/Redshift/OpenSearch.
AWS IoT, Amazon CloudWatch Logs, Amazon CloudWatch Events, AWS WAF logs, Amazon VPC flow logs, Amazon Route 53 Resolver query logs — native integrations.

Amazon Managed Service for Apache Flink (formerly Kinesis Data Analytics)

Amazon Managed Service for Apache Flink is the serverless Apache Flink runtime that lets you build stateful stream-processing applications without managing Flink clusters. It was renamed from Kinesis Data Analytics in 2023; the exam may still use either name.

Three authoring modes

Apache Flink applications (Java, Scala, Python) — full DataStream or Table API, package a JAR or ZIP, upload to S3, submit. Best for complex event processing, sessionisation, joins, and windowed aggregations.
Studio notebooks with Apache Zeppelin — interactive SQL, Python, or Scala notebooks backed by Flink. Great for ad-hoc exploration and prototyping that can be promoted to production applications.
SQL applications (legacy Kinesis Data Analytics for SQL) — the original SQL-on-stream engine. AWS has marked this as legacy and recommends Flink SQL via Studio for new workloads.

Sources and sinks

Managed Service for Apache Flink reads from Kinesis Data Streams, Amazon MSK, Amazon MSK Serverless, and other Flink-supported connectors (Kafka, Kinesis Firehose, Amazon S3, Amazon DynamoDB Streams). It writes to the same destinations plus Amazon Timestream, Amazon OpenSearch, Amazon Redshift (via JDBC), and Amazon S3 (via the Flink S3 sink). The typical SAA-C03 pattern is:

KDS -> Managed Service for Apache Flink -> KDS -> Firehose -> S3 (Parquet)

MSK -> Managed Service for Apache Flink -> OpenSearch (real-time dashboard)

State, checkpointing, and scaling

Flink applications are stateful — windowed aggregates, joins, pattern detection. Managed Service for Apache Flink automatically checkpoints state to a service-managed backend and restores on failure. Scaling is by Kinesis Processing Units (KPU), each giving 1 vCPU and 4 GB of memory; parallelism auto-scales between MinParallelism and MaxParallelism based on CPU load.

The answer is Amazon Managed Service for Apache Flink. Amazon Data Firehose cannot do windowed joins; Amazon Kinesis Data Streams alone is only a transport; AWS Lambda triggered by KDS can do single-record transforms but not stateful cross-record windowing at scale. Source ↗

Amazon MSK — Managed Apache Kafka

Amazon MSK is the fully managed Apache Kafka service. Two deployment modes exist.

MSK provisioned

You pick the broker instance type (kafka.m7g.large, kafka.m5.4xlarge, etc.), the number of brokers per Availability Zone (two or three), and the EBS storage size per broker. AWS runs the brokers, ZooKeeper (or KRaft in Kafka 3.5+), patching, and failure recovery. You manage:

Topics and partitions — Kafka's equivalent of shards.
Replication factor — typically three across three AZs.
Retention — per topic, by time (retention.ms) or size (retention.bytes).
ACLs, SASL/SCRAM, IAM auth, TLS — access control.

MSK Serverless

MSK Serverless removes broker sizing. You create a cluster, create topics, and pay per GB ingested, per GB stored, and per partition-hour. AWS auto-scales partitions and throughput up to 200 MiB/s write per cluster by default. Ideal for unpredictable Kafka workloads.

MSK Connect

MSK Connect is the managed Kafka Connect service. It runs Kafka Connect workers (source or sink connectors) as a managed fleet on top of MSK or any Apache Kafka cluster. Common connectors:

Source: Debezium CDC from MySQL/Postgres, Amazon S3 source, JDBC.
Sink: Amazon S3 sink, Amazon OpenSearch sink, Snowflake, Redshift, MongoDB.

MSK Connect auto-scales worker count and bills per worker-hour.

Choose Amazon MSK when (a) the customer already runs Apache Kafka or uses Kafka-native tooling (Debezium, Kafka Streams, Schema Registry), (b) they need message sizes larger than 1 MiB (default Kafka limit is 1 MiB but configurable to 10+ MiB), or (c) they need longer retention than 365 days (Kafka retention is bounded only by disk). Choose Amazon Kinesis Data Streams when the team has no Kafka expertise and wants a lighter-weight, fully AWS-native stream with tight IAM, KMS, and VPC integration. Source ↗

Decision Tree — When to Choose KDS vs Firehose vs MSK vs Flink

The SAA-C03 exam nearly always phrases streaming data Kinesis questions as "which service should the architect choose?" Use this decision tree.

Step 1 — Is the destination a single well-known sink?

If the pipeline ends in Amazon S3, Amazon Redshift, Amazon OpenSearch, or Splunk with no multi-consumer fan-out and no stateful processing → Amazon Data Firehose. Zero-ops, lowest effort, cheapest on small scale.

Step 2 — Do multiple independent consumers need their own copy of the stream?

If yes, and each consumer may read at different offsets, replay history, or need dedicated throughput → Amazon Kinesis Data Streams (classic or enhanced fan-out). Firehose cannot fan out to multiple arbitrary consumers.

Step 3 — Is the team already running Apache Kafka or does the payload exceed 1 MiB or need > 365 day retention?

If yes → Amazon MSK (provisioned if predictable, MSK Serverless if bursty). MSK Connect handles Debezium CDC and S3 sink patterns.

Step 4 — Does the workload require stateful windowed processing, joins, or pattern detection?

If yes → Amazon Managed Service for Apache Flink, typically chained in front of KDS or MSK.

Step 5 — Is throughput unpredictable and the team wants zero capacity planning?

Prefer on-demand mode for KDS, MSK Serverless for Kafka, and Firehose for delivery. All three auto-scale.

Question stem: "Ingest 500 MB/s of clickstream JSON, convert to Parquet, land in S3 for Athena, no custom consumers." Correct answer: Amazon Data Firehose. Wrong-but-tempting answer: Amazon Kinesis Data Streams plus Lambda plus S3. The Lambda-based chain is more code, more failure modes, and more expensive. If only one destination is needed and no replay/fan-out is required, always pick Amazon Data Firehose. Source ↗

Security — Encryption, VPC, and Access Control

Every streaming data Kinesis service supports the same AWS security primitives, but the exam loves to test the differences.

Encryption at rest

KDS uses AWS KMS server-side encryption (SSE-KMS); enabled per stream, key is customer-managed or AWS-managed. Consumers must have kms:Decrypt on the key.
Firehose encrypts the in-flight buffer with KMS; destination encryption is destination-specific (S3 SSE-S3/SSE-KMS, Redshift cluster encryption, OpenSearch domain encryption).
MSK encrypts at rest with KMS on the broker EBS volumes.

Encryption in transit

KDS, Firehose, and MSK all support TLS endpoints.
MSK additionally supports in-cluster broker-to-broker TLS (must be enabled at cluster creation).

VPC endpoints and private connectivity

KDS and Firehose have interface VPC endpoints so producers and consumers in a private VPC never traverse the public internet.
MSK runs entirely inside your VPC by design; brokers live on customer ENIs.

IAM and resource policies

KDS resource-based policies allow cross-account access without role assumption (launched 2023).
Firehose access is granted via IAM policies on the delivery-stream ARN; the Firehose IAM role needs permission to write to the destination and read the source.
MSK IAM auth lets Kafka clients authenticate with SigV4 instead of SASL/SCRAM or TLS mutual auth.

When a Kinesis stream in account A is encrypted with a customer-managed KMS key and a consumer in account B is granted read access via resource policy, the consumer STILL fails until account A's KMS key policy grants account B's role kms:Decrypt. This two-door pattern (IAM grant + KMS key-policy grant) is a classic SAA-C03 trap. Source ↗

Performance and Cost Patterns

KDS performance limits cheat sheet

Per-shard write: 1 MiB/s or 1,000 records/s.
Per-shard classic read: 2 MiB/s, 5 GetRecords/s, shared.
Per-shard EFO read: 2 MiB/s per consumer, up to 20 consumers.
Max record size: 1 MiB.
Retention: 24h default, 7d standard extended, 365d long-term.
On-demand default throughput: 200 MiB/s write, 400 MiB/s read per stream (quota-raisable).

Firehose performance limits cheat sheet

Direct PUT throughput: starts at 1 MiB/s or 1,000 records/s per region per account (quota-raisable).
Buffer size: 1 to 128 MiB, default 5 MiB.
Buffer interval: 0 to 900 s, default 300 s.
Max record size: 1 MiB.

MSK performance knobs

Broker instance type (kafka.m7g.large up to kafka.m5.24xlarge).
Number of brokers per AZ (one to three per AZ, three AZs).
EBS storage per broker (1 GB to 16 TB, autoscaling optional).
Replication factor (usually three across three AZs).
In-cluster network throughput governs peer-replication headroom.

Cost knobs

KDS provisioned: shard-hour + PUT payload units + extended retention shard-hour + long-term retention GB-month + EFO consumer-shard-hour + EFO data-retrieval GB.
KDS on-demand: per-GB-ingested + per-GB-retrieved + GB-month retention.
Firehose: per-GB-ingested (plus per-GB format conversion, per-GB dynamic partitioning, per-GB VPC delivery, per-GB backup to S3).
MSK provisioned: broker-hour + EBS-GB-month + data-transfer + optional tiered storage GB-month.
MSK Serverless: per-GB-ingested + per-GB-stored + partition-hour.

For under roughly 500 MiB/s of clickstream or log data with a single S3 destination, Amazon Data Firehose (direct PUT) is usually cheaper than Amazon Kinesis Data Streams + custom Lambda because you pay no shard-hours, no Lambda invocations, and no state management. Above that scale or with replay requirements, provisioned KDS plus Firehose-as-consumer tends to win on unit cost. Source ↗

High-Availability and Disaster Recovery Patterns

Same-region resilience

KDS and Firehose are regional, multi-AZ by default. No configuration required.
MSK requires you to distribute brokers across three AZs at cluster creation; a single-AZ MSK cluster is not recommended.

Cross-region replication

KDS cross-region: no native mirror. Pattern is Lambda or Managed Service for Apache Flink consuming the source stream and writing to a stream in the target region.
Firehose cross-region: destination can be in another region for HTTP endpoints and S3 (with extra data-transfer cost).
MSK cross-region: use MSK Replicator (2023) for managed mirror-maker-like replication between two MSK clusters across regions.

Replay and backfill

KDS and MSK both support seeking by sequence number / offset / timestamp. Firehose does not support replay once a batch has been delivered — if you need to reprocess, you must re-ingest from the original source or read the S3 backup bucket.

Common Architectures on the SAA-C03 Exam

Clickstream to data lake

Web app -> KPL -> KDS -> Firehose (Parquet + dynamic partitioning) -> S3 -> Athena / QuickSight. Uses KDS for replay, Firehose for zero-ops delivery, Parquet for scan cost, dynamic partitioning for Athena partition pruning.

IoT telemetry with real-time anomaly detection

IoT devices -> AWS IoT Core -> rule to KDS -> Managed Service for Apache Flink (windowed anomaly detection) -> KDS alerts stream -> Lambda -> SNS. Flink handles stateful sessionisation; KDS holds seven-day replay history.

Log aggregation to OpenSearch and S3

Amazon CloudWatch Logs subscription filter -> Firehose -> (Lambda transform) -> OpenSearch + S3 backup. Single-destination delivery, no fan-out needed, Firehose wins.

Change-data-capture from RDS to data lake

RDS MySQL -> Debezium on MSK Connect -> MSK -> MSK Connect S3 sink -> S3 Iceberg table. Kafka ecosystem is the natural fit; MSK Connect hosts Debezium without you managing EC2.

Multi-tenant streaming platform

SaaS customers (many AWS accounts) -> cross-account KDS resource policy -> central stream in shared account -> Managed Service for Apache Flink for per-tenant aggregation -> Firehose -> per-tenant S3 prefixes via dynamic partitioning.

Operational Pitfalls and Exam Traps

Using the current epoch second as the partition key creates a rolling hot shard — every producer in the same second hashes to the same shard. Prefer high-cardinality keys (user_id, device_id, session_id) and let the shard hash do the time distribution. Source ↗

Amazon Data Firehose does NOT guarantee cross-record ordering to destinations. Batches are buffered, optionally transformed in parallel by Lambda, and landed with best-effort ordering. If global ordering matters, use KDS or MSK with a single partition per key and a single-threaded consumer, not Firehose.

KPL can aggregate up to 64 KB of small records into one 1 MiB Kinesis record, then KCL de-aggregates on the consumer side. This dramatically lowers the PUT-payload-unit bill on high-frequency, small-payload producers (IoT, metrics) without consumer code changes. Source ↗

Streaming Data Kinesis vs Other AWS Messaging — Quick Disambiguation

SAA-C03 mixes streaming data Kinesis with Amazon SQS, Amazon SNS, and Amazon EventBridge. The quick rule:

Amazon SQS — point-to-point queue, single consumer per message, no order beyond FIFO queues, up to 14-day retention.
Amazon SNS — pub/sub fan-out to subscribers, no retention, push-only.
Amazon EventBridge — event bus with schema registry, content-based routing, SaaS partner events; low throughput per rule.
Amazon Kinesis Data Streams — ordered replayable log, multi-consumer, high throughput, up to 365-day retention.
Amazon MSK — same as KDS but Apache Kafka API and Kafka ecosystem.
Amazon Data Firehose — managed delivery, no consumers.

If the exam says "replay seven days of events for a new downstream consumer added next week", the answer is streaming data Kinesis (KDS or MSK), not SQS or SNS.

FAQ — Streaming Data Kinesis for SAA-C03

Q1. When should I pick Amazon Kinesis Data Streams on-demand over provisioned mode?

Pick on-demand when traffic is unpredictable, spiky by more than 2x the baseline, or you do not want to own resharding. On-demand charges per GB ingested and per GB retrieved; provisioned charges per shard-hour regardless of utilisation. For steady workloads above roughly 250 shard-hours per month, provisioned is typically 30-50 percent cheaper per GB. For new workloads, always start on-demand, observe for a month, and switch to provisioned if the baseline is stable. Source ↗

Q2. What is the difference between classic Kinesis consumers and enhanced fan-out?

Classic consumers share 2 MiB/s per shard across all classic consumers, pull via GetRecords every 1 second or longer, and end-to-end latency is 200 ms to 1 s. Enhanced fan-out registers each consumer separately, gives each 2 MiB/s per shard dedicated, pushes via HTTP/2 for sub-200 ms latency, and scales to 20 consumers per stream. EFO costs extra per consumer-shard-hour and per GB retrieved, so use it only when isolation or latency matters. Source ↗

Q3. Can Amazon Data Firehose deliver to more than one destination from a single delivery stream?

No. Each Firehose delivery stream has exactly one primary destination. The optional S3 backup bucket is a failure-case bucket, not a second destination. If you need the same data in S3 and OpenSearch and Splunk, either (a) create three separate Firehose delivery streams, each with its own source, or (b) put a Kinesis Data Stream in front and have three Firehose delivery streams read from the same KDS. Source ↗

Q4. How do I prevent a hot shard in Amazon Kinesis Data Streams?

Design the partition key for high cardinality — user_id, device_id, session_id, request_id. Avoid low-cardinality keys (country code, status flag, constant). If a natural key is skewed, append a random 0-9 suffix to create synthetic sub-keys, then reaggregate downstream. Monitor IncomingBytes and IncomingRecords per shard with CloudWatch and reshard (split the hot shard) or redesign the key if one shard consistently exceeds 80 percent utilisation. Remember that simply adding shards does not fix a hot key — the hash still lands on the same shard. Source ↗

Q5. When should I choose Amazon MSK over Amazon Kinesis Data Streams?

Choose Amazon MSK when (a) the team already uses Apache Kafka and wants to lift-and-shift without rewriting producers and consumers, (b) you need Kafka-native tooling like Debezium, Kafka Streams, Schema Registry, ksqlDB, or (c) payload size exceeds the 1 MiB KDS limit. Choose Amazon Kinesis Data Streams when the team has no Kafka expertise, wants tight integration with AWS Lambda triggers, IAM-native auth, KMS encryption, and is happy with the 1 MiB record ceiling and 365-day retention. MSK Serverless removes broker sizing and is the closest Kafka equivalent to KDS on-demand. Source ↗

Q6. Can I replay old records from Amazon Data Firehose?

No. Firehose is a one-way delivery pipeline — once a batch is pushed to the destination, Firehose does not keep a copy you can replay. The S3 backup bucket (if configured) holds failed records only. If replay is a requirement, put Amazon Kinesis Data Streams in front (1 to 365 day retention) with Firehose as the consumer, and replay from KDS by resetting the consumer checkpoint. Source ↗

Q7. How does cross-account access work for Amazon Kinesis Data Streams?

Two paths exist. (a) Resource-based policy on the stream (launched 2023) grants specific actions (SubscribeToShard, GetRecords, DescribeStream) to principals in other accounts — no role assumption needed. (b) Classic cross-account IAM role in the stream-owning account, assumed by the consumer account via STS. If the stream is encrypted with a customer-managed KMS key, the KMS key policy must ALSO grant the consumer account kms:Decrypt, otherwise reads fail with AccessDenied at the KMS layer even when the IAM side is correct. Source ↗

Summary — Streaming Data Kinesis Cheat Sheet for SAA-C03

Amazon Kinesis Data Streams — ordered, replayable, multi-consumer log. Shards, partition keys, 1 MiB records, 24h default / 365d extended retention. On-demand for unknown traffic, provisioned for steady baseline.
Amazon Data Firehose — zero-ops delivery to S3, Redshift, OpenSearch, Splunk, HTTP. Buffer size/interval, Lambda transform, dynamic partitioning, Parquet/ORC conversion. Single destination, no replay.
Amazon Managed Service for Apache Flink — serverless Flink for stateful stream processing, windows, joins, pattern detection. Java/Scala/Python or Zeppelin Studio SQL. Scales on KPUs.
Amazon MSK — managed Apache Kafka. Provisioned for predictable workloads with Kafka ecosystem needs; MSK Serverless for bursty traffic; MSK Connect for Debezium CDC and S3 sinks; MSK Replicator for cross-region mirroring.
Hot shards are fixed by better partition keys, not by more shards.
Enhanced fan-out gives each consumer a dedicated 2 MiB/s per shard with HTTP/2 push.
Firehose is NOT pub/sub — one destination per delivery stream.
KMS cross-account requires both IAM and KMS key-policy grants.
Replay requirement means KDS or MSK, never Firehose alone.

Master this streaming data Kinesis playbook and you will pick the right service on the first read of every SAA-C03 streaming question — the difference between a borderline pass and a comfortable 80-plus score.

What is Streaming Data on AWS?

Why Streaming Data Kinesis Matters for SAA-C03

Plain-Language Explanation: Streaming Data Kinesis

Analogy 1 — The postal sorting facility (郵政系統)

Analogy 2 — Train lines on a station map (交通號誌)

Analogy 3 — The kitchen brigade (廚房)

Core Operating Principles — Streaming Data Kinesis Fundamentals

Amazon Kinesis Data Streams — The Durable Streaming Log

Shards, throughput, and record size

Partition keys and hot shards

On-demand mode vs provisioned mode

Retention: 1 day, 7 days, 365 days

KCL — the Kinesis Client Library

Enhanced fan-out

Cross-account stream sharing

Producers: SDK, KPL, Kinesis Agent, Firehose-as-producer

Amazon Data Firehose — The Zero-Ops Delivery Pipeline

Firehose destinations

Buffering: size and time

Lambda transformation

Dynamic partitioning

Firehose sources

Amazon Managed Service for Apache Flink (formerly Kinesis Data Analytics)

Three authoring modes

Sources and sinks

State, checkpointing, and scaling

Amazon MSK — Managed Apache Kafka

MSK provisioned

MSK Serverless

MSK Connect

Decision Tree — When to Choose KDS vs Firehose vs MSK vs Flink

Step 1 — Is the destination a single well-known sink?

Step 2 — Do multiple independent consumers need their own copy of the stream?

Step 3 — Is the team already running Apache Kafka or does the payload exceed 1 MiB or need > 365 day retention?

Step 4 — Does the workload require stateful windowed processing, joins, or pattern detection?

Step 5 — Is throughput unpredictable and the team wants zero capacity planning?

Security — Encryption, VPC, and Access Control

Encryption at rest

Encryption in transit

VPC endpoints and private connectivity

IAM and resource policies

Performance and Cost Patterns

KDS performance limits cheat sheet

Firehose performance limits cheat sheet

MSK performance knobs

Cost knobs

High-Availability and Disaster Recovery Patterns

Same-region resilience

Cross-region replication

Replay and backfill

Common Architectures on the SAA-C03 Exam

Clickstream to data lake

IoT telemetry with real-time anomaly detection

Log aggregation to OpenSearch and S3

Change-data-capture from RDS to data lake

Multi-tenant streaming platform

Operational Pitfalls and Exam Traps

Streaming Data Kinesis vs Other AWS Messaging — Quick Disambiguation

FAQ — Streaming Data Kinesis for SAA-C03

Q1. When should I pick Amazon Kinesis Data Streams on-demand over provisioned mode?

Q2. What is the difference between classic Kinesis consumers and enhanced fan-out?

Q3. Can Amazon Data Firehose deliver to more than one destination from a single delivery stream?

Q4. How do I prevent a hot shard in Amazon Kinesis Data Streams?

Q5. When should I choose Amazon MSK over Amazon Kinesis Data Streams?

Q6. Can I replay old records from Amazon Data Firehose?

Q7. How does cross-account access work for Amazon Kinesis Data Streams?

Summary — Streaming Data Kinesis Cheat Sheet for SAA-C03

Official sources