---
title: 'Kafka Basics: Tables vs Streams'
teaser: 'Learn the difference between a Table and a Stream in Kafka and when to use
  them.

  '
tags: kafka,data
author: Edward Loveall
published_on: 2019-08-26
---

When consuming topics with [Kafka Streams]
there are two kinds of data you'll want to work with.
One is a stream
and one is a table.

[kafka streams]: https://kafka.apache.org/documentation/streams/

Let's look at some data:

### Users

```
| key | data                       |
| --- | -------------------------- |
| 17  | name: Gail,  color: Green  |
| 201 | name: Oscar, color: Red    |
| 11  | name: Sam,   color: Purple |
| 201 | name: Oscar, color: Orange |
```

### Purchases

```
| key | data                           |
| --- | ------------------------------ |
| 384 | title: Soap,       price: $7   |
| 385 | title: TV,         price: $500 |
| 386 | title: Basketball, price: $15  |
| 387 | title: Sunglasses, price: $24  |
```

These look like tables,
but don't be fooled.
They are streams.
Every time new data is produced for one of these streams,
a new record
(a `key` with attached `data`)
is added to the end of the stream.

The data is mostly self explanatory,
but I'll point out that the **Users** topic has two entries for `Oscar`
where he starts with the color `Red`
and changes it to `Orange`.
This will be used later.

## All Data Are Streams

To clear one thing up,
all Kafka topics are stored as a stream.
The difference is:
when we want to consume that topic,
we can either consume it as a table
or a stream.
Let's look at how they're different.

## Tables

Take the **Users** topic above.
If we want to look at all of our users
and their chosen color,
we only want to see the _latest_ version of each user
and their color.
We only want to see `Oscar` once,
with his current `color`.

This is what the [KTable] type in Kafka Streams does.
It takes a topic stream of records from a topic
and reduces it down to unique entries.

[ktable]: https://kafka.apache.org/10/javadoc/org/apache/kafka/streams/kstream/KTable.html

## Streams

When we want to work with a stream,
we grab _all records_ from it.
A good example is the **Purchases** stream above.
If we want to see how much money we made,
we go through every record in our purchase topic,
add up all the profit,
and get our number.

This is what the [KStream] type in Kafka Streams is.

[kstream]: https://kafka.apache.org/10/javadoc/org/apache/kafka/streams/kstream/KStream.html

## Tables For Nouns, Streams For Verbs

I've found it helpful to think of tables as representing nouns
(users, songs, cars)
and streams as verbs
(buys, plays, drives).
This is because with a noun,
we mostly want the current state of that noun:
the current document
or the current flight.
But with verbs,
we need to see the trail of how we got here:
the history of edits to this document
or the path this plane took to its destination.

## Resources

While they are slightly different,
tables are also sometimes called a `changelog stream`.
In truth, [everything is a stream](#all-data-are-streams)
and KTables are an _abstraction_ over that stream.
Similarlly, streams are sometimes called a `record stream`
and the same abstraction princible applies.
You may see this termonology come up when looking into Kafka.

- https://docs.confluent.io/current/streams/concepts.html
