Gangnam Style view count broke YouTube once. Why is counting a f***ing hard engineering problem?
Count++ isn't a simple Counting
count++. Seem easy, right?
It is. Until it’s every view on YouTube.
Then the same one-line operation turns into the most architecturally consequential problem in your stack.
Simple lie
Everyone starts small. For all the right reasons.
Throw it in Postgres, add an index, scale up the box if needed, sprinkle some Redis if it gets hot. Done. Like reallllyyy done.
Right up until the moment your platform goes viral. That’s kinda problem Youtube deals with, everyday.
A video goes viral and your “simple counter row” becomes the slowest object in your entire system.
Counting itself isn’t one problem. It’s a combination of 6 problems wearing a trench coat, and youtube has dealt with all of those.
The hard problems
1. Counter overflow. Sounds boring. It’s not.
“Gangnam Style” hit 2^31 views and crashed YouTube’s display because the field was a signed 32-bit integer. They migrated to 64-bit and bought themselves a few quintillion views of headroom.
But, relatively easy gotcha.
2. Volume & Concurrency. A million events per second means a million writes per second.
YouTube serves a million video views per second. But raw volume isn’t even the worst part. Concurrency is.
If you use postgres, every increment on a hot counter takes a row lock. Postgres uses MVCC, so an UPDATE is actually an insert plus a tombstone, meaning your hot row is generating thousands of dead versions per second that vacuum has to chase down.
The row lock serializes everyone. The fsync caps you again.
You’ve got a 96-core machine watching one logical row’s lock chain decide the throughput of your entire product. The fanciest hardware in the world bottlenecked by one number.
3. Distribution. Events happen in Tokyo. The counter lives in Iowa.
Physics is in the room, and it doesn’t care about your roadmap. You cannot make this fast if every increment needs a round trip to one global database.
4. Late events. A view happens at 09:59:55 and arrives at 10:00:55.
View events usually have streaming processor working with watermarks. That event belongs to the 9 AM bucket. The 9 AM bucket already closed.
Now what? How long do you wait? Do you process that late event? What about the event that shows up tomorrow?
5. Exactness is contested. Did the user actually watch, or click and bounce? Bot? Self-view? Same person refreshing?
6. Failures. A consumer crashes mid-batch. A network splits. Did the event get counted? Twice? Lost? Pick: at-most-once, at-least-once, exactly-once. Each one wrecks your architecture in a different direction.
7. Hot keys: no matter how you shard, one celebrity’s post will overwhelm one physical row.
Youtube uses BigTable
Of course, YouTube doesn’t use Postgres for view counts. They use Bigtable.
The mental model:
LSM tree = append to log + insert into memtable
no row locks, no in-place updates, no vacuumEvery Bigtable write is just an append to a commit log plus an in-memory insert into a sorted structure.
No random disk write. No lock that serializes writers to the same row; writes are timestamped and merged later.
When a single row gets too hot, the tablet splits automatically and the load redistributes.
A single Bigtable cluster eats millions of writes per second without sweating. The view counter you see ticking up under a viral video? That’s a IncrementColumn on one row, absorbed at memory speed.
But Bigtable can’t tell you the top 100 most-viewed videos. There’s no ORDER BY view_count DESC LIMIT 100. No secondary indexes by default.
That’s a whole different topic. We might cover it separately.
But..do we even need to count precisely?
Sometimes the right count is wrong approximate count.
Reddit uses it. Google Analytics uses it. The number you see is statistically a lie (close enough though). It’s also correct enough that nobody cares.
Count-Min Sketch does the trick for frequency counts and approximation.
YouTube actually utilizes the Count-Min Sketch algorithm to track real-time view frequencies, trending topics, and to identify the "Top K" most viewed across its massive global traffic.
Count-Min Sketch could be visualised is a tiny 2D grid of counters (say 5 rows × 1000 columns) paired with one hash function per row. Play with it here.
To record an item, you hash it through each function and bump the cell it lands on in every row: 5 hashes, 5 increments, done.
To ask “how many times have I seen this?”, you hash it again, look up those same 5 cells, and return the minimum value. Why min?
Because different items can collide on the same cell and inflate it, but they almost never collide in all rows, so the smallest cell is your cleanest read.
You get approximate frequency counts for billions of items in a few kilobytes of memory, the estimate is always an upper bound (it lies high, never low), and two sketches merge by just adding their cells together which is why every distributed streaming system reaches for it.
Summary
This is the mental model: counting accuracy is a knob, not a constant. Pick it per access pattern and distribution.
The dashboard showing "1.2M views" can be a HyperLogLog estimate that's off by 1% and nobody cares.
But if you are not viral, then a single tick from 99 views to 100 views may matter to you.
Think differently when designing your systems.

