Why Online User Counting Starts With Heartbeats

The online user count in a live room looks like a simple number, but after building it I became more careful with it.

If I only count connections, page refreshes, network jitter, background tabs, and abnormal disconnects all make the number unreliable. The business sees "how many people are watching now". The system has to answer what behavior still counts as online.

Online Is Not the Same as Connected

After a user enters a live room, the connection may drop, the page may remain open, the browser may go into the background, or the network may briefly disconnect. Connection state is only one signal. It is not the same as real presence.

I prefer using heartbeats to express viewing state. As long as the client keeps reporting within the rules, the system treats it as active. After a period without heartbeat, the user is gradually removed from online state.

This does not produce perfect truth. It produces a rule that can be explained.

Cache Fits Short-Term State

Online state changes quickly. Writing every heartbeat directly to the database is not a good fit. Cache is better for this short-lived state.

I keep room, user, or session-level state in cache, then use expiration time and cleanup logic to maintain the online set. The database stores more stable records: live sessions, historical peaks, and statistical snapshots.

This reduces database pressure and separates realtime statistics from historical data.

Cleanup Matters More Than Entry

In online-user counting, entering a room is easy to handle. Leaving is harder.

Users do not always trigger a normal leave event. Closing the browser, losing network, or locking a phone may leave the system without an explicit exit signal. Without expiration cleanup, the online count only grows more inflated.

So I treat heartbeat timeout and state cleanup as core logic, not a supplement. The credibility of the count depends heavily on how the system handles silent users.

Operations Need Trends, Not Only Current Values

The business value of online counts is not only the current number.

Operations care when the live room reaches its peak, which entry brings more viewers, whether comments and online changes move together, and how sessions differ historically. Current online count is only the start. Peaks, trends, and time distribution are more valuable.

If the underlying online rule is unclear, later statistics lose credibility.

One Number Contains a Set of Rules

Online-user statistics reminded me that many simple-looking metrics need explicit rules behind them.

When does a user count as entering? When do they count as leaving? How long without heartbeat means expiration? What happens if the same user opens multiple pages? Does backgrounding affect statistics? If these questions are not answered, the number is only decoration.

When I look at realtime statistics now, I first ask whether the number can be explained. An explainable number may not be perfect, but it can help operations judge. An unexplainable number can mislead the business even when it looks realtime.

Have a 0-to-1 system or technical lead role to discuss? Email me

©2026 Eddie Xu. All rights reserved