Live Streaming Basics#

RTA live streaming enables a client to follow the leading edge of a live session without polling.

Important

RTA live streaming is designed for use downstream from storage, rather than as an upstream telemetry distribution protocol.

The REST API must be available to provide Configuration, and ATLAS expects to be able to retrieve the Session and read any flushed Data.

WebSocket Communication#

When a user loads a Session from the REST API, the client checks to see if the session is in the "open" state. In that case, if a WebSocket URI has been configured, the client tries to open a connection using the Session Identity as part of the URI.

A WebSocket URI uses the ws:// or wss:// scheme (equivalent to http:// and https://).

Example

For session identity abc123:

wss://example.com/rta/v2/sessions/abc123/stream

The server immediately starts sending session metadata updates using a simple protobuf-based protocol, and clients can send requests to subscribe to events and data. The flow of data is selective and predominantly unidirectional, so it is both efficient and insensitive to connection latency.

Redis Communication#

The Stream Service provides a reference implementation of the WebSocket protocol and receives its data for distribution via Redis, which acts as a broker between the ingest points and the edge services.

This decoupling helps ensure that edge services can be scaled and fault-tolerant, and ingest processes can be short-lived and unaffected by client activity.

Data is carried using a simple protocol built on Redis Streams, which also provides buffering to cover the gap between flushed data — which should be available via the REST API — and the leading edge. This gap will vary depending on the storage technology: it could be milliseconds or hours.

The MAT.OCS.RTA.StreamBuffer NuGet Package (.NET Core) provides an implementation for ingest processes to send data to Redis, and the protocol is documented for the benefit of ingest pipelines written in other languages.

Complications#

The live stream may not originate in the same process that writes data to persistent storage.
This is likely when integrating with existing infrastucture or using off-the-shelf connectors.

This raises several complications:

Difficult to synchronize session metadata

The process writing to persistent storage might have more metadata available than the process creating the live data stream.

The client needs to mitigate this by combining metadata from the REST API and WebSocket using some heuristics, and there are some guidelines to mitigate this issue.

Difficult to track how much data has been flushed

Many storage technologies do not guarantee immediate consistency, and separation between ingest processes can make it nearly impossible to determine when data has been flushed with perfect accuracy. If the flush point is not known, there cannot generally be a seamless join between the REST API and WebSocket streaming data.

There are strategies that might mitigate this in specific situations — such as inserting checkpoints or tracking progress based on time — but in general it is more robust simply to configure the server environment to buffer stream data in Redis for a fixed period based on the known system characteristics.

The client needs to mitigate this by merging data where there is an overlap.