Fix: S2S Compression Bug Causes Panic Near 2MiB Limit

Jan 16, 2026 by Editorial Team 54 views

Hey guys! Today, we're diving deep into a tricky bug that can cause some serious headaches in your server-to-server (s2s) streaming. This bug surfaces when dealing with compression, specifically when the compression process inadvertently expands data near the 2 MiB frame limit, leading to a panic. Let's break down the issue, understand the root cause, and explore a recommended fix to keep your services running smoothly.

Detail Bug Report

For those who want to dive into the nitty-gritty details, you can check out the original bug report here: https://app.detail.dev/org_89d327b3-b883-4365-b6a3-46b6701342a9/bugs/bug_7019df41-099b-4651-a159-8165f7cbc7cf

Summary: The Heart of the Matter

Context: The s2s streaming protocol, found in api/src/v1/stream/s2s.rs, uses compression (specifically Gzip and Zstd) to make protobuf messages smaller and faster to transmit. This is all about optimizing efficiency, reducing bandwidth usage, and speeding things up between servers.
Bug: Here's the catch: when compressing large chunks of data that are close to the 2 MiB frame limit, the compression process itself can sometimes increase the overall size of the data. This happens because compression algorithms add overhead. If the compressed data exceeds the MAX_FRAME_BYTES limit, the SessionMessage::encode() function throws a panic, effectively crashing the server. This is particularly problematic when dealing with data that's already compressed, encrypted, or just inherently random, as these types of data don't compress well and can easily exceed the limit after the compression overhead is added.
Actual vs. Expected: The CompressedData::compress() method is designed to compress data when the payload size is greater than or equal to 1024 bytes. However, it always compresses the data if this condition is met, even if the compression makes the payload larger. The expected behavior is that the method should intelligently detect when compression isn't beneficial (i.e., it increases the payload size or causes it to exceed the frame size limit) and fall back to sending the data uncompressed. This would avoid the panic and ensure that the data is still transmitted, albeit without the benefits of compression.
Impact: The practical impact of this bug is significant. When servers encounter large, incompressible payloads in the s2s streaming protocol, they can crash. This can lead to service disruptions, data loss, and a generally poor user experience. Imagine a scenario where your servers are exchanging critical data, and suddenly, one of them crashes due to this compression issue. It's not a pretty picture!

Code with Bug: Spotting the Culprit

The problematic code lies within the CompressedData::compress() method in api/src/v1/stream/s2s.rs:

fn compress(compression: CompressionAlgorithm, data: Vec<u8>) -> std::io::Result<Self> {
    if compression == CompressionAlgorithm::None || data.len() < COMPRESSION_THRESHOLD_BYTES {
        return Ok(Self {
            compression: CompressionAlgorithm::None,
            payload: data.into(),
        });
    }
    let mut buf = Vec::with_capacity(data.len());
    match compression {
        CompressionAlgorithm::Gzip => {
            let mut encoder = GzEncoder::new(buf, Compression::default());
            encoder.write_all(data.as_slice())?;
            buf = encoder.finish()?;
        }
        CompressionAlgorithm::Zstd => {
            zstd::stream::copy_encode(data.as_slice(), &mut buf, 0)?;
        }
        CompressionAlgorithm::None => unreachable!("handled above"),
    };
    let payload = Bytes::from(buf.into_boxed_slice());
    Ok(Self {
        compression, // <-- BUG 🔴 returns compressed even if larger/exceeds frame limit
        payload,
    })
}

The issue is that the code unconditionally returns the compressed data, even if it's larger than the original or exceeds the frame limit.

Later, in SessionMessage::encode():

pub fn encode(&self) -> Bytes {
    let encoded_size = FLAG_TOTAL_SIZE + self.payload_size();
    assert!(
        encoded_size <= MAX_FRAME_BYTES,
        "payload exceeds encoder limit"
    );
    // ...
}

This assertion is where the panic occurs if the encoded size exceeds the MAX_FRAME_BYTES limit.

Evidence (Example): Seeing is Believing

Let's consider a payload of 2,097,100 bytes of random (incompressible) data:

Input: Uncompressed data = 2,097,100 bytes
After compression (gzip): ~353 bytes overhead → 2,097,453 bytes
Encoded size: FLAG_TOTAL_SIZE (1) + 2,097,453 = 2,097,454 bytes
Frame limit: 2,097,454 > MAX_FRAME_BYTES (2,097,152) → panic at encode() assertion
If uncompressed: 1 + 2,097,100 = 2,097,101 bytes → fits within limit

This clearly demonstrates that forcing compression can push near-limit payloads beyond the frame size, causing a panic. However, if the data were sent uncompressed, it would fit within the limit, and the process would succeed.

Recommended Fix: A Smart Solution

The solution is to modify the CompressedData::compress() method to be more intelligent about when it applies compression. Specifically, it should check if compression actually reduces the size of the data or if it would cause the frame size limit to be exceeded. If either of these conditions is true, the method should fall back to sending the data uncompressed.

Here's the proposed fix:

fn compress(compression: CompressionAlgorithm, data: Vec<u8>) -> std::io::Result<Self> {
    if compression == CompressionAlgorithm::None || data.len() < COMPRESSION_THRESHOLD_BYTES {
        return Ok(Self {
            compression: CompressionAlgorithm::None,
            payload: data.into(),
        });
    }
    let mut buf = Vec::with_capacity(data.len());
    match compression {
        CompressionAlgorithm::Gzip => {
            let mut encoder = GzEncoder::new(buf, Compression::default());
            encoder.write_all(data.as_slice())?;
            buf = encoder.finish()?;
        }
        CompressionAlgorithm::Zstd => {
            zstd::stream::copy_encode(data.as_slice(), &mut buf, 0)?;
        }
        CompressionAlgorithm::None => unreachable!("handled above"),
    };

    let compressed_frame_size = FLAG_TOTAL_SIZE + buf.len();
    if buf.len() >= data.len() || compressed_frame_size > MAX_FRAME_BYTES { // <-- FIX 🟢 fallback when compression expands or would overflow frame
        return Ok(Self {
            compression: CompressionAlgorithm::None,
            payload: data.into(),
        });
    }

    Ok(Self {
        compression,
        payload: Bytes::from(buf.into_boxed_slice()),
    })
}

Explanation of the Fix

Check Compression Efficiency:
- The line if buf.len() >= data.len() || compressed_frame_size > MAX_FRAME_BYTES is the heart of the fix. It checks two crucial conditions:First, buf.len() >= data.len() determines whether the compressed data (buf.len()) is larger than or equal to the original data (data.len()). If compression increases the size, it's not beneficial, and we should skip it.Second, compressed_frame_size > MAX_FRAME_BYTES checks if the size of the compressed data, including any framing overhead (FLAG_TOTAL_SIZE), exceeds the maximum allowed frame size (MAX_FRAME_BYTES). If it does, we also skip compression.
Fallback to Uncompressed:
- If either of the above conditions is true, the code executes the return Ok(Self { ... }) block. This creates a CompressedData instance with compression: CompressionAlgorithm::None, indicating that the data will be sent uncompressed, and payload: data.into(), which contains the original, uncompressed data.
Proceed with Compression (if Beneficial):
- If both conditions are false (i.e., compression reduces the size and the compressed data fits within the frame limit), the code proceeds to create a CompressedData instance with the compressed data and the appropriate compression algorithm.

Benefits of the Fix

Prevents Panics: By intelligently deciding when to compress, the fix avoids the scenario where compression leads to exceeding the frame size limit, thus preventing panics and server crashes.
Maintains Functionality: Even when compression is skipped, the data is still transmitted, ensuring that the s2s communication remains functional.
Optimizes Bandwidth (When Possible): The fix still leverages compression when it's beneficial, reducing bandwidth usage and improving performance.

By implementing this fix, you can ensure that your s2s streaming protocol is more robust and reliable, especially when dealing with large, potentially incompressible payloads. Happy coding, folks!