Skip to content

Commit 6c06564

Browse files
committed
Add: FFmpeg based end-to-end latency measurement
1 parent 551c9d6 commit 6c06564

File tree

3 files changed

+374
-0
lines changed

3 files changed

+374
-0
lines changed
Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
# End-to-End Latency Measurement — Media Transport Library
2+
3+
This document describes a simple solution for measuring end-to-end latency in Media Transport Library.
4+
5+
## Overview
6+
7+
The solution is based on the FFmpeg ability to print current timestamps on the sender side (Tx) and the receiver side (Rx), and the use of Optical Character Recognition (OCR) to read the timestamps out of each received video frame and calculate the delta.
8+
The choice of OCR is determined by the fact that the text can be effectively recognized even if the picture is affected by any sort of a lossy video compression algorithm somewhere in the transmission path. To achieve proper accuracy of the measurement, both Tx and Rx host machines should be synchronized using Precision Time Protocol (PTP).
9+
10+
> Only video payload ST2110-20 and ST2110-22 is supported.
11+
12+
```mermaid
13+
flowchart LR
14+
tx-file((Input video file))
15+
tx-ffmpeg(Tx FFmpeg)
16+
mtl1(ST2110)
17+
NET(network)
18+
mtl2(ST2110)
19+
rx-ffmpeg(Rx FFmpeg)
20+
rx-file((Output video file))
21+
22+
tx-file --> tx-ffmpeg --> mtl1 --> NET --> mtl2 --> rx-ffmpeg --> rx-file
23+
24+
classDef netStyle fill:#ffcccc;
25+
class NET netStyle;
26+
```
27+
28+
## How it works
29+
30+
1. Tx side – The user starts FFmpeg with special configuration to stream video via the ST2110.
31+
1. Rx side – The user starts FFmpeg with special configuration to receive the video stream from the ST2110.
32+
1. Tx side – FFmpeg prints the current timestamp as a huge text at the top of each video frame and transmits it via the network.
33+
1. Rx side – FFmpeg prints the current timestamp as a huge text at the bottom of each video frame received from the network and saves it on the disk.
34+
1. After transmission is done, there is a resulting MPEG video file on the disk on the Rx side.
35+
1. The user runs the solution script against the MPEG file that recognizes the Tx and Rx timestamps in each frame, and calculates the average latency based on the difference between the timestamps. Additionally, the script generates a latency diagram and stores it in JPEG format on the disk.
36+
37+
## Sample latency diagram
38+
39+
<img src="ffmpeg_based_latency_solution_diagram.jpg" width="520" alt="End-to-End Latency diagram">
40+
41+
## Important notice on latency measurement results
42+
43+
> Please note the calculated average latency is highly dependent on the hardware configuration and CPU background load, and cannot be treated as an absolute value. The provided solution can only be used for comparing the latency in different network configurations and video streaming parameters, as well as latency stability checks.
44+
45+
46+
## Build and install steps
47+
48+
> It is assumed that Media Transport Library is installed on the Tx and Rx host machines according to [Build Guide](build.md).
49+
50+
If FFmpeg Plugin was installed earlier, remove its directory before proceeding with the following.
51+
52+
1. Install required packages
53+
```bash
54+
sudo apt install libfreetype6-dev libharfbuzz-dev libfontconfig1-dev
55+
```
56+
1. Clone build and install FFmpeg.
57+
```bash
58+
git clone https://github.com/FFmpeg/FFmpeg.git
59+
cd FFmpeg
60+
git checkout release/7.0
61+
# apply the build patch
62+
git am <repo_dir>/ecosystem/ffmpeg_plugin/7.0/*.patch
63+
# copy the mtl in/out implementation code
64+
cp <repo_dir>/ecosystem/ffmpeg_plugin/mtl_*.c -rf libavdevice/
65+
cp <repo_dir>/ecosystem/ffmpeg_plugin/mtl_*.h -rf libavdevice/
66+
./configure --enable-shared --enable-mtl --enable-libfreetype --enable-libharfbuzz --enable-libfontconfig
67+
make -j "$(nproc)"
68+
sudo make install
69+
sudo ldconfig
70+
```
71+
1. Install Tesseract OCR
72+
```bash
73+
apt install tesseract-ocr
74+
```
75+
1. Install Python packages
76+
```bash
77+
pip install opencv-python~=4.11.0 pytesseract~=0.3.13 matplotlib~=3.10.3
78+
```
79+
1. Setup time synchronization on host machines
80+
81+
> Make sure `network_interface_1` and `network_interface_2` are connected to the same network.
82+
83+
* __host-1 Controller clock__
84+
```bash
85+
sudo ptp4l -i <network_interface_1> -m 2
86+
sudo phc2sys -a -r -r -m
87+
```
88+
89+
* __host-2 Worker clock__
90+
```bash
91+
sudo ptp4l -i <network_interface_2> -m 2 -s
92+
sudo phc2sys -a -r
93+
```
94+
95+
## Example – Measuring transmission latency between two FFmpeg instances on different hosts
96+
97+
This example demonstrates sending a video file from the 1st FFmpeg instance to the 2nd FFmpeg instance via Media Transport Library, and then calculate transmission latency from the recorded video.
98+
99+
1. Start the Receiver side FFmpeg instance
100+
101+
```bash
102+
sudo ffmpeg -y \
103+
-f mtl_st20p \
104+
-p_port 0000:af:01.0 \
105+
-p_sip 192.168.96.2 \
106+
-p_rx_ip 239.168.85.20 \
107+
-udp_port 20000 \
108+
-payload_type 96 \
109+
-fps 59.94 \
110+
-pix_fmt yuv422p10le \
111+
-video_size 1920x1080 \
112+
-i - \
113+
-vf \
114+
"drawtext=fontsize=40: \
115+
text='Rx timestamp %{localtime\\:%H\\\\\:%M\\\\\:%S\\\\\:%3N}': \
116+
x=10: y=70: fontcolor=white: box=1: boxcolor=black: boxborderw=10" \
117+
-vcodec mpeg4 -qscale:v 3 recv.mp4
118+
```
119+
120+
1. Start the Sender side FFmpeg instance
121+
122+
```bash
123+
sudo ffmpeg -i <video-file-path> \
124+
-vf \
125+
"drawtext=fontsize=40: \
126+
text='Tx timestamp %{localtime\\:%H\\\\\:%M\\\\\:%S\\\\\:%3N}': \
127+
x=10: y=10: fontcolor=white: box=1: boxcolor=black: boxborderw=10" \
128+
-f mtl_st20p \
129+
-fps 59.94 \
130+
-p_port 0000:af:01.1 \
131+
-p_sip 192.168.96.3 \
132+
-p_tx_ip 239.168.85.20 \
133+
-udp_port 20000 \
134+
-payload_type 96 -
135+
```
136+
137+
When sending a raw video file, e.g. of the YUV format, you have to explicitly specify the file format `-f rawvideo`, the pixel format `-pix_fmt`, and the video resolution `-s WxH`:
138+
139+
```bash
140+
ffmpeg -f rawvideo -pix_fmt yuv422p10le -s 1920x1080 -i <video-file-path> ...
141+
```
142+
143+
It is also recommended to provide the read rate `-readrate` at which FFmpeg will read frames from the file:
144+
145+
```bash
146+
ffmpeg -f rawvideo -readrate 2.4 -pix_fmt yuv422p10le -s 1920x1080 -i <video-file-path> ...
147+
```
148+
149+
The `-readrate` value is calculated from the `-frame_rate` parameter value using the following equation: $readrate=framerate\div25$. Use the pre-calculated values from the table below.
150+
151+
| frame_rate | readrate |
152+
|------------|-------------------|
153+
| 25 | 25 / 25 = 1 |
154+
| 50 | 50 / 25 = 2 |
155+
| 60 | 60 / 25 = 2.4 |
156+
157+
1. Run the script located in `<repo_dir>/tests/tools/latency_measurement`against the recorded MPEG file. The first argument is the input video file path. The second argument is the optional latency diagram JPEG file path to be generated.
158+
159+
```bash
160+
python text_detection.py recv.mp4 recv-latency.jpg
161+
```
162+
163+
Console output
164+
```bash
165+
...
166+
Processing Frame: 235
167+
Processing Frame: 236
168+
Processing Frame: 237
169+
Processing Frame: 238
170+
Processing Frame: 239
171+
Processing Frame: 240
172+
Saving the latency chart to: recv-latency.jpg
173+
File: recv.mp4 | Last modified: 2025-06-02 13:49:54 UTC
174+
Resolution: 640x360 | FPS: 25.00
175+
Average End-to-End Latency: 564.61 ms
176+
```
177+
178+
See the [Sample latency diagram](#sample-latency-diagram).
179+
180+
## Customization
181+
When modifying FFmpeg commands if you change parameters of `drawtext` filter, especialy `fontsize`, `x`, `y` or `text`, you have to adjust python script __text-detection.py__ too, please refer to function `extract_text_from_region(image, x, y, font_size, length)`
149 KB
Loading
Lines changed: 193 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,193 @@
1+
import os
2+
import re
3+
import sys
4+
from concurrent.futures import ThreadPoolExecutor
5+
from datetime import datetime
6+
7+
import cv2 as cv
8+
import matplotlib.pyplot as plt
9+
import numpy as np
10+
import pytesseract
11+
12+
13+
def is_display_attached():
14+
# Check if the DISPLAY environment variable is set
15+
return "DISPLAY" in os.environ
16+
17+
18+
def extract_text_from_region(image, x, y, font_size, length):
19+
"""
20+
Extracts text from a specific region of the image.
21+
:param image: The image to extract text from.
22+
:param x: The x-coordinate of the top-left corner of the region.
23+
:param y: The y-coordinate of the top-left corner of the region.
24+
:param font_size: The font size of the text.
25+
:param length: The length of the text to extract.
26+
:return: The extracted text.
27+
"""
28+
margin = 5
29+
y_adjusted = max(0, y - margin)
30+
x_adjusted = max(0, x - margin)
31+
height = y + font_size + margin
32+
width = x + length + margin
33+
# Define the region of interest (ROI) for text extraction
34+
roi = image[y_adjusted:height, x_adjusted:width]
35+
36+
# Use Tesseract to extract text from the ROI
37+
return pytesseract.image_to_string(roi, lang="eng")
38+
39+
40+
def process_frame(frame_idx, frame):
41+
print("Processing Frame: ", frame_idx)
42+
43+
timestamp_format = "%H:%M:%S:%f"
44+
timestamp_pattern = r"\b\d{2}:\d{2}:\d{2}:\d{3}\b"
45+
46+
# Convert frame to grayscale for better OCR performance
47+
frame = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
48+
49+
line_1 = extract_text_from_region(frame, 10, 10, 40, 600)
50+
line_2 = extract_text_from_region(frame, 10, 70, 40, 600)
51+
52+
# Find the timestamps(Type: string) in the extracted text using regex
53+
tx_time = re.search(timestamp_pattern, line_1)
54+
rx_time = re.search(timestamp_pattern, line_2)
55+
56+
if tx_time is None or rx_time is None:
57+
print("Error: Timestamp not found in the expected format.")
58+
return 0
59+
60+
# Convert the timestamps(Type: string) to time (Type: datetime)
61+
tx_time = datetime.strptime(tx_time.group(), timestamp_format)
62+
rx_time = datetime.strptime(rx_time.group(), timestamp_format)
63+
64+
if tx_time is None or rx_time is None:
65+
print("Error: Timestamp not found in the expected format.")
66+
return 0
67+
68+
if tx_time > rx_time:
69+
print("Error: Transmit time is greater than receive time.")
70+
return 0
71+
72+
time_difference = rx_time - tx_time
73+
time_difference_ms = time_difference.total_seconds() * 1000
74+
return time_difference_ms
75+
76+
77+
def main():
78+
if len(sys.argv) < 2:
79+
print("Usage: python text-detection.py <input_video_file> <output_image_name>")
80+
sys.exit(1)
81+
82+
input_video_file = sys.argv[1]
83+
cap = cv.VideoCapture(input_video_file)
84+
if not cap.isOpened():
85+
print("Fatal: Could not open video file.")
86+
sys.exit(1)
87+
88+
frame_idx = 0
89+
time_differences = []
90+
91+
with ThreadPoolExecutor(max_workers=40) as executor:
92+
futures = []
93+
while True:
94+
ret, frame = cap.read()
95+
if not ret:
96+
break
97+
98+
futures.append(executor.submit(process_frame, frame_idx, frame))
99+
frame_idx += 1
100+
101+
for future in futures:
102+
time_differences.append(future.result())
103+
104+
# Filter out zero values from time_differences
105+
non_zero_time_differences = [td for td in time_differences if td != 0]
106+
107+
# Calculate the average latency excluding zero values
108+
if non_zero_time_differences:
109+
average_latency = np.mean(non_zero_time_differences)
110+
111+
# Filter out anomaly peaks that differ more than 25% from the average for average calculation
112+
filtered_time_differences = [
113+
td
114+
for td in non_zero_time_differences
115+
if abs(td - average_latency) <= 0.25 * average_latency
116+
]
117+
118+
# Calculate the average latency using the filtered data
119+
filtered_average_latency = np.mean(filtered_time_differences)
120+
else:
121+
print(
122+
"Fatal: No timestamps recognized in the video. No data for calculating latency."
123+
)
124+
sys.exit(1)
125+
126+
# Plot the non-zero data
127+
plt.plot(non_zero_time_differences, marker="o")
128+
plt.title("End-to-End Latency — Media Transport Library")
129+
plt.xlabel("Frame Index")
130+
plt.ylabel("Latency, ms")
131+
plt.grid(True)
132+
133+
# Adjust the layout to create more space for the text
134+
plt.subplots_adjust(bottom=0.5)
135+
136+
# Prepare text for display and stdout
137+
average_latency_text = (
138+
f"Average End-to-End Latency: {filtered_average_latency:.2f} ms"
139+
)
140+
file_name = os.path.basename(input_video_file)
141+
file_mod_time = datetime.fromtimestamp(os.path.getmtime(input_video_file)).strftime(
142+
"%Y-%m-%d %H:%M:%S"
143+
)
144+
file_info_text = f"File: {file_name} | Last modified: {file_mod_time} UTC"
145+
width = int(cap.get(cv.CAP_PROP_FRAME_WIDTH))
146+
height = int(cap.get(cv.CAP_PROP_FRAME_HEIGHT))
147+
fps = cap.get(cv.CAP_PROP_FPS)
148+
video_properties_text = f"Resolution: {width}x{height} | FPS: {fps:.2f}"
149+
150+
cap.release()
151+
152+
# Display text on the plot
153+
plt.text(
154+
0.5,
155+
-0.55,
156+
average_latency_text,
157+
horizontalalignment="center",
158+
verticalalignment="center",
159+
transform=plt.gca().transAxes,
160+
)
161+
plt.text(
162+
0.5,
163+
-0.85,
164+
file_info_text,
165+
horizontalalignment="center",
166+
verticalalignment="center",
167+
transform=plt.gca().transAxes,
168+
)
169+
plt.text(
170+
0.5,
171+
-1,
172+
video_properties_text,
173+
horizontalalignment="center",
174+
verticalalignment="center",
175+
transform=plt.gca().transAxes,
176+
)
177+
if is_display_attached():
178+
plt.show()
179+
180+
if len(sys.argv) == 3:
181+
filename = sys.argv[2]
182+
if not filename.endswith(".jpg"):
183+
filename += ".jpg"
184+
print("Saving the latency chart to: ", filename)
185+
plt.savefig(filename, format="jpg", dpi=300)
186+
187+
# Print text to stdout
188+
print(file_info_text)
189+
print(video_properties_text)
190+
print(average_latency_text)
191+
192+
193+
main()

0 commit comments

Comments
 (0)