Skip to content

Commit d022dfa

Browse files
committed
Add: implement end-to-end latency measurement documentation and text detection script
1 parent f3da799 commit d022dfa

File tree

3 files changed

+345
-0
lines changed

3 files changed

+345
-0
lines changed

doc/LatencyMeasurement.md

Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
# End-to-End Latency Measurement — Media Transport Library
2+
3+
This document describes a simple solution for measuring end-to-end latency in Media Transport Library.
4+
5+
## Overview
6+
7+
The solution is based on the FFmpeg ability to print current timestamps on the sender side (Tx) and the receiver side (Rx), and the use of Optical Character Recognition (OCR) to read the timestamps out of each received video frame and calculate the delta. The choice of OCR is determined by the fact that the text can be effectively recognized even if the picture is affected by any sort of a lossy video compression algorithm somewhere in the transmission path. To achieve proper accuracy of the measurement, both Tx and Rx host machines should be synchronized using Precision Time Protocol (PTP).
8+
9+
> Only video payload ST2110-20 and ST2110-22 is supported.
10+
11+
```mermaid
12+
flowchart LR
13+
tx-file((Input video file))
14+
tx-ffmpeg(Tx FFmpeg)
15+
mtl1(ST2110)
16+
NET(network)
17+
mtl2(ST2110)
18+
rx-ffmpeg(Rx FFmpeg)
19+
rx-file((Output video file))
20+
21+
tx-file --> tx-ffmpeg --> mtl1 --> NET --> mtl2 --> rx-ffmpeg --> rx-file
22+
23+
classDef netStyle fill:#ffcccc;
24+
class NET netStyle;
25+
```
26+
27+
## How it works
28+
29+
1. Tx side – The user starts FFmpeg with special configuration to stream video via the ST2110.
30+
1. Rx side – The user starts FFmpeg with special configuration to receive the video stream from the ST2110.
31+
1. Tx side – FFmpeg prints the current timestamp as a huge text at the top of each video frame and transmits it via the network.
32+
1. Rx side – FFmpeg prints the current timestamp as a huge text at the bottom of each video frame received from the network and saves it on the disk.
33+
1. After transmission is done, there is a resulting MPEG video file on the disk on the Rx side.
34+
1. The user runs the solution script against the MPEG file that recognizes the Tx and Rx timestamps in each frame, and calculates the average latency based on the difference between the timestamps. Additionally, the script generates a latency diagram and stores it in JPEG format on the disk.
35+
36+
## Sample latency diagram
37+
38+
<img src="png/ffmpeg-based-latency-solution-diagram.jpg" width="520">
39+
40+
## Important notice on latency measurement results
41+
42+
> Please note the calculated average latency is highly dependent on the hardware configuration and CPU background load, and cannot be treated as an absolute value. The provided solution can only be used for comparing the latency in different network configurations and video streaming parameters, as well as latency stability checks.
43+
44+
45+
## Build and install steps
46+
47+
> It is assumed that Media Transport Library is installed on the Tx and Rx host machines according to [Build Guide](build.md).
48+
49+
If FFmpeg Plugin was installed earlier, remove its directory before proceeding with the following.
50+
51+
1. Install required packages
52+
```bash
53+
sudo apt install libfreetype6-dev libharfbuzz-dev libfontconfig1-dev
54+
```
55+
1. Clone build and install FFmpeg.
56+
```bash
57+
git clone https://github.com/FFmpeg/FFmpeg.git
58+
cd FFmpeg
59+
git checkout release/7.0
60+
# apply the build patch
61+
git am <repo_dir>/ecosystem/ffmpeg_plugin/7.0/*.patch
62+
# copy the mtl in/out implementation code
63+
cp <repo_dir>/ecosystem/ffmpeg_plugin/mtl_*.c -rf libavdevice/
64+
cp <repo_dir>/ecosystem/ffmpeg_plugin/mtl_*.h -rf libavdevice/
65+
./configure --enable-shared --enable-mtl --enable-libfreetype --enable-libharfbuzz --enable-libfontconfig
66+
make -j "$(nproc)"
67+
sudo make install
68+
sudo ldconfig
69+
```
70+
1. Install Tesseract OCR
71+
```bash
72+
apt install tesseract-ocr
73+
```
74+
1. Install Python packages
75+
```bash
76+
pip install opencv-python~=4.11.0 pytesseract~=0.3.13 matplotlib~=3.10.3
77+
```
78+
1. Setup time synchronization on host machines
79+
80+
> Make sure `network_interface_1` and `network_interface_2` are connected to the same network.
81+
82+
* __host-1 Controller clock__
83+
```bash
84+
sudo ptp4l -i <network_interface_1> -m 2
85+
sudo phc2sys -a -r -r -m
86+
```
87+
88+
* __host-2 Worker clock__
89+
```bash
90+
sudo ptp4l -i <network_interface_2> -m 2 -s
91+
sudo phc2sys -a -r
92+
```
93+
94+
## Example – Measuring transmission latency between two FFmpeg instances on different hosts
95+
96+
This example demonstrates sending a video file from the 1st FFmpeg instance to the 2nd FFmpeg instance via Media Transport Library, and then calculate transmission latency from the recorded video.
97+
98+
1. Start the Receiver side FFmpeg instance
99+
100+
```bash
101+
sudo ffmpeg -y \
102+
-f mtl_st20p \
103+
-p_port 0000:af:01.0 \
104+
-p_sip 192.168.96.2 \
105+
-p_rx_ip 239.168.85.20 \
106+
-udp_port 20000 \
107+
-payload_type 96 \
108+
-fps 59.94 \
109+
-pix_fmt yuv422p10le \
110+
-video_size 1920x1080 \
111+
-i - \
112+
-vf \
113+
"drawtext=fontsize=40: \
114+
text='Rx timestamp %{localtime\\:%H\\\\\:%M\\\\\:%S\\\\\:%3N}': \
115+
x=10: y=70: fontcolor=white: box=1: boxcolor=black: boxborderw=10" \
116+
-vcodec mpeg4 -qscale:v 3 recv.mp4
117+
```
118+
119+
1. Start the Sender side FFmpeg instance
120+
121+
```bash
122+
sudo ffmpeg -i <video-file-path> \
123+
-vf \
124+
"drawtext=fontsize=40: \
125+
text='Tx timestamp %{localtime\\:%H\\\\\:%M\\\\\:%S\\\\\:%3N}': \
126+
x=10: y=10: fontcolor=white: box=1: boxcolor=black: boxborderw=10" \
127+
-f mtl_st20p \
128+
-fps 59.94 \
129+
-p_port 0000:af:01.1 \
130+
-p_sip 192.168.96.3 \
131+
-p_tx_ip 239.168.85.20 \
132+
-udp_port 20000 \
133+
-payload_type 96 -
134+
```
135+
136+
When sending a raw video file, e.g. of the YUV format, you have to explicitly specify the file format `-f rawvideo`, the pixel format `-pix_fmt`, and the video resolution `-s WxH`:
137+
138+
```bash
139+
ffmpeg -f rawvideo -pix_fmt yuv422p10le -s 1920x1080 -i <video-file-path> ...
140+
```
141+
142+
It is also recommended to provide the read rate `-readrate` at which FFmpeg will read frames from the file:
143+
144+
```bash
145+
ffmpeg -f rawvideo -readrate 2.4 -pix_fmt yuv422p10le -s 1920x1080 -i <video-file-path> ...
146+
```
147+
148+
The `-readrate` value is calculated from the `-frame_rate` parameter value using the following equation: $readrate=framerate\div25$. Use the pre-calculated values from the table below.
149+
150+
| frame_rate | readrate |
151+
|------------|-------------------|
152+
| 25 | 25 / 25 = 1 |
153+
| 50 | 50 / 25 = 2 |
154+
| 60 | 60 / 25 = 2.4 |
155+
156+
1. Run the script located in `<repo_dir>/script`against the recorded MPEG file. The first argument is the input video file path. The second argument is the optional latency diagram JPEG file path to be generated.
157+
158+
```bash
159+
python text-detection.py recv.mp4 recv-latency.jpg
160+
```
161+
162+
Console output
163+
```bash
164+
...
165+
Processing Frame: 235
166+
Processing Frame: 236
167+
Processing Frame: 237
168+
Processing Frame: 238
169+
Processing Frame: 239
170+
Processing Frame: 240
171+
Saving the latency chart to: recv-latency.jpg
172+
File: recv.mp4 | Last modified: 2025-06-02 13:49:54 UTC
173+
Resolution: 640x360 | FPS: 25.00
174+
Average End-to-End Latency: 564.61 ms
175+
```
176+
177+
See the [Sample latency diagram](#sample-latency-diagram).
178+
179+
## Customization
180+
When modifying FFmpeg commands if you change parameters of `drawtext` filter, especialy `fontsize`, `x`, `y` or `text`, you have to adjust python script __text-detection.py__ too, please refer to function `extract_text_from_region(image, x, y, font_size, length)`
149 KB
Loading

script/text-detection.py

Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
import sys
2+
import pytesseract
3+
import cv2 as cv
4+
import numpy as np
5+
from datetime import datetime
6+
import re
7+
import matplotlib.pyplot as plt
8+
from concurrent.futures import ThreadPoolExecutor
9+
import os
10+
11+
def is_display_attached():
12+
# Check if the DISPLAY environment variable is set
13+
return 'DISPLAY' in os.environ
14+
15+
def extract_text_from_region(image, x, y, font_size, length):
16+
"""
17+
Extracts text from a specific region of the image.
18+
:param image: The image to extract text from.
19+
:param x: The x-coordinate of the top-left corner of the region.
20+
:param y: The y-coordinate of the top-left corner of the region.
21+
:param font_size: The font size of the text.
22+
:param length: The length of the text to extract.
23+
:return: The extracted text.
24+
"""
25+
margin = 5
26+
y_adjusted = max(0, y - margin)
27+
x_adjusted = max(0, x - margin)
28+
height = y + font_size + margin
29+
width = x + length + margin
30+
# Define the region of interest (ROI) for text extraction
31+
roi = image[y_adjusted:height, x_adjusted:width]
32+
33+
# Use Tesseract to extract text from the ROI
34+
return pytesseract.image_to_string(roi, lang='eng')
35+
36+
def process_frame(frame_idx, frame):
37+
print("Processing Frame: ", frame_idx)
38+
39+
timestamp_format = "%H:%M:%S:%f"
40+
timestamp_pattern = r'\b\d{2}:\d{2}:\d{2}:\d{3}\b'
41+
42+
# Convert frame to grayscale for better OCR performance
43+
frame = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
44+
45+
line_1 = extract_text_from_region(frame, 10, 10, 40, 600)
46+
line_2 = extract_text_from_region(frame, 10, 70, 40, 600)
47+
48+
# Find the timestamps(Type: string) in the extracted text using regex
49+
tx_time = re.search(timestamp_pattern, line_1)
50+
rx_time = re.search(timestamp_pattern, line_2)
51+
52+
if tx_time is None or rx_time is None:
53+
print("Error: Timestamp not found in the expected format.")
54+
return 0
55+
56+
# Convert the timestamps(Type: string) to time (Type: datetime)
57+
tx_time = datetime.strptime(tx_time.group(), timestamp_format)
58+
rx_time = datetime.strptime(rx_time.group(), timestamp_format)
59+
60+
if tx_time is None or rx_time is None:
61+
print("Error: Timestamp not found in the expected format.")
62+
return 0
63+
64+
if tx_time > rx_time:
65+
print("Error: Transmit time is greater than receive time.")
66+
return 0
67+
68+
time_difference = rx_time - tx_time
69+
time_difference_ms = time_difference.total_seconds() * 1000
70+
return time_difference_ms
71+
72+
def main():
73+
if len(sys.argv) < 2:
74+
print("Usage: python text-detection.py <input_video_file> <output_image_name>")
75+
sys.exit(1)
76+
77+
input_video_file = sys.argv[1]
78+
cap = cv.VideoCapture(input_video_file)
79+
if not cap.isOpened():
80+
print("Fatal: Could not open video file.")
81+
sys.exit(1)
82+
83+
frame_idx = 0
84+
time_differences = []
85+
86+
with ThreadPoolExecutor(max_workers=40) as executor:
87+
futures = []
88+
while True:
89+
ret, frame = cap.read()
90+
if not ret:
91+
break
92+
93+
futures.append(executor.submit(process_frame, frame_idx, frame))
94+
frame_idx += 1
95+
96+
for future in futures:
97+
time_differences.append(future.result())
98+
99+
# Filter out zero values from time_differences
100+
non_zero_time_differences = [td for td in time_differences if td != 0]
101+
102+
# Calculate the average latency excluding zero values
103+
if non_zero_time_differences:
104+
average_latency = np.mean(non_zero_time_differences)
105+
106+
# Filter out anomaly peaks that differ more than 25% from the average for average calculation
107+
filtered_time_differences = [
108+
td for td in non_zero_time_differences if abs(td - average_latency) <= 0.25 * average_latency
109+
]
110+
111+
# Calculate the average latency using the filtered data
112+
filtered_average_latency = np.mean(filtered_time_differences)
113+
else:
114+
print("Fatal: No timestamps recognized in the video. No data for calculating latency.")
115+
sys.exit(1)
116+
117+
# Plot the non-zero data
118+
plt.plot(non_zero_time_differences, marker='o')
119+
plt.title('End-to-End Latency — Media Transport Library')
120+
plt.xlabel('Frame Index')
121+
plt.ylabel('Latency, ms')
122+
plt.grid(True)
123+
124+
# Adjust the layout to create more space for the text
125+
plt.subplots_adjust(bottom=0.5)
126+
127+
# Prepare text for display and stdout
128+
average_latency_text = f'Average End-to-End Latency: {filtered_average_latency:.2f} ms'
129+
file_name = os.path.basename(input_video_file)
130+
file_mod_time = datetime.fromtimestamp(os.path.getmtime(input_video_file)).strftime('%Y-%m-%d %H:%M:%S')
131+
file_info_text = f'File: {file_name} | Last modified: {file_mod_time} UTC'
132+
width = int(cap.get(cv.CAP_PROP_FRAME_WIDTH))
133+
height = int(cap.get(cv.CAP_PROP_FRAME_HEIGHT))
134+
fps = cap.get(cv.CAP_PROP_FPS)
135+
video_properties_text = f'Resolution: {width}x{height} | FPS: {fps:.2f}'
136+
137+
cap.release()
138+
139+
# Display text on the plot
140+
plt.text(0.5, -0.55, average_latency_text,
141+
horizontalalignment='center', verticalalignment='center',
142+
transform=plt.gca().transAxes)
143+
plt.text(0.5, -0.85, file_info_text,
144+
horizontalalignment='center', verticalalignment='center',
145+
transform=plt.gca().transAxes)
146+
plt.text(0.5, -1, video_properties_text,
147+
horizontalalignment='center', verticalalignment='center',
148+
transform=plt.gca().transAxes)
149+
150+
if is_display_attached():
151+
plt.show()
152+
153+
if len(sys.argv) == 3:
154+
filename = sys.argv[2]
155+
if not filename.endswith('.jpg'):
156+
filename += '.jpg'
157+
print("Saving the latency chart to: ", filename)
158+
plt.savefig(filename, format='jpg', dpi=300)
159+
160+
# Print text to stdout
161+
print(file_info_text)
162+
print(video_properties_text)
163+
print(average_latency_text)
164+
165+
main()

0 commit comments

Comments
 (0)