The qtivoverlay element is a hardware-accelerated, in-place drawing and blitting plugin designed to render visual overlay objects directly on top of incoming video frames, such as YUV or RGB buffers. By compositing overlays directly onto the frame data, it enables the processed video to be displayed on a screen or forwarded for encoding with minimal additional overhead. This makes it well suited for real-time video applications where performance and low latency are important.The element supports a wide range of overlay content, including user-defined graphics and annotations such as logos, bounding boxes, custom labels, date and time, buffer timestamps, privacy masks, and other visual indicators. In addition, qtivoverlay can render dynamic overlay information carried in GstMeta, which is commonly used to attach AI or analytics results to each frame for post-processing and visualization.To achieve efficient overlay rendering, qtivoverlay combines CPU-based drawing with GPU-based blending:
CPU rendering with Cairo: Overlay content is first drawn using the open-source Cairo graphics library into compact, memory-efficient overlay buffers.
GPU hardware blending: These rendered overlay buffers are then blended with the main video frame using GPU hardware acceleration.
This hybrid approach helps reduce memory usage and improves overall performance, especially in pipelines handling high-resolution or high-frame-rate video streams.The overlays supported by the element can be described in two ways:Through element propertiesThese are overlays configured manually by the user, including:
Static images or logos
Bounding boxes
Date and/or time
Buffer timestamps
Custom text
Privacy masks
Through buffer metadataThese overlays are attached to each input frame as metadata, and are typically added by the qtimetamux plugin. This metadata is used to draw machine-learning-related overlays such as:
Detection overlays, including bounding boxes
Segmentation overlays, such as semantic masks or mask images
Classification overlays, such as labels or user text
Pose graph overlays
In summary, qtivoverlay is a hardware-accelerated, in-place image drawing and blitting plugin for overlaying visual annotations on video frames. It supports both manually configured overlays and metadata-driven overlays, making it suitable for use cases such as video analytics, AI inference visualization, and privacy masking.
# Replace $HOME to the appropriate device path before running the commands.# For QLI: /root# For Ubuntu: /home/ubuntu# Modify this based on your platform and ensure files are copied to the correct location on the device.# Run from your host machine — replace <user> and <device-ip>ssh <user>@<device-ip> "mkdir -p $HOME/{models,labels,media,media/output}"scp yolo_x_w8a8.tflite <user>@<device-ip>:$HOME/models/scp yolov8.json <user>@<device-ip>:$HOME/labels/scp ai_demo_sample.mp4 <user>@<device-ip>:$HOME/media/
3
Connect to device
# Run from your host machine — replace <user> and <device-ip>ssh <user>@<device-ip>
Manually set multiple custom bounding boxes as a list of GstStructures with a unique name and three parameters: position, dimensions, and color. The position and dimensions are mandatory for new entries.
Manually set multiple custom BGRA images as a list of GstStructures with a unique name and three mandatory parameters: path, resolution, and destination.
Manually set multiple masks as a list of GstStructures with a unique name and parameters color and either circle=<X, Y, RADIUS> or rectangle=<X, Y, WIDTH, HEIGHT> (polygon also supported).
Manually set multiple custom strings as a list of GstStructures with a unique name and four parameters: contents, fontsize, position, and color. The contents field is mandatory for new entries.
Manually set timestamps using GstStructures. Use Date/Time for displaying formatted date and time with optional parameters format, fontsize, position, and color. Use PTS/DTS for displaying buffer timestamps with optional parameters fontsize, position, and color.
Incoming video buffers may contain additional metadata in the form of GstMeta. The element inspects each buffer for supported metadata types and renders any detected metadata directly on top of the corresponding image frame. Each supported metadata type is processed and displayed according to its own visual representation.
This metadata describes a region of interest (ROI) within the frame, typically representing an object such as a person, vehicle, keyboard, or any other detected item.A rectangle is drawn around the ROI using the specified X/Y coordinates, width, and height. The rectangle is rendered with a visible border and a transparent interior so that the object inside the region remains visible.
This metadata describes a collection of landmark points that may represent human pose key-points, facial features, hand joints, or other connected points of interest.Both the individual points and the lines connecting related points are drawn on the image, allowing the landmark structure to be visualized clearly.
This metadata contains a list of possible labels or classifications for the entire image.All associated labels are rendered in the top-left corner of the frame, making the classification results easy to view at a glance.
This metadata represents optical flow motion vectors, which describe the movement between pixels across two sequential video frames.The vectors indicate motion direction and magnitude, making it useful for visualizing frame-to-frame movement in the video stream.
In addition to buffer metadata, the plugin supports user-defined overlay objects that can be configured through properties. These overlays can be added, updated, or removed during runtime. Once configured, each overlay remains visible on every incoming video frame until the user explicitly removes it.
Static images such as logos, icons, or watermarks can be added using the images property.
All images provided through this property must be in raw RGBA format.The property payload follows the GStreamer structure layout and requires the following mandatory parameters when first set:
Parameter
Description
path
URL or path to the image file
resolution
Width and height of the image in pixels
destination
Top-left X/Y coordinates where the image should be placed, along with its final rendered size
Text overlays such as captions, titles, annotations, or labels can be set using the strings property.The property payload follows the GStreamer structure layout and requires the following parameters when first set:
Parameter
Description
contents
Text content to be displayed
fontsize
Font size of the text
position
Top-left X/Y coordinates where the text should appear
Timestamps such as the current date/time or buffer timestamps can be displayed using the timestamps property.The property payload follows the GStreamer structure layout and requires the following parameters when first set:
Parameter
Description
format
Date/time formatting string. Not applicable for PTS/DTS
fontsize
Font size of the timestamp
position
Top-left X/Y coordinates where the timestamp should be placed
Privacy masks can be used to obscure specific portions of the image using the masks property.The property payload follows the GStreamer structure layout and requires a unique name along with a color value and one of the supported shape definitions:
Shape
Syntax
Description
circle
circle=<X, Y, RADIUS>
Draws a circular mask centered at the specified X/Y position with the given radius
rectangle
rectangle=<X, Y, WIDTH, HEIGHT>
Draws a rectangular mask using the specified coordinates and dimensions
polygon
polygon=<X1, Y1, X2, Y2, X3, Y3, ...>
Draws a polygonal mask defined by multiple coordinate points
Rectangles used to highlight a region or object within the frame can be configured using the bboxes property.The property payload follows the GStreamer structure layout and requires the following parameters when first set:
Parameter
Description
position
Top-left X/Y coordinates where the rectangle should be placed
Sample pipeline displaying user-defined text positioned at the top-left corner of the video frames (live preview from camera), along with a circular privacy mask applied to conceal a selected region of the image.
Sample pipeline containing a single stage AI inference that performs object detection on the video frame. The AI stage generates ROI metadata in string format and attaches it to the main frame. This metadata is then rendered as an overlay using the qtivoverlay plugin and displayed on the output video.
If any downloaded file is a .zip archive, extract it on your host machine before copying:
unzip filename.zip
2
Copy files to device
# Run from your host machine — replace <user> and <device-ip>ssh <user>@<device-ip> "mkdir -p $HOME/{models,labels}"scp yolox_w8a8.tflite <user>@<device-ip>:$HOME/models/scp yolov8.json <user>@<device-ip>:$HOME/labels/
3
Connect to device
# Run from your host machine — replace <user> and <device-ip>ssh <user>@<device-ip>
This pipeline demonstrates how to blit a static image (e.g., a company logo or watermark) onto a live video stream using the images property of qtivoverlay. The image is loaded once at pipeline start and composited onto every frame at the specified position and size.
The image file must be in raw BGRA format. Save your image as logo.bgra before running this pipeline.
1
Prepare Required Files
File
Description
Save as
Logo image
Raw BGRA format image file (e.g., your company logo or watermark)
logo.bgra
2
Copy files to device
# Run from your host machine — replace <user> and <device-ip>ssh <user>@<device-ip> "mkdir -p $HOME/media"scp logo.bgra <user>@<device-ip>:$HOME/media/
3
Connect to device
# Run from your host machine — replace <user> and <device-ip>ssh <user>@<device-ip>
Applying Timestamp — DD/MM/YY and Time as Text Overlay
This pipeline demonstrates how to render the current date in DD/MM/YYYY format and the current time as a live text overlay on each video frame using the timestamps property of qtivoverlay. The timestamp is updated automatically on every buffer.
This pipeline demonstrates how to apply multiple privacy masks of different shapes — a circle and a rhombus (polygon) — to obscure sensitive regions of the video frame using the masks property of qtivoverlay. Both masks are rendered simultaneously on every frame.
Multiple masks can be combined in a single masks property by listing multiple GStreamer structures separated by commas. Each mask is rendered independently and can use a different shape, position, and color.