Overview
qtivcomposer is a video composition element that combines multiple video streams into a single output frame using GPU hardware acceleration. At a fundamental level, video composition involves taking multiple independent video frames and arranging them into one output frame according to defined spatial rules. These rules determine the position of each input frame and, when necessary, resize it to fit its destination window in the output frame. Resizing may involve either downscaling or upscaling, depending on the source and destination rectangles. qtivcomposer is commonly used in scenarios such as:- Multi-camera systems, such as surround-view and security grid layouts
- Picture-in-picture compositions
- Video wall applications
- Overlaying auxiliary video streams onto a primary video feed

Example Pipeline
Download Required Files
| File | Download | Save as |
|---|---|---|
| Sample video | Input video | video.mp4 |
Hierarchy
GObjectGstObject
GstElement
GstAggregator
GstVideoAggregator
qtivcomposer
Pad Templates
sink
| Capabilities | |
|---|---|
video/x-raw | format: { NV12, NV21, UYVY, YUY2, P010_10LE, RGBA, BGRA, ARGB, ABGR, RGBx, BGRx, xRGB, xBGR, RGB, BGR, GRAY8, NV12_Q08C } width: [1, 32767] height: [1, 32767] framerate: [0/1, 255/1] |
| Availability: On request | |
| Direction: sink |
src
| Capabilities | |
|---|---|
video/x-raw | format: { NV12, NV21, UYVY, YUY2, P010_10LE, RGBA, BGRA, ARGB, ABGR, RGBx, BGRx, xRGB, xBGR, RGB, BGR, GRAY8, NV12_Q08C } width: [1, 32767] height: [1, 32767] framerate: [0/1, 255/1] |
| Presence: Always | |
| Direction: source |
Pad Properties
sink
| Property | Description |
|---|---|
alpha | Alpha channel value.Type: DoubleDefault: 1Range: 0 - 1Flags: readable/writable |
crop | The crop rectangle in the format <X, Y, WIDTH, HEIGHT>.Type: GstValueArray of type gintDefault: "<0, 0, 0, 0>"Flags: readable/writable Example: crop="<0,0,1280,720>" |
dimensions | The destination rectangle width and height. If not set, it remains the same as input dimensions in the format <WIDTH, HEIGHT>.Type: GstValueArray of type gintDefault: "< >"Flags: readable/writable Example: crop="<1280,720>" |
flip-horizontal | Flips the video frame horizontally.Type: BooleanDefault: falseFlags: readable/writable |
flip-vertical | Flips the video frame vertically.Type: BooleanDefault: falseFlags: readable/writable |
position | The X and Y coordinates of the destination rectangle’s top-left corner in the format <X, Y>.Type: GstValueArray of type gintDefault: "< >"Flags: readable/writable Example: position="<10,20>" |
rotate | Specifies rotation to apply to the video.Type: Enum Default: 0, "none"Range:(0): none(1): 90CW - Rotate 90 degrees clockwise(2): 90CCW - Rotate 90 degrees counter-clockwise(3): 180 - Rotate 180 degreesFlags: readable/writable Example: rotate="90CW" (or) rotate=1 |
zorder | Z-axis order. By default, follows the order of creation.Type: IntegerDefault: -1Range: -1 - 2147483647Flags: readable/writable |
Element Properties
| Property | Description |
|---|---|
background | Background color.Type: Unsigned IntegerDefault: 4286611584Range: 0 - 4294967295Flags: readable/writable |
engine | Engine backend used for the conversion operations.Type: Enum Default: 2, "gles"Range:(0): none - No backend used(2): gles - Use OpenGLES based video converter(3): fcv - Use FastCV based video converterFlags: readable/writable Example: engine="gles" (or) engine=2 |
Architecture, Processing, and Role in Vision AI
qtivcomposer is a multi-sink GStreamer element with one source pad. Each sink pad accepts one input stream, typically through dynamically requested pads such assink_0, sink_1, and so on. The source pad produces the final composed frame.
Internally, the element includes four main parts: a pad manager that tracks active inputs, a layout configuration that defines where and how each input appears in the output, a hardware composition backend that performs the composition, and a buffer manager that enforces DMA-compatible memory usage.
At setup time, caps negotiation defines the output frame, including resolution and color format. Input streams must be compatible with the compositor. When needed, hardware-supported scaling and format conversion are applied during composition.
Each sink pad is configured with layout metadata that specifies:
- Output position
(x, y) - Output size
(width, height) - Z-order
Usage
Side-by-side composition of two camera streams
This pipeline usesqtivcomposer to combine two NV12 camera streams into a single output frame arranged side by side, with the resulting output displayed on the screen.

Picture-in-picture (PiP) composition of two camera streams
This example uses qtivcomposer to overlay a secondary video stream on top of a primary stream, creating a PiP layout in a single output frame.
Multi‑Stream Vision AI Video Wall
This example demonstrates howqtivcomposer serves as the final aggregation point for building an video wall, where each stream runs its own AI inference pipeline, applies AI Metadata as overlay, and is then composed into a single output to create the final video wall.

Download Required Files
| File | Download | Save as |
|---|---|---|
| YOLO model | Qualcomm Model | yolov8_det_quantized.tflite |
| YOLO labels | Labels | yolov8.json |
