Overview

The qtivtransform plugin is a GStreamer video transformation element designed for real-time, hardware-accelerated manipulation of video frames. It plays a critical role in video and vision processing pipelines where performance, flexibility, and low latency are essential.

This plugin enables developers to apply a variety of transformations - such as resizing, rotating, flipping, cropping, and color conversion - all while maintaining high throughput. Modern multimedia often require video frames to be transformed before further processing. For example:

Resizing: A video stream may need to be resized to fit the native resolution of a screen (e.g., resizing 1920×1080 to 1280×720 for a smaller display).
Rotating: Mobile devices or webcams may produce rotated frames depending on how the device is held. Rotating the frame ensures proper display alignment.
Flipping/Mirroring: Front cameras often produce mirrored images. Horizontal flipping corrects this for natural viewing.
Cropping: Cropping can help convert non-standard aspect ratios (e.g., 4:3 or 21:9) to standard ones like 16:9 for compatibility with downstream encoders or players.
Color conversion: Тransform between formats (e.g., RGB to NV12) for compatibility with downstream components.

qtivtransform addresses these needs efficiently by leveraging GPU acceleration, making it suitable for a wide range of platforms - from low-power embedded systems to high-performance multimedia setups.

Example Pipeline

Download Required Files

File	Download	Save as
Sample video	Input video	`video.mp4`

Copy files to device

# Replace $HOME to the appropriate device path before running the commands.
# For QLI:    /root
# For Ubuntu: /home/ubuntu
# Modify this based on your platform and ensure files are copied to the correct location on the device.
# Run from your host machine — replace <user> and <device-ip>

ssh <user>@<device-ip> "mkdir -p $HOME/{media,media/output}"
scp video.mp4   <user>@<device-ip>:$HOME/media/

Connect to device

# Run from your host machine — replace <user> and <device-ip>
ssh <user>@<device-ip>

Set environment variables

Run below command on your device

mkdir -p $HOME/{media,media/output}
export SRC_VIDEO_NAME=video.mp4

Run the pipeline

gst-launch-1.0 -e \
filesrc location=$HOME/media/$SRC_VIDEO_NAME ! \
decodebin ! \
videoconvert ! \
video/x-raw,format=NV12 ! \
qtivtransform ! \
videoconvert ! \
waylandsink sync=false

Hiearchy

GObject
   GstObject
      GstElement
         GstBaseTransform
            qtivtransform

Pad Templates

sink

Capabilities
`video/x-raw`	`format: { NV12, NV21, YUY2, P010_10LE, NV12_10LE32, RGBA, BGRA, ARGB, ABGR, RGBx, BGRx, xRGB, xBGR, RGB, BGR, GRAY8, NV12_Q08C }` `width: [1, 32767]` `height: [1, 32767]` `framerate: [0/1, 255/1]`
Availability: Always
Direction: sink

src

Capabilities
`video/x-raw`	`format: { NV12, NV21, YUY2, P010_10LE, RGBA, BGRA, ARGB, ABGR, RGBx, BGRx, xRGB, xBGR, RGB, BGR, RGBP, BGRP, GRAY8, NV12_Q08C }` `width: [1, 32767]` `height: [1, 32767]` `framerate: [0/1, 255/1]`
Availability: Always
Direction: source

Element Properties

Property	Description
`background`	Defines the background color used when the destination rectangle does not fill the entire output frame. `Type: Unsigned Integer` `Default: 4286611584` `Range: 0 - 4294967295` `Flags: readable/writable`
`crop`	Defines the crop rectangle on the input frame in the format `<X, Y, WIDTH, HEIGHT>`. Cropping is applied immediately upon frame reception and cannot be time-synchronized. `Type: GstValueArray of type gint` `Default: Default: "<0, 0, 0, 0 >"` `Flags: readable/writable` `Example: crop="<0,0,1280,720>"`
`destination`	Specifies the destination rectangle within the output frame in the format `<X, Y, WIDTH, HEIGHT>`. Useful for positioning the transformed content within a larger output frame. `Type: GstValueArray of type gint` `Default: Default: "<0, 0, 0, 0 >"` `Flags: readable/writable` `Example: destination="<0,0,1280,720>"`
`engine`	Engine backend used for the transformation operations. `Type: Enum` `Default: 2, "gles"` `Range:` `(0): none - No backend used` `(2): gles - Use OpenGLES based video converter` `(3): fcv - Use FastCV-based video converter` `Flags: readable/writable` `Example: engine="gles" (or) engine=2`
`engine-param`	Additional parameters for the selected engine backend. `Type: String` `Default: NULL` `Flags: readable/writable`
`flip-horizontal`	If set to true, the video frame is flipped horizontally (mirrored). Commonly used for correcting front-facing camera output. `Type: Boolean` `Default: false` `Flags: readable/writable`
`flip-vertical`	If set to true, the video frame is flipped vertically. Useful for correcting upside-down camera feeds. `Type: Boolean` `Default: false` `Flags: readable/writable`
`rotate`	Specifies the rotation to apply to the video frame. `Type: Enum` `Default: 0, "none"` `Range:` `(0): none - No rotation` `(1): 90CW - Rotate 90 degrees clockwise` `(2): 90CCW - Rotate 90 degrees counter-clockwise` `(3): 180 - Rotate 180 degrees` `Flags: readable/writable` `Example: rotate="90CW" (or) rotate=1`

Cropping Behavior

Cropping in qtivtransform is applied directly to the input frame, before any other transformation operations. This means the crop region is extracted from the original frame as it arrives at the input pad, ensuring that subsequent steps - such as scaling or color conversion - operate only on the cropped region. This approach improves performance by reducing the number of pixels processed downstream and ensures that transformations are applied precisely to the intended area of the frame. The diagram below illustrates this behavior: the crop region is selected from the full input frame, and only that region is passed forward for further processing.

Internal Architecture

The qtivtransform plugin operates as a GStreamer element with a straightforward but efficient internal architecture. It consists of two pads:

Input Pad: Receives video frames from upstream elements (e.g. camera source, decoder).
Output Pad: Sends transformed frames downstream (e.g. display, encoder).

Processing Flow

Caps negotiation

The output format and resolution in qtivtransform is determined by the negotiated output caps during the GStreamer pipeline setup.
These caps define the resolution, color format, and other properties that downstream elements expect, and qtivtransform adapts its output accordingly.

Property Parsing and Configuration

Transformation parameters (e.g., rotation angle, crop region, resize dimensions) are parsed from the element’s properties.
These parameters are stored in an internal configuration structure that guides the transformation logic.

Frame Reception

Incoming video frames are received through the sink pad (input pad).
The plugin validates the frame format and dimensions.

Buffer Allocation and Pool Negotiation

The plugin expects DMA-backed buffers for zero-copy performance and GPU compatibility.
If the upstream element does not provide DMA buffers:
- qtivtransform offers its own buffer pool via GStreamer’s allocation mechanism.
- If the upstream accepts the pool, compatible buffers are allocated.
- If not, the plugin fails during negotiation and reports an error.
This ensures buffer memory layout is optimized for the selected engine.

Transformation Execution

The GPU engine performs the requested operations (e.g., rotation angle, crop region, resize dimensions).
The scaling is performed before color conversion to minimize GPU memory bandwidth usage.

Frame Output

The transformed frame is pushed to the source pad (output pad).
Downstream elements (e.g. encoders, displays) receive the processed frame for further handling.

Buffer Management and Pool Requirements

qtivtransform is designed to operate efficiently with DMA-allocated buffers, which are essential for zero-copy performance and hardware acceleration. To ensure compatibility and optimal throughput, the plugin enforces specific buffer handling requirements:

DMA Buffer Requirement
- The plugin expects incoming buffers to be DMA-backed.
- This is crucial for interoperability with GPU-based backends and for minimizing memory copies during transformation.
Buffer Pool Negotiation
- If the upstream element does not provide DMA buffers, qtivtransform can offer its own buffer pool to the upstream element via the standard GStreamer allocation mechanism.
- This allows the upstream plugin to allocate compatible buffers from qtivtransform’s pool.
Fallback Behavior
- If the upstream element does not support buffer pool negotiation (i.e., cannot accept a pool from qtivtransform), the plugin will not function correctly.
- In such cases, pipeline setup will fail, and an error will be reported during negotiation.

This behavior ensures that buffer allocation is tightly controlled and optimized for the transformation backend.

Usage

Downscale a YUV video stream to RGB format

This pipeline demonstrates how qtivtransform can be used to convert and downscale a YUV (NV12) video stream to RGB format.
This is useful in scenarios where:
- You need to dump RGB frames to disk for use in a custom image processing or computer vision algorithm that expects raw RGB input.
- A downstream plugin or application requires RGB format instead of NV12.

gst-launch-1.0 -e --gst-debug=2 qticamsrc ! video/x-raw,width=1920,height=1080,format=NV12,framerate=5/1 ! queue ! qtivtransform ! video/x-raw,width=1280,height=720,format=RGB ! multifilesink location=$HOME/media/frame_%d.rgb

Horizontal flip

This pipeline captures video frames from a Qualcomm camera source (qticamsrc) at 1920×1080 resolution in NV12 format. The frames are passed to the qtivtransform, which applies a 90-degree clockwise rotation and horizontal flip. After transformation, the frames are encoded using the hardware-accelerated H.264 encoder (v4l2h264enc). The encoded stream is parsed (h264parse), multiplexed into an MP4 container (mp4mux), and written to disk via filesink.

gst-launch-1.0 -e --gst-debug=2 qticamsrc ! video/x-raw,width=1920,height=1080,format=NV12,framerate=30/1 ! queue ! qtivtransform rotate=90CW flip-horizontal=true ! v4l2h264enc capture-io-mode=4 output-io-mode=4 ! h264parse ! mp4mux ! queue ! filesink location="$HOME/media/video.mp4"

Cropping and Scaling

The following GStreamer pipeline demonstrates how qtivtransform performs cropping and scaling before encoding the video:
- Input: 1280×720 NV12 video stream from qticamsrc.
- Cropping: qtivtransform crops a 640×360 region starting at (320,180) from the input frame. This is done immediately upon receiving the frame.
- Scaling: The cropped region is then scaled up to 1920×1080 as specified by the output caps.
- Encoding: The transformed frame is encoded using v4l2h264enc and saved as an MP4 file.
```
gst-launch-1.0 -e --gst-debug=2 qticamsrc ! video/x-raw,width=1280,height=720,format=NV12,framerate=30/1 ! queue ! qtivtransform crop="<320,180,640,360>" ! video/x-raw,width=1920,height=1080 ! v4l2h264enc capture-io-mode=4 output-io-mode=4 ! h264parse ! mp4mux ! queue ! filesink location="$HOME/media/video.mp4"
```

​Overview

​Example Pipeline

​Hiearchy

​Pad Templates

​sink

​src

​Element Properties

​Cropping Behavior

​Internal Architecture

​Processing Flow

​Caps negotiation

​Property Parsing and Configuration

​Frame Reception

​Buffer Allocation and Pool Negotiation

​Transformation Execution

​Frame Output

​Buffer Management and Pool Requirements

​Usage

​Downscale a YUV video stream to RGB format

​Horizontal flip

​Cropping and Scaling