> ## Documentation Index
> Fetch the complete documentation index at: https://imsdkdocs.qualcomm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# qtimlonnx

> ONNX Runtime inference element for GStreamer AI and multimedia pipelines.

<Note>
  **qtimlonnx** is not available in the current build. Support for this plugin will be enabled in a future release.
</Note>

# Overview

**qtimlonnx** is a GStreamer inference element that executes **ONNX** models as part of AI and multimedia pipelines. The element operates entirely in **tensor mode:** it accepts input tensors on its sink pad and produces output tensors on its source pad according to the model's input and output specifications.

The element is limited to **model execution**. It does not perform preprocessing, tensor reshaping, batching, layout conversion, or model-specific post-processing. These functions are expected to be handled by adjacent elements in the pipeline. As a result, upstream elements must provide tensors that already match the model requirements, and downstream elements must interpret the output tensors produced by inference.

**qtimlonnx** supports multiple ONNX Runtime execution providers, including **CPU, GPU, and Qualcomm AI accelerator / NPU** execution through the QNN runtime. This allows the same pipeline structure to be deployed across different hardware targets and optimized for different performance, latency, and power requirements. The element is intended for real-time and embedded AI pipelines where inference is one stage in a larger modular processing flow.

### Key Responsibilities

**qtimlonnx** is responsible for:

* Loading and executing an ONNX model via the ONNX Runtime
* Accepting preformatted input tensors from upstream elements
* Producing output tensors that match the model output signature
* Negotiating tensor data types and dimensions with adjacent elements
* Propagating tensor metadata required by downstream elements
* Automatically extracting quantization parameters (scale and zero-point) from the ONNX model graph and dequantizing quantized outputs to `FLOAT32` when required
* Automatically detecting NCHW or NHWC memory layout for 4-D output tensors and advertising the layout in output caps

In practice, **qtimlonnx** serves as the inference stage in the pipeline, while tensor preparation and result interpretation are handled externally.

## Example Pipeline

<Steps>
  <Step title="Download Required Files">
    | File             | Download                                                                                                                                               | Save as              |
    | ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------- |
    | YOLOX W8A8 model | [Qualcomm AI Hub — YOLOX](https://aihub.qualcomm.com/iot/models/yolox)                                                                                 | `yolo_x_w8a8.onnx`   |
    | Detection labels | <a href="../labels/yolov8.json" download="yolov8.json">yolov8.json</a>                                                                                 | `yolov8.json`        |
    | Sample video     | <a href="https://github.com/qualcomm/sample-apps-for-qualcomm-linux/raw/refs/heads/main/qualcomm-linux/artifacts/videos/demo_samples/">Input video</a> | `ai_demo_sample.mp4` |

    <Note>
      If any downloaded file is a `.zip` archive, extract it on your host machine before copying:
      `unzip filename.zip`
    </Note>
  </Step>

  <Step title="Copy files to device">
    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      # Replace $HOME to the appropriate device path before running the commands.
      # For QLI:    /root
      # For Ubuntu: /home/ubuntu
      # Modify this based on your platform and ensure files are copied to the correct location on the device.
      # Run from your host machine — replace <user> and <device-ip>

      ssh <user>@<device-ip> "mkdir -p $HOME/{models,labels,media,media/output}"
      scp yolo_x_w8a8.onnx          <user>@<device-ip>:$HOME/models/
      scp yolov8.json                  <user>@<device-ip>:$HOME/labels/
      scp ai_demo_sample.mp4   <user>@<device-ip>:$HOME/media/
      ```
    </CodeGroup>
  </Step>

  <Step title="Connect to device">
    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      # Run from your host machine — replace <user> and <device-ip>
      ssh <user>@<device-ip>
      ```
    </CodeGroup>
  </Step>

  <Step title="Set environment variables">
    Run below command on your device

    ```bash theme={null}
    export MODEL_NAME=yolo_x_w8a8.onnx
    export LABELS_NAME=yolov8.json
    export SRC_VIDEO_NAME=ai_demo_sample.mp4
    ```
  </Step>

  <Step title="Run the pipeline">
    ```bash theme={null}
    gst-launch-1.0 -e --gst-debug=2 \
    filesrc location=$HOME/media/$SRC_VIDEO_NAME ! qtdemux ! h264parse ! \
    v4l2h264dec capture-io-mode=4 output-io-mode=4 ! video/x-raw,format=NV12 ! queue ! \
    tee name=t ! qtimetamux name=obj_mux ! qtivoverlay ! waylandsink fullscreen=true sync=false \
    t. ! queue ! qtimlvconverter ! queue ! qtimlonnx model=/data/onnx/yolox-float/model.onnx execution-provider=qnn backend-path="/usr/lib/libQnnHtp.so" ! queue ! \
    qtimlpostprocess module=yolov8 labels=$HOME/labels/$LABELS_NAME settings="{\"confidence\": 51.0}" ! text/x-raw ! queue ! obj_mux.
    ```
  </Step>
</Steps>

# Hierarchy

[GObject](https://docs.gtk.org/gobject/)<br />
   <Icon icon="arrow-turn-down-right" iconType="solid" />[GstObject](https://gstreamer.freedesktop.org/documentation/gstreamer/gstobject.html?gi-language=c)<br />
      <Icon icon="arrow-turn-down-right" iconType="solid" />[GstElement](https://gstreamer.freedesktop.org/documentation/gstreamer/gstelement.html?gi-language=c)<br />
         <Icon icon="arrow-turn-down-right" iconType="solid" />[GstBaseTransform](https://gstreamer.freedesktop.org/documentation/base/gstbasetransform.html?gi-language=c)<br />
            <Icon icon="arrow-turn-down-right" iconType="solid" />qtimlonnx

# Pad Templates

### sink

| Capabilities             |                                                                                          |
| ------------------------ | ---------------------------------------------------------------------------------------- |
| `neural-network/tensors` | `format: { INT8, UINT8, INT16, UINT16, INT32, UINT32, INT64, UINT64, FLOAT16, FLOAT32 }` |
| Availability: *Always*   |                                                                                          |
| Direction: *sink*        |                                                                                          |

### src

| Capabilities             |                                                                                          |
| ------------------------ | ---------------------------------------------------------------------------------------- |
| `neural-network/tensors` | `format: { INT8, UINT8, INT16, UINT16, INT32, UINT32, INT64, UINT64, FLOAT16, FLOAT32 }` |
| Availability: *Always*   |                                                                                          |
| Direction: *source*      |                                                                                          |

# Element Properties

| Property               | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `backend-path`         | Absolute path to the QNN backend shared library such as `libQnnHtp.so`. Used only when `execution-provider=qnn`. Determines which Qualcomm hardware accelerator is targeted.<br /><br />`Type: String`<br />`Default: NULL`<br />`Flags: readable/writable (construct)`                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| `execution-provider`   | Selects the ONNX Runtime Execution Provider (EP) used for inference.<br /><br />`Type: Enum `<br />`Default: 0, "cpu"`<br />`Range:`<br />    `(0): cpu - Default ONNX Runtime CPU execution. Runs all operations on the host CPU`<br />    `(1): qnn - Qualcomm QNN Execution Provider. Offloads inference to Qualcomm hardware via the QNN SDK. Requires backend-path to be set`<br />`Flags: readable/writable`                                                                                                                                                                                                                                                                                                                                |
| `htp-performance-mode` | Controls the power and performance trade-off on the Qualcomm Hexagon HTP. Applicable only when `execution-provider=qnn`.<br /><br />`Type: Enum `<br />`Default: 0, "default"`<br />`Range:`<br />    `(0): default - Default performance mode`<br />    `(1): burst - Maximum performance with highest power consumption`<br />    `(2): balanced - Balanced performance and power`<br />    `(3): low-balanced - Lower balanced performance`<br />    `(4): high-performance - High performance mode`<br />    `(5): extreme-power - Extreme power performance`<br />    `(6): low-power - Lowest power consumption`<br />    `(7): sustained-high-performance - Sustained high performance without throttling`<br />`Flags: readable/writable` |
| `model`                | Path to the ONNX model file. This property is required and must reference a valid `.onnx` model file. The model is loaded when the element transitions from `NULL` to `READY` state.<br /><br />`Type: String`<br />`Default: NULL`<br />`Flags: readable/writable (construct)`                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| `optimization-level`   | Controls the ONNX Runtime graph optimization level applied when loading the model. Higher levels may reduce inference latency but increase model load time.<br /><br />`Type: Enum `<br />`Default: 2, "enable-extended"`<br />`Range:`<br />    `(0): disable-all - Disable all graph optimizations`<br />    `(1): enable-basic - Basic optimizations such as constant folding`<br />    `(2): enable-extended - Extended optimizations including operator fusion`<br />    `(3): enable-all - All optimizations including layout and memory optimizations`<br />`Flags: readable/writable`                                                                                                                                                     |
| `threads`              | Number of intra-operation threads assigned to the ONNX Runtime session. Primarily affects CPU execution. May have limited impact when using QNN execution provider.<br /><br />`Type: Unsigned Integer`<br />`Default: 1`<br />`Range: 1 - 16`<br />`Flags: readable/writable (construct)`                                                                                                                                                                                                                                                                                                                                                                                                                                                        |

# Input and Output Behavior

### Input Tensors

**qtimlonnx** exposes a single sink pad, but it supports both single-input and multi-input models. For multi-input models, all required tensors are delivered through the same sink pad as a tensor set.

Input tensors must be fully prepared before they reach **qtimlonnx**. Expected tensor layout, shape, data type, and batch size are determined by:

* the ONNX model input signature (read at engine initialization)
* caps negotiation with upstream elements

Typical upstream elements include:

* [`qtimlvconverter`](qtimlvconverter) for scaling, color conversion, normalization, and quantization
* [`qtibatch`](qtibatch) for batch construction

**qtimlonnx** does not modify, reshape, batch, or reinterpret incoming tensors. It wraps them directly as ONNX Runtime input tensors (zero-copy) and passes them to the runtime as received.

### Output Tensors

**qtimlonnx** exposes a single source pad and produces output tensors according to the model output signature. The single source pad does not limit the element to a single tensor. Models with multiple output tensors are fully supported, and all outputs are emitted together on the same pad.

The element supports:

* single-output and multi-output models
* arbitrary tensor ranks, including batch and depth dimensions
* quantized and floating-point outputs

Output tensors are typically consumed by downstream post-processing elements, which decode model-specific results such as classification scores, detection boxes, segmentation masks, landmarks, or other structured outputs.

# Quantization and Dequantization

**qtimlonnx** can optionally dequantize quantized output tensors (such as `UINT8` or `INT8`) into `FLOAT32`. This conversion uses quantization parameters (scale and zero-point) extracted directly from the ONNX model graph at engine initialization time.

### Quantization Parameter Extraction

At initialization, **qtimlonnx** parses the ONNX model graph and locates `QuantizeLinear` nodes whose outputs match the model's declared output tensor names. For each such node, the plugin reads:

* `scale` — from the second input of the `QuantizeLinear` node (a graph initializer)
* `zero-point` — from the third input of the `QuantizeLinear` node (a graph initializer)

These values are stored per output tensor and used during inference to perform dequantization.

### Conditional Output Dequantization

Dequantization is performed only when the downstream path requires `FLOAT32` tensors. In practice, this is enabled when downstream caps negotiation indicates that floating-point output is needed.

When dequantization is applied, **qtimlonnx**:

* reads the per-tensor `scale` and `zero_point` extracted from the model graph
* applies the standard dequantization formula:

```text theme={null}
output_float = (quantized_value - zero_point) × scale
```

* produces `FLOAT32` tensors for downstream processing

### When Dequantization Is Skipped

Dequantization is not performed when:

* downstream elements accept only quantized tensor types
* no downstream element negotiates `FLOAT32`
* the model output tensor does not contain a `QuantizeLinear` node with valid quantization metadata

In these cases, the output tensor is forwarded in its original quantized representation.

This behavior allows the same downstream processing path to support both quantized and floating-point models where applicable, while avoiding unnecessary conversion.

# Supported Data Types

**qtimlonnx** supports the tensor data types provided by the ONNX Runtime and the selected execution provider, subject to caps negotiation with adjacent elements.

Supported data types include:

* `INT8`
* `UINT8`
* `INT16`
* `UINT16`
* `INT32`
* `UINT32`
* `INT64`
* `UINT64`
* `FLOAT16`
* `FLOAT32`

The element does not impose additional data-type restrictions beyond those required by the ONNX Runtime, the selected execution provider, and the negotiated pipeline caps.

# NCHW / NHWC Layout Detection

For 4-D output tensors, **qtimlonnx** automatically detects the memory layout (NCHW or NHWC) at engine initialization by traversing the ONNX model graph backwards from each output tensor and inspecting `Transpose` node permutations:

* A `Transpose` with `perm=[0,2,3,1]` indicates the output is **NHWC** (the node converted NCHW→NHWC).
* A `Transpose` with `perm=[0,3,1,2]` indicates the output is **NCHW** (the node converted NHWC→NCHW).
* If no `Transpose` node is found, the output defaults to **NCHW** (the ONNX standard).

When any output tensor is detected as NCHW, the plugin adds `layout=nchw` to the source pad caps. Downstream elements (such as [`qtimlpostprocess`](qtimlpostprocess)) use this field to perform the necessary in-place NCHW→NHWC transpose before processing.

# Batch and Depth Model Support

**qtimlonnx** supports models with batch and multi-dimensional tensor inputs and outputs, including tensors with explicit batch and depth dimensions.

Examples include:

* batched tensors: `N × H × W × C`
* multi-dimensional tensors: `N × D × H × W × C`

The element treats these dimensions transparently and passes tensors to the ONNX Runtime according to the negotiated shape. It does not construct batches, reshape tensors, or reinterpret tensor dimensions internally.

Batch construction must be handled by upstream elements such as [`qtibatch`](qtibatch).

This behavior keeps inference predictable across single-frame, batched, and higher-dimensional workflows.

# Execution Providers

An ONNX Runtime **Execution Provider (EP)** defines the hardware backend used to run a model. Execution providers allow **qtimlonnx** to offload inference from the default CPU interpreter to an optimized backend such as Qualcomm's HTP/DSP.

**qtimlonnx** supports two execution providers. The provider is selected through the `execution-provider` property and controls how the ONNX Runtime dispatches model operations during inference.

### CPU

Runs the model on the default ONNX Runtime CPU interpreter.

* **Backend**: CPU
* **Use case**: reference execution, debugging, or systems without hardware acceleration
* **Additional configuration**: none required

### QNN (Qualcomm Neural Network)

Offloads inference to the AI Accelerator / NPU via the ONNX Runtime's QNN Execution Provider.

* **Backend**: Qualcomm AI Accelerator / NPU
* **Use case**: hardware-accelerated inference on Qualcomm SoCs for quantized and floating-point models
* **Additional configuration required**:
  * `backend-path` — absolute path to the QNN backend shared library (e.g. `libQnnHtp.so`)
  * `htp-performance-mode` — optional power/performance trade-off setting

When `execution-provider=qnn`, the plugin performs the following initialization sequence:

1. Registers `libonnxruntime_providers_qnn_abi.so` with the ONNX Runtime environment using `RegisterExecutionProviderLibrary`.
2. Enumerates available QNN EP devices using `GetEpDevices`.
3. Attaches the QNN EP to the session options using `SessionOptionsAppendExecutionProvider_V2`, passing `backend_path` and `htp_performance_mode` as provider options.
4. Creates the ONNX Runtime session with the configured QNN EP.

# Runtime Memory Behavior and GAP Handling

**qtimlonnx** operates within the memory model of the ONNX Runtime. Input buffers from the pipeline are mapped read-only and wrapped directly as ONNX Runtime input tensors using `CreateTensorWithDataAsOrtValue`, avoiding a copy of the input data. Output tensors are written into DMA-backed output buffers allocated from the element's `GstMLBufferPool`.

### ONNX Runtime Memory Model

The ONNX Runtime manages its own internal memory for:

* intermediate activation tensors
* output tensors (allocated by the runtime and then copied into the pipeline output buffer)

### GAP Buffer Handling

**qtimlonnx** is GAP-aware and correctly handles input buffers marked with `GST_BUFFER_FLAG_GAP`.

When a GAP buffer is received, the element skips inference and forwards the buffer downstream. This preserves timing and synchronization while explicitly indicating that no valid inference input is available for that timestamp.

GAP buffers commonly appear in conditional AI pipelines, such as cascaded workflows where later inference stages run only when earlier stages produce valid regions of interest.

# Usage

### Single-Stage AI Inference on Live Camera Stream

This example demonstrates real-time ONNX inference on a live camera stream using a single **qtimlonnx** instance with the QNN execution provider. Inference results are attached to each `GstBuffer` as `MLMeta`, allowing downstream elements to access synchronized metadata directly from the frame. An overlay stage renders annotations such as bounding boxes, labels, or keypoints before display.

<Steps>
  <Step title="Download Required Files">
    | File             | Download                                                               | Save as            |
    | ---------------- | ---------------------------------------------------------------------- | ------------------ |
    | YOLOX W8A8 model | [Qualcomm AI Hub — YOLOX](https://aihub.qualcomm.com/iot/models/yolox) | `yolo_x_w8a8.onnx` |
    | Detection labels | <a href="../labels/yolov8.json" download="yolov8.json">yolov8.json</a> | `yolov8.json`      |

    <Note>
      If any downloaded file is a `.zip` archive, extract it on your host machine before copying:
      `unzip filename.zip`
    </Note>
  </Step>

  <Step title="Copy files to device">
    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      # Replace $HOME to the appropriate device path before running the commands.
      # For QLI:    /root
      # For Ubuntu: /home/ubuntu
      # Modify this based on your platform and ensure files are copied to the correct location on the device.
      # Run from your host machine — replace <user> and <device-ip>

      ssh <user>@<device-ip> "mkdir -p $HOME/{models,labels}"
      scp yolo_x_w8a8.onnx            <user>@<device-ip>:$HOME/models/
      scp yolov8.json   <user>@<device-ip>:$HOME/labels/
      ```
    </CodeGroup>
  </Step>

  <Step title="Connect to device">
    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      # Run from your host machine — replace <user> and <device-ip>
      ssh <user>@<device-ip>
      ```
    </CodeGroup>
  </Step>

  <Step title="Set environment variables">
    Run below command on your device

    ```bash theme={null}
    export MODEL_NAME=yolo_x_w8a8.onnx
    export LABELS_NAME=yolov8.json
    ```
  </Step>

  <Step title="Run the pipeline">
    ```bash Run the pipeline theme={null}
    gst-launch-1.0 -e --gst-debug=2 \
    qticamsrc ! video/x-raw,width=1920,height=1080,format=NV12,framerate=30/1 ! queue ! tee name=t \
    t. ! queue ! qtimetamux name=metamux ! queue ! qtivoverlay ! waylandsink fullscreen=true sync=false \
    t. ! queue ! qtimlvconverter ! queue ! qtimlonnx model=$HOME/models/$MODEL_NAME execution-provider=qnn backend-path="libQnnHtp.so" ! queue ! \
    qtimlpostprocess results=10 module=yolov8 labels=$HOME/labels/$LABELS_NAME ! video/x-raw,format=BGRA,width=960,height=540 ! queue ! metamux.
    ```
  </Step>
</Steps>

### Two-Stage Daisy Chain AI Inference on File Stream

This example demonstrates a two-stage ONNX inference workflow using two **qtimlonnx** instances. The first model operates on full video frames after preprocessing by a [`qtimlvconverter`](qtimlvconverter) configured for full-frame input. Inference results, such as detected objects, are attached to the corresponding video buffer and propagated downstream. The second model runs once for each object detected by the first stage. A second [`qtimlvconverter`](qtimlvconverter), configured for ROI-based processing, crops each detected region from the input frame and prepares it as input for the second **qtimlonnx** instance.

<Steps>
  <Step title="Download Required Files">
    | File                               | Download                                                                                                                                               | Save as                  |
    | ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------ |
    | Detection model                    | [Qualcomm AI Hub — YOLOX](https://aihub.qualcomm.com/iot/models/yolox)                                                                                 | `yolo_x_w8a8.onnx`       |
    | Detection labels                   | <a href="../labels/yolov8.json" download="yolov8.json">yolov8.json</a>                                                                                 | `yolov8.json`            |
    | Classification model (InceptionV3) | [Qualcomm AI Hub — InceptionV3](https://aihub.qualcomm.com/iot/models/inception_v3)                                                                    | `mobilenet_v2_w8a8.onnx` |
    | Classification labels              | <a href="../labels/mobilenet.json" download="mobilenet.json">mobilenet.json</a>                                                                        | `mobilenet.json`         |
    | Sample video                       | <a href="https://github.com/qualcomm/sample-apps-for-qualcomm-linux/raw/refs/heads/main/qualcomm-linux/artifacts/videos/demo_samples/">Input video</a> | `ai_demo_sample.mp4`     |

    <Note>
      If any downloaded file is a `.zip` archive, extract it on your host machine before copying:
      `unzip filename.zip`
    </Note>
  </Step>

  <Step title="Copy files to device">
    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      # Replace $HOME to the appropriate device path before running the commands.
      # For QLI:    /root
      # For Ubuntu: /home/ubuntu
      # Modify this based on your platform and ensure files are copied to the correct location on the device.
      # Run from your host machine — replace <user> and <device-ip>

      ssh <user>@<device-ip> "mkdir -p $HOME/{models,labels,media}"
      scp <yolo_x_w8a8.onnx>      <user>@<device-ip>:$HOME/models/
      scp yolov8.json               <user>@<device-ip>:$HOME/labels/
      scp <mobilenet_v2_w8a8.onnx>      <user>@<device-ip>:$HOME/models/
      scp mobilenet.json                <user>@<device-ip>:$HOME/labels/
      scp ai_demo_sample.mp4   <user>@<device-ip>:$HOME/media/
      ```
    </CodeGroup>
  </Step>

  <Step title="Connect to device">
    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      # Run from your host machine — replace <user> and <device-ip>
      ssh <user>@<device-ip>
      ```
    </CodeGroup>
  </Step>

  <Step title="Set environment variables">
    Run below command on your device

    ```bash theme={null}
    export MODEL_NAME_1=yolo_x_w8a8.onnx
    export LABELS_NAME_1=yolov8.json
    export MODEL_NAME_2=<mobilenet_v2_w8a8.onnx>
    export LABELS_NAME_2=mobilenet.json
    export SRC_VIDEO_NAME=ai_demo_sample.mp4
    ```
  </Step>

  <Step title="Run the pipeline">
    ```bash Run the pipeline theme={null}
    gst-launch-1.0 -e --gst-debug=2 \
    qtimlvconverter name=stage_01_preproc \
    qtimlonnx model=$HOME/models/$MODEL_NAME_1 execution-provider=qnn backend-path="libQnnHtp.so" htp-performance-mode=1 name=stage_01_inference \
    qtimlpostprocess name=stage_01_postproc results=10 module=yolov8 labels=$HOME/labels/$LABELS_NAME_1 \
    qtimlvconverter name=stage_02_preproc \
    qtimlonnx model=$HOME/models/$MODEL_NAME_2 execution-provider=qnn backend-path="libQnnHtp.so" htp-performance-mode=1 name=stage_02_inference \
    qtimlpostprocess name=stage_02_postproc module=mobilenet labels=$HOME/labels/$LABELS_NAME_2 \
    filesrc location=$HOME/media/$SRC_VIDEO_NAME ! qtdemux ! h264parse ! v4l2h264dec capture-io-mode=4 output-io-mode=4 ! video/x-raw,format=NV12 ! queue ! tee name=t_split_1 \
    t_split_1. ! queue ! metamux_1. \
    t_split_1. ! queue ! stage_01_preproc. stage_01_preproc. ! queue ! stage_01_inference. stage_01_inference. ! queue ! stage_01_postproc. stage_01_postproc. ! text/x-raw ! queue ! metamux_1. \
    qtimetamux name=metamux_1 ! queue ! qtiobjtracker algo=bytetrack ! queue ! tee name=t_split_2 \
    t_split_2. ! queue ! metamux_2. \
    t_split_2. ! queue ! stage_02_preproc. stage_02_preproc. ! queue ! stage_02_inference. stage_02_inference. ! queue ! stage_02_postproc. stage_02_postproc. ! text/x-raw ! queue ! metamux_2. \
    qtimetamux name=metamux_2 ! queue ! qtivoverlay ! queue ! waylandsink fullscreen=true
    ```
  </Step>
</Steps>
