> ## Documentation Index
> Fetch the complete documentation index at: https://imsdkdocs.qualcomm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Inferencing

> Pipeline construction for running AI inference

The inference plugin is responsible for executing the AI model on the prepared input tensor. The QIM SDK supports multiple inference runtimes, each encapsulated within a dedicated GStreamer plugin. This architecture allows for straightforward replacement and integration of inference engines, depending on the target platform or model format.

Supported runtimes include:

* [**qtimlsnpe**](../plugin-reference/qtimltflite) — SNPE (Qualcomm Neural Processing): Executes models in DLC format on Qualcomm Snapdragon platforms.
* [**qtimlqnn**](../plugin-reference/qtimlqnn) — QNN (Qualcomm AI Engine Direct): Supports models optimized for QNN.
* [**qtimltflite**](../plugin-reference/qtimltflite) — TFLite / Lite-RT: Enables execution of TensorFlow Lite models.
* [**qtimlonnx**](../plugin-reference/qtimlonnx) — Enables execution of ONNX models.

All plugins leverage hardware acceleration provided by Qualcomm NPUs and GPUs for optimal performance.

### Run example on device

The example below uses the [ResNeXt101](https://aihub.qualcomm.com/iot/models/resnext101) model with the [`qtimltflite`](../plugin-reference/qtimltflite) plugin to classify objects in a video stream.

<img src="https://mintcdn.com/qimsdk/xdnKhBBjxpS5mUYP/qimsdk-overview/images/inference.png?fit=max&auto=format&n=xdnKhBBjxpS5mUYP&q=85&s=85a8a15900b0b978b9b12463d14436fc" alt="Inference pipeline diagram" width="1851" height="417" data-path="qimsdk-overview/images/inference.png" />

<Steps>
  <Step title="Download Required Files">
    | File                  | Download                                                                                                                                               | Save as                  |
    | --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------ |
    | ResNeXt101 W8A8 model | [Qualcomm AI Hub — ResNeXt101](https://aihub.qualcomm.com/iot/models/resnext101)                                                                       | `resnext101-w8a8.tflite` |
    | Sample video          | <a href="https://github.com/qualcomm/sample-apps-for-qualcomm-linux/raw/refs/heads/main/qualcomm-linux/artifacts/videos/demo_samples/">Input video</a> | `ai_demo_sample.mp4`     |

    <Note>
      If any downloaded file is a `.zip` archive, extract it on your host machine before copying:
      `unzip filename.zip`
    </Note>
  </Step>

  <Step title="Copy files to device">
    Create the required directories and transfer the downloaded files to your device.

    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      # Replace $HOME to the appropriate device path before running the commands.
      # For QLI:    /root
      # For Ubuntu: /home/ubuntu
      # Modify this based on your platform and ensure files are copied to the correct location on the device.
      # Run from your host machine — replace <user> and <device-ip>
      ssh <user>@<device-ip> "mkdir -p $HOME/{models,labels,media,media/output}"
      scp resnext101-w8a8.tflite  <user>@<device-ip>:$HOME/models/
      scp ai_demo_sample.mp4                <user>@<device-ip>:$HOME/media/
      ```
    </CodeGroup>
  </Step>

  <Step title="Connect to device">
    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      ssh <user>@<device-ip>
      ```
    </CodeGroup>
  </Step>

  <Step title="Set environment variables">
    ```bash theme={null}
    export MODEL_NAME=resnext101-w8a8.tflite
    export SRC_VIDEO_NAME=ai_demo_sample.mp4
    export VIDEO_SOURCE="filesrc location=$HOME/media/$SRC_VIDEO_NAME ! qtdemux ! h264parse ! v4l2h264dec capture-io-mode=4 output-io-mode=4 ! video/x-raw,format=NV12"
    ```
  </Step>

  <Step title="Run example on device">
    <Tabs>
      <Tab title="GStreamer Command line">
        ```bash theme={null}
        gst-launch-1.0 -e \
          filesrc location=$HOME/media/$SRC_VIDEO_NAME ! qtdemux ! h264parse ! \
          v4l2h264dec capture-io-mode=4 output-io-mode=4 ! video/x-raw,format=NV12 ! queue ! \
          qtimlvconverter name=preprocess ! queue ! \
          qtimltflite name=inference \
            delegate=external \
            external-delegate-path=libQnnTFLiteDelegate.so \
            external-delegate-options="QNNExternalDelegate,backend_type=htp;" \
            model=$HOME/models/$MODEL_NAME ! queue ! \
          multifilesink location=$HOME/media/tensor.bin sync=false
        ```
      </Tab>

      <Tab title="GStreamer Python application">
        * **Python source code:** [gst-ai-video-inference.py](https://github.com/qualcomm/gst-plugins-imsdk/tree/main/gst-python-examples/gst-ai-video-inference.py)

        * **Run:**

          ```bash theme={null}
          python3 gst-ai-video-inference.py -s "$VIDEO_SOURCE"
          ```
      </Tab>

      <Tab title="GStreamer C/C++ application">
        * **Application source code:** [gst-ai-video-inference](https://github.com/qualcomm/gst-plugins-imsdk/tree/main/gst-sample-apps/gst-ai-video-inference)

        * **Build your application:**

                  <Tabs>
                    <Tab title="Yocto">
                      <a href="../advanced/yocto-build#steps-to-build-custom-application">
                        Steps to build custom application
                      </a>
                    </Tab>

                    <Tab title="Ubuntu">
                      <a href="../advanced/ubuntu-build#steps-to-build-custom-application">
                        Steps to build custom application
                      </a>
                    </Tab>
                  </Tabs>

        * **Run:**

          ```bash theme={null}
          gst-ai-video-inference -s "$VIDEO_SOURCE"
          ```
      </Tab>
    </Tabs>
  </Step>
</Steps>

### Expected output

The pipeline classifies objects in the video stream in real time. The [`qtimltflite`](../plugin-reference/qtimltflite) plugin automatically reads tensor specifications from the model and propagates them to adjacent plugins — no manual tensor configuration is required.

By default, all QIM SDK inference plugins perform dequantization on output tensors automatically.

Now that we are able to take video input from a data source and do hardware accelerated preprocessing and inferencing on each frame, let's turn our attention to the post processing of the results and generating meaningful data.
