> ## Documentation Index
> Fetch the complete documentation index at: https://imsdkdocs.qualcomm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Object Detection

> Building object detection pipelines with QIM SDK

In this example, the pipeline systematically analyzes each frame of a video stream to identify and localize multiple objects — such as people, vehicles, or other entities — within each frame. For each detected object, the pipeline provides bounding boxes and confidence scores. This example uses the [YOLOX](https://aihub.qualcomm.com/iot/models/yolox) model from Qualcomm AI Hub.

<img src="https://mintcdn.com/qimsdk/xdnKhBBjxpS5mUYP/qimsdk-overview/images/obj-detect.png?fit=max&auto=format&n=xdnKhBBjxpS5mUYP&q=85&s=193b9b5e8749977fa5fac5206fa6b79e" alt="gst-ai-video-detection" width="2500" height="745" data-path="qimsdk-overview/images/obj-detect.png" />

The detection pipeline is structurally identical to the classification pipeline, with two key differences:

* The inference plugin is configured with a detection model (YOLOX) instead of a classification model
* The [`qtimlpostprocess`](../plugin-reference/qtimlpostprocess) plugin uses the `yolov8` module with a higher result count to capture multiple detections per frame

## Run example on device

<Steps>
  <Step title="Download Required Files">
    | File             | Download                                                                                                                                               | Save as                  |
    | ---------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------ |
    | YOLOX W8A8 model | [Qualcomm AI Hub — YOLOX](https://aihub.qualcomm.com/iot/models/yolox)                                                                                 | `yolox_quantized.tflite` |
    | Detection labels | <a href="../labels/coco.txt" download="coco.txt">coco.txt</a>                                                                                          | `coco.txt`               |
    | Sample video     | <a href="https://github.com/qualcomm/sample-apps-for-qualcomm-linux/raw/refs/heads/main/qualcomm-linux/artifacts/videos/demo_samples/">Input video</a> | `ai_demo_sample.mp4`     |

    <Note>
      If any downloaded file is a `.zip` archive, extract it on your host machine before copying:
      `unzip filename.zip`
    </Note>
  </Step>

  <Step title="Copy files to device">
    Create the required directories and transfer the downloaded files to your device.

    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      # Replace $HOME to the appropriate device path before running the commands.
      # For QLI:    /root
      # For Ubuntu: /home/ubuntu
      # Modify this based on your platform and ensure files are copied to the correct location on the device.
      # Run from your host machine — replace <user> and <device-ip>
      ssh <user>@<device-ip> "mkdir -p $HOME/{models,labels,media,media/output}"
      scp yolox_quantized.tflite  <user>@<device-ip>:$HOME/models/
      scp coco.txt                 <user>@<device-ip>:$HOME/labels/
      scp ai_demo_sample.mp4                <user>@<device-ip>:$HOME/media/
      ```
    </CodeGroup>
  </Step>

  <Step title="Connect to device">
    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      ssh <user>@<device-ip>
      ```
    </CodeGroup>
  </Step>

  <Step title="Set environment variables">
    ```bash theme={null}
    export MODEL_NAME=yolox_quantized.tflite
    export LABELS_NAME=coco.txt
    export SRC_VIDEO_NAME=ai_demo_sample.mp4
    export VIDEO_SOURCE="filesrc location=$HOME/media/$SRC_VIDEO_NAME ! qtdemux ! h264parse ! v4l2h264dec capture-io-mode=4 output-io-mode=4 ! video/x-raw,format=NV12"
    ```
  </Step>

  <Step title="Run example on device">
    <Tabs>
      <Tab title="GStreamer Command line">
        ```bash theme={null}
        gst-launch-1.0 $VIDEO_SOURCE ! \
          tee name=t \
          t. ! qtimlvconverter name=preprocess ! queue ! \
               qtimltflite name=inference delegate=external \
                 external-delegate-path=libQnnTFLiteDelegate.so \
                 external-delegate-options="QNNExternalDelegate,backend_type=htp;" \
                 model=$HOME/models/$MODEL_NAME ! queue ! \
               qtimlpostprocess name=postprocess results=8 module=yolov8 \
                 labels=$HOME/labels/$LABELS_NAME settings='{"confidence": 51.0}' bbox-stabilization=true ! \
               text/x-raw ! metamux. \
          t. ! qtimetamux name=metamux ! qtivoverlay ! waylandsink sync=true fullscreen=true
        ```
      </Tab>

      <Tab title="GStreamer Python application">
        * **Python source code:** [gst-ai-video-detection.py](https://github.com/qualcomm/gst-plugins-imsdk/tree/main/gst-python-examples/gst-ai-video-detection.py)

        * **Run:**

          ```bash theme={null}
          python3 gst-ai-video-detection.py -s "$VIDEO_SOURCE" -o display
          ```
      </Tab>

      <Tab title="GStreamer C/C++ application">
        * **Application source code:** [gst-ai-video-detection](https://github.com/qualcomm/gst-plugins-imsdk/tree/main/gst-sample-apps/gst-ai-video-detection)

        * **Build your application:**

                  <Tabs>
                    <Tab title="Yocto">
                      <a href="../advanced/yocto-build#steps-to-build-custom-application">
                        Steps to build custom application
                      </a>
                    </Tab>

                    <Tab title="Ubuntu">
                      <a href="../advanced/ubuntu-build#steps-to-build-custom-application">
                        Steps to build custom application
                      </a>
                    </Tab>
                  </Tabs>

        * **Run:**

          ```bash theme={null}
          gst-ai-video-detection -s "$VIDEO_SOURCE" -o display
          ```
      </Tab>
    </Tabs>
  </Step>
</Steps>

## Expected output

Detection results are visually overlaid on each video frame — bounding boxes and class labels are rendered on top of the original image in real time.

<img src="https://mintcdn.com/qimsdk/xdnKhBBjxpS5mUYP/qimsdk-overview/images/obj-expected_output.png?fit=max&auto=format&n=xdnKhBBjxpS5mUYP&q=85&s=8ff11c509df2ab5cfff1aac2e49b8c23" alt="gst-ai-video-detection" width="2260" height="1267" data-path="qimsdk-overview/images/obj-expected_output.png" />
