> ## Documentation Index
> Fetch the complete documentation index at: https://imsdkdocs.qualcomm.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Image Classification

> Building image classification pipelines with QIM SDK

In an image classification system, the pipeline analyzes each frame of a video stream and assigns labels that reflect the scene's content such as identified objects or scene categories. Let's walk through an example of building an image classification pipeline with the QIM SDK using the ResNeXt101 image classification model which can be [downloaded from Qualcomm's AI Hub](https://aihub.qualcomm.com/iot/models/resnext101).

Here is what our pipeline in this example will look like:

<img src="https://mintcdn.com/qimsdk/xdnKhBBjxpS5mUYP/qimsdk-overview/images/image-classification-pipeline.png?fit=max&auto=format&n=xdnKhBBjxpS5mUYP&q=85&s=9b958169ff4ab7e300ac936c9eeb657a" alt="Diagram of image classification pipeline" width="2478" height="739" data-path="qimsdk-overview/images/image-classification-pipeline.png" />

<Note>You can refer to the [Building AI Pipelines](../qimsdk-overview/sdkoverview) for more general information about each element of an AI pipeline</Note>

## Run example on device

<Steps>
  <Step title="Download Required Files">
    | File                  | Download                                                                                                                                               | Save as                 |
    | --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ | ----------------------- |
    | ResNeXt101 W8A8 model | [Qualcomm AI Hub — ResNeXt101](https://aihub.qualcomm.com/iot/models/resnext101)                                                                       | `resnet101-w8a8.tflite` |
    | Classification labels | <a href="../labels/imagenet.txt" download="imagenet.txt">imagenet.txt</a>                                                                              | `imagenet.txt`          |
    | Sample video          | <a href="https://github.com/qualcomm/sample-apps-for-qualcomm-linux/raw/refs/heads/main/qualcomm-linux/artifacts/videos/demo_samples/">Input video</a> | `ai_demo_sample.mp4`    |

    <Note>
      If any downloaded file is a `.zip` archive, extract it on your host machine before copying:
      `unzip filename.zip`
    </Note>
  </Step>

  <Step title="Copy files to device">
    Create the required directories and transfer the downloaded files to your device.

    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      # Replace $HOME to the appropriate device path before running the commands.
      # For QLI:    /root
      # For Ubuntu: /home/ubuntu
      # Modify this based on your platform and ensure files are copied to the correct location on the device.
      # Run from your host machine — replace <user> and <device-ip>
      ssh <user>@<device-ip> "mkdir -p $HOME/{models,labels,media,media/output}"
      scp resnet101-w8a8.tflite  <user>@<device-ip>:$HOME/models/
      scp imagenet.txt           <user>@<device-ip>:$HOME/labels/
      scp ai_demo_sample.mp4                <user>@<device-ip>:$HOME/media/
      ```
    </CodeGroup>
  </Step>

  <Step title="Connect to device">
    <CodeGroup>
      ```bash SCP (SSH) theme={null}
      ssh <user>@<device-ip>
      ```
    </CodeGroup>
  </Step>

  <Step title="Set environment variables">
    ```bash theme={null}
    export MODEL_NAME=resnet101-w8a8.tflite
    export LABELS_NAME=imagenet.txt
    export SRC_VIDEO_NAME=ai_demo_sample.mp4
    export VIDEO_SOURCE="filesrc location=$HOME/media/$SRC_VIDEO_NAME ! qtdemux ! h264parse ! v4l2h264dec capture-io-mode=4 output-io-mode=4 ! video/x-raw,format=NV12"
    ```
  </Step>

  <Step title="Run example on device">
    <Tabs>
      <Tab title="GStreamer Command line">
        ```bash theme={null}
        gst-launch-1.0 $VIDEO_SOURCE ! \
          tee name=t \
          t. ! qtimlvconverter name=preprocess ! queue ! \
               qtimltflite name=inference delegate=external \
                 external-delegate-path=libQnnTFLiteDelegate.so \
                 external-delegate-options="QNNExternalDelegate,backend_type=htp;" \
                 model=$HOME/models/$MODEL_NAME ! queue ! \
               qtimlpostprocess name=postprocess results=1 module=mobilenet-softmax \
                 labels=$HOME/labels/$LABELS_NAME settings='{"confidence": 51.0}' ! \
               text/x-raw ! metamux. \
          t. ! qtimetamux name=metamux ! qtivoverlay ! waylandsink sync=true fullscreen=true
        ```
      </Tab>

      <Tab title="GStreamer Python application">
        * **Python source code:** [gst-ai-video-classification.py](https://github.com/qualcomm/gst-plugins-imsdk/tree/main/gst-python-examples/gst-ai-video-classification.py)

        * **Run:**

          ```bash theme={null}
          python3 gst-ai-video-classification.py -s "$VIDEO_SOURCE" -o display
          ```
      </Tab>

      <Tab title="GStreamer C/C++ application">
        * **Application source code:** [gst-ai-video-classification](https://github.com/qualcomm/gst-plugins-imsdk/tree/main/gst-sample-apps/gst-ai-video-classification)

        * **Build your application:**

                  <Tabs>
                    <Tab title="Yocto">
                      <a href="../advanced/yocto-build#steps-to-build-custom-application">
                        Steps to build custom application
                      </a>
                    </Tab>

                    <Tab title="Ubuntu">
                      <a href="../advanced/ubuntu-build#steps-to-build-custom-application">
                        Steps to build custom application
                      </a>
                    </Tab>
                  </Tabs>

        * **Run:**

          ```bash theme={null}
          gst-ai-video-classification -s "$VIDEO_SOURCE" -o display
          ```
      </Tab>
    </Tabs>
  </Step>
</Steps>

## Expected output

The result of the video classification is visually overlaid in the top-left corner of the frame.

<img src="https://mintcdn.com/qimsdk/xdnKhBBjxpS5mUYP/qimsdk-overview/images/classification-camel.png?fit=max&auto=format&n=xdnKhBBjxpS5mUYP&q=85&s=ac0df27ab2fd0c441db483b744d31bdb" alt="Image of a camel classification" width="800" height="448" data-path="qimsdk-overview/images/classification-camel.png" />

## Stream Splitting via tee

One of the powerful features of **GStreamer** is the ability to split a video or audio stream into multiple branches, allowing the same stream to be processed or consumed in different ways simultaneously. In this example, we are using the `tee` element to split the original video stream. One branch runs through the AI processing pipeline to generate classifications, and at the end it is recombined with the original video stream so that the detected label can be displayed on top of the original image. Note that each branch connected to a tee runs on its own thread, so you often need `queue` elements after each branch to avoid blocking.

## Combining inference results with the original image

The [qtimetamux](../plugin-reference/qtimetamux) attaches the AI inference results to the original NV12 video frame as custom GStreamer metadata. This ensures synchronization between the video frame and its associated AI metadata, allowing downstream elements to seamlessly access both for visualization, network streaming, or automated decision-making.

## Adding a text overlay on top of the image

The [qtivoverlay](../plugin-reference/qtivoverlay) element reads the AI metadata and renders visual overlays, such as bounding boxes and labels, directly onto the video frame without requiring buffer duplication.
