Skip to main content
Segmentation tasks differ fundamentally from classification and detection. While classification and detection models output discrete results — class labels, bounding boxes, confidence scores — segmentation models generate pixel-wise masks that delineate object boundaries within each frame. This example uses the DeepLabV3+ MobileNet model from Qualcomm AI Hub. gst-ai-video-segmentation-new For segmentation, the qtimlpostprocess plugin outputs an RGBA image mask rather than structured metadata. This mask is blended with the original video frame using qtivcomposer with sink_1::alpha=0.5. The qtivoverlay plugin is not needed for segmentation. The order of inputs to qtivcomposer matters — the video frame must be connected first, and the segmentation mask second, so the mask is correctly composited on top.

Run example on device

1

Download Required Files

FileDownloadSave as
DeepLabV3+ MobileNet W8A8 modelQualcomm AI Hub — DeepLabV3+deeplabv3_plus_mobilenet.tflite
Segmentation labelsdv3-argmax.jsondv3-argmax.json
Sample videoInput videoai_demo_sample.mp4
If any downloaded file is a .zip archive, extract it on your host machine before copying: unzip filename.zip
2

Copy files to device

Create the required directories and transfer the downloaded files to your device.
# Run from your host machine — replace <user> and <device-ip>
ssh <user>@<device-ip> "mkdir -p $HOME/{models,labels,media,media/output}"
scp deeplabv3_plus_mobilenet.tflite  <user>@<device-ip>:$HOME/models/
scp dv3-argmax.json                   <user>@<device-ip>:$HOME/labels/
scp ai_demo_sample.mp4                         <user>@<device-ip>:$HOME/media/
3

Connect to device

ssh <user>@<device-ip>
4

Set environment variables

export MODEL_NAME=deeplabv3_plus_mobilenet.tflite
export LABELS_NAME=dv3-argmax.json
export SRC_VIDEO_NAME=ai_demo_sample.mp4
export VIDEO_SOURCE="filesrc location=$HOME/media/$SRC_VIDEO_NAME ! qtdemux ! h264parse ! v4l2h264dec capture-io-mode=4 output-io-mode=4 ! video/x-raw,format=NV12"
5

Run example on device

gst-launch-1.0 $VIDEO_SOURCE ! \
  tee name=t \
  t. ! queue ! mixer. \
  t. ! qtimlvconverter name=preprocess ! queue ! \
       qtimltflite name=inference delegate=external \
         external-delegate-path=libQnnTFLiteDelegate.so \
         external-delegate-options="QNNExternalDelegate,backend_type=htp;" \
         model=$HOME/models/$MODEL_NAME ! queue ! \
       qtimlpostprocess name=postprocess module=deeplab-argmax \
         labels=$HOME/labels/$LABELS_NAME ! mixer. \
  qtivcomposer name=mixer sink_1::alpha=0.5 ! video/x-raw,format=NV12 ! \
  waylandsink sync=true fullscreen=true

Expected output

The segmentation mask is blended on top of the original video frame in real time. image-segmentation-background