GStreamer plugin architecture
The QIM SDK encapsulates hardware complexity within a modular plugin architecture, freeing developers from the burden of managing low-level platform libraries or hardware-specific details that vary across Qualcomm chipsets and generations. Each QIMSDK plugin in the SDK maps directly to a dedicated hardware accelerator — including video encode/decode, camera ISP, GPU, display, and AI/ML accelerators — giving developers a clean, unified API surface to build sophisticated multimedia and AI pipelines with minimal integration overhead
Camera Architecture
The QIM SDK abstracts the underlying camera driver and hardware through a client-server architecture, shielding developers from the complexities of low-level camera management and enabling rapid construction of single-stream, multi-stream, multi-client, and multi-camera applications. At its core, theqticamsrc GStreamer element operates as a client to the le-camera service — a system-level service responsible for managing the camera HAL, stream lifecycle, buffer allocation, and capture control. This separation of responsibilities allows qticamsrc to surface the full camera capability set to the GStreamer pipeline through a developer-friendly interface of source pads, camera properties, and control signals.
At runtime, qticamsrc performs the following on behalf of the pipeline:
- Opens and configures the target camera device
- Creates and manages video and image output streams
- Receives captured frames from the le-camera service
- Wraps camera buffers as GstBuffer objects and pushes them downstream through the corresponding source pads
- Returns buffers to the camera service upon downstream consumption

Related Information
Camera use casesAI Inference Architecture
The QIM SDK provides end-to-end support for AI/ML video and audio analytics pipelines, covering the full processing chain from raw media ingestion through model inference to real-time result visualization. AI pipelines in QIM SDK are built around a tensor-based, three-stage processing model: preprocessing, inference, and postprocessing. Each stage is handled by a dedicated plugin and executes in parallel across consecutive frames — source capture at frame N-2, preprocessing at N-1, inference at N, and postprocessing at N+1 — delivering maximum pipeline throughput and optimal hardware utilization. The pipeline source can be live camera frames viaqticamsrc (YUV) or offline media via filesrc (with format conversion); the sink can be an on-screen display via waylandsink or local storage via filesink.
The qtimlvconverter preprocessing plugin converts raw video frames into normalized tensors by performing color space conversion, resizing, and mean subtraction. It queries tensor dimensions and format requirements directly from the downstream inference plugin at runtime, making it a fully generic, model-agnostic preprocessing stage compatible with all inference backends. All inference plugins operate in tensor-in / tensor-out mode and share the same qtimlvconverter for preprocessing. QIM SDK supports the following inference engines, each with hardware delegate support for acceleration across Qualcomm’s neural processing subsystems:
| Plugin | Engine |
|---|---|
qtimltflite | LiteRT / TensorFlow Lite |
qtimlsnpe | Snapdragon Neural Processing Engine (SNPE) |
qtimlqnn | Qualcomm Neural Network (QNN) |
qtimlonnx | ONNX Runtime |
qtivoverlay plugin consumes the structured ML metadata and renders inference results directly onto the video buffer in real time. The annotated stream can then be rendered to a display, encoded and stored locally, or streamed over a network.

Related information
Run AI/ML use casesVideo Architecture
The QIM SDK provides hardware-accelerated video encode and decode through V4L2-based plugins that interface directly with the Qualcomm multimedia engine. Video encoding and decoding are performed entirely on the dedicated VPU (Video Processing Unit), offloading this work from the CPU and allowing concurrent camera capture, AI inference, and video processing without resource contention. The architecture supports AVC (H.264) and HEVC (H.265) formats for both encode and decode paths. Encoded output can be written to file via a multiplexer or streamed over the network. Decoded output is delivered as DMA-buf-backed NV12 frames, compatible with all downstream QIM SDK plugins including AI preprocessing and display.Encode
Video frames captured from the camera viaqticamsrc are passed directly to the hardware encoder. The encoded bitstream is parsed and multiplexed into an MP4 or MPEGTS container before being written to file or transmitted over the network.

| Component | Description |
|---|---|
qticamsrc | Captures video streams from the ISP camera. See Camera Architecture. |
v4l2h264enc | Encodes the video stream to AVC (H.264) format using the V4L2 driver. |
v4l2h265enc | Encodes the video stream to HEVC (H.265) format using the V4L2 driver. |
h264parse / h265parse | Parses the encoded bitstream and inserts headers required for downstream muxing or streaming. |
mp4mux / mpegtsmux | Multiplexes the encoded stream into an MP4 or MPEGTS container. |
filesink | Writes the muxed container to the file system. |
Decode
A video file is demultiplexed, the elementary stream is parsed, and the hardware decoder produces raw NV12 frames for display or downstream AI processing. Decode parameters are exposed as GStreamer element properties for direct application control.
| Component | Description |
|---|---|
filesrc | Reads the video container from the file system. |
qtdemux | Demultiplexes the container into separate elementary streams. |
h264parse / h265parse | Parses the encoded elementary stream before decoding. |
v4l2h264dec | Decodes the AVC (H.264) video stream to raw NV12 frames using the V4L2 driver. |
v4l2h265dec | Decodes the HEVC (H.265) video stream to raw NV12 frames using the V4L2 driver. |
waylandsink | Receives decoded NV12 frames as GBM buffers and submits them to the Weston compositor for display. |
Related Information
Video encode and decode use cases Video playback use casesAudio Architecture
Audio capture and playback in QIM SDK are handled through thepulsesrc and pulsesink GStreamer plugins, which interface with the system-level PulseAudio server. The PulseAudio server in turn communicates with the ALSA driver to interact with the underlying audio hardware.
Audio encode and decode use open-source GStreamer plugins (such as flacenc, flacparse, flacdec, lamemp3enc, mpegaudioparse, mpg123audiodec) combined with the PulseAudio integration, enabling a complete audio pipeline without requiring custom hardware-specific code.
Capture
Thepulsesrc plugin captures raw PCM audio from the microphone and passes it to the PulseAudio server, which handles hardware interaction via the ALSA driver. The PCM stream can then be encoded or written directly to file.

| Component | Description |
|---|---|
pulsesrc | Captures PCM audio samples from the microphone via the PulseAudio server. |
| PulseAudio server | Interfaces with the ALSA driver to acquire audio data from the hardware. |
Encode / filesink | Optionally encodes and writes the captured audio to a file. |
Playback
Thepulsesink plugin receives PCM audio data and routes it through the PulseAudio server for hardware playback. It supports both live audio sources and pre-encoded audio files after decoding.

| Component | Description |
|---|---|
filesrc | Reads audio data from a file. |
| Decode | Decodes encoded audio (e.g., MP3, FLAC, WAV) to raw PCM using open-source decoder plugins. |
pulsesink | Sends PCM audio data to the PulseAudio server for playback on the audio hardware. |
| PulseAudio server | Interfaces with the ALSA driver to route audio to the hardware output. |
Encode
Captured PCM audio is encoded using an open-source encoder, parsed, and multiplexed into a container before being written to file.
| Component | Description |
|---|---|
pulsesrc | Captures PCM audio from the microphone. |
| PulseAudio server | Interacts with the ALSA driver to supply raw audio data. |
| Encode | Encodes PCM to a compressed format (MP3, FLAC, AAC) using open-source plugins. |
| Parse | Parses the encoded bitstream before muxing. |
mp4mux / mpegtsmux | Multiplexes the encoded audio into an MP4 or MPEGTS container. |
filesink | Writes the container to the file system. |
Decode
An encoded audio file is read, demultiplexed, decoded back to PCM, and played back throughpulsesink.

| Component | Description |
|---|---|
filesrc | Reads the audio container from the file system. |
qtdemux | Demultiplexes the container to extract the audio elementary stream. |
| Decode | Decodes the compressed audio to raw PCM using open-source decoder plugins. |
pulsesink | Sends the decoded PCM to the PulseAudio server for playback. |
| PulseAudio server | Interfaces with the ALSA driver to route audio to the hardware output. |
Related Information
Audio use casesGraphics and Display Architecture
The QIM SDK uses the Wayland display protocol with the Weston compositor to manage display composition and output. Weston runs as an independent process and handles all display composition using OpenGL ES, communicating with the display hardware through the DRM/KMS subsystem. QIM SDK plugins deliver video buffers as GBM-backed DMA-buf handles. Weston receives these buffers via the Wayland protocol and composites them directly onto the display without intermediate CPU copies, maintaining the zero-copy path from camera capture or video decode through to the screen.
| Component | Description |
|---|---|
| Wayland/GLES client | GStreamer plugins such as waylandsink implement the Wayland client protocol to submit video buffers to Weston. |
| Weston server | Implements the Wayland compositor. Uses KMS to configure the display and OpenGL ES with DRM for hardware-accelerated compositing. |
| SDM back-end | Display hardware abstraction layer (HAL) that provides the DRM/KMS platform implementation for Weston. |
| libGBM | Buffer allocation and sharing library. Provides a DMA-backed allocator enabling zero-copy buffer sharing between the GPU, display, and other hardware blocks. |
| EGL sub-driver | Interface between GBM/EGL and the Wayland protocol, allowing GStreamer elements to share hardware-allocated buffers with the Weston compositor. |
