Run context binaries (.bin/.dlc)
Some models from AI Hub (and from internal Qualcomm releases) are released as context binaries (.bin files). Context binaries contain the model, plus hardware optimizations; and can be ran with Qualcomm tools that directly use the Qualcomm® AI Runtime SDK. Examples of this are Genie (to run LLMs) and VoiceAI ASR (to run voice transcription); but you can also run context binaries directly from Python using QAI AppBuilder.
Not portable: Context binaries are not portable. They are tied to both the AI Engine Direct SDK version and your hardware target.
Finding supported models
Models in context binary format can be found in a few places:
Qualcomm AI Hub (note that these come in
.dlcformat - you'll need to convert them to.binfiles, see below):Under 'Chipset', select:
RB3 Gen 2 Vision Kit: 'Qualcomm QCS6490 (Proxy)'
RUBIK Pi 3: 'Qualcomm QCS6490 (Proxy)'
IQ-9075 EVK: 'Qualcomm QCS9075 (Proxy)'
Under 'Runtime', select "Qualcomm® AI Runtime".
Under 'Chipset', select:
RB3 Gen 2 Vision Kit: 'Qualcomm QCS6490'
RUBIK Pi 3: 'Qualcomm QCS6490'
IQ-9075 EVK: 'Qualcomm QCS9075'
Note that the NPU only supports quantized models. Floating point models (or layers) will be automatically moved back to the CPU.
From .dlc -> .bin
If your model comes in .dlc format (a portable serialized format); you'll first need to convert them to context binaries (.bin files). You convert these files using qnn-context-binary-generator (part of the AI Runtime SDK). Open the terminal on your development board, or an ssh session to your development board, and:
Install the AI Runtime SDK - Community Edition:
Convert the
.dlcfile into a.binfile:
You should now have output/inception_v3-inception-v3-w8a8.bin. Note that this file is not portable. It's tied to both the AI Engine Direct SDK version and your hardware target.
Troubleshooting
If conversion fails (e.g. with Failed to create dlc handle with code 1002 for dlc file), there might be a discrepancy between the QNN version that created the .dlc file; and the QNN version on your development board. Run:
Here this file was created by QAIRT 2.37.0; but your development board runs 2.35.0. Ask the person that gave you the .dlc file for a version tied to QNN 2.35.0 instead.
Example: Inception-v3 (Python)
Here's how you can run an image classification model (downloaded from AI Hub) on the NPU using QAI AppBuilder. Open the terminal on your development board, or an ssh session to your development board, and:
Build the AppBuilder wheel with QNN bindings:
Now create a new folder for the application:
Download the .dlc file from AI Hub, and convert it to a .bin file:
Create a new file
context_demo.pyand add:Run the example:
Great! You now have ran a model in context binary format on the NPU.
Last updated