Skip to content
Discussion options

You must be logged in to vote

Hi @kimminsu38oo
Yes, the ExecuTorch QNN Intermediate Output Debugger is used to debug accuracy issues by comparing per-tensor outputs with CPU results. QHAS and optrace are used for performance analysis, using dumps from pte.

You can refer to the section about generate-optrace-and-qhas
Please note that the input order in the context binary may differ from the source model. You can check the input order in the JSON file using <QNN_SDK_ROOT>//bin/x86_64-linux-clang/qnn-context-binary-utility --context_binary $1 --json_file $2.

The following show how to generate optrace and QHAS in llama.py for stories260K.

Reproduce command

python examples/qualcomm/oss_scripts/llama/llama.py -b build-andro…

Replies: 4 comments 4 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
4 replies
@shewu-quic
Comment options

Answer selected by liu-mengyang
@liu-mengyang
Comment options

@kimminsu38oo
Comment options

@shewu-quic
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
partner: qualcomm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Qualcomm module: qnn Issues related to Qualcomm's QNN delegate and code under backends/qualcomm/
5 participants