Skip to content

Neuron Compiler Fails to Trace Supported Ops — Only ~10% Coverage on ConvLSTM3D + Attention Model #1202

@kayhustle

Description

@kayhustle

I've spent several days trying to compile a ConvLSTM3D-based Keras model with AWS NeuronX (TF 2.10.1, Neuron SDK latest). In my use case I do inference every 15 minutes (total inference time less than 15 seconds), so Inf2 would be great for me instead of keeping a GPU backed instance up 24/7, or waiting up to 10 minutes for it stop between inference calls.

The model uses ConvLSTM3D, Conv3D, Dense, BatchNorm, Embeddings, and a simple custom Temporal Attention layer.

According to the official Neuron Ops List, these ops are either directly supported or composed of supported ops.

I validated the model with analyze_model() and it reported ~78% of ops supported.
However, when actually tracing with tfnx.trace(), Neuron only compiles ~11-13% of the graph — even with minimal control flow and no unnecessary ops. This wouldn't work for production deployment.

Key concerns:

Discrepancy between analyze_model report and actual trace compile percentage

Poor support for common ops like LSTM/ConvLSTM or MatMul inside attention mechanisms

Lack of transparency on how control flow ops block entire graph segments from tracing

No clear documentation on how to structure models for better compilation rates

I’m happy to provide example code and the full trace logs if you want.
Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    compilerdocumentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions