# RuntimeError: Triton compilation failed: unsupported instruction 'mma.sync.aligned.m16n8k16.row.col.f16.f16.f16'

- **ID:** `cuda/triton-asm-unsupported`
- **Domain:** cuda
- **Category:** build_error
- **Verification:** ai_generated
- **Fix Rate:** 80%

## Root Cause

A Triton kernel uses a PTX instruction (e.g., mma.sync) that is not supported by the target GPU architecture, often due to an older GPU or incorrect compute capability.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| Triton 2.1.0 | active | — | — |
| PyTorch 2.2.0 | active | — | — |
| NVIDIA T4 (sm_75) | active | — | — |

## Workarounds

1. **Run the kernel on a GPU with compute capability >= 8.0 (Ampere or newer). Alternatively, disable Triton by setting environment variable `TORCHDYNAMO_USE_TRITON=0` to fall back to CUDA kernels.** (80% success)
   ```
   Run the kernel on a GPU with compute capability >= 8.0 (Ampere or newer). Alternatively, disable Triton by setting environment variable `TORCHDYNAMO_USE_TRITON=0` to fall back to CUDA kernels.
   ```

## Dead Ends

- **** — The error is hardware-limited, not software. (70% fail)
- **** — Triton uses its own JIT compiler, independent of TensorExpr. (90% fail)