# RuntimeError: Triton compilation failed: LLVM ERROR: out of memory when compiling kernel with large shared memory

- **ID:** `cuda/triton-compilation-llvm-crash`
- **Domain:** cuda
- **Category:** build_error
- **Verification:** ai_generated
- **Fix Rate:** 72%

## Root Cause

Triton JIT compiler invokes LLVM to optimize kernel code, but the kernel uses excessive shared memory (>48KB per block on most GPUs), causing LLVM's memory allocation for register spilling or optimization to exceed available host memory.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| Triton 2.3.0 | active | — | — |
| CUDA 12.5 | active | — | — |
| LLVM 18.1.0 | active | — | — |
| PyTorch 2.5.0 | active | — | — |

## Workarounds

1. **Reduce shared memory usage in the Triton kernel: decrease block size or use fewer shared memory allocations. Example: change tl.constexpr BLOCK_SIZE from 128 to 64, and ensure shared memory is not allocated per-thread but per-block.** (80% success)
   ```
   Reduce shared memory usage in the Triton kernel: decrease block size or use fewer shared memory allocations. Example: change tl.constexpr BLOCK_SIZE from 128 to 64, and ensure shared memory is not allocated per-thread but per-block.
   ```
2. **Set environment variable TRITON_MAX_SHARED_MEMORY to a lower value (e.g., 32768 bytes) to force Triton to generate kernels within limits. Command: export TRITON_MAX_SHARED_MEMORY=32768 before running the script.** (75% success)
   ```
   Set environment variable TRITON_MAX_SHARED_MEMORY to a lower value (e.g., 32768 bytes) to force Triton to generate kernels within limits. Command: export TRITON_MAX_SHARED_MEMORY=32768 before running the script.
   ```

## Dead Ends

- **** — The error is not about total system memory but about LLVM's internal allocation limits during compilation; more RAM does not help if the kernel design is flawed. (90% fail)
- **** — Caching is unrelated to compilation memory; it only affects reuse of compiled kernels. (95% fail)
