# RuntimeError: CUDA error: peer access is not supported between these two devices (cudaErrorPeerAccessUnsupported)

- **ID:** `cuda/peer-access-unsupported-by-hardware`
- **Domain:** cuda
- **Category:** runtime_error
- **Error Code:** `cudaErrorPeerAccessUnsupported`
- **Verification:** ai_generated
- **Fix Rate:** 70%

## Root Cause

The two GPUs do not support direct peer-to-peer (P2P) memory access, typically due to hardware topology (e.g., different PCIe switches) or disabled P2P in the driver.

## Version Compatibility

| Version | Status | Introduced | Deprecated |
|---------|--------|------------|------------|
| CUDA 11.0 | active | — | — |
| CUDA 12.0 | active | — | — |
| CUDA 12.3 | active | — | — |

## Workarounds

1. **Disable peer-to-peer access in your code by setting the environment variable NCCL_P2P_DISABLE=1 before launching the script. For PyTorch DistributedDataParallel, use: os.environ['NCCL_P2P_DISABLE'] = '1'. This forces NCCL to use shared memory or network-based communication instead.** (75% success)
   ```
   Disable peer-to-peer access in your code by setting the environment variable NCCL_P2P_DISABLE=1 before launching the script. For PyTorch DistributedDataParallel, use: os.environ['NCCL_P2P_DISABLE'] = '1'. This forces NCCL to use shared memory or network-based communication instead.
   ```
2. **If using multiple GPUs, assign each GPU to a separate process (e.g., via torch.multiprocessing) to avoid P2P requirements. For example, use torch.cuda.set_device(rank) and communicate via torch.distributed with NCCL_SHM_DISABLE=1.** (70% success)
   ```
   If using multiple GPUs, assign each GPU to a separate process (e.g., via torch.multiprocessing) to avoid P2P requirements. For example, use torch.cuda.set_device(rank) and communicate via torch.distributed with NCCL_SHM_DISABLE=1.
   ```

## Dead Ends

- **** — Enabling P2P via software flags cannot override hardware limitations; it will still fail. (90% fail)
- **** — Rebooting the system does not change GPU topology; if P2P is unsupported by hardware, it remains unsupported. (80% fail)
