GID
tensorflow
config_error
ai_generated
true
InternalError: CUDA_ERROR_INVALID_DEVICE: invalid device ordinal
ID: tensorflow/gpu-visible-devices-invalid-id
90%Fix Rate
85%Confidence
1Evidence
2023-05-10First Seen
Version Compatibility
| Version | Status | Introduced | Deprecated | Notes |
|---|---|---|---|---|
| tensorflow 2.12 | active | — | — | — |
| tensorflow 2.13 | active | — | — | — |
| tensorflow 2.14 | active | — | — | — |
| cuda 11.8 | active | — | — | — |
| cuda 12.0 | active | — | — | — |
Root Cause
CUDA_VISIBLE_DEVICES environment variable references a GPU index that does not exist on the system.
generic中文
CUDA_VISIBLE_DEVICES 环境变量引用了系统中不存在的 GPU 索引。
Official Documentation
https://www.tensorflow.org/guide/gpu#limiting_gpu_memory_growthWorkarounds
-
95% success List available GPUs with nvidia-smi, then set CUDA_VISIBLE_DEVICES to a valid index. For example: export CUDA_VISIBLE_DEVICES=0 (if only one GPU exists). In Python: import os; os.environ['CUDA_VISIBLE_DEVICES'] = '0'; import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))
List available GPUs with nvidia-smi, then set CUDA_VISIBLE_DEVICES to a valid index. For example: export CUDA_VISIBLE_DEVICES=0 (if only one GPU exists). In Python: import os; os.environ['CUDA_VISIBLE_DEVICES'] = '0'; import tensorflow as tf; print(tf.config.list_physical_devices('GPU')) -
85% success Remove CUDA_VISIBLE_DEVICES entirely to let TensorFlow auto-detect all GPUs: unset CUDA_VISIBLE_DEVICES
Remove CUDA_VISIBLE_DEVICES entirely to let TensorFlow auto-detect all GPUs: unset CUDA_VISIBLE_DEVICES
中文步骤
List available GPUs with nvidia-smi, then set CUDA_VISIBLE_DEVICES to a valid index. For example: export CUDA_VISIBLE_DEVICES=0 (if only one GPU exists). In Python: import os; os.environ['CUDA_VISIBLE_DEVICES'] = '0'; import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))Remove CUDA_VISIBLE_DEVICES entirely to let TensorFlow auto-detect all GPUs: unset CUDA_VISIBLE_DEVICES
Dead Ends
Common approaches that don't work:
-
Reinstalling CUDA drivers
95% fail
The issue is not driver installation but environment variable misconfiguration; reinstalling drivers does not fix the ordinal mapping.
-
Setting CUDA_VISIBLE_DEVICES to all GPUs (e.g., '0,1,2,3') blindly
70% fail
If the system has fewer GPUs than specified, the error persists; the correct approach is to query available devices first.