- 1 Getting TensorFlow to Actually See Your AMD GPU on Linux
- 2 Step 1: Make Sure Your Card Isn’t Too Old
- 3 Step 2: Install the ROCm Stack
- 4 Step 3: Give Your User the Right Permissions
- 5 Step 4: Quick Sanity Check
- 6 Step 5: Install TensorFlow That Actually Talks to ROCm
- 7 Step 6: Prove It Works
- 8 Need It Even Faster? Docker One-Liner
- 9 When Things Explode (Troubleshooting Notes)
- 10 Useful Stash
Getting TensorFlow to Actually See Your AMD GPU on Linux
Been there. You just bought a beefy AMD card, slapped it into your Linux box, and now TensorFlow is giving you the cold shoulder.
I spent 4 hours on a Sunday wrestling with this exact problem. Here’s the dead-simple way to fix it—no PhD in driver nonsense required.
Step 1: Make Sure Your Card Isn’t Too Old
First things first. AMD keeps a list of cards that actually work with ROCm (their CUDA competitor).
Quick check: Does your GPU show up below? If yes, you’re golden.
- Radeon VII or any RX Vega
- RX 6800, 6900, 7900 series
- Instinct MI cards (the data-center stuff)
Also, stick to Ubuntu 20.04 or 22.04. Anything else is asking for pain.
Step 2: Install the ROCm Stack
Open a terminal and literally copy-paste these lines. I’ve tested them on fresh Ubuntu 22.04 installs twice—they work.
sudo apt update && sudo apt upgrade -y
sudo apt install wget gnupg -y
# Add AMD’s repo key
wget https://repo.radeon.com/rocm/rocm.gpg.key
sudo apt-key add rocm.gpg.key
# Add the package source
echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/5.7 jammy main' | sudo tee /etc/apt/sources.list.d/rocm.list
Refresh the package list, then grab the ROCm stack:
sudo apt update
sudo apt install rocm-dev rocm-dkms rocm-libs rccl -y
Step 3: Give Your User the Right Permissions
Linux locks out “video” hardware by default. Let’s unlock it:
sudo usermod -a -G video,render $USER
Now reboot. I skipped this step once and wasted an hour wondering why nothing worked.
Step 4: Quick Sanity Check
rocm-smi
You should see something like:
========================= ROCm System Management Interface =========================
GPU Temp AvgPwr SCLK MCLK Fan Perf
0 42°C 10W 300 350 0% auto
If that shows up, ROCm is alive.
Step 5: Install TensorFlow That Actually Talks to ROCm
Messing with system Python will trash your machine. Instead:
# Create a sandbox
python3 -m venv ~/rocmtf
source ~/rocmtf/bin/activate
# Upgrade pip, then grab the ROCm-flavored build
pip install --upgrade pip
pip install tensorflow-rocm==2.10.1
(Check PyPI for the latest version.)
Step 6: Prove It Works
Still in that virtual environment, run a tiny test:
python - <<'PY'
import tensorflow as tf
print("TensorFlow version:", tf.__version__)
print("Visible GPUs:", tf.config.list_physical_devices('GPU'))
PY
Expect:
TensorFlow version: 2.10.1
Visible GPUs: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
If you see that, high-five—you’re done.
Need It Even Faster? Docker One-Liner
docker run -it --device=/dev/kfd --device=/dev/dri --group-add=video rocm/tensorflow:latest bash
You’ll drop straight into a container where TensorFlow already sees the GPU. No install step needed.
When Things Explode (Troubleshooting Notes)
- “No GPUs found” — You forgot to add your user to the
videogroup. Rungroups $USERand see ifvideois listed. - Kernel mismatch — After a system update, run
dkms status. If it screams about ROCm, reinstall the metapackage:sudo apt install --reinstall rocm-dkms. - Docker runs but no GPU — Make sure the host
/dev/kfdshows permissionsrwfor your user or the “video” group.
Useful Stash
- Official ROCm docs (bookmark for verbose info)
- Pre-built TensorFlow images
- AMD user forum (search “ROCm TensorFlow”)
Last weekend I went from zero to a 10-second ResNet training loop on an RX 6900 XT following exactly these steps. No hacks, no tears—just a happy GPU and caffeinated TensorFlow.