Join WhatsApp
Join Now
Join Telegram
Join Now

Setting Up TensorFlow on Linux with AMD GPUs (ROCm Guide)

Avatar for Noman Mohammad

By Noman Mohammad

Published on:

Your rating ?

Getting TensorFlow to Actually See Your AMD GPU on Linux

Been there. You just bought a beefy AMD card, slapped it into your Linux box, and now TensorFlow is giving you the cold shoulder.

I spent 4 hours on a Sunday wrestling with this exact problem. Here’s the dead-simple way to fix it—no PhD in driver nonsense required.

Step 1: Make Sure Your Card Isn’t Too Old

First things first. AMD keeps a list of cards that actually work with ROCm (their CUDA competitor).

Quick check: Does your GPU show up below? If yes, you’re golden.

  • Radeon VII or any RX Vega
  • RX 6800, 6900, 7900 series
  • Instinct MI cards (the data-center stuff)

Also, stick to Ubuntu 20.04 or 22.04. Anything else is asking for pain.

Step 2: Install the ROCm Stack

Open a terminal and literally copy-paste these lines. I’ve tested them on fresh Ubuntu 22.04 installs twice—they work.

sudo apt update && sudo apt upgrade -y
sudo apt install wget gnupg -y

# Add AMD’s repo key
wget https://repo.radeon.com/rocm/rocm.gpg.key
sudo apt-key add rocm.gpg.key

# Add the package source
echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/5.7 jammy main' | sudo tee /etc/apt/sources.list.d/rocm.list

Refresh the package list, then grab the ROCm stack:

sudo apt update
sudo apt install rocm-dev rocm-dkms rocm-libs rccl -y

Step 3: Give Your User the Right Permissions

Linux locks out “video” hardware by default. Let’s unlock it:

sudo usermod -a -G video,render $USER

Now reboot. I skipped this step once and wasted an hour wondering why nothing worked.

Step 4: Quick Sanity Check

rocm-smi

You should see something like:

========================= ROCm System Management Interface =========================
GPU  Temp   AvgPwr  SCLK  MCLK  Fan  Perf
0    42°C   10W     300   350   0%   auto

If that shows up, ROCm is alive.

Step 5: Install TensorFlow That Actually Talks to ROCm

Messing with system Python will trash your machine. Instead:

# Create a sandbox
python3 -m venv ~/rocmtf
source ~/rocmtf/bin/activate

# Upgrade pip, then grab the ROCm-flavored build
pip install --upgrade pip
pip install tensorflow-rocm==2.10.1

(Check PyPI for the latest version.)

Step 6: Prove It Works

Still in that virtual environment, run a tiny test:

python - <<'PY'
import tensorflow as tf
print("TensorFlow version:", tf.__version__)
print("Visible GPUs:", tf.config.list_physical_devices('GPU'))
PY

Expect:

TensorFlow version: 2.10.1
Visible GPUs: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

If you see that, high-five—you’re done.

Need It Even Faster? Docker One-Liner

docker run -it --device=/dev/kfd --device=/dev/dri --group-add=video rocm/tensorflow:latest bash

You’ll drop straight into a container where TensorFlow already sees the GPU. No install step needed.

When Things Explode (Troubleshooting Notes)

  • “No GPUs found” — You forgot to add your user to the video group. Run groups $USER and see if video is listed.
  • Kernel mismatch — After a system update, run dkms status. If it screams about ROCm, reinstall the metapackage: sudo apt install --reinstall rocm-dkms.
  • Docker runs but no GPU — Make sure the host /dev/kfd shows permissions rw for your user or the “video” group.

Useful Stash

Last weekend I went from zero to a 10-second ResNet training loop on an RX 6900 XT following exactly these steps. No hacks, no tears—just a happy GPU and caffeinated TensorFlow.

Leave a Comment