GPU Newbee

doneske

Well-Known Member
USA team member
Finally was able to get ahold of the Radeon Pro W5500 adapter (RDNA and Nav14). Shutdown one of the systems and physically installed the card in the 2nd PCIe x16 slot. There was an existing air cooled Radeon HD 6670 that I left in the machine because it had a HDMI port I could use for communicating with the system. W5500 only has 4 DP ports and I don't have a DP cable and monitor doesn't have a DP port. Anyway, pressed power button and nothing happened. Turned off power supply, unplugged power cord, replugged power cord and turned on power supply. Pressed power button again and system LEDs and lights just blink (should stay lit). Removed W5500 card and nothing happened when pressing power button. I assume, power supply was blown or damaged in some way. It is an 850W power supply but is about 5 years old. Bit the bullet, and took second identical system and removed old VGA card (W5500 comes with a DP to DVI cable. Didn't notice that right away) and installed new W5500 and system powered up fine. Tried to install latest AMD Pro driver and get compile errors during DKMS install. System recognizes the card (Firmware works) and clinfo installs and works but BOINC doesn't recognize the card (Driver not working). Most likely due to kernel being higher level than supported. AMD doc says driver is for CentOS 8.1 and I'm at 8.2 although both 8.1 and 8.2 use the 4.18 kernel. Difference is the ABI level. This is why I don't run GPU work. I suspect the solution is the 21.Q1 driver but it hasn't been released yet. Doc says it was due out on 2/10/2021 but it's not there yet. The card works and I'm able to attach a monitor to it and use as a terminal but that's it. Now just waiting for a new driver to be released.
 

Vester

Well-Known Member
USA team member
That is why my GPU work is done on Windows 10. I have tried for hours (days, even) to run GPU projects on Ubuntu and Linux Mint. I tried on GhostBSD the other day without success.

Here is some background information that I used, but I am a Linux novice. I was able to get the OpenCL package using Synaptic Package Manager. I downloaded and installed amdgpu-pro-20.20-1098277-ubuntu-20.04 from AMD support. Installation was guided by reading this topic at Einstein. The post by ROBL was helpful in the installation of the drivers.

I have not been able to run amdgpu-utils yet and that will be necessary in order to run Linux Mint on my mining rig, 4HD7990. Amdgpu-utils is available in Software Manager, but that is an old version. The latest is gpu-utils-3.0.0 by Ricks-Lab. I believe my problem lies in the editing of Grub as described in the README. There is better information in the User Guide.

Good luck!
 

Vester

Well-Known Member
USA team member
This post by mdxi at WCG may help you, doneske.
you should have OpenCL 2.1 support on that RX550 card. Both cards are Polaris12 so that should be no problem. I have the RX550 myself in a PC, running latest AMD drivers for it and it has been crunching away with no problems.

Thanks for saying this; it made me look a little deeper. It's a Linux problem, and more specifically a Mesa/AMD problem. Until October of 2020 the Mesa OCL implementation was 1.1-only. It's now almost entirely 1.2 compliant... but it lacks support for 'new image types', so the driver still only advertises 1.1

If anyone else is interested, this is covered in this article and its comments. There are possible some workarounds in the comments. Also, Mesa OCL feature support is trackable here.
 

Nick Name

Administrator
USA team member
Trying to install Nvidia drivers on Linux is bad enough, although it's a lot better than it was. AMD is even worse. The average home user will never tolerate stuff like editing GRUB, they'll stick with Windows.
 

doneske

Well-Known Member
USA team member
The problem is a mixture of the card and the driver code. The card is an Enterprise class card and requires the Radeon Pro Enterprise driver for support. Enterprises usually don't update their systems very often so the driver doesn't get updates as often to support later releases of the kernel. The available driver only supports Centos 8.1 or Ubuntu 18.04.4 LTS (which is old. 21.04 is coming out next month). What I have decided to do is revert the system to Ubuntu 18.04.4 LTS and try to reinstall the driver. I read something recently that AMD is announcing a mid-range card next week and that they have indicated that they will have a supply of cards available on a weekly basis going forward. Hopefully, I'll be able to get a RX 6800 XT soon. If I can get a recent card, I may try and use strictly open source drivers (AMDGPU and ROCm) going forward. It will simplify things to only use the Ubuntu repositories for updates and not have to download drivers from AMD. AMD has gotten better about getting code out to support their cards closer to release dates. I saw they have already added some PCI IDs to the kernel for future unnamed cards.
 

doneske

Well-Known Member
USA team member
Reverting to 18.04.4 didn't work. The 20.Q3 driver for the W5500 wouldn't compile with the 4.15 kernel installed with 18.04.4 LTS. There was a note that indicated if using the 4.15 kernel one should use the Radeon Pro Software Adrenalin for Linux at the 18.2 level. Couldn't find the Adrenalin version for Linux anywhere on the AMD site. Just on a lark, decided to install Ubuntu 20.04.2 LTS and try the ROCm code. Added the ROCm repositories to Ubuntu and installed the rocm-dkms code. Worked like a charm. Installed the BOINC client and it recognized the card and OpenCL version at startup. Now trying to find a project that works on the card. Tried MilkyWay and it failed with a CL_OUT_OF_HOST_MEMORY error. I'll try a Prime Grid project and see if any of them work.
 

Nick Name

Administrator
USA team member
Reverting to 18.04.4 didn't work. The 20.Q3 driver for the W5500 wouldn't compile with the 4.15 kernel installed with 18.04.4 LTS. There was a note that indicated if using the 4.15 kernel one should use the Radeon Pro Software Adrenalin for Linux at the 18.2 level. Couldn't find the Adrenalin version for Linux anywhere on the AMD site. Just on a lark, decided to install Ubuntu 20.04.2 LTS and try the ROCm code. Added the ROCm repositories to Ubuntu and installed the rocm-dkms code. Worked like a charm. Installed the BOINC client and it recognized the card and OpenCL version at startup. Now trying to find a project that works on the card. Tried MilkyWay and it failed with a CL_OUT_OF_HOST_MEMORY error. I'll try a Prime Grid project and see if any of them work.
I don't have any good ideas but note that I had that same error on Minecraft on my Win8 machine. In my case MilkyWay is the only project that works on my Radeon VII, everything else fails with various errors. Maybe you'll find the opposite condition, all other projects work. :LOL: There's something specific to this machine, either the Win8 or the AMD / Nvidia combo or both that cause problems for AMD apps. I've been wanting to upgrade for awhile but am waiting for some sanity to return to the computer parts market. I recently saw RX 570 cards selling for $350 at MicroCenter. Absolutely insane!
 

doneske

Well-Known Member
USA team member
Since Minecraft and Primegrid (sieve and genefer) work fine, I'm happy with the install. I just needed to see something work successfully on the ROCm stack. It sure is nice not having to mess around with the packaged drivers. The next test will be the 21.04 LTS release of Ubuntu and how quickly ROCm can support it.
 

Nick Name

Administrator
USA team member
Awesome. I will say that if you can get through the pain of a Linux installation and get it working, it works very well.
 

doneske

Well-Known Member
USA team member
Another little nugget I found just browsing around in the bin /opt/rocm/bin directory. The rocm-smi command gives me the GPU temp, power draw, and utilization percent (among other things). Entering: watch -n 2 /opt/rocm/bin/rocm-smi will give the stats every 2 seconds until I terminate the command with ctrl-c.
 

Nick Name

Administrator
USA team member
It seems to be missing on my system, maybe it wasn't part of the package at the time I installed it. One thing I miss versus the Nvidia system is fan control, I might look into upgrading if I can control the fans. Temps are 74 and the fans probably aren't running faster than 50%.
 

doneske

Well-Known Member
USA team member
I installed the 4.0 version of ROCm. I saw an article awhile back that seem to indicate that AMD was working on a utilities module but it wasn't quite there yet. I'll see if I can go back and find it.
 
Top