Folding@Home WU 134XX

BeauZaux

Well-Known Member
USA team member
I've read about the problems with these important WU's and work being done, but no work arounds. I run FAH on 4 Radeon GPU machines and get a lot of 13415's. The 3 AMD cpu machines run these WU's fine, though not much points. But my one Intel cpu machine crashes FAH every night when these WU's hit 100%. Strictly running on GPU's. I selected Alzheimer, but still get these WU's. I guess I could try cancer or another, or just run PrimeGrid.
Any clues or suggestions?
 

Nick Name

Administrator
USA team member
I just looked, I wasn't aware of this problem but I also have most of these failing on my Rx 590. I'm going to try blocking that project IP: 18.188.125.154

Some have reported that blocking the IP causes its own issues. I've blocked another one that kept giving me work but wouldn't accept uploads without any obvious problems.

Changing project/research preference won't do anything, e.g. if you select Alzheimer's but there's no work available the client will grab whatever it can instead. I'm not even sure that setting is respected right now since there's a high priority on the COVID19 work.
 

BeauZaux

Well-Known Member
USA team member
Like I've done previous days, I uninstalled, reinstalled FAH and reconfigured GPU's, 0/0/-1 and 1/0/-1 (not the first time). Last night the 990 completed 11761, 13415, 11761 on GPU 0 and 13415, 11760, 11760 on GPU 1 without failing. Seven 13415 WU's across all machines last night. Maybe they fixed the WU? Fingers crossed.

Still the 990 machine oddly gives 117XX WU's an estimated credit of 9405 till the very end and final credits in the 50-60k.

I had a feeling that changing preference wouldn't work. But didn't think of blocking the server IP for a WU. If the problem returns, I'll try that.
Crashing not long after I hit the sack and the GPU's sitting idle the rest of the night...:mad:

Thank Nick
:USA:
 

Nick Name

Administrator
USA team member
~24 hours after applying the block, I didn't get any work from that project and the F@H client stayed busy all day. That's good news! Hopefully that gets fixed soon.
 

BeauZaux

Well-Known Member
USA team member
990 crashed FAH last night again and ASUS very slow with little credit. Dell and ASRock running them fine. So I've put in the block on the ASUS and 990 for tonight. we'll see how that goes.
 

Nick Name

Administrator
USA team member
There may be more but the reports I saw on these were from folks running the Rx 570, Win10. They're failing on my 590, PopOS, so there must be something about that GPU architecture or driver causing the problem. I'd expect them to fail on the Rx 4xx series as well, they're basically the same cards.

[edit] I forgot to add that everything is still running smoothly after applying the block, the client tends to hang periodically but blocking IPs doesn't seem to make it worse. [/edit]
 

BeauZaux

Well-Known Member
USA team member
Well my 990 has a pair of RX470's and the ASUS has one RX470 and an HD 7970. Dell and ASRock have single RX470's. All RX470's are Sapphire on Win 7.
 

Nick Name

Administrator
USA team member
It's not surprising to me those tasks are failing then. It's a little surprising that there aren't more comments about it on the F@H forum, but most people probably aren't aware of it.

The block is working perfectly here. I blocked it in the router, in my case there's an option for blocking in, out or both. I did both. I haven't messed with Windows firewall in a long time but it shouldn't be too hard. You could also use the old hosts file block. No problems at all here so I don't think blocking servers in itself is responsible for any client problems.
 

BeauZaux

Well-Known Member
USA team member
The block worked fine last night. But had problems with drivers on 2 machines today. Reverted back to 2019 drivers. Got the 990 remedied, but the ASUS is being stubborn.
It has a 470 and 7970. Tomorrow I'll swap the 7970 with one of my single 470 machines.

Blocked the other machines and feel badly, but they need to fix those WUs. They're a waste of good GPU time if they fail to complete.

Thanks for the assistance.
 

BeauZaux

Well-Known Member
USA team member
So I stopped 134XX by blocking the in and out traffic for it's Work Server IP. Now I've received WU 16448 that's Collection Server IP is the same as WU 134XX's Work Server IP. So I opened the outbound side for the IP. Am I correct in this actions?

I reconfigured all my machines again (never can leave well enough alone), so I think my Estimated Credits (EC) are low due to my machines being unknown, again. Both WU 16448's I received are showing an EC of 10980 with ETA a little over 2 hours. Looks alot like 134XX, but different project and manager.

We shall see, but any insight y'all can share should be interesting.

Thanks, all.
Happy Independence Day
:USA:
 

BeauZaux

Well-Known Member
USA team member
Forget all that. I guess the collection server is never contacted from the client. And the WU 16448 was good for 65170 & 64101 credits at a little 3 hours on my RX470s.
 

BeauZaux

Well-Known Member
USA team member
One of my machines idled most of the night two nights ago trying to retrieve 134XX WUs, which I had blocked. So last night hoping, they may have fixed those WUs, I unblocked their IP. I got a 13414 for 115563 credits, a 13416 for 152041 credits, and another 13416 for 151302 credits. Not sure, but I think they were around 6 hours each on my dual RX 470 machine. Tonight I will open the firewall on the 3X Rx 470 machine that crashed several times on 13415 last week. I'll let y'all know how it goes tomorrow.
 

BeauZaux

Well-Known Member
USA team member
The MSI 990 (3X RX 470) machine crashed (FAHcore22) on WU 13416 this morning. So done folding 134XX on that machine. The IP block is going back up for it.

The ASUS 1055T (dual RX 470) had no problems, so will continue with them and any other.
 

Nick Name

Administrator
USA team member
I had a couple jobs come in last week that apparently came in from an unblocked server, but then tried to upload to one that is blocked. It seems like the work server and collection server aren't 100% specific to an IP address. They wouldn't upload even after turning the block off so I've turned it back on, added a couple more and will let them expire.
 

BeauZaux

Well-Known Member
USA team member
Argh! I let the crashed WU try to run out (over 12 hours) and it crashed again at 4 second ETA. It wanted to start again. I don't know how else to get rid of a WU with out deleting GPU in cofig. Meaning had to delete all 3 GPUs and reconfig. That's not always a good thing.
 
Top