WCG Beta Testing GPU Work

Nick Name

Administrator
USA team member
I got a few more tonight but not many in the queue. I realized that I had the cache set very low, which I normally do. I increased it from about .25 a day to a full day, hopefully that will get me a few more when tasks are available.
 

Nick Name

Administrator
USA team member
I got a few more tonight but not many in the queue. I realized that I had the cache set very low, which I normally do. I increased it from about .25 a day to a full day, hopefully that will get me a few more when tasks are available.
Worked like a charm. I now have 30+ tasks on hand which will keep me busy for awhile. (y)
 

Vester

Well-Known Member
USA team member
I am giving it another chance on my HD 7990s. Queue not full and work units are about two minutes. It appears that this project likes my video cards. Graphics card performance.

Edit: After a few shorter-running work units, I started getting longer ones in the 600 to 950 points per WU range.
 
Last edited:

Jason Jung

Well-Known Member
USA team member
We have the tech update we've been waiting for:

"...running OpenPandemics at a higher speed will cause the research team to focus the majority of their time and energy on preparing input data sets and archiving returned data rather than performing analysis of the results and moving the interesting results to the next step in the pipeline. As a result, the project will remain at its current speed for the foreseeable future."
 

supdood

Well-Known Member
USA team member
That is unfortunate. It was great to see what you all could do when the supply was plentiful.

For those with GPUs, if you aren't already doing so, you could change the project resource share so that if WCG has OPNG units available you'll get them and, if not, your GPU will be kept busy with your other projects. For WCG, this setting is under the device profile on the website in the Cross Project Settings. Set Project Weight to something higher (I think the range is between 0 and 10,000) than your other projects if you want to prioritize it. I don't think that there is anything you could do for a WCG/F@H split other than babysit it.
 

Vester

Well-Known Member
USA team member
The Radeon HD 7990s loved the WCG tasks. Not much was said about it, but FP64 performance is important at WCG.
 

RONNIE

Well-Known Member
USA team member
We have the tech update we've been waiting for:

"...running OpenPandemics at a higher speed will cause the research team to focus the majority of their time and energy on preparing input data sets and archiving returned data rather than performing analysis of the results and moving the interesting results to the next step in the pipeline. As a result, the project will remain at its current speed for the foreseeable future."
It seems we've waited for so long that now Jason Jung already has a long beard! But this is the opposite that we were expecting: they are not very interested in our GPUs...
 

RONNIE

Well-Known Member
USA team member
That is unfortunate. It was great to see what you all could do when the supply was plentiful.

For those with GPUs, if you aren't already doing so, you could change the project resource share so that if WCG has OPNG units available you'll get them and, if not, your GPU will be kept busy with your other projects. For WCG, this setting is under the device profile on the website in the Cross Project Settings. Set Project Weight to something higher (I think the range is between 0 and 10,000) than your other projects if you want to prioritize it. I don't think that there is anything you could do for a WCG/F@H split other than babysit it.
We can also change the WCG profile and opt to receive no GPU tasks at all.
 

doneske

Well-Known Member
USA team member
This is a constant nagging problem with WCG specifically (and IBM generally). I don't know if anyone knows but IBM acquired monitoring software long ago (circa 1990s) and yet it would only run on Windows or Linux (not IBM's own product AIX). As an IBM customer, I couldn't run IBM software on IBM hardware. Why? Developers couldn't get access to AIX machines. They could only get access to Windows and Linux. Go figure. The clustered filesystem is an IBM creation and IBM has experts that have written the code for the filesystem yet it sounds like WCG doesn't have access to those resources. Once again, go figure. My son does technical support for IBM Spectrum Scale. But after saying all that, why did the researchers spend the effort creating a GPU version of the code if they didn't have the resources to take advantage of the new code. In other words, why create a faster version of the program if you are only going to run it at CPU speeds anyway. Things that make you go...DUH!!!
 

Vester

Well-Known Member
USA team member
For those with GPUs, if you aren't already doing so, you could change the project resource share so that if WCG has OPNG units available you'll get them and, if not, your GPU will be kept busy with your other projects. For WCG, this setting is under the device profile on the website in the Cross Project Settings. Set Project Weight to something higher (I think the range is between 0 and 10,000) than your other projects if you want to prioritize it.
I couldn't readily find the Cross Project Settings yesterday and was too busy to spend time finding it. I awakened at 0230 this morning and found that my computer, 4HD7990, was slow to respond. It errored on three WCG OPGN tasks. It was trying to run five tasks per GPU for both WCG and Milkyway@home at the same time. I set WCT to 10,000. Thanks for the information, doneske.

(My eldest son works in IT in Tokyo. He is a big Plan9/Inferno fan and acquaintance of Bob Pike.)
 

Jason Jung

Well-Known Member
USA team member
"Right now the 7 day average of batches for GPU is 727 batches/day and CPU is 185 batches/day. This is above the 500 batches/day goal. As a result, we reduced the GPU pace 2 days ago from 3000 workunits/day to 1700 workunits/day. We have been attempting to reduce the pace of the CPU work as well. Unfortunately, there is a fairly large set of users who only have OpenPandemics selected and thus we are not able to shift as much of the CPU power to our other projects as I had hoped. We will continue to adjust things where we can to support the 500/batches per day goal and to shift work to GPU as much as possible without making the CPU work intermittent." --knreed

I'm hoping that 1700 workunits/day thing is a typo. A single workstation or GPU server could handle that.
 

Vester

Well-Known Member
USA team member
I had an OPNG task on a Radeon HD 7990 that took 55 minutes to reach 100%. I checked after five minutes to see if it had reached completion and BOINC Manager crashed. After restarting BOINC Manager, the task restarted from the beginning. I wonder what happened to the checkpoints that are burning up SSDs? (Edit: Maybe they worked. It completed the second time in under 15 minutes.)
 
Last edited:

Nick Name

Administrator
USA team member
This is disappointing from the user side but if the project is getting the data it needs that's the main thing. I'm still of the opinion they should scrap the CPU app so CPU resources can be put to use where they're actually needed.
 
Top