Advanced search

Forums : Technical Support : No CPU time again
Message board moderation

To post messages, you must log in.

AuthorMessage
Peter Hucker of the Scottish Boinc Team

Send message
Joined: 5 Jul 11
Posts: 22
Credit: 934,708
RAC: 13,322
Message 22991 - Posted: 27 Mar 2022, 14:55:37 UTC

Not sure why I'm bothering to ask in here as I never get an answer, but one of my hosts is idling about. http://www.cosmologyathome.org/results.php?hostid=452344 I thought I'd sorted the nonsense with VB tasks, by downgrading VB to 5 instead of 6, but one host is taking 14.5 minutes to do bugger all CPU time then sending them back, half of those are marked as ok on this end for some reason and half as a computation error.
ID: 22991 · Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team

Send message
Joined: 5 Jul 11
Posts: 22
Credit: 934,708
RAC: 13,322
Message 22993 - Posted: 30 Mar 2022, 19:23:59 UTC - in response to Message 22991.  
Last modified: 30 Mar 2022, 19:24:48 UTC

I'll reply to myself since nobody's in here. Don't let this run on lots of cores. I have three 24 core machines. 24 cores on this project doesn't work. You need to limit it to say 6 cores and let it do 4 of them, using app_config in the Cosmology project folder:

<app_config>
   <app_version>
       <app_name>camb_boinc2docker</app_name>
       <plan_class>vbox64_mt</plan_class>
       <avg_ncpus>6</avg_ncpus>
       <cmdline>--nthreads 6</cmdline>
   </app_version>
</app_config>
ID: 22993 · Report as offensive     Reply Quote
Nflight

Send message
Joined: 4 Aug 07
Posts: 7
Credit: 1,307,493
RAC: 9,236
Message 22995 - Posted: 1 Apr 2022, 12:47:23 UTC - in response to Message 22991.  

Peter your posts are appreciated, thank you for the knowledge you share, which can make a difference to those who don't have all the answers, you are a God Send.!!
ID: 22995 · Report as offensive     Reply Quote
poppinfresh99

Send message
Joined: 1 Mar 22
Posts: 18
Credit: 542,434
RAC: 4,283
Message 22997 - Posted: 6 Apr 2022, 12:50:29 UTC - in response to Message 22995.  

Instead, you could probably change “Max # CPUs” on this project’s settings webpage.
ID: 22997 · Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team

Send message
Joined: 5 Jul 11
Posts: 22
Credit: 934,708
RAC: 13,322
Message 22999 - Posted: 6 Apr 2022, 21:52:25 UTC - in response to Message 22997.  

Instead, you could probably change “Max # CPUs” on this project’s settings webpage.
I wasn't sure if that was CPU cores per task or CPU cores per computer. I want to use all the cores, just run each task on 6 of them, otherwise it fails.
ID: 22999 · Report as offensive     Reply Quote
poppinfresh99

Send message
Joined: 1 Mar 22
Posts: 18
Credit: 542,434
RAC: 4,283
Message 23000 - Posted: 7 Apr 2022, 12:39:07 UTC - in response to Message 22999.  

I think “Max # CPUs” is for the whole project.

I also thought that <avg_ncpus> was for the whole project...
https://www.cosmologyathome.org/faq.php#limit-cpu
https://boinc.berkeley.edu/wiki/Client_configuration#Project-level_configuration
That is, for the whole application (not project), but camb_boinc2docker should be the only application we are running.
ID: 23000 · Report as offensive     Reply Quote
.clair.

Send message
Joined: 4 Nov 07
Posts: 626
Credit: 12,068,402
RAC: 0
Message 23002 - Posted: 8 Apr 2022, 0:16:16 UTC
Last modified: 8 Apr 2022, 0:25:36 UTC

Max # jobs - is for project = Limit on "Tasks in progress" possibly per computer
Max # CPUs - is for each work unit
Each has a limit of 8 on the pref`s page
It was discovered not long after cam2docker started that VB goes mental if more than 8 cpu`s are used and many tasks have errors for no good reason , unless you are lucky .

Max # jobs - 8 { 1<8 or "No Limit" = 300 ish } I did`nt count them
Max # CPUs - 6 {1<8 - don't use "No Limit" }

should have a 24 core system - 4 "running" work units and - 4 "ready to start"
And no need for an "app config" file .

7 cpu`s takes 10<11 minit on my opteron16
ID: 23002 · Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team

Send message
Joined: 5 Jul 11
Posts: 22
Credit: 934,708
RAC: 13,322
Message 23004 - Posted: 8 Apr 2022, 7:01:42 UTC - in response to Message 23002.  
Last modified: 8 Apr 2022, 7:03:34 UTC

7 cpu`s takes 10<11 minit on my opteron16
Pah! I'm doing a job in 3 minutes on an i5-8600K, and 4 at once in 6 minutes on a Ryzen 9 3900XT. I SHALL PREVAIL!!!!
ID: 23004 · Report as offensive     Reply Quote
poppinfresh99

Send message
Joined: 1 Mar 22
Posts: 18
Credit: 542,434
RAC: 4,283
Message 23005 - Posted: 8 Apr 2022, 12:53:37 UTC - in response to Message 23002.  
Last modified: 8 Apr 2022, 12:58:04 UTC

Max # jobs - is for project = Limit on "Tasks in progress" possibly per computer
Max # CPUs - is for each work unit


This is great to know! I tested it, and I was wrong about "Max # CPUs".

It was discovered not long after cam2docker started that VB goes mental if more than 8 cpu`s are used and many tasks have errors for no good reason , unless you are lucky .


It's great to know that "Max # CPUs" is needed to fix this and fixes this so nicely!

Is this more-than-8-CPU bug true for VB in general (that is, not just for Cosmology@home)? Maybe it's just for CERTAIN VirtualBox situations??
https://forums.virtualbox.org/viewtopic.php?f=2&t=87445&start=15
ID: 23005 · Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team

Send message
Joined: 5 Jul 11
Posts: 22
Credit: 934,708
RAC: 13,322
Message 23006 - Posted: 8 Apr 2022, 22:46:58 UTC - in response to Message 23005.  
Last modified: 8 Apr 2022, 22:47:30 UTC

Is this more-than-8-CPU bug true for VB in general (that is, not just for Cosmology@home)? Maybe it's just for CERTAIN VirtualBox situations??
https://forums.virtualbox.org/viewtopic.php?f=2&t=87445&start=15
I don't have any problems running loads of VB except:

If it's the computer I'm trying to use for web browsing etc, more than half the cores on VB grinds it to a halt - it's fine on a Boinc only machine though.

Cosmology needs to run on 6 cores for each task, not 24 each. But I can happily run four of 6 core on a 24 core machine I'm not using for anything else. Virtually no errors.
ID: 23006 · Report as offensive     Reply Quote
.clair.

Send message
Joined: 4 Nov 07
Posts: 626
Credit: 12,068,402
RAC: 0
Message 23007 - Posted: 9 Apr 2022, 0:22:23 UTC - in response to Message 23004.  
Last modified: 9 Apr 2022, 0:23:35 UTC

7 cpu`s takes 10<11 minit on my opteron16
Pah! I'm doing a job in 3 minutes on an i5-8600K, and 4 at once in 6 minutes on a Ryzen 9 3900XT. I SHALL PREVAIL!!!!

Yup , the opteron iz little faster than an Athlon XP3200 in single threaded crunching [benchmarks] ,
I know this coz mine only died last year crunching Einstein {I don't like chucking stuff if it still workz}
Hopteron is / waz a cheap webserver cpu in its day , but crap at math .
It waz a cheep system {ebay parts} when I got it , I soon found out why .
ID: 23007 · Report as offensive     Reply Quote
.clair.

Send message
Joined: 4 Nov 07
Posts: 626
Credit: 12,068,402
RAC: 0
Message 23008 - Posted: 9 Apr 2022, 0:37:01 UTC - in response to Message 23005.  

Max # jobs - is for project = Limit on "Tasks in progress" possibly per computer
Max # CPUs - is for each work unit

This is great to know! I tested it, and I was wrong about "Max # CPUs".
It was discovered not long after cam2docker started that VB goes mental if more than 8 cpu`s are used and many tasks have errors for no good reason , unless you are lucky .

It's great to know that "Max # CPUs" is needed to fix this and fixes this so nicely!
Is this more-than-8-CPU bug true for VB in general (that is, not just for Cosmology@home)? Maybe it's just for CERTAIN VirtualBox situations??
https://forums.virtualbox.org/viewtopic.php?f=2&t=87445&start=15

From reading the posts on VB and the linked bug trak , that is the kind of stuff that waz happening here on cosmo when CB2D started , so several upto 8 core tasks ran a lot faster than one `big` one ,
if it ran at all .
ID: 23008 · Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team

Send message
Joined: 5 Jul 11
Posts: 22
Credit: 934,708
RAC: 13,322
Message 23009 - Posted: 9 Apr 2022, 0:55:25 UTC - in response to Message 23007.  

7 cpu`s takes 10<11 minit on my opteron16
Pah! I'm doing a job in 3 minutes on an i5-8600K, and 4 at once in 6 minutes on a Ryzen 9 3900XT. I SHALL PREVAIL!!!!

Yup , the opteron iz little faster than an Athlon XP3200 in single threaded crunching [benchmarks] ,
I know this coz mine only died last year crunching Einstein {I don't like chucking stuff if it still workz}
Hopteron is / waz a cheap webserver cpu in its day , but crap at math .
It waz a cheep system {ebay parts} when I got it , I soon found out why .
I like to keep old stuff running too, even though it may be more sensible to chuck it and buy something that uses less electricity.
I've never managed to break a CPU though, MBs occasionally expire, and GPUs expire a lot.
ID: 23009 · Report as offensive     Reply Quote
poppinfresh99

Send message
Joined: 1 Mar 22
Posts: 18
Credit: 542,434
RAC: 4,283
Message 23015 - Posted: 16 Apr 2022, 17:29:04 UTC - in response to Message 23000.  
Last modified: 16 Apr 2022, 17:32:41 UTC

I also thought that <avg_ncpus> was for the whole project...
https://www.cosmologyathome.org/faq.php#limit-cpu


I now see that the FAQ linked to (in above quote) also has
<max_concurrent>1</max_concurrent>
in addition to avg_ncpus, so, what Peter said about just using avg_ncpus (without setting max_concurrent to 1) should work.
Though “Max # CPUs” on this project’s settings webpage is still easiest :)

*If* someone also wanted to set max_concurrent, I think "Max # jobs" on this project’s settings webpage would work and be easiest.
ID: 23015 · Report as offensive     Reply Quote
poppinfresh99

Send message
Joined: 1 Mar 22
Posts: 18
Credit: 542,434
RAC: 4,283
Message 23016 - Posted: 16 Apr 2022, 21:12:00 UTC - in response to Message 23015.  

*If* someone also wanted to set max_concurrent, I think "Max # jobs" on this project’s settings webpage would work and be easiest.


Though I just learned that "Max # jobs" and max_concurrent are a bit different...
- "Max # jobs" is the amount of tasks that are downloaded to client
- max_concurrent is the amount of tasks that are run at a time (and maybe its infinite-download bug was fixed in the latest release of BOINC): https://github.com/BOINC/boinc/issues/4322
ID: 23016 · Report as offensive     Reply Quote
Sagittarius Lupus

Send message
Joined: 12 Apr 11
Posts: 2
Credit: 645,531
RAC: 322
Message 23045 - Posted: 21 May 2022, 8:26:33 UTC

Thank you for this thread. I hadn't been able to get any tasks from this project to run for literally years (with a dozen other BOINC projects on this system, some of which also use VBox) until I happened to look here.

The advice to limit the camb_boinc2docker application to 8 cores or fewer if no CPU time or output is observed is essential -- it really ought to be pinned somewhere if not made a project default.
ID: 23045 · Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team

Send message
Joined: 5 Jul 11
Posts: 22
Credit: 934,708
RAC: 13,322
Message 23046 - Posted: 21 May 2022, 9:19:48 UTC

Unfortunately this seems to be one of those projects (like Rosetta) where nobody who works there is present in the forums, so nothing will get changed. We're on our own.
ID: 23046 · Report as offensive     Reply Quote
Sagittarius Lupus

Send message
Joined: 12 Apr 11
Posts: 2
Credit: 645,531
RAC: 322
Message 23047 - Posted: 24 May 2022, 15:00:01 UTC

How very sad. That's many thousands of work units over the years just left on the table they could have had for the low low price of paying attention.
ID: 23047 · Report as offensive     Reply Quote
Peter Hucker of the Scottish Boinc Team

Send message
Joined: 5 Jul 11
Posts: 22
Credit: 934,708
RAC: 13,322
Message 23048 - Posted: 24 May 2022, 15:41:02 UTC

It's possible they get enough work back and the scientists couldn't deal with any more results anyway. This is the case with GPU work on covid over at WCG (before they broke the server completely for 3 months....)
ID: 23048 · Report as offensive     Reply Quote

Forums : Technical Support : No CPU time again