Advanced search

Forums : Technical Support : Longer / heavier WUs?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Brian Silvers

Send message
Joined: 11 Dec 07
Posts: 420
Credit: 270,580
RAC: 0
Message 8365 - Posted: 24 May 2009, 2:50:42 UTC - in response to Message 8363.  
Last modified: 24 May 2009, 2:51:57 UTC

My PC can't crunch Cosmology anymore. I get the message:
19/05/2009 10:31:54 Cosmology@Home Message from server: CAMB needs 476.84 MB RAM but only 460.33 MB is available for use.

Increase your Virtual Memory (Paging File size).


He could try that, but the computer is sharing projects with Einstein and WCG...with only 512MB of memory. Best suggestion is to upgrade to at least 1GB of memory, preferably 2GB, and try not to run other proejcts while Cosmology is running, until such time as this project's tasks are not so demanding...
ID: 8365 · Report as offensive     Reply Quote
jkforde

Send message
Joined: 15 Apr 09
Posts: 3
Credit: 5,420
RAC: 0
Message 8367 - Posted: 25 May 2009, 11:33:12 UTC


Just to add my piece, I've had to suspend C@H, it was making the machine sluggish... Einstein, MilkyWay and ClimatePrediction running fine in the background so please fix this hog!
ID: 8367 · Report as offensive     Reply Quote
Tamaster

Send message
Joined: 20 May 09
Posts: 1
Credit: 10,920
RAC: 0
Message 8414 - Posted: 6 Jun 2009, 15:55:11 UTC

I'll keep it simple:

Since I'm new here, I pulled down a couple WU for testing and got some
pretty wide variability in execution time. I have no issues with the
claimed vs. granted, but the first one finished in 6.8Hrs compute time
and I had to kill one off to get the last two completed before deadline.

I've got a dual-core Athlon 64 overclocked (STABLE) to 3.36gHz with 4Gb
of RAM and plenty of disk and swap and it is *_STILL_* taking over 19hrs
of compute to complete 17hr WU from COSMO.

These are monthly jobs being scheduled for a 2 week completion.
Expect them to run out of time even on a fast machine.

And I'm not even touching the subject of memory footprint...

Sorry. But not on my time.
ID: 8414 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 11 Dec 07
Posts: 420
Credit: 270,580
RAC: 0
Message 8415 - Posted: 6 Jun 2009, 21:59:44 UTC - in response to Message 8414.  
Last modified: 6 Jun 2009, 22:01:41 UTC

I'll keep it simple:

Since I'm new here, I pulled down a couple WU for testing and got some
pretty wide variability in execution time. I have no issues with the
claimed vs. granted, but the first one finished in 6.8Hrs compute time
and I had to kill one off to get the last two completed before deadline.

I've got a dual-core Athlon 64 overclocked (STABLE) to 3.36gHz with 4Gb
of RAM and plenty of disk and swap and it is *_STILL_* taking over 19hrs
of compute to complete 17hr WU from COSMO.

These are monthly jobs being scheduled for a 2 week completion.
Expect them to run out of time even on a fast machine.

And I'm not even touching the subject of memory footprint...

Sorry. But not on my time.


The issue is not as clear cut...

What happened is since you had just attached here, BOINC over-estimated your system's capability to process tasks here. What happens is you start out with a duration correction factor of 1.0 for new projects. If the estimate is wrong, the duration correction factor will rise or fall, depending on whether the task is processed faster or slower than expected by the initial estimate. If the task processed faster, then the correction factor will go down. If the task processed slower, then the correction factor will go up. Once a few tasks have processed, the BOINC CPU scheduler knows more about how to handle tasks from a project. This applies to any project you first join, not only just this one.

Another factor is that you have that specific computer attached to multiple projects. That's another thing that will cause contention between the current application here due to, as you noticed, the extremely large memory requirement.

Basically, BOINC just went through a learning phase with what you asked of it. If you allow work from here again, your system should only request 1 or 2 tasks, not 4 at once like you got the other day, because the duration correction factor and the resource allocations are more properly taken into account now. Note that all of that happens on your local system. It had to learn that asking for as many as it did was too aggressive, so it will be more conservative. It's the way BOINC was designed to operate...
ID: 8415 · Report as offensive     Reply Quote
billy ewell 1931

Send message
Joined: 5 Nov 07
Posts: 2
Credit: 252,700
RAC: 0
Message 8523 - Posted: 16 Aug 2009, 16:37:06 UTC
Last modified: 16 Aug 2009, 16:38:45 UTC

I was quite surprized to finish task 6613565 at 16 Aug. 15:55:19 utc with Boinc Manager showing about 100.55 hours of cpu time to complete AND then checking the results on my account to find that only 37.9 hours of cpu time was recorded and the credit given was the 420 apparent standard!! Therefore, according to the results page, my dual-core 2.66 cpu earned 11.08 credits per hour and according to Boinc Manager I earned a whopping 4.18 points per hour. I find it interesting that 62.65 hours of actual cpu time simply vanished.

The other work units I recently completed with three other computers did not encounter the same discrepancy. No big problem; just an observation.

Bill
ID: 8523 · Report as offensive     Reply Quote
MattShizzle

Send message
Joined: 23 Aug 09
Posts: 3
Credit: 248,875
RAC: 30
Message 8575 - Posted: 7 Sep 2009, 22:49:21 UTC - in response to Message 8523.  

I'm having the same problem. I have a fairly fast computer and only this morning uploaded my first task here since joining on AUG 23. The deadlines are way too soon - the only reason I got it in on time was I left my computer on overnight while I slept. As was mentioned elsewhere, it also slows other things down which BOINC isn't supposed to do. The other task is 2 hours past deadline and less than 40% complete.
ID: 8575 · Report as offensive     Reply Quote
MattShizzle

Send message
Joined: 23 Aug 09
Posts: 3
Credit: 248,875
RAC: 30
Message 8576 - Posted: 7 Sep 2009, 22:49:21 UTC - in response to Message 8523.  
Last modified: 7 Sep 2009, 23:05:03 UTC

I'm having the same problem. I have a fairly fast computer and only this morning uploaded my first task here since joining on AUG 23. The deadlines are way too soon - the only reason I got it in on time was I left my computer on overnight while I slept. As was mentioned elsewhere, it also slows other things down which BOINC isn't supposed to do. The other task is 2 hours past deadline and less than 40% complete.
Note I have 2 other projests - World Community and climateprediction. I've finished 5 or 6 WC ones and am about 35% through the climate one - but the deadline for that one is a YEAR away!
ID: 8576 · Report as offensive     Reply Quote
Profile kevint

Send message
Joined: 30 Aug 07
Posts: 46
Credit: 6,502,980
RAC: 0
Message 8578 - Posted: 10 Sep 2009, 2:08:29 UTC
Last modified: 10 Sep 2009, 2:13:37 UTC

780M ??

You really got to be kidding!

I have 4 WU's running and the smallest one is 350M - the largest is 780M - I am getting the message "Waiting for memory"
I can not run 4 WU's on my quads without the entire box slowing down to grandma speed.

These WU's are worse that uFluids!

For the long crunch times and huge memory footprint, credit needs to be adjusted upwards of 3x

Why is there such a large memory footprint?? Is there something being done about it?
ID: 8578 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 11 Dec 07
Posts: 420
Credit: 270,580
RAC: 0
Message 8583 - Posted: 12 Sep 2009, 15:44:13 UTC - in response to Message 8578.  
Last modified: 12 Sep 2009, 16:01:54 UTC


Why is there such a large memory footprint??


Because the app they have out now is providing them the data that they want...and they feel the amount of memory is appropriate.


Is there something being done about it?


No indications of life from any of the project admins or scientists in 2-2.5 months now... If it weren't for the fact that there is an actual physical project sitting out in space that this BOINC project is supposedly helping, I'd have given up on this project completely sometime last year... They just can't get it together when it comes to the BOINC project... It appears that in their mind, since they are getting the data they want/need and the application is stable, it doesn't matter how much memory the application consumes. The only thing that will change their opinion is if the return rate of the data from the BOINC project slows down too much. Of course that's assuming the data we're providing is of any criticality. If it's not of much criticality, then why bother having the BOINC project at all?

Anyway....we're pretty much on auto-pilot and have been for the past 6 months. After the server crash and partial recovery (partial because some things still don't work right), they put the new application in place and set the auto-pilot...with a few trickles of info, but nothing in over 2 months now...
ID: 8583 · Report as offensive     Reply Quote
Emanuel

Send message
Joined: 28 Oct 07
Posts: 31
Credit: 316,100
RAC: 0
Message 8584 - Posted: 12 Sep 2009, 22:54:05 UTC - in response to Message 8583.  

It's getting to the point where I'm thinking it would be better to boycott the project until we get some attention. Unfortunately I doubt a lot of people check the forum on a regular basis, so it would be hard to make a dent in the amount of contribution.
ID: 8584 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 11 Dec 07
Posts: 420
Credit: 270,580
RAC: 0
Message 8585 - Posted: 12 Sep 2009, 23:06:40 UTC - in response to Message 8584.  
Last modified: 12 Sep 2009, 23:08:22 UTC

It's getting to the point where I'm thinking it would be better to boycott the project until we get some attention. Unfortunately I doubt a lot of people check the forum on a regular basis, so it would be hard to make a dent in the amount of contribution.


"Boycotting" is not really helpful. Many people have been "boycotted" by the project itself anyway because anything with less than about 1GB of memory is going to run into serious problems running the application. Now we're seeing people with computers that have 2-8GB of memory complaining about performance issues. They need to move the application off to GPUs if they want this much memory usage. If they did that, they'd get results back in faster too, but I don't know if it is needed to be faster due to the duration of the Planck mission being another year...

Space / Physics projects are my projects of choice, but I continually shake my head at the abyssmal project management of the BOINC project here...
ID: 8585 · Report as offensive     Reply Quote
Emanuel

Send message
Joined: 28 Oct 07
Posts: 31
Credit: 316,100
RAC: 0
Message 8586 - Posted: 13 Sep 2009, 13:24:51 UTC - in response to Message 8585.  

Yeah, I know.. If everyone was as bothered by this as I am (we are?), boycotting the project en masse would be easy to do.. unfortunately they still get enough crunchers (apparently) as it is, so individual users are essentially powerless.

The sad truth is I wouldn't boycott them because I want more credits or even because I feel ignored, but because I want to see that BOINC is a suitable platform for doing science, and I seriously doubt this project is getting anywhere near as much done as it could be. In the (distant) future, when we obtain the technology to make robots do most of our work for us, are we going to just do nothing as a species? Or are we going to contribute our resources, whether it be our computers or our minds, to advancement in science? *sigh* One can dream..
ID: 8586 · Report as offensive     Reply Quote
rroonnaalldd

Send message
Joined: 10 Apr 08
Posts: 18
Credit: 147,580
RAC: 0
Message 8587 - Posted: 14 Sep 2009, 15:21:13 UTC - in response to Message 8586.  

I think you have only the three choices boycott, detach or crunching. They will get the data crunched with or without you. It is the same like uFluids. :(

ID: 8587 · Report as offensive     Reply Quote
koschi

Send message
Joined: 19 Apr 08
Posts: 2
Credit: 479,965
RAC: 0
Message 8589 - Posted: 19 Sep 2009, 5:24:30 UTC

I don't mind a big memory footprint or run time, if the WUs finish at all!
I calculated some WUs in July and now again in September, some finish, some do not. The later ones are simply cycling between 60 and 77.499%.
Have a look at http://nopaste.info/4a924b9a2e.html, I recorded the status of one WU over some time. I only reduced "waiting to run" lines to single ones.
Run time on a Q9550/4GB is already around 3days, the system runs on Linux 64bit.

Because of these behaviour I set Cosmo to no new work in July, gave it again a try now and the situation is still the same.

Now is this a problem on my side or of the work units / app?
ID: 8589 · Report as offensive     Reply Quote
.clair.

Send message
Joined: 4 Nov 07
Posts: 604
Credit: 9,523,843
RAC: 12,917
Message 8591 - Posted: 19 Sep 2009, 21:04:31 UTC

Cosmo seems to have a long checkpoint time so and it may be that
if you do not have `keep wu in memory` set to `yes` this may explain it getting stuck If other projects on the pc compleat ok,
but this means even with 4gb you may run out of ram.
ID: 8591 · Report as offensive     Reply Quote
MattShizzle

Send message
Joined: 23 Aug 09
Posts: 3
Credit: 248,875
RAC: 30
Message 8601 - Posted: 28 Sep 2009, 1:34:48 UTC
Last modified: 28 Sep 2009, 1:38:06 UTC

Well, I thought I was close to finishing another one but now there is more than 25 hours left (and the time left is going up, not down) and it is due 3 in the afternoon of the 30th (I'm on EDT.) If this one times out before finishing I'm suspending CAH until I hear they changed it to give a reasonable ammount of time to finish.

eta: well since I made this entry the time left is now less than 4 hours.
ID: 8601 · Report as offensive     Reply Quote
koschi

Send message
Joined: 19 Apr 08
Posts: 2
Credit: 479,965
RAC: 0
Message 8602 - Posted: 28 Sep 2009, 19:33:36 UTC - in response to Message 8591.  

Cosmo seems to have a long checkpoint time so and it may be that
if you do not have `keep wu in memory` set to `yes` this may explain it getting stuck If other projects on the pc compleat ok,
but this means even with 4gb you may run out of ram.



'keep wu in memory' did the trick. Those monsters finally finished.

Thanks!
ID: 8602 · Report as offensive     Reply Quote
.clair.

Send message
Joined: 4 Nov 07
Posts: 604
Credit: 9,523,843
RAC: 12,917
Message 8603 - Posted: 28 Sep 2009, 22:04:55 UTC

Pleased to see it helped,
I do wonder why the checkpoint`s are as far apart as they are.
ID: 8603 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 11 Dec 07
Posts: 420
Credit: 270,580
RAC: 0
Message 8604 - Posted: 29 Sep 2009, 0:19:20 UTC - in response to Message 8603.  

Pleased to see it helped,
I do wonder why the checkpoint`s are as far apart as they are.


Who knows? This, like the download error, would be so easy to change, but either nobody from the project is willing to work on it or they are not willing to say that they are working on it...

I could have this all wrong, but I think more frequent checkpointing should reduce the memory load, since the app wouldn't have to carry around as much working memory. At least I think it wouldn't... They could have a scientifically proven fact though that the calculations needed for the next phase of a task complete in, on average, 2 hours, for all we know...

However, the team seems to have this opinion:



ID: 8604 · Report as offensive     Reply Quote
Profile Misfit
Volunteer tester
Avatar

Send message
Joined: 9 Jun 07
Posts: 150
Credit: 237,789
RAC: 0
Message 8608 - Posted: 3 Oct 2009, 19:12:53 UTC - in response to Message 8591.  

Cosmo seems to have a long checkpoint time so and it may be that
if you do not have `keep wu in memory` set to `yes` this may explain it getting stuck If other projects on the pc compleat ok,
but this means even with 4gb you may run out of ram.

I have 4GB and running 2 cosmo units I'm typically at 73% memory usage. So leaving them sit in memory isn't an option for me.
me@rescam.org
ID: 8608 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Forums : Technical Support : Longer / heavier WUs?