Advanced search

Forums : General Topics : Work Unit of 80+ hours??
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile microchip

Send message
Joined: 19 Jul 08
Posts: 19
Credit: 1,257,211
RAC: 6,201
Message 9494 - Posted: 23 Sep 2011, 10:14:12 UTC

Hi,

I almost completed a WU that took more than 80 hours to compute. Even on recent hardware, I think this is way too much as C@H is not multi-threaded so it runs on one core only thus needing such a long time to compute. Can we please get a bit smaller WUs?
ID: 9494 · Report as offensive     Reply Quote
.clair.

Send message
Joined: 4 Nov 07
Posts: 604
Credit: 10,881,302
RAC: 0
Message 9495 - Posted: 23 Sep 2011, 20:18:39 UTC
Last modified: 23 Sep 2011, 20:22:13 UTC

That was a very long running work unit, though at least it validated ok.
Makes me think that something odd happened to it during processing.
I don`t like it if they run longer that 100k sec`s on my old Athlon xp 3000 machine,
and it`s a snail compared to recent pc`s.

edit - smaller work units,
I think we got two chance`s there,
And one of them is no chance.
ID: 9495 · Report as offensive     Reply Quote
Profile Benjamin Wandelt
Volunteer moderator
Project administrator
Project scientist
Avatar

Send message
Joined: 24 Jun 07
Posts: 192
Credit: 15,273
RAC: 0
Message 9501 - Posted: 5 Oct 2011, 12:42:02 UTC - in response to Message 9495.  
Last modified: 5 Oct 2011, 12:42:24 UTC

Hi clive and microchip -

Could you explain to me why you have a preference for short work units? It's an honest question.

From the side of the project manager longer work units are almost always better, since they reduce communication and server overheads. On your end, you can always ask to run several workunits at the same time which is a trivial way to multithread.

Thank you -
Ben
Creator of Cosmology@Home
ID: 9501 · Report as offensive     Reply Quote
.clair.

Send message
Joined: 4 Nov 07
Posts: 604
Credit: 10,881,302
RAC: 0
Message 9503 - Posted: 5 Oct 2011, 23:47:33 UTC

Sorry Ben, it is not that i am wishing for shorter running work units it is much more like i need a faster pc to run them on. And my comment is remembering that some time ago it was sed that sometime in the future work units may become longer and use more ram, sorry for putting it badly.
ID: 9503 · Report as offensive     Reply Quote
Profile microchip

Send message
Joined: 19 Jul 08
Posts: 19
Credit: 1,257,211
RAC: 6,201
Message 9504 - Posted: 6 Oct 2011, 18:14:26 UTC

Hi Ben,

I'd like a bit smaller units as when I get such a long one, BOINC goes into "high priority" mode and assigns one of the cores to that WU only for the complete time it needs to be computed, thus nothing else but C@H can run on this core. I also crunch for other projects as well and would like BOINC to switch every hour between the various projects but with a WU from C@H in high priority mode, this doesn't happen.

So either sent smaller WU's or increase the reporting deadline in order not to make BOINC go into high priority mode from the start.
ID: 9504 · Report as offensive     Reply Quote
Profile Benjamin Wandelt
Volunteer moderator
Project administrator
Project scientist
Avatar

Send message
Joined: 24 Jun 07
Posts: 192
Credit: 15,273
RAC: 0
Message 9505 - Posted: 9 Oct 2011, 22:58:25 UTC - in response to Message 9504.  

Ok, thanks for the feedback. Sounds like we need to look at increasing the reporting deadline.

Happy crunching -

Ben

Creator of Cosmology@Home
ID: 9505 · Report as offensive     Reply Quote
Profile Benjamin Wandelt
Volunteer moderator
Project administrator
Project scientist
Avatar

Send message
Joined: 24 Jun 07
Posts: 192
Credit: 15,273
RAC: 0
Message 9506 - Posted: 10 Oct 2011, 0:07:37 UTC - in response to Message 9504.  

I just checked and our reporting deadline is set to 15 days. 80 hours are 3.333 days, so it seems like there would be plenty of time.

At what point does your client go into high-priority mode? Could this be a setting you can change at your end?

Thanks,
Ben
Creator of Cosmology@Home
ID: 9506 · Report as offensive     Reply Quote
mickydl*

Send message
Joined: 4 Jan 10
Posts: 1
Credit: 222,180
RAC: 0
Message 9509 - Posted: 14 Oct 2011, 19:30:56 UTC - in response to Message 9506.  

3.33 days is only true if you run the machine 24/7. Although many of us do , some don't.
My machines are switched off during the night and are running an average of 15 hours a day.

Another thing I have encountered during one of the last WUs I crunched is that the checkpointing seems to be quite inefficient. The particular WU had finished to abt. 80% when I switched the machine off for the night. When I started everything on the next day it continued from it's last checkpoint at 60% (corresponding to several hours of work). I didn't investigate this any further so I don't know if this is the normal behavior or just an unlucky coincidence.

So, let's make a new calculation :-)

80 hours / 15 hours per day = 5.33 days
add 5 % penalty (maybe 10% ?) for checkpointing and you get 5.6 days (5.68).
If at least one other project is being crunched on the same machine double the times. So now we have 11.2 days (or 11.73 days with 10% penalty).
OK, it's still less then the 15 day deadline but we are getting closer. Add yet another project to BOINC on that machine and you're likely to run into the "high priority" problem.

Regards,
Michael
ID: 9509 · Report as offensive     Reply Quote
Profile microchip

Send message
Joined: 19 Jul 08
Posts: 19
Credit: 1,257,211
RAC: 6,201
Message 9616 - Posted: 29 Oct 2011, 20:05:38 UTC
Last modified: 29 Oct 2011, 20:10:06 UTC

Forget the 80 hours WU. I just finished one that took 127 hours to compute on an AMD Phenom II x6 1090T CPU and what did I get for it? A measly 420 credits. Ridiculous. BOINC was running it for the past 3 days in high priority mode. This is enough.

Ben, please increase the deadline or send smaller WUs and also review the credits system

Also, I am with mickydl*. I crunch for a bunch of other projects too and have never seen such a long WU. The high priority mode becomes easily a problem when you also crunch for other projects, even though my server runs 24/7.
ID: 9616 · Report as offensive     Reply Quote
.clair.

Send message
Joined: 4 Nov 07
Posts: 604
Credit: 10,881,302
RAC: 0
Message 9618 - Posted: 30 Oct 2011, 4:01:06 UTC

Do you have `Leave applications in memory while suspended` ticked for `yes`
In BOINC manager preferences, if not, it will make work units take a very long tome to finish.
Checkpoints in cosmo are a long way apart and you loose work done on all other projects as well when BM switches between projects if this is not ticked.
BM will move the data out to swap / virtual memory until it is needed again.
ID: 9618 · Report as offensive     Reply Quote
Profile microchip

Send message
Joined: 19 Jul 08
Posts: 19
Credit: 1,257,211
RAC: 6,201
Message 9619 - Posted: 30 Oct 2011, 9:37:52 UTC - in response to Message 9618.  

Do you have `Leave applications in memory while suspended` ticked for `yes`
In BOINC manager preferences, if not, it will make work units take a very long tome to finish.
Checkpoints in cosmo are a long way apart and you loose work done on all other projects as well when BM switches between projects if this is not ticked.
BM will move the data out to swap / virtual memory until it is needed again.


Yes, I have it ...
ID: 9619 · Report as offensive     Reply Quote
.clair.

Send message
Joined: 4 Nov 07
Posts: 604
Credit: 10,881,302
RAC: 0
Message 9621 - Posted: 30 Oct 2011, 22:28:56 UTC

Sorry that it did not help,
I had a look at the specs of your pc it has got every thing it needs to run well,
You do run a lot of other projects on it,
Do the other projects you do work units for run for similar times on their systems,
is it just cosmo that is running slow.
ID: 9621 · Report as offensive     Reply Quote
Profile microchip

Send message
Joined: 19 Jul 08
Posts: 19
Credit: 1,257,211
RAC: 6,201
Message 9623 - Posted: 31 Oct 2011, 13:19:37 UTC - in response to Message 9621.  

Sorry that it did not help,
I had a look at the specs of your pc it has got every thing it needs to run well,
You do run a lot of other projects on it,
Do the other projects you do work units for run for similar times on their systems,
is it just cosmo that is running slow.


It is just cosmology@home that has such long work units. All other projects I'm attached to have WUs between 1 hour and 34 hours (excluding the GPU projects).

I really don't mind running very long WUs, but the deadline should be extended for those. Also the credits system on here definitely needs a review. :)
ID: 9623 · Report as offensive     Reply Quote
Profile Ananas

Send message
Joined: 19 Jan 08
Posts: 180
Credit: 2,500,290
RAC: 0
Message 9631 - Posted: 2 Nov 2011, 4:20:13 UTC

Workunits here usually run way shorter, between 4 and 12 hours (lately more of the shorter ones occur) would be normal and I haven't had such a long running one so far. I guess there has been something wrong with the workunit or the host somehow had a problem with it.

Normal Cosmo result credits are not too low here in average and deadlines are quite comfortable.

If your next one runs that long again, I would rather suspect that there is something that somehow collides with the Cosmo ressource requirements on your host.
ID: 9631 · Report as offensive     Reply Quote
Profile microchip

Send message
Joined: 19 Jul 08
Posts: 19
Credit: 1,257,211
RAC: 6,201
Message 9635 - Posted: 3 Nov 2011, 15:46:35 UTC - in response to Message 9631.  

Workunits here usually run way shorter, between 4 and 12 hours (lately more of the shorter ones occur) would be normal and I haven't had such a long running one so far. I guess there has been something wrong with the workunit or the host somehow had a problem with it.

Normal Cosmo result credits are not too low here in average and deadlines are quite comfortable.

If your next one runs that long again, I would rather suspect that there is something that somehow collides with the Cosmo ressource requirements on your host.


Actually, I tried on a different host and I get the same result. WUs between 36 and 70 hours (which I aborted as it was a test) so it's not just this one host that gets them.

About the credits, I have to disagree. I favor a credits system (which is also used by virtually all other projects) that gives credit based on amount of work you do. Not a fixed-credit system like on here.

If you run other projects as well and get such a long WU, the deadline definitely becomes a problem, hence why I asked to increase it a bit :)
ID: 9635 · Report as offensive     Reply Quote
Profile microchip

Send message
Joined: 19 Jul 08
Posts: 19
Credit: 1,257,211
RAC: 6,201
Message 9715 - Posted: 30 Nov 2011, 22:20:54 UTC
Last modified: 30 Nov 2011, 22:22:53 UTC

Ok, after further investigation I found out why C@H is running such long tasks on my host. It is because of shitty checkpointing issue.

Suppose a WU has crunched to 85%. When BOINC suspends it to run another WU from some project and then goes back to the C@H WU, instead of continuing from 85%, it starts crunching it from 70% instead.

I guess Ben doesn't care much about adding more checkpoints (as I've seen others complain as well) so I'll also not care much and am detaching from C@H. Enough is enough. Good bye!
ID: 9715 · Report as offensive     Reply Quote
Profile cykodennis

Send message
Joined: 31 May 10
Posts: 234
Credit: 4,896,378
RAC: 0
Message 9724 - Posted: 4 Dec 2011, 10:45:23 UTC - in response to Message 9715.  

The WUs are as big as they have to be.
Work that has to be done. And that is the way it should be.
ID: 9724 · Report as offensive     Reply Quote

Forums : General Topics : Work Unit of 80+ hours??