Forums :
Technical Support :
Long running WU aborted by project - no credit
Message board moderation
Author | Message |
---|---|
Ruud van der Kroef Send message Joined: 25 Aug 07 Posts: 12 Credit: 3,654,064 RAC: 0 |
I have a number of systems running unattended; one of them is synstar04. This morning I noticed there is a C@H WU running at high priority. I found it is at approx. 65% and had been running for already about 350 hours. Time to complete is another 250 hours, and due date was something like 19-09-2008. I checked the Tasks Status Page for this system, but there are no tasks displays. I decided to do a Update Project on the Projects page of BOINCManager. Then I looked on the Work page of BOINCManager, and found the WU is aborted by the project. The Messages page of BOINCManager shows the following:
The task has disappeared, does not show in the Task Status Page, no credits have been granted. 350 hours of computation time wasted. Has this problem been seen before? I cannot imagine that I am the first one to run into this. I have searched the Forum, did find other complaints about not getting credit, but nothing about this particular incident. Regards, Ruud |
Phoneman1 Send message Joined: 5 Nov 07 Posts: 113 Credit: 3,100,327 RAC: 0 |
Has this problem been seen before? I cannot imagine that I am the first one to run into this. I\'ve not seen any Cosmo task go much over 5.25hours since my old P4 was retired with a failed power supply a few months ago. It could be a rogue set of variables but I think it might possibly be to do with check-pointing and how often Boinc switches between tasks if you run multiple projects on your machine. If the switch interval is less than the interval between checkpoints, no progress can be made unless Boinc doesn\'t need to do a switch. That could explain how you got 65% in 350 hours of processing. I have heard of something similar on climate prediction. As for the task disappearing that can probably be explained by the fact the task would have been re-issued to someone else on the 19th and if they completed it on that day it would have been archived within 10 days (if your original wingman had completed on time too). The project would have issued a cancel task request on the 19th but you didn\'t pick it up because no update was done presumably. I\'d start by checking the general preferences on your account here and if you use the home / work / school options, check them. Also the local settings on the machine in question. You should be safe leaving the switch interval at 60 minutes or more (not less). Because a project can cancel a task but that doesn\'t get passed to your machine until an update is made it may be worth installing Boinccmd and setting up a scheduled task to run an update via that software every 24 hours - choose an odd time not on the hour or half hour etc. If that had been in place your long-running task would have been canceled on the 19th or 20th not the 30th. Phoneman1 |
![]() ![]() Send message Joined: 8 Aug 07 Posts: 54 Credit: 527,780 RAC: 0 |
I\'ve got a 266G p4 xpsp3 and have NEVER run into a wu that long! The wu\'s have never gone over about 11hrs on that box! A clear conscience is usually the sign of a bad memory |
Ruud van der Kroef Send message Joined: 25 Aug 07 Posts: 12 Credit: 3,654,064 RAC: 0 |
It has been a while that I reported this problem, but I would like to thank the respondants.
I have checked the preferences, and the switch interval is in all profiles left unchanged: 60 min.
As I mentioned in my problem description, the tasks were running \'high priority\'. From experience (I have not checked any documentation on this), that means that task switching has been disabled for that task. I also think that in these cases updates are disabled. In the mean time I have found more of these tasks, but I will discuss them in a separate message. Thanks and regards, Ruud |
Ruud van der Kroef Send message Joined: 25 Aug 07 Posts: 12 Credit: 3,654,064 RAC: 0 |
Last week Wednesday (05/11) I discovered, that on this same host synstar04 there are 3 C@H tasks running at high priority: Task ID..........Name.........................................Processor time........Progress.......Remaining time 11328240.....wu_101908_001039_0_2_0.....263h....................... 43%..............217h 11332781.....wu_101908_003211_0_2_0.....226h....................... 36.5%...........226h 11350047.....wu_101908_001559_1_1_1.....248h....................... 36%..............241h I aborted task wu_101908_001039_0_2_0 just to see what would happen. Checking the task webpage for that computer shows the task has been aborted as expected, and of course without any credit granted, but look at the low claimed credit: only 560.80 for 816,696.90 CPU seconds. Another thing is, should I leave the remaining tasks (or maybe just one) running to see what will happen? Or should I kill them both, as it seems like a waste of computer time? Any suggestions? (Personally I think I will keep one running just to see what will happen.) Thanks, Ruud |
sygopet Send message Joined: 2 Aug 08 Posts: 27 Credit: 204,771 RAC: 0 |
.....look at the low claimed credit: only 560.80 for 816,696.90 CPU seconds. Something is wrong! With your setup you should definitely be processing units within a few hours. I wouldn\'t have thought continuing with the others would have any worth either. Could be worth trying a project reset. It may be significant that two of the units (and possibly the third) you mention are ones where others have had downloading problems. so you\'ve probably just got rubbish units. You might have a claimed credit of 560 but the standard awarded is just 140 at present. |
web03 Send message Joined: 29 Aug 07 Posts: 4 Credit: 314,240 RAC: 0 |
ok - some things to think about. It looks like you are running version 5.10.45. Any reason why you haven\'t upgraded to the current version of BOINC - 6.2.19? Second - what about your pc statistics? This can be found on the Computer Summary page (but not viewable by us). here\'s mine as an example. % of time BOINC client is running 99.8956 % While BOINC running, % of time host has an Internet connection 98.6076 % While BOINC running, % of time work is allowed 99.974 % Average CPU efficiency 0.979113 Task duration correction factor 1.160849 Third - have you recently re-ran benchmarks? Maybe something is wrong there. Wendy |
Ruud van der Kroef Send message Joined: 25 Aug 07 Posts: 12 Credit: 3,654,064 RAC: 0 |
OK, some answers: sygopet: ... As you advised I have killed both tasks. Also because the progress looks very slow, so it might take quite some time to finish, if ever. webo3: ok - some things to think about. Nothing special, other than that it is a lot of work upgrading 50+ boxes. Second - what about your pc statistics? This can be found on the Computer Summary page (but not viewable by us). here\'s mine as an example. My statistics for this client are: % of time BOINC client is running 100 % While BOINC running, % of time work is allowed 99.9914 % Average CPU efficiency 0.999058 Task duration correction factor 16.420957 Third - have you recently re-ran benchmarks? Maybe something is wrong there. I think they run automatically. Looking into the Messages tab of BOINCManager I found they run every 5 days or so. Thanks, Ruud |