Forums :
Technical Support :
URGENT Problems Discussion Thread
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 18 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 17 Jul 07 Posts: 302 Credit: 5,006,319 RAC: 0 |
Getting a whole bunch of WU's that are immediately throwing -161 errors again. PS- These are the same as talked about in the signal 11 thread. They are over-loading Boinc so bad I can't look at the messages however. I run Boinc as a service, I can't even get Boinc to set no new tasks, I've had to stop the service, as CAMB (2 instances running.. should be 4) locks up the Boinc Manager. I don't know why I am seeing 2 instances of CAMB with 4 WUs running. Normally I see 4 instances (as I should). * Note this started out on my Windows machine, my Linux boxes are seeing it now also. =============================================================== Dumping objects -> {74} normal block at 0x00AA5970, 32 bytes long. Data: <camb_1.25_window> 63 61 6D 62 5F 31 2E 32 35 5F 77 69 6E 64 6F 77 {73} normal block at 0x00AA5888, 184 bytes long. Data: < pY > 00 00 00 00 CD CD CD CD 70 59 AA 00 CD CD CD CD {68} normal block at 0x00AA2E98, 12 bytes long. Data: < D J R > D8 44 AA 00 B0 4A AA 00 88 52 AA 00 c:documents and settingsskruger2.uiucmy documentsboincapiboinc_api.c(160) : {63} normal block at 0x00AA4A80, 4 bytes long. Data: < > 00 00 C0 00 c:documents and settingsskruger2.uiucmy documentsboinclibparse.c(140) : {62} normal block at 0x00AA5AB0, 90 bytes long. Data: < <color_scheme>T> 0A 3C 63 6F 6C 6F 72 5F 73 63 68 65 6D 65 3E 54 Object dump complete. </stderr_txt> <message> <file_xfer_error> <file_name>wu_091107_200107_8_2_0</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>wu_091107_200107_8_2_1</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>wu_091107_200107_8_2_2</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>wu_091107_200107_8_2_3</file_name> <error_code>-161</error_code> </file_xfer_error> <file_xfer_error> <file_name>wu_091107_200107_8_2_4</file_name> <error_code>-161</error_code> </file_xfer_error> ================================== ![]() ![]() Boinc Button Abuser In Training >My Shrubbers< |
![]() Send message Joined: 3 Aug 07 Posts: 35 Credit: 153,234 RAC: 0 |
Happended again (see my post above). Also noticed this time that Boinc is temporarily freezing up while attempting to crunch these units. Also getting message at the top of the client that the local host is not responding. Last between 30 to 60 seconds then the units error out. ![]() |
![]() Volunteer moderator Project administrator Project developer ![]() Send message Joined: 1 Apr 07 Posts: 662 Credit: 13,742 RAC: 0 |
Apparently, just downgrading the server software isn't enough. I'm going to try to check with the BOINC people to see what changed in the newest version so I can pinpoint the problems. Scott Kruger Project Administrator, Cosmology@Home |
![]() Volunteer moderator Project administrator Project developer ![]() Send message Joined: 1 Apr 07 Posts: 662 Credit: 13,742 RAC: 0 |
I need some more information on the errors: 1) Does the WU quit immediately or does it run for a while? If it doesn't immediately quit, how long does it go? 2) Does this happen to *every* WU you get or just some? If only some, about what percentage? EDIT: It seems like multi-core processors and 5.10 clients are the only ones not erroring out. Therefore, I've raised the minimum client version to 5.10. Sorry if this inconveniences some of you, but it's a necessary step at this point. EDIT 2: Never mind, the bug affects everybody. I'm got to talk to the BOINC people about this. Scott Kruger Project Administrator, Cosmology@Home |
![]() Volunteer tester ![]() Send message Joined: 22 May 07 Posts: 110 Credit: 353,577 RAC: 7 |
I need some more information on the errors: It happened to all WUs I recieved last, they are listed in my results all in a row. Some I recieved were of a different batch, they ran fine. I seems to happen immediately, at least my mesage tab says so. edit: I just happened to watch some fail to operate, they seem to freeze BOINC for some time, probably about half a minute, as that's what the messages tell me. it's not much waste, but more then nothing ;) Grüße vom Sänger ![]() |
StratCat Send message Joined: 20 Jul 07 Posts: 26 Credit: 263,747 RAC: 0 |
Hi - I am having exactly the same symptoms as Saenger. The new WU's d/l, or attempt to d/l, and the WU's seem to attempt to start, but there is no time incrementing/decrementing displayed under the "Task" tab of the BOINC mgr. The BOINC mgr then freezes up for a minute or so, and the newly d/l'd WU's status changes to "computation error". This occurs with every WU downloaded since last nite (several dozen). My system: Intel C2Q G0 stepping Intel i965P Chipset Win XP-Pro SP2 BOINC ver 5.10.13 Hope this helps. Best regards. |
![]() Volunteer moderator Project administrator Project developer ![]() Send message Joined: 1 Apr 07 Posts: 662 Credit: 13,742 RAC: 0 |
I upgraded the server software to the newest version and changed the template files. I\'m making some more workunits to test whether or not this works. BTW, the required client version is back to 5.8. Scott Kruger Project Administrator, Cosmology@Home |
![]() ![]() Send message Joined: 9 Sep 07 Posts: 89 Credit: 2,201,260 RAC: 0 |
I need some more information on the errors: Scott, I think it\'s not the client version. This is one of 29 error results from my 5.10.20 windows rig. Errors show exactly the same symptoms as Saenger(5.10.8) and StratCat(5.10.13) describe. mic. ![]() |
![]() Volunteer moderator Project administrator Project developer ![]() Send message Joined: 1 Apr 07 Posts: 662 Credit: 13,742 RAC: 0 |
I need some more information on the errors: Yes, I already noticed this; that\'s why I removed the 5.10 requirement. It\'s strange how WUs do not error out ever on my Core 2 Duo machine (I haven\'t been able to test my Pentium M machine, since it\'s currently crunching WUs from last month). Scott Kruger Project Administrator, Cosmology@Home |
![]() Volunteer tester ![]() Send message Joined: 22 May 07 Posts: 110 Credit: 353,577 RAC: 7 |
Yes, I already noticed this; that\'s why I removed the 5.10 requirement. Mine is a C2D as well, a E6750@3.6GHz running ubuntu7.04, BOINC5.10.8. It started with WU #461074 (wu_091107_190003_1) in my list, last good one was #460806 (wu_091107_184225_1). Some of the good ones were crunched later, after the first bad ones failed, but they had a smaller WU number. Grüße vom Sänger ![]() |
Nvgnte![]() Send message Joined: 24 Jun 07 Posts: 49 Credit: 920,725 RAC: 12 |
One of my crunchers (P4-1800 MHz, WinXP, Boinc 5.10.20) is almost done with this one with no problems at all However, one WU dl a couple hours ago on another PC (laptop, P4-2400, XP, 5.10.20) 976484 crashed as reported earlier EDIT: In fact, JRenkar did crunch successfully the first one :) La Tierra de un Dios que no supo aceptar / su falso derecho a la libertad - Tierra Santa Descárgate mi primer eBook Amaneceres |
![]() Send message Joined: 11 Aug 07 Posts: 63 Credit: 1,843,380 RAC: 0 |
Seems as tho all mine are client error now: 999584 471171 12 Sep 2007 20:27:55 UTC 22 Sep 2007 20:27:55 UTC In Progress Unknown New --- --- --- 985088 468485 12 Sep 2007 13:04:37 UTC 12 Sep 2007 14:14:04 UTC Over Client error Compute error 0.00 0.00 --- 985083 468482 12 Sep 2007 13:04:04 UTC 12 Sep 2007 14:14:04 UTC Over Client error Compute error 0.02 0.00 --- 985075 468478 12 Sep 2007 13:04:04 UTC 12 Sep 2007 14:14:04 UTC Over Client error Compute error 0.00 0.00 --- 984377 468130 12 Sep 2007 12:51:01 UTC 12 Sep 2007 13:04:04 UTC Over Client error Compute error 0.00 0.00 --- 984187 468035 12 Sep 2007 12:51:01 UTC 12 Sep 2007 13:04:04 UTC Over Client error Compute error 0.00 0.00 --- 977778 465131 12 Sep 2007 17:28:35 UTC 22 Sep 2007 17:28:35 UTC In Progress Unknown New --- --- --- 977138 464815 12 Sep 2007 17:22:52 UTC 12 Sep 2007 19:22:39 UTC Over Client error Compute error 0.00 0.00 --- 977131 464812 12 Sep 2007 17:22:36 UTC 12 Sep 2007 19:22:39 UTC Over Client error Compute error 0.00 0.00 --- 977105 464799 12 Sep 2007 17:20:38 UTC 12 Sep 2007 19:22:39 UTC Over Client error Compute error 0.00 0.00 --- 974820 463657 12 Sep 2007 19:27:21 UTC 12 Sep 2007 20:27:55 UTC Over Client error Compute error 0.00 0.00 --- 974791 463642 12 Sep 2007 19:26:49 UTC 12 Sep 2007 20:27:55 UTC Over Client error Compute error 0.00 0.00 --- 974589 463541 12 Sep 2007 19:27:04 UTC 12 Sep 2007 20:27:55 UTC Over Client error Compute error 0.00 0.00 --- 974570 463532 12 Sep 2007 19:23:10 UTC 12 Sep 2007 19:26:49 UTC Over Client error Compute error 0.00 0.00 --- 974564 463529 12 Sep 2007 19:22:53 UTC 12 Sep 2007 19:26:49 UTC Over Client error Compute error 0.00 0.00 --- 974562 463528 12 Sep 2007 19:22:39 UTC 12 Sep 2007 19:26:49 UTC Over Client error Compute error 0.00 0.00 --- 973192 462861 12 Sep 2007 0:44:15 UTC 12 Sep 2007 1:51:26 UTC Over Client error Compute error 0.00 0.00 --- 971478 462004 12 Sep 2007 16:13:02 UTC 12 Sep 2007 17:22:36 UTC Over Client error Compute error 0.00 0.00 --- 971469 461999 12 Sep 2007 16:12:46 UTC 12 Sep 2007 17:22:36 UTC Over Client error Compute error 0.00 0.00 --- 969991 461260 12 Sep 2007 0:40:27 UTC 12 Sep 2007 0:44:15 UTC Over Client error Compute error 0.00 0.00 --- First page only. :( ![]() |
![]() ![]() Send message Joined: 9 Sep 07 Posts: 89 Credit: 2,201,260 RAC: 0 |
BTW, the \\-pest came over the forum. Already went through that over at SAH... \'\'\'\'\'\' <<---no \\ typed here! mic. ![]() |
![]() Volunteer moderator Project administrator Project developer ![]() Send message Joined: 1 Apr 07 Posts: 662 Credit: 13,742 RAC: 0 |
BTW, the \\-pest came over the forum. Yeah... I saw that. I\'m attempting to find a fix for the code somewhere. Scott Kruger Project Administrator, Cosmology@Home |
![]() ![]() Send message Joined: 17 Jul 07 Posts: 302 Credit: 5,006,319 RAC: 0 |
All new WUs are now erroring out after a full run. Appears to happen on all platforms: Q6600/Windows AMD X2/Linux P4 D/Windows ==================== CPU time 6775.63 stderr out <core_client_version>5.10.8</core_client_version> <![CDATA[ <stderr_txt> wrapper: starting running camb_1.25_x86_64-pc-linux-gnu wrapper: running ../../projects/www.cosmologyathome.org/camb_1.25_x86_64-pc-linux-gnu (params.ini) </stderr_txt> <message> <file_xfer_error> <file_name>wu_091207_150352_0_0_5</file_name> <error_code>-161</error_code> </file_xfer_error> </message> ]]> Validate state Initial Claimed credit 24.1131324070043 Granted credit 0 =============== ![]() ![]() Boinc Button Abuser In Training >My Shrubbers< |
![]() Volunteer moderator Project administrator Project developer ![]() Send message Joined: 1 Apr 07 Posts: 662 Credit: 13,742 RAC: 0 |
All new WUs are now erroring out after a full run. Appears to happen on all platforms: My memory is faulty; I forget that this version of CAMB only has 5 output files, instead of 6 with the new version that I haven\'t released yet. Therefore, the template file is wrong. Anyway, I corrected it. Let\'s see if this helps out. Scott Kruger Project Administrator, Cosmology@Home |
caferace![]() Send message Joined: 1 Aug 07 Posts: 24 Credit: 287,830 RAC: 0 |
OK, Scott. All my existing WU\'s were properly aborted. Now, I\'m getting no work from project messages on all my boxes. Example: 9/12/2007 7:17:48 PM|Cosmology@Home|Sending scheduler request: To fetch work 9/12/2007 7:17:48 PM|Cosmology@Home|Requesting 297534 seconds of new work 9/12/2007 7:17:53 PM|Cosmology@Home|Scheduler RPC succeeded [server version 601] 9/12/2007 7:17:53 PM|Cosmology@Home|Message from server: No work sent 9/12/2007 7:17:53 PM|Cosmology@Home|Message from server: (there was work but it was committed to other platforms) 9/12/2007 7:17:53 PM|Cosmology@Home|Deferring communication for 7 sec 9/12/2007 7:17:53 PM|Cosmology@Home|Reason: requested by project 9/12/2007 7:17:53 PM|Cosmology@Home|Deferring communication for 6 min 5 sec 9/12/2007 7:17:53 PM|Cosmology@Home|Reason: no work from project cheers, -jim ![]() |
![]() Send message Joined: 3 Aug 07 Posts: 35 Credit: 153,234 RAC: 0 |
Was allowed to download one unit. Then committed to other platforms started (3800 plus units according to server status page). Started ok and seems to be crunching. ![]() |
caferace![]() Send message Joined: 1 Aug 07 Posts: 24 Credit: 287,830 RAC: 0 |
Was allowed to download one unit. Update on mine is the same situation, even on the dual-core boxen. -jim ![]() |
Seventh Serenity![]() Send message Joined: 2 Jul 07 Posts: 5 Credit: 16,070 RAC: 0 |
My P4 is getting the \"there was work, but it was committed to other platforms\" error too - it\'s almost dry since it\'s been getting the error for several hours. |