Advanced search

Forums : Technical Support : URGENT Problems Discussion Thread
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 18 · Next

AuthorMessage
Profile ohiomike
Avatar

Send message
Joined: 17 Jul 07
Posts: 302
Credit: 5,006,319
RAC: 0
Message 2652 - Posted: 12 Sep 2007, 13:34:35 UTC
Last modified: 12 Sep 2007, 13:46:59 UTC

Getting a whole bunch of WU's that are immediately throwing -161 errors again.
PS- These are the same as talked about in the signal 11 thread. They are over-loading Boinc so bad I can't look at the messages however.
I run Boinc as a service, I can't even get Boinc to set no new tasks, I've had to stop the service, as CAMB (2 instances running.. should be 4) locks up the Boinc Manager. I don't know why I am seeing 2 instances of CAMB with 4 WUs running. Normally I see 4 instances (as I should).

* Note this started out on my Windows machine, my Linux boxes are seeing it now also.

===============================================================
Dumping objects ->
{74} normal block at 0x00AA5970, 32 bytes long.
Data: <camb_1.25_window> 63 61 6D 62 5F 31 2E 32 35 5F 77 69 6E 64 6F 77
{73} normal block at 0x00AA5888, 184 bytes long.
Data: < pY > 00 00 00 00 CD CD CD CD 70 59 AA 00 CD CD CD CD
{68} normal block at 0x00AA2E98, 12 bytes long.
Data: < D J R > D8 44 AA 00 B0 4A AA 00 88 52 AA 00
c:documents and settingsskruger2.uiucmy documentsboincapiboinc_api.c(160) : {63} normal block at 0x00AA4A80, 4 bytes long.
Data: < > 00 00 C0 00
c:documents and settingsskruger2.uiucmy documentsboinclibparse.c(140) : {62} normal block at 0x00AA5AB0, 90 bytes long.
Data: < <color_scheme>T> 0A 3C 63 6F 6C 6F 72 5F 73 63 68 65 6D 65 3E 54
Object dump complete.


</stderr_txt>
<message>
<file_xfer_error>
<file_name>wu_091107_200107_8_2_0</file_name>
<error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
<file_name>wu_091107_200107_8_2_1</file_name>
<error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
<file_name>wu_091107_200107_8_2_2</file_name>
<error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
<file_name>wu_091107_200107_8_2_3</file_name>
<error_code>-161</error_code>
</file_xfer_error>
<file_xfer_error>
<file_name>wu_091107_200107_8_2_4</file_name>
<error_code>-161</error_code>
</file_xfer_error>
==================================



Boinc Button Abuser In Training >My Shrubbers<
ID: 2652 · Report as offensive
Profile Campion

Send message
Joined: 3 Aug 07
Posts: 35
Credit: 153,234
RAC: 0
Message 2658 - Posted: 12 Sep 2007, 15:27:38 UTC

Happended again (see my post above).

Also noticed this time that Boinc is temporarily freezing up while
attempting to crunch these units. Also getting message at the top
of the client that the local host is not responding. Last between
30 to 60 seconds then the units error out.




ID: 2658 · Report as offensive
Profile Scott
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 1 Apr 07
Posts: 662
Credit: 13,742
RAC: 0
Message 2666 - Posted: 12 Sep 2007, 18:11:43 UTC

Apparently, just downgrading the server software isn't enough. I'm going to try to check with the BOINC people to see what changed in the newest version so I can pinpoint the problems.
Scott Kruger
Project Administrator, Cosmology@Home
ID: 2666 · Report as offensive
Profile Scott
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 1 Apr 07
Posts: 662
Credit: 13,742
RAC: 0
Message 2668 - Posted: 12 Sep 2007, 18:19:54 UTC
Last modified: 12 Sep 2007, 18:45:40 UTC

I need some more information on the errors:

1) Does the WU quit immediately or does it run for a while? If it doesn't immediately quit, how long does it go?
2) Does this happen to *every* WU you get or just some? If only some, about what percentage?

EDIT: It seems like multi-core processors and 5.10 clients are the only ones not erroring out. Therefore, I've raised the minimum client version to 5.10. Sorry if this inconveniences some of you, but it's a necessary step at this point.

EDIT 2: Never mind, the bug affects everybody. I'm got to talk to the BOINC people about this.
Scott Kruger
Project Administrator, Cosmology@Home
ID: 2668 · Report as offensive
Profile Saenger
Volunteer tester
Avatar

Send message
Joined: 22 May 07
Posts: 110
Credit: 282,157
RAC: 0
Message 2669 - Posted: 12 Sep 2007, 18:32:51 UTC - in response to Message 2668.  
Last modified: 12 Sep 2007, 19:22:02 UTC

I need some more information on the errors:

1) Does the WU quit immediately or does it run for a while? If it doesn't immediately quit, how long does it go?
2) Does this happen to *every* WU you get or just some? If only some, about what percentage?


It happened to all WUs I recieved last, they are listed in my results all in a row. Some I recieved were of a different batch, they ran fine.
I seems to happen immediately, at least my mesage tab says so.

edit:
I just happened to watch some fail to operate, they seem to freeze BOINC for some time, probably about half a minute, as that's what the messages tell me. it's not much waste, but more then nothing ;)
Grüße vom Sänger
ID: 2669 · Report as offensive
StratCat

Send message
Joined: 20 Jul 07
Posts: 26
Credit: 263,710
RAC: 0
Message 2671 - Posted: 12 Sep 2007, 19:43:44 UTC

Hi -

I am having exactly the same symptoms as Saenger.

The new WU's d/l, or attempt to d/l, and the WU's seem to attempt to start, but there is no time incrementing/decrementing displayed under the "Task" tab of the BOINC mgr. The BOINC mgr then freezes up for a minute or so, and the newly d/l'd WU's status changes to "computation error".

This occurs with every WU downloaded since last nite (several dozen).

My system:

Intel C2Q G0 stepping
Intel i965P Chipset
Win XP-Pro SP2
BOINC ver 5.10.13

Hope this helps.

Best regards.

Team Ars Technica
The Dogs of War - Chicago Chapter
ID: 2671 · Report as offensive
Profile Scott
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 1 Apr 07
Posts: 662
Credit: 13,742
RAC: 0
Message 2673 - Posted: 12 Sep 2007, 20:04:03 UTC

I upgraded the server software to the newest version and changed the template files. I\'m making some more workunits to test whether or not this works.

BTW, the required client version is back to 5.8.
Scott Kruger
Project Administrator, Cosmology@Home
ID: 2673 · Report as offensive
Profile speedimic
Avatar

Send message
Joined: 9 Sep 07
Posts: 89
Credit: 2,201,260
RAC: 0
Message 2674 - Posted: 12 Sep 2007, 20:08:23 UTC

I need some more information on the errors:

1) Does the WU quit immediately or does it run for a while? If it doesn\'t immediately quit, how long does it go?
2) Does this happen to *every* WU you get or just some? If only some, about what percentage?

EDIT: It seems like multi-core processors and 5.10 clients are the only ones not erroring out. Therefore, I\'ve raised the minimum client version to 5.10. Sorry if this inconveniences some of you, but it\'s a necessary step at this point.

EDIT 2: Never mind, the bug affects everybody. I\'m got to talk to the BOINC people about this.


Scott,

I think it\'s not the client version. This is one of 29 error results from my 5.10.20 windows rig.

Errors show exactly the same symptoms as Saenger(5.10.8) and StratCat(5.10.13) describe.


mic.


ID: 2674 · Report as offensive
Profile Scott
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 1 Apr 07
Posts: 662
Credit: 13,742
RAC: 0
Message 2675 - Posted: 12 Sep 2007, 20:12:00 UTC - in response to Message 2674.  

I need some more information on the errors:

1) Does the WU quit immediately or does it run for a while? If it doesn\'t immediately quit, how long does it go?
2) Does this happen to *every* WU you get or just some? If only some, about what percentage?

EDIT: It seems like multi-core processors and 5.10 clients are the only ones not erroring out. Therefore, I\'ve raised the minimum client version to 5.10. Sorry if this inconveniences some of you, but it\'s a necessary step at this point.

EDIT 2: Never mind, the bug affects everybody. I\'m got to talk to the BOINC people about this.


Scott,

I think it\'s not the client version. This is one of 29 error results from my 5.10.20 windows rig.

Errors show exactly the same symptoms as Saenger(5.10.8) and StratCat(5.10.13) describe.


Yes, I already noticed this; that\'s why I removed the 5.10 requirement.

It\'s strange how WUs do not error out ever on my Core 2 Duo machine (I haven\'t been able to test my Pentium M machine, since it\'s currently crunching WUs from last month).
Scott Kruger
Project Administrator, Cosmology@Home
ID: 2675 · Report as offensive
Profile Saenger
Volunteer tester
Avatar

Send message
Joined: 22 May 07
Posts: 110
Credit: 282,157
RAC: 0
Message 2677 - Posted: 12 Sep 2007, 20:19:32 UTC - in response to Message 2675.  
Last modified: 12 Sep 2007, 20:19:52 UTC

Yes, I already noticed this; that\'s why I removed the 5.10 requirement.

It\'s strange how WUs do not error out ever on my Core 2 Duo machine (I haven\'t been able to test my Pentium M machine, since it\'s currently crunching WUs from last month).

Mine is a C2D as well, a E6750@3.6GHz running ubuntu7.04, BOINC5.10.8.
It started with WU #461074 (wu_091107_190003_1) in my list, last good one was #460806 (wu_091107_184225_1). Some of the good ones were crunched later, after the first bad ones failed, but they had a smaller WU number.
Grüße vom Sänger
ID: 2677 · Report as offensive
Nvgnte
Avatar

Send message
Joined: 24 Jun 07
Posts: 49
Credit: 538,016
RAC: 2,755
Message 2678 - Posted: 12 Sep 2007, 20:20:20 UTC
Last modified: 12 Sep 2007, 20:29:17 UTC

One of my crunchers (P4-1800 MHz, WinXP, Boinc 5.10.20) is almost done with this one with no problems at all

However, one WU dl a couple hours ago on another PC (laptop, P4-2400, XP, 5.10.20) 976484 crashed as reported earlier

EDIT: In fact, JRenkar did crunch successfully the first one :)
La Tierra de un Dios que no supo aceptar / su falso derecho a la libertad - Tierra Santa
Descárgate mi primer eBook Amaneceres
ID: 2678 · Report as offensive
Profile Beezlebub

Send message
Joined: 11 Aug 07
Posts: 63
Credit: 1,843,380
RAC: 0
Message 2680 - Posted: 12 Sep 2007, 20:38:40 UTC
Last modified: 12 Sep 2007, 20:39:23 UTC

Seems as tho all mine are client error now:

999584 471171 12 Sep 2007 20:27:55 UTC 22 Sep 2007 20:27:55 UTC In Progress Unknown New --- --- ---
985088 468485 12 Sep 2007 13:04:37 UTC 12 Sep 2007 14:14:04 UTC Over Client error Compute error 0.00 0.00 ---
985083 468482 12 Sep 2007 13:04:04 UTC 12 Sep 2007 14:14:04 UTC Over Client error Compute error 0.02 0.00 ---
985075 468478 12 Sep 2007 13:04:04 UTC 12 Sep 2007 14:14:04 UTC Over Client error Compute error 0.00 0.00 ---
984377 468130 12 Sep 2007 12:51:01 UTC 12 Sep 2007 13:04:04 UTC Over Client error Compute error 0.00 0.00 ---
984187 468035 12 Sep 2007 12:51:01 UTC 12 Sep 2007 13:04:04 UTC Over Client error Compute error 0.00 0.00 ---
977778 465131 12 Sep 2007 17:28:35 UTC 22 Sep 2007 17:28:35 UTC In Progress Unknown New --- --- ---
977138 464815 12 Sep 2007 17:22:52 UTC 12 Sep 2007 19:22:39 UTC Over Client error Compute error 0.00 0.00 ---
977131 464812 12 Sep 2007 17:22:36 UTC 12 Sep 2007 19:22:39 UTC Over Client error Compute error 0.00 0.00 ---
977105 464799 12 Sep 2007 17:20:38 UTC 12 Sep 2007 19:22:39 UTC Over Client error Compute error 0.00 0.00 ---
974820 463657 12 Sep 2007 19:27:21 UTC 12 Sep 2007 20:27:55 UTC Over Client error Compute error 0.00 0.00 ---
974791 463642 12 Sep 2007 19:26:49 UTC 12 Sep 2007 20:27:55 UTC Over Client error Compute error 0.00 0.00 ---
974589 463541 12 Sep 2007 19:27:04 UTC 12 Sep 2007 20:27:55 UTC Over Client error Compute error 0.00 0.00 ---
974570 463532 12 Sep 2007 19:23:10 UTC 12 Sep 2007 19:26:49 UTC Over Client error Compute error 0.00 0.00 ---
974564 463529 12 Sep 2007 19:22:53 UTC 12 Sep 2007 19:26:49 UTC Over Client error Compute error 0.00 0.00 ---
974562 463528 12 Sep 2007 19:22:39 UTC 12 Sep 2007 19:26:49 UTC Over Client error Compute error 0.00 0.00 ---
973192 462861 12 Sep 2007 0:44:15 UTC 12 Sep 2007 1:51:26 UTC Over Client error Compute error 0.00 0.00 ---
971478 462004 12 Sep 2007 16:13:02 UTC 12 Sep 2007 17:22:36 UTC Over Client error Compute error 0.00 0.00 ---
971469 461999 12 Sep 2007 16:12:46 UTC 12 Sep 2007 17:22:36 UTC Over Client error Compute error 0.00 0.00 ---
969991 461260 12 Sep 2007 0:40:27 UTC 12 Sep 2007 0:44:15 UTC Over Client error Compute error 0.00 0.00 ---

First page only. :(
ID: 2680 · Report as offensive
Profile speedimic
Avatar

Send message
Joined: 9 Sep 07
Posts: 89
Credit: 2,201,260
RAC: 0
Message 2682 - Posted: 12 Sep 2007, 20:54:13 UTC
Last modified: 12 Sep 2007, 20:54:46 UTC

BTW, the \\-pest came over the forum.

Already went through that over at SAH...

\'\'\'\'\'\' <<---no \\ typed here!


mic.


ID: 2682 · Report as offensive
Profile Scott
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 1 Apr 07
Posts: 662
Credit: 13,742
RAC: 0
Message 2683 - Posted: 12 Sep 2007, 21:26:37 UTC - in response to Message 2682.  

BTW, the \\-pest came over the forum.

Already went through that over at SAH...

\'\'\'\'\'\' <<---no \\ typed here!


Yeah... I saw that. I\'m attempting to find a fix for the code somewhere.
Scott Kruger
Project Administrator, Cosmology@Home
ID: 2683 · Report as offensive
Profile ohiomike
Avatar

Send message
Joined: 17 Jul 07
Posts: 302
Credit: 5,006,319
RAC: 0
Message 2684 - Posted: 13 Sep 2007, 1:02:59 UTC
Last modified: 13 Sep 2007, 1:08:22 UTC

All new WUs are now erroring out after a full run. Appears to happen on all platforms:
Q6600/Windows
AMD X2/Linux
P4 D/Windows

====================
CPU time 6775.63
stderr out

<core_client_version>5.10.8</core_client_version>
<![CDATA[
<stderr_txt>
wrapper: starting
running camb_1.25_x86_64-pc-linux-gnu
wrapper: running ../../projects/www.cosmologyathome.org/camb_1.25_x86_64-pc-linux-gnu (params.ini)

</stderr_txt>
<message>
<file_xfer_error>
<file_name>wu_091207_150352_0_0_5</file_name>
<error_code>-161</error_code>
</file_xfer_error>

</message>
]]>

Validate state Initial
Claimed credit 24.1131324070043
Granted credit 0
===============





Boinc Button Abuser In Training >My Shrubbers<
ID: 2684 · Report as offensive
Profile Scott
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 1 Apr 07
Posts: 662
Credit: 13,742
RAC: 0
Message 2685 - Posted: 13 Sep 2007, 1:57:47 UTC

All new WUs are now erroring out after a full run. Appears to happen on all platforms:
Q6600/Windows
AMD X2/Linux
P4 D/Windows

====================
CPU time 6775.63
stderr out

5.10.8

wrapper: starting
running camb_1.25_x86_64-pc-linux-gnu
wrapper: running ../../projects/www.cosmologyathome.org/camb_1.25_x86_64-pc-linux-gnu (params.ini)




wu_091207_150352_0_0_5
-161



]]>

Validate state Initial
Claimed credit 24.1131324070043
Granted credit 0
===============





My memory is faulty; I forget that this version of CAMB only has 5 output files, instead of 6 with the new version that I haven\'t released yet. Therefore, the template file is wrong.

Anyway, I corrected it. Let\'s see if this helps out.
Scott Kruger
Project Administrator, Cosmology@Home
ID: 2685 · Report as offensive
caferace
Avatar

Send message
Joined: 1 Aug 07
Posts: 24
Credit: 287,830
RAC: 0
Message 2686 - Posted: 13 Sep 2007, 2:21:15 UTC - in response to Message 2685.  


My memory is faulty; I forget that this version of CAMB only has 5 output files, instead of 6 with the new version that I haven\'t released yet. Therefore, the template file is wrong.

Anyway, I corrected it. Let\'s see if this helps out.


OK, Scott. All my existing WU\'s were properly aborted. Now, I\'m getting no work from project messages on all my boxes. Example:

9/12/2007 7:17:48 PM|Cosmology@Home|Sending scheduler request: To fetch work
9/12/2007 7:17:48 PM|Cosmology@Home|Requesting 297534 seconds of new work
9/12/2007 7:17:53 PM|Cosmology@Home|Scheduler RPC succeeded [server version 601]
9/12/2007 7:17:53 PM|Cosmology@Home|Message from server: No work sent
9/12/2007 7:17:53 PM|Cosmology@Home|Message from server: (there was work but it was committed to other platforms)
9/12/2007 7:17:53 PM|Cosmology@Home|Deferring communication for 7 sec
9/12/2007 7:17:53 PM|Cosmology@Home|Reason: requested by project
9/12/2007 7:17:53 PM|Cosmology@Home|Deferring communication for 6 min 5 sec
9/12/2007 7:17:53 PM|Cosmology@Home|Reason: no work from project


cheers,

-jim

ID: 2686 · Report as offensive
Profile Campion

Send message
Joined: 3 Aug 07
Posts: 35
Credit: 153,234
RAC: 0
Message 2687 - Posted: 13 Sep 2007, 2:22:24 UTC

Was allowed to download one unit.

Then committed to other platforms started (3800 plus units according to server status page).

Started ok and seems to be crunching.




ID: 2687 · Report as offensive
caferace
Avatar

Send message
Joined: 1 Aug 07
Posts: 24
Credit: 287,830
RAC: 0
Message 2688 - Posted: 13 Sep 2007, 2:37:51 UTC - in response to Message 2687.  

Was allowed to download one unit.

Then committed to other platforms started (3800 plus units according to server status page).

Started ok and seems to be crunching.



Update on mine is the same situation, even on the dual-core boxen.

-jim

ID: 2688 · Report as offensive
Seventh Serenity
Avatar

Send message
Joined: 2 Jul 07
Posts: 5
Credit: 16,070
RAC: 0
Message 2693 - Posted: 13 Sep 2007, 7:02:44 UTC

My P4 is getting the \"there was work, but it was committed to other platforms\" error too - it\'s almost dry since it\'s been getting the error for several hours.
ID: 2693 · Report as offensive
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 18 · Next

Forums : Technical Support : URGENT Problems Discussion Thread