Advanced search

Forums : Technical Support : Is the validator running?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile ohiomike
Avatar

Send message
Joined: 17 Jul 07
Posts: 302
Credit: 5,006,319
RAC: 0
Message 3404 - Posted: 24 Oct 2007, 1:47:58 UTC
Last modified: 24 Oct 2007, 1:50:19 UTC

Just curious. I have completed a WU on 2 different machines (both AMD x2's), and they still show pending and "initial". These machines have not had a non-validating WU in 3 or 4 thousand WU's so far.
WU 632327.
Excuse me if I am "jumping the gun" here.


Boinc Button Abuser In Training >My Shrubbers<
ID: 3404 · Report as offensive     Reply Quote
Profile Jayargh
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 25 Jun 07
Posts: 508
Credit: 2,282,158
RAC: 0
Message 3406 - Posted: 24 Oct 2007, 1:59:49 UTC
Last modified: 24 Oct 2007, 2:44:51 UTC

It might be Scott needs to 'tweak' the validator but......usually initial state means the 2 results don't match well enough and is waiting for the third result to reach its quorum. Initial does not mean they won't validate ....just means a tighter control might be used now.

EDIT I now have the same as you ohiomike with this workunit.
ID: 3406 · Report as offensive     Reply Quote
Profile ohiomike
Avatar

Send message
Joined: 17 Jul 07
Posts: 302
Credit: 5,006,319
RAC: 0
Message 3407 - Posted: 24 Oct 2007, 2:45:12 UTC - in response to Message 3406.  

It might be Scott needs to 'tweak' the validator but......usually initial state means the 2 results don't match well enough and is waiting for the third result to reach its quorum. Initial does not mean they won't validate ....just means a tighter control might be used now.

It seems odd, that a AMD 6000+ X2 won't validate against a AMD 5600+ X2, both running the same OS, etc. Neither has ever had a validate error before. We will have to see what the AMD 4600+ X2 it has been sent to thinks.


Boinc Button Abuser In Training >My Shrubbers<
ID: 3407 · Report as offensive     Reply Quote
Profile Jayargh
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 25 Jun 07
Posts: 508
Credit: 2,282,158
RAC: 0
Message 3409 - Posted: 24 Oct 2007, 2:50:30 UTC - in response to Message 3407.  
Last modified: 24 Oct 2007, 2:51:38 UTC

It might be Scott needs to 'tweak' the validator but......usually initial state means the 2 results don't match well enough and is waiting for the third result to reach its quorum. Initial does not mean they won't validate ....just means a tighter control might be used now.

It seems odd, that a AMD 6000+ X2 won't validate against a AMD 5600+ X2, both running the same OS, etc. Neither has ever had a validate error before. We will have to see what the AMD 4600+ X2 it has been sent to thinks.



In both LHC and Nano I have had results send up to the maximum the project allows (5 here) before validating and giving credit to ALL workunits. New application...new ballgame is all :) Or a tweak is needed.
ID: 3409 · Report as offensive     Reply Quote
Profile Scott
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 1 Apr 07
Posts: 662
Credit: 13,742
RAC: 0
Message 3412 - Posted: 24 Oct 2007, 3:25:51 UTC

There are too few results to make a judgment about whether the validator is having problems. I'll take a look at it in the morning.
Scott Kruger
Project Administrator, Cosmology@Home
ID: 3412 · Report as offensive     Reply Quote
Profile ohiomike
Avatar

Send message
Joined: 17 Jul 07
Posts: 302
Credit: 5,006,319
RAC: 0
Message 3416 - Posted: 24 Oct 2007, 5:02:20 UTC - in response to Message 3412.  

There are too few results to make a judgment about whether the validator is having problems. I'll take a look at it in the morning.

It looks like all WUs are not initially validating. Does changing the initial replication to 3 cause the validator to wait for all 3 results to be returned before it checks them?

Boinc Button Abuser In Training >My Shrubbers<
ID: 3416 · Report as offensive     Reply Quote
Profile Jayargh
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 25 Jun 07
Posts: 508
Credit: 2,282,158
RAC: 0
Message 3417 - Posted: 24 Oct 2007, 5:14:51 UTC - in response to Message 3416.  

There are too few results to make a judgment about whether the validator is having problems. I'll take a look at it in the morning.

It looks like all WUs are not initially validating. Does changing the initial replication to 3 cause the validator to wait for all 3 results to be returned before it checks them?



If you look at the unfinished work of any given wu it only initially sends out 2.....the additional 3rd is when the 1st 2 don't agree and will continue to send the 4th and so on until the project limit until 2 results agree to make the quorum so 3 may not be the limit...ohiomike they have already been checked as I said before but do not agree and may be sent again.

I suspect the new applications precision may not be up to par to the validators causing this snafu however Scott said he would look at it the morning so its waiting time.
ID: 3417 · Report as offensive     Reply Quote
STE\/E
Volunteer tester

Send message
Joined: 12 Jun 07
Posts: 375
Credit: 16,539,257
RAC: 0
Message 3419 - Posted: 24 Oct 2007, 8:32:43 UTC

I have 86 Wu's of the new Wu's turned with none Validated so far. Some of them have 3 returned results, many with 2 returned.
ID: 3419 · Report as offensive     Reply Quote
Nothing But Idle Time

Send message
Joined: 27 Aug 07
Posts: 84
Credit: 148,380
RAC: 0
Message 3421 - Posted: 24 Oct 2007, 11:26:01 UTC

Shouldn' there be some data in the results? All I see in the two results (now sent to a third host) for one of my Wus is:

<core_client_version>5.10.20</core_client_version>
<![CDATA[
<stderr_txt>

</stderr_txt>
]]>


<core_client_version>5.10.13</core_client_version>
<![CDATA[
<stderr_txt>

</stderr_txt>
]]>


I'm used to seeing something like this:

<core_client_version>5.10.13</core_client_version>
<![CDATA[
<stderr_txt>
running camb_1.25_windows_intelx86.exe

**********
**********

Memory Leaks Detected!!!

Memory Statistics:
0 bytes in 0 Free Blocks.
322 bytes in 5 Normal Blocks.
4676 bytes in 4 CRT Blocks.
0 bytes in 0 Ignore Blocks.
0 bytes in 0 Client Blocks.
Largest number used: 9582 bytes.
...etc

Dumping objects ->
{76} normal block at 0x00976878, 32 bytes long.
...etc
</stderr_txt>
]]>
ID: 3421 · Report as offensive     Reply Quote
Profile ohiomike
Avatar

Send message
Joined: 17 Jul 07
Posts: 302
Credit: 5,006,319
RAC: 0
Message 3423 - Posted: 24 Oct 2007, 11:57:13 UTC
Last modified: 24 Oct 2007, 12:43:55 UTC

So far I have collected 60 pending results with 1.26 and 1.27.
At least 1 WU has gotten 0 credits because of "Too many successful results".
As far as I can tell no WU has validated yet on any of 6 machines I have running.



Boinc Button Abuser In Training >My Shrubbers<
ID: 3423 · Report as offensive     Reply Quote
Profile Campion

Send message
Joined: 3 Aug 07
Posts: 35
Credit: 153,234
RAC: 0
Message 3425 - Posted: 24 Oct 2007, 12:38:48 UTC

I was looking at some of my results last night and saw that I too was getting "Memory Leaks detected"

Sounds bad, is it ?

Would hate to be returning junk results.




ID: 3425 · Report as offensive     Reply Quote
Profile ohiomike
Avatar

Send message
Joined: 17 Jul 07
Posts: 302
Credit: 5,006,319
RAC: 0
Message 3426 - Posted: 24 Oct 2007, 12:41:56 UTC - in response to Message 3425.  
Last modified: 24 Oct 2007, 12:43:06 UTC

I was looking at some of my results last night and saw that I too was getting "Memory Leaks detected"

Sounds bad, is it ?

Would hate to be returning junk results.


The "Memory Leaks detected" msg was standard for the 1.25 app- not an issue.
Unfortunately if you look at your new results, nothing running the 1.26 or 1.27 apps has validated yet, that may be an issue.

Boinc Button Abuser In Training >My Shrubbers<
ID: 3426 · Report as offensive     Reply Quote
Profile Scott
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 1 Apr 07
Posts: 662
Credit: 13,742
RAC: 0
Message 3431 - Posted: 24 Oct 2007, 14:27:08 UTC
Last modified: 24 Oct 2007, 15:32:38 UTC

The issue should now be fixed. Apparently, there were some magic numbers in the validator code that didn't apply to the new results. I've changed them to be more general, so now we should see some validation.

EDIT: More fixes.
Scott Kruger
Project Administrator, Cosmology@Home
ID: 3431 · Report as offensive     Reply Quote
Profile ohiomike
Avatar

Send message
Joined: 17 Jul 07
Posts: 302
Credit: 5,006,319
RAC: 0
Message 3458 - Posted: 24 Oct 2007, 18:18:50 UTC - in response to Message 3431.  
Last modified: 24 Oct 2007, 18:28:01 UTC

The issue should now be fixed. Apparently, there were some magic numbers in the validator code that didn't apply to the new results. I've changed them to be more general, so now we should see some validation.

EDIT: More fixes.

Could you look at 631387?
It was sent to 4 different machines, all returned "success", but 0 credit was granted because of "to many successful results". I am curious, in that 2 of the machines are mine and have each run > 3000 WUs without a validation error.
Also found 632225 which is the same case on one of my Windows machines.

Boinc Button Abuser In Training >My Shrubbers<
ID: 3458 · Report as offensive     Reply Quote
Profile Scott
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 1 Apr 07
Posts: 662
Credit: 13,742
RAC: 0
Message 3462 - Posted: 24 Oct 2007, 18:55:16 UTC - in response to Message 3458.  

The issue should now be fixed. Apparently, there were some magic numbers in the validator code that didn't apply to the new results. I've changed them to be more general, so now we should see some validation.

EDIT: More fixes.

Could you look at 631387?
It was sent to 4 different machines, all returned "success", but 0 credit was granted because of "to many successful results". I am curious, in that 2 of the machines are mine and have each run > 3000 WUs without a validation error.
Also found 632225 which is the same case on one of my Windows machines.

That had to do with the validator shenanigans earlier. All results are now considered valid and credit has been granted to users/teams.

Sorry about that.
Scott Kruger
Project Administrator, Cosmology@Home
ID: 3462 · Report as offensive     Reply Quote
Profile Conan
Avatar

Send message
Joined: 28 Aug 07
Posts: 169
Credit: 1,333,424
RAC: 872
Message 3520 - Posted: 26 Oct 2007, 11:18:19 UTC - in response to Message 3462.  

The issue should now be fixed. Apparently, there were some magic numbers in the validator code that didn't apply to the new results. I've changed them to be more general, so now we should see some validation.

EDIT: More fixes.

Could you look at 631387?
It was sent to 4 different machines, all returned "success", but 0 credit was granted because of "to many successful results". I am curious, in that 2 of the machines are mine and have each run > 3000 WUs without a validation error.
Also found 632225 which is the same case on one of my Windows machines.

That had to do with the validator shenanigans earlier. All results are now considered valid and credit has been granted to users/teams.

Sorry about that.


Scott here are 4 that have been given 'zero' credit even though to me they look ok and are still at an 'initial' state but have been given no points. In each case the other two results have been granted credit.
1343362
1346137
1345647
1345649

Plus I would still like my computer totals to add up to my overall total, still 4,000 different. I like to know what my computer outputs are and now I can't work it out, as they are missing hundreds of points off their individual totals.
ID: 3520 · Report as offensive     Reply Quote
Profile speedimic
Avatar

Send message
Joined: 9 Sep 07
Posts: 89
Credit: 2,201,260
RAC: 0
Message 3522 - Posted: 26 Oct 2007, 15:48:56 UTC

I think someone should merge the threads...

Similar problems here.


mic.


ID: 3522 · Report as offensive     Reply Quote
Profile Conan
Avatar

Send message
Joined: 28 Aug 07
Posts: 169
Credit: 1,333,424
RAC: 872
Message 3562 - Posted: 27 Oct 2007, 3:04:34 UTC - in response to Message 3520.  

The issue should now be fixed. Apparently, there were some magic numbers in the validator code that didn't apply to the new results. I've changed them to be more general, so now we should see some validation.

EDIT: More fixes.

Could you look at 631387?
It was sent to 4 different machines, all returned "success", but 0 credit was granted because of "to many successful results". I am curious, in that 2 of the machines are mine and have each run > 3000 WUs without a validation error.
Also found 632225 which is the same case on one of my Windows machines.

That had to do with the validator shenanigans earlier. All results are now considered valid and credit has been granted to users/teams.

Sorry about that.


Scott here are 4 that have been given 'zero' credit even though to me they look ok and are still at an 'initial' state but have been given no points. In each case the other two results have been granted credit.
1343362
1346137
1345647
1345649

Plus I would still like my computer totals to add up to my overall total, still 4,000 different. I like to know what my computer outputs are and now I can't work it out, as they are missing hundreds of points off their individual totals.


Here is another 1345766

Also the difference in my computer totals and my overall total amounts to the pending work units that you gave the credit to when switching over from old application to new (my difference is 4,000 and I have added up 3,900 from my result tables, probably missed one, Host 2573 missing 1,200 (suspect 1,300 as I think an old WU has been deleted), Host 4475 missing 1,500 and Host 4648 missing 1,200).

Here is one Result 1223744 that I can't explain.
I was sent this result on the 10/10/07 and returned it on the 11/10/07 by Host 2573. As it had not been validated due to no one else finishing the WU I should of gotten 100 credits before the new Application started.
It had also been sent to Host 3473 and was "cancelled didn't need" as that host had not returned it as yet.
Now for some reason the WU is created again on the 26/10 and gets sent out on the 26/10 to Hosts 568 and 6064.
When Host 568 returns this new WU (CAMB 2.00 I think) it gets the 100 credits and I get 0.00.
Host 6064 has not completed yet.
What has happened here?
ID: 3562 · Report as offensive     Reply Quote
Profile Jayargh
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 25 Jun 07
Posts: 508
Credit: 2,282,158
RAC: 0
Message 3563 - Posted: 27 Oct 2007, 3:17:35 UTC

Conan please see my post in sticky weird validation
ID: 3563 · Report as offensive     Reply Quote
Profile Conan
Avatar

Send message
Joined: 28 Aug 07
Posts: 169
Credit: 1,333,424
RAC: 872
Message 3564 - Posted: 27 Oct 2007, 7:10:57 UTC - in response to Message 3563.  
Last modified: 27 Oct 2007, 7:11:40 UTC

Conan please see my post in sticky weird validation


That is ok but the last one and this one I have listed were both 102307 type work units and state that they are CAMB 2.00 work units but don't validate and give 0.00 credit.
ID: 3564 · Report as offensive     Reply Quote

Forums : Technical Support : Is the validator running?