Advanced search

Forums : Technical Support : URGENT Problems Discussion Thread
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 18 · Next

AuthorMessage
Profile XaaK

Send message
Joined: 26 Aug 07
Posts: 17
Credit: 2,229,710
RAC: 0
Message 5629 - Posted: 31 Mar 2008, 16:43:26 UTC - in response to Message 5628.  



I even wonder if Ben Wandelt http://cosmos.astro.uiuc.edu/pBen.php?style=explore
cares there is a problem with his project...


There, I fixed it for you. ;-)
ID: 5629 · Report as offensive
Profile XaaK

Send message
Joined: 26 Aug 07
Posts: 17
Credit: 2,229,710
RAC: 0
Message 5631 - Posted: 31 Mar 2008, 17:12:59 UTC
Last modified: 31 Mar 2008, 17:19:14 UTC

Ok, I was just able to get into my pendings by playing around with some timeout values and using a custom program to get web pages.

edit: for exact stats

Current count = 2500 pending work units
Pending credit: 35,262.64
Average claim : 14.105


I\'m getting some work, and uploading seems better than it was before, but there\'s still problems along with the obvously huge backlog of pendings.
ID: 5631 · Report as offensive
Profile m4rtyn
Avatar

Send message
Joined: 23 Aug 07
Posts: 18
Credit: 372,460
RAC: 0
Message 5632 - Posted: 31 Mar 2008, 17:57:31 UTC - in response to Message 5631.  

I\'m getting some work, and uploading seems better than it was before,


I suspect the reason for that is because many users have realised that Cosmo is a dead duck and abandoned it for projects with admins who have a bit more respect for their contributors,thus lightening the load on the servers.


m4rtyn
************************** *************************
ID: 5632 · Report as offensive
Profile Westsail and *Pyxey*
Avatar

Send message
Joined: 19 Dec 07
Posts: 24
Credit: 889,050
RAC: 0
Message 5635 - Posted: 31 Mar 2008, 19:55:58 UTC

LOL @ the current server status page

\"Transitioner backlog (hours) 335,276\"

hmm, guess it may be ahwile before we get credit for all this work in limbo.

They could start a new side DC project to help out...
Call it CosmosTransitioner@home or just CT@home.

;-)
ID: 5635 · Report as offensive
Profile kevint

Send message
Joined: 30 Aug 07
Posts: 46
Credit: 6,502,980
RAC: 0
Message 5636 - Posted: 1 Apr 2008, 0:37:39 UTC

HUH ?????

Does this mean that these WU\'s are LOST ???

3/31/2008 6:12:06 PM|Cosmology@Home|Sending scheduler request: To fetch work. Requesting 755041 seconds of work, reporting 39 completed tasks
3/31/2008 6:17:19 PM||Project communication failed: attempting access to reference site
3/31/2008 6:17:20 PM|Cosmology@Home|Scheduler request failed: Timeout was reached
3/31/2008 6:17:21 PM||Access to reference site succeeded - project servers may be temporarily down.
3/31/2008 6:25:22 PM|Cosmology@Home|Sending scheduler request: Requested by user. Requesting 757986 seconds of work, reporting 39 completed tasks
3/31/2008 6:27:17 PM|Cosmology@Home|Scheduler request succeeded: got 0 new tasks
3/31/2008 6:27:17 PM|Cosmology@Home|You used the wrong URL for this project
3/31/2008 6:27:17 PM|Cosmology@Home|The correct URL is http://www.cosmologyathome.org/
3/31/2008 6:27:17 PM|Cosmology@Home|Using the wrong URL can cause problems in some cases.
3/31/2008 6:27:17 PM|Cosmology@Home|When convenient, detach this project, then reattach to http://www.cosmologyathome.org/
3/31/2008 6:27:17 PM|Cosmology@Home|Message from server: Completed result wu_032608_172616_1_0 refused: result already reported as success
3/31/2008 6:27:17 PM|Cosmology@Home|Message from server: Completed result wu_032608_172859_1_0 refused: result already reported as success
3/31/2008 6:27:17 PM|Cosmology@Home|Message from server: Completed result wu_032608_172311_0_0 refused: result already reported as success
3/31/2008 6:27:17 PM|Cosmology@Home|Message from server: Completed result wu_032608_172905_1_0 refused: result already reported as success
3/31/2008 6:27:17 PM|Cosmology@Home|Message from server: Completed result wu_032608_172622_1_0 refused: result already reported as success
3/31/2008 6:27:17 PM|Cosmology@Home|Message from server: Completed result wu_032608_172258_0_1 refused: result already reported as success

ID: 5636 · Report as offensive
Profile Ananas

Send message
Joined: 19 Jan 08
Posts: 180
Credit: 2,500,290
RAC: 0
Message 5637 - Posted: 1 Apr 2008, 0:44:31 UTC - in response to Message 5636.  

HUH ?????

Does this mean that these WU\'s are LOST ???

...Message from server: Completed result wu_032608_172616_1_0 refused: result already reported as success ...


I don\'t think so. This usually means that your host has reported the results to the scheduler. This did work properly but the scheduler reply didn\'t return to your host properly.

So the results have been reported to the server successfully, but your host didn\'t delete them as it didn\'t know about the success.

Your host retries to report the results now, but as the first attempt had already worked, the server doesn\'t accept the report a second time.
ID: 5637 · Report as offensive
Profile Jayargh
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 25 Jun 07
Posts: 508
Credit: 2,282,158
RAC: 0
Message 5639 - Posted: 1 Apr 2008, 2:16:06 UTC

At least I can view now I have about 320 pending results.... the validation rate has not picked up much yet.
ID: 5639 · Report as offensive
Profile Scott
Volunteer moderator
Project administrator
Project developer
Avatar

Send message
Joined: 1 Apr 07
Posts: 662
Credit: 13,742
RAC: 0
Message 5640 - Posted: 1 Apr 2008, 4:23:22 UTC

Here\'s the scoop on the weekend shenanigans:

I left on Thursday to visit the UWM physics department (home of Einstein@Home). I made sure not to make any changes so things would at least stay at the status quo while I was gone. However, it just so happened that the mysql server decided to blow up (figuratively) while I was gone.

After a number of hours working on the machine today, I finally figured out what the problem was. Debian plays around with the mysql startup script a bit, adding a check of the database integrity. However, the integrity check faltered at some point and stalled. Because it was just sitting there with a hold on the results table, queries kept piling up and at some point, the number of pending queries exceeded the maximum number of connections and everything went to hell.

I removed Debian\'s unnecessary check, increased the max_connections parameter, and increased the share of the server\'s memory alloted to mysql. Hopefully this will prevent more issues in the future.

The server is pretty overworked now as you can imagine, so I think it would be best to wait until tomorrow to look into the validation issue.
Scott Kruger
Project Administrator, Cosmology@Home
ID: 5640 · Report as offensive
Profile Ananas

Send message
Joined: 19 Jan 08
Posts: 180
Credit: 2,500,290
RAC: 0
Message 5644 - Posted: 1 Apr 2008, 7:12:47 UTC

There is a sample \"robots.txt\" in the BOINC server paket. You should consider installing that. Allowing search engines on all database driven pages can generate quite high peaks in the database load.
ID: 5644 · Report as offensive
Profile XaaK

Send message
Joined: 26 Aug 07
Posts: 17
Credit: 2,229,710
RAC: 0
Message 5645 - Posted: 1 Apr 2008, 13:41:43 UTC - in response to Message 5640.  
Last modified: 1 Apr 2008, 14:07:10 UTC

Here\'s the scoop on the weekend shenanigans:

I left on Thursday to visit the UWM physics department (home of Einstein@Home).
...

I\'m pretty sure UWM has things called computers, and they\'re most likely connected to something called \"The Internet\". In case you weren\'t aware, you can use those computers that are connected to \"The Internet\" to look at web pages, just like this one. Had you been aware of these amazing technological developments, you would have been able to look at this page, and even post a message saying \"Hey, I\'m out of town and I\'ll be sure to have a look at it on Monday. Until then, hang in there\".

Now, doing that would have taken, what, maybe 5 minutes? Is this project so trivial to you that you couldn\'t spare a few minutes from your amazingly exciting trip to UWM?

In case you don\'t understand the concept, let me explain further. Most of us users will tolerate problems with projects quite well if the project team at least pretends to care about the project.
ID: 5645 · Report as offensive
Scott

Send message
Joined: 31 Oct 07
Posts: 3
Credit: 108,180
RAC: 0
Message 5646 - Posted: 1 Apr 2008, 14:00:50 UTC - in response to Message 5645.  


I\'m pretty sure UWM has things called computers, and they\'re most likely connected to something called \"The Internet\". In case you weren\'t aware, you can use those computers that are connected to \"The Internet\" to look at web pages, just like this one. Had you been aware of these amazing technological developments, you would have been able to look at this page, and even post a message saying \"Hey, I\'m out of town and I\'ll be sure to have a look at it on Monday. Until then, hang in there\".


LOL! Amazingly to the point.
ID: 5646 · Report as offensive
Profile sysfried

Send message
Joined: 24 Jun 07
Posts: 114
Credit: 5,294,457
RAC: 3
Message 5649 - Posted: 1 Apr 2008, 14:26:31 UTC - in response to Message 5646.  


I\'m pretty sure UWM has things called computers, and they\'re most likely connected to something called \"The Internet\". In case you weren\'t aware, you can use those computers that are connected to \"The Internet\" to look at web pages, just like this one. Had you been aware of these amazing technological developments, you would have been able to look at this page, and even post a message saying \"Hey, I\'m out of town and I\'ll be sure to have a look at it on Monday. Until then, hang in there\".


LOL! Amazingly to the point.


Scott is one of the BOINC Project Admins that really does care. Give him a break!
I\'m sure many members which have been here for a while will agree. Scott is taking care and will fix it. He always did!

Best wishes,

Sysfried

Happy member of Team: Planet 3D Now!

ID: 5649 · Report as offensive
Profile kevint

Send message
Joined: 30 Aug 07
Posts: 46
Credit: 6,502,980
RAC: 0
Message 5650 - Posted: 1 Apr 2008, 14:34:12 UTC - in response to Message 5640.  

Here\'s the scoop on the weekend shenanigans:



Scott,

Glad that you could get it back and running. Things seem to be better now, at least my machines are reporting back.
I have not checked to see if the pending units have been validated -

In the future, it would be nice to have someone check in once in a while, and a least make a post so we know someone is alive on that side of the wire. - these BOINC projects as you know need some babysitting -

Again,
Thanks for your work -

ID: 5650 · Report as offensive
Brian D from Georgia

Send message
Joined: 25 Aug 07
Posts: 8
Credit: 345,890
RAC: 0
Message 5651 - Posted: 1 Apr 2008, 16:49:02 UTC

This is a good sign! The WU\'s in the validator queue are finally starting to drain. As of this posting, the number was 43,882. That number had been steadily increasing the last 3 days but crested a couple hours ago and is now receding.
ID: 5651 · Report as offensive
Profile m4rtyn
Avatar

Send message
Joined: 23 Aug 07
Posts: 18
Credit: 372,460
RAC: 0
Message 5654 - Posted: 1 Apr 2008, 17:56:16 UTC

As far as I can see appart from the slowly(very) droping number of wu waiting to be validated, not much else has changed. all my hosts at two different locations are getting a lot of download errors.

01/04/2008 05:27:53|Cosmology@Home|Started download of params_033008_213410_1.ini
01/04/2008 05:27:55|Cosmology@Home|Finished download of params_033008_213410_1.ini
01/04/2008 05:27:55|Cosmology@Home|[error] MD5 check failed for params_033008_213410_1.ini
01/04/2008 05:27:55|Cosmology@Home|[error] expected d41d8cd98f00b204e9800998ecf8427e, got 7d048a8145a783dc36c1bfce032a347f
01/04/2008 05:27:55|Cosmology@Home|[error] Checksum or signature error for params_033008_213410_1.ini

And if not that then,

01/04/2008 05:28:42|Cosmology@Home|Message from server: No work sent
01/04/2008 05:28:42|Cosmology@Home|Message from server: (there was work but it was committed to other platforms)

I\'ve evan had a few unable to open database errors today.

Although reporting and uploads seem OK nothing but old wu from before the weekend are being validated.
m4rtyn
************************** *************************
ID: 5654 · Report as offensive
Stwainer

Send message
Joined: 21 Jun 07
Posts: 18
Credit: 536,245
RAC: 0
Message 5655 - Posted: 1 Apr 2008, 18:04:55 UTC

I\'m sure glad I\'m not the admin of a BOINC project. It seems that people expect you to be on call 24/7 and if you don\'t tell complete strangers you\'ll be out of town for a few days and disconnected from the project, they get their panties in a bind. I guess admin\'s can\'t have any other bit of life outside the project or \'forget\' about the project for a few days.

Thank you Scott for fixing the problem quickly when you returned.
ID: 5655 · Report as offensive
Profile the silver surfer
Avatar

Send message
Joined: 11 Jan 08
Posts: 9
Credit: 48,970
RAC: 0
Message 5656 - Posted: 1 Apr 2008, 18:12:11 UTC
Last modified: 1 Apr 2008, 18:16:55 UTC

Attached a comp to COSMO a few minutes ago - All downloads o.k.,

started crunching the first wu. So everything seems to be stable

on my side (at least as far as I can tell).

Kurt

BTW : Thank You, Scott!

ID: 5656 · Report as offensive
Nothing But Idle Time

Send message
Joined: 27 Aug 07
Posts: 84
Credit: 148,380
RAC: 0
Message 5657 - Posted: 1 Apr 2008, 19:13:06 UTC - in response to Message 5655.  

I\'m sure glad I\'m not the admin of a BOINC project. It seems that people expect you to be on call 24/7...
If you have a child (project) and you are the parent (project administrator) then yes, you are on duty and responsible for what happens 24/7; that\'s just the way it is.
...and if you don\'t tell complete strangers you\'ll be out of town for a few days and disconnected from the project, they get their panties in a bind. I guess admin\'s can\'t have any other bit of life outside the project or \'forget\' about the project for a few days.
Again, if you\'re the parent of older children then at least phone home and check in occasionally; if the children are young then you get someone to oversee things while you are away. Responsibility doesn\'t end at the door.
ID: 5657 · Report as offensive
Profile m4rtyn
Avatar

Send message
Joined: 23 Aug 07
Posts: 18
Credit: 372,460
RAC: 0
Message 5658 - Posted: 1 Apr 2008, 20:18:02 UTC - in response to Message 5655.  

I\'m sure glad I\'m not the admin of a BOINC project. It seems that people expect you to be on call 24/7 and if you don\'t tell complete strangers you\'ll be out of town for a few days and disconnected from the project, they get their panties in a bind. I guess admin\'s can\'t have any other bit of life outside the project or \'forget\' about the project for a few days.


Nobody has sugested 24/7 monitoring of the project but some people donate a lot of time and expense to the project. In return it\'s hardly too much to expect someone to look in at least every couple of days. Being away from the project is no excuse as it takes only two minuets to check out the forums from anywhere in the world.

m4rtyn
************************** *************************
ID: 5658 · Report as offensive
Profile XaaK

Send message
Joined: 26 Aug 07
Posts: 17
Credit: 2,229,710
RAC: 0
Message 5660 - Posted: 1 Apr 2008, 21:01:11 UTC - in response to Message 5655.  



said by Stwainer
I\'m sure glad I\'m not the admin of a BOINC project.
...


So am I.
ID: 5660 · Report as offensive
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 18 · Next

Forums : Technical Support : URGENT Problems Discussion Thread