Forums :
Technical Support :
URGENT Problems Discussion Thread
Message board moderation
Previous · 1 . . . 7 · 8 · 9 · 10 · 11 · 12 · 13 . . . 18 · Next
Author | Message |
---|---|
![]() Send message Joined: 26 Aug 07 Posts: 17 Credit: 2,229,710 RAC: 0 |
There, I fixed it for you. ;-) |
![]() Send message Joined: 26 Aug 07 Posts: 17 Credit: 2,229,710 RAC: 0 |
Ok, I was just able to get into my pendings by playing around with some timeout values and using a custom program to get web pages. edit: for exact stats Current count = 2500 pending work units Pending credit: 35,262.64 Average claim : 14.105 I\'m getting some work, and uploading seems better than it was before, but there\'s still problems along with the obvously huge backlog of pendings. |
![]() ![]() Send message Joined: 23 Aug 07 Posts: 18 Credit: 372,460 RAC: 0 |
I\'m getting some work, and uploading seems better than it was before, I suspect the reason for that is because many users have realised that Cosmo is a dead duck and abandoned it for projects with admins who have a bit more respect for their contributors,thus lightening the load on the servers. m4rtyn ************************** ![]() ![]() |
![]() ![]() Send message Joined: 19 Dec 07 Posts: 24 Credit: 889,050 RAC: 0 |
LOL @ the current server status page \"Transitioner backlog (hours) 335,276\" hmm, guess it may be ahwile before we get credit for all this work in limbo. They could start a new side DC project to help out... Call it CosmosTransitioner@home or just CT@home. ;-) ![]() |
![]() Send message Joined: 30 Aug 07 Posts: 46 Credit: 6,502,980 RAC: 0 |
HUH ????? Does this mean that these WU\'s are LOST ??? 3/31/2008 6:12:06 PM|Cosmology@Home|Sending scheduler request: To fetch work. Requesting 755041 seconds of work, reporting 39 completed tasks 3/31/2008 6:17:19 PM||Project communication failed: attempting access to reference site 3/31/2008 6:17:20 PM|Cosmology@Home|Scheduler request failed: Timeout was reached 3/31/2008 6:17:21 PM||Access to reference site succeeded - project servers may be temporarily down. 3/31/2008 6:25:22 PM|Cosmology@Home|Sending scheduler request: Requested by user. Requesting 757986 seconds of work, reporting 39 completed tasks 3/31/2008 6:27:17 PM|Cosmology@Home|Scheduler request succeeded: got 0 new tasks 3/31/2008 6:27:17 PM|Cosmology@Home|You used the wrong URL for this project 3/31/2008 6:27:17 PM|Cosmology@Home|The correct URL is http://www.cosmologyathome.org/ 3/31/2008 6:27:17 PM|Cosmology@Home|Using the wrong URL can cause problems in some cases. 3/31/2008 6:27:17 PM|Cosmology@Home|When convenient, detach this project, then reattach to http://www.cosmologyathome.org/ 3/31/2008 6:27:17 PM|Cosmology@Home|Message from server: Completed result wu_032608_172616_1_0 refused: result already reported as success 3/31/2008 6:27:17 PM|Cosmology@Home|Message from server: Completed result wu_032608_172859_1_0 refused: result already reported as success 3/31/2008 6:27:17 PM|Cosmology@Home|Message from server: Completed result wu_032608_172311_0_0 refused: result already reported as success 3/31/2008 6:27:17 PM|Cosmology@Home|Message from server: Completed result wu_032608_172905_1_0 refused: result already reported as success 3/31/2008 6:27:17 PM|Cosmology@Home|Message from server: Completed result wu_032608_172622_1_0 refused: result already reported as success 3/31/2008 6:27:17 PM|Cosmology@Home|Message from server: Completed result wu_032608_172258_0_1 refused: result already reported as success |
![]() Send message Joined: 19 Jan 08 Posts: 180 Credit: 2,500,290 RAC: 0 |
HUH ????? I don\'t think so. This usually means that your host has reported the results to the scheduler. This did work properly but the scheduler reply didn\'t return to your host properly. So the results have been reported to the server successfully, but your host didn\'t delete them as it didn\'t know about the success. Your host retries to report the results now, but as the first attempt had already worked, the server doesn\'t accept the report a second time. |
![]() Volunteer moderator Volunteer tester ![]() Send message Joined: 25 Jun 07 Posts: 508 Credit: 2,282,158 RAC: 0 |
At least I can view now I have about 320 pending results.... the validation rate has not picked up much yet. |
![]() Volunteer moderator Project administrator Project developer ![]() Send message Joined: 1 Apr 07 Posts: 662 Credit: 13,742 RAC: 0 |
Here\'s the scoop on the weekend shenanigans: I left on Thursday to visit the UWM physics department (home of Einstein@Home). I made sure not to make any changes so things would at least stay at the status quo while I was gone. However, it just so happened that the mysql server decided to blow up (figuratively) while I was gone. After a number of hours working on the machine today, I finally figured out what the problem was. Debian plays around with the mysql startup script a bit, adding a check of the database integrity. However, the integrity check faltered at some point and stalled. Because it was just sitting there with a hold on the results table, queries kept piling up and at some point, the number of pending queries exceeded the maximum number of connections and everything went to hell. I removed Debian\'s unnecessary check, increased the max_connections parameter, and increased the share of the server\'s memory alloted to mysql. Hopefully this will prevent more issues in the future. The server is pretty overworked now as you can imagine, so I think it would be best to wait until tomorrow to look into the validation issue. Scott Kruger Project Administrator, Cosmology@Home |
![]() Send message Joined: 19 Jan 08 Posts: 180 Credit: 2,500,290 RAC: 0 |
There is a sample \"robots.txt\" in the BOINC server paket. You should consider installing that. Allowing search engines on all database driven pages can generate quite high peaks in the database load. |
![]() Send message Joined: 26 Aug 07 Posts: 17 Credit: 2,229,710 RAC: 0 |
Here\'s the scoop on the weekend shenanigans: I\'m pretty sure UWM has things called computers, and they\'re most likely connected to something called \"The Internet\". In case you weren\'t aware, you can use those computers that are connected to \"The Internet\" to look at web pages, just like this one. Had you been aware of these amazing technological developments, you would have been able to look at this page, and even post a message saying \"Hey, I\'m out of town and I\'ll be sure to have a look at it on Monday. Until then, hang in there\". Now, doing that would have taken, what, maybe 5 minutes? Is this project so trivial to you that you couldn\'t spare a few minutes from your amazingly exciting trip to UWM? In case you don\'t understand the concept, let me explain further. Most of us users will tolerate problems with projects quite well if the project team at least pretends to care about the project. |
Scott Send message Joined: 31 Oct 07 Posts: 3 Credit: 108,180 RAC: 0 |
LOL! Amazingly to the point. |
![]() Send message Joined: 24 Jun 07 Posts: 114 Credit: 5,296,905 RAC: 0 |
Scott is one of the BOINC Project Admins that really does care. Give him a break! I\'m sure many members which have been here for a while will agree. Scott is taking care and will fix it. He always did! Best wishes, Sysfried Happy member of Team: Planet 3D Now! ![]() |
![]() Send message Joined: 30 Aug 07 Posts: 46 Credit: 6,502,980 RAC: 0 |
Here\'s the scoop on the weekend shenanigans: Scott, Glad that you could get it back and running. Things seem to be better now, at least my machines are reporting back. I have not checked to see if the pending units have been validated - In the future, it would be nice to have someone check in once in a while, and a least make a post so we know someone is alive on that side of the wire. - these BOINC projects as you know need some babysitting - Again, Thanks for your work - |
Brian D from Georgia Send message Joined: 25 Aug 07 Posts: 8 Credit: 345,890 RAC: 0 |
This is a good sign! The WU\'s in the validator queue are finally starting to drain. As of this posting, the number was 43,882. That number had been steadily increasing the last 3 days but crested a couple hours ago and is now receding. |
![]() ![]() Send message Joined: 23 Aug 07 Posts: 18 Credit: 372,460 RAC: 0 |
As far as I can see appart from the slowly(very) droping number of wu waiting to be validated, not much else has changed. all my hosts at two different locations are getting a lot of download errors. 01/04/2008 05:27:53|Cosmology@Home|Started download of params_033008_213410_1.ini 01/04/2008 05:27:55|Cosmology@Home|Finished download of params_033008_213410_1.ini 01/04/2008 05:27:55|Cosmology@Home|[error] MD5 check failed for params_033008_213410_1.ini 01/04/2008 05:27:55|Cosmology@Home|[error] expected d41d8cd98f00b204e9800998ecf8427e, got 7d048a8145a783dc36c1bfce032a347f 01/04/2008 05:27:55|Cosmology@Home|[error] Checksum or signature error for params_033008_213410_1.ini And if not that then, 01/04/2008 05:28:42|Cosmology@Home|Message from server: No work sent 01/04/2008 05:28:42|Cosmology@Home|Message from server: (there was work but it was committed to other platforms) I\'ve evan had a few unable to open database errors today. Although reporting and uploads seem OK nothing but old wu from before the weekend are being validated. m4rtyn ************************** ![]() ![]() |
Stwainer Send message Joined: 21 Jun 07 Posts: 18 Credit: 536,245 RAC: 0 |
I\'m sure glad I\'m not the admin of a BOINC project. It seems that people expect you to be on call 24/7 and if you don\'t tell complete strangers you\'ll be out of town for a few days and disconnected from the project, they get their panties in a bind. I guess admin\'s can\'t have any other bit of life outside the project or \'forget\' about the project for a few days. Thank you Scott for fixing the problem quickly when you returned. |
![]() ![]() Send message Joined: 11 Jan 08 Posts: 9 Credit: 49,099 RAC: 0 |
Attached a comp to COSMO a few minutes ago - All downloads o.k., started crunching the first wu. So everything seems to be stable on my side (at least as far as I can tell). Kurt BTW : Thank You, Scott! ![]() |
Nothing But Idle Time Send message Joined: 27 Aug 07 Posts: 84 Credit: 148,380 RAC: 0 |
I\'m sure glad I\'m not the admin of a BOINC project. It seems that people expect you to be on call 24/7...If you have a child (project) and you are the parent (project administrator) then yes, you are on duty and responsible for what happens 24/7; that\'s just the way it is. ...and if you don\'t tell complete strangers you\'ll be out of town for a few days and disconnected from the project, they get their panties in a bind. I guess admin\'s can\'t have any other bit of life outside the project or \'forget\' about the project for a few days.Again, if you\'re the parent of older children then at least phone home and check in occasionally; if the children are young then you get someone to oversee things while you are away. Responsibility doesn\'t end at the door. |
![]() ![]() Send message Joined: 23 Aug 07 Posts: 18 Credit: 372,460 RAC: 0 |
I\'m sure glad I\'m not the admin of a BOINC project. It seems that people expect you to be on call 24/7 and if you don\'t tell complete strangers you\'ll be out of town for a few days and disconnected from the project, they get their panties in a bind. I guess admin\'s can\'t have any other bit of life outside the project or \'forget\' about the project for a few days. Nobody has sugested 24/7 monitoring of the project but some people donate a lot of time and expense to the project. In return it\'s hardly too much to expect someone to look in at least every couple of days. Being away from the project is no excuse as it takes only two minuets to check out the forums from anywhere in the world. m4rtyn ************************** ![]() ![]() |
![]() Send message Joined: 26 Aug 07 Posts: 17 Credit: 2,229,710 RAC: 0 |
So am I. |