Problem with space allocation and workunits not finishing

Advanced search

Message boards : Technical Support : Problem with space allocation and workunits not finishing

AuthorMessage
js
Send message
Joined: Jan 1 08
Posts: 3
ID: 5151
Credit: 40,610
RAC: 27
Message 4930 - Posted 6 Feb 2008 12:57:18 UTC

    Hello everyone,

    I am having trouble getting cosmology@home to run. I use boinc 5.10.28 under Linux. Having re-attached to Cosmology after the server restructuring a few days ago, I have the following problems:


    • My boinc installation is on a 1 GB USB-stick. Local preferences limit disk usage to around 800 MB. I often get error messages indicating that no WUs were downloaded due to missing space, but after leaving boinc alone for a while, it fetches WUs and finishes them, never using more than 200 or 300 MB.

    • After finishing a work unit, my boinc client exits with a symbol lookup error (see console snippet below). A subsequent restart lists the WU as \"lost result\".



    Do you have any ideas what I might do about this? Any hints are appreciated.

    Here is the output from the end of a WU\'s processing:


    >> Beginning Phase 3 <<
    >> ----------------- <<
    at z = 0.0000000E+00 sigma8 (all matter)= 0.8138058
    06-Feb-2008 13:42:13 [Cosmology@Home] Computation for task wu_013108_090634_0_2 finished
    ./boinc: symbol lookup error: ./boinc: undefined symbol: gzopen64

    [1]+ Exit 127 ./boinc


    After restarting the client, the following message is repeated:


    06-Feb-2008 13:42:29 [Cosmology@Home] Restarting task wu_013108_090634_0_2 using camb version 205
    06-Feb-2008 13:42:31 [Cosmology@Home] Task wu_013108_090634_0_2 exited with zero status but no \'finished\' file
    06-Feb-2008 13:42:31 [Cosmology@Home] If this happens repeatedly you may need to reset the project.

    Profile Jayargh
    Forum moderator
    Volunteer tester
    Avatar
    Send message
    Joined: Jun 25 07
    Posts: 508
    ID: 191
    Credit: 2,282,158
    RAC: 427
    Message 4931 - Posted 6 Feb 2008 14:32:20 UTC

      The only thing that comes to mind is the memory issue....as a wu finishes it can use 500mb or more of memory to write the output files before they are deleted...so as you see the memory usage as being lower while running in the 200-300mb range at completion it jumps .... maybe the 1gb stick is not enough.
      ____________

      js
      Send message
      Joined: Jan 1 08
      Posts: 3
      ID: 5151
      Credit: 40,610
      RAC: 27
      Message 4932 - Posted 6 Feb 2008 15:05:13 UTC - in response to Message 4931.

        (...) maybe the 1gb stick is not enough.


        Thanks for your reply. I will try to copy the installation directory to a hard drive and see if I can reproduce the error. Anyway, I thought that in case of drive memory shortage, the error messages would be clearer...

        Profile Ananas
        Send message
        Joined: Jan 19 08
        Posts: 158
        ID: 5845
        Credit: 674,130
        RAC: 632
        Message 4933 - Posted 6 Feb 2008 23:49:38 UTC

          Last modified: 7 Feb 2008 0:08:11 UTC

          The second part of the error might be a BOINC issue :

          Ticket #537 is titled ./boinc: symbol lookup error: ./boinc: undefined symbol: gzopen64

          The ticket is still open, I wonder why it\'s still priorized \"minor\", the reason for going to \"minor\" has been that it seemed to be a unique error but now it should go back to \"major\" I guess.


          p.s.: I found this, maybe it can fix your problem too

          js
          Send message
          Joined: Jan 1 08
          Posts: 3
          ID: 5151
          Credit: 40,610
          RAC: 27
          Message 4939 - Posted 8 Feb 2008 16:34:25 UTC

            OK, first of all thanks for your replies. I have tried to take a closer look at the problem. I moved the boinc folder from my USB stick onto a hard drive and raised the disk usage limit to 9 GB using global_prefs_override.xml.

            Running from hard disk, I could not reproduce the error. However, I doubt that disk space was really the cause: While processing several workunits, I monitored boinc\'s disk consumption every 2 seconds. Those values never went above 129 MB for the boinc client and cosmology and Einstein as projects in the same directory. So, I suppose that CAMB might falsely report \"out of disk space\" when, in fact, there is enough left...

            Monitoring, by the way, was done using

            watch -n 2 \"du -sh ~/boinc | tee -a ~/boincsize\"

            in bash. As far as I know, this method has no inherent systematic errors when checking a directory\'s size.

            Post to thread

            Message boards : Technical Support : Problem with space allocation and workunits not finishing