Advanced search

Forums : Technical Support : Computation error on every WU
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Hammy

Send message
Joined: 7 May 18
Posts: 1
Credit: 61,204
RAC: 0
Message 22022 - Posted: 20 Dec 2018, 17:44:49 UTC

Hi guys. I am having zero success running Cosmology on my iMac with fully up to date software. Whilst I have no idea what it is, I have Virtual Box set up on my machine as I understand it is needed for this. I just downloaded 60 WUs and they ran quickly, all expiring after 9 seconds of processing with the comment 'Computation error'.

I'm not a computer engineer and don't have any understanding of the technical aspects of this work. However I have been processing units for BOINC and pre BOINC since 2001 and have not had this problem previously.

Any ideas please?[/img]
ID: 22022 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 17 Nov 14
Posts: 107
Credit: 4,425,470
RAC: 409
Message 22023 - Posted: 20 Dec 2018, 17:50:29 UTC - in response to Message 22022.  

While I know nothing about iMacs, the usual problem on other machines is that virtualization is not enabled in the BIOS. Even though VirtualBox installs "correctly", it won't run without it. Check your BIOS to see if you can find it.

Maybe this helps.
https://stackoverflow.com/questions/13580491/how-to-enable-support-of-cpu-virtualization-on-macbook-pro
ID: 22023 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 161
Credit: 7,580,022
RAC: 1,014
Message 22024 - Posted: 21 Dec 2018, 3:42:38 UTC - in response to Message 22022.  

I looked at your computer details. It is reporting VM extensions are enabled. I am going to guess you have too many processor cores assigned to a VM work unit. Try setting it to use half or less of your total cores.

FAQ has section on app_config.xml setting. Below is mine I edited to two processors. You can copy and paste this into a text file and save it in your Cosmologyathome project directory. Mine is Boinc > Projects > www.cosmologyathome.com on a Windows computer. Save the file as 'app_config.xml' Restart boinc and I think you will get it running.

<app_config>
<app>
<name>camb_boinc2docker</name>
</app>
<app_version>
<app_name>camb_boinc2docker</app_name>
<plan_class>vbox64_mt</plan_class>
<avg_ncpus>2.000000</avg_ncpus>
<max_ncpus>2.000000</max_ncpus>
</app_version>
</app_config>
ID: 22024 · Report as offensive     Reply Quote
dricker

Send message
Joined: 4 Feb 11
Posts: 13
Credit: 867,108
RAC: 1,096
Message 22063 - Posted: 4 Feb 2019, 16:12:31 UTC - in response to Message 22024.  

I am having the same issue with computational errors on my new Windows 10 Pro computer. VM is enabled in the BIOS. BOINC is trying to use 12 CPUs. I have tried limiting the number of CPUs in BOINC manager to 50% but it does not seem to work. I have made an .xml file as decribed above but do not appear to have a cosmology project folder on my computer.

Any assistance will be greatly appreciated. Thanks.

Regards,

David
ID: 22063 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 17 Nov 14
Posts: 107
Credit: 4,425,470
RAC: 409
Message 22064 - Posted: 4 Feb 2019, 17:34:12 UTC - in response to Message 22063.  

I have made an .xml file as decribed above but do not appear to have a cosmology project folder on my computer.

It looks like you have BOINC on the "D" drive; I see "D:\Documents\BOINC\", etc.

Chances are, BOINC is all mixed up. Uninstall BOINC, and try installing everything on the "C" drive.
ID: 22064 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 161
Credit: 7,580,022
RAC: 1,014
Message 22068 - Posted: 6 Feb 2019, 10:11:35 UTC - in response to Message 22063.  
Last modified: 6 Feb 2019, 10:18:15 UTC

I just looked at your successful camb_boinc2docker task and the log indicated the virtual machine had two processors assigned so I think you successfully created and used the app_config.xml file otherwise it would have tried to grab all the processors for the virtual machine.

How many different projects do you have running?

http://www.cosmologyathome.org/result.php?resultid=5347934
ID: 22068 · Report as offensive     Reply Quote
dricker

Send message
Joined: 4 Feb 11
Posts: 13
Credit: 867,108
RAC: 1,096
Message 22071 - Posted: 7 Feb 2019, 21:00:39 UTC - in response to Message 22068.  

Hi,

I tried moving everything to the C: drive but it didn't help. The .xml file appears to be working with everything on D: drive. I have ten BOINC projects, but not all of them run at the same time. I am now receiving an exceeded runtime error.

Any ideas?

Regards,

david
ID: 22071 · Report as offensive     Reply Quote
dricker

Send message
Joined: 4 Feb 11
Posts: 13
Credit: 867,108
RAC: 1,096
Message 22072 - Posted: 7 Feb 2019, 21:06:39 UTC - in response to Message 22071.  

I just rebooted and when BOINC came up it had the following message:

Cosmology@Home: Notice from BOINC
Your app_config.xml file refers to an unknown application 'camb_boinc2docker'. Known applications: None
2/7/2019 3:04:20 PM

Regards,

David
ID: 22072 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 161
Credit: 7,580,022
RAC: 1,014
Message 22073 - Posted: 7 Feb 2019, 22:50:47 UTC - in response to Message 22072.  
Last modified: 7 Feb 2019, 22:54:48 UTC

you may want to suspend other projects temporarily and set 'camb_legacy: no' in cosmology@home prefs.

What do you have set for your Cosmology@home preferences?

What do you have for your computing preferences?


Here is the app_config.xml I am using. I edited it to run only one camb_boinc2docker app at a time. You can always change that later. The camb_boinc2docker apps will use 2Gb of memory per work unit and it doesn't depend on the number of processors assigned to the work unit. On your machine, valid setting would be from 1 to 6 ( max_ncpu). max_concurrent controls how many camb_boinc2docker work units can run at the same time.

Please see the FAQ section on the app_config.xml file. Copy and paste got rid of the indentations.
________________________________________________________

<app_config>
<app>
<name>camb_boinc2docker</name>
<max_concurrent>1</max_concurrent>
</app>
<app_version>
<app_name>camb_boinc2docker</app_name>
<plan_class>vbox64_mt</plan_class>
<avg_ncpus>2.000000</avg_ncpus>
<max_ncpus>2.000000</max_ncpus>
</app_version>
</app_config>

__________________________________________________________
ID: 22073 · Report as offensive     Reply Quote
dricker

Send message
Joined: 4 Feb 11
Posts: 13
Credit: 867,108
RAC: 1,096
Message 22074 - Posted: 8 Feb 2019, 1:00:54 UTC - in response to Message 22073.  

Resource share 100
Use CPU
Is it OK for Cosmology@Home and your team (if any) to email you?
Should Cosmology@Home show your computers on its web site?
Default computer location ---
Run only the selected applications camb_legacy: no
camb_boinc2docker: yes
planck_param_sims: yes
If no work for selected applications is available, accept work from other applications? no
Max # jobs No limit
Max # CPUs 2

Edit preferences
ID: 22074 · Report as offensive     Reply Quote
dricker

Send message
Joined: 4 Feb 11
Posts: 13
Credit: 867,108
RAC: 1,096
Message 22075 - Posted: 8 Feb 2019, 1:05:24 UTC - in response to Message 22074.  

These settings apply to all computers using this account except

computers where you have set preferences locally using the BOINC Manager
Android devices

Computing
Usage limits
Use at most 50 % of the CPUs
Use at most 60 % of CPU time
When to suspend
Suspend when computer is on battery
Suspend when computer is in use
Suspend GPU computing when computer is in use
'In use' means mouse/keyboard input in last 3 minutes
Suspend when no mouse/keyboard input in last --- minutes
Suspend when non-BOINC CPU usage is above 25 %
Compute only between ---
Other
Store at least 0.2 days of work
Store up to an additional 0.3 days of work
Switch between tasks every 120 minutes
Request tasks to checkpoint at most every 60 seconds
Disk
Use no more than 10 GB
Leave at least 0.5 GB free
Use no more than 50 % of total
Memory
When computer is in use, use at most 50 %
When computer is not in use, use at most 75 %
Leave non-GPU tasks in memory while suspended
Page/swap file: use at most 50 %
Network
Usage limits
Limit download rate to --- KB/second
Limit upload rate to --- KB/second
Limit usage to --- MB every --- days
When to suspend
Transfer files only between ---
Other
Skip data verification for image files
Confirm before connecting to Internet
Disconnect when done
ID: 22075 · Report as offensive     Reply Quote
dricker

Send message
Joined: 4 Feb 11
Posts: 13
Credit: 867,108
RAC: 1,096
Message 22076 - Posted: 8 Feb 2019, 1:06:20 UTC - in response to Message 22075.  

I recreated the .xml file using the info in your last email. Thanks.

Regards,

David
ID: 22076 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 161
Credit: 7,580,022
RAC: 1,014
Message 22077 - Posted: 8 Feb 2019, 1:33:06 UTC - in response to Message 22076.  

Use at most 60 % of CPU time is the setting probably causing the time out problem. I am not sure if that works with the Virtual Box applications but their completion time is estimated for 100% cpu time. The throttling messes with it.

There is a good check list for Virtual Box related apps and trouble shooting at the LHC website.
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4161&postid=29359#29359

Your app_config.xml may not have been working all this time and your virtual box machine were probably limited by your 'Max # CPUs 2' preference.

Are you finding your overall processor usage to be about 50% (six cores) with the 'Use at most 50 % of the CPUs' preference?

I am having trouble with VirtualBox 6.0.4 and 'Postponed: VM job unmanageable, restarting later.' errors so I would hold off on changing versions for now. I may go back to the 5 series version.

Were you able to find your BOINC, projects folder for the app_config.xml file? Mine is in the default location for a boinc install. The ProgramData directory is usually hidden. Mine drops in this folder
'C:\ProgramData\BOINC\projects\www.cosmologyathome.org'
ID: 22077 · Report as offensive     Reply Quote
dricker

Send message
Joined: 4 Feb 11
Posts: 13
Credit: 867,108
RAC: 1,096
Message 22080 - Posted: 8 Feb 2019, 15:01:40 UTC - in response to Message 22077.  

Hi Jonathan,

After recreating the .xml file and disabling legacy everything seems to be working now. I will change the CPU time back to 100% to see what happens.

The check list is very exhaustive. I will need to work through it over the next couple of days.

Even with the 50% cores setting it seems I am still using all 12. I am not sure why that is.

I was able to put the .xml file in the correct projects folder, so that is good. BOINC seems to be using it as it is limiting the project to 2 cores and running one at a time.

I really appreciate your assistance. Thank you very much,

Regards,

David
ID: 22080 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 161
Credit: 7,580,022
RAC: 1,014
Message 22081 - Posted: 8 Feb 2019, 17:19:55 UTC - in response to Message 22080.  

if you are using Boinc Manager, you can try clicking on Options, Computing Preferences. Mine gives me a little notice at the top that my current preferences are pulled from the Cosmology@home project preferences that I have set. I wonder if your are getting pulled or overwritten from somewhere else.
ID: 22081 · Report as offensive     Reply Quote
dricker

Send message
Joined: 4 Feb 11
Posts: 13
Credit: 867,108
RAC: 1,096
Message 22082 - Posted: 8 Feb 2019, 21:00:50 UTC - in response to Message 22081.  

Hi Jonathan,

I am using BOINC manager, and it is set to use local preferences. When I changed it to use Cosmology@home preferences it changed it for all my projects. I reset it to use local preferences and changed most of them to be similar to the Cosmology settings. The .xml file is still controlling the number of cores for Cosmology.

Regards,

David
ID: 22082 · Report as offensive     Reply Quote
Tim Kelley

Send message
Joined: 1 Jul 18
Posts: 24
Credit: 1,646,084
RAC: 8,785
Message 22106 - Posted: 4 Mar 2019, 20:30:51 UTC

Hi, I'm having that same "Postponed: VM job unmanageable" error. I downgraded to VBox 6.0.2 and it's still happening. It seems to have started about a month ago, which I think coincided with an announcement of the release of a new C@H engine. Maybe that's the problem?

The estimates of how long a job has remaining to complete seem to gyrate wildly, jumping up to many hours (or even days) and then being slowly revised downwards again. I think sometimes the high estimate of time remaining "scares" the scheduler and causes the error message. They used to run like clockwork, 26 mins predicted and 26 mins to complete.
ID: 22106 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 161
Credit: 7,580,022
RAC: 1,014
Message 22107 - Posted: 4 Mar 2019, 23:17:10 UTC - in response to Message 22106.  

Tim, are you running any other Virtual Box projects? What projects are you running concurrently? Are you using an app_config.xml file?

I am just running Cosmology@Home (camb_boinc2docker) and LHC@home (Atlas tasks) using Virtual Box 6.0.4 It is running successfully and without errors. If I suspend LHC and run four concurrent 2 core tasks I get the 'communication lost, postponed' errors. I think I traced it down to Virtual Box being too busy with multiple virtual machines running at 100% processor usage. vboxwrapper_26200_windows_x86_64 looks to run with LOW priority on Windows while everything else Virtual Box related is at a normal priority. vboxwrapper_26196_windows_x86_64 from the LHC project is running at Below Normal priority. I think the vboxwrapper just can't get updates as quickly as required.

I have eight true cores and below is my app_config.xml for Cosmology@home. You can edit the <max_concurrent> section to 1 or 2 for your machine. Info can be found it the FAQ section.

<app_config>
 <app>
  <name>camb_boinc2docker</name>
  <max_concurrent>3</max_concurrent>
 </app>
 <app_version>
  <app_name>camb_boinc2docker</app_name>
  <plan_class>vbox64_mt</plan_class>
  <avg_ncpus>2.000000</avg_ncpus>
  <max_ncpus>2.000000</max_ncpus>
 </app_version>
</app_config>
ID: 22107 · Report as offensive     Reply Quote
Tim Kelley

Send message
Joined: 1 Jul 18
Posts: 24
Credit: 1,646,084
RAC: 8,785
Message 22110 - Posted: 6 Mar 2019, 17:01:16 UTC
Last modified: 6 Mar 2019, 17:05:59 UTC

Scratch that idea of "scaring" the scheduler; I've seen it declare several jobs "unmanageable" even though there are less than 10 minutes predicted remaining calculation time, and less than 20 minutes total predicted calculation time. I have no idea what causes BOINC to deem a job "unmanageable."

I was running LHC at the same time, but I stopped, so there's only one concurrent virtual machine. Here's my app_config.xml:

<app_config>
<app>
<name>camb_boinc2docker</name>
<max_concurrent>1</max_concurrent>
</app>
<app_version>
<app_name>camb_boinc2docker</app_name>
<plan_class>vbox64_mt</plan_class>
<avg_ncpus>3</avg_ncpus>
</app_version>
</app_config>
ID: 22110 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 161
Credit: 7,580,022
RAC: 1,014
Message 22112 - Posted: 7 Mar 2019, 4:38:51 UTC - in response to Message 22110.  

The two camb_boinc2docker errors that were currently showing under your computers tasks were just 'normal' failures. I get these. Just look for 'Optical depth is strange' in your log results. I usually get less than one percent of those errors. I have a bunch of Timed Out - No Response but that should have just been from when the project ran out of disk space or something like that a bit ago.
ID: 22112 · Report as offensive     Reply Quote
1 · 2 · Next

Forums : Technical Support : Computation error on every WU