Message boards : Announcements : Beta testing the new C@H
1 · 2 · 3 · 4 . . . 5 · Next
Author | Message |
---|---|
Hi all,
| |
ID: 20282 · ![]() | |
1. Please send for this BETA test only the new application: camb_boinc2docker | |
ID: 20283 · ![]() | |
Hi Marius, <vbox_job>
<!-- Set as desired -->
<memory_size_mb>2048</memory_size_mb>
<!-- This is the VBox guest OS, not the host OS, so it stays this for all app_versions. -->
<os_name>Linux26_64</os_name>
<!-- These are all needed for boinc2docker -->
<enable_isocontextualization>1</enable_isocontextualization>
<enable_cache_disk>1</enable_cache_disk>
<enable_shared_directory/>
<enable_scratch_directory/>
<enable_network/>
<completion_trigger_file>completion_trigger_file</completion_trigger_file>
<!-- -->
<fraction_done_filename>results/progress</fraction_done_filename>
<minimum_checkpoint_interval>60</minimum_checkpoint_interval>
<enable_vm_savestate_usage/>
<disable_automatic_checkpoints/>
</vbox_job> You use vbox_job.xml twice, named as vbox_job.xml and camb_boinc2docker_vbox_job.xml. The vbox_job.xml can be deleted. It's confusing, cause not used. | |
ID: 20284 · ![]() | |
When using the 'save state' method, the task still requires more than the 1000000000 bytes you allow. | |
ID: 20285 · ![]() | |
After setting "Request tasks to checkpoint at most every xx seconds" to a value higher than the runtime of the task, I'm able to complete tasks on Win7 x64 with vbox 5.0.0. So the 'maximum disk limit exceeded' error is definitely caused by the checkpoint snapshots. | |
ID: 20286 · ![]() | |
MT-tasks with 8 threads: Elapsed time avg: 947.76 sec - CPU used 6700,41 seconds on average. | |
ID: 20287 · ![]() | |
Awesome, thanks for the useful comments! Replies below..
| |
ID: 20288 · ![]() | |
* Multi threaded: By default BOINC is going to allocate all free CPUs to the job. If you have 4 CPUS and in your computing preferences you tell BOINC to use 50% CPU time, it'll run it as 2 CPU job. Is this a solution to what you guys are talking about, or am I misunderstanding? 1. The problem is that VBoxHeadless.exe is running at the 'normal' priority, where normal BOINC-tasks are running at the lowest 'idle' priority. So your task is concurring with the user himself. Setting cpu's to e.g. 50% is only a partial solution, cause most crunchers want to use all cores, but al lowest priority for BOINC. There is a cmdline parameter --nthreads. Maybe you could use that, when taking ncpus - 1 for --nthreads. 2. When your mt-task is starting it pushes all other already running BOINC-tasks to a waiting state, maybe even loosing a lot of computing time when 'Leave in application' is not set or swapped to disk when "LAIM" is set, but system is low on memory. Your VM needs about 1.5GB RAM. * Crystal Pellet: Thanks good catch, there's an unnecessary vbox_job.xml in there. Btw, what is the <enable_vm_savestate_usage> tag, I'm not seeing that in the docs for vboxwrapper? If you set that tag, in your *job.xml file together with the also not documented disable_automatic_checkpoint tag the VM will save its state immediately when a user suspend the task (LAIM off) or BOINC stops. The VM is saved and not poweroff (although of course not running anymore) After resume no loss, because it restores from the very last point where the user suspended it. In your setup the whole task could be lost when no checkpoint was made or at least the loss of time since the last checkpoint. Therefore also in my setup to checkpoint every 60 seconds, because no checkpoints needed, but the checkpoint-file updates more regular now. That file is also used for restoring the cpu-seconds after a task-resume. | |
ID: 20289 · ![]() | |
I upped the job disk bound to 3gb, let me know if anyone still sees the disk errors. The 3gb should be an overestimate, I will work on tweaking the exact space / memory requirements. | |
ID: 20290 · ![]() | |
Thanks for the long awaited update! I was wondering what has happened with this great project. I am looking forward to the changed being planned. Keep up the good work! | |
ID: 20293 · ![]() | |
You removed a needed file from the download directory: | |
ID: 20296 · ![]() | |
Wu's seem to stop after 10 Min's with this Message ? | |
ID: 20297 · ![]() | |
Crystal Pellet: Yea, I noticed the file was gone and readded it. It might have been gotten deleted again at some other points too, I'll look into why the file deleter is getting it. Btw, your suggestion with the vm_save_state looks really great, I'm testing it now. Thanks! | |
ID: 20298 · ![]() | |
I've updated the project several times now ... The deleted Wu's were actually download error's, I've returned no successful Wu's yet as they all hang/suspend themselves after 10 Min's ... | |
ID: 20299 · ![]() | |
Crystal Pellet: Yea, I noticed the file was gone and readded it. It might have been gotten deleted again at some other points too, I'll look into why the file deleter is getting it. Btw, your suggestion with the vm_save_state looks really great, I'm testing it now. Thanks! Hi Marius, That camb_boinc2docker_boinc_app-file is gone again. At least it's not in the download-dir. I've successfully tested an option to reduce the number of cores for the Virtual Machine by the user himself. You don't have to do anything, when the user places following file with the name app_config.xml in his project directory: <app_config>
<project_max_concurrent>1</project_max_concurrent>
<app>
<name>camb_boinc2docker</name>
<max_concurrent>1</max_concurrent>
</app>
<app_version>
<app_name>camb_boinc2docker</app_name>
<plan_class>vbox64_mt</plan_class>
<avg_ncpus>7.000000</avg_ncpus>
<max_ncpus>7.000000</max_ncpus>
</app_version>
</app_config>
In the example I've reduced the number of cores to 7 on my 8-threaded machines. The VM is created and running with 7 cores. Results with 6 cores: http://beta.cosmologyathome.org/result.php?resultid=1951 http://beta.cosmologyathome.org/result.php?resultid=1939 Result with 7 cores: http://beta.cosmologyathome.org/result.php?resultid=1897 | |
ID: 20300 · ![]() | |
STEVE: That camb_boinc2docker_boinc_app file (http://beta.cosmologyathome.org/download/2b0/camb_boinc2docker_boinc_app) is and should have been present for at least the last four hours. But I do still see your client giving errors downloading it. Can you try a project reset? Maybe a remove / add too? Other clients have been able to complete the exact same workunit after your client gave a download error on them, so my guess is the workunits / files are fine on the server. | |
ID: 20301 · ![]() | |
STEVE: That camb_boinc2docker_boinc_app file (http://beta.cosmologyathome.org/download/2b0/camb_boinc2docker_boinc_app) is and should have been present for at least the last four hours. Shouldn't be that file 1 directory higher: in download-dir itself and not in /2b0/ ? It looks like it is deleted after every task from the user's machine. Crystal Pellet: Very useful, thanks. To make sure I understand, the difference between this, and say, just lowering the "Use at most" CPU time option is that this targets camb_boinc2docker specifically, leaving other apps to use that last 8th core? That's correct! This last core could be left free for GPU-task support or another single-core CPU-task could use it. That app_config.xml should be placed in the Cosmology project directory on the users machine (now of course the beta-directory). | |
ID: 20302 · ![]() | |
STEVE: That camb_boinc2docker_boinc_app file (http://beta.cosmologyathome.org/download/2b0/camb_boinc2docker_boinc_app) is and should have been present for at least the last four hours. But I do still see your client giving errors downloading it. Can you try a project reset? Maybe a remove / add too? Other clients have been able to complete the exact same workunit after your client gave a download error on them, so my guess is the workunits / files are fine on the server I haven't been able to get the camb_boinc2docker_boinc_app to run for more than 10 min's before suspending & starting another task on the Win 8 Laptop. I did get it to run on another PC though that has Win 7 Pro installed ... Question, are the camb_legacy wu's I'm getting multi task too ??? they only run 1 at a time ??? | |
ID: 20303 · ![]() | |
Hello | |
ID: 20305 · ![]() | |
Steve: Yes, camb_legacy is single threaded. We have no plans to modify this app in the future, so it will stay single threaded, but if you've got the RAM, it's just as efficient to run multiple copies of it as if it were multithreaded. | |
ID: 20307 · ![]() | |
Message boards : Announcements : Beta testing the new C@H