Advanced search

Forums : General Topics : Jobs are being created with VT-x / AMD-V disabled
Message board moderation

To post messages, you must log in.

AuthorMessage
Evans CAH

Send message
Joined: 19 Apr 09
Posts: 7
Credit: 373,938
RAC: 14
Message 22066 - Posted: 5 Feb 2019, 10:36:00 UTC

All Cosmology jobs are being created with the VT-x / AMD-V flag reset. This makes running the project more or less impossible. The jobs run, but they clobber the machine so badly that it is unusable.

If I set the flag manually for each waiting VM, they and other VMs run normally.

This may not be a Cosmology problem - I think Atlas was doing the same thing. I can't verify that right now as there is no Atlas work available.

The only mention of VT-x in any log is in the VM log:

00:00:02.108645 HM: VT-x/AMD-V init method: LOCAL


but this entry looks the same for VMs with the flag set. I don't see way to enforce this setting globally.

This is an AMD box.

Anyone else seeing this? Google says not.

VB 6.02
BOINC 7.14.2 & 7.15
ID: 22066 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 144
Credit: 7,186,588
RAC: 5,242
Message 22067 - Posted: 6 Feb 2019, 9:41:24 UTC - in response to Message 22066.  
Last modified: 6 Feb 2019, 9:44:21 UTC

I am not seeing any unusual problems with my AMD computer.

I looked at your tasks and it looks like you recently aborted about 5 tasks using the GUI.
The error in the log was:
2019-02-05 11:56:24 (2664): ERROR: Vboxwrapper lost communication with VirtualBox, rescheduling task for a later time.
https://www.cosmologyathome.org/result.php?resultid=5272870

Is this the error you are having and you have to abort the work units or is it on different work units? Do you have an example work unit? I am not sure what you mean by the VT-x / AMD-V flag problem. Can you explain where you are seeing this this problem? Is it in the logs, the VirtualBox software, etc?

I forgot to add I am on VirtualBox 5.2.24. Ocassionally I get the 'lost communication problem' but they seem to have gone down since I limited Cosmology@home to the true number of processors I have (eight) total using app_config.xml
ID: 22067 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 144
Credit: 7,186,588
RAC: 5,242
Message 22070 - Posted: 6 Feb 2019, 20:44:58 UTC - in response to Message 22067.  

I just upgraded VirtualBox to 6.0.4 and will let it run. I set 'no new tasks' and let the queue run out before upgrading. I only have cosmology running.
ID: 22070 · Report as offensive     Reply Quote
Evans CAH

Send message
Joined: 19 Apr 09
Posts: 7
Credit: 373,938
RAC: 14
Message 22091 - Posted: 13 Feb 2019, 15:09:37 UTC - in response to Message 22067.  
Last modified: 13 Feb 2019, 15:26:31 UTC

Thanks for the feedback. I am seeing it in the VirtualBox interface under Settings/System/Processor. Every Cosomology VM has AMD-V disabled. While the VMs are running, the machine is unusable.

The flag can be manually set, and the setting sticks.

I probably aborted jobs that are 'unmanageable'. This has nothing to do with AMD-V.

I should add I am using an app_config to limit each VM to two CPUs.

Anyone else on VB 6.02?
ID: 22091 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 144
Credit: 7,186,588
RAC: 5,242
Message 22092 - Posted: 13 Feb 2019, 22:13:22 UTC - in response to Message 22091.  
Last modified: 13 Feb 2019, 22:17:31 UTC

"Enable Nested VT-x / AMD-V" is unchecked in Virtual Box. Is that what you are referring to?
That is not checked on any of the work units for Cosmology and isn't used. You only need it on the 'host' and not in the 'guest'

I think your problem is running too many Cosmology work units at once. Try changing your 'app_config.xml' to run one concurrent task of either two or three CPU cores. Virtual Box is running at a higher priority that the regular Boinc tasks that don't use Virtual Box.
ID: 22092 · Report as offensive     Reply Quote
Evans CAH

Send message
Joined: 19 Apr 09
Posts: 7
Credit: 373,938
RAC: 14
Message 22093 - Posted: 18 Feb 2019, 12:21:41 UTC - in response to Message 22092.  

That's what I'm referring to, yes. I just realized this setting enables nested hardware virtualization, new in VB 6.0, which obviously isn't needed.

BOINC is correctly assigning CPUs up to the maximum. After a bit of experimentation, I think my issue is exactly what you describe: that I (and BOINC) can't set the scheduler priority of individual VMs. On this box I am using VB for non-BOINC stuff that needs to be responsive. I can't just demote the VB service to a lower priority because that will demote all VMs, BOINC and non-BOINC.

I will try your app_config fix, but last I checked all Cosmology jobs had been 'unmanageable' for days and I had to humanely kill them.

For what it's worth there is another workaround, which is to allow BOINC network access only at night, when the machine doesn't have fussy users on it. The VM jobs won't run without internet access, so they politely wait. The side-effect is that you have to run a huge job cache so that the machine has something to do during the day.
ID: 22093 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 144
Credit: 7,186,588
RAC: 5,242
Message 22094 - Posted: 18 Feb 2019, 17:37:37 UTC - in response to Message 22093.  

I think Virtual Box 5.2.8 isn't as fussy on the 'lost communication' issues. There are posts at the LHC@home forums on it too. I only had that issue when I was running more that two virtual box jobs at once. Link to my forum post at bottom.

You might be better off by just sticking to the conventional Boinc applications that don't use Virtual Box. I think that Cosmology@home can run the virtual box jobs without network access

https://www.cosmologyathome.org/forum_thread.php?id=7615
ID: 22094 · Report as offensive     Reply Quote
Evans CAH

Send message
Joined: 19 Apr 09
Posts: 7
Credit: 373,938
RAC: 14
Message 22563 - Posted: 16 Jun 2020, 8:22:40 UTC - in response to Message 22094.  

Related issue: a bug in the BOINC client meant that once system-level virtualization was switched off (e.g. after a BIOS flash or reset) BOINC would then assume it was off forever. The effect was that VB projects would be ignored irrespective of the firmware setting. This was fixed in March 2020 in client 7.16 (see https://boinc.berkeley.edu/wiki/Release_Notes)
ID: 22563 · Report as offensive     Reply Quote

Forums : General Topics : Jobs are being created with VT-x / AMD-V disabled