Advanced search

Forums : Technical Support : Can't Complete Tasks, all end with: EXIT_TIME_LIMIT_EXCEEDED
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
crashtech

Send message
Joined: 10 May 17
Posts: 10
Credit: 4,013,446
RAC: 0
Message 22160 - Posted: 31 May 2019, 5:52:06 UTC

Hi,

I'd really like to resolve this problem so I can run Cosmology@Home without having to stay with the legacy application.

All the camb_boinc2docker tasks fail at between 10 and 11 minutes with the error EXIT_TIME_LIMIT_EXCEEDED.

Machine: https://www.cosmologyathome.org/show_host_detail.php?hostid=391524

Representative Task: https://www.cosmologyathome.org/show_host_detail.php?hostid=391524

Any help will be much appreciated.
ID: 22160 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 17 Nov 14
Posts: 107
Credit: 4,425,470
RAC: 409
Message 22161 - Posted: 31 May 2019, 13:27:06 UTC - in response to Message 22160.  

That usually means that virtualization is not working. Make sure it is enabled in the motherboard BIOS. (Just because VirtualBox installs properly does not mean it is functioning.)
ID: 22161 · Report as offensive     Reply Quote
crashtech

Send message
Joined: 10 May 17
Posts: 10
Credit: 4,013,446
RAC: 0
Message 22162 - Posted: 31 May 2019, 18:59:38 UTC
Last modified: 31 May 2019, 18:59:49 UTC

Thank you for replying! I wondered about that myself, but then again, the tasks are making it about halfway through before they fail. I am presuming that VT-x being disabled would not allow that much progress to occur, though that's only a guess. But there's also this, from the link to the machine in question: https://www.cosmologyathome.org/show_host_detail.php?hostid=391524

"Virtualbox (6.0.8r130520) installed, CPU has hardware virtualization support and it is enabled"

Somewhere I read that downgrading VBox may be required, but I wouldn't know to which version.
ID: 22162 · Report as offensive     Reply Quote
crashtech

Send message
Joined: 10 May 17
Posts: 10
Credit: 4,013,446
RAC: 0
Message 22167 - Posted: 4 Jun 2019, 15:47:58 UTC

I still have not gained any insight on how to solve this problem. Sad because I find this project fascinating.
ID: 22167 · Report as offensive     Reply Quote
Hal Bregg
Avatar

Send message
Joined: 31 Oct 18
Posts: 22
Credit: 284,331
RAC: 2
Message 22168 - Posted: 6 Jun 2019, 13:16:59 UTC - in response to Message 22162.  
Last modified: 6 Jun 2019, 13:20:43 UTC

Thank you for replying! I wondered about that myself, but then again, the tasks are making it about halfway through before they fail. I am presuming that VT-x being disabled would not allow that much progress to occur, though that's only a guess. But there's also this, from the link to the machine in question: https://www.cosmologyathome.org/show_host_detail.php?hostid=391524

"Virtualbox (6.0.8r130520) installed, CPU has hardware virtualization support and it is enabled"

Somewhere I read that downgrading VBox may be required, but I wouldn't know to which version.


I would go for VirtualBox that is bundled with BOINC installer (just check BOINC client website). On Windows host you can remove existing VBox and and run installer combo. No need to remove existing Boinc installation as VirtualBox gets installed first. After that you will be prompted to continue installation of Boinc client which you can abort. After that reboot PC.

On Linux install boinc-virtualbox package from repository supplied by your distro.
ID: 22168 · Report as offensive     Reply Quote
crashtech

Send message
Joined: 10 May 17
Posts: 10
Credit: 4,013,446
RAC: 0
Message 22169 - Posted: 10 Jun 2019, 21:32:55 UTC - in response to Message 22168.  

I would go for VirtualBox that is bundled with BOINC installer (just check BOINC client website). On Windows host you can remove existing VBox and and run installer combo. No need to remove existing Boinc installation as VirtualBox gets installed first. After that you will be prompted to continue installation of Boinc client which you can abort. After that reboot PC.

On Linux install boinc-virtualbox package from repository supplied by your distro.

Thank you for your reply. This time around I installed Windows 10, with which I am more familiar, then installed the BOINC+VBox bundle together. On this fresh install, I ran some other projects for a while, then tried Cosmology again, with exactly the same error, "197 (0x000000C5) EXIT_TIME_LIMIT_EXCEEDED" This occurs most of the way through the task:

http://www.cosmologyathome.org/result.php?resultid=12177949

The machine in question still shows hardware virtualization enabled:

http://www.cosmologyathome.org/show_host_detail.php?hostid=392685

I'm starting to think that other than the legacy projects, this project is for users that have a much deeper grasp of the use of VMs and how they interact with the network file system.
ID: 22169 · Report as offensive     Reply Quote
crashtech

Send message
Joined: 10 May 17
Posts: 10
Credit: 4,013,446
RAC: 0
Message 22170 - Posted: 10 Jun 2019, 22:32:58 UTC

I have discovered a partial solution of my own that isn't a real solution, but might point to the actual problem. First, I tried re-running CPU benchmarks, this was ineffective. Next, I increased the number of logical CPUs utilized per task from 4 to 8, and this helped, but some were still taking too long. Next, I increased the number to 12, and now the tasks seem to be finishing under the time limit, which seems arbitrarily short. I don't know if this time limit can be manipulated on the user end.
ID: 22170 · Report as offensive     Reply Quote
Hal Bregg
Avatar

Send message
Joined: 31 Oct 18
Posts: 22
Credit: 284,331
RAC: 2
Message 22171 - Posted: 11 Jun 2019, 7:00:49 UTC - in response to Message 22170.  
Last modified: 11 Jun 2019, 7:01:03 UTC

I wouldn't be able to tell if this is correct. I run max 2 cores per task and all my tasks are finishing just fine. Sometimes I run one WU per core with the same result.

I checked my failed WUs and I found one with exactly same error you described.

Me and other users are experiencing same issues on nanoHUB project with EXIT_TIME_LIMIT_EXCEEDED, which is utilizing boinc2docker so maybe it means that WUs are somehow faulty.

Project admin would be the best person to clarify situation with those faulty WUs however he seems to be not active on the forum.
ID: 22171 · Report as offensive     Reply Quote
crashtech

Send message
Joined: 10 May 17
Posts: 10
Credit: 4,013,446
RAC: 0
Message 22172 - Posted: 11 Jun 2019, 15:58:47 UTC
Last modified: 11 Jun 2019, 15:59:46 UTC

My current hypothesis is that shutting of HT in hardware might solve the problem, but that is not a practical long-term solution for me.
ID: 22172 · Report as offensive     Reply Quote
crashtech

Send message
Joined: 10 May 17
Posts: 10
Credit: 4,013,446
RAC: 0
Message 22173 - Posted: 11 Jun 2019, 16:32:22 UTC
Last modified: 11 Jun 2019, 16:32:49 UTC

An update on what I am finding:

If I set ncpus to 4 in app_config, allow the VMs to be created, then shut down BOINC, edit the ncpus to 12, then restart, the VMs still "think" there are 4 vCPUs, but they are allocated 12 by BOINC (I think). This is the only sure fire way I can get tasks to finish, but it can't be left alone this way, because once the tasks are finished, the VMs are deleted and new ones created with the values from app_config. If the values from app_config and the VM match, the tasks will time out. If they mismatch, with the app_config value being significantly higher, the tasks will complete successfully.
ID: 22173 · Report as offensive     Reply Quote
crashtech

Send message
Joined: 10 May 17
Posts: 10
Credit: 4,013,446
RAC: 0
Message 22174 - Posted: 11 Jun 2019, 18:20:34 UTC
Last modified: 11 Jun 2019, 18:21:56 UTC

Using Process Lasso to virtually disable Hyperthreading on all the VBoxHeadless executables seems to have solved the problem. I do think this a problem with the app, since HT and SMT are nearly ubiquitous now, there should be a way to allow for slower completions due to using logical cores instead of physical cores.
ID: 22174 · Report as offensive     Reply Quote
crashtech

Send message
Joined: 10 May 17
Posts: 10
Credit: 4,013,446
RAC: 0
Message 22175 - Posted: 12 Jun 2019, 0:26:42 UTC

Disabling HT seems to be the fix in this particular case. Whether this fix is yet a crutch for some other shortcoming will remain a question for those better versed in computer science. Happily, I will be able to run the more significant tasks of this project now; consider this matter closed.
ID: 22175 · Report as offensive     Reply Quote
Hal Bregg
Avatar

Send message
Joined: 31 Oct 18
Posts: 22
Credit: 284,331
RAC: 2
Message 22178 - Posted: 21 Jun 2019, 9:14:30 UTC - in response to Message 22175.  
Last modified: 21 Jun 2019, 9:14:36 UTC

Disabling HT seems to be the fix in this particular case. Whether this fix is yet a crutch for some other shortcoming will remain a question for those better versed in computer science. Happily, I will be able to run the more significant tasks of this project now; consider this matter closed.


Did you have a chance to solve the problem without disabling HT?
ID: 22178 · Report as offensive     Reply Quote
mikey
Avatar

Send message
Joined: 30 Oct 12
Posts: 46
Credit: 5,145,330
RAC: 0
Message 22179 - Posted: 21 Jun 2019, 20:19:22 UTC - in response to Message 22173.  

An update on what I am finding:

If I set ncpus to 4 in app_config, allow the VMs to be created, then shut down BOINC, edit the ncpus to 12, then restart, the VMs still "think" there are 4 vCPUs, but they are allocated 12 by BOINC (I think). This is the only sure fire way I can get tasks to finish, but it can't be left alone this way, because once the tasks are finished, the VMs are deleted and new ones created with the values from app_config. If the values from app_config and the VM match, the tasks will time out. If they mismatch, with the app_config value being significantly higher, the tasks will complete successfully.


This can happen when a project ties the downloaded tasks to the app_config, or website, settings at the time the workunits are downloaded.
ID: 22179 · Report as offensive     Reply Quote
Hal Bregg
Avatar

Send message
Joined: 31 Oct 18
Posts: 22
Credit: 284,331
RAC: 2
Message 22180 - Posted: 25 Jun 2019, 10:01:07 UTC - in response to Message 22179.  

An update on what I am finding:

If I set ncpus to 4 in app_config, allow the VMs to be created, then shut down BOINC, edit the ncpus to 12, then restart, the VMs still "think" there are 4 vCPUs, but they are allocated 12 by BOINC (I think). This is the only sure fire way I can get tasks to finish, but it can't be left alone this way, because once the tasks are finished, the VMs are deleted and new ones created with the values from app_config. If the values from app_config and the VM match, the tasks will time out. If they mismatch, with the app_config value being significantly higher, the tasks will complete successfully.


This can happen when a project ties the downloaded tasks to the app_config, or website, settings at the time the workunits are downloaded.


So the solution would be to change settings on project's website, delete app_config files and reset project in boinc client?
ID: 22180 · Report as offensive     Reply Quote
Grant

Send message
Joined: 26 Aug 11
Posts: 1
Credit: 12,995
RAC: 0
Message 22187 - Posted: 8 Jul 2019, 18:25:19 UTC

Very much amateur here - I am experiencing the exact same issue and error as the OP. I have virtualization enabled on the hardware level and just recently updated VirtualBox to the most recent version and my WUs still work fine for 11 seconds and then stall and stop at 0.1% done. Unfortunately I had to stop accepting new work for this project which is sad because I'd l like my brand new, semi-beastly laptop to contribute to this.
ID: 22187 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 161
Credit: 7,580,022
RAC: 1,014
Message 22188 - Posted: 9 Jul 2019, 18:50:07 UTC - in response to Message 22187.  

See the FAQ section How can I limit the number of CPUs used. Set avg_ncpus to 6 or less. The number of true cores your computer has. Turn off throttling %. You can limit the total number of cores used for BOINC in the preferences on a projects website or in the boinc manager. Try running the Cosmology project with only one concurrent task setting from the FAQ until you get it working and then you can play with the settings.
ID: 22188 · Report as offensive     Reply Quote
Tazarak_Ordinateur

Send message
Joined: 18 Oct 18
Posts: 1
Credit: 1,140,443
RAC: 0
Message 22280 - Posted: 12 Oct 2019, 10:12:43 UTC

I've downgraded to 5.2.8 and still no joy.

I'm also receiving the EXIT_TIME_LIMIT_EXCEEDED.

I've confirmed the other troubleshooting tips in this thread. I had just started to fiddle with process lasso, then reflected on why I was putting myself through so much trouble.

I'm afraid to say I give up on this project.
ID: 22280 · Report as offensive     Reply Quote
Tim Kelley

Send message
Joined: 1 Jul 18
Posts: 24
Credit: 1,646,084
RAC: 8,785
Message 22328 - Posted: 3 Jan 2020, 20:17:51 UTC

Also experiencing the same problem, running two virtual CPUs each on two Win10 machines. The desktop completes roughly half of the tasks successfully; the others error out with EXIT_TIME_LIMIT_EXCEEDED. In the case of the laptop, I have to throttle CPU or it overheats (six core processor in little laptop with limited cooling capability), so 100% of tasks end in error. WHY CAN'T THE PEOPLE MAKING THE WORK UNITS RAISE THE TIME LIMIT??? Is there any way to reach the people actually working on this project? This would require us volunteers to do a lot less gymnastics and workarounds.
ID: 22328 · Report as offensive     Reply Quote
Jonathan

Send message
Joined: 27 Sep 17
Posts: 161
Credit: 7,580,022
RAC: 1,014
Message 22329 - Posted: 4 Jan 2020, 5:00:41 UTC - in response to Message 22328.  

VM's don't like throttling. Pick a different project for your laptop.
There is a check list on the LHC forums that may help with VirtualBox tasks

Checklist Version 3 for Atlas@Home (and other VM-based Projects) on your PC
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4161#29359
ID: 22329 · Report as offensive     Reply Quote
1 · 2 · Next

Forums : Technical Support : Can't Complete Tasks, all end with: EXIT_TIME_LIMIT_EXCEEDED