Forums :
News :
Docker-based applications upgrade
Message board moderation
Author | Message |
---|---|
![]() Project administrator Project developer Project scientist ![]() Send message Joined: 29 Jun 15 Posts: 470 Credit: 4,276 RAC: 0 |
The camb_boinc2docker and planck_param_sims apps are getting an upgrade which should reduce even further the number of failed jobs. Read more about the upgrade in the comments. |
![]() Project administrator Project developer Project scientist ![]() Send message Joined: 29 Jun 15 Posts: 470 Credit: 4,276 RAC: 0 |
Over the next couple of days I'm going to upgrade the planck_param_sims and camb_boinc2docker applications. This somewhat major update might cause some of your currently-running jobs to fail, I'm going to do my best to avoid this, but just in case I apologize in advance. I'm excited about this upgrade though. The way these two applications currently work is that they fire up a VM, and the first thing the VM does is download (if you don't have it yet) the Docker image which houses the actual analysis code. For various reasons, this can sometimes fail. In fact, somewhere between 3-10% of our jobs fail this way. Its far and away the biggest cause of failed jobs. What this update does is make it so the Docker images are downloaded by BOINC itself, so you will see the download in your "Transfers" tab, you will see the progress, it will be retried if it fails, and other computation can happen in the background while the download happens. Also, the intelligent way in which Docker images don't re-download the parts of them you already have is preserved. As you can see, pretty awesome :) So sometime later today or tomorrow I'm upgrading the two apps to function in this way. You'll know they've been upgraded when you see the version number go to 2.0. Happy to hear feedback in this thread. |
![]() Project administrator Project developer Project scientist ![]() Send message Joined: 29 Jun 15 Posts: 470 Credit: 4,276 RAC: 0 |
Starting upgrade now. Will post back when done. EDIT: Upgrade done as of now. I *think* I managed to do it without killing any of your currently-running jobs. Let me know how the new version 2.0 apps go. |
rbpeake Send message Joined: 27 Jun 07 Posts: 118 Credit: 61,883 RAC: 0 |
Error on this Planck job: http://www.cosmologyathome.org/result.php?resultid=40740603 |
![]() Project administrator Project developer Project scientist ![]() Send message Joined: 29 Jun 15 Posts: 470 Credit: 4,276 RAC: 0 |
Error on this Planck job: Thanks, indeed a few Planck jobs slipped through the cracks and got set to run the 2.0 version with the 1.X input files (which doesn't work). I think this should be fixed now, and in any case they'll cycle out soon enough. |
![]() ![]() Send message Joined: 15 Apr 08 Posts: 101 Credit: 4,535,998 RAC: 0 |
I'm glad to hear that explanation because I was coming to say that my rate of error tasks had greatly increased (not decreased). I'll keep an eye for the next few days and see if that stops. ![]() |
Pierce.Moore Send message Joined: 8 Jun 16 Posts: 1 Credit: 8,109 RAC: 0 |
For some reason I didn't see this message in BOINC until right now. I read through it and had two thoughts: 1) That is a fantastic upgrade. I am a Docker fanboy and love the shift toward having the BOINC manager handle the downloads/transfers natively. 2) I learned that I didn't know how to check for error rates or failed jobs. I know it's been weeks since this upgrade occurred but at least for future reference is there a piece of documentation or a forum post explaining the process of checking that? If one of you fine people can point me that direction with a link it would be much appreciated! |
Jim1348 Send message Joined: 17 Nov 14 Posts: 134 Credit: 5,412,499 RAC: 14 |
2) I learned that I didn't know how to check for error rates or failed jobs. I know it's been weeks since this upgrade occurred but at least for future reference is there a piece of documentation or a forum post explaining the process of checking that? The only one for failed jobs that I know of is accessed via Community/Your Account/Tasks. Then you can click on "Error" to see just the ones that failed. That is pretty much standard for all BOINC projects, except WCG which has its own system. |