Advanced search

Message boards : News : Docker-based applications upgrade

Author Message
Profile Marius
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 29 Jun 15
Posts: 423
Credit: 4,276
RAC: 0
Message 21077 - Posted: 31 May 2016, 11:48:40 UTC
Last modified: 1 Jun 2016, 13:54:49 UTC

The camb_boinc2docker and planck_param_sims apps are getting an upgrade which should reduce even further the number of failed jobs. Read more about the upgrade in the comments.

Profile Marius
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 29 Jun 15
Posts: 423
Credit: 4,276
RAC: 0
Message 21078 - Posted: 31 May 2016, 11:48:50 UTC - in response to Message 21077.
Last modified: 31 May 2016, 11:54:10 UTC

Over the next couple of days I'm going to upgrade the planck_param_sims and camb_boinc2docker applications. This somewhat major update might cause some of your currently-running jobs to fail, I'm going to do my best to avoid this, but just in case I apologize in advance.

I'm excited about this upgrade though. The way these two applications currently work is that they fire up a VM, and the first thing the VM does is download (if you don't have it yet) the Docker image which houses the actual analysis code. For various reasons, this can sometimes fail. In fact, somewhere between 3-10% of our jobs fail this way. Its far and away the biggest cause of failed jobs. What this update does is make it so the Docker images are downloaded by BOINC itself, so you will see the download in your "Transfers" tab, you will see the progress, it will be retried if it fails, and other computation can happen in the background while the download happens. Also, the intelligent way in which Docker images don't re-download the parts of them you already have is preserved. As you can see, pretty awesome :)

So sometime later today or tomorrow I'm upgrading the two apps to function in this way. You'll know they've been upgraded when you see the version number go to 2.0. Happy to hear feedback in this thread.

Profile Marius
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 29 Jun 15
Posts: 423
Credit: 4,276
RAC: 0
Message 21079 - Posted: 1 Jun 2016, 13:18:09 UTC - in response to Message 21078.
Last modified: 1 Jun 2016, 13:54:39 UTC

Starting upgrade now. Will post back when done.

EDIT: Upgrade done as of now. I *think* I managed to do it without killing any of your currently-running jobs. Let me know how the new version 2.0 apps go.

rbpeake
Send message
Joined: 27 Jun 07
Posts: 118
Credit: 50,318
RAC: 0
Message 21080 - Posted: 1 Jun 2016, 19:22:56 UTC
Last modified: 1 Jun 2016, 19:24:12 UTC

Error on this Planck job:

http://www.cosmologyathome.org/result.php?resultid=40740603

Profile Marius
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 29 Jun 15
Posts: 423
Credit: 4,276
RAC: 0
Message 21081 - Posted: 2 Jun 2016, 9:54:03 UTC - in response to Message 21080.
Last modified: 2 Jun 2016, 9:59:13 UTC

Error on this Planck job:

http://www.cosmologyathome.org/result.php?resultid=40740603

Thanks, indeed a few Planck jobs slipped through the cracks and got set to run the 2.0 version with the 1.X input files (which doesn't work). I think this should be fixed now, and in any case they'll cycle out soon enough.

Profile Thunder
Avatar
Send message
Joined: 15 Apr 08
Posts: 101
Credit: 4,535,998
RAC: 2
Message 21082 - Posted: 2 Jun 2016, 12:54:42 UTC

I'm glad to hear that explanation because I was coming to say that my rate of error tasks had greatly increased (not decreased). I'll keep an eye for the next few days and see if that stops.
____________

Pierce.Moore
Send message
Joined: 8 Jun 16
Posts: 1
Credit: 8,109
RAC: 0
Message 21099 - Posted: 21 Jun 2016, 17:36:36 UTC

For some reason I didn't see this message in BOINC until right now. I read through it and had two thoughts:

1) That is a fantastic upgrade. I am a Docker fanboy and love the shift toward having the BOINC manager handle the downloads/transfers natively.

2) I learned that I didn't know how to check for error rates or failed jobs. I know it's been weeks since this upgrade occurred but at least for future reference is there a piece of documentation or a forum post explaining the process of checking that? If one of you fine people can point me that direction with a link it would be much appreciated!

Jim1348
Send message
Joined: 17 Nov 14
Posts: 48
Credit: 2,358,299
RAC: 0
Message 21100 - Posted: 21 Jun 2016, 17:57:07 UTC - in response to Message 21099.
Last modified: 21 Jun 2016, 17:58:24 UTC

2) I learned that I didn't know how to check for error rates or failed jobs. I know it's been weeks since this upgrade occurred but at least for future reference is there a piece of documentation or a forum post explaining the process of checking that?

The only one for failed jobs that I know of is accessed via Community/Your Account/Tasks. Then you can click on "Error" to see just the ones that failed. That is pretty much standard for all BOINC projects, except WCG which has its own system.

Message boards : News : Docker-based applications upgrade