Advanced search

Message boards : Technical Support : Camb_Legacy 2.17 jobs restarting

Author Message
David Guymer
Send message
Joined: 28 Jan 08
Posts: 2
Credit: 530,618
RAC: 2,297
Message 21341 - Posted: 10 Apr 2017, 12:17:45 UTC

Camb_Legacy 2.17 jobs are regressing to an earlier point upon a restart of my PC

Profile Marius
Project administrator
Project developer
Project scientist
Avatar
Send message
Joined: 29 Jun 15
Posts: 427
Credit: 4,276
RAC: 0
Message 21359 - Posted: 30 Apr 2017, 12:51:33 UTC - in response to Message 21341.

Camb_Legacy 2.17 jobs are regressing to an earlier point upon a restart of my PC

Hi David, the jobs periodically checkpoint, so you are likely seeing them regress to the previous checkpoint. Unfortunately check-pointing is fairly costly so we can't do it much more often than currently done.

Close At Hand
Send message
Joined: 21 May 17
Posts: 2
Credit: 202,422
RAC: 290
Message 21446 - Posted: 23 May 2017, 14:46:33 UTC

I am seeing the camb_legacy jobs restart constantly. Some get to about 15 minutes then go back to 13 minutes, over and over and over again. It would crash and reload from a checkpoint forever if I let it. This is with VB 5.1.22

mmonnin
Send message
Joined: 29 Dec 16
Posts: 18
Credit: 420,063
RAC: 1,339
Message 21447 - Posted: 23 May 2017, 18:02:53 UTC - in response to Message 21446.

I am seeing the camb_legacy jobs restart constantly. Some get to about 15 minutes then go back to 13 minutes, over and over and over again. It would crash and reload from a checkpoint forever if I let it. This is with VB 5.1.22


For reference, the legacy tasks to not use VB. Just planck and docker.

Is the % actually going up and down or just the time estimate?

Close At Hand
Send message
Joined: 21 May 17
Posts: 2
Credit: 202,422
RAC: 290
Message 21448 - Posted: 23 May 2017, 20:41:12 UTC - in response to Message 21447.

Is the % actually going up and down or just the time estimate?

The checkpoint loads with 15:19 elapsed and about 24.769% done. At 16:16 the % done drops to 24.029 but elapsed keeps increasing. %done continues increasing from there. Around 17:40 elapsed it reloads the checkpoint.

mmonnin
Send message
Joined: 29 Dec 16
Posts: 18
Credit: 420,063
RAC: 1,339
Message 21449 - Posted: 24 May 2017, 15:22:04 UTC - in response to Message 21448.

I was wondering it the ETA was just way off (short) to start and at the completed checkpoints it was adjusting to a longer ETA. That doesn't seem the case if it's going back to the same checkpoint.

Message boards : Technical Support : Camb_Legacy 2.17 jobs restarting