Skip to content

Add TransientImage backend

Aaron Graubert requested to merge github/fork/julianhess/image into master

Created by: julianhess

Alpha release of the TransientImage backend. There are still some things remaining to be implemented (as noted in pipeline_options.md and numerous TODOs throughout the code).

In addition to adding the backend class itself, I made a few changes to the orchestrator:

  • Only perform hard controller reset on startup if backend.hard_reset_on_orch_init == True (a new property added to the base backend; default True) and slurm_conf_path is specified. Previously, hard resets were performed only whenever the latter was true.
  • Pass slurm_conf_path and type to backend. This required updating its base constructor, since some implementations do not expect any arguments.

I also updated the criteria the base backend uses to assess whether a cluster is ready. Previously, it only checked if the partition was ready; now, it checks whether there are any nodes within the partition ready to accept jobs.

Merge request reports