Skip to content

Add job avoidance

Aaron Graubert requested to merge github/fork/julianhess/develop into master

Created by: julianhess

The main contribution of this PR is (basic) job avoidance. We pickle the output dataframe when a job completes; when another job is run with the same output directory, it searches for this pickle, which, if found, is compared against to see if the job specifications match.

This PR also adds the following miscellaneous features:

  • Preemptible NFS automatic restarting (Docker backend)
    • Launches a thread in the background active during the duration of the backend context manager that checks every minute if the NFS server has been preempted, and will restart it if so
  • Automatically override inputs to null that are absolute paths residing on the same NFS share and are not Canine outputs
    • E.g. if a user specifies an input with the absolute path /mnt/nfs/foo/bar/bah, and the output directory resides somewhere in /mnt/nfs, this input will not be symlinked into the inputs directory, since we assume that the user meant the absolute path as a string literal.
    • This is especially useful for running things like aligners, which often take a path to a reference FASTA and implicitly assume that ancillary files (e.g. indices) are present in the same directory, with the same basename as the reference FASTA.
    • Note that this only applies to absolute paths outside of a Canine directory structure. If we detect that the input path is a Canine output directory (via a simple regex), we will symlink it, since this is likely being run as part of a wolF workflow.
  • Track task runtime in seconds (not hours)
  • Bump required Python version to 3.7 (we are implicitly assuming ordered dicts)
    • If this is a problem, we can explicitly use OrderedDict when necessary.
  • Kill any jobs still running on cluster when we stop the cluster
    • I thought the Slurm worker daemons did this automatically, but I guess not!

Merge request reports