Add job avoidance
Created by: julianhess
The main contribution of this PR is (basic) job avoidance. We pickle the output dataframe when a job completes; when another job is run with the same output directory, it searches for this pickle, which, if found, is compared against to see if the job specifications match.
This PR also adds the following miscellaneous features:
- Preemptible NFS automatic restarting (Docker backend)
- Launches a thread in the background active during the duration of the backend context manager that checks every minute if the NFS server has been preempted, and will restart it if so
- Automatically override inputs to null that are absolute paths residing on the same NFS share and are not Canine outputs
- E.g. if a user specifies an input with the absolute path
/mnt/nfs/foo/bar/bah
, and the output directory resides somewhere in/mnt/nfs
, this input will not be symlinked into the inputs directory, since we assume that the user meant the absolute path as a string literal. - This is especially useful for running things like aligners, which often take a path to a reference FASTA and implicitly assume that ancillary files (e.g. indices) are present in the same directory, with the same basename as the reference FASTA.
- Note that this only applies to absolute paths outside of a Canine directory structure. If we detect that the input path is a Canine output directory (via a simple regex), we will symlink it, since this is likely being run as part of a wolF workflow.
- E.g. if a user specifies an input with the absolute path
- Track task runtime in seconds (not hours)
- Bump required Python version to 3.7 (we are implicitly assuming ordered dicts)
- If this is a problem, we can explicitly use
OrderedDict
when necessary.
- If this is a problem, we can explicitly use
- Kill any jobs still running on cluster when we stop the cluster
- I thought the Slurm worker daemons did this automatically, but I guess not!