Reputation: 3
I am struggling to understand how snakemake submits jobs to slurm.
When I have a basic slurm sbatch script I usually add a line, such as
#SBATCH --mem=5G
to determine that slurm may use 5 gigabytes (and no more) of memory.
Now, I am using snakemake together with slurm with snakemake --configfile config.yaml --snakefile test.smk --profile simple/.
The profil file looks like this:
cluster:
mkdir -p logs &&
sbatch
--partition={resources.partition}
--cpus-per-task={threads}
--gpus={resources.gpus}
--mem={resources.mem_mb}
--time={resources.runtime}
--job-name={rule}
--output=logs/{rule}.out
--error=logs/{rule}.err
--parsable
default-resources:
- partition=batch
- runtime=10
- nodes=1
slurm: True
Now, most of the time I do not define resources in my snakemake rules, so it's just using the defaults (whatever those are?). However, I noticed that when an input files to a snakemake rule is fairly large (or I am imputing several larger files at once) I often get an error at startup similar to slurm submission failed, cannot satisfy memory specification and snakemake stops before it can get started. I then need to manually specify the memory as done in this rule:
for plevel in plevels:
# PURPOSE: Link the previously generated ec.bin files to speed up hic runs.
rule:
name: f"run_link_bin_{plevel}"
input:
hifiasm_bin=expand("{output_directory}/hifi/hifiasm/{species_lower}.ec.bin", output_directory=config["output_directory"], species_lower=config["species_lower"]),
hifiasm_bin_reverse=expand("{output_directory}/hifi/hifiasm/{species_lower}.ovlp.reverse.bin", output_directory=config["output_directory"], species_lower=config["species_lower"]),
hifiasm_bin_source=expand("{output_directory}/hifi/hifiasm/{species_lower}.ovlp.source.bin", output_directory=config["output_directory"], species_lower=config["species_lower"]),
output:
ln_hifiasm_bin=expand("{output_directory}/hic/hifiasm/purge_level_{plevel}/{species_lower}.ec.bin", output_directory=config["output_directory"], species_lower=config["species_lower"], plevel=plevel),
ln_hifiasm_bin_reverse=expand("{output_directory}/hic/hifiasm/purge_level_{plevel}/{species_lower}.ovlp.reverse.bin", output_directory=config["output_directory"], species_lower=config["species_lower"], plevel=plevel),
ln_hifiasm_bin_source=expand("{output_directory}/hic/hifiasm/purge_level_{plevel}/{species_lower}.ovlp.source.bin", output_directory=config["output_directory"], species_lower=config["species_lower"], plevel=plevel),
message: "Message: Link the previously generated ec.bin files to sped up re-run."
resources:
slurm_partition=bigmem,
mem_mb=1000000, # pointless here, but snakemake sees the large input size and wants more memory
shell:
"""
ln -s {input.hifiasm_bin} {output.ln_hifiasm_bin}
ln -s {input.hifiasm_bin_reverse} {output.ln_hifiasm_bin_reverse}
ln -s {input.hifiasm_bin_source} {output.ln_hifiasm_bin_source}
"""
As I am sure, you noticed allocating so much memory in this case, makes absolutely no sense, as the rule just soft links some (albeit fairly large) files. However, if I dont, snakmake cannot not start running this rule.
So I guess my question is, how much memory does snakmake automatically request in a slurm job if the mem_mb
parameter is not set within the rule? It seems to depend on the input file size? And what would be the best practice in this case?
Upvotes: 0
Views: 648
Reputation: 165
First, use the slurm executor.
If no resources are specified, snakemake
uses defaults). Those errors with large files are because snakemake
default memory resource will be 2*input.size_mb
(and that might be more than you have in your cluster).
Upvotes: 1
Reputation: 9062
So I guess my question is, how much memory does snakmake automatically request in a slurm job if the mem_mb parameter is not set within the rule?
I don't think snakemake allocates any resources automatically. If mem_mb
is not present in a rule's resources.mem_mb
, then snakemake will use whatever you have in default-resources, and if there is no entry for mem_mb
in default resources then the sbatch job will be submitted without --mem
option and you will get whatever the cluster administrator has set as default.
I don't claim this to be a best practice, just a suggestion:
mem_mb
in default-resources to a sensible defaultUpvotes: 0