dariober
dariober

Reputation: 9062

Handle environmental variables in config options

I have snakemake command line with configuration options like this:

snakemake --config \
    f1=$PWD/file1.txt \
    f2=$PWD/file2.txt \
    f3=/path/to/file3.txt \
    ...more key-value pairs \
    --directory /path/to/output/dir

file1.txt and file2.txt are expected to be in the same directory as the snakefile, file3.txt is somewhere else. I need the paths to files to be absolute, hence the $PWD variable, so Snakemake can find the files after moving to /path/to/output/dir.

Because I start having several configuration options, I would like to move all the --config items to a separate yaml configuration file. The problem is: How do I transfer the variable $PWD to a configuration file?

I could have a dummy string in the yaml file indicating that that string is to be replaced by the directory where the Snakefile is (e.g. f1: <replace me>/file1.txt) but I feel it's awkward. Any better ideas? It may be that I should rethink how the files fileX.txt are passed to snakemake...

Upvotes: 1

Views: 603

Answers (2)

SultanOrazbayev
SultanOrazbayev

Reputation: 16581

One option is to use an external module, intake, to handle the environmental variable integration. There is a similar answer, but a more specific example for this question is as follow.

A yaml file which follows the syntax expected by intake, a field called sources that contains a list of nested entries that specify at the very least a (possibly local) url at which the file can be access:

# config.yml
sources:
  file1:
    args:
      url: "{{env(PWD)}}/file1.txt"
  file2:
    args:
      url: "{{env(PWD)}}/file2.txt"

Inside the Snakefile, the relevant code would be:

import intake
cat = intake.open_catalog('config.yml')
f1 = cat['file1'].urlpath
f2 = cat['file2'].urlpath

Note that for less verbose yaml files, intake provides syntax for parameterization, see the docs or this example.

Upvotes: 1

KeyboardCat
KeyboardCat

Reputation: 559

You can access the directory the Snakefile lives in with workflow.basedir - you might be able to get away with specifying the relative path in the config file and then defining the absolute path in your Snakefile e.g. as

file1 = pathlib.Path(workflow.basedir) / config["f1"]
file2 = pathlib.Path(workflow.basedir) / config["f2"]

Upvotes: 0

Related Questions