Lamma
Lamma

Reputation: 1507

yaml.dump seeming to add two dashes to 2nd row under first key

I am using yaml.dump to generate yaml files for snakemake config but I keep getting the error Config file must be given as JSON or YAML with keys at top level

I think it might be something to do with incorrect formatting of my yaml files coming out of yaml.dump:

The input:

with open("yaml-config-files/"+args.name+".yaml", "w") as outfile:
    ruamel.yaml.dump(yaml_dict, outfile, default_flow_style=False)

output:

- samples:
  - - Unmap_54_1.fastq
    - Unmap_51_2.fastq
    - Unmap_55_2.fastq
    - Unmap_50_1.fastq
    - Unmap_16_1.fastq
    - Unmap_13_2.fastq
    - Unmap_17_2.fastq
    - Unmap_12_1.fastq
    - Unmap_31_1.fastq
    - Unmap_34_2.fastq
    - Unmap_30_2.fastq
    - Unmap_35_1.fastq
    - Unmap_06_2.fastq
    - Unmap_03_1.fastq
    - Unmap_07_1.fastq
    - Unmap_02_2.fastq
    - Unmap_28_1.fastq
    - Unmap_21_2.fastq
    - Unmap_24_1.fastq
    - Unmap_25_2.fastq
    - Unmap_44_2.fastq
    - Unmap_41_1.fastq
    - Unmap_40_2.fastq
    - Unmap_02_1.fastq
    - Unmap_07_2.fastq
    - Unmap_03_2.fastq
    - Unmap_06_1.fastq
    - Unmap_25_1.fastq
    - Unmap_24_2.fastq
    - Unmap_21_1.fastq
    - Unmap_28_2.fastq
    - Unmap_40_1.fastq
    - Unmap_41_2.fastq
    - Unmap_44_1.fastq
    - Unmap_50_2.fastq
    - Unmap_55_1.fastq
    - Unmap_51_1.fastq
    - Unmap_54_2.fastq
    - Unmap_12_2.fastq
    - Unmap_17_1.fastq
    - Unmap_13_1.fastq
    - Unmap_16_2.fastq
    - Unmap_35_2.fastq
    - Unmap_30_1.fastq
    - Unmap_34_1.fastq
    - Unmap_31_2.fastq
    - Unmap_27_1.fastq
    - Unmap_22_2.fastq
    - Unmap_26_2.fastq
    - Unmap_23_1.fastq
    - Unmap_05_2.fastq
    - Unmap_01_2.fastq
    - Unmap_04_1.fastq
    - Unmap_09_2.fastq
    - Unmap_08_1.fastq
    - Unmap_42_1.fastq
    - Unmap_47_2.fastq
    - Unmap_43_2.fastq
    - Unmap_46_1.fastq
    - Unmap_52_2.fastq
    - Unmap_57_1.fastq
    - Unmap_53_1.fastq
    - Unmap_56_2.fastq
    - Unmap_37_2.fastq
    - Unmap_36_1.fastq
    - Unmap_33_2.fastq
    - Unmap_19_1.fastq
    - Unmap_18_2.fastq
    - Unmap_10_2.fastq
    - Unmap_15_1.fastq
    - Unmap_11_1.fastq
    - Unmap_14_2.fastq
    - Unmap_56_1.fastq
    - Unmap_53_2.fastq
    - Unmap_57_2.fastq
    - Unmap_52_1.fastq
    - Unmap_33_1.fastq
    - Unmap_36_2.fastq
    - Unmap_37_1.fastq
    - Unmap_14_1.fastq
    - Unmap_11_2.fastq
    - Unmap_15_2.fastq
    - Unmap_10_1.fastq
    - Unmap_18_1.fastq
    - Unmap_19_2.fastq
    - Unmap_23_2.fastq
    - Unmap_26_1.fastq
    - Unmap_22_1.fastq
    - Unmap_27_2.fastq
    - Unmap_08_2.fastq
    - Unmap_09_1.fastq
    - Unmap_04_2.fastq
    - Unmap_01_1.fastq
    - Unmap_05_1.fastq
    - Unmap_46_2.fastq
    - Unmap_43_1.fastq
    - Unmap_47_1.fastq
    - Unmap_42_2.fastq
- path_to_files:
  - /home/lamma/ABR/Each_reads

The 2nd row has an extra dash, does anyone know why this is occuring and would this be causing the error that I am seeing?

Edit: I am populating my yaml_dict as follows:

yaml_dict = [{'samples' : [[os.path.basename(file) for file in glob.glob(path+"/*."+args.type)]]},
             {'path_to_files' : [path]}]

Upvotes: 3

Views: 4354

Answers (2)

SitiSchu
SitiSchu

Reputation: 1417

Using the code from the comments:

yaml_dict = [{'samples' : [[os.path.basename(file) for file in glob.glob(path+"/*."+args.type)]]}, {'path_to_files' : [path]}]

This: [[os.pa...type)]] creates a list of lists (note the double []). Replace it with the following:

yaml_dict = [{'samples' : [os.path.basename(file) for file in glob.glob(path+"/*."+args.type)]}, {'path_to_files' : [path]}]

which only creates a single list.

Upvotes: 2

GandhiGandhi
GandhiGandhi

Reputation: 1350

I think the second dash could be coming from the input data formatting:

Here's an example of input data that might cause that.

>>> import yaml
>>> x = {"samples": [[1,2,3]]}
>>> print(yaml.dump(x))
samples:
- - 1
  - 2
  - 3

>>> 

Upvotes: 1

Related Questions