Aadit M Shah
Aadit M Shah

Reputation: 74234

How to read a python tuple using PyYAML?

I have the following YAML file named input.yaml:

cities:
  1: [0,0]
  2: [4,0]
  3: [0,4]
  4: [4,4]
  5: [2,2]
  6: [6,2]
highways:
  - [1,2]
  - [1,3]
  - [1,5]
  - [2,4]
  - [3,4]
  - [5,4]
start: 1
end: 4

I'm loading it using PyYAML and printing the result as follows:

import yaml

f = open("input.yaml", "r")
data = yaml.load(f)
f.close()

print(data)

The result is the following data structure:

{ 'cities': { 1: [0, 0]
            , 2: [4, 0]
            , 3: [0, 4]
            , 4: [4, 4]
            , 5: [2, 2]
            , 6: [6, 2]
            }
, 'highways': [ [1, 2]
              , [1, 3]
              , [1, 5]
              , [2, 4]
              , [3, 4]
              , [5, 4]
              ]
, 'start': 1
, 'end': 4
}

As you can see, each city and highway is represented as a list. However, I want them to be represented as a tuple. Hence, I manually convert them into tuples using comprehensions:

import yaml

f = open("input.yaml", "r")
data = yaml.load(f)
f.close()

data["cities"] = {k: tuple(v) for k, v in data["cities"].items()}
data["highways"] = [tuple(v) for v in data["highways"]]

print(data)

However, this seems like a hack. Is there some way to instruct PyYAML to directly read them as tuples instead of lists?

Upvotes: 39

Views: 69270

Answers (6)

idjaw
idjaw

Reputation: 26600

I wouldn't call what you've done hacky for what you are trying to do. Your alternative approach from my understanding is to make use of python-specific tags in your YAML file so it is represented appropriately when loading the yaml file. However, this requires you modifying your yaml file which, if huge, is probably going to be pretty irritating and not ideal.

Look at the PyYaml doc that further illustrates this. Ultimately you want to place a !!python/tuple in front of your structure that you want to represented as such. To take your sample data, it would like:

YAML FILE:

cities:
  1: !!python/tuple [0,0]
  2: !!python/tuple [4,0]
  3: !!python/tuple [0,4]
  4: !!python/tuple [4,4]
  5: !!python/tuple [2,2]
  6: !!python/tuple [6,2]
highways:
  - !!python/tuple [1,2]
  - !!python/tuple [1,3]
  - !!python/tuple [1,5]
  - !!python/tuple [2,4]
  - !!python/tuple [3,4]
  - !!python/tuple [5,4]
start: 1
end: 4

Sample code:

import yaml

with open('y.yaml') as f:
    d = yaml.load(f.read())

print(d)

Which will output:

{'cities': {1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)}, 'start': 1, 'end': 4, 'highways': [(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)]}

Upvotes: 37

Karl I.
Karl I.

Reputation: 141

It might not be safe in some situations, but an easy fix for me was to store lists of tuples in string representation. When reading back, convert the string to list of tuples using eval().

atoms = [(7, 13, 14, 15)]  # list of tuple

# when creating dict for YAML dump
ddict[grp] = str(atoms)  # convert list of tuples to string

# then after reading the YAML file
ddict[grp] = list(eval(ddict[grp]))  # list() for slight safety

Upvotes: 1

Anant
Anant

Reputation: 424

This worked for me -

config.yaml

cities:
    1: !!python/tuple [0,0]
    2: !!python/tuple [4,0]
    3: !!python/tuple [0,4]
    4: !!python/tuple [4,4]
    5: !!python/tuple [2,2]
    6: !!python/tuple [6,2]
highways:
    - !!python/tuple [1,2]
    - !!python/tuple [1,3]
    - !!python/tuple [1,5]
    - !!python/tuple [2,4]
    - !!python/tuple [3,4]
    - !!python/tuple [5,4]
start: 1
end: 4

main.py

import yaml

def tuple_constructor(loader, node):
    # Load the sequence of values from the YAML node
    values = loader.construct_sequence(node)
    # Return a tuple constructed from the sequence
    return tuple(values)

# Register the constructor with PyYAML
yaml.SafeLoader.add_constructor('tag:yaml.org,2002:python/tuple', 
tuple_constructor)

# Load the YAML file
with open('config.yaml', 'r') as f:
    data = yaml.load(f, Loader=yaml.SafeLoader)

print(data)

Output:

{'cities': {1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)},
'highways': [(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)], 
'start': 1, 
'end': 4}

Upvotes: 1

DanielBell99
DanielBell99

Reputation: 1957

You treat a tuple as a list.

params.yaml

foo:
  bar: ["a", "b", "c"]

Source

Upvotes: 0

Olivier
Olivier

Reputation: 41

I ran in the same problem as the question and I was not too satisfied by the two answers. While browsing around the pyyaml documentation I found really two interesting methods yaml.add_constructor and yaml.add_implicit_resolver.

The implicit resolver solves the problem of having to tag all entries with !!python/tuple by matching the strings with a regex. I also wanted to use the tuple syntax, so write tuple: (10,120) instead of writing a list tuple: [10,120] which then gets converted to a tuple, I personally found that very annoying. I also did not want to install an external library. Here is the code:

import yaml
import re

# this is to convert the string written as a tuple into a python tuple
def yml_tuple_constructor(loader, node): 
    # this little parse is really just for what I needed, feel free to change it!                                                                                            
    def parse_tup_el(el):                                                                                                            
        # try to convert into int or float else keep the string                                                                      
        if el.isdigit():                                                                                                             
            return int(el)                                                                                                           
        try:                                                                                                                         
            return float(el)                                                                                                         
        except ValueError:                                                                                                           
            return el                                                                                                                

    value = loader.construct_scalar(node)                                                                                            
    # remove the ( ) from the string                                                                                                 
    tup_elements = value[1:-1].split(',')                                                                                            
    # remove the last element if the tuple was written as (x,b,)                                                                     
    if tup_elements[-1] == '':                                                                                                       
        tup_elements.pop(-1)                                                                                                         
    tup = tuple(map(parse_tup_el, tup_elements))                                                                                     
    return tup                                                                                                                       

# !tuple is my own tag name, I think you could choose anything you want                                                                                                                                   
yaml.add_constructor(u'!tuple', yml_tuple_constructor)
# this is to spot the strings written as tuple in the yaml                                                                               
yaml.add_implicit_resolver(u'!tuple', re.compile(r"\(([^,\W]{,},){,}[^,\W]*\)")) 

Finally by executing this:

>>> yml = yaml.load("""
   ...: cities:
   ...:   1: (0,0)
   ...:   2: (4,0)
   ...:   3: (0,4)
   ...:   4: (4,4)
   ...:   5: (2,2)
   ...:   6: (6,2)
   ...: highways:
   ...:   - (1,2)
   ...:   - (1,3)
   ...:   - (1,5)
   ...:   - (2,4)
   ...:   - (3,4)
   ...:   - (5,4)
   ...: start: 1
   ...: end: 4""")
>>>  yml['cities']
{1: (0, 0), 2: (4, 0), 3: (0, 4), 4: (4, 4), 5: (2, 2), 6: (6, 2)}
>>> yml['highways']
[(1, 2), (1, 3), (1, 5), (2, 4), (3, 4), (5, 4)]

There could be a potential drawback with save_load compared to load which I did not test.

Upvotes: 4

Anthon
Anthon

Reputation: 76812

Depending on where your YAML input comes from your "hack" is a good solution, especially if you would use yaml.safe_load() instead of the unsafe yaml.load(). If only the "leaf" sequences in your YAML file need to be tuples you can do ¹:

import pprint
import ruamel.yaml
from ruamel.yaml.constructor import SafeConstructor


def construct_yaml_tuple(self, node):
    seq = self.construct_sequence(node)
    # only make "leaf sequences" into tuples, you can add dict 
    # and other types as necessary
    if seq and isinstance(seq[0], (list, tuple)):
        return seq
    return tuple(seq)

SafeConstructor.add_constructor(
    u'tag:yaml.org,2002:seq',
    construct_yaml_tuple)

with open('input.yaml') as fp:
    data = ruamel.yaml.safe_load(fp)
pprint.pprint(data, width=24)

which prints:

{'cities': {1: (0, 0),
            2: (4, 0),
            3: (0, 4),
            4: (4, 4),
            5: (2, 2),
            6: (6, 2)},
 'end': 4,
 'highways': [(1, 2),
              (1, 3),
              (1, 5),
              (2, 4),
              (3, 4),
              (5, 4)],
 'start': 1}

if you then need to process more material where sequence need to be "normal" lists again, use:

SafeConstructor.add_constructor(
    u'tag:yaml.org,2002:seq',
    SafeConstructor.construct_yaml_seq)

¹ This was done using ruamel.yaml a YAML 1.2 parser, of which I am the author. You should be able to do same with the older PyYAML if you only ever need to support YAML 1.1 and/or cannot upgrade for some reason

Upvotes: 7

Related Questions