arodin
arodin

Reputation: 83

Pydantic - parse a list of objects from YAML configuration file

I want to read a list of objects from a YAML file:

- entry1:
   attribute: "Test1"
   amount: 1
   price: 123.45
- entry2:
   attribute: "Test1"
   amount: 10
   price: 56.78

For this data structure i created three nested models as follows:

# Models
class EntryValues(BaseModel):
    attribute: str
    amount: int
    price: float

class Entry(BaseModel):
    entry1: EntryValues
    entry2: EntryValues
class Config(BaseModel):
    __root__: list[Entry]

My code to read the YAML config file looks as follows:

# get YAML config path
def get_cfg_path() -> Path:
    return CWD

# read YAML file
def read_cfg(file_name: str, file_path: Path = None) -> YAML:
    if not file_path:
        file_path = get_cfg_path()

    if file_path:
        try:
            file = open(file_path / file_name, "r")
        except Exception as e:
            print(f"open file {file_name} failed", e)
            sys.exit(1)
        else:
            return load(file.read())
    else:
        raise Exception(f"Config file {file_name} not found!")

Now i want to unpack the values of the YAML to my model. For that i tried to unpack the values with the ** operator. I think im missing one more loop here though, but i can not get it work.

# Unpack and create config file
def create_cfg(file_name: str = None) -> Config:
    config_file = read_cfg(file_name=file_name)
    _config = Config(**config_file.data)
    return _config

I would appreciate any help.

Update

So i played around with my model-structure a bit without using the YAML file. I dont quite get why the following throws an ValidationError:

Consider the following list of entries (thats the same data structure i would receive from my YAML file):

entries = [
    {'entry1': {'attribute': 'Test1', 'amount': 1, 'price': 123.45}}, 
    {'entry2': {'attribute': 'Test2', 'amount': 10, 'price': 56.78}}
]

If i run the following simple loop, then Pydantic throws an ValidationError:

for entry in entries:
    Entry(**entry)

Error:

ValidationError: 1 validation error for Entry
entry2
  field required (type=value_error.missing)

However, if the list only contains one entry dictionary, then it works:

class Entry(BaseModel):
    entry1: EntryValues
    #entry2: EntryValues

entries = [
    {'entry1': {'attribute': 'Test1', 'amount': 1, 'price': 123.45}}
]

for entry in entries:
    Entry(**entry)

Can someone explain this or what im doing wrong here?

Upvotes: 2

Views: 16752

Answers (3)

JGC
JGC

Reputation: 6363

You might consider using the pydantic-yaml package instead (pip install pydantic-yaml)

I refactored the model since it looks like the structure is a list of one-key dictionaries.

from typing import Dict, List
from pydantic import BaseModel, RootModel
from pydantic_yaml import parse_yaml_file_as


class Entry(BaseModel):
    attribute: str
    amount: int
    price: float

class Config(RootModel):
    root: List[Dict[str, Entry]]

m = parse_yaml_file_as(Config, "demo.yml")
print(m)

The output, in this instance, would be:

root=[{'entry1': Entry(attribute='Test1', amount=1, price=123.45)}, {'entry2': Entry(attribute='Test1', amount=10, price=56.78)}]

Upvotes: 0

Dan Davis
Dan Davis

Reputation: 1

Since you are trying to parse config, you might also consider using pydantic-settings module instead of just pydantic. However, if all of your configuration is coming from YAML, this doesn't make a lot of sense.

Upvotes: 0

Josh Friedlander
Josh Friedlander

Reputation: 11657

In your update, the reason that the second case works but not the first is that the unpacking operator (**) takes a single dictionary object which contains all the necessary keys. In your first case, you had one dictionary with all the necessary info; in the second it is spread across two dicts and they can't be unpacked together. One possible workaround would be to merge them into a single dictionary. But as far as I understand, a better solution would be to just change your YAML to provide this in the first place, by deleting the first two characters in each line:

entry1:
 attribute: "Test1"
 amount: 1
 price: 123.45
entry2:
 attribute: "Test1"
 amount: 10
 price: 56.78

and then:

_config = Config(__root__=[Entry(**entries)])

Original answer:

There are a number of issues with your code, but I think what you're trying to do is parse the YAML into a dictionary and instantiate an EntryValues from each item. That would look something like this:

from pydantic import BaseModel
from pathlib import Path
from typing import List

import yaml


def create_cfg(file_name: str = None) -> Config:
    config_file = read_cfg(file_name=file_name)
    entries = yaml.safe_load(config_file)
    _config = [
        EntryValues(**di[name]) for di, name in zip(entries, ["entry1", "entry2"])
    ]
    return _config

Upvotes: 1

Related Questions