Andrey Semakin
Andrey Semakin

Reputation: 2755

Validate list of Schemas with constraints for list length using marshmallow

For example, I want to check if data contains list of well-formed dicts, and this list has length between 1 and 10.

from marshmallow import Schema, fields

class Record(Schema): 
    id = fields.Integer(required=True)
    # more fields here, let's omit them

schema = Record(many=True)
# somehow define that we have constraint on list length
# list length should be between 1 and 10 inclusive

# validation should fail
errors = schema.validate([])
assert errors  # length < 1
errors = schema.validate([{"id": i} for i in range(100)])
assert errors  # length > 10

# validation should succeed
errors = schema.validate([{"id": i} for i in range(5)])
assert not errors

Is it possible to define such constraints using marshmallow?


I need something like this but I would like to avoid additional level of nesting in data:

from marshmallow.validate import Length

class BatchOfRecords(Schema):
    records = fields.Nested(
        Record,
        required=True,
        many=True,
        validate=Length(1, 10)
    )

UPD:

So to clarify the question, I would like to validate a list of dicts:

[
    {"id": 1},
    {"id": 2},
    ...
]

And not a dict with a key containing list of dicts:

# it works but it introduces extra level of nesting,
# I want to avoid it
{
    "records": [
        {"id": 1},
        {"id": 2},
        ...
    ]
}

Upvotes: 3

Views: 7724

Answers (2)

l33tHax0r
l33tHax0r

Reputation: 1721

EDIT

So it is possible to validate collections using just marshmallow. You can use the pass_many kwarg with pre_load or post_load methods. I did not have success with the pre_load but got to work with post. The pass_many kwarg will treat the input as a collection so then you can check the length of the collection after loaded. I use the many kwarg so that I will only check the length if we are passing a collection of records rather than just an individual record

from marshmallow import Schema, fields, ValidationError, post_load


class Record(Schema):
    id = fields.Integer(required=True)
    name = fields.String(required=True)
    status = fields.String(required=True)

    @post_load(pass_many=True)
    def check_length(self, data, many, **kwargs):
        if many:
            if len(data) < 1 or len(data) > 10:
                raise ValidationError(message=['Record length should be greater than 1 and less than 10.'],
                                      field_name='record')

EDIT TEST CASES

from unittest import TestCase

from marshmallow import ValidationError

from stack_marshmallow import Record


class TestStackSchemasNonNested(TestCase):

    def test_empty_dict(self):
        with self.assertRaises(ValidationError) as exc:
            Record(many=True).load([])
        self.assertEqual(exc.exception.messages['record'], ['Record length should be greater than 1 and less than 10.'])

    def test_happy_path(self):
        user_data = [{"id": "1", "name": "apple", "status": "OK"}, {"id": "2", "name": "apple", "status": 'OK'}]
        data = Record(many=True).load(user_data)
        self.assertEqual(len(data), 2)

    def test_invalid_values_with_valid_values(self):
        user_data = [{"id": "1", "name": "apple", "status": 'OK'}, {"id": "2"}]
        with self.assertRaises(ValidationError) as exc:
            Record(many=True).load(user_data)
        self.assertEqual(exc.exception.messages[1]['name'], ['Missing data for required field.'])
        self.assertEqual(exc.exception.messages[1]['status'], ['Missing data for required field.'])

    def test_too_many(self):
        user_data = [{"id": "1", "name": "apple", "status": "OK"},
                     {"id": "2", "name": "apple", "status": 'OK'},
                     {"id": "3", "name": "apple", "status": 'OK'},
                     {"id": "4", "name": "apple", "status": 'OK'},
                     {"id": "5", "name": "apple", "status": 'OK'},
                     {"id": "6", "name": "apple", "status": 'OK'},
                     {"id": "7", "name": "apple", "status": 'OK'},
                     {"id": "8", "name": "apple", "status": 'OK'},
                     {"id": "9", "name": "apple", "status": 'OK'},
                     {"id": "10", "name": "apple", "status": 'OK'},
                     {"id": "11", "name": "apple", "status": 'OK'},
                     ]
        with self.assertRaises(ValidationError) as exc:
            Record(many=True).load(user_data)
        self.assertEqual(exc.exception.messages['record'], ['Record length should be greater than 1 and less than 10.'])

EDIT SOURCES: https://marshmallow.readthedocs.io/en/stable/extending.html

You are very close. I added a little more complexity to record because I don't think you will just have one field or else I would just use a List of Integers. I added some unit tests as well so you can see how to test it.

from marshmallow import Schema, fields, validate


class Record(Schema):
    id = fields.Integer(required=True)
    name = fields.String(required=True)
    status = fields.String(required=True)


class Records(Schema):
    records = fields.List(
        fields.Nested(Record),
        required=True,
        validate=validate.Length(min=1,max=10)
    )

TEST CASES

from unittest import TestCase

from marshmallow import ValidationError

from stack_marshmallow import Records


class TestStackSchemas(TestCase):

    def setUp(self):
        self.schema = Records()

    def test_empty_dict(self):
        with self.assertRaises(ValidationError) as exc:
            self.schema.load({})
        self.assertEqual(exc.exception.messages['records'], ['Missing data for required field.'])

    def test_empty_empty_list_in_dict(self):
        with self.assertRaises(ValidationError) as exc:
            self.schema.load({"records": []})
        self.assertEqual(exc.exception.messages['records'], ['Length must be between 1 and 10.'])

    def test_missing_fields_in_single_record(self):
        with self.assertRaises(ValidationError) as exc:
            self.schema.load({"records": [{"id": 1}]})
        self.assertEqual(exc.exception.messages['records'][0]['name'], ['Missing data for required field.'])
        self.assertEqual(exc.exception.messages['records'][0]['status'], ['Missing data for required field.'])

    def test_list_too_long_and_invalid_records(self):
        with self.assertRaises(ValidationError) as exc:
            self.schema.load({"records":
                                  [{"id": 1, "name": "stack", "status": "overflow"},
                                   {"id": 2, "name": "stack", "status": "overflow"},
                                   {"id": 3, "name": "stack", "status": "overflow"},
                                   {"id": 4, "name": "stack", "status": "overflow"},
                                   {"id": 5, "name": "stack", "status": "overflow"},
                                   {"id": 6, "name": "stack", "status": "overflow"},
                                   {"id": 7, "name": "stack", "status": "overflow"},
                                   {"id": 8, "name": "stack", "status": "overflow"},
                                   {"id": 9, "name": "stack", "status": "overflow"},
                                   {"id": 10, "name": "stack", "status": "overflow"},
                                   {"id": 11, "name": "stack", "status": "overflow"}]})
        self.assertEqual(exc.exception.messages['records'], ['Length must be between 1 and 10.'])

Sources: https://marshmallow.readthedocs.io/en/stable/nesting.html and https://marshmallow.readthedocs.io/en/stable/examples.html

Upvotes: 4

Andrey Semakin
Andrey Semakin

Reputation: 2755

What I want to do may be achieved using this little library: https://github.com/and-semakin/marshmallow-toplevel.

pip install marshmallow-toplevel

Usage:

from marshmallow.validate import Length
from marshmallow_toplevel import TopLevelSchema

class BatchOfRecords(TopLevelSchema):
    _toplevel = fields.Nested(
        Record,
        required=True,
        many=True,
        validate=Length(1, 10)
    )

Upvotes: 0

Related Questions