Micheal J. Roberts
Micheal J. Roberts

Reputation: 4170

Inserting missing records into an array of objects governed by a list of dates

I have a list of dates:

dates = [
    datetime.date(2019, 7, 30),
    datetime.date(2019, 7, 31),
    datetime.date(2019, 8, 1), 
    datetime.date(2019, 8, 2),
]

I have a list of survey result objects, which has a key of 'date':

survey_results = [
    {
        'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'), 
        'date': datetime.date(2019, 7, 31), 
        'total_score': 16054, 
        'participants': 499, 
        'average_score': 32.17,
    },
    {
        'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'), 
        'date': datetime.date(2019, 8, 1), 
        'total_score': 17894, 
        'participants': 553, 
        'average_score': 32.36,
    },
]

I'd like to check if survey_results is missing a survey_result record for any of the dates in dates.

If yes, pass over, if not, then I would like to add in a survey_result record in the correct place, but for the date missing but "carrying over" the previous dates record. However, if there are no survey results to "carry over", I would need some sort of blank entry, leaving me with...

Expected output:

survey_results = [
    {
        'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'), 
        'date': datetime.date(2019, 7, 30), 
        'total_score': 0, 
        'participants': 0, 
        'average_score': 0,
    },
    {
        'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'), 
        'date': datetime.date(2019, 7, 31), 
        'total_score': 16054, 
        'participants': 499, 
        'average_score': 32.17,
    },
    {
        'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'), 
        'date': datetime.date(2019, 8, 1), 
        'total_score': 17894, 
        'participants': 553, 
        'average_score': 32.36,
    },
    {
        'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'), 
        'date': datetime.date(2019, 8, 2), 
        'total_score': 17894, 
        'participants': 553, 
        'average_score': 32.36,
    },
]

...for the dates list given above.

I've tried looping over each, within each other. But the matching is completely off. So, maybe some sort of reducing or mapping is needed. I am completely lost tho, so most of what I have tried is junk.

The best I have so far is this:

# 0 mapped element, so the first date to check if we need to insert before, or after:
zeroth = list(map(lambda record: record['date'], survey_results))[0]

print('Zeroth: {}'.format(zeroth))

for date in dates:
    if date not in map(lambda record: record['date'], survey_results):
        # This "date" will correspond to the date of the missing entry record:           
        print(date)

Upvotes: 0

Views: 31

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195438

One possible solution using heapq.merge (doc):

import datetime
from uuid import UUID

dates = [
    datetime.date(2019, 7, 30),
    datetime.date(2019, 7, 31),
    datetime.date(2019, 8, 1),
    datetime.date(2019, 8, 2),
]

survey_results = [
    {
        'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'),
        'date': datetime.date(2019, 7, 31),
        'total_score': 16054,
        'participants': 499,
        'average_score': 32.17,
    },
    {
        'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'),
        'date': datetime.date(2019, 8, 1),
        'total_score': 17894,
        'participants': 553,
        'average_score': 32.36,
    },
]

from heapq import merge

last_date =     {
        'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'),
        'date': None,
        'total_score': 0,
        'participants': 0,
        'average_score': 0
}
out = []
for i in merge( sorted(survey_results, key=lambda k: k['date']),
                sorted(dates),
                key=lambda k: k['date'] if isinstance(k, dict) else k ):

    if isinstance(i, dict):
        last_date = i.copy()
    else:
        last_date = last_date.copy()
        if i == last_date['date']:
            continue
        last_date['date'] = i

    out.append(last_date)


from pprint import pprint
pprint(out)

Prints:

[{'average_score': 0,
  'date': datetime.date(2019, 7, 30),
  'participants': 0,
  'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'),
  'total_score': 0},
 {'average_score': 32.17,
  'date': datetime.date(2019, 7, 31),
  'participants': 499,
  'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'),
  'total_score': 16054},
 {'average_score': 32.36,
  'date': datetime.date(2019, 8, 1),
  'participants': 553,
  'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'),
  'total_score': 17894},
 {'average_score': 32.36,
  'date': datetime.date(2019, 8, 2),
  'participants': 553,
  'survey': UUID('19934780-d860-497f-87e3-fc53a9490960'),
  'total_score': 17894}]

Upvotes: 1

Related Questions