Pixxel
Pixxel

Reputation: 71

How can I convert a list of dictionaries and nested lists into a pandas Dataframe for use in a markdown table generator?

I have the following list of dictionaries (with nested lists within):

compatibility = [
  {'region': 'NTSC-U',
   'stats': [
     {'os': 'Windows', 'state': 'Playable'}, 
     {'os': 'Linux', 'state': 'Playable'},
     {'os': 'Mac', 'state': 'N/A'}
   ]}, 
   {'region': 'PAL',
    'stats': [
      {'os': 'Windows', 'state': 'Playable'},
      {'os': 'Linux', 'state': 'N/A'},
       {'os': 'Mac', 'state': 'N/A'}
    ]}, 
    {'region': 'NTSC-J',
     'stats': [
       {'os': 'Windows', 'state': 'N/A'}, 
       {'os': 'Linux', 'state': 'N/A'},
       {'os': 'Mac', 'state': 'N/A'}
   ]}
]

I'm trying to use pytablewriter and pandas to generate a markdown table for this data but I can't seem to wrap my head around dealing with the unique values and column ordering for the dataframe.

Here's my function with an example of how this output would look:

def generate_table():
    compatibility = find_compatibility()
    writer = MarkdownTableWriter()
    writer.from_dataframe(
        pd.DataFrame({'NTSC-U': ['Playable', 'Playable', 'N/A'], 'PAL': ['Playable', 'N/A', 'N/A'], 
                      'NTSC-J': ['N/A', 'N/A', 'N/A']}, index=['Windows', 'Linux', 'Mac']),
        add_index_column=True,
    )
    writer.write_table()

How would I go about generating this? I've gotten as far as thinking about sets to pull the unique OS names but that doesn't seem to work with a list of dicts with nested lists. (And since the data structure sometimes only contains one region, it would need to be dynamic).

Any help is much appreciated!

Upvotes: 0

Views: 1102

Answers (1)

run-out
run-out

Reputation: 3184

You can start by normalizing your json, selecting 'stats' as the record path.

df = pd.json_normalize(compatibility, "stats", ["region"])
print(df)
        os     state  region
0  Windows  Playable  NTSC-U
1    Linux  Playable  NTSC-U
2      Mac       N/A  NTSC-U
3  Windows  Playable     PAL
4    Linux       N/A     PAL
5      Mac       N/A     PAL
6  Windows       N/A  NTSC-J
7    Linux       N/A  NTSC-J
8      Mac       N/A  NTSC-J

Then you just use pivot to get the desired output.

df = df.pivot("os", "region", values="state")
print(df)
region  NTSC-J    NTSC-U       PAL
os                                
Linux      N/A  Playable       N/A
Mac        N/A       N/A       N/A
Windows    N/A  Playable  Playable

Upvotes: 1

Related Questions