Reputation: 9
So for this assignment I managed to create a dictionary, where the keys are State names (eg: Alabama, Alaska, Arizona), and the values are lists of regions for each state. The problem is that the lists of regions are of different lengths - so each state can have a different number of regions associated.
Example : 'Alabama': ['Auburn',
'Florence',
'Jacksonville',
'Livingston',
'Montevallo',
'Troy',
'Tuscaloosa',
'Tuskegee'],
'Alaska': ['Fairbanks'],
'Arizona': ['Flagstaff', 'Tempe', 'Tucson'],
How can I unload this into a pandas Dataframe? What I want is basically 2 columns - "State", "Region". Something similar to what you would get if you would do a "GroupBy" on state for the regions.
Upvotes: 1
Views: 80
Reputation: 99
You can also do this by dividing the dictionary into lists. Although that will be a little longer approach. For Example:
Example = {'Alabama': ['Auburn','Florence','Jacksonville','Livingston','Montevallo','Troy','Tuscaloosa','Tuskegee'],
'Alaska': ['Fairbanks'],
'Arizona': ['Flagstaff', 'Tempe', 'Tucson']}
new_list_of_keys = []
new_list_of_values = []
keys = list(Example.keys())
values = list(Example.values())
for i in range(len(keys)):
for j in range(len(values[i])):
new_list_of_values.append(values[i][j])
new_list_of_keys.append(keys[i])
df = pd.DataFrame(zip(new_list_of_keys, new_list_of_values), columns = ['State', 'Region'])
This will give output as:
State Region
0 Alabama Auburn
1 Alabama Florence
2 Alabama Jacksonville
3 Alabama Livingston
4 Alabama Montevallo
5 Alabama Troy
6 Alabama Tuscaloosa
7 Alabama Tuskegee
8 Alaska Fairbanks
9 Arizona Flagstaff
10 Arizona Tempe
11 Arizona Tucson
Upvotes: 0
Reputation: 150735
If you work on pandas 0.25+, you can use explode
:
pd.Series(states).explode()
Output:
Alabama Auburn
Alabama Florence
Alabama Jacksonville
Alabama Livingston
Alabama Montevallo
Alabama Troy
Alabama Tuscaloosa
Alabama Tuskegee
Alaska Fairbanks
Arizona Flagstaff
Arizona Tempe
Arizona Tucson
dtype: object
You can also use concat
which works for most pandas
version:
pd.concat(pd.DataFrame({'state':k, 'Region':v}) for k,v in states.items())
Output:
state Region
0 Alabama Auburn
1 Alabama Florence
2 Alabama Jacksonville
3 Alabama Livingston
4 Alabama Montevallo
5 Alabama Troy
6 Alabama Tuscaloosa
7 Alabama Tuskegee
0 Alaska Fairbanks
0 Arizona Flagstaff
1 Arizona Tempe
2 Arizona Tucson
Upvotes: 2