Python: Generate dictionary from pandas dataframe with rows as keys and columns as values

Question

I have a dataframe that looks like this:

     Curricula Course1 Course2 Course3 ... CourseN
0       q1      c1        c2     NaN        NaN
1       q2      c14       c21    c1         Nan
2       q3      c2        c14    NaN        Nan
...
M       qm      c7        c9     c21

Where the number of Courses per Curricula is different.

What I need is a dictionary from this dataframe looking like this:

{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }

Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.

What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:

in:

df.set_index('Curricula')
df_transposed = df.transpose()
Dic = df_transposed.to_dict('records')

out:

[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']

So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.

Anyone an idea how to fix that?

Best regards, Jan

user3483203 · Accepted Answer

Setup

df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},
 'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},
 'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},
 'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})

print(df)

  Curricula Course1 Course2 Course3
0        q1      c1      c2     NaN
1        q2     c14     c21      c1
2        q3      c2     c14     NaN

You can't have duplicate keys in a dictionary, however you can use agg along with set_index and stack to create a list for each unique key:

df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()

{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}

Python: Generate dictionary from pandas dataframe with rows as keys and columns as values

Answers (1)

Related Questions