JanB
JanB

Reputation: 199

Python: Generate dictionary from pandas dataframe with rows as keys and columns as values

I have a dataframe that looks like this:

     Curricula Course1 Course2 Course3 ... CourseN
0       q1      c1        c2     NaN        NaN
1       q2      c14       c21    c1         Nan
2       q3      c2        c14    NaN        Nan
...
M       qm      c7        c9     c21

Where the number of Courses per Curricula is different.

What I need is a dictionary from this dataframe looking like this:

{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }

Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.

What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:

in:

df.set_index('Curricula')
df_transposed = df.transpose()
Dic = df_transposed.to_dict('records')

out:

[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']

So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.

Anyone an idea how to fix that?

Best regards, Jan

Upvotes: 1

Views: 1693

Answers (1)

user3483203
user3483203

Reputation: 51165

Setup

df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},
 'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},
 'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},
 'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})

print(df)

  Curricula Course1 Course2 Course3
0        q1      c1      c2     NaN
1        q2     c14     c21      c1
2        q3      c2     c14     NaN

You can't have duplicate keys in a dictionary, however you can use agg along with set_index and stack to create a list for each unique key:

df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()

{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}   

Upvotes: 1

Related Questions