Convert pandas DataFrame to dictionary using repeating cell values as keys

Question

I have a dataframe like that:

Sol Col1    v1  Col2   v2    Col3  v3   Col4    v4  
1   Y_1_1   0   Y_1_2   1   Y_1_3   0   Y_1_4   0       
2   Y_1_1   0   Y_1_2   1   Y_1_3   0   Y_1_4   0       
3   Y_1_1   0   Y_1_2   0   Y_1_3   0   Y_1_4   0       
4   Y_1_1   0   Y_1_2   0   Y_1_3   1   Y_1_4   0   
5   Y_1_1   0   Y_1_2   0   Y_1_3   1   Y_1_4   0   
6   Y_1_1   0   Y_1_2   0   Y_1_3   0   Y_1_4   0       
7   Y_1_1   0   Y_1_2   1   Y_1_3   0   Y_1_4   0       
8   Y_1_1   0   Y_1_2   0   Y_1_3   0   Y_1_4   1       
9   Y_1_1   0   Y_1_2   1   Y_1_3   0   Y_1_4   0

I would like to transform that in a dictionary like that:

dic = {1: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0},
       2: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0},
       ...}

I am wondering that I should replace the header of columns v1, v2, v3, by the str variables (Y_1_1, Y_1_2, etc) and just delete the columns with variables names (col1, col2, ...).

I've found some examples to transform dataframe to dictionary, but if I am not wrong, any of them won't help to solve my problem.

Is there a pythonic way to do this transformation?

jezrael · Accepted Answer

If there are same values in column col1 to colN, then you can use:

#create index by `Sol` column
df = df.set_index('Sol')

#select first row, shift and create dictionary
d = df.iloc[0].shift().to_dict()

#select each `v1` column by indexing, rename columns and convert to dict
out = df.iloc[:, 1::2].rename(columns=d).to_dict('index')
print (out)

{1: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0},
 2: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0}, 
 3: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 0, 'Y_1_4': 0}, 
 4: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 1, 'Y_1_4': 0}, 
 5: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 1, 'Y_1_4': 0}, 
 6: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 0, 'Y_1_4': 0}, 
 7: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0}, 
 8: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 0, 'Y_1_4': 1}, 
 9: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0}}

If possible different values in col1 to colN columns then use dictionary comprehension with zip pair and unpair values:

d = {k: dict(zip(list(v.values())[::2], list(v.values())[1::2])) 
       for k, v in df.set_index('Sol').to_dict('index').items()}
print (d)

{1: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0}, 
 2: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0},
 3: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 0, 'Y_1_4': 0}, 
 4: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 1, 'Y_1_4': 0}, 
 5: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 1, 'Y_1_4': 0}, 
 6: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 0, 'Y_1_4': 0}, 
 7: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0}, 
 8: {'Y_1_1': 0, 'Y_1_2': 0, 'Y_1_3': 0, 'Y_1_4': 1}, 
 9: {'Y_1_1': 0, 'Y_1_2': 1, 'Y_1_3': 0, 'Y_1_4': 0}}

Convert pandas DataFrame to dictionary using repeating cell values as keys

Answers (2)

Related Questions