Reputation: 309

Iterating over multiIndex dataframe

I have a data frame as shown below dataframe

I have a problem in iterating over the rows. for every row fetched I want to return the key value. For example in the second row for 2016-08-31 00:00:01 entry df1 & df3 has compass value 4.0 so I wanted to return the keys which has the same compass value which is df1 & df3 in this case I Have been iterating rows using

for index,row in df.iterrows():

Upvotes: 0

Answers (1)

Josh

Reputation: 2835

Update

Okay so now I understand your question better this will work for you. First change the shape of your dataframe with

dfs = df.stack().swaplevel(axis=0)

This will make your dataframe look like:

Then you can iterate the rows like before and extract the information you want. I'm just using print statements for everything, but you can put this in some more appropriate data structure.

for index, row in dfs.iterrows():
     dup_filter = row.duplicated(keep=False)
     dfss = row_tuple[dup_filter].index.values
     print("Attribute:", index[0])
     print("Index:", index[1])
     print("Matches:", dfss, "\n")

which will print out something like

.....

Attribute: compass
Index: 5
Matches: ['df1' 'df3']

Attribute: gyro
Index: 5
Matches: ['df1' 'df3']

Attribute: accel
Index: 6
Matches: ['df1' 'df3']

....

You could also do it one attribute at a time by

dfs_compass = df.stack().swaplevel(axis=0).loc['compass']

and iterate through the rows with just the index.

Old

If I understand your question correctly, i.e. you want to return the indexes of rows which have matching values on the second level of your columns, i.e. ('compass', 'accel', 'gyro'). The following will work.

compass_match_indexes = []

for index, row in df.iterrows():
    match_filter = row[:, 'compass'].duplicated()
    if len(row[:, 'compass'][match_filter] > 0)
        compass_match_indexes.append(index)

You can use select your dataframe with that list like df.loc[compass_match_indexes]

Another approach, you could get the transform of your DataFrame with df.T and then use the duplicated function.

Upvotes: 1

Iterating over multiIndex dataframe

Answers (1)

Related Questions