Reputation: 337
Using pandas, I open some csv files in a loop and set the index to the cycleID
column, except the cycleID
column is not unique. See below:
for filename in all_files:
abfdata = pd.read_csv(filename, index_col=None, header=0)
abfdata = abfdata.set_index("cycleID", drop=False)
for index, row in abfdata.iterrows():
print(row['cycleID'], row['mean'])
This prints the 2 columns (cycleID
and mean
) of the dataframe I am interested in for further computations:
1 1.5020712104685252e-11
1 6.56683605063102e-12
2 1.3993315187144084e-11
2 -8.670502467042485e-13
3 7.0270625256163566e-12
3 9.509995221868016e-12
4 1.2901435995915644e-11
4 9.513106448422182e-12
The objective is to use the rows corresponding to the same cycleID
and calculate the difference between the mean
column values. So, if there are 8 rows in the table, the final array or list would store 4 values.
I want to make it scalable as well where there can be 3 or more rows with the same cycleID
s. In that case, each cycleID
could have 2 or more mean
differences.
Update: Instead of creating a new ques about it, I thought I'd add here. I used the diff and groupby approach as mentioned in the solution. It works great but I have this extra need to save one of the mean values (odd row or even row doesn't matter) in a new column and make that part of the new data frame as well. How do I do that?
Upvotes: 0
Views: 2206
Reputation: 9047
You can use groupby
s2= df.groupby(['cycleID'])['mean'].diff()
s2.dropna(inplace=True)
output
1 -8.453876e-12
3 -1.486037e-11
5 2.482933e-12
7 -3.388330e-12
8 3.000000e-12
d = [[1, 1.5020712104685252e-11],
[1, 6.56683605063102e-12],
[2, 1.3993315187144084e-11],
[2, -8.670502467042485e-13],
[3, 7.0270625256163566e-12],
[3, 9.509995221868016e-12],
[4, 1.2901435995915644e-11],
[4, 9.513106448422182e-12]]
df = pd.DataFrame(d, columns=['cycleID', 'mean'])
df2 = df.groupby(['cycleID']).diff().dropna().rename(columns={'mean': 'difference'})
df2['mean'] = df['mean'].iloc[df2.index]
difference mean
1 -8.453876e-12 6.566836e-12
3 -1.486037e-11 -8.670502e-13
5 2.482933e-12 9.509995e-12
7 -3.388330e-12 9.513106e-12
Upvotes: 1