A dataframe splitting problem in Pandas, any thoughts?

Question

The probe of an instrument is cycling back and forward along an x direction while is recording its position and acquiring the measurements. The probe makes 10 cycles, let's say from 0 to 10 um (go and back) and records the measurements. This gives 2 columns of data: position and measurement, where the position number cycle 0um->10um->0->10->0..., but these numbers have an experimental error so they are all different.

I need to split the dataframe at the beginning of each cycle. Any interesting strategy to tackle this problem? Please, let me know if you need more info. Thank in advance.

Below is link to an example of the dataframe that I have. https://www.dropbox.com/s/af4r8lw5lfhwexr/Example.xlsx?dl=0

In this example the instrument made 3 cycles and generated the data (measurement). Cycle 1 = Index 0-20; Cycle 1 = Index 20-40; and Cycle 1 = Index 40-60. I need to divide this dataframe into 3 dataframes, one for each cycle (Index 0-20; Index 20-40; Index 40-60).

The tricky part is that the method needs to be "general" because each cycle can have a different number of datapoints (in this example that is fixed to 20), and different experiments can be performed with a different number of cycles.

Andy · Accepted Answer

My objective is to keep tract when the numbers start increasing again after decreasing to determine the cycle number. Not very elegant sorry.

import pandas as pd

df = pd.read_excel('Example.xlsx')

def cycle(array):
    increasing = 1
    cycle_num = 0
    answer = []
    for ind,val in enumerate(array):
        try:
            if array[ind+1]-array[ind]>=0:
                if increasing==0:
                    cycle_num+=1
                increasing=1
                answer.append(cycle_num)
            else:
                answer.append(cycle_num)
                increasing=0
        except:
            answer.append(cycle_num)
    return answer


df['Cycle'] = cycle(df['Distance'].to_list())
grouped = df.groupby(['Cycle'])

print(grouped.get_group(0))
print(grouped.get_group(1))
print(grouped.get_group(2))

A dataframe splitting problem in Pandas, any thoughts?

Answers (1)

Related Questions