evgenii ershenko
evgenii ershenko

Reputation: 489

pandas: how to count values in a selected by condition time frame window

For each value('X' column) we have time frame window which defined as ['T1', 'T2']:

df = pd.DataFrame({'X': [1, 1, 1, 3, 3, 3],
                   'T1': ['2015-12-30 22:30:00',
                          '2015-12-30 22:30:15',
                          '2015-12-30 22:30:20',
                          '2015-11-30 23:40:00',
                          '2015-11-30 23:42:00',
                          '2015-11-30 23:40:00'],
                   'T2': ['2015-12-30 22:31:00',
                          '2015-12-30 22:30:45',
                          '2015-12-30 22:30:20',
                          '2015-11-30 23:43:00',
                          '2015-11-30 23:42:50',
                          '2015-11-30 23:44:00'], 
                   })
    X   T1                      T2
0   1   2015-12-30 22:30:00     2015-12-30 22:31:00
1   1   2015-12-30 22:30:15     2015-12-30 22:30:45
2   1   2015-12-30 22:30:20     2015-12-30 22:30:20
3   3   2015-11-30 23:40:00     2015-11-30 23:43:00
4   3   2015-11-30 23:42:00     2015-11-30 23:42:50
5   3   2015-11-30 23:40:00     2015-11-30 23:44:00

I want to count the same values('X') whose time frames inside(or equal) the time frame defined for this value, so I want to get next result:

df = pd.DataFrame({'X': [1, 1, 1, 3, 3, 3],
                   'T1': ['2015-12-30 22:30:00',
                          '2015-12-30 22:30:15',
                          '2015-12-30 22:30:20',
                          '2015-11-30 23:40:00',
                          '2015-11-30 23:42:00',
                          '2015-11-30 23:40:00'],
                   'T2': ['2015-12-30 22:31:00',
                          '2015-12-30 22:30:45',
                          '2015-12-30 22:30:20',
                          '2015-11-30 23:43:00',
                          '2015-11-30 23:42:50',
                          '2015-11-30 23:44:00'],
                   'count': [3, 2, 1, 2, 1, 3],
                   })
    X   T1                      T2                      count
0   1   2015-12-30 22:30:00     2015-12-30 22:31:00     3
1   1   2015-12-30 22:30:15     2015-12-30 22:30:45     2
2   1   2015-12-30 22:30:20     2015-12-30 22:30:20     1
3   3   2015-11-30 23:40:00     2015-11-30 23:43:00     2
4   3   2015-11-30 23:42:00     2015-11-30 23:42:50     1
    3   2015-11-30 23:40:00     2015-11-30 23:44:00     3

Upvotes: 2

Views: 265

Answers (1)

Alexandre B.
Alexandre B.

Reputation: 5502

You can do it with apply.

Here is one way to do:

# Import module
import pandas as pd

# You dataframe
df = pd.DataFrame({ 'X': [1, 1, 1, 3, 3, 3],
                    'T1': ['2015-12-30 22:30:00',
                         '2015-12-30 22:30:15',
                         '2015-12-30 22:30:20',
                         '2015-11-30 23:40:00',
                         '2015-11-30 23:42:00',
                         '2015-11-30 23:40:00'],
                    'T2': ['2015-12-30 22:31:00',
                         '2015-12-30 22:30:45',
                         '2015-12-30 22:30:20',
                         '2015-11-30 23:43:00',
                         '2015-11-30 23:42:50',
                         '2015-11-30 23:44:00'], 
                   })

# Convert columns to datetime object
df["T1"] = pd.to_datetime(df.T1, format="%Y-%m-%d %H:%M:%S")
df["T2"] = pd.to_datetime(df.T2, format="%Y-%m-%d %H:%M:%S")

# Function to comptute the number of element in the interval
def countIntervals(row):
    return len(df[(df.X == row.X) & (df.T1 >= row.T1) & (df.T2 < row.T2)]) + 1

# Add new column
df["count"] = df.apply(countIntervals, axis=1)
print(df)
#    X                  T1                  T2  count
# 0  1 2015-12-30 22:30:00 2015-12-30 22:31:00      3
# 1  1 2015-12-30 22:30:15 2015-12-30 22:30:45      2
# 2  1 2015-12-30 22:30:20 2015-12-30 22:30:20      1
# 3  3 2015-11-30 23:40:00 2015-11-30 23:43:00      2
# 4  3 2015-11-30 23:42:00 2015-11-30 23:42:50      1
# 5  3 2015-11-30 23:40:00 2015-11-30 23:44:00      3

Upvotes: 2

Related Questions