I have a dataframe output from the python script which gives following output Datetime High Low Time 546 2021-06-15 14:30:00 15891.049805 15868.049805 14:30:00 547 2021-06-15 14:45:00 15883.000000 15869.900391 14:45:00 548 2021-06-15 15:00:00 15881.500000 15866.500000 15:00:00 549 2021-06-15 15:15:00 15877.750000 15854.549805 15:15:00 550 2021-06-15 15:30:00 15869.250000 15869.250000 15:30:00 i Want to remove all rows where time is equal to 15:30:00. tried different things but unable to do. Help please. import pandas as pd import datetime as dt from pandas.core.frame import DataFrame import yfinance as yf import numpy as np import xlwings as xw ticker = "^NSEI" # getting Intraday data for plotting 15 minutes and 75 minutes chart en = dt.date.today() + dt.timedelta(days=1) st = en - dt.timedelta(days=30) df = yf.download(ticker,start = st, end=en, interval = '15m')\ .drop(['Open', 'Close', 'Adj Close', 'Volume'], axis = 1).reset_index() df['Datetime'] = pd.to_datetime(df['Datetime'].astype(str).str[:-6]) df['Time'] = df['Datetime'].dt.time print(df)

Deleting columns with specific conditions

	Datetime	High	Low	Time
546	2021-06-15 14:30:00	15891.049805	15868.049805	14:30:00
547	2021-06-15 14:45:00	15883.000000	15869.900391	14:45:00
548	2021-06-15 15:00:00	15881.500000	15866.500000	15:00:00
549	2021-06-15 15:15:00	15877.750000	15854.549805	15:15:00
550	2021-06-15 15:30:00	15869.250000	15869.250000	15:30:00

Reputation: 8826

As Pandas also provides vectorized string operations across the dataframe, thus it's easy to get the rows those contain the string:

Data-Frame

>>> df
          Datetime         High          Low      Time
0  6/15/2021 14:30  15891.04981  15868.04981  14:30:00
1  6/15/2021 14:45  15883.00000  15869.90039  14:45:00
2  6/15/2021 15:00  15881.50000  15866.50000  15:00:00
3  6/15/2021 15:15  15877.75000  15854.54981  15:15:00
4  6/15/2021 15:30  15869.25000  15869.25000  15:30:00

Result:

Method One:

Using str.contains ...

>>> df[~df['Time'].str.contains('15:30:00')]
          Datetime         High          Low      Time
0  6/15/2021 14:30  15891.04981  15868.04981  14:30:00
1  6/15/2021 14:45  15883.00000  15869.90039  14:45:00
2  6/15/2021 15:00  15881.50000  15866.50000  15:00:00
3  6/15/2021 15:15  15877.75000  15854.54981  15:15:00

OR

if you are looking based on the Datetime

>>> df[~df['Datetime'].str.contains('15:30')]
          Datetime         High          Low      Time
0  6/15/2021 14:30  15891.04981  15868.04981  14:30:00
1  6/15/2021 14:45  15883.00000  15869.90039  14:45:00
2  6/15/2021 15:00  15881.50000  15866.50000  15:00:00
3  6/15/2021 15:15  15877.75000  15854.54981  15:15:00

OR

>>> df[~df.Time.str.contains("15:30") == True]
          Datetime         High          Low      Time
0  6/15/2021 14:30  15891.04981  15868.04981  14:30:00
1  6/15/2021 14:45  15883.00000  15869.90039  14:45:00
2  6/15/2021 15:00  15881.50000  15866.50000  15:00:00
3  6/15/2021 15:15  15877.75000  15854.54981  15:15:00

OR

>>> df[df['Time'].str.contains('15:30') == False]
          Datetime         High          Low      Time
0  6/15/2021 14:30  15891.04981  15868.04981  14:30:00
1  6/15/2021 14:45  15883.00000  15869.90039  14:45:00
2  6/15/2021 15:00  15881.50000  15866.50000  15:00:00
3  6/15/2021 15:15  15877.75000  15854.54981  15:15:00

OR

>>> df[df['Time'].str.contains('15:30') == 0]
          Datetime         High          Low      Time
0  6/15/2021 14:30  15891.04981  15868.04981  14:30:00
1  6/15/2021 14:45  15883.00000  15869.90039  14:45:00
2  6/15/2021 15:00  15881.50000  15866.50000  15:00:00
3  6/15/2021 15:15  15877.75000  15854.54981  15:15:00

Method two:

Using isin ...

>>> df[~df['Time'].isin(['15:30:00'])]
          Datetime         High          Low      Time
0  6/15/2021 14:30  15891.04981  15868.04981  14:30:00
1  6/15/2021 14:45  15883.00000  15869.90039  14:45:00
2  6/15/2021 15:00  15881.50000  15866.50000  15:00:00
3  6/15/2021 15:15  15877.75000  15854.54981  15:15:00

Method three:

Using Not equal to of dataframe and other, element-wise (binary operator ne).

>>> df[df.Time != '15:30:00']
          Datetime         High          Low      Time
0  6/15/2021 14:30  15891.04981  15868.04981  14:30:00
1  6/15/2021 14:45  15883.00000  15869.90039  14:45:00
2  6/15/2021 15:00  15881.50000  15866.50000  15:00:00
3  6/15/2021 15:15  15877.75000  15854.54981  15:15:00

OR

>>> df[df['Time'] != '15:30:00']
          Datetime         High          Low      Time
0  6/15/2021 14:30  15891.04981  15868.04981  14:30:00
1  6/15/2021 14:45  15883.00000  15869.90039  14:45:00
2  6/15/2021 15:00  15881.50000  15866.50000  15:00:00
3  6/15/2021 15:15  15877.75000  15854.54981  15:15:00

OR

>>> df[df['Time'].ne('15:30:00')]
          Datetime         High          Low      Time
0  6/15/2021 14:30  15891.04981  15868.04981  14:30:00
1  6/15/2021 14:45  15883.00000  15869.90039  14:45:00
2  6/15/2021 15:00  15881.50000  15866.50000  15:00:00
3  6/15/2021 15:15  15877.75000  15854.54981  15:15:00

Upvotes: 3

user13300964

Reputation:

You can try this:

import pandas as pd

test_data=pd.read_csv("test.csv")
test_data=test_data[test_data["Time"]!="15:30:00"]
print(test_data)

Just select rows based on condition.

Upvotes: 0

inarticulatus

Reputation: 175

The way I did was the following,

First we get the the time we want to remove from the dataset, that is 15:30:00 in this case.

Since the Datetime column is in the datetime format, we cannot compare the time as strings. So we convert the given time in the datetime.time() format.

rm_time = dt.time(15,30)

With this, we can go about using the DataFrame.drop()

df.drop(df[df.Datetime.dt.time == rm_time].index)

Upvotes: 2

Deleting columns with specific conditions

Answers (3)

Related Questions