Reputation: 146
I have a dataframe output from the python script which gives following output
Datetime | High | Low | Time | |
---|---|---|---|---|
546 | 2021-06-15 14:30:00 | 15891.049805 | 15868.049805 | 14:30:00 |
547 | 2021-06-15 14:45:00 | 15883.000000 | 15869.900391 | 14:45:00 |
548 | 2021-06-15 15:00:00 | 15881.500000 | 15866.500000 | 15:00:00 |
549 | 2021-06-15 15:15:00 | 15877.750000 | 15854.549805 | 15:15:00 |
550 | 2021-06-15 15:30:00 | 15869.250000 | 15869.250000 | 15:30:00 |
i Want to remove all rows where time is equal to 15:30:00. tried different things but unable to do. Help please.
import pandas as pd
import datetime as dt
from pandas.core.frame import DataFrame
import yfinance as yf
import numpy as np
import xlwings as xw
ticker = "^NSEI"
# getting Intraday data for plotting 15 minutes and 75 minutes chart
en = dt.date.today() + dt.timedelta(days=1)
st = en - dt.timedelta(days=30)
df = yf.download(ticker,start = st, end=en, interval = '15m')\
.drop(['Open', 'Close', 'Adj Close', 'Volume'], axis = 1).reset_index()
df['Datetime'] = pd.to_datetime(df['Datetime'].astype(str).str[:-6])
df['Time'] = df['Datetime'].dt.time
print(df)
Upvotes: 0
Views: 141
Reputation: 8826
As Pandas also provides vectorized string operations across the dataframe, thus it's easy to get the rows those contain the string:
Data-Frame
>>> df
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
4 6/15/2021 15:30 15869.25000 15869.25000 15:30:00
Result:
Method One:
Using str.contains
...
>>> df[~df['Time'].str.contains('15:30:00')]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
OR
if you are looking based on the Datetime
>>> df[~df['Datetime'].str.contains('15:30')]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
OR
>>> df[~df.Time.str.contains("15:30") == True]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
OR
>>> df[df['Time'].str.contains('15:30') == False]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
OR
>>> df[df['Time'].str.contains('15:30') == 0]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
Method two:
Using isin
...
>>> df[~df['Time'].isin(['15:30:00'])]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
Method three:
Using Not equal to of dataframe and other, element-wise (binary operator ne).
>>> df[df.Time != '15:30:00']
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
OR
>>> df[df['Time'] != '15:30:00']
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
OR
>>> df[df['Time'].ne('15:30:00')]
Datetime High Low Time
0 6/15/2021 14:30 15891.04981 15868.04981 14:30:00
1 6/15/2021 14:45 15883.00000 15869.90039 14:45:00
2 6/15/2021 15:00 15881.50000 15866.50000 15:00:00
3 6/15/2021 15:15 15877.75000 15854.54981 15:15:00
Upvotes: 3
Reputation:
You can try this:
import pandas as pd
test_data=pd.read_csv("test.csv")
test_data=test_data[test_data["Time"]!="15:30:00"]
print(test_data)
Just select rows based on condition.
Upvotes: 0
Reputation: 175
The way I did was the following,
First we get the the time we want to remove from the dataset, that is 15:30:00 in this case.
Since the Datetime column is in the datetime format, we cannot compare the time as strings. So we convert the given time in the datetime.time() format.
rm_time = dt.time(15,30)
With this, we can go about using the DataFrame.drop()
df.drop(df[df.Datetime.dt.time == rm_time].index)
Upvotes: 2