Reputation: 101
I have the following code which correlates data brought in from PgSQL.
if wd is not None and dd is not None:
alldata=np.concatenate((wd,dd))
alldat_df=pd.DataFrame(alldata, index=None, columns=['datetime','rain', 'raindiff'])
alldat_df.drop(alldat_df.loc[2708:2738].index, inplace=True)
alldata=np.array(alldat_df)
alldata[0,2]=0
mask = (alldat_df['datetime'] > fdate) & (alldat_df['datetime'] <= tdate)
ndf=alldat_df.loc[mask]
ndf.loc[0,['raindiff']]=0
ndf.index=ndf['datetime']
ndf.drop(columns=['datetime'], inplace=True)
davisdfnew=ndf.resample(bs, offset=bs, origin=fdate).sum()
davisdfnew.rename(columns={'rain':'rain sum','raindiff':'raindiff sum'}, inplace=True)
if dd is None:
alldat_df=pd.DataFrame(wd, index=None, columns=['datetime', 'rain', 'raindiff'])
mask = (alldat_df['datetime'] > fdate) & (alldat_df['datetime'] <= tdate)
ndf=alldat_df.loc[mask]
ndf.loc[0,['raindiff']]=0
ndf.index=ndf['datetime']
ndf.drop(columns=['datetime'], inplace=True)
davisdfnew=ndf.resample(bs, offset=bs, origin=fdate).sum()
davisdfnew.rename(columns={'rain':'rain sum','raindiff':'raindiff sum'}, inplace=True)
if wd is None:
alldat_df=pd.DataFrame(dd, index=None, columns=['datetime', 'rain', 'raindiff'])
mask = (alldat_df['datetime'] > fdate) & (alldat_df['datetime'] <= tdate)
ndf=alldat_df.loc[mask]
ndf.loc[0,['raindiff']]=0
ndf.index=ndf['datetime']
ndf.drop(columns=['datetime'], inplace=True)
davisdfnew=ndf.resample(bs, offset=bs, origin=fdate).sum()
davisdfnew.rename(columns={'rain':'rain sum','raindiff':'raindiff sum'}, inplace=True)
When it runs and the first two if conditions are met it throws the following warning
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
ndf.loc[0,['raindiff']]=0
but when the condition if wd is None
is met there is no warning
in all cases the value at ndf.loc[0,['raindif']]
is a none type object
I would appreciate it if someone could shed some light on this!
edited as per @john giorgio comment
wd=
array([[datetime.datetime(2021, 5, 20, 10, 45), 0.0, None],
[datetime.datetime(2021, 5, 20, 11, 0), 0.0, 0.0],
[datetime.datetime(2021, 5, 20, 11, 15), 0.0, 0.0],
...,
[datetime.datetime(2021, 6, 17, 22, 30), 96.6, 0.0],
[datetime.datetime(2021, 6, 17, 22, 45), 96.6, 0.0],
[datetime.datetime(2021, 6, 17, 23, 0), 96.6, 0.0]], dtype=object)
dd=
array([[datetime.datetime(2021, 6, 17, 15, 30, 42), 96.6, None],
[datetime.datetime(2021, 6, 17, 15, 35, 42), 96.6, 0.0],
[datetime.datetime(2021, 6, 17, 15, 40, 42), 96.6, 0.0],
...,
[datetime.datetime(2021, 6, 30, 23, 45, 41), 113.8, 0.0],
[datetime.datetime(2021, 6, 30, 23, 50, 41), 113.8, 0.0],
[datetime.datetime(2021, 6, 30, 23, 55, 41), 113.8, 0.0]],
dtype=object)
as I said, the error occurs when wd exists. If both wd and dd exist they are combined, and duplicate datetimes removed to give ndf. if only wd exists ndf is formed from it, in both these cases the error occurs.
If only dd exists ndf is formed from this, and the error does not occur
Upvotes: 0
Views: 53
Reputation: 659
What you could try to do is resetting the index, .reset_index(drop=True)
each time you take a subsample of your original dataset, before performing any other action.
Upvotes: 1