Martijn van Amsterdam
Martijn van Amsterdam

Reputation: 326

Why is my data empty when i add it to my dataframe column?

I use a filter to check for conditions in my dataframe so i can mark them.

filtering = (dfsamen.shift(0).moving=='movingToclose') & (more condtions)
dffilter = pd.Dataframe(data=filtering, columns = ['filter'])
dffilter['DateTime'] = dfsamen['DateTime']

Output:

filtering

4     False
5     False
6      True
7      True

dffilter

4    False 2018-06-03 06:33:38.593
5    False 2018-06-03 06:33:39.197
6     True 2018-06-03 06:33:40.597
7     True 2018-06-03 06:33:41.800

But later I use the same code with different condtions and it doesn't work

filtering2 = (dfsamen.shift(0).Input5==1) | (more conditions)
dffilter2 = pd.DataFrame(data=filtering2, columns=['filter2'])
dffilter2['DateTime'] = dfsamen['DateTime']

Output:

filtering2

4     False
5      True
6      True
7      True

dffilter2 (before the datetime is added)

Empty DataFrame
Columns: [filter2]
Index: []

dffilter2 (with datetime)

4      NaN 2018-06-03 06:33:38.593
5      NaN 2018-06-03 06:33:39.197
6      NaN 2018-06-03 06:33:40.597
7      NaN 2018-06-03 06:33:41.800

So why does my data dissapear in the second filter when i add it to the column eventhough the data exists in the filtering2?

Upvotes: 1

Views: 95

Answers (1)

jezrael
jezrael

Reputation: 862671

Problem is your DataFrame constructor, because is created default RangeIndex so possible different indices in both DataFrames, data are not alignes and you get NaNs column for rows with different index values.

Solution is converting values to numpy arrays:

filtering = (dfsamen.shift(0).moving=='movingToclose') & (more condtions)

dffilter = pd.DataFrame(data=filtering.values, columns = ['filter'])
dffilter['DateTime'] = dfsamen['DateTime'].values
print (dffilter)

Sample:

dfsamen = pd.DataFrame({
        'A':list('abc'),
        'DateTime':pd.date_range('2015-01-01', periods=3),
        'C':[7,8,9]
}, index=[4,5,6])

print (dfsamen)
   A   DateTime  C
4  a 2015-01-01  7
5  b 2015-01-02  8
6  c 2015-01-03  9

filtering = dfsamen.A == 'a'

dffilter = pd.DataFrame(data=filtering.values, columns = ['filter'])
dffilter['DateTime'] = dfsamen['DateTime'].values
print (dffilter)
   filter   DateTime
0    True 2015-01-01
1   False 2015-01-02
2   False 2015-01-03

Or use Series.to_frame for converting Series to DataFrame with one column:

filtering = dfsamen.A == 'a'

dffilter = filtering.to_frame('filter')
dffilter['DateTime'] = dfsamen['DateTime'].values
print (dffilter)
   filter   DateTime
4    True 2015-01-01
5   False 2015-01-02
6   False 2015-01-03

Upvotes: 1

Related Questions