Reputation: 57
I have two dataframes in the same time resolution. From the first dataframe (in my case: df_data1) I only want to have all values ['A'] where ['B'] is < 90. And now I'd like filter my secound dataframe so that I have only the values with the same timestamp(timeindex) from my first dataframe
df_data1 = pd.io.parsers.read_csv(station_path, skiprows=0, index_col=0, na_values=[-999], names= names_header , sep=';', header=None , squeeze=True)
date A B
16.08.2013 03:00 -1 97
16.08.2013 03:15 -1 95
16.08.2013 03:30 0 92
16.08.2013 03:45 4 90
16.08.2013 04:00 18 88
16.08.2013 04:15 42 86
16.08.2013 04:30 73 83
16.08.2013 04:45 110 81
16.08.2013 05:00 151 78
Now I'd like to have all df_data['A'] where df_data['B'] is <90. So I do:
df_data = df_data[(df_data['B'] < 90)]
the second dataframe looks like:
df_data2 = pd.io.parsers.read_csv(station_path, skiprows=1, sep=";", index_col=False, header=None)
date w x y z
16.08.2013 03:00 0 0 0 0
16.08.2013 03:15 0 0 0 0
16.08.2013 03:30 0 0 0 0
16.08.2013 03:45 0 0 0 0
16.08.2013 04:00 0 0 0 0
16.08.2013 04:15 0 0 0 0
16.08.2013 04:30 47 47 48 0
16.08.2013 04:45 77 78 79 88
16.08.2013 05:00 111 112 113 125
Have anyone an idea to solve this?
I need the dataframes in the same shape cause furthermore I'd like to calculate the np.corrcoef
and so on.
Upvotes: 1
Views: 124
Reputation: 394469
Well your first part is pretty much done:
df_data = df_data[(df_data['B'] < 90)]
you can then access column A using df_data['A']
if your index values are the same in both df then this should work:
In [40]:
df1.loc[df_data.index]
Out[40]:
w x y z
date
2013-08-16 04:00:00 0 0 0 0
2013-08-16 04:15:00 0 0 0 0
2013-08-16 04:30:00 47 47 48 0
2013-08-16 04:45:00 77 78 79 88
2013-08-16 05:00:00 111 112 125 NaN
EDIT
Unclear why you get a KeyError
but you can use the following also:
df_data2[df_data2.index.isin(df_data1.index)]
This will handle any index values that are not present in your second df.
Upvotes: 2
Reputation: 57
to complete this:
but with the following expression it works well:
df_data2[df_data2.index.isin(df_data1.index)]
Upvotes: 1