Reputation: 154454
Is it possible to create a DataFrame from a list of series without duplicating their names?
Ex, creating the same DataFrame as:
>>> pd.DataFrame({ "foo": data["foo"], "bar": other_data["bar"] })
But without without needing to explicitly name the columns?
Upvotes: 1
Views: 101
Reputation: 3599
I prefer the explicit way, as presented in your original post, but if you really want to write certain names once, you could try this:
import pandas as pd
import numpy as np
def dictify(*args):
return dict((i,n[i]) for i,n in args)
data = { 'foo': np.random.randn(5) }
other_data = { 'bar': np.random.randn(5) }
print pd.DataFrame(dictify(('foo', data), ('bar', other_data)))
The output is as expected:
bar foo
0 0.533973 -0.477521
1 0.027354 0.974038
2 -0.725991 0.350420
3 1.921215 0.648210
4 0.547640 1.652310
[5 rows x 2 columns]
Upvotes: 0
Reputation: 68126
Seems like you want to join the dataframes (works similar to SQL):
import numpy as np
import pandas
df1 = pandas.DataFrame(
np.random.random_integers(low=0, high=10, size=(10,2)),
columns = ['foo', 'bar'],
index=list('ABCDEFHIJK')
)
df2 = pandas.DataFrame(
np.random.random_integers(low=0, high=10, size=(10,2)),
columns = ['bar', 'bax'],
index=list('DEFHIJKLMN')
)
df1[['foo']].join(df2['bar'], how='outer')
The on
kwarg takes a list of columns or None
. If None
, it'll join on the indices of the two dataframes. You just need to make sure that you're using a dataframe for the left size -- hence the double brackets to force df[['foo']] to a dataframe (df['foo'] returns a series)
This gives me:
foo bar
A 4 NaN
B 0 NaN
C 10 NaN
D 8 3
E 2 0
F 3 3
H 9 10
I 0 9
J 5 6
K 2 9
L NaN 3
M NaN 1
N NaN 1
You can also do inner
, left
, and right
joins.
Upvotes: 2
Reputation: 13768
Try pandas.concat
which takes a list of items to combine as its argument:
df1 = pd.DataFrame(np.random.randn(100, 4), columns=list('abcd'))
df2 = pd.DataFrame(np.random.randn(100, 3), columns=list('xyz'))
df3 = pd.concat([df1['a'], df2['y']], axis=1)
Note that you need to use axis=1
to stack things together side-by side and axis=0
(which is the default) to combine them one-over-the-other.
Upvotes: 3