Deseaus
Deseaus

Reputation: 3

Concatenate columns and use the column names as a binary feature in a new column

I currently have a DataFrame with two columns that contain the same type of data:

id   foo   bar
1    f1    b1
2    f2    b2
3    f3    b3

I know how to concatenate the two columns, but I would also like foo and bar to appear in an additional column as a binary feature indicating which column they originally came from, like so:

id   foobar  column
1    f1      foo
2    f2      foo
3    f3      foo
4    b1      bar
5    b2      bar
6    b3      bar

How can I achieve that?

Upvotes: 0

Views: 719

Answers (1)

Stefan
Stefan

Reputation: 42905

You could do:

df = DataFrame({'foo': ['f1', 'f2', 'f3'], 'bar': ['b1', 'b2', 'b3']})

print df

  bar foo
0  b1  f1
1  b2  f2
2  b3  f3

cols = ''.join(list(df))
df = concat([df.foo, df.bar], keys=df.columns).reset_index(0)
df.columns = ['source', cols]
print df

  source barfoo
0    bar     f1
1    bar     f2
2    bar     f3
0    foo     b1
1    foo     b2
2    foo     b3

Upvotes: 1

Related Questions