Prachi
Prachi

Reputation: 544

How to fill a df column with range of values of 2 columns from another df?

Df1

A   B  C1  C2   D   E
a1  b1  2   4   d1  e1
a2  b2  1   2   d2  e2

Df2

A   B   C   D   E
a1  b1  2   d1  e1
a1  b1  3   d1  e1
a1  b1  4   d1  e1
a2  b2  1   d2  e2
a2  b2  2   d2  e2

How to make Df2 from Df1 in the fastest possible way? I tried using groupby and then within for loop used np.arange to fill Df2.C and then used pd.concat to make the final Df2. But this approach is very slow and doesn't seem very elegant and pythonic as well. Can somebody please help with this problem.

Upvotes: 2

Views: 70

Answers (2)

Scott Boston
Scott Boston

Reputation: 153460

Try this:

df1.assign(C = [np.arange(s, e+1) for s, e in zip(df1['C1'], df1['C2'])])\
   .explode('C')

Output:

    A   B  C1  C2   D   E  C
0  a1  b1   2   4  d1  e1  2
0  a1  b1   2   4  d1  e1  3
0  a1  b1   2   4  d1  e1  4
1  a2  b2   1   2  d2  e2  1
1  a2  b2   1   2  d2  e2  2

Upvotes: 2

user7864386
user7864386

Reputation:

One way is to melt df1, use groupby.apply to add ranges; then explode for the final output:

cols = ['A','B','D','E']
out = (df1.melt(cols, value_name='C').groupby(cols)['C']
       .apply(lambda x: range(x.min(), x.max()+1))
       .explode().reset_index(name='C'))

Output:

    A   B   D   E  C
0  a1  b1  d1  e1  2
1  a1  b1  d1  e1  3
2  a1  b1  d1  e1  4
3  a2  b2  d2  e2  1
4  a2  b2  d2  e2  2

Upvotes: 1

Related Questions