Pijay
Pijay

Reputation: 53

How to explode range (from two column) to rows

Hello all I have this:

df = pd.DataFrame({'name':['L1', 'L2'], 'from':['1', '5'], 'to':['3', '7']})

name    from    to
L1       1       3
L2       5       7

In fact I have a lot of row (3000000) and very big range like 1 to 9000.

What is the best way to have this kind of result (exploding the range in rows)

like:

name    n°
L1       1
L1       2
L1       3
L2       5
L2       6
L2       7

Thank's a lot

Upvotes: 3

Views: 298

Answers (3)

Dev Khadka
Dev Khadka

Reputation: 5471

Edited: sorry I got the questions wrong 1st time, I corrected by answer

As pandas data are stored as numpy array internally and numpy manipulations are fast you can use numpy manipulation to do it

import pandas as pd
import numpy as np

df = pd.DataFrame({'name':['L1', 'L2'], 'from':[1, 5], 'to':[3, 7]})

t_dict = {}
## find number of times each rows should be repeated
repeats = (df["to"]- df["from"]+1 ).values

## repeat names
t_dict["name"] = df["name"].values.repeat(repeats)

## repeat notes and add 0,1,2,3 .. seq to each repeated section
t_dict["n°"] = df["from"].values.repeat(repeats) + np.array([i for count in repeats for i in range(count)])

pd.DataFrame(t_dict)

output

name    n°
0   L1  1
1   L1  2
2   L1  3
3   L2  5
4   L2  6
5   L2  7

Upvotes: 0

jezrael
jezrael

Reputation: 863236

Use list comprehension with range and DataFrame constructor:

df[['from','to']] = df[['from','to']].astype(int)

zipped = zip(df['name'], df['from'], df['to'])
df = pd.DataFrame([(i, y) for i,j,k in zipped for y in range(j, k+1)], columns=['name','id'])
print (df)
  name  id
0   L1   1
1   L1   2
2   L1   3
3   L2   5
4   L2   6
5   L2   7

Another solution:

df[['from','to']] = df[['from','to']].astype(int)

s = df['to'] - df['from'] + 1
df = df.loc[df.index.repeat(s), ['name','from']].rename(columns={'from':'no'})
df['no'] += df.groupby(level=0).cumcount()
df = df.reset_index(drop=True)
print (df)
  name  no
0   L1   1
1   L1   2
2   L1   3
3   L2   5
4   L2   6
5   L2   7

Upvotes: 1

iDrwish
iDrwish

Reputation: 3103

(pd.concat([df, df[['from', 'to']]
           .applymap(int)
           .apply(lambda x: np.arange(x[0], x[1]), axis=1)], 
          axis=1)
.explode(0))

Output:

    name    from    to  0
0   L1         1    3   1
0   L1         1    3   2
1   L2         5    7   5
1   L2         5    7   6

Upvotes: 0

Related Questions