I have a Pandas dataframe that looks like this: Location Number Position New York 111 1 through 12 Dallas 222 1 through 12 San Francisco 333 1 through 4 I would like to basically explode the dataframe based on the Position column so that the output looks like this: Location Number Position New York 111 1 New York 111 2 New York 111 3 New York 111 4 New York 111 5 New York 111 6 New York 111 7 New York 111 8 New York 111 9 New York 111 10 New York 111 11 New York 111 12 Dallas 222 etc. etc. I would like to do this for every instance of Location and Number based on what the values in Position are. Is there a quick and easy way to do this??

Reputation: 37

Exploding a dataframe based on column value

I have a Pandas dataframe that looks like this:

Location	Number	Position
New York	111	1 through 12
Dallas	222	1 through 12
San Francisco	333	1 through 4

I would like to basically explode the dataframe based on the Position column so that the output looks like this:

Location	Number	Position
New York	111	1
New York	111	2
New York	111	3
New York	111	4
New York	111	5
New York	111	6
New York	111	7
New York	111	8
New York	111	9
New York	111	10
New York	111	11
New York	111	12
Dallas	222	etc. etc.

I would like to do this for every instance of Location and Number based on what the values in Position are.

Is there a quick and easy way to do this??

Upvotes: 0

Answers (1)

mozway

Reputation: 260410

One option using explode:

out = (df
 .assign(Position=[range(a, b+1) for x in df['Position']
                   for a,b in [map(int, x.split(' through '))]])
 .explode('Position')
)

Another approach using reindexing:

df2 = df['Position'].str.extract('(\d+) through (\d+)').astype(int)
#    0   1
# 0  1  12
# 1  1  12
# 2  1   4

rep = df2[1].sub(df2[0]).add(1)

out = (df
 .loc[df.index.repeat(rep)]
 .assign(Position=lambda d: d.groupby(level=0).cumcount().add(df2[0]))
)

output:

        Location  Number  Position
0       New York     111         1
0       New York     111         2
0       New York     111         3
0       New York     111         4
0       New York     111         5
0       New York     111         6
0       New York     111         7
0       New York     111         8
0       New York     111         9
0       New York     111        10
0       New York     111        11
0       New York     111        12
1         Dallas     222         1
1         Dallas     222         2
1         Dallas     222         3
1         Dallas     222         4
1         Dallas     222         5
1         Dallas     222         6
1         Dallas     222         7
1         Dallas     222         8
1         Dallas     222         9
1         Dallas     222        10
1         Dallas     222        11
1         Dallas     222        12
2  San Francisco     333         1
2  San Francisco     333         2
2  San Francisco     333         3
2  San Francisco     333         4

Upvotes: 1

Exploding a dataframe based on column value

Answers (1)

Related Questions