user3222101
user3222101

Reputation: 1330

How to split data to multiple rows in pandas on one condition?

I have the data in the below dataframe as:-

id name value year quarter 
1 an     2.3  2012 1
2 yu     3.5  2012 2
3 ij     3.1  2013 4
4 ij     2.1  2013 1

to be converted to below dataframe i.e. get month from quarter and split row into 3.

id name value year quarter month
1 an     2.3  2012 1       01
1 an     2.3  2012 1       02
1 an     2.3  2012 1       03
2 yu     3.5  2012 2       04
2 yu     3.5  2012 2       05
2 yu     3.5  2012 2       06
3 ij     3.1  2013 4       10  
3 ij     3.1  2013 4       11
3 ij     3.1  2013 4       12
4 ij     2.1  2013 1       01
4 ij     2.1  2013 1       02
4 ij     2.1  2013 1       03

Upvotes: 3

Views: 733

Answers (4)

BENY
BENY

Reputation: 323246

Using reindex with pd.to_datetime , and we adding the cumcount for each sub group

df=df.reindex(df.index.repeat(3))
df['Month']=pd.to_datetime(df[['year','quarter']].astype(str).apply('Q'.join,1)).dt.month+df.groupby(level=0).cumcount()
df
Out[1258]: 
   id name  value  year  quarter  Month
0   1   an    2.3  2012        1      1
0   1   an    2.3  2012        1      2
0   1   an    2.3  2012        1      3
1   2   yu    3.5  2012        2      4
1   2   yu    3.5  2012        2      5
1   2   yu    3.5  2012        2      6
2   3   ij    3.1  2013        4     10
2   3   ij    3.1  2013        4     11
2   3   ij    3.1  2013        4     12

Upvotes: 1

user3483203
user3483203

Reputation: 51155

First, create a DataFrame with the month ranges of each quarter in your current DataFrame:

m = pd.DataFrame([range(i*3-2, 3*i+1) for i in df.quater], index=df.quater)

         0   1   2
quater
1        1   2   3
2        4   5   6
4       10  11  12

Now join and stack:

df.set_index('quater').join(m.stack().reset_index(1, drop=True).rename('month'))

        id name  value  year  month
quater
1        1   an    2.3  2012      1
1        1   an    2.3  2012      2
1        1   an    2.3  2012      3
2        2   yu    3.5  2012      4
2        2   yu    3.5  2012      5
2        2   yu    3.5  2012      6
4        3   ij    3.1  2013     10
4        3   ij    3.1  2013     11
4        3   ij    3.1  2013     12

Upvotes: 2

Zero
Zero

Reputation: 76917

You could use repeat

In [360]: dff = df.loc[df.index.repeat(3)]

In [362]: dff.assign(month = dff.quater.sub(1) * 3 + dff.groupby('quater').cumcount() + 1)
Out[362]:
   id name  value  year  quater  month
0   1   an    2.3  2012       1      1
0   1   an    2.3  2012       1      2
0   1   an    2.3  2012       1      3
1   2   yu    3.5  2012       2      4
1   2   yu    3.5  2012       2      5
1   2   yu    3.5  2012       2      6
2   3   ij    3.1  2013       4     10
2   3   ij    3.1  2013       4     11
2   3   ij    3.1  2013       4     12

Upvotes: 2

piRSquared
piRSquared

Reputation: 294278

Create a quarter to month dataframe to merge on

q2m = pd.DataFrame([
    [(m - 1) // 3 + 1, m] for m in range(1, 13)],
    columns=['quarter', 'month']
)

df.merge(q2m)

   id name  value  year  quarter  month
0   1   an    2.3  2012        1      1
1   1   an    2.3  2012        1      2
2   1   an    2.3  2012        1      3
3   2   yu    3.5  2012        2      4
4   2   yu    3.5  2012        2      5
5   2   yu    3.5  2012        2      6
6   3   ij    3.1  2013        4     10
7   3   ij    3.1  2013        4     11
8   3   ij    3.1  2013        4     12

Upvotes: 3

Related Questions