Condo_programmer
Condo_programmer

Reputation: 236

Repeat rows based on field in Pandas

I am using the Python Data Analysis Library and have a data table like it follows:

+----------------+
| ID | Frequency |
+----------------+
| AA | 4         |
| BB | 2         |
| CC | 3         |
| DD | 1         |
+----------------+

I would like to generate a result table where each original row occurrence has a multiplicity indicated by the Frequency field above. The new values for the Frequency field should be increasing starting from 1, finishing with the original frequency value.

+----------------+
| ID | Frequency |
+----------------+
| AA | 1         |
| AA | 2         |
| AA | 3         |
| AA | 4         |
| BB | 1         |
| BB | 2         |
| CC | 1         |
| CC | 2         |
| CC | 3         |
| DD | 1         |
+----------------+

As I have no idea how start to solve this issue, any help would be appreciated.

Upvotes: 1

Views: 292

Answers (2)

Mike Chirico
Mike Chirico

Reputation: 3491

I think there is a built in way to do this; however, I can't find it in the documentation. The following seems to work for me...

>>> import pandas as pd
>>> d = pd.DataFrame([['AA',4],['BB',3],['CC',2],['DD',1]],columns=['id','frequency'])

>>> k=((' '+d.id)*d.frequency).apply(lambda x: pd.Series(x.split(' '))).stack()
>>> df = pd.DataFrame(k[(k.T != '')])

>>> df.reset_index(inplace=True)
>>> df.columns=['l0','frequency','id']

>>> df[['id','frequency']]
   id  frequency
0  AA          1
1  AA          2
2  AA          3
3  AA          4
4  BB          1
5  BB          2
6  BB          3
7  CC          1
8  CC          2
9  DD          1

Upvotes: 1

seanmus
seanmus

Reputation: 518

df = pd.DataFrame({"ID":["AA", "BB", "CC", "DD"], "Frequency": [4, 3, 2, 1]})
l = []
for idx, i in df.iterrows():
    x = i["Frequency"]
    while x > 0:
        d = {}
        d.update({"ID":i["ID"], "Frequency":i["Frequency"]-x+1 })
        l.append(d)
        x-=1

ndf = pd.DataFrame(l)
ndf = ndf[["ID", "Frequency"]]
print(ndf)

   ID  Frequency
0  AA          1
1  AA          2
2  AA          3
3  AA          4
4  BB          1
5  BB          2
6  BB          3
7  CC          1
8  CC          2
9  DD          1

Upvotes: 1

Related Questions