Repeat each row of a DataFrame N times to create a new DataFrame (N is defined by another column)

Question

I am a beginner when it comes to Pandas and I recently had a problem which I could not solve with MS Excel. I have a DataFrame df like:

    Jobtype  JobCount Degree
0   BI Analyst  3     Winf
1   BI Analyst  2     Wing
2   BI Analyst  5     Wiwi
3   Marketing   1     Winf
4   Marketing   2     Wing
5   Marketing   6     Wiwi

(Side note: these are German abbreviations for the degree)

What I want to do is create a new, empty DataFrame which has the columns "JobType" and "Degree" in it with the same values BUT instead of having a column "JobCount" I want the same row to be JobCount-times added to the new, empty DataFrame. For example, I want to have a new DataFrame looking like this:

    Jobtype     Studiengang
0   BI Analyst  Winf
1   BI Analyst  Winf
2   BI Analyst  Winf
3   BI Analyst  Wing
4   BI Analyst  Wing
...

So copy a row JobCount-times to the new DataFrame, without the JobCount column being in it (JobCount-times means the value for JobCount in that specific row). I actually solved this issue but it took me too long and I don't think that my solution is the right way, there has to be a more simple way to do so and therefore I ask for help since I couldn't find a proper solution.

My Code (which works but is way too complicated):

df = pd.read_csv("tete.csv", sep = ";")

cols = ["Jobtype", "Degree"]
new = pd.DataFrame(columns = cols)

df_row_count = range(0, len(df.index))
new_row_count = 0

for i in df_row_count:
    job_count = 0
    while job_count < df.iloc[i]["JobCount"]:
        new.loc[new_row_count] = df.loc[I]
        job_count += 1 
        new_row_count += 1

print(new)

cs95 · Accepted Answer

Let's try repeat and reindex:

(df.drop('JobCount', 1)
   .reindex(df.index.repeat(df['JobCount']))
   .reset_index(drop=True))

Or, if you want something a bit shorter, you can pop while repeating:

(df.reindex(df.index.repeat(df.pop('JobCount')))
   .reset_index(drop=True))

       Jobtype Degree
0   BI Analyst   Winf
1   BI Analyst   Winf
2   BI Analyst   Winf
3   BI Analyst   Wing
4   BI Analyst   Wing
5   BI Analyst   Wiwi
6   BI Analyst   Wiwi
7   BI Analyst   Wiwi
8   BI Analyst   Wiwi
9   BI Analyst   Wiwi
10   Marketing   Winf
11   Marketing   Wing
12   Marketing   Wing
13   Marketing   Wiwi
14   Marketing   Wiwi
15   Marketing   Wiwi
16   Marketing   Wiwi
17   Marketing   Wiwi
18   Marketing   Wiwi

Repeat each row of a DataFrame N times to create a new DataFrame (N is defined by another column)

Answers (2)

Related Questions