knowlede0989
knowlede0989

Reputation: 43

How to repeat 2 columns every nth row in pandas?

I have a df that looks like this.

id rent place
0  Yes  colorado
0  yes  Mexico
0  yes  Brazil
1  yes  colorado
1  yes  Mexico
1  yes  Brazil
2  yes colorado
2  yes Mexico
2  yes Brazil
3  yes colorado
3  yes Mexico
3  yes Brazil

I need the "id" column to continue to increase by 1 and the values in the "place" column to repeat every 3rd row. I have no idea how to do this.

Upvotes: 2

Views: 380

Answers (2)

MiH
MiH

Reputation: 352

You could build your DataFrame row by row, and append the relevant row(s) as you desire.

id = [0,1,2,3]
rent = [123, 'yes', 'yes']
place = ['colorado', 'Mexico', 'Brazil']
df = pd.DataFrame({'rent': [], 'place': []}, index=[])   #empty df

for i in range(len(id)):
    for j in range(len(rent)):
        df = df.append(pd.DataFrame({'rent': rent[j], 'place': place[j]}, index=[id[i]]))
df.reset_index(inplace=True)
df.rename(columns={'index': 'id'}, inplace=True)

Output df is:

   id  rent place
0   0   123 colorado
1   0   yes Mexico
2   0   yes Brazil
3   1   123 colorado
4   1   yes Mexico
5   1   yes Brazil
6   2   123 colorado
7   2   yes Mexico
8   2   yes Brazil
9   3   123 colorado
10  3   yes Mexico
11  3   yes Brazil

Upvotes: 2

anon01
anon01

Reputation: 11171

You can generate a new one like so:

N = 200
from itertools import cycle    
places = cycle(["colorado", "mexico", "brazil"])

data = {"id": [j//3 for j in range(N)], "rent": True, "place": [next(places) for j in range(N)]}
df = pd.DataFrame(data)

Note that I've replaced rent with a boolean to be less error prone than text. Output:

     id  rent     place
0     0  True  colorado
1     0  True    mexico
2     0  True    brazil
3     1  True  colorado
4     1  True    mexico
..   ..   ...       ...
195  65  True  colorado
196  65  True    mexico
197  65  True    brazil
198  66  True  colorado
199  66  True    mexico

Alternatively, you can concatenate dfs and then sort them:

df = pd.DataFrame()
for place in ["brazil", "colorado", "mexico"]:

    sub_df = pd.DataFrame({"id": range(N), "rent": True, "place": place})
    df = pd.concat([df, sub_df], axis=0)

df = df.sort_values(["id"])

Upvotes: 1

Related Questions