Reputation: 31
I asked this question yesterday but was unclear about a couple things so I'm reposting it here. Basically I have a dataframe of 13 columns and over 500 rows and I'm trying to add a header every x number of rows.
I'm a beginner so I've tried .concat and .append but I'm not sure if I really am doing it right
I have the variable header = ['Rk', 'Player', 'Age',...]
In: print(final.head())
out:
index Player Age Tm Pos GP G A P +/- PPP TOI
0 0 Nikita Kucherov 25 TBL RW 82 41 87 128 24 41 19:58
1 4 Brad Marchand 30 BOS LW 79 36 64 100 15 33 19:37
2 5 Sidney Crosby 31 PIT C 79 35 65 100 18 20 21:00
3 6 Nathan MacKinnon 23 COL C 82 41 58 99 20 31 22:05
4 7 Johnny Gaudreau 25 CGY LW 82 36 63 99 18 29 20:04
I want to print the header every 48 rows, if I wanted to print it every 2 rows it would look like this:
In: print(final.head())
out:
index Player Age Tm Pos GP G A P +/- PPP TOI
0 0 Nikita Kucherov 25 TBL RW 82 41 87 128 24 41 19:58
1 4 Brad Marchand 30 BOS LW 79 36 64 100 15 33 19:37
Player Age Tm Pos GP G A P +/- PPP TOI
2 5 Sidney Crosby 31 PIT C 79 35 65 100 18 20 21:00
3 6 Nathan MacKinnon 23 COL C 82 41 58 99 20 31 22:05
Player Age Tm Pos GP G A P +/- PPP TOI
4 7 Johnny Gaudreau 25 CGY LW 82 36 63 99 18 29 20:04
Note I don't really care about the what the value is for the index column for the header row when I insert multiple times, I'm pretty lenient for that part.
Upvotes: 0
Views: 1096
Reputation: 862661
It is possible, but if need working later with data, not recommended, because if mixing numeric values mixed with strings some function should failed:
N = 2
#N = 48 with real data
#get index of added values, omit first value
idx = df.index[::N][1:]
#repeat columns to DataFrame
arr = np.broadcast_to(df.columns, (len(idx),len(df.columns)))
df1 = pd.DataFrame(arr, index=idx, columns=df.columns)
#append original and sorting by index
df = df1.append(df).sort_index().reset_index(drop=True)
print (df)
index Player Age Tm Pos GP G A P +/- PPP TOI
0 0 Nikita Kucherov 25 TBL RW 82 41 87 128 24 41 19:58
1 4 Brad Marchand 30 BOS LW 79 36 64 100 15 33 19:37
2 index Player Age Tm Pos GP G A P +/- PPP TOI
3 5 Sidney Crosby 31 PIT C 79 35 65 100 18 20 21:00
4 6 Nathan MacKinnon 23 COL C 82 41 58 99 20 31 22:05
5 index Player Age Tm Pos GP G A P +/- PPP TOI
6 7 Johnny Gaudreau 25 CGY LW 82 36 63 99 18 29 20:04
EDIT For write each splitted DataFrame to separate sheet in one excel file use:
N = 2
#N = 48 with real data
with pd.ExcelWriter('file.xlsx') as writer:
for i, df1 in enumerate(np.split(df, range(N, len(df), N))):
df1.to_excel(writer, sheet_name=f'Sheet{i}', index=False)
EDIT1: For write all DataFrame to same sheetname:
#https://stackoverflow.com/a/33004253 + added index=False to df.to_excel
def multiple_dfs(df_list, sheets, file_name, spaces):
writer = pd.ExcelWriter(file_name,engine='xlsxwriter')
row = 0
for dataframe in df_list:
dataframe.to_excel(writer,sheet_name=sheets,startrow=row ,startcol=0, index=False)
row = row + len(dataframe.index) + spaces + 1
writer.save()
N = 2
#N = 48 with real data
dfs = np.split(df, range(N, len(df), N))
multiple_dfs(dfs, 'Steetname1', 'file.xlsx', 1)
Upvotes: 1