Reputation:
I have a string the exact same as below, My goal is so split it into a dataframe but I am finding trouble getting it to work. I have tried search on stack but have got nowhere.
'Position Players Average Form\nGoalkeeper Manuel Neuer 4.17017132535\n Defender Diego Godin 4.14973163459\n Defender Giorgio Chiellini 4.10115207373\n Defender Thiago Silva 3.93318274318\n Defender Andrea Barzagli 3.85132973289\nMidfielder Arjen Robben 4.80556193806\nMidfielder Alexander Meier 4.51037598508\nMidfielder Franck Ribery 4.48063714064\nMidfielder David Silva 3.76028050109\n Forward Cristiano Ronaldo 7.87909462636\n Forward Zlatan Ibrahimovic 6.85401665065'
Is there a way to turn this into a dataframe, in a reproducible way so I could do it with other strings?
My goal dataframe would look like as follows:
Position name Average
Goalkeeper Manuel 4.17017132535
Defender Diego 4.14973163459
Defender Giorgio 4.10115207373
Defender Thiago 3.93318274318
Defender Andrea 3.85132973289
Midfielder Arjen 4.80556193806
Midfielder Alexander 4.51037598508
Midfielder Franck 4.48063714064
Midfielder David 3.76028050109
Forward Cristiano 7.87909462636
Forward Hnery 6.85401665065
I am new to pandas so any help would be greatly appreciated
Upvotes: 1
Views: 52
Reputation: 164683
This is one way.
import pandas as pd
mystr = 'Position Players Average Form\nGoalkeeper Manuel Neuer 4.17017132535\n Defender Diego Godin 4.14973163459\n Defender Giorgio Chiellini 4.10115207373\n Defender Thiago Silva 3.93318274318\n Defender Andrea Barzagli 3.85132973289\nMidfielder Arjen Robben 4.80556193806\nMidfielder Alexander Meier 4.51037598508\nMidfielder Franck Ribery 4.48063714064\nMidfielder David Silva 3.76028050109\n Forward Cristiano Ronaldo 7.87909462636\n Forward Zlatan Ibrahimovic 6.85401665065'
lst = mystr.split()
data = [lst[pos:pos+4] for pos in range(0, len(lst), 4)]
df = pd.DataFrame(data[1:], columns=data[0])
print(df)
# Position Players Average Form
# 0 Goalkeeper Manuel Neuer 4.17017132535
# 1 Defender Diego Godin 4.14973163459
# 2 Defender Giorgio Chiellini 4.10115207373
# 3 Defender Thiago Silva 3.93318274318
# 4 Defender Andrea Barzagli 3.85132973289
# 5 Midfielder Arjen Robben 4.80556193806
# 6 Midfielder Alexander Meier 4.51037598508
# 7 Midfielder Franck Ribery 4.48063714064
# 8 Midfielder David Silva 3.76028050109
# 9 Forward Cristiano Ronaldo 7.87909462636
# 10 Forward Zlatan Ibrahimovic 6.85401665065
This method will not be perfect in these instances:
Upvotes: 1
Reputation: 1445
Here is how you will work your way around that.
import pandas as pd
from io import StringIO
data = StringIO('Position Players Average Form\nGoalkeeper Manuel Neuer 4.17017132535\n Defender Diego Godin 4.14973163459\n Defender Giorgio Chiellini 4.10115207373\n Defender Thiago Silva 3.93318274318\n Defender Andrea Barzagli 3.85132973289\nMidfielder Arjen Robben 4.80556193806\nMidfielder Alexander Meier 4.51037598508\nMidfielder Franck Ribery 4.48063714064\nMidfielder David Silva 3.76028050109\n Forward Cristiano Ronaldo 7.87909462636\n Forward Zlatan Ibrahimovic 6.85401665065')
df = pd.read_csv(data, sep="\n")
print(df)
Output :
Position Players Average Form
0 Goalkeeper Manuel Neuer 4.17017132535
1 Defender Diego Godin 4.14973163459
2 Defender Giorgio Chiellini 4.10115207373
3 Defender Thiago Silva 3.93318274318
4 Defender Andrea Barzagli 3.85132973289
5 Midfielder Arjen Robben 4.80556193806
6 Midfielder Alexander Meier 4.51037598508
7 Midfielder Franck Ribery 4.48063714064
8 Midfielder David Silva 3.76028050109
9 Forward Cristiano Ronaldo 7.87909462636
10 Forward Zlatan Ibrahimovic 6.85401665065
Upvotes: 0