Reputation: 137
I want to convert the list values into the Python Dataframe but my header values are inside the list.
the values which are inside the list are like this:
cols_head=['BATSMEN', '', 'R', 'B', '4s', '6s', 'SR', '', 'C Hemraj', 'c Mohammad Mithun b Mehidy Hasan Miraz', '9', '17', '2', '0', '52.94', '']
I have scraped the values from the espn website and it is score card now when the values are inside the list i want to convert them into the pandas dataframe. When I have convert the list into the dataframe I am getting the like this:
0
0 BATSMEN
1 Out
2 R
3 B
4 4s
5 6s
6 SR
7
8 C Hemraj
9 c Mohammad Mithun b Mehidy Hasan Miraz
10 9
11 17
12 2
13 0
14 52.94
from serial 0 to 7 these are columns of the dataframe
This is the code which I have tried to convert the values:
cols_head=[x.text.strip() for x in cell]
#print(cols_head)
List_values=cols_head[:-13]
df=pd.DataFrame(List_values)
I want to the output of the dataframe like this:
BATSMEN Out R B 4s 6s SR
1 C Hemraj C Mohammad Mithun 9 17 2 0 52.94
Upvotes: 2
Views: 103
Reputation: 164793
You can use a list comprehension. This extends under the assumption your list contains an arbitrary number of rows in the same format. Notice you actually have 8 columns. The last is just labeled with an empty string.
data = ['BATSMEN', '', 'R', 'B', '4s', '6s', 'SR', '', 'C Hemraj', 'c Mohammad Mithun b Mehidy Hasan Miraz', '9', '17', '2', '0', '52.94', '']
n = 8
df = pd.DataFrame([data[n*i:n*(i+1)] for i in range(1, len(data) // n)],
columns=data[:n])
print(df)
# BATSMEN R B 4s 6s SR
# 0 C Hemraj c Mohammad Mithun b Mehidy Hasan Miraz 9 17 2 0 52.94
print(df.columns)
# Index(['BATSMEN', '', 'R', 'B', '4s', '6s', 'SR', ''], dtype='object')
Upvotes: 1
Reputation: 6543
This works for the data you have posted. It will need to be tweaked slightly if your list actually contains multiple rows of data.
import pandas as pd
cols_head=['BATSMEN', '', 'R', 'B', '4s', '6s', 'SR', '', 'C Hemraj', 'c Mohammad Mithun b Mehidy Hasan Miraz', '9', '17', '2', '0', '52.94', '']
headers = cols_head[:7]
data = cols_head[8:-1] # Ignores the two blanks at index 7 and 15
df = pd.DataFrame([data], columns=headers)
Output:
BATSMEN R B 4s 6s SR
0 C Hemraj c Mohammad Mithun b Mehidy Hasan Miraz 9 17 2 0 52.94
Upvotes: 0