Reputation: 107
Normally when you want to create a turn a set of data into a Data Frame, you make a list for each column, create a dictionary from those lists, then create a data frame from the dictionary.
The data frame I want to create has 75 columns, all with the same number of rows. Defining lists one-by-one isn't going work. Instead I decided to make a single list and iteratively put a certain chunk of each row onto a Data Frame. Here I will make an example where I turn a list into a data frame:
lst = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
# Example list
df =
a b c d e
0 0 2 4 6 8
1 1 3 5 7 9
# Result I want from the example list
Here is my test code:
import pandas as pd
import numpy as np
dict = {'a':[], 'b':[], 'c':[], 'd':[], 'e':[]}
df = pd.DataFrame(dict)
# Here is my test data frame, it contains 5 columns and no rows.
lst = np.arange(10).tolist()
# This is my test list, it looks like this lst = [0, 2, …, 9]
for i in range(len(lst)):
df.iloc[:, i] = df.iloc[:, i]\
.append(pd.Series(lst[2 * i:2 * i + 2]))
# This code is supposed to put two entries per column for the whole data frame.
# For the first column, i = 0, so [2 * (0):2 * (0) + 2] = [0:2]
# df.iloc[:, 0] = lst[0:2], so df.iloc[:, 0] = [0, 1]
# Second column i = 1, so [2 * (1):2 * (1) + 2] = [2:4]
# df.iloc[:, 1] = lst[2:4], so df.iloc[:, 1] = [2, 3]
# This is how the code was supposed to allocate lst to df.
# However it outputs an error.
When I run this code I get this error:
ValueError: cannot reindex from a duplicate axis
When I add ignore_index = True
such that I have
for i in range(len(lst)):
df.iloc[:, i] = df.iloc[:, i]\
.append(pd.Series(lst[2 * i:2 * i + 2]), ignore_index = True)
I get this error:
IndexError: single positional indexer is out-of-bounds
After running the code, I check the results of df
. The output is the same whether I ignore index or not.
In: df
Out:
a b c d e
0 0 NaN NaN NaN NaN
1 1 NaN NaN NaN NaN
It seems that the first loop runs fine, but the error occurs when trying to fill the second column.
Does anybody know how to get this to work? Thank you.
Upvotes: 0
Views: 83
Reputation: 153500
IIUC:
lst = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
alst = np.array(lst)
df = pd.DataFrame(alst.reshape(2,-1, order='F'), columns = [*'abcde'])
print(df)
Output:
a b c d e
0 0 2 4 6 8
1 1 3 5 7 9
Upvotes: 1