Reputation: 554
I am trying to create a Pandas dataframe from a series of lists of unequal lengths. Ideally, what I'd like to do is have the values from the shorter lists repeat so that they match the longer lists that I'm trying to column bind together.
Here's is an example of what I'm trying to do:
name = ['acme corp']
id_num = ['123456']
year = ['2017']
vendors = ['toyota','honda']
paymets = ['100','5000']
name | id_num | year | vendor| payment|
acme corp | 123456 | 2017 | toyota| 100
acme corp | 123456 | 2017 | honda| 5000
In case it matters, I am running this process in a for loop that is extracting data from 1.8 million xml files and then appending the data from each into a csv. Thanks for any pointers you can offer me!
Upvotes: 3
Views: 1677
Reputation: 32125
Use the parameter data
with the list of variables, then apply a couple of transformations:
pd.DataFrame(data=[name, id_num, year, vendors, paymets])
Out[99]:
0 1
0 acme corp None
1 123456 None
2 2017 None
3 toyota honda
4 100 5000
pd.DataFrame(data=[name, id_num, year, vendors, paymets]).T.ffill()
Out[100]:
0 1 2 3 4
0 acme corp 123456 2017 toyota 100
1 acme corp 123456 2017 honda 5000
Upvotes: 3