Reputation: 7265
I have hundereds of dataframe, let say the name is df1
,..., df250
, I need to build list by a column of those dataframe. Usually I did manually, but today data is to much, and to prone to mistakes
Here's what I did
list1 = df1['customer_id'].tolist()
list2 = df2['customer_id'].tolist()
..
list250 = df250['customer_id'].tolist()
This is so manual, can we make this in easier way?
Upvotes: 2
Views: 63
Reputation: 1092
Using exec function enables you to execute python code stored in a string:
for i in range(1,251):
s = "list"+str(i)+" = df"+str(i)+"['customer_id'].tolist()"
exec(s)
Upvotes: 1
Reputation: 470
I'd use next code. In this case there's no need to manually create list of DataFrames.
cust_lists = {'list{}'.format(i): globals()['df{}'.format(i)]['customer_id'].tolist()
for i in range(1, 251)}
Now you can access you lists from cust_lists dict by the name, like this:
`cust_lists['list1']`
or
`list1`
Upvotes: 1
Reputation: 164773
The easier way is to take a step back and make sure you put your dataframes in a collection such as list
or dict
. You can then perform operations easily in a scalable way.
For example:
dfs = {1: df1, 2: df2, 3: df3, ... , 250: df250}
lists = {k: v['customer_id'].tolist() for k, v in dfs.items()}
You can then access the results as lists[1]
, lists[2]
, etc.
There are other benefits. For example, you are no longer polluting the namespace, you save the effort of explicitly defining variable names, you can easily store and transport related collections of objects.
Upvotes: 3