Reputation: 1667
I would like to subset my dataframe based on a couple of lists of variables, that is:
list1=[var1,var2,var3]
list2=[var4,var5,var6]
data_final = data[list1,list2]
which produced this error:
TypeError: unhashable type: 'list'
If I provide a single list, everything works fine:
data_final = data[list1]
Below is a min-example:
dict1 = [{'var0': 0, 'var1': 1, 'var2': 2},
{'var0': 0, 'var1': 2, 'var2': 4},
{'var0': 1, 'var1': 5, 'var2': 8},
{'var0': 1, 'var1': 15, 'var2': 12},]
df = pd.DataFrame(dict1, index=['s1', 's2','s3','s4'])
list1=['var0']
list2=['var1','var2']
These two commands work fine:
df[list1]
df[list2]
But this one produces the above mentioned error:
df[list1,list2]
Upvotes: 3
Views: 2690
Reputation: 498
To load any number of list into a dataframe in row (as long as the length of the lists are equal) you would do the following:
import pandas as pd
l1 = [1,2,3]
l2 = [10,20,30]
col_name = ['c1','c2','c3']
row_name = ['r1','r2']
pd.DataFrame([l1,l2],columns=col_name, index=row_name)
c1 c2 c3
r1 1 2 3
r2 10 20 30
To load any number of lists into a dataframe in columns you would have to zip the list together like so:
l1 = [1,2,3]
l2 = [10,20,30]
col_name = ['c1','c2']
row_name = ['r1','r2','r3']
zipped_list = list(zip(l1,l2))
import pandas as pd
pd.DataFrame(zipped_list,columns=col_name,index=row_name)
c1 c2
r1 1 10
r2 2 20
r3 3 30
Hope that helps, py-on!
Upvotes: 2
Reputation: 2684
Is this the output you're expecting?
df[list1 + list2]
Out[106]:
var0 var1 var2
s1 0 1 2
s2 0 2 4
s3 1 5 8
s4 1 15 12
Upvotes: 2
Reputation: 205
You need to write your column names in one list not as list of lists:
data_final= data[[var1,var2,var3],[var4,var5,var6]]
From docs:
You can pass a list of columns to [] to select columns in that order. If a column is not contained in the DataFrame, an exception will be raised. Multiple columns can also be set in this manner
Upvotes: 2