Reputation: 25
I have a large data frame (367 rows × 342 columns) where multiple columns have the same prefix in their name. I am trying to make our code easier to use.
Current code:
value_vars = "'Intensity 01_1',
'Intensity 01_2',
'Intensity 01_3',
'Intensity 03_1',
'Intensity 03_2',
'Intensity 03_3',
'Intensity 04_1',
'Intensity 04_2',
'Intensity 04_3',
'Intensity 05_1',
'Intensity 05_2',
'Intensity 05_3',
'Intensity 06_1',
'Intensity 06_2',
'Intensity 06_3',,
var_name="SampleMeas", value_name="SpecInt"
)
Here is what I am trying to use but I am getting an error " TypeError: unhashable type: 'list' "
valvarlist = [col for col in protstack if 'Intensity' in col],
[col for col in protstack if 'iBAQ' in col],
[col for col in protstack if 'LFQ intensity' in col]
#print(valvarlist)
test = pd.melt(protstack, id_vars="Majority protein IDs",
value_vars = valvarlist,
var_name="SampleMeas", value_name="SpecInt"
)
I have tried putting the valvarlist in [] but I get the same error. When I check type(valvarlist), I get a tuble, which should be usable with melt.
Upvotes: 1
Views: 236
Reputation: 862591
Create list of columns names with or
for chain conditions:
alvarlist = [col for col in protstack if
('Intensity' in col) or ('iBAQ' in col) or ('intensity' in col)]
Or use str.contains
with columns names with |
for regex OR of tested values:
alvarlist = df.columns[df.columns.str.contains('Intensity|iBAQ|intensity')]
Sample:
df = pd.DataFrame(1, columns=['Intensity1','iBAQ1','intensity4','intensity','ss'],
index=[0,1])
print (df)
Intensity1 iBAQ1 intensity4 intensity ss
0 1 1 1 1 1
1 1 1 1 1 1
protstack = df.columns
alvarlist = [col for col in protstack if
('Intensity' in col) or ('iBAQ' in col) or ('intensity' in col)]
print (alvarlist)
['Intensity1', 'iBAQ1', 'intensity4', 'intensity']
alvarlist = df.columns[df.columns.str.contains('Intensity|iBAQ|intensity')]
print (alvarlist)
Index(['Intensity1', 'iBAQ1', 'intensity4', 'intensity'], dtype='object')
Upvotes: 2