Reputation: 91
I have this nested list X_train
X_train = [['sunny', 'hot', 'high', 'FALSE'],
['sunny', 'hot', 'high', 'TRUE'],
['overcast', 'hot', 'high', 'FALSE'],
['rainy', 'mild', 'high', 'FALSE'],
['rainy', 'cool', 'normal', 'FALSE'],
['rainy', 'cool', 'normal', 'TRUE'],
['overcast', 'cool', 'normal', 'TRUE'],
['sunny', 'mild', 'high', 'FALSE'],
['sunny', 'cool', 'normal', 'FALSE'],
['rainy', 'mild', 'normal', 'FALSE'],
['sunny', 'mild', 'normal', 'TRUE'],
['overcast', 'mild', 'high', 'TRUE'],
['overcast', 'hot', 'normal', 'FALSE'],
['rainy', 'mild', 'high', 'TRUE']]
I want to generate a list where the nth row of X_train
contains the set of unique values in the 𝑛 th column of X_train
. So the expected output should be:
[{'overcast', 'rainy', 'sunny'},
{'cool', 'hot', 'mild'},
{'high', 'normal'},
{'FALSE', 'TRUE'}]
My code is as follows:
questions=[]
f=set({w for row in X_train for w in row})
questions+=[f]
The output for that is like the gatherings of all unique values, which is not my expected output. How should I correct to fix my output as expected (I am advised to use set but I am not sure how to fix it in a right way)
[{'FALSE',
'TRUE',
'cool',
'high',
'hot',
'mild',
'normal',
'overcast',
'rainy',
'sunny'}]
Any ideas to help me out please? Thanks in advance
Upvotes: 0
Views: 461
Reputation: 1
if you dont want to use zip then you can use this method but it is very long but simple and very basic
X_train = [['sunny', 'hot', 'high', 'FALSE'],
['sunny', 'hot', 'high', 'TRUE'],
['overcast', 'hot', 'high', 'FALSE'],
['rainy', 'mild', 'high', 'FALSE'],
['rainy', 'cool', 'normal', 'FALSE'],
['rainy', 'cool', 'normal', 'TRUE'],
['overcast', 'cool', 'normal', 'TRUE'],
['sunny', 'mild', 'high', 'FALSE'],
['sunny', 'cool', 'normal', 'FALSE'],
['rainy', 'mild', 'normal', 'FALSE'],
['sunny', 'mild', 'normal', 'TRUE'],
['overcast', 'mild', 'high', 'TRUE'],
['overcast', 'hot', 'normal', 'FALSE'],
['rainy', 'mild', 'high', 'TRUE']]
f=[]
temp1=set()
temp2=set()
temp3=set()
temp4=set()
for i in X_train:
temp1.add(i[0])
temp2.add(i[1])
temp3.add(i[2])
temp4.add(i[3])
f.append(temp1)
f.append(temp2)
f.append(temp3)
f.append(temp4)
del(temp1)
del(temp2)
del(temp3)
del(temp4)
print(f)
Upvotes: 0
Reputation: 2602
A concise way of getting your expected output is with: list(map(set, zip(*X_train)))
.
zip(*X_train)
switches rows and columns, giving something roughly equivalent to:
[['sunny',
'sunny',
'overcast',
'rainy',
'rainy',
'rainy',
'overcast',
'sunny',
'sunny',
'rainy',
'sunny',
'overcast',
'overcast',
'rainy'],
['hot',
'hot',
'hot',
'mild',
'cool',
'cool',
'cool',
'mild',
'cool',
'mild',
'mild',
'mild',
'hot',
'mild'],
['high',
'high',
'high',
'high',
'normal',
'normal',
'normal',
'high',
'normal',
'normal',
'normal',
'high',
'normal',
'high'],
['FALSE',
'TRUE',
'FALSE',
'FALSE',
'FALSE',
'TRUE',
'TRUE',
'FALSE',
'FALSE',
'FALSE',
'TRUE',
'TRUE',
'FALSE',
'TRUE']]
Then each list in the list is mapped to a set
, and the map
object is converted to a list
.
Upvotes: 0
Reputation: 92440
You can zip()
the list to get the columns. Unpacking the columns with *
is the trick here. Then just take sets of the columns:
X_train = [['sunny', 'hot', 'high', 'FALSE'],
['sunny', 'hot', 'high', 'TRUE'],
['overcast', 'hot', 'high', 'FALSE'],
['rainy', 'mild', 'high', 'FALSE'],
['rainy', 'cool', 'normal', 'FALSE'],
['rainy', 'cool', 'normal', 'TRUE'],
['overcast', 'cool', 'normal', 'TRUE'],
['sunny', 'mild', 'high', 'FALSE'],
['sunny', 'cool', 'normal', 'FALSE'],
['rainy', 'mild', 'normal', 'FALSE'],
['sunny', 'mild', 'normal', 'TRUE'],
['overcast', 'mild', 'high', 'TRUE'],
['overcast', 'hot', 'normal', 'FALSE'],
['rainy', 'mild', 'high', 'TRUE']]
values = [set(col) for col in zip(*X_train)]
Gives you values:
[{'overcast', 'rainy', 'sunny'},
{'cool', 'hot', 'mild'},
{'high', 'normal'},
{'FALSE', 'TRUE'}]
Upvotes: 7