Reputation: 1
I want to turn a list of bigrams to a list of tokens using Python 3.6.
I have something like:
input_list = [(‘hi’, ‘my’), (‘my’, ‘name’), (‘name’, ‘is’), (‘is’, ‘x’)]
I want to turn this to:
output_list = [‘hi’, ‘my’, ‘name’, ‘is’, ‘x’]
Upvotes: 0
Views: 95
Reputation: 15872
If you do not want to create a separate list to store the flattened values, and save space and avoid loops you may try this:
from itertools import chain
lst = [('hi', 'my'), ('my', 'name'), ('name', 'is'), ('is', 'x')]
flattened = chain(*lst)
elems = list(dict.fromkeys(flattened).keys())
print(elems)
Here chain(*lst)
basically unpacks the elements and flattens the list, and stores it in a iterator object, as opposed to actually storing as list. Then you can convert those to set and back, but they may mess the ordering. So you take all those values and try to convert them to keys of dictionary. As dictionaries cannot have duplicate keys, it will only take the unique elements. So if you take the keys of that dict, you will get the unique elements from the flattened list. NOTE: The order is guaranteed to be maintained from Python 3.7.
Upvotes: 0
Reputation: 36450
If all input follow that structure I would extract first part of first tuple
, then last element from every tuple
, that is:
input_list = [("hi", "my"), ("my", "name"), ("name", "is"), ("is", "x")]
output_list = [input_list[0][0]]+[i[-1] for i in input_list]
print(output_list) # ['hi', 'my', 'name', 'is', 'x']
I used followed python features:
[0][0]
means first element of first element (if that is not clear I suggest searching for nesting first), [-1]
means last element (first element starting from end)+
) to "glue" two list
s togetherUpvotes: 0
Reputation: 386
You can start with using a list comprehension to flatten the list and then take a set of that:
flat_list = [x for sublist in input_list for x in sublist]
output_list = set(flat_list)
output_list
{'hi', 'is', 'my', 'name', 'x'}
Upvotes: 1