Reputation: 37
I already have code which maps to this
['vita', 'oscura', 'smarrita', 'dura', 'forte', 'paura', 'morte', 'trovai', 'scorte', 'v’intrai']
I want this
[('vita','oscura',1),('oscura','smarrita',1),('smarrita','dura',1), ('dura','forte',1) etc
I thought that I could do this via a lambda function, where for every line, i ask for the first row, first item, then I ask for first row second column, which fails bc of an out of index error, any points on how I could go about this?
this is my code so far
def lower_clean_str(x):
punc='!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
lowercased_str = x.lower()
for ch in punc:
lowercased_str = lowercased_str.replace(ch, '')
return lowercased_str
clean_dcr=dcr.map(lower_clean_str)
print(clean_dcr.take(10))
#we split on whitespaces as in ex1, notice how this time we take [-1] to grab only the first word
clean_dcr=clean_dcr.map(lambda line: line.split()[-1])
print(clean_dcr.take(10))
#this gives an error
#clean_dcr=clean_dcr.map((lambda line:line[0][0],line[0][1])),1)
#print(clean_dcr.take(3))
Upvotes: 1
Views: 62
Reputation: 1247
For Python 3.10
and above one can use pairwise
Sample code snippet can be,
import itertools
input_list = ['vita', 'oscura', 'smarrita', 'dura', 'forte', 'paura', 'morte', 'trovai', 'scorte', 'v’intrai']
output = [element + (1, ) for element in itertools.pairwise(input_list)]
For python versions below 3.10 one can use reference implementation of pairwise which is also mentioned in the link
Upvotes: 1