Reputation: 570
I am learning Pyspark and just a beginner. I am getting the error as mentioned in the title. I have followed similar questions and tried what is mentioned here but still doesn't help. https://stackoverflow.com/questions/20441035/unsupported-operand-types-for-int-and-str
Please find below some of my code snippet
age=lines.map(lambda x: x.split(',')[2])
friends=lines.map(lambda x: x.split(',')[3])
rdd=lines.map(lambda x: int(x.split(',')[2]) +","+ int(x.split(',')[3]))
totalsByAge = rdd.mapValues(lambda x: (x, 1)).reduceByKey(lambda x, y: (x[0] + y[0], x[1] + y[1]))
averagesByAge = totalsByAge.mapValues(lambda x: x[0] / x[1])
results = averagesByAge.collect()
for result in results:
print(result)
I have converted rdd to int while using map but still getting the error as
rdd=lines.map(lambda x: int(x.split(',')[2]) +","+ int(x.split(',')[3]))
TypeError: unsupported operand type(s) for +: 'int' and 'str'
I also tried removing "+" but not getting the right syntax.
Upvotes: 2
Views: 4298
Reputation: 7028
You are adding integers and strings, which can not be done in python. You would first have to concat the strings, and then cast them to int.
rdd=lines.map(lambda x: int(x.split(',')[2] +","+ x.split(',')[3]))
Upvotes: 2