Reputation: 4618
I have a mixed dataset where some of them are as strings and some as bytes as follows.
mydata={'data mining': [b'data', b'text mining', b"artificial intelligence"], 'neural networks': ['cnn', 'rnn', "artificial intelligence"]}
My code is as follows
for key, value in mydata.items():
for item in value:
print(type(item))
Since some of the values are bytes I wanted to convert them to strings. Therefore, I did the following change to the above code.
for key, value in mydata.items():
for item in value:
print(type(item.decode("utf-8")))
However, then I get an error saying; AttributeError: 'str' object has no attribute 'decode'
I also tried:
for key, value in mydata.items():
for item in value:
if type(item) == 'str':
print(type(item))
But it did not work for me.
Is there a way to resolve this issue?
Upvotes: 2
Views: 908
Reputation: 15130
Following is an implementation of the various suggestions in the comments. Check if the list element is a bytes object and decode if so (since bytes objects are immutable, I am replacing the list element with a decoded version).
mydata = {'data mining': [b'data', b'text mining', b'artificial intelligence'], 'neural networks': ['cnn', 'rnn', "artificial intelligence"]}
for items in mydata.values():
for i, item in enumerate(items):
if isinstance(item, bytes):
items[i] = item.decode()
print(mydata)
# OUTPUT
# {'data mining': ['data', 'text mining', 'artificial intelligence'], 'neural networks': ['cnn', 'rnn', 'artificial intelligence']}
Upvotes: 2