Vicky
Vicky

Reputation: 829

Decode UTF-8 python list to array list python

I have an array of string like this

[u'ROWKEY\ufffdACCOUNTID\ufffdACCOUNTIDDSC']

How do i convert the above list into the below array list in python

['ROWKEY','ACCOUNTID','ACCOUNTIDDSC']

Upvotes: 1

Views: 6874

Answers (4)

Kian
Kian

Reputation: 1350

You should encode your string not decode. Your provided list (array of string as you mentioned) consists of a unicode sting. To represent a unicode string as a string of bytes is known as encoding, use u'...'.encode(encoding). Then by using string.split() you can break that encoded string down into smaller chunks, or strings.

lst = [u'ROWKEY\ufffdACCOUNTID\ufffdACCOUNTIDDSC']
new_list = [i.encode('utf8') for i in lst[0].split(u'\ufffd')]
print(new_list)

Output would be:

['ROWKEY', 'ACCOUNTID', 'ACCOUNTIDDSC']

Upvotes: 3

Rakesh
Rakesh

Reputation: 82765

Using Regex. re.split

Ex:

import re

l = u'ROWKEY\ufffdACCOUNTID\ufffdACCOUNTIDDSC'
print(re.split(r"[^a-zA-Z]", l))

Output:

[u'ROWKEY', u'ACCOUNTID', u'ACCOUNTIDDSC']

Upvotes: 1

Antwane
Antwane

Reputation: 22598

Use str.split()

>>> [u'ROWKEY\ufffdACCOUNTID\ufffdACCOUNTIDDSC'][0].split(u"\ufffd")
[u'ROWKEY', u'ACCOUNTID', u'ACCOUNTIDDSC']

Upvotes: 1

Taher A. Ghaleb
Taher A. Ghaleb

Reputation: 5240

Do it like this:

old_list = [u'ROWKEY\ufffdACCOUNTID\ufffdACCOUNTIDDSC']
new_list = old_list[0].split(u'\ufffd')
print(new_list)

Hope it helps.

Upvotes: 1

Related Questions