Reputation: 291
I have a question string look like this
'{"type":"2","question_id":"\\u5c0d\\u65bc\\u7d93\\u71df\\u4e00\\u6bb5\\u611f\\u60c5\\uff0c\\u59b3\\u89ba\\u5f97\\u6700\\u91cd\\u8981\\u7684\\u95dc\\u9375\\u662f\\u4ec0\\u9ebc\\u5462\\uff1f","text":"\\u5fcd \\u8b93\\u5c0d\\u65b9"}'
I only want the text part, which is "\u5fcd \u8b93\u5c0d\u65b9",
but need to clean it to print out,
any suggestions?
Thank you
Upvotes: 1
Views: 1870
Reputation: 369274
The string looks like a json after unicode-escape decoding:
>>> s = '{"type":"2","question_id":"...","text":"\\u5fcd \\u8b93\\u5c0d\\u65b9"}'
>>> s.encode().decode('unicode-escape') # `encode` is not needed in python 2.x
'{"type":"2","question_id":"對於經營一段感情,妳覺得最重要的關鍵是什麼呢?","text":"忍 讓對方"}'
You can use json.loads
to deserialize the json:
>>> import json
>>> print(json.loads(s.encode().decode('unicode-escape'))['text'])
'忍 讓對方'
Upvotes: 2