Reputation: 110502
I am receiving an xml feed which has values such as:
<Theme>Valentine's Day</Theme>
<Copyright>© Ventures. All Rights Reserved.</Copyright>
I need to parse the value and store it in a mysql database. What would be the best way to cleanse the values so I can insert "Valentie's Day"
, "<copyright symbol> Ventures. All Rights Reserved."
? There are about 20+ different marking like this.
Doing a straight INSERT
, I'll get the following erro:
Warning: Incorrect string value: '\xA9 1987...' for column 'title' at row 1
Upvotes: 0
Views: 199
Reputation: 376002
If you parse the XML with a real xml parser, you'll get Unicode strings as text. You can then encode them with UTF-8:
title = text.encode('utf8')
and title will be writable into your database, though many details are still unclear because we don't know how you're writing to your database.
Upvotes: 2
Reputation: 110502
Specify encoding and then ecode the string to utf8
.
# -*- coding: utf-8 -*-
title = text.encode('utf8')
Upvotes: 0