Niklas Rosencrantz
Niklas Rosencrantz

Reputation: 26652

What should be done to enable foreign chars?

I'm looking for a way to enable foreign chars in my reporting output. The output is HTML that I convert to PDF to make appengine send a PDF email. The output can't handle int. chars eg. åäö: enter image description here

The code that makes the report is

class Report(webapp2.RequestHandler):

    def get(self):
        # Create a conversion request from HTML to PDF.
        users = User.query()
        today = date.today()
        startdate = date(today.year, today.month, 1) # first day of month   
        html = None     
        for user in users: 
            if user.activity() > 0:
                logging.info('found active user %s %s' % (user.firstname, user.lastname))
                html = '<html><body><table border="1">'
                level = user.level()
                distributor = user
                while distributor.has_downline():
                    downline = User.query(User.sponsor == distributor.key).order(User.lastname).fetch()
                    for person in downline:  # to this for whole downline
                        orders = model.Order.all().filter('distributor_id =' , person.key.id()).filter('created >' , startdate).filter('status =', 'PAID').fetch(999999)
                        silver = 0
                        name = person.firstname +' '+ person.lastname
                        for order in orders:
                            logging.info('found orders')
                            for idx,item in enumerate(order.items):
                                purchase = model.Item.get_by_id(long(item.id()))
                                amount = int(order.amounts[idx])
                                silver = silver + amount*purchase.silver/1000.000 
                            if len(name) > 13:
                                name = name[13]
                            html = html + '<tr><td>' + str(order.created.date().day)+'/'+ str(order.created.date().month )+'</td><td>' + filters.makeid(person.key.id()) +'</td><td>' + name + '</td><td>' + str(order.key().id()) + '</td><td>' + str(silver) 
                            dist_level = order.dist_level
                            bonus = 0   
                            if level == 5 and dist_level == 4:                          
                                bonus = 0.05
                            if level == 5 and dist_level == 3:
                                bonus = 0.1
                            if level == 5 and dist_level == 2:
                                bonus = 0.13
                            if level == 5 and dist_level == 1:
                                bonus = 0.35

                            if level == 4 and dist_level == 3:                          
                                bonus = 0.05
                            if level == 4 and dist_level == 2:
                                bonus = 0.08
                            if level == 4 and dist_level == 1:
                                bonus = 0.3

                            if level == 3 and dist_level == 2:                          
                                bonus = 0.03
                            if level == 3 and dist_level == 1:
                                bonus = 0.25

                            if level == 2 and dist_level == 1:                          
                                bonus = 0.2

                            html = html + '</td><td>' + str(bonus) + '</td><td>' + str(order.total)
                            bonusmoney = bonus * float(order.total)
                            html = html + '</td><td>' + str(bonusmoney) + '</td></tr>'

                        distributor = person

                html = html + '</table>'

            asset = conversion.Asset("text/html", html, "test.html")
            conversion_obj = conversion.Conversion(asset, "application/pdf")        
            rpc = conversion.create_rpc()
            conversion.make_convert_call(rpc, conversion_obj)

            result = rpc.get_result()
            if result.assets:
                for asset in result.assets:
                    logging.info('emailing report')# to %s' % user.email)
                    message = mail.EmailMessage(sender='[email protected]',
                                    subject='Report %s %s' % (user.firstname, user.lastname))
                    message.body = 'Here is the monthly report'
                    message.to = '[email protected]'
                    message.bcc = '[email protected]'
                    message.attachments = ['report.pdf', asset.data]
                    message.send()
                    logging.info('message sent')

What should be done to enable foreign chars in this case? Thanks

Upvotes: 0

Views: 149

Answers (5)

rbanffy
rbanffy

Reputation: 2521

You may want to check the htmllib.htmlentitydefs module. And don't forget to declare encoding in your input forms and pages to ensure all data is encoded uniformly or, at least, always be aware of it and act accordingly.

Upvotes: 0

Kjartan Sverrisson
Kjartan Sverrisson

Reputation: 123

I've had a similar experience with my code. My native language is Icelandic so we tend to have a lot of non English characters in our data. This particular time I have made my due diligence on all fronts except making sure the .py file it self was encoded as utf-8 and not ansi, which for some reasons created unpredictable results.

So if all else fails, check your document encoding and make sure it's set to utf-8.

Upvotes: 1

Deestan
Deestan

Reputation: 17138

The two Holy Rules for Unicode programming:

  • Be certain of what encoding your input has, and decode it appropriately.
  • Be explicit of what encoding your output has.

Your code appears to rely on User.query decoding the input (otherwise your sorting won't work!), but it doesn't do anything about the second point. Do the following:

  • Assure that the person.firstname fields are indeed Unicode strings. If you need to inspect the contents, look at the hexadecimal representation of each character; just printing the string out causes some automatic encoding and hides the source representation. If they aren't Unicode strings, fix User.query.
  • Explicitly encode your output as UTF-8, and explicitly mark the output as UTF-8 encoded (do as grodzik suggested - that should work).

Just adding a encoding=UTF8 mark to the output may give the correct end result, but that is purely by accident if you don't have control of the intermediate data. (You will also notice that your sorting will be off.)

Upvotes: 3

grodzik
grodzik

Reputation: 274

You should add <head> to the html = '<html><body><table border="1">', This should contain meta tag: <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

If utf-8 charset won't help, maybe some iso-8859-x will.

Upvotes: 1

jcomeau_ictx
jcomeau_ictx

Reputation: 38422

it looks as though it's handling them, but the encoding is wrong. you might need to do something like name.decode('utf-8').encode('latin-1')

of course, I'm only guessing at the encodings.

Upvotes: 1

Related Questions