Reputation: 309
I have a encoding problem, When I try to crawl youtube (arabic channel) :
#!/usr/bin/python
# -*- coding: utf8 -*-
from django.core.management.base import BaseCommand, CommandError
import requests, lxml, re
from lxml import html
class Command(BaseCommand):
def handle(self, *args, **options):
r = requests.get("https://www.youtube.com/user/aljazeerachannel/videos?view=0")
root = lxml.html.fromstring(r.content)
for data in root.xpath('.//*[@id="branded-page-body"]/div/div/div[1]/div/div[2]/ul/li[1]/span/span/a'):
print data.text
The result is :
[root@vmi9105 buzzbal]# python manage.py youtube
اÙتخابات اÙÙجاÙس اÙبÙدÙØ© Ù٠سÙØ·ÙØ© عÙÙاÙ
Upvotes: 3
Views: 667
Reputation: 21
try this it sloved my problem in python:
f"{yourString}".encode('latin-1').decode("utf-8")
Upvotes: 2