Reputation: 15
I am trying to scrape the data from the given link bellow, a link
And I an saving it into csv file.
I got all movies name, but in other format bellow, please see bellow: I am getting bellow format in csv:
T h e " " S h a w s h a n k " " R e d e m p t i o n
T h e " " G o d f a t h e r
T h e " " G o d f a t h e r : " " P a r t " " I I
T h e " " D a r k " " K n i g h t
1 2 " " A n g r y " " M e n
S c h i n d l e r ' s " " L i s t
It should be:
The Shawshank Redemption
The Godfather
The God father: Part II
The Dark Knight
I tried:
from bs4 import BeautifulSoup
import requests
import csv
url = 'https://www.imdb.com/chart/top'
res = requests.get(url)
soup = BeautifulSoup(res.text)
movie = soup.find_all(class_='titleColumn')
for names in movie:
for name in names.find_all('a'):
movies=list(name.text)
# print(movies)
# IN CSV
with open('TopMovies.csv', 'a') as csvFile:
writer = csv.writer(csvFile, delimiter = ' ')
writer.writerow(movies)
csvFile.close()
print(movies)
print("Successfully inserted")
Please, Let me know if its any changes in my code.
Thanks
Upvotes: 0
Views: 56
Reputation: 195418
Problem is in line movies=list(name.text)
- you are creating list, where each item is character from the string name.text
.
Instead of this list()
, you can use list-comprehension movies = [name.text for name in names.find_all('a')]
:
from bs4 import BeautifulSoup
import requests
import csv
url = 'https://www.imdb.com/chart/top'
res = requests.get(url)
soup = BeautifulSoup(res.text)
movie = soup.find_all(class_='titleColumn')
for names in movie:
movies = [name.text for name in names.find_all('a')]
# print(movies)
# IN CSV
with open('TopMovies.csv', 'a') as csvFile:
writer = csv.writer(csvFile, delimiter = ' ')
writer.writerow(movies)
csvFile.close()
print(movies)
print("Successfully inserted")
This will create TopMovies.csv
correctly.
Screenshot from LibreOffice:
Upvotes: 2