Reputation: 11
I recently started Learning Python and Web scraping using bs4. I queried a website and this is the Output I received in a dictionary:
{'title': 'Finance and Automation', 'description': 'Finance and Automation '}
{'title': 'Business and News', 'description': 'Business and News <a href = "cnn.com"'}
{'title': 'Politics and Economy', 'description': 'Politics and Economy <a href = "cnn.com"'}
This is basically the code which initializes the dictionary :
myList=[]
data = soup.find_all('div', class_='news-description')
for i in data:
getTitle = title.a.text
getDesc= desc.a.text
final_data= {
'title' : getTitle,
'description' : getDesc
}
print(final_data)
myList.append(final_data)
And after Printing, I'm getting the Output as shown above How can I replace all cnn.com which is in 'description' key with something like google.com before appending it to the list ?
UPDATE : I ran .replace() , but got this Error :
TypeError: 'NoneType' object is not callable
I think its because the First element doesnt have cnn.com . How to handle this cases ?
Upvotes: 1
Views: 42
Reputation: 6554
I think its because the First element doesnt have cnn.com ?
No, it is because description
is None
type and not string
just check if the description is not None
for i in data:
getTitle = title.a.text
getDesc = '' # initialize empty string
if desc.a.text:
# or
# if desc.a.text is not None:
getDesc = desc.a.text
Upvotes: 0
Reputation: 28630
Give this a try. Let me know if it works:
myList=[]
data = soup.find_all('div', class_='news-description')
for i in data:
getTitle = title.a.text
if type(desc.a.text) == str:
getDesc= desc.a.text.replace('cnn.com','google.com')
else:
getDesc= desc.a.text
final_data= {
'title' : getTitle,
'description' : getDesc
}
print(final_data)
myList.append(final_data)
Upvotes: 1