Eric
Eric

Reputation: 53

I want to scrape in python with BeautifulSoup but 'str' object has no attribute 'find_all' error occurred

I want to scrape in python with BeautifulSoup but 'str' object has no attribute 'find_all' error occurred. The expecting result is numbers are assigned each value in the array.

Here is my code

import requests
from bs4 import BeautifulSoup

url = "https://ja.wikipedia.org/wiki/メインページ"

response= requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
today = soup.find("div", attrs={"id": "on_this_day"}).text

entries = today.find_all("li")
today_list = []
index = 1

for entry in entries:
    today_list.append([index, entry.get_text()])
    index += 1
print(today_list)

The error message

AttributeError                            Traceback (most recent call last)
<ipython-input-10-c70240e5052b> in <module>
     8 today = soup.find("div", attrs={"id": "on_this_day"}).text
     9 
     ---> 10 entries = today.find_all("li")
    11 today_list = []
    12 index = 1

AttributeError: 'str' object has no attribute 'find_all'

Could you please help?

Upvotes: 0

Views: 1603

Answers (2)

Joe Thor
Joe Thor

Reputation: 1260

Remove the .text on the today variable, which strips away the html.

Leaving this will enable the .findall method to pull all <li> tags.

import requests
from bs4 import BeautifulSoup

url = "https://ja.wikipedia.org/wiki/メインページ"

response= requests.get(url)
soup = BeautifulSoup(response.content, "html.parser")
today = soup.find("div", attrs={"id": "on_this_day"})

entries = today.find_all("li")
today_list = []
index = 1

for entry in entries:
    today_list.append([index, entry.get_text()])
    index += 1
print(today_list)

returns

[[1, '天武天皇が日本で初めて肉食・狩猟を禁じる詔を発する(675年 - 天武天皇4年4月17日))'], [2, 'フランドル伯ボードゥアン9世がラテン帝国の初代皇帝に即位(1204年)'], [3, 'スコッ
トランドの元女王メアリーがイングランドに亡命(1568年)'], [4, '松尾芭蕉、北へ向けて江戸を出立。『奥の細道』の旅へ(1689年 - 元禄2年3月27日)'], [5, 'ローマ教皇ベネディクトゥス15世がジャンヌ・ダルクを列聖(1920年)'], [6, '第1回アカデミー賞授賞式(1929年)'], [7, '東京、大阪、名古屋の3証券取引所が取引再開(1949年)'], [8, '韓国で5・16軍事クーデター(1961年)'], [9, '田部井淳子ら日本女子登山隊が女性初のエベレスト登頂に成功(1975年)'], [10, 'ヒマラヤ山脈のシッキム王国が、国民投票の結果に基づきインドに合併される(1975年)'], [11, '初の実用的なパーソナルコンピュータ「Apple II」が発売(1977年)'], [12, 'オウム真理教教祖・麻原彰晃を逮捕(1995年)']]

Upvotes: 0

Shubham Periwal
Shubham Periwal

Reputation: 2248

The error message says it all

AttributeError: 'str' object has no attribute 'find_all'

So you're trying to get find_all() attribute of some str object. so clearly there's a string object which should not be a string.

You notice here

today = soup.find("div", attrs={"id": "on_this_day"}).text

You have a .text here which is making it a string so if you don't want it to be a string you just remove it and there's your solution!

today = soup.find("div", attrs={"id": "on_this_day"}).text

Upvotes: 1

Related Questions