python how to get all the content of the web page dynamic

Question

use selenium.webdriver to log in Facebook and to get the html page of a public figure, such as https://www.facebook.com/DonaldTrump/?fref=ts, may want to crawl the post content from this page.

I found that use selemium.webdriver, only get the contents of the web page in the current screen , for example, when log in the facebook and want to get all the web content of https://www.facebook.com/DonaldTrump/?fref=ts, what I got is only the several post in the current screen, but in fact, the post(the content) in the page https://www.facebook.com/DonaldTrump/?fref=ts are so many.

I will roll the mouse wheel so many times, the page can reach its bottom, but now what I get is only the limited content in current screen could you please tell me the solution method, or tell me other methods or library except selenium that can log in facebbook and get all the content of the target page(not only the content in current screen)

The program that I wrote is:

import requests

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

FACEBOOK_URL_PREFIX = "https://www.facebook.com/"

def web_public_figure(self,p_figure_name):
    #delete " " in p_figue_name
    p_figure_name_arr = p_figure_name.split(" ")
    p_figure_name_str = "".join(p_figure_name_arr)
    params = r"/?fref=ts"

    p_f_web_url = FACEBOOK_URL_PREFIX + p_figure_name_str + params

    # log in the website
    login_url = "https://www.facebook.com/login.php?login_attempt=1&lwv=110"
    glovar.webdriver_browser = webdriver.Chrome()
    glovar.webdriver_browser.get(login_url)

    # user credentials
    user = glovar.webdriver_browser.find_element_by_css_selector("#email")
    user.send_keys('choikunchen@gmail.com')
    password = glovar.webdriver_browser.find_element_by_css_selector("#pass")
    password.send_keys('expectopatronum')
    login = glovar.webdriver_browser.find_element_by_css_selector("#loginbutton")
    login.click()
    # the login maybe fail, return to the login page
    if "login" in glovar.webdriver_browser.current_url:
        glovar.webdriver_browser.close()
    time.sleep(10)

    glovar.webdriver_browser.get(p_f_web_url)
    html_p_f_page = glovar.webdriver_browser.page_source

    return html_p_f_page

p_figure_name is "Donald trump", but the "html_p_page" is only the part of the whole page:https://www.facebook.com/DonaldTrump/?fref=ts,(only the part in current screen).

It seems in the page, there is button "see all", could you please tell me how to get all the content of such a page, maybe using library other than selenium

python how to get all the content of the web page dynamic

Answers (1)

Related Questions