Thomas
Thomas

Reputation: 1212

Can't get progress bar to work in python rich

i'm trying to add a progress bar with rich to my code. However, while the code is running, the bar only updates to 100% after it's finished. Can I have any help? My code:

theme = Theme({'success': 'bold green',
              'error': 'bold red', 'enter': 'bold blue'})
console = Console(theme=(theme))
for i in track(range(1), description='Scraping'):
    global pfp
    global target_id
    chrome_options = Options()
    chrome_options.add_argument("--headless")
    driver = webdriver.Chrome(options=chrome_options)
    begining_of_url = "https://lookup.guru/"
    whole_url = begining_of_url + str(target_id)
    driver.get(whole_url)
    wait = WebDriverWait(driver, 10)
    wait.until(EC.visibility_of_element_located((By.XPATH, "//img")))
    images = driver.find_elements_by_tag_name('img')
    for image in images:
        global pfp
        pfp = (image.get_attribute('src'))
        break
    if pfp == "a":
        console.print("User not found \n", style='error')
        userInput()
    img_data = requests.get(pfp).content
    with open('pfpimage.png', 'wb') as handler:
        handler.write(img_data)
    filePath = "pfpimage.png"
    searchUrl = 'https://yandex.com/images/search'
    files = {'upfile': ('blob', open(filePath, 'rb'), 'image/jpeg')}
    params = {'rpt': 'imageview', 'format': 'json',
              'request': '{"blocks":[{"block":"b-page_type_search-by-image__link"}]}'}
    response = requests.post(searchUrl, params=params, files=files)
    query_string = json.loads(response.content)[
                              'blocks'][0]['params']['url']
    img_search_url = searchUrl + '?' + query_string
    webbrowser.open(whole_url)
    webbrowser.open(img_search_url)
    console.print("Done!", style='success')

Edit: For more clarity, I want the progressbar to update as it goes through each part of my code. There is only one url to scrape. For example it would start at 0%, and after global pfp the bar would change to x%

Thanks for any help :)

Upvotes: 0

Views: 11868

Answers (3)

Rusca8
Rusca8

Reputation: 612

In order for the rich package to work properly, you need to check Emulate terminal in output console, which you can find in the Run menu (on top), then Edit configurations, then Modify options.

Emulate terminal in output console

I had your same problem (the bar appeared only after the process was done), and this solved it.

Upvotes: 0

Thomas
Thomas

Reputation: 1212

The problem was that through the use of for i in track(range(1), description='Scraping'): the bar would only go to 100% when the loop had finished. By changing the range() value would make the code loop and would update the bar. To fix this issue I used another rich module called Progress.

By importing Progress and then modifying the code on the Rich Documentation I got:

from rich.progress import Progress
import time

with Progress() as progress:

    task1 = progress.add_task("[red]Scraping", total=100)

    while not progress.finished:
        progress.update(task1, advance=0.5)
        time.sleep(0.5)

Essentially:

  • At task1 = progress.add_task("[red]Scraping", total=100) a bar is created with a maximum value of 100
  • The code indented underwhile not progress.finished: will loop until the bar is at 100%
  • At progress.update(task1, advance=0.5) the bar's total will be increased by a value of 0.5.

Therefore, for my specific example, my end result code was:

theme = Theme({'success': 'bold green',
                  'error': 'bold red', 'enter': 'bold blue'})
console = Console(theme=(theme))
bartotal = 100

with Progress() as progress:
    task1 = progress.add_task("[magenta bold]Scraping...", total=bartotal)
    while not progress.finished:
                console.print("\nDeclaring global variables", style='success')
                global pfp
                progress.update(task1, advance=4)
                global target_id
                progress.update(task1, advance=4)
                console.print("\nSetting up Chrome driver", style='success')
                chrome_options = Options()
                progress.update(task1, advance=4)
                chrome_options.add_argument("--headless")
                progress.update(task1, advance=4)
                driver = webdriver.Chrome(options=chrome_options)
                progress.update(task1, advance=4)
                console.print("\nCreating url for lookup.guru",
                              style='success')
                begining_of_url = "https://lookup.guru/"
                progress.update(task1, advance=4)
                whole_url = begining_of_url + str(target_id)
                progress.update(task1, advance=4)
                driver.get(whole_url)
                progress.update(task1, advance=4)
                console.print(
                    "\nWaiting up to 10 seconds for lookup.guru to load", style='success')
                wait = WebDriverWait(driver, 10)
                progress.update(task1, advance=4)
                wait.until(EC.visibility_of_element_located(
                    (By.XPATH, "//img")))
                progress.update(task1, advance=4)
                console.print("\nScraping images", style='success')
                images = driver.find_elements_by_tag_name('img')
                progress.update(task1, advance=4)
                for image in images:
                    global pfp
                    pfp = (image.get_attribute('src'))
                    break
                progress.update(task1, advance=4)
                if pfp == "a":
                    console.print("User not found \n", style='error')
                    userInput()
                progress.update(task1, advance=4)
                console.print(
                    "\nDownloading image to current directory", style='success')
                img_data = requests.get(pfp).content
                progress.update(task1, advance=4)
                with open('pfpimage.png', 'wb') as handler:
                    handler.write(img_data)
                progress.update(task1, advance=4)
                filePath = "pfpimage.png"
                progress.update(task1, advance=4)
                console.print("\nUploading to yandex.com", style='success')
                searchUrl = 'https://yandex.com/images/search'
                progress.update(task1, advance=4)
                files = {'upfile': ('blob', open(
                    filePath, 'rb'), 'image/jpeg')}
                progress.update(task1, advance=4)
                params = {'rpt': 'imageview', 'format': 'json',
                          'request': '{"blocks":[{"block":"b-page_type_search-by-image__link"}]}'}
                progress.update(task1, advance=4)
                response = requests.post(searchUrl, params=params, files=files)
                progress.update(task1, advance=4)
                query_string = json.loads(response.content)[
                                          'blocks'][0]['params']['url']
                progress.update(task1, advance=4)
                img_search_url = searchUrl + '?' + query_string
                progress.update(task1, advance=4)
                console.print("\nOpening lookup.guru", style='success')
                webbrowser.open(whole_url)
                progress.update(task1, advance=4)
                console.print("\nOpening yandex images", style='success')
                webbrowser.open(img_search_url)
                progress.update(task1, advance=4)
                console.print("\nDone!", style='success')
                progress.update(task1, advance=4)

Upvotes: 0

Will McGugan
Will McGugan

Reputation: 2468

In order to show a progress bar, Rich needs to know how may steps are involved and when you finish a step. The track function can get this information automatically from a sequence. You're using this in your example, but your sequence only has a single element so you go from 0 to 100% in a single step.

If you want to track progress of something you need a sequence that defines the work to be done. For instance if you had a list of urls to scrape, you might do something like this:

from rich.progress import track
SCRAPE_URLS = ["https://example.org", "https://google.org", ...]
for url in track(SCRAPE_URLS):
    scrape(url)

The progress bar will advance for every URL.

Upvotes: 2

Related Questions