Reputation: 103
I tried to scrape the review for an app using the google-play-scraper package. I followed the code from their readme on github: https://github.com/JoMingyu/google-play-scraper. But it doesn't seem to scrape all reviews!
I'm not sure whether it's possible to scrape all reviews as the lang and country arguments default to 'en' and 'us'. But I was trying to scrape several apps that are exclusively used in Germany using 'de' for country and language. I know there will be some people with a foreign play store account who reviewed the app but for an app that only exists in Germany, this share shouldn't be too high. But for many apps I've tried, the difference between the number of reviews stated on google play's website and the number of reviews that are scraped is just implausible.
Here's my code:
from google_play_scraper import app
import pandas as pd
import numpy as np
from google_play_scraper import Sort, reviews_all
app_reviews = reviews_all(
'de.flaschenpost.app',
sleep_milliseconds=0,
lang='de',
country='de',
sort=Sort.NEWEST
)
For this app, Google Play has 69,600 reviews, but only 7,608 are scraped. Other examples: de.hafas.android.db (184,247 reviews, 51,552 scraped), de.materna.bbk.mobile.app (24,401 reviews, 12,896 scraped).
Am I missing something? Thanks a lot!
EDIT: Thanks to Joaquin for pointing me in the right direction. All scraped reviews include some comments, i.e. the larger number probably includes all ratings, also those who only left 1-5 stars and didn't write anything!
Upvotes: 4
Views: 3461
Reputation: 1
Try to check how many comments you can manually load by scrolling down with apps that don't have many reviews. You will see that it corresponds exactly to the number of comments the code scraped. So its either a limitation implemented by Google app store or as Joaquin said. Either way the code scrapes as much as you could if you decided to load and copy manually.
Upvotes: 0
Reputation: 103
Thanks to Joaquin for pointing me in the right direction. All scraped reviews include some comments, i.e. the larger number probably includes all ratings, also those who only left 1-5 stars and didn't write anything!
Upvotes: 2