KeepLearning
KeepLearning

Reputation: 23

Automate google play search items in a list

I am working on a python project where I need to find out what are the apps that the company owns. For example, I have a list:

company_name = ['Airbnb', 'WeFi']

I would like to write a python function/ program to do the following:

1 . have it automatically search item in the list in Play store

2 . if the company name match,even if it only matches the first name, eg "Airbnb" will match "Airbnb,inc"

Airbnb Search Page circled

  1. Then it will click into the page and read its category Airbnb Read category

  2. If the company has more than one app, it will do the same for all apps.

  3. each app information of the company is store in tuple = {app name, category}

  4. Desired end result will be a list of tuples

eg:

print(company_name[0])
print(type(company_name[0]))

outcome:
airbnb
tuple

print(company_name[0][0])

outcome:
[('airbnb','Travel')]

This is a mixed of many knowledge and I am a newbie to python. So please give me some direction of how should I start writing the code.

I learn selenium could do automate "load more" function but I am not sure what exactly package I could use?

Upvotes: 1

Views: 1504

Answers (2)

Antony Hatchkins
Antony Hatchkins

Reputation: 33974

Here's another option to search google play programmatically:
https://github.com/facundoolano/google-play-scraper/#list

var gplay = require('google-play-scraper');

gplay.list({
    category: gplay.category.GAME_ACTION,
    collection: gplay.collection.TOP_FREE,
    num: 2
  })
  .then(console.log, console.log);

(it's nodejs, not python though)

Upvotes: 0

Peter234
Peter234

Reputation: 1052

I've written a little demo that may help you to achieve your goal. I used requests and Beautiful Soup. It's not exactly what you wanted but it can be adapted easily.

import requests
import bs4

company_name = "airbnb"
def get_company(company_name):
    r = requests.get("https://play.google.com/store/search?q="+company_name)
    soup = bs4.BeautifulSoup(r.text, "html.parser")
    subtitles = soup.findAll("a", {'class':"subtitle"})
    dev_urls = []
    for title in subtitles:
        try:
            text = title.attrs["title"].lower()
        #Sometimes there is a subtitle without any text on GPlay
        #Catchs the error
        except KeyError:
            continue
        if company_name in text:
            url = "https://play.google.com" + title.attrs["href"]
            dev_urls.append(url)
    return dev_urls

def get_company_apps_url(dev_url):
    r = requests.get(dev_url)
    soup = bs4.BeautifulSoup(r.text, "html.parser")
    titles = soup.findAll("a", {"class":"title"})
    return ["https://play.google.com"+title.attrs["href"] for title in titles]

def get_app_category(app_url):
    r = requests.get(app_url)
    soup = bs4.BeautifulSoup(r.text, "html.parser")
    developer_name = soup.find("span", {"itemprop":"name"}).text
    app_name = soup.find("div", {"class":"id-app-title"}).text
    category = soup.find("span", {"itemprop":"genre"}).text
    return (developer_name, app_name, category)

dev_urls = get_company("airbnb")
apps_urls = get_company_apps_url(dev_urls[0])
get_app_category(apps_urls[0])

>>> get_company("airbnb")
['https://play.google.com/store/apps/developer?id=Airbnb,+Inc']
>>> get_company_apps_url("https://play.google.com/store/apps/developer?id=Airbnb,+Inc")
['https://play.google.com/store/apps/details?id=com.airbnb.android']
>>> get_app_category("https://play.google.com/store/apps/details?id=com.airbnb.android")
('Airbnb, Inc', 'Airbnb', 'Travel & Local')

My script with google

dev_urls = get_company("google")
apps_urls = get_company_apps_url(dev_urls[0])
for app in apps_urls:
    print(get_app_category(app))

('Google Inc.', 'Google Duo', 'Communication')
('Google Inc.', 'Google Translate', 'Tools')
('Google Inc.', 'Google Photos', 'Photography')
('Google Inc.', 'Google Earth', 'Travel & Local')
('Google Inc.', 'Google Play Games', 'Entertainment')
('Google Inc.', 'Google Calendar', 'Productivity')
('Google Inc.', 'YouTube', 'Media & Video')
('Google Inc.', 'Chrome Browser - Google', 'Communication')
('Google Inc.', 'Google Cast', 'Tools')
('Google Inc.', 'Google Sheets', 'Productivity')

Upvotes: 3

Related Questions