Reputation: 13
So I am trying to scrap the psn store. specifically this link below. I am trying to grab the data of the games and prices of what is on sale.
https://store.playstation.com/#!/en-us/2-for-1/cid=STORE-MSF77008-PLAYCOLLMULTIBUY
r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")
and the data I want is when you right click on the webpage and then click inspect. so for Firewatch for example it would look like this.
< h3 class="cellTitle">Firewatch</h3>
< li class="buyPrice ">$19.99</li>
now when I print out the soup.prettify()
I get this
html,body,div,span,applet,object,iframe,h1,h2,h3,h4,h5,h6,p,blockquote,pre,a,abbr,acronym,address,big,cite,code,del,dfn,em,img,ins,kbd,q,s,samp,small,strike,strong,sub,sup,tt,var,b,u,i,center,dl,dt,dd,ol,ul,li,fieldset,form,label,legend,table,caption,
without any of the actual data
I must be doing something wrong here with the functions, but the guides I am reading and other peoples problems all seem to be doing exactly what I am?
Upvotes: 1
Views: 1042
Reputation: 2140
With the help of phantomjs(http://phantomjs.org/download.html) and Selenium you can do this
Step: 1. on terminal or cmd use command: pip install selenium 2. Download the phantomjs & unzip it than put the "phantomjs.exe" at python path for example on windows, C:\Python27
Than use this code it will give you desired result:
from selenium import webdriver
import time
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
url="https://store.playstation.com/#!/en-us/2-for-1/cid=STORE-MSF77008-PLAYCOLLMULTIBUY"
driver = webdriver.PhantomJS()
driver.get(url)
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.CSS_SELECTOR, ".cellTitle")))
gamenames=driver.find_elements(By.CLASS_NAME,'cellTitle')
prices= driver.find_elements(By.CLASS_NAME,'buyPrice ')
links= driver.find_elements(By.CLASS_NAME,'permalink')
time.sleep(2)
if len(gamenames) == len(prices):
for i in range(len(prices)):
print "The Name of Game is :" + gamenames[i].text + " The Price for Which is : "+ prices[i].text + " The url for it is: " + links[i].get_attribute('href')
else:
print "Parsing fail as Some data is not parsed properlly, Try Again"
driver.quit()
It will print :
The Name of Game is :Yu-Gi-Oh! Legacy of the Duelist The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/yu-gi-oh-legacy-of-the-duelist/cid=UP0101-CUSA02718_00-YGOLEGACYOFDUELB
The Name of Game is :Firewatch The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/firewatch/cid=UP0146-CUSA04107_00-FIREWATCH0000000
The Name of Game is :The Escapists The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/the-escapists/cid=UP4064-CUSA01880_00-THEESCAPISTS0000
The Name of Game is :Oxenfree The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/oxenfree/cid=UP0962-CUSA04950_00-OXENBASEENUS0000
The Name of Game is :Duke Nukem 3D: 20th Anniversary World Tour The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/duke-nukem-3d-20th-anniversary-world-tour/cid=UP0292-CUSA04899_00-PAGODA0000000000
The Name of Game is :Primal Carnage: Extinction The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/primal-carnage-extinction/cid=UP0505-CUSA03371_00-PRIMALCARNAGE000
The Name of Game is :The Bunker The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/the-bunker/cid=UP4459-CUSA06057_00-THEBUNKERGAMEPS4
The Name of Game is :Shantae and the Pirate's Curse The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/shantae-and-the-pirate's-curse/cid=UP2053-CUSA01609_00-SHANTAECURSENA01
The Name of Game is :Pure Pool The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/pure-pool/cid=UP2070-CUSA00328_00-UPUREPOOL0000001
The Name of Game is :Banner Saga 2 The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/banner-saga-2/cid=UP0134-CUSA04444_00-THEBANNERSAGA2VE
The Name of Game is :Armello™ The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/armello/cid=UP1120-CUSA03300_00-00ARMELLOONESCEA
The Name of Game is :Gone Home: Console Edition The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/gone-home-console-edition/cid=UP1012-CUSA01228_00-GONEHOME00000000
The Name of Game is :Amplitude The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/amplitude/cid=UP8802-CUSA02480_00-HMXAMPLITUDE2015
The Name of Game is :Dangerous Golf™ The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/dangerous-golf/cid=UP1898-CUSA05385_00-TFEDANGEROUSGOLF
The Name of Game is :Pure Hold'em World Poker Championship The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/pure-hold'em-world-poker-championship/cid=UP2070-CUSA01104_00-UPUREPOKER000001
The Name of Game is :Hard Reset Redux The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/hard-reset-redux/cid=UP1050-CUSA04041_00-HARDRESET0000000
The Name of Game is :Lifeless Planet: Premier Edition The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/lifeless-planet-premier-edition/cid=UP0604-CUSA05475_00-LIFELESSPLANETPS
The Name of Game is :The Escapists: The Walking Dead The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/the-escapists-the-walking-dead/cid=UP4064-CUSA04182_00-THEESCAPISTSWD00
The Name of Game is :100ft Robot Golf The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/100ft-robot-golf/cid=UP0476-CUSA04678_00-100FTGAMEPS4SIEA
The Name of Game is :Kholat The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/kholat/cid=UP1561-CUSA04464_00-KHOLATGAME000000
The Name of Game is :Pure Chess® Complete Bundle The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/pure-chess-complete-bundle/cid=UP2070-CUSA00240_00-B000000000000337
The Name of Game is :Rogue Stormers The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/rogue-stormers/cid=UP4402-CUSA06052_00-ROGUESTORMERS000
The Name of Game is :SNOW Beta The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/snow-beta/cid=UP2862-CUSA06096_00-0000000000000001
The Name of Game is :Assault Suit Leynos The Price for Which is : $19.99The url for it is: https://store.playstation.com/#!/en-us/games/assault-suit-leynos/cid=UP4034-CUSA04727_00-ASLEYNOS00000000
Hope this is what you were looking.
Upvotes: 0
Reputation: 5942
I checked this website a little tough. If you check link with using browser. You will see the loading...
text. When you make the request actually you just get this piece of the page and the other data actually is not loaded. It was loaded by javascript. Maybe you can use selenium
like solutions for this website.
Upvotes: 1