b0baboi
b0baboi

Reputation: 27

Parsing javascript using re.findall

So I have several problems that I am trying to tackle.

First I am trying to parse this javascript I got from html.

$(document).ready(function() { $('#commodity-show-thumbnails').bxSlider({ mode: 'vertical', auto: false, controls: true, pager: false, minSlides: 4, maxSlides: 4, moveSlides: 1, slideWidth: 250 }); itemSelector('commodity-show-form', 'commodity-show-addcart-submit', [['color', 'Choose color'], ['size', 'Choose size']], { "39805": { "params": ["Smokey Blue/Mica Blue", "36"]}, "39806": { "params": ["Smokey Blue/Mica Blue", "36,5"]}, "39807": { "params": ["Smokey Blue/Mica Blue", "37,5"]}, "39808": { "params": ["Smokey Blue/Mica Blue", "38"]}, "39809": { "params": ["Smokey Blue/Mica Blue", "38,5"]}, "39810": { "params": ["Smokey Blue/Mica Blue", "39"]}, "39811": { "params": ["Smokey Blue/Mica Blue", "40"]}, "39812": { "params": ["Smokey Blue/Mica Blue", "40,5"]}, "39814": { "params": ["Smokey Blue/Mica Blue", "42"]} }, [39805,39806,39807,39808,39809,39810,39811,39812,39814], 'main-cart', 'commodity-show-image'); });

res = re.findall(r'{ "params": (.+?)}', text)  # text is where javascript text is stored

final = [eval(i) for i in res]

print(final)

I got following output

[['Smokey Blue/Mica Blue', '36'], ['Smokey Blue/Mica Blue', '36,5'], ['Smokey Blue/Mica Blue', '37,5'], ['Smokey Blue/Mica Blue', '38'], ['Smokey Blue/Mica Blue', '38,5'], ['Smokey Blue/Mica Blue', '39'], ['Smokey Blue/Mica Blue', '40'], ['Smokey Blue/Mica Blue', '40,5'], ['Smokey Blue/Mica Blue', '42']]

But now I don't know how to go from here on.I want to find the value this value 39805 from

{ "39805": { "params": ["Smokey Blue/Mica Blue", "36"]}. How would I parse it so that says if I am looking for value associated with 36, it would give me 39805?

I am sorry but I am really bad with parsing and I am pretty new to this.

Upvotes: 0

Views: 89

Answers (2)

saulspatz
saulspatz

Reputation: 5261

EDIT: I just realized that in some cases, the size has two numbers, like "36,5". I assume this means 36 and a a half. Anyway, my original script didn't take account for that, which is why it gave the wrong answer (which I carelessly didn't notice.) Here's a revised script that seems to work:

import re
text='''$(document).ready(function() { $('#commodity-show-thumbnails').bxSlider({ mode: 'vertical', auto: false, controls: true, pager: false, minSlides: 4, maxSlides: 4, moveSlides: 1, slideWidth: 250 }); itemSelector('commodity-show-form', 'commodity-show-addcart-submit', [['color', 'Choose color'], ['size', 'Choose size']], { "39805": { "params": ["Smokey Blue/Mica Blue", "36"]}, "39806": { "params": ["Smokey Blue/Mica Blue", "36,5"]}, "39807": { "params": ["Smokey Blue/Mica Blue", "37,5"]}, "39808": { "params": ["Smokey Blue/Mica Blue", "38"]}, "39809": { "params": ["Smokey Blue/Mica Blue", "38,5"]}, "39810": { "params": ["Smokey Blue/Mica Blue", "39"]}, "39811": { "params": ["Smokey Blue/Mica Blue", "40"]}, "39812": { "params": ["Smokey Blue/Mica Blue", "40,5"]}, "39814": { "params": ["Smokey Blue/Mica Blue", "42"]} }, [39805,39806,39807,39808,39809,39810,39811,39812,39814], 'main-cart', 'commodity-show-image'); });'''
pattern = re.compile(r' "([0-9]+).*?params.*?([0-9]+(,5)?)')

s={b:a for a,b,_ in pattern.findall(text)}

print(s['36'], s['36,5'])

Now this prints 39805 39806, which looks right to me.

Here's all the data:

for a in sorted(s):print(a, s[a])
36 39805
36,5 39806
37,5 39807
38 39808
38,5 39809
39 39810
40 39811
40,5 39812
42 39814

Upvotes: 1

Mohammad Yusuf
Mohammad Yusuf

Reputation: 17064

You can get that 36 like this:

import re
import ast

a="""$(document).ready(function() { $('#commodity-show-thumbnails').bxSlider({ mode: 'vertical', auto: false, controls: true, pager: false, minSlides: 4, maxSlides: 4, moveSlides: 1, slideWidth: 250 }); itemSelector('commodity-show-form', 'commodity-show-addcart-submit', [['color', 'Choose color'], ['size', 'Choose size']], { "39805": { "params": ["Smokey Blue/Mica Blue", "36"]}, "39806": { "params": ["Smokey Blue/Mica Blue", "36,5"]}, "39807": { "params": ["Smokey Blue/Mica Blue", "37,5"]}, "39808": { "params": ["Smokey Blue/Mica Blue", "38"]}, "39809": { "params": ["Smokey Blue/Mica Blue", "38,5"]}, "39810": { "params": ["Smokey Blue/Mica Blue", "39"]}, "39811": { "params": ["Smokey Blue/Mica Blue", "40"]}, "39812": { "params": ["Smokey Blue/Mica Blue", "40,5"]}, "39814": { "params": ["Smokey Blue/Mica Blue", "42"]} }, [39805,39806,39807,39808,39809,39810,39811,39812,39814], 'main-cart', 'commodity-show-image'); });"""
b = re.findall(r'.*?({ ".*?} }).*}', a)[0]

d1 = ast.literal_eval(b)
print d1, '\n'

for a,b in d1.iteritems():
    if b['params'][1]=='36':
        print a

Output:

{'39809': {'params': ['Smokey Blue/Mica Blue', '38,5']}, '39808': {'params': ['Smokey Blue/Mica Blue', '38']}, '39805': {'params': ['Smokey Blue/Mica Blue', '36']}, '39807': {'params': ['Smokey Blue/Mica Blue', '37,5']}, '39806': {'params': ['Smokey Blue/Mica Blue', '36,5']}, '39812': {'params': ['Smokey Blue/Mica Blue', '40,5']}, '39814': {'params': ['Smokey Blue/Mica Blue', '42']}, '39810': {'params': ['Smokey Blue/Mica Blue', '39']}, '39811': {'params': ['Smokey Blue/Mica Blue', '40']}} 

39805

Upvotes: 1

Related Questions