Trouble web scraping data from a tag's inline style attribute

Question

So I have a couple of spans with inline styles:





 //width=0px

I want to extract the "px" value and store it into an array. When we hit a span with width=0px, that signifies the end of that array. So the above will look like this:

array1 = [8, 16, 13, 20]

array2 = [5, 3, 90, 200]

We can use an arraylist of integer arrays to store the data.

What I have so far is very basic: Elements spanWidths= doc.select("span");

So far this produces: "border:...;width:8px;..."

I believe that we use regex to solve this but I'm not very accustomed to it. Any help?

marcus erronius · Accepted Answer

The regex would be \bwidth\s*:\s*(\d+)px. Then take the value from the first capture group. That is, call .group(1) on the resulting match.

Trouble web scraping data from a tag's inline style attribute

Answers (1)

Related Questions

Trouble web scraping data from a tag&#39;s inline style attribute

Answers (1)

Related Questions

Trouble web scraping data from a tag's inline style attribute