Reputation: 10686
I have the code:
img = f.read.scan(/<img/)
img = img.length
links = f.read.scan(/<a/)
links = links.length
div = f.read.scan(/<div/)
div = div.length
The program opens a link, say http://stackoverflow.com. It then prints img, links, and div. For some reason, no matter what website I choose, it returns 0 for links and div, but returns the correct number for img. Why is this?
Upvotes: 1
Views: 108
Reputation: 1499
f.read
reads the whole file on the first go, so the second and third matches get an empty string to scan for tags on, and you get zero matches. See http://www.ruby-doc.org/core-1.9.3/IO.html#method-i-read:
If length is omitted or is nil, it reads until EOF and the encoding conversion is applied. It returns a string even if EOF is met at beginning.
You might reposition the input pointer after the first read back to the beginning, but that'll only work for files, so basically read the whole data to a buffer, and then use the scanning on that. See @Hauleth's answer for an example.
Upvotes: 3
Reputation: 23586
Cause when you read file then you also move pointer. Write it that way (I also added method chains):
content = f.read
img = content.scan(/<img/).length
links = content.scan(/<a/).length
div = content.scan(/<div/).length
Upvotes: 4