Extracting particular element in HTML file and inserting into CSV

Question

I have a HTML table stored in a file. I want to take each td value from the table which has the attribute like so :

Value for CSV
Value for CSV2
Value for CSV3
Value for CSV4

and I want to put it into a CSV file, with each new value taking up a new line in the CSV.

So for the file above, the CSV produced would be :

Value for CSV
Value for CSV2
Value for CSV3

Value for CSV4 would be ignored as describedby="grid_1-2", not "grid_1-1".

So I have tried this, however no matter what I try there seems to be (a) a blank line in between each printed line (b) a comma separating each char.

So the print is more like :

V,a,l,u,e,f,o,r,C,S,V,

V,a,l,u,e,f,o,r,C,S,V,2

What silly thing have I done now?

Thanks :)

import csv
import os
from bs4 import BeautifulSoup

with open("C:\Users\ADMIN\Desktop\test.html", 'r') as orig_f:
    soup = BeautifulSoup(orig_f.read())
    results = soup.findAll("td", {"describedby":"grid_1-1"})
    with open('C:\Users\ADMIN\Desktop\Deploy.csv', 'wb') as fp:
        a = csv.writer(fp, delimiter=',')
        for result in results :
            a.writerows(result)

Padraic Cunningham · Accepted Answer

If result is a string inside a list you need to wrap it in a list as writerows expects an iterable of iterables and iterates over the string:

a.writerows([result]) <- wrap in a list

In your case you should use writerow and extract the text from each td tag in results:

  a.writerow([result.text]) # write the text from td element

You have all the td tags in your result list so you just need extract the text with .text.

Extracting particular element in HTML file and inserting into CSV

Answers (2)

Related Questions