user1718373
user1718373

Reputation: 81

Python creating/ formatting CSV files for columns and rows

I have a web scraper that saves the scrapes data into a CSV file. The data looks like this:

random text
Johm May
1234 Big Street
Atlanta, GA 30331
acre .14  small
random text
Jane Jones
4321 Little Street
Atlanta, GA 30322
acre .07 small
random text

I would like to:

(1) Add in the columns Name,Street,,Address <--- Note that this sample is delimited by a comma.

(2) I would like to add commas to the address results I posted above. An example would be:

jane jones
,4321 Little Street
,,Atlanta, GA 30344
,,,acre .07 small
,,,random text

Note how the commas are used to push each line to the desired column with the unneeded data acre .07 small and random text being pushed away from the named columns.

How do I do this in python? I can do it by hand, but I'm dealing with thousands of address and I need a simple way to do this in python.

Is it possible to pull all the data into a list after if has been scraped, and to assign a variable for the commas like a = , b = ,, c = ,,, and then to join the variable to a specific line in the list, and then to save it again?

Also, I need to add the column info as well: columns Name,Street,,Address

Upvotes: 0

Views: 2767

Answers (1)

Mu Mind
Mu Mind

Reputation: 11194

I'm just guessing what you mean on a lot of this, since your question seems to be missing some details, but this should get you something similar to what you want:

import csv

with open('data.txt', 'r') as f:
    with open('data.csv', 'wb') as csv_out:
        line_iter = iter(l.rstrip('\n') for l in f)
        writer = csv.writer(csv_out)
        writer.writerow(['Name', 'Street', '', 'Address'])
        try:
            line_iter.next()    # discard 'random text' (?)
            while True:
                writer.writerow([line_iter.next(), '', '', ''])
                writer.writerow(['', line_iter.next(), '', ''])
                writer.writerow(['', '', line_iter.next(), ''])
                writer.writerow(['', '', '', line_iter.next()])
                writer.writerow(['', '', '', line_iter.next()])
        except StopIteration:
            pass        # reached end of file

It gives this output for your example data above:

Name,Street,,Address
Johm May,,,
,1234 Big Street,,
,,"Atlanta, GA 30331",
,,,acre .14 small
,,,random text
Jane Jones,,,
,4321 Little Street,,
,,"Atlanta, GA 30322",
,,,acre .07 small
,,,random text

Upvotes: 2

Related Questions