Reputation: 81
I have a web scraper that saves the scrapes data into a CSV file. The data looks like this:
random text Johm May 1234 Big Street Atlanta, GA 30331 acre .14 small random text Jane Jones 4321 Little Street Atlanta, GA 30322 acre .07 small random text
I would like to:
(1) Add in the columns Name,Street,,Address <--- Note that this sample is delimited by a comma.
(2) I would like to add commas to the address results I posted above. An example would be:
jane jones ,4321 Little Street ,,Atlanta, GA 30344 ,,,acre .07 small ,,,random text
Note how the commas are used to push each line to the desired column with the unneeded data acre .07 small and random text being pushed away from the named columns.
How do I do this in python? I can do it by hand, but I'm dealing with thousands of address and I need a simple way to do this in python.
Is it possible to pull all the data into a list after if has been scraped, and to assign a variable for the commas like a = , b = ,, c = ,,, and then to join the variable to a specific line in the list, and then to save it again?
Also, I need to add the column info as well: columns Name,Street,,Address
Upvotes: 0
Views: 2767
Reputation: 11194
I'm just guessing what you mean on a lot of this, since your question seems to be missing some details, but this should get you something similar to what you want:
import csv
with open('data.txt', 'r') as f:
with open('data.csv', 'wb') as csv_out:
line_iter = iter(l.rstrip('\n') for l in f)
writer = csv.writer(csv_out)
writer.writerow(['Name', 'Street', '', 'Address'])
try:
line_iter.next() # discard 'random text' (?)
while True:
writer.writerow([line_iter.next(), '', '', ''])
writer.writerow(['', line_iter.next(), '', ''])
writer.writerow(['', '', line_iter.next(), ''])
writer.writerow(['', '', '', line_iter.next()])
writer.writerow(['', '', '', line_iter.next()])
except StopIteration:
pass # reached end of file
It gives this output for your example data above:
Name,Street,,Address Johm May,,, ,1234 Big Street,, ,,"Atlanta, GA 30331", ,,,acre .14 small ,,,random text Jane Jones,,, ,4321 Little Street,, ,,"Atlanta, GA 30322", ,,,acre .07 small ,,,random text
Upvotes: 2