Reputation: 97
I have question better say how to think for best solution on this problem. My CSV file looks like :
,02/12/2013,03/12/2013,04/12/2013,05/12/2013,06/12/2013,07/12/2013,08/12/2013,
06:00,"06:00 World Sport","06:00 World Sport","06:00 World Sport","06:00 World Sport","06:00 World Sport","06:00 World Sport","06:00 World Sport",06:00
,,,,,,,,
06:15,,,,,,,,06:15
,,,,,,,,
06:30,"06:30 Inside Africa: November 29, 2013","06:30 African Voices: Agatha Achindu","06:30 Inside the Louvre","06:30 Talk Asia: Franz Harary","06:30 Blueprint","06:30 Inside the Middle East","06:30 CNNGo",06:30
Ok what I need to do is this, compile dates in range from 1 to how much is in one sheet, and put date in every line in front of start, before comma like this example:
02/12/2013, "06:00 World Sport", 03/12/2013 "06:00 World Sport", 04/12/2013 "06:00 World of Sport"...
02/12/2013, "06:30 Inside Africa: November 23,2013", 03/12/2013, "06:30 African Voices.."
And my starting code was like this:
try:
for line in fileinput.input(fnames):
if re.search(r'\d{2}/\d{2}/\d{4}.*',line):
line_date = re.findall(r'\d{2}/\d{2}/\d{4}',line)[0]
output.write(line_date+'\n')
if re.search(r'\".+?\"',line):
line_sadrzaj = re.findall(r'\".+?\"',line)[0]
output.write(line_sadrzaj+'\n')
output.close()
Do you have and better idea for this problem.
Maybe this way:
for line in fileinput.input(fnames):
if re.search(r'\d{2}/\d{2}/\d{4}.*',line):
line_date = re.findall(r'\d{2}/\d{2}/\d{4}.*',line)[0]
line_split = re.split(r'\,',line_date)
for line1 in line_split:
var = line1
output.write(var+'\n')
if re.search(r'\".+?\".*',line):
line_sadrzaj = re.findall(r'\".+?\".*',line)[0]
line_split1 = re.split (r'\,',line_sadrzaj)
for line2 in line_split1:
var2 = line2
output.write(var2+'\n')
#output.write(line_sadrzaj+'\n'
Upvotes: 1
Views: 2517
Reputation: 101162
You don't need regex at all; just use the csv
module to read the csv file, then transform the result to your desired output.
Example:
import csv
with open('csv.csv') as text:
table = list(csv.reader(text))
# get all dates (skipping first and last column)
dates = table[0][1:-1]
# get all shows (skipping first and last column and empty rows)
shows = filter(''.join, (t[1:-1] for t in table[1:]))
# join dates and shows back together and do some formatting
for line in [zip(dates, s) for s in shows]:
print ', '.join('{}, "{}"'.format(*t) for t in line)
Result:
02/12/2013, "06:00 World Sport", 03/12/2013, "06:00 World Sport", 04/12/2013, "06:00 World Sport", 05/12/2013, "06:00 World Sport", 06/12/2013, "06:00 World Sport", 07/12/2013, "06:00 World Sport", 08/12/2013, "06:00 World Sport"
02/12/2013, "06:30 Inside Africa: November 29, 2013", 03/12/2013, "06:30 African Voices: Agatha Achindu", 04/12/2013, "06:30 Inside the Louvre", 05/12/2013, "06:30 Talk Asia: Franz Harary", 06/12/2013, "06:30 Blueprint", 07/12/2013, "06:30 Inside the Middle East", 08/12/2013, "06:30 CNNGo"
Upvotes: 3