Reputation: 13
Please help with my code
i am getting
IndexError: list index out of range
when i am using
split(",")[1] and split(",")[2]
This works fine instead
split(",")[0] and split(",")[-1]
appreciate your help
my data like this:
INPUT.csv
col0 col1 col2 col3 col4
blue, eight, line, aaa [email protected],[email protected],[email protected]
green, nine, square, bbb [email protected],[email protected],[email protected]
expected output
OUTPUT.csv
col0 col1 col2 col3 col4 col5 col6
blue eight line aaa [email protected] [email protected] [email protected]
green, nine, square, bbb [email protected] [email protected] [email protected]
My code so far:
import csv
with open('INPUT.csv', 'r') as csv_file:
csv_reader = csv.DictReader(csv_file)
with open('OUTPUT.csv', 'w',encoding='utf-8') as new_file:
fieldnames = ['col0','col1','col2','col3','col4','col5','col6']
csv_writer = csv.DictWriter(new_file,lineterminator='\n' ,
fieldnames=fieldnames)
)
for row in csv_reader:
csv_writer.writerow({
"col0": row["col0"],
"col1": row["col1"],
"col4": row["col4"].split(",")[0].strip(),
"col5": row["col4"].split(",")[1].strip(),
"col6": row["col4"].split(",")[2].strip(),
})
Upvotes: 0
Views: 92
Reputation: 365717
You're reading the file as comma-separated values. So, look at this line:
green, nine, square, bbb [email protected],[email protected],[email protected]
The values, separated by commas, are:
green
nine
square
bbb [email protected]
[email protected]
[email protected]
So, your column 4 is [email protected]
. When you try to split that on commas, of course it doesn't have any, so you get back only one result, and then you ask for the second and third values that don't exist.
You need to fix your CSV file to actually be a CSV file.
That includes putting a comma after the bbb
column, and after each column in the header.
And, more importantly, it means not using commas inside columns when you're using the same commas to separate the columns. The result is at best ambiguous, and therefore it can't be parsed.
Ways around this include:
(You could almost use ", "
as a column delimiter here, but that's really hacky, and any human editing your file is going to break it.)
Here's an example that could work:
col0, col1, col2, col3, col4
blue, eight, line, aaa, [email protected],[email protected],[email protected]
green, nine, square, bbb, [email protected],[email protected],[email protected]
Even with all that messy spacing (that you always get from human-edited files), this can be parsed cleanly and unambiguously with the right dialect parameters:
csv_reader = csv.DictReader(csv_file, skipinitialspace=True)
Now, each row looks like this:
{'col0': 'blue',
'col1': 'eight',
'col2': 'line',
'col3': 'aaa',
'col4': '[email protected],[email protected],[email protected]'}
… so now, you can row["col4"].split(",")
and get back:
['[email protected]', '[email protected]', '[email protected]']
And then, [1]
and [2]
will work.
However, you still have at least one more problem in your code. Your desired output includes columns 2 and 3, but you're explicitly leaving them out of the writerow
.
While we're at it, there's no reason to try to cram 7 lines of code into one expression. So, why not just split
the row once?
col456 = row["col4"].split(",")
And then, we can just modify row
in-place:
row["col4"], row["col5"], row["col6"] = col456
… and now:
csv_writer.writerow(row)
Upvotes: 4
Reputation: 81604
If string
does not contain any ','
then string.split(',')
will return a list with a single element, the entire string. In this case, string.split(',')[1]
will obviously raiseIndexError
.
li[0] == li[-1]
in case li
is a list with a single element.
Upvotes: 1