Reputation: 3973
I have the following list of lists as below:
my_list = [
['first-column', 'DisplayName', 'FLOW TRIGGERED: 636e56d390c8c0910d592cc6', 'ClassificationType', 'NLU', 'KeyPhrases', 'MetaIntent', 'Description', 'test description', 'SampleSentences', [], 'Regexes'],
['first-column', 'DisplayName', 'FLOW TRIGGERED: 636e56d390c8c0910d592cc6', 'ClassificationType', 'NLU', 'KeyPhrases', 'MetaIntent', 'Description', 'test description', 'SampleSentences', [], 'Regexes'],
['first-column', 'DisplayName', 'FLOW TRIGGERED: 636e56d490c8c01802592cd1', 'ClassificationType', 'NLU', 'KeyPhrases', 'MetaIntent', 'Description', 'test description', 'SampleSentences', ['Pressemitteilung?\n', 'Pressemeldung?\n', 'Wo finde ich den Schlussbericht zur Messe?\n'], 'Regexes'],
['first-column', 'DisplayName', 'FLOW TRIGGERED: 636e56d490c8c0edac592cd8', 'ClassificationType', 'NLU', 'KeyPhrases', 'MetaIntent', 'Description', 'test description', 'SampleSentences', ['Aussteller?\n', 'Ausstellerverzeichnis 2022?\n', 'Welche Aussteller waren 2022 dabei?\n', 'Ausstellerliste 2022?\n', 'Welche Unternehmen waren als Aussteller vertreten?\n'], 'Regexes'],
['first-column', 'DisplayName', 'FLOW TRIGGERED: 636e56d490c8c01739592ce0', 'ClassificationType', 'NLU', 'KeyPhrases', 'MetaIntent', 'Description', 'test description', 'SampleSentences', ['Wie hoch war die Ausstellerzahl 2022?\n', 'Wie viele Unternehmen waren vor Ort\n', 'Anzahl Aussteller?\n', 'Ausstellerzahl?\n', 'Wie viele Aussteller waren auf der Messe vertreten?\n'], 'Regexes']
]
I am using the above list to write a CSV file as below:
rows = zip(*my_list)
with open('test.csv', "w") as f:
writer = csv.writer(f, lineterminator='\n\n')
for row in rows:
writer.writerow(row)
So my CSV looks like below. Which is the format I need.
first-column,first-column,first-column
[],[],"['Pressemitteilung?', 'Pressemeldung?', 'Wo finde ich den Schlussbericht zur Messe?']"
Regexes,Regexes,Regexes
But above is not exactly what I need my CSV to looks like,
I need it like below:
first-column,first-column,first-column,first-column,first-column
DisplayName,DisplayName,DisplayName,DisplayName,DisplayName
FLOW TRIGGERED: 636e56d390c8c0910d592cc6,FLOW TRIGGERED: 636e56d390c8c0910d592cc6,FLOW TRIGGERED: 636e56d490c8c01802592cd1,FLOW TRIGGERED: 636e56d490c8c0edac592cd8,FLOW TRIGGERED: 636e56d490c8c01739592ce0
ClassificationType,ClassificationType,ClassificationType,ClassificationType,ClassificationType
NLU,NLU,NLU,NLU,NLU
KeyPhrases,KeyPhrases,KeyPhrases,KeyPhrases,KeyPhrases
MetaIntent,MetaIntent,MetaIntent,MetaIntent,MetaIntent
Description,Description,Description,Description,Description
test description,test description,test description,test description,test description
SampleSentences,SampleSentences,SampleSentences,SampleSentences,SampleSentences
[],[],Pressemitteilung?,Aussteller?,Wie hoch war die Ausstellerzahl 2022?
[],[],Pressemeldung?,Ausstellerverzeichnis 2022?,Wie viele Unternehmen waren vor Ort
[],[],Wo finde ich den Schlussbericht zur Messe?,Welche Aussteller waren 2022 dabei?,Anzahl Aussteller?
[],[],[],Ausstellerliste 2022?,Ausstellerzahl?
[],[],[],Welche Unternehmen waren als Aussteller vertreten?,Wie viele Aussteller waren auf der Messe vertreten?
Regexes,Regexes,Regexes,Regexes,Regexes
How can I iterate over the inner array so my CSV looks like above?
with open('test.csv', "w") as f:
writer = csv.writer(f, lineterminator='\n\n')
for row in rows:
writer.writerow(row)
writer.writerow(row[1])
But this produces weird output. I am new to python
can someone help me fix this?
Thank you, Best Regards
Upvotes: 2
Views: 89
Reputation: 340
I do not believe that this is a standard operation as it looks you want to populate additional rows in your csv file based on the row you are currently parsing.
So an example solution could check for the case where you have a populated list in a column, then pad the other lists to that length, then write the csv for the new columns:
import csv
my_list = [
['first-column', [], 'Regexes'],
['first-column', [], 'Regexes'],
['first-column', ['Pressemitteilung?', 'Pressemeldung?', 'Wo finde ich den Schlussbericht zur Messe?'], 'Regexes']
]
def list_length(l):
return len(l) if isinstance(l, list) else 0
def pad_list(l, size):
if not isinstance(l, list):
l = [l]
l.extend([[]] * (size - len(l)))
return l
rows = zip(*my_list)
with open('test.csv', "w") as f:
writer = csv.writer(f, lineterminator='\n')
for row in rows:
max_len = max([list_length(element) for element in row])
if max_len > 0:
row = [pad_list(element, max_len) for element in row]
subrows = zip(*row)
for sub in subrows:
writer.writerow(sub)
else:
writer.writerow(row)
Which outputs:
first-column,first-column,first-column
[],[],Pressemitteilung?
[],[],Pressemeldung?
[],[],Wo finde ich den Schlussbericht zur Messe?
Regexes,Regexes,Regexes
If you don't want the line at the end you need explicitly handle it as well, with maybe something like:
import csv
my_list = [
['first-column', [], 'Regexes'],
['first-column', [], 'Regexes'],
['first-column', ['Pressemitteilung?', 'Pressemeldung?', 'Wo finde ich den Schlussbericht zur Messe?'], 'Regexes']
]
def list_length(l):
return len(l) if isinstance(l, list) else 0
def pad_list(l, size):
if not isinstance(l, list):
l = [l]
l.extend([[]] * (size - len(l)))
return l
def parse_row(write, row, new_line):
max_len = max([list_length(element) for element in row])
if max_len > 0:
row = [pad_list(element, max_len) for element in row]
subrows = zip(*row)
for sub in subrows:
write(sub, new_line)
else:
write(row, new_line)
rows = [list(row) for row in zip(*my_list)]
with open('test.csv', "w", newline='') as f:
writer = csv.writer(f, lineterminator='')
def write(row, new_line):
writer.writerow(row)
if new_line:
f.write('\n')
for row in rows[:-1]:
parse_row(write, row, True)
parse_row(write, rows[-1], False)
Upvotes: 2