ValueError: arrays must all be same length - print dataframe to CSV

Question

thanks for stopping by! I was hoping to get some help creating a csv using pandas dataframe. Here is my code:

a = ldamallet[bow_corpus_new[:21]]
b = data_text_new

print(a)
print("/n")
print(b)

d = {'Preprocessed Document': b['Preprocessed Document'].tolist(), 
     'topic_0': a[0][1], 
     'topic_1': a[1][1], 
     'topic_2': a[2][1], 
     'topic_3': a[3][1], 
     'topic_4': a[4][1], 
     'topic_5': a[5][1], 
     'topic_6': a[6][1], 
     'topic_7': a[7][1], 
     'topic_8': a[8][1], 
     'topic_9': a[9][1], 
     'topic_10': a[10][1],
     'topic_11': a[11][1], 
     'topic_12': a[12][1],
     'topic_13': a[13][1],
     'topic_14': a[14][1],
     'topic_15': a[15][1],
     'topic_16': a[16][1],
     'topic_17': a[17][1],
     'topic_18': a[18][1],
     'topic_19': a[19][1]}

print(d)

df = pd.DataFrame(data=d)
df.to_csv("test.csv", index=False)

The data:

print(a): the format is in tuples

[[(topic number: 0, topic percentage),...(19, #)], [(topic distribution for next row, #)...(19, .819438),...(#,#),...]

print(b)

Here is my error:

This is the size of the dataframe:

This is what I wished it looked like:

Any help would be greatly appreciated :)

Matt Cremeens · Accepted Answer

It might be easiest to get the second value of each tuple for all of the rows in it's own list. Something like this

topic_0=[]
topic_1=[]
topic_2=[]
...and so on
for i in a:
    topic_0.append(i[0][1])
    topic_1.append(i[1][1])
    topic_2.append(i[2][1])
    ...and so on

Then you can make your dictionary like so

d = {'Preprocessed Document': b['Preprocessed Document'].tolist(), 
     'topic_0': topic_0, 
     'topic_1': topic_1, 
      etc. }

ValueError: arrays must all be same length - print dataframe to CSV

Answers (2)

Related Questions