Array length does not match index length by mixing list and dataframe columns

I have 2 dataframes and a list. I want to mix them in a pandas dataframe.

Lista m1, Dataframe test_subdata and Dataframe predicciones:

len(m1)
438
test_subdata.shape
(438, 8)
predicciones.shape
(438, 3)

So basically I want to do this, a dataframe with size of (438, 3) with the values above:

result_frame = pd.DataFrame({'index': test_subdata['id'], 'match_1': m1, 
                             'pred1': predicciones['pred1']})

but when I do so, the following error appears:

ValueError: array length 438 does not match index length 841

Some idea, what is happening?

PS: When I mix only one dataframe with a list, everything is ok, even between 2 dataframes.

Upvotes: 2

Views: 12488

Answers (1)

Bharath M Shetty
Bharath M Shetty

Reputation: 30605

You are getting array mismatch error because of the index the series contains. So reset the index earlier or pass only the values i.e

result_frame = pd.DataFrame({'index': test_subdata['id'].values, 'match_1': m1, 
                         'pred1': predicciones['pred1'].values})

Explanation

Since test_subdata and predicciones are series if the index of test_subdata and predicciones are different a new object with non existing index will be created from dataframe constructor. So the dataframe size doubles in this case. (To make your existing approach work make sure both the dataframes have same index.)

Since m1 length doesn't match the existing index length there will be array length mismatch error.

Upvotes: 5

Related Questions