Reputation: 81
I have a list of multiple arrays and I want them to have the same size, filling the ones with less elements with nan. I have some arrays that have integers and others that have string.
For example:
a = ['Nike']
b = [1,5,10,15,20]
c = ['Adidas']
d = [150, 2]
I have tried
max_len = max(len(a),len(b),len(c),len(d))
empty = np.empty(max_len - len(a))
a = np.asarray(a) + empty
empty = np.empty(max_len - len(b))
b = np.asarray(b) + empty
I do the same with all of the arrays, however an error occurs (TypeError: only integer scalar arrays can be converted to a scalar index)
I am doing this because I want to make a DataFrame with all of the arrays being a different columns.
Thank you in advanced
Upvotes: 2
Views: 3612
Reputation: 88236
I'd suggest using lists
since you also have strings
. Here's one way using zip_longest
:
from itertools import zip_longest
a, b, c, d = map(list,(zip(*zip_longest(a,b,c,d, fillvalue=float('nan')))))
print(a)
# ['Nike', nan, nan, nan, nan]
print(b)
# [1, 5, 10, 15, 20]
print(c)
# ['Adidas', nan, nan, nan, nan]
print(d)
# [150, 2, nan, nan, nan]
Another approach could be:
max_len = len(max([a,b,c,d], key=len))
a, b, c, d = [l+[float('nan')]*(max_len-len(l)) for l in [a,b,c,d]]
Upvotes: 3
Reputation: 12992
You can do that directly just like so:
>>> import pandas as pd
>>> a = ['Nike']
>>> b = [1,5,10,15,20]
>>> c = ['Adidas']
>>> d = [150, 2]
>>> pd.DataFrame([a, b, c, d])
0 1 2 3 4
0 Nike NaN NaN NaN NaN
1 1 5.0 10.0 15.0 20.0
2 Adidas NaN NaN NaN NaN
3 150 2.0 NaN NaN NaN
Upvotes: 0
Reputation: 84
You should use the numpy.append(array, value, axis)
to append to an array. In you example that would be ans = np.append(a,empty)
.
Upvotes: 1