Reputation: 19
For a specified single-column dataframe, is it possible purely from pandas calls to chronologically split into multiple columns of length n once a randomised order is created?
df = pd.read_csv('info.csv', low_memory=False, index_col=0)
df.head(5)
Which initially reads as:
list
0 A
1 B
2 C
3 D
4 E
Then in order to randomise the order:
df = df.apply(np.random.permutation)
df.head(5)
Which then reads as:
list
0 C
1 E
2 A
3 B
4 D
I have attempted using a modified version of the call below, yet not entirely sure if appropriate:
df = pd.DataFrame([list[n:n+2] for n in range(0, len(list), 2)], columns=columnNames)
I would like a finalised dataframe of the format below, whereby in this case the length is 3 rows:
list1 list2 ... listn
0 C B ...
1 E D ...
2 A ... ...
Is this possible purely from a single line pandas query?
Thanks in advance!
Upvotes: 1
Views: 35
Reputation: 863166
You can use dictionary comprehension with Series
for possible create DataFrame
with genersl lengths of Series
:
L = np.random.permutation(df['list'])
N = 3
df = (pd.DataFrame({i: pd.Series(L[n:n+N]) for i,n in enumerate(range(0, len(L), N))})
.add_prefix('list'))
print (df)
list0 list1
0 A D
1 C B
2 E NaN
Nnon loop solution, if faster best test:
N = 3
df = (pd.DataFrame({'a': np.random.permutation(df['list'])})
.assign(b = lambda x: x.index // N, c = lambda x: x.index % N)
.pivot('c','b','a')
.add_prefix('list')
.rename_axis(index=None, columns=None))
print (df)
list0 list1
0 B D
1 A C
2 E NaN
Upvotes: 2