Reputation: 442
I have this array (it's a result from similarity calcul) it's a list of tuples like this:
example = [[(a,b), (c,d)], [(a1,b1), (c1,d2)] …]
In example there is 121044 list of 30 tuples each.
I want to have a pandas Dataframe like of just the second value of the tuples (i.e : b, d, b1, d2) without spending to much time compute it
Do you have any ideas ?
Upvotes: 5
Views: 2763
Reputation: 164843
For numeric data, you can use numpy
indexing directly. This should be more efficient than a list comprehension, as pandas
uses numpy
internally to store data in contiguous memory blocks.
import pandas as pd, numpy as np
example = [[(1,2), (3,4)], [(5,6), (7,8)]]
df = pd.DataFrame(np.array(example)[..., 1],
columns=['col1', 'col2'])
print(df)
col1 col2
0 2 4
1 6 8
Upvotes: 1
Reputation: 863791
Use nested list comprehension:
df = pd.DataFrame([[y[1] for y in x] for x in example])
print (df)
0 1
0 b d
1 b1 d2
df = pd.DataFrame([[y[1] for y in x] for x in example], columns=['col1','col2'])
print (df)
col1 col2
0 b d
1 b1 d2
Upvotes: 3