Reputation: 863
Yep, much discussed and similar questions down voted multiple times.. I still can't figure this one out..
Say I have a dataframe like this:
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
I want to end up with four separate list (a, b, c and d) with the data from each column.
Logically (to me anyway) I would do:
list_of_lst = df.values.T.astype(str).tolist()
for column in df.columns:
i = 0
while i < len(df.columns) - 1:
column = list_of_lst[1]
i = i + 1
But assigning variable names in a loop is not doable/recommended...
Any suggestions how I can get what I need?
Upvotes: 2
Views: 38
Reputation: 164773
You can transpose your dataframe and use df.T.values.tolist()
. But, if you are manipulating numeric arrays thereafter, it's advisable you skip the tolist()
part.
df = pd.DataFrame(np.random.randint(0, 100, size=(5, 4)), columns=list('ABCD'))
# A B C D
# 0 17 56 57 31
# 1 3 44 15 0
# 2 94 36 87 30
# 3 44 49 56 76
# 4 29 5 35 24
list_of_lists = df.T.values.tolist()
# [[17, 3, 94, 44, 29],
# [56, 44, 36, 49, 5],
# [57, 15, 87, 56, 35],
# [31, 0, 30, 76, 24]]
Upvotes: 0
Reputation: 1351
retList = dict()
for i in df.columns:
iterator = df[i].tolist()
retList[i] = iterator
You'd get a dictionary with the keys as the column names and values as the list of values in that column.
Modify it to any data structure you want.
retList.values()
will give you a list of size 4 with each inner list being the list of each column values
Upvotes: 0
Reputation: 863166
I think the best is create dictionary of list
by DataFrame.to_dict
:
np.random.seed(456)
df = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
print (df)
A B C D
0 5 9 4 5
1 7 1 8 3
2 5 2 4 2
3 2 8 4 8
4 5 6 0 9
5 8 2 3 6
6 7 0 0 3
7 3 5 6 6
8 3 8 9 6
9 5 1 6 1
d = df.to_dict('l')
print (d['A'])
[5, 7, 5, 2, 5, 8, 7, 3, 3, 5]
If really want A
, B
, C
and D
lists:
for k, v in df.to_dict('l').items():
globals()[k] = v
print (A)
[5, 7, 5, 2, 5, 8, 7, 3, 3, 5]
Upvotes: 1