cJc
cJc

Reputation: 863

How to make separate lists out of multiple dataframe columns?

Yep, much discussed and similar questions down voted multiple times.. I still can't figure this one out..

Say I have a dataframe like this:

df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))

I want to end up with four separate list (a, b, c and d) with the data from each column.

Logically (to me anyway) I would do:

list_of_lst = df.values.T.astype(str).tolist()

for column in df.columns:
    i = 0
    while i < len(df.columns) - 1:
        column = list_of_lst[1]
        i = i + 1

But assigning variable names in a loop is not doable/recommended...

Any suggestions how I can get what I need?

Upvotes: 2

Views: 38

Answers (3)

jpp
jpp

Reputation: 164773

You can transpose your dataframe and use df.T.values.tolist(). But, if you are manipulating numeric arrays thereafter, it's advisable you skip the tolist() part.

df = pd.DataFrame(np.random.randint(0, 100, size=(5, 4)), columns=list('ABCD'))

#     A   B   C   D
# 0  17  56  57  31
# 1   3  44  15   0
# 2  94  36  87  30
# 3  44  49  56  76
# 4  29   5  35  24

list_of_lists = df.T.values.tolist()

# [[17, 3, 94, 44, 29],
#  [56, 44, 36, 49, 5],
#  [57, 15, 87, 56, 35],
#  [31, 0, 30, 76, 24]]

Upvotes: 0

Thalish Sajeed
Thalish Sajeed

Reputation: 1351

retList = dict()
for i in df.columns:
    iterator = df[i].tolist()
    retList[i] = iterator

You'd get a dictionary with the keys as the column names and values as the list of values in that column.

Modify it to any data structure you want.

retList.values() will give you a list of size 4 with each inner list being the list of each column values

Upvotes: 0

jezrael
jezrael

Reputation: 863166

I think the best is create dictionary of list by DataFrame.to_dict:

np.random.seed(456) 

df = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
print (df)
   A  B  C  D
0  5  9  4  5
1  7  1  8  3
2  5  2  4  2
3  2  8  4  8
4  5  6  0  9
5  8  2  3  6
6  7  0  0  3
7  3  5  6  6
8  3  8  9  6
9  5  1  6  1

d = df.to_dict('l')
print (d['A'])
[5, 7, 5, 2, 5, 8, 7, 3, 3, 5]

If really want A, B, C and D lists:

for k, v in df.to_dict('l').items():
     globals()[k] = v

print (A)
[5, 7, 5, 2, 5, 8, 7, 3, 3, 5]

Upvotes: 1

Related Questions