Reputation: 7994
Here is my dataframe which called df
University Subject Colour
Melb Math Red
English Blue
Sydney Math Green
Arts Yellow
English Green
Ottawa Med Blue
Math Yellow
Both University and Subject are the index key for this dataframe
when I do this
print(df.to_dict('index'))
I get
{(Melb, Math): {'Colour': Red}, (Melb, English): {'Colour': Blue}, ...
When I do this
print(df["Colour"])
I get this
University Subject Colour
Melb Math Red
English Blue
Sydney Math Green
Arts Yellow
English Green
Ottawa Med Blue
Math Yellow
When I do
print(df["University"])
I get an error
KeyError: 'University'
What I want is a way to read each value separately
I want to read the University and another read for Subject and a third for the Colour
How to do that?
Upvotes: 1
Views: 292
Reputation: 484
A quicker way to do this is by using python's zip function, this method will be significantly faster than manually running a for loop.
university_list = list(zip(*df.index))[0]
subject_list = list(zip(*df.index))[1]
colour_list = list(df['Colour'])
index_list = list(zip(*df.index))
Output:
[('Melb','Sydney','Ottawa'),('Math','English','Math','Arts',...)]
You will get a list of tuples where each tuples will be relating to an index column.
(columns will be in Left to Right order: such as 1st index-column will be the first tuple, 2nd index-column will be the second tuple and so on!)
Now, to get the Separate Index Column Lists you can simply do,
Universities = list(index_list[0]) #this will give you separate list for university ('Melb','Sydney','Ottawa')
Subjects = list(index_list[1]) #this will give you separate list for Subjects ('Math','English','Math','Arts',...)
You can do this by simply doing,
column_data = list(df['column_name'])
#which in your case will be
colour_list = list(df['Colour'])
Now, Imagine a case where you need the whole Dataframe as a list of Tuples where each tuple will have data of a column. (Index columns included)
The list will look something like,
[(Col-1_data, ,...),(Col-2_data, ,...),...]
To achieve something like this you will have to reset the indexes, Fetch the data and set indexes again. Below code will do the task,
index_names = list(df.index.names) #saving current indexes so that we can reassign them later.
df.reset_index(inplace = True)
dataframe_raw_list = df.values.tolist() #This will be a list of tuples where each tuple is a row of dataframe
df.set_index(index_names, inplace = True)
dataframe_columns_list = list(zip(*dataframe_raw_list)) #This will be a list of tuples where each tuple is a Column of dataframe
Output:
[(Col-1_data, ,...),(Col-2_data, ,...),...]
Upvotes: 6
Reputation: 10624
You can get the first level of your index ('University') with this:
[i[0] for i in df.index]
Similarly for the second level ('Subject')
[i[1] for i in df.index]
Also, if you want to just get the values of 'Colour' column, you can do it with:
list(df['Colour'])
Upvotes: 1