SG9
SG9

Reputation: 231

Creating a sub dataframe from a dataframe based on a column value

I want to create a data frame that only includes av_rating for the title - "Law & Order: Special Victims Unit" from the following CSV file. How can I do that? The following code is giving me the same data as subset instead of just giving only for the title "Law & Order: Special Victims Unit"

import pandas as pd
ratings = pd.read_csv("http://becomingvisual.com/python4data/tv.csv")
ratings
subset = ratings[["title", "av_rating"]]
subset
for i in subset["title"]:
  if i == "Law & Order: Special Victims Unit":  
    subset_1 = subset[["title", "av_rating"]]
subset_1 

Upvotes: 0

Views: 58

Answers (2)

skchandra
skchandra

Reputation: 99

@JGM answers the OP's question (with a small change). Filter is to be on title column rather than on av_rating column.


ratings[ratings['title'] == 'Law & Order: Special Victims Unit'][['title','av_rating']]

and .loc method also can be used to filter.


ratings.loc[ratings['title'] == 'Law & Order: Special Victims Unit',  ['title','av_rating'] ]

ratings['title'] == 'Law & Order: Special Victims Unit' is a index filter

['title','av_rating'] is a columns selector

Upvotes: 1

JGM
JGM

Reputation: 1

First you need to create a filter like this:

filter = ratings.av_rating == 'Law & Order: Special Victims Unit'

The filter is a boolean variable, and return something like:

filter = [True, True, False, True, ... , True, False]

True occures when ratings.av_rating == 'Law & Order: Special Victims Unit' and False occures when ratings.av_rating != 'Law & Order: Special Victims Unit', naturally.

Then, you need to use the filter, to select the values that you want:

final_data = rating[filter]

The variable final_data contains the old data ONLY when filtre == True (ratings.av_rating == 'Law & Order: Special Victims Unit'). Now, to select only title and av_rating:

final_data_titles = final_data[["title", "av_rating"]]

You can use query too:

final_rating = rating.query("av_rating == 'Law & Order: Special Victims Unit'")

Then,

final_data_titles = final_data[["title", "av_rating"]]

Upvotes: 0

Related Questions