Reputation: 231
I want to create a data frame that only includes av_rating for the title - "Law & Order: Special Victims Unit" from the following CSV file. How can I do that? The following code is giving me the same data as subset instead of just giving only for the title "Law & Order: Special Victims Unit"
import pandas as pd
ratings = pd.read_csv("http://becomingvisual.com/python4data/tv.csv")
ratings
subset = ratings[["title", "av_rating"]]
subset
for i in subset["title"]:
if i == "Law & Order: Special Victims Unit":
subset_1 = subset[["title", "av_rating"]]
subset_1
Upvotes: 0
Views: 58
Reputation: 99
@JGM answers the OP's question (with a small change). Filter is to be on title column rather than on av_rating column.
ratings[ratings['title'] == 'Law & Order: Special Victims Unit'][['title','av_rating']]
and .loc
method also can be used to filter.
ratings.loc[ratings['title'] == 'Law & Order: Special Victims Unit', ['title','av_rating'] ]
ratings['title'] == 'Law & Order: Special Victims Unit'
is a index filter
['title','av_rating']
is a columns selector
Upvotes: 1
Reputation: 1
First you need to create a filter like this:
filter = ratings.av_rating == 'Law & Order: Special Victims Unit'
The filter is a boolean variable, and return something like:
filter = [True, True, False, True, ... , True, False]
True occures when ratings.av_rating == 'Law & Order: Special Victims Unit' and False occures when ratings.av_rating != 'Law & Order: Special Victims Unit', naturally.
Then, you need to use the filter, to select the values that you want:
final_data = rating[filter]
The variable final_data contains the old data ONLY when filtre == True (ratings.av_rating == 'Law & Order: Special Victims Unit'). Now, to select only title and av_rating:
final_data_titles = final_data[["title", "av_rating"]]
You can use query too:
final_rating = rating.query("av_rating == 'Law & Order: Special Victims Unit'")
Then,
final_data_titles = final_data[["title", "av_rating"]]
Upvotes: 0