Vedant Tripathi
Vedant Tripathi

Reputation: 55

Replacing/Selecting values in columns using loc. Pandas

I am using label based indexing function loc to search all the labels where value of the object is "UN" in a list of column i.e is list "columns", but in this piece of code as soon as loc doesn't find "UN" at the first index, it stops after that, printing out only the first index.

columns=["median","age","capital"]  # this is the list of columns

recent_grads is my DataFrame.

for column in columns:
    recent_grads.loc[0:172 == 'UN',column]

This the 'median' column

recent_grads["median"]

0        NaN
1      75000
2      73000
3      70000
4      65000
5      65000
6         UN
7      62000
8      60000
9      60000
10     60000
11     60000
12     60000
13     60000
14     58000
15     57100
16     57000
17     56000
18     54000
19     54000
20     53000
21     53000
22     52000
23     52000
24     51000
25     50000
26     50000
27     50000
28     50000
29     50000
       ...  
143    32000
144    32000
145    31500
146    31000
147    31000
148    31000
149    30500
150    30000
151    30000
152    30000
153    30000
154    30000
155    30000
156    30000
157    30000
158    29000
159    29000
160    29000
161    29000
162    28000
163    28000
164    28000
165    27500
166    27000
167    27000
168    26000
169    25000
170    25000
171    23400
172    22000
Name: median, Length: 173, dtype: object

And as for the ouptput of my code:

recent_grads.loc[0:172 == 'UN',"median"]

output:

0    NaN
Name: median, dtype: object

On choosing some random starting Index

recent_grads.loc[3:172 == ['UN'],"median"]

output comes out different:

Series([], Name: median, dtype: object)

Upvotes: 1

Views: 1451

Answers (2)

knh190
knh190

Reputation: 2882

I think you need to search for 'UN' in column for the first 172 records, if so:

# returns a dataframe
df.head(172).filter(df[column] == 'UN')

Docs: https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.filter.html#pandas.DataFrame.filter

Update:

If you want to use loc, easy:

df.head(172).loc[df[column] == 'UN']

With respect to the accepted answer, this does not transform a dataframe to a list, which creates a new object, and can possibly eat up more memory especially when your data is large. Thus, this native Dataframe method is more efficient.

Upvotes: 1

anky
anky

Reputation: 75080

if you want the 'UN' labels:

use :

list_of_index=list(recent_grads[recent_grads['median'].str.contains('UN',na=False)].index)

or :

list_of_index = list(recent_grads.loc[recent_grads['median']=='UN'].index)

where :

recent_grads.loc[recent_grads['median']=='UN'] 

finds the rows containing UN

Upvotes: 1

Related Questions