John
John

Reputation: 51

In Python Pandas, how to search if column elements contains the first 2 digits

I am fairly new to Python and currently I am trying to build a function that searches for the first 2 digits of the elements in a column and if true, return the result with a new header such as region

For example,

   Adres  AreaCode Region
0  SArea    123191      A
1  BArea    122929      A
2  AArea    132222      B

I want the function to search for just the first 2 digits of the AreaCode which would give me the result of along with a new header of Region which classifies the Region based on the first 2 digits of the AreaCode. So in this case 12 would give me A and 13 would give me B

I already tried this

df.loc[df.AreaCode == 123191, 'Region'] = 'A'

and this worked for the entire AreaCode but I have no idea how to modify it so that I would be able to search based on the first 2 digits.

and I tried this

df.loc[df.AreaCode.str.contains == 12, 'Region' ] = 'A' 

but it gives me the error:

AttributeError: Can only use .str accessor with string values,
                which use np.object_ dtype in pandas

How do I fix this and thanks a lot for helping!

Upvotes: 3

Views: 138

Answers (5)

Sreeram TP
Sreeram TP

Reputation: 11907

First convert the data type to str like this

df.AreaCode = df.AreaCode.astype('str')

Then check for the number in beginning like this

df.loc[df.AreaCode.startswith('12'), 'Region' ] = 'A' 

Assuming you need nan in the rows which dont start with A, you can do a map like this

df['Region'] = df['AreaCode'].map(lambda x : 'A' if x.startswith('12') else np.nan )

Upvotes: 0

Rhae
Rhae

Reputation: 94

See if this helps -

First convert Area code column dtype to string with

df.AreaCode = df.AreaCode.astype(str)

And then do filtering with first characters of the column

df.loc[(df.AreaCode.str.startswith('12')) & (df.Region=='A')]

Upvotes: 2

mirmo
mirmo

Reputation: 376

Try this

df.loc[df.AreaCode.astype(str).str.startswith("12") == True, 'Region' ]

The line below will give you a series with True/False for each row and what becomes the filter for the dataframe.

df.AreaCode.astype(str).str.startswith("12")

Assigning a equals test makes it a filter.

Upvotes: 2

w-m
w-m

Reputation: 11232

I tried this df.loc[df.AreaCode.str.contains == 12, 'Region' ] = 'A' but it gives me the error: AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas

You could simply convert it to a string, then use the same code:

df.loc[df.AreaCode.astype(str).str.startswith('12'), 'Region' ] = 'A'

Upvotes: 2

Mammu yedukondalu
Mammu yedukondalu

Reputation: 156

This will work I guess.

df.loc[df.AreaCode.str.startswith('12'), 'Region' ] = 'A'

Upvotes: 1

Related Questions