the_learning_child
the_learning_child

Reputation: 111

How to identify if an element in a column is integer or a string?

I am trying to identify integer or string for elements in a pandas series. The dtype of this column is object.

transaction id
654656
546466
654646
844886
C846464
C384448
C468788
C873316

the elements containing C in the prefix are string and rest are integers.

i tried to use if else, but i got an error

for n in data_clean['transaction id']:
    if data_clean['transaction id'].is_integer():
        data_clean['transaction status'] = 1
    elif data_clean['transaction id'].is_str():
        data_clean['transaction status'] = 0

I expect the output to be a new column with output as "Ordered" if its an integer, and "Cancelled" if its a string.

Upvotes: 0

Views: 209

Answers (4)

Bo Reppen
Bo Reppen

Reputation: 21

You can use np.where to define a condition, based on which you can give some choices. Where your transaction had a numeric ID we will put Ordered else put Cancelled. I hope there are only those two conditions, or else you can define a set of conditions and corresponding choices

df['transaction status'] = np.where(df['transaction id'].str.isnumeric().astype(int), 'Ordered', 'Cancelled')

Output:

  transaction id transaction status
0         654656            Ordered
1         546466            Ordered
2         654646            Ordered
3         844886            Ordered
4        C846464          Cancelled
5        C384448          Cancelled
6        C468788          Cancelled
7        C873316          Cancelled

Upvotes: 0

user11553043
user11553043

Reputation:

Maybe something like this for each iteration in your for loop:

if type(data_clean['transaction id']) == int:
    X = 1
else:
    X = 0

Upvotes: 0

the_learning_child
the_learning_child

Reputation: 111

data_clean['transaction status'] = pd.notna(pd.to_numeric(data_clean['transaction id'], errors='coerce')).astype(int)

First, pd.to_numeric converts the column to a numeric format. Because I've got strings in the rows when the transaction is cancelled, these get picked up as errors. Setting errors=coerce will give a NaN for those rows.

Second, with pd.notna, NaNs get set to False and numbers get set to True.

Third, astype(int) converts True/False to 1/0.

Upvotes: 0

Chris
Chris

Reputation: 29742

Use pandas.Series.str.isnumeric():

df['transaction status'] = df['transaction id'].str.isnumeric().astype(int)
print(df)

Output:

  transaction id  transaction status
0         654656                   1
1         546466                   1
2         654646                   1
3         844886                   1
4        C846464                   0
5        C384448                   0
6        C468788                   0
7        C873316                   0

Upvotes: 3

Related Questions