kyle preston
kyle preston

Reputation: 37

I want to remove a `%` from entries in a column to create floats without removing other strings (pandas)

I have a df that looks something like this:

col1 col2 col3
80% 10% SP
90% 0% SP
90% 10% SP
70% SP 20%
90% SP 0%

As you can see, the values have a % sign appended onto them, I could usually remove this by using a pd.to_numeric() function and using df[col2].str.rstrip('%').astype('float') / 100) however, I cannot do this because the columns currently contain strings such as SP which throws an error when doing this

Any ideas as to how to do this?

Upvotes: 1

Views: 269

Answers (2)

Rob Raymond
Rob Raymond

Reputation: 31146

A regular expression replace() works. Also did astype() for columns that can become fully typed.

df = pd.read_csv(io.StringIO("""col1    col2    col3
80% 10% SP
90% 0%  SP
90% 10% SP
70% SP  20%
90% SP  0%"""), sep="\t")

df.replace("^([0-9]+)%$", r".\1", regex=True).astype("float64", errors="ignore")

output

 col1 col2 col3
  0.8  .10   SP
  0.9   .0   SP
  0.9  .10   SP
  0.7   SP  .20
  0.9   SP   .0

Upvotes: 0

Pyd
Pyd

Reputation: 6159

using applymap

df.applymap(lambda x : float(x.strip('%')) / 100 if x.endswith('%') else x)

Upvotes: 1

Related Questions