Reputation: 2231
I have the following problem, I have two separate python scripts. The first converts a CSV file to a new format (dropping certain columns and removing dollar signs)
A separate scripts reads all of the converted CSV files and create a single large script from those, grouping the data by certain keys.
All of this works well enough, except for the fact that I keep running into the following problem.
One of the columns in both files Money Out sometimes contains values starting with a '-' (-235.93). I want to remove this symbol from the file.
I've put the following code in both scripts, yet I am unable to remove the symbol. The data is sometimes numeric and sometimes a string. I thought that unifying it to string and using the 'replace' method would solve the problem, yet it didnt. I also used the build in pandas method, yet to no avail.
df['Money Out'] = df['Money Out'].astype(str)
df['Money Out'] = df['Money Out'].replace('-', '')
df['Money Out'].replace('-', '', inplace=True)
Does anyone have an idea of what I might be doing wrong?
Upvotes: 0
Views: 83
Reputation: 142106
Your code:
df['Money Out'].replace('-', '', inplace=True)
is using Series.replace
which is a different thing (replaces an exact match of the column or a regex with a value) than the Series.str.replace
function (more traditional replace a substring present anywhere with another), you can use:
df['Money Out'] = df['Money Out'].str.replace('-', '')
to remove any hyphens from your text.
Upvotes: 1
Reputation: 325
Try to do this
df['Money Out'] = df.apply(lambda r: ("%s"%(r['Money Out'] or "0.0")).replace("-",""), axis=1)
Upvotes: 1