Reputation: 1468
I'm working with a large set of csv(table) and I need to remove character-containing cells and keep the numeric cells.
For example.
p1 p2 p3 p4 p5
dcf23e 2322 acc41 4212 cdefd
So In this case, I only want to remove dcf23e, acc41 and cdefd. After removing those strings, I want to keep them as empty cells.
How would I do this? Thanks in advance.
The code that I've tried is this... , this code remove characters in a string but the problem is, if a string is 23cdgf2, it makes a string 232 which is not what I want. And after removing all the characters, when I try to convert strings to int for calculations, some of the strings became decimals since some string have 123def.24 -> 123.24
temp = ''.join([c for c in temp if c in '1234567890.']) # Strip all non-numeric characters
# Now converting strings to integers for calculations, Using function to use int() , because of the blank spaces cannot be converted to int
def mk_int(s):
s = s.strip()
return int(s) if s else 0
mk_int(temp)
print(temp)
Upvotes: 1
Views: 184
Reputation: 6449
have you tried a for
loop with a try
statement?
temp = ['dcf23e','2322','acc41','4212','cdefd']
index = 0
for element in temp:
try:
element+1
except:
del temp[index]
index = index+1
print temp
or, if you want to convert the value to an int
element you can write this:
temp = ['dcf23e','2322','acc41','4212','cdefd']
index = 0
for element in temp:
try:
element+1
except:
temp[index] = 0
index = index+1
print temp
Upvotes: 0
Reputation:
I would use a simple setup for doing quick tests.
a = 'dcf23e 2322 acc41 4212 cdefd'
cleaned_val = lambda v: v if v.isdigit() else ''
[cleaned_val(val) for val in a.split()]
It will give you the results if strings are valid numbers otherwise empty string in their place.
['', '2322', '', '4212', '']
However, this provides the strings only. If you want to convert the values into integers (replacing the wrong ones with 0 instead), change your lambda:
convert_to_int = lambda v: int(v) if v.isdigit() else 0
[convert_to_int(val) for val in a.split()]
Your new results will be all valid integers:
[0, 2322, 0, 4212, 0]
Upvotes: 2
Reputation: 1808
use regex
import re
def covert_string_to_blank(_str):
return ['' if re.findall("[a-zA-Z]+", c) else c for c in _str.split()]
or use isalpha
:
def convert_string_to_blank(_str):
return ['' if any(c.isalpha() for c in s) else s for s in _str.split()]
Upvotes: 2
Reputation: 2168
Compile regex for performance and split the string for correctness
import re
regex = re.compile(r'.*\D+.*')
def my_parse_fun(line):
return [regex.sub('', emt) for emt in line.split()]
From AbhiP's answer, you can also do
[val if val.isdigit() else '' for val in line.split()]
Upvotes: 3