Jessi
Jessi

Reputation: 1468

How to remove character containing String from Python?

I'm working with a large set of csv(table) and I need to remove character-containing cells and keep the numeric cells.

For example.

   p1     p2      p3       p4      p5
 dcf23e   2322   acc41   4212     cdefd

So In this case, I only want to remove dcf23e, acc41 and cdefd. After removing those strings, I want to keep them as empty cells.

How would I do this? Thanks in advance.

The code that I've tried is this... , this code remove characters in a string but the problem is, if a string is 23cdgf2, it makes a string 232 which is not what I want. And after removing all the characters, when I try to convert strings to int for calculations, some of the strings became decimals since some string have 123def.24 -> 123.24

temp = ''.join([c for c in temp if c in '1234567890.']) # Strip all non-numeric characters
# Now converting strings to integers for calculations, Using function to use   int() , because of the blank spaces cannot be converted to int
def mk_int(s):
    s = s.strip()
    return int(s) if s else 0
mk_int(temp)
print(temp)

Upvotes: 1

Views: 184

Answers (4)

Liam
Liam

Reputation: 6449

have you tried a for loop with a try statement?

temp = ['dcf23e','2322','acc41','4212','cdefd']
    index = 0
    for element in temp:
        try:
            element+1
        except:
            del temp[index]
        index = index+1
    print temp

or, if you want to convert the value to an int element you can write this:

temp = ['dcf23e','2322','acc41','4212','cdefd']
    index = 0
    for element in temp:
        try:
            element+1
        except:
            temp[index] = 0
        index = index+1
    print temp

Upvotes: 0

user142650
user142650

Reputation:

I would use a simple setup for doing quick tests.

a = 'dcf23e   2322   acc41   4212     cdefd'
cleaned_val = lambda v: v if v.isdigit() else ''
[cleaned_val(val) for val in a.split()]

It will give you the results if strings are valid numbers otherwise empty string in their place.

['', '2322', '', '4212', '']

However, this provides the strings only. If you want to convert the values into integers (replacing the wrong ones with 0 instead), change your lambda:

convert_to_int = lambda v: int(v) if v.isdigit() else 0

[convert_to_int(val) for val in a.split()]

Your new results will be all valid integers:

[0, 2322, 0, 4212, 0]

Upvotes: 2

Sinux
Sinux

Reputation: 1808

use regex

import re
def covert_string_to_blank(_str):
    return ['' if re.findall("[a-zA-Z]+", c) else c for c in _str.split()]

or use isalpha:

def convert_string_to_blank(_str):
    return ['' if any(c.isalpha() for c in s) else s for s in _str.split()]

Upvotes: 2

Patrick the Cat
Patrick the Cat

Reputation: 2168

Compile regex for performance and split the string for correctness

import re
regex = re.compile(r'.*\D+.*')
def my_parse_fun(line):
    return [regex.sub('', emt) for emt in line.split()]

From AbhiP's answer, you can also do

[val if val.isdigit() else '' for val in line.split()]

Upvotes: 3

Related Questions