Reputation: 23
I have a variety of values in a text field of a CSV
Some values look something like this AGM00BALDWIN AGM00BOUCK
however, some have duplicates, changing the names to AGM00BOUCK01 AGM00COBDEN01 AGM00COBDEN02
My goal is to write a specific ID to values NOT containing a numeric suffix
Here is the code so far
prov_count = 3000
prov_ID = 0
items = (name, x, y)
xy_tup = tuple(items)
if "*1" not in name and "*2" not in name:
prov_ID = prov_count + 1
else:
prov_ID = ""
It seems that the the wildcard isn't the appropriate method here but I can't seem to find an appropriate solution.
Upvotes: 2
Views: 94
Reputation: 2575
There are different ways to do it, one with the isdigit
function:
a = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]
for i in a:
if i[-1].isdigit(): # can use i[-1] and i[-2] for both numbers
print (i)
regex
:
import re
a = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]
pat = re.compile(r"^.*\d$") # can use "\d\d" instead of "\d" for 2 numbers
for i in a:
if pat.match(i): print (i)
another:
for i in a:
if name[-1:] in map(str, range(10)): print (i)
all above methods return inputs with numeric suffix:
AGM00BOUCK01
AGM00COBDEN01
AGM00COBDEN02
Upvotes: 1
Reputation: 3485
You can use slicing to find the last 2 characters of the element and then check if it ends with '01'
or '02'
:
l = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]
for i in l:
if i[-2:] in ('01', '02'):
print('{} is a duplicate'.format(i))
Output:
AGM00BOUCK01 is a duplicate
AGM00COBDEN01 is a duplicate
AGM00COBDEN02 is a duplicate
Or another way would be using the str.endswith
method:
l = ["AGM00BALDWIN", "AGM00BOUCK", "AGM00BOUCK01", "AGM00COBDEN01", "AGM00COBDEN02"]
for i in l:
if i.endswith('01') or i.endswith('02'):
print('{} is a duplicate'.format(i))
So your code would look like this:
prov_count = 3000
prov_ID = 0
items = (name, x, y)
xy_tup = tuple(items)
if name[-2] in ('01', '02'):
prov_ID = prov_count + 1
else:
prov_ID = ""
Upvotes: 0
Reputation: 7903
Using regular expressions seems appropriate here:
import re
pattern= re.compile(r'(\d+$)')
prov_count = 3000
prov_ID = 0
items = (name, x, y)
xy_tup = tuple(items)
if pattern.match(name)==False:
prov_ID = prov_count + 1
else:
prov_ID = ""
Upvotes: 1