jay a
jay a

Reputation: 722

How to subtract strings in python

Basically, if I have a string 'AJ' and another string 'AJYF', I would like to be able to write 'AJYF'-'AJ' and get 'YF'.

I tried this but got a syntax error.

Just on a side note the subtractor will always will be shorter than the string it is subtracted from. Also, the subtractor will always be like the string it is subtracted from. For instance, if I have 'GTYF' and I want to subtract a string of length 3 from it, that string has to be 'GTY'.

If it is possible, the full function I am trying to do is convert a string to a list based on how long each item in the list is supposed to be. Is there any way of doing that?

Upvotes: 52

Views: 172507

Answers (6)

Tommy
Tommy

Reputation: 144

I just had the same problem, and i solved it in this way:

a='AJYF'
b='AJ'
if a[:len(b)]==b: a=a[len(b):]
print(a)     #prints 'YF'

a='GTYF'
b='F'
if a[-len(b):]==b: a=a[:-len(b)]
print(a)     #prints 'GTY'

Upvotes: 0

Jyothi Ram
Jyothi Ram

Reputation: 1

This works for distinct characters in a string:

print(set(string1) ^ set(string2))

Upvotes: -1

Tom Barron
Tom Barron

Reputation: 1594

I think what you want is this:

a = 'AJYF'
b = a.replace('AJ', '')
print b     # produces 'YF'
a = 'GTYF'
b = a.replace('GTY', '')
print b     # produces 'F'

Upvotes: 29

moctarjallo
moctarjallo

Reputation: 1615

if you insist on using the '-' operator, then use a class with the __ sub __ dunder method overitten, with a combination of one of the solutions provided above:

class String(object):
    def __init__(self, string):
        self.string = string

    def __sub__(self, other):
        if self.string.startswith(other.string):
            return self.string[len(other.string):]

    def __str__(self):
        return self.string


sub1 = String('AJYF') - String('AJ')
sub2 = String('GTYF') - String('GTY')
print(sub1)
print(sub2)

It prints:

YF
F

Upvotes: 4

bli
bli

Reputation: 8184

replace can do something that you do not want if the second string is present at several positions:

s1 = 'AJYFAJYF'
s2 = 'AJ'
if s1.startswith(s2):
    s3 = s1.replace(s2, '')
s3
# 'YFYF'

You can add an extra argument to replace to indicate that you want only one replacement to happen:

if s1.startswith(s2):
    s3 = s1.replace(s2, '', 1)
s3
# 'YFAJYF'

Or you could use the re module:

import re
if s1.startswith(s2):
    s3 = re.sub('^' + s2, '', s1)
s3
# 'YFAJYF'

The '^' is to ensure that s2 it is substituted only at the first position of s1.

Yet another approach, suggested in the comments, would be to take out the first len(s2) characters from s1:

if s1.startswith(s2):
    s3 = s1[len(s2):] 
s3
# 'YFAJYF'

Some tests using the %timeit magic in ipython (python 2.7.12, ipython 5.1.0) suggest that this last approach is faster:

In [1]: s1 = 'AJYFAJYF'

In [2]: s2 = 'AJ'

In [3]: %timeit s3 = s1[len(s2):]
The slowest run took 24.47 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 87.7 ns per loop

In [4]: %timeit s3 = s1[len(s2):]
The slowest run took 32.58 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 87.8 ns per loop

In [5]: %timeit s3 = s1[len(s2):]
The slowest run took 21.81 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 87.4 ns per loop

In [6]: %timeit s3 = s1.replace(s2, '', 1)
The slowest run took 17.64 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 230 ns per loop

In [7]: %timeit s3 = s1.replace(s2, '', 1)
The slowest run took 17.79 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 228 ns per loop

In [8]: %timeit s3 = s1.replace(s2, '', 1)
The slowest run took 16.27 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 234 ns per loop

In [9]: import re

In [10]: %timeit s3 = re.sub('^' + s2, '', s1)
The slowest run took 82.02 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.85 µs per loop

In [11]: %timeit s3 = re.sub('^' + s2, '', s1)
The slowest run took 12.82 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.86 µs per loop

In [12]: %timeit s3 = re.sub('^' + s2, '', s1)
The slowest run took 13.08 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.84 µs per loop

Upvotes: 13

Shubham Namdeo
Shubham Namdeo

Reputation: 1925

Easy Solution is:

>>> string1 = 'AJYF'
>>> string2 = 'AJ'
>>> if string2 in string1:
...     string1.replace(string2,'')
'YF'
>>>

Upvotes: 67

Related Questions