ForzaLeclerc
ForzaLeclerc

Reputation: 13

How to filter out numbers with specific digits?

I wanna do this: There is a number N. How many numbers are there smaller than N, which contains the 0 or 4 or 7 digits only ? For example.: N = 50, the good numbers are: 0,4,7,40,44,47. total_count = 6

db = 0
badnums = '1235689'
lsgood = []
for i in range(0, n + 1):
    szamok = [int(j) for j in str(i)]
    for i in szamok:
        if str(i) not in badnums:
            lsgood.append(i)
            db += 1
print(lsgood,db)

Upvotes: 1

Views: 1453

Answers (5)

Nathan
Nathan

Reputation: 10306

Iterating over all numbers less than n may not be the most efficient if n is large (though I haven't done any timings; this would be worth doing if performance matters). Instead, we can just generate numbers that are probably valid (all numbers with the same number of digits as n but only using 0, 4, or 7) and then eliminate any that aren't (those greater than n).

from itertools import product

def n_good(n: int = 50, allowed_digits: str = "047") -> int:
    n_digits = len(str(n))
    possible_nums = [int("".join(num)) for num in product(set(allowed_digits + "0"), repeat=n_digits)]
    if "0" not in allowed_digits:
        possible_nums = [num for num in possible_nums if "0" not in str(num)]
    return sum(1 for num in possible_nums if num < n)

assert n_good(50) == 6 and n_good(50, "47") == 4

EDIT: I did a couple of quick timings compared to one of the other answers:

def n_good_range(n):
    return sum(1 for i in range(n + 1) if all(d in '047' for d in str(i)))
| n            | 10_000           | 100_000          | 1_000_000         |
| ------------ | ---------------- | ---------------- | ----------------- |
| n_good       | 138 µs ± 9.48 µs | 464 µs ± 44.3 µs | 1.28 ms ± 42.5 µs |
| n_good_range | 9.05 ms ± 319 µs | 91.5 ms ± 4.4 ms | 942 ms ± 57.9 ms  |

(timings done using %%timeit in IPython)

EDIT2: based on @Jan's insightful comment, I fixed the function to work even when 0 is not one of the allowed digits. This will make things slightly slower sometimes, but I found that n_good(1_000_000, "47") only took 1.7 ms (compared with 1.3 ms for n_good(1_000_000, "047")).

Upvotes: 2

Jan Christoph Terasa
Jan Christoph Terasa

Reputation: 5935

Solution using itertools.combinations_with_replacement (only works properly for two digits)

This solution uses combinations_with_replacement from the itertools module, since we are looking for all combinations of digits smaller than the target value n, with repeatable digits. Since combinations_with_replacement('ABC', 2) returns AA AB AC BB BC CC, and thus considers BA == AB, we have to add all the digits to the back of the string, to cover the other combination pairing as well, i.e. we use '047047' instead of '047' internally.

This solution avoids looping over all values, which can be more efficient for large N. Instead we only have log10(n) python loops:

def good_digits(n, digs='047'):
    import itertools

    digs = digs + digs[-2::-1]
    l = len(str(n)) 
    s = set(int("".join(i)) for j in range(1,l+1) for i in itertools.combinations_with_replacement(digs, j)) 
    return sum(1 for i in s if i < n)

good_digits(50, '047') # 6

Python 3.8 Assignment Expressions

Using Python 3.8 Assignment Expressions, we can avoid the summing of the values in the set, at the cost of the assignment expression:

def good_digits_38(n, digs='047'):
    import itertools

    digs = digs + digs[-2::-1]
    l = len(str(n)) 
    s = set(r for j in range(1,l+1) for i in itertools.combinations_with_replacement(digs, j) if (r:=int("".join(i))) < n) 
    return len(s)

good_digits_38(50, '047') # 6

Permutations

If you want the permutations instead of combinations with replacement, e.g. excluding "44", use this solution for permutations:

def good_digits_perm(n, digs='047'):
    import itertools

    l = len(str(n)) 
    s = set(int("".join(i)) for j in range(1,l+1) for i in itertools.permutations(digs, j))
    return sum(1 for i in s if i < n)

good_digits_perm(50, '047') # 5

Upvotes: 0

MegaIng
MegaIng

Reputation: 7886

You have a few problems with your code. In that case, it always helps to step thourgh it either by hand or with the help of a debugger (as it is integrated in many IDEs). Lets take a look at your code:

n = 50
db = 0
badnums = '1235689'
lsgood = []
for i in range(0, n + 1):
    szamok = [int(j) for j in str(i)]
    for i in szamok:
        if str(i) not in badnums:
            lsgood.append(i)
            db += 1
print(lsgood,db)

The first think that should jump to your eye is the reusage of the variable name i. You shouldn't do that, so we rename the variable j

n = 50
db = 0
badnums = '1235689'
lsgood = []
for i in range(0, n + 1):
    szamok = [int(j) for j in str(i)]
    for j in szamok:
        if str(j) not in badnums:
            lsgood.append(?)
            db += 1
print(lsgood,db)

But now we have the first interesting mistake. What do we want to append to lsgood? you added the inner i, which was always a single digit. This meant, that lsgood only contained single digits, which is not what you wanted. So, we add i:

n = 50
db = 0
badnums = '1235689'
lsgood = []
for i in range(0, n + 1):
    szamok = [int(j) for j in str(i)]
    for j in szamok:
        if str(j) not in badnums:
            lsgood.append(i)
            db += 1
print(lsgood,db)

Now we get the following output:

[0, 4, 7, 10, 14, 17, 20, 24, 27, 30, 34, 37, 40, 40, 41, 42, 43, 44, 44, 45, 46, 47, 47, 48, 49, 50] 26

These are way to many. If we look closely, It counts all numbers containing at least one of '047'. That is not what we want. So we start to look into it and discover the magic that is else on for-loops.

n = 50
db = 0
badnums = '1235689'
lsgood = []
for i in range(0, n + 1):
    szamok = [int(j) for j in str(i)]
    for j in szamok:
        if str(j) in badnums:
            break # The number contains a bad digit, we don't want it
    else:
        lsgood.append(i) # All digits past the test, it only contains good digits
        db += 1
print(lsgood, db)

This gives the intended output:

[0, 4, 7, 40, 44, 47] 6

As others showed, this can be shortened significantly, but I think for a beginner, this is the easier to understand solution.

Upvotes: 2

SpghttCd
SpghttCd

Reputation: 10860

You can simplify your code using sets:

n = 50

lsgood = []
for i in range(n):
    if set(list(str(i))).issubset(list('047')):
        lsgood.append(i)

print(lsgood, len(lsgood))

# [0, 4, 7, 40, 44, 47] 6

Upvotes: 0

blhsing
blhsing

Reputation: 106513

You can use sum with a generator expression like this:

sum(1 for i in range(n + 1) if all(d in '047' for d in str(i)))

This returns: 6

Note that this includes the number 44, which your expected output does not.

Upvotes: 5

Related Questions