Ashish
Ashish

Reputation: 1952

Generate cartesian product

I am trying to generate some strings like

AAAA0000
AAAA0001
...
..
...
ZZZZ9999

I can do it with below logic & it will work

   for A in {A..Z}
    do
        for B in {A..Z}
            do
                for C in {A..Z}
                    do
                    for D in {A..Z}
                        do
                        for E in {0..9}
                            do
                            for F in {0..9}
                                do
                                for G in {0..9}
                                    do
                                        for H in {0..9}
                                        do
                                            echo $A$B$C$D$E$F$G$H

                                        done
                                    done
                                done
                            done
                        done
                    done
                done
            done
        done
    done

This is exhausting way of doing it (Though it solves the problem)

Can anyone help in some efficient way of doing same

Upvotes: 1

Views: 343

Answers (2)

Andreas Louv
Andreas Louv

Reputation: 47099

You can combine multiply bracket expansions with: {A..Z}{A..Z}.... But the cartesian product is above 4 billions combinations (26 * 26 * 26 * 26 * 10 * 10 * 10 * 10), which will be stored in memory (Around 38 GB if not more).

You can however use Perl or similar to generate this list:

perl -le 'print for "AAAA0000" .. "ZZZZ9999"' > output.txt

Expect it to take some time
This will generate a file that is about 38 GB in size:

26 * 26 * 26 * 26 * 10 * 10 * 10 * 10 * (8 + 1) = 41,127,840,000 bytes
^    ^    ^    ^    ^    ^    ^    ^     ^   ^
A-Z  A-Z  A-Z  A-Z  0-9  0-9  0-9  0-9   |   new line
                                         8 bytes wide AAAA0000, AAAA0001, ...

41127840000 / 1024 / 1024 / 1024 = 38.30 GB
^             ^      ^      ^
Bytes         KB     MB     GB

Upvotes: 6

F. Hauri  - Give Up GitHub
F. Hauri - Give Up GitHub

Reputation: 70822

Try this:

printf "%s\n" {A..Z}{A..Z}{A..Z}{A..Z}{0..9}{0..9}{0..9}{0..9}

...

Hem, Of course, this may consume some memory!

This smaller form work on my computer:

printf "%s\n" {A..F}{A..F}{A..F}{A..F}{0..9}{0..9}{0..9}{0..9} | wc 
12960000 12960000 116640000

Of course is no a programming language, but you could:

recurs () { 
    local level=$1 upper=$2;
    shift 2;
    ((level)) && { 
        for i in $@ ;do
            recurs $[level-1] $upper$i $@;
        done
    } || { 
        for i in $@ ;do
            echo $upper$i;
        done
    }
}

This will make recursion job:

recurs 3 '' a b c d e f | tee >(sed -ne '1p;$p') >(wc) >/dev/null ;sleep .2
aaaa
   1296    1296    6480
ffff

The second argument is left part string, so for doing thing like asked:

while read lhs ;do
    recurs 3 $lhs {0..4}
  done < <(
    recurs 3 '' {A..F}
) | tee >(sed -ne '1p;$p') >(wc) >/dev/null ;sleep .2
AAAA0000
 810000  810000 7290000
FFFF4444

And, if you really wanna see how quick could be bash:

time while read lhs ;do
    recurs 3 $lhs {0..9}
  done < <(
    recurs 3 '' {A..Z}
) | tee >(sed -ne '1p;$p') >(wc) >/dev/null ;sleep .2

Upvotes: 3

Related Questions