384X21
384X21

Reputation: 6643

How to make a random but partial shuffle in Python?

Instead of a complete shuffle, I am looking for a partial shuffle function in python.

Example : "string" must give rise to "stnrig", but not "nrsgit"

It would be better if I can define a specific "percentage" of characters that have to be rearranged.

Purpose is to test string comparison algorithms. I want to determine the "percentage of shuffle" beyond which an(my) algorithm will mark two (shuffled) strings as completely different.

Update :

Here is my code. Improvements are welcome !

import random

percent_to_shuffle = int(raw_input("Give the percent value to shuffle : "))
to_shuffle = list(raw_input("Give the string to be shuffled : "))

num_of_chars_to_shuffle = int((len(to_shuffle)*percent_to_shuffle)/100)

for i in range(0,num_of_chars_to_shuffle):
    x=random.randint(0,(len(to_shuffle)-1))
    y=random.randint(0,(len(to_shuffle)-1))
    z=to_shuffle[x]
    to_shuffle[x]=to_shuffle[y]
    to_shuffle[y]=z

print ''.join(to_shuffle)

Upvotes: 7

Views: 3894

Answers (5)

jan zegan
jan zegan

Reputation: 1657

maybe like so:

>>> s = 'string'
>>> shufflethis = list(s[2:])
>>> random.shuffle(shufflethis)
>>> s[:2]+''.join(shufflethis)
'stingr'

Taking from fortran's idea, i'm adding this to collection. It's pretty fast:

def partial_shuffle(st, p=20):
    p = int(round(p/100.0*len(st)))

    idx = range(len(s))
    sample = random.sample(idx, p)

    res=str()
    samptrav = 1

    for i in range(len(st)):
        if i in sample:
            res += st[sample[-samptrav]]
            samptrav += 1
            continue
        res += st[i]

    return res

Upvotes: 0

fortran
fortran

Reputation: 76077

Your problem is tricky, because there are some edge cases to think about:

  • Strings with repeated characters (i.e. how would you shuffle "aaaab"?)
  • How do you measure chained character swaps or re arranging blocks?

In any case, the metric defined to shuffle strings up to a certain percentage is likely to be the same you are using in your algorithm to see how close they are.

My code to shuffle n characters:

import random
def shuffle_n(s, n):
  idx = range(len(s))
  random.shuffle(idx)
  idx = idx[:n]
  mapping = dict((idx[i], idx[i-1]) for i in range(n))
  return ''.join(s[mapping.get(x,x)] for x in range(len(s)))

Basically chooses n positions to swap at random, and then exchanges each of them with the next in the list... This way it ensures that no inverse swaps are generated and exactly n characters are swapped (if there are characters repeated, bad luck).

Explained run with 'string', 3 as input:

idx is [0, 1, 2, 3, 4, 5]
we shuffle it, now it is [5, 3, 1, 4, 0, 2]
we take just the first 3 elements, now it is [5, 3, 1]
those are the characters that we are going to swap
s t r i n g
  ^   ^   ^
t (1) will be i (3)
i (3) will be g (5)
g (5) will be t (1)
the rest will remain unchanged
so we get 'sirgnt'

The bad thing about this method is that it does not generate all the possible variations, for example, it could not make 'gnrits' from 'string'. This could be fixed by making partitions of the indices to be shuffled, like this:

import random

def randparts(l):
    n = len(l)
    s = random.randint(0, n-1) + 1
    if s >= 2 and n - s >= 2: # the split makes two valid parts
        yield l[:s]
        for p in randparts(l[s:]):
            yield p
    else: # the split would make a single cycle
        yield l

def shuffle_n(s, n):
    idx = range(len(s))
    random.shuffle(idx)
    mapping = dict((x[i], x[i-1])
        for i in range(len(x))
        for x in randparts(idx[:n]))
    return ''.join(s[mapping.get(x,x)] for x in range(len(s)))

Upvotes: 3

Karl Knechtel
Karl Knechtel

Reputation: 61617

Evil and using a deprecated API:

import random
# adjust constant to taste
# 0 -> no effect, 0.5 -> completely shuffled, 1.0 -> reversed
# Of course this assumes your input is already sorted ;)
''.join(sorted(
    'abcdefghijklmnopqrstuvwxyz',
    cmp = lambda a, b: cmp(a, b) * (-1 if random.random() < 0.2 else 1)
))

Upvotes: 0

eumiro
eumiro

Reputation: 213005

import random

def partial_shuffle(a, part=0.5):
    # which characters are to be shuffled:
    idx_todo = random.sample(xrange(len(a)), int(len(a) * part))

    # what are the new positions of these to-be-shuffled characters:
    idx_target = idx_todo[:]
    random.shuffle(idx_target)

    # map all "normal" character positions {0:0, 1:1, 2:2, ...}
    mapper = dict((i, i) for i in xrange(len(a)))

    # update with all shuffles in the string: {old_pos:new_pos, old_pos:new_pos, ...}
    mapper.update(zip(idx_todo, idx_target))

    # use mapper to modify the string:
    return ''.join(a[mapper[i]] for i in xrange(len(a)))

for i in xrange(5):
    print partial_shuffle('abcdefghijklmnopqrstuvwxyz', 0.2)

prints

abcdefghljkvmnopqrstuxwiyz
ajcdefghitklmnopqrsbuvwxyz
abcdefhwijklmnopqrsguvtxyz
aecdubghijklmnopqrstwvfxyz
abjdefgcitklmnopqrshuvwxyz

Upvotes: 1

jsbueno
jsbueno

Reputation: 110476

This is a problem simpler than it looks. And the language has the right tools not to stay between you and the idea,as usual:

import random

def pashuffle(string, perc=10):
    data = list(string)
    for index, letter in enumerate(data):
        if random.randrange(0, 100) < perc/2:
            new_index = random.randrange(0, len(data))
            data[index], data[new_index] = data[new_index], data[index]
    return "".join(data)

Upvotes: 4

Related Questions