Sailormoon
Sailormoon

Reputation: 267

sort array based on custom alphabet order python

I want to write a code to sort arr based on customAl order, and not use sorted function.

customAl = [dshbanfmg]
arr = [bba,abb,baa,mggfba,mffgh......]

psudo code:

def sortCA(arr, customAl):
    dt = {}
    generate dt order based on customAl
    look up and sort arr

    return result


newArr = [bba,baa,abb,mffgh,mggfba......]

I know there's a similiar question but the answer is wrapped in sorted function which I don't wish to use. anyone has a better solution than unsorted, or dictionary which takes space?

Sorting string values according to a custom alphabet in Python

Upvotes: 0

Views: 914

Answers (1)

recnac
recnac

Reputation: 3744

In my opinion, programming is a trade-off, it depends on which part you care most.

Specifically, in this scenario, you can choose to trade time for space by str.index, or you can trade space for time with an extra index dict:

customAl = 'dshbanfmg'
arr = ['bba', 'abb', 'baa', 'mggfba', 'mffgh']

# trade time for space
# no extra space but, but O(n) to index
def sortCA1(arr, customAl):
    return sorted(arr, key=lambda x: [customAl.index(c) for c in x])

# trade space for time
# extra space O(n), but O(1) to index
def sortCA2(arr, customAl):
    dt = {c: i for i, c in enumerate(customAl)}
    return sorted(arr, key=lambda x: [dt[c] for c in x])

# output: ['bba', 'baa', 'abb', 'mffgh', 'mggfba']

Here is a version which not use sorted function, we can use a bucket based on custom alphabet order. split the arr by 1st char, if one bucket has multiple elements then split by 2nd char recursively...kind of radix sort: one thing to mention, the length is different, so we should add a bucket to record none index str.

def sortCA3(arr, customAl):
    dt = {c: i + 1 for i, c in enumerate(customAl)}  # keep 0 for none bucket

    def bucket_sort(arr, start):
        new_arr = []
        buckets = [[] for _ in range(len(customAl) + 1)]

        for s in arr:
            if start < len(s):
                buckets[dt[s[start]]].append(s)
            else:
                buckets[0].append(s)

        for bucket in buckets:
            if len(bucket) == 1:
                new_arr += bucket
            elif len(bucket) > 1:
                new_arr += bucket_sort(bucket, start+1)
        return new_arr

    return bucket_sort(arr, 0)

test and output

customAl = 'dshbanfmg'
arr = ['bba', 'bb', 'abb', 'baa', 'mggfba', 'mffgh']  # add `bb` for test
print(sortCA4(arr, customAl))

Upvotes: 2

Related Questions