vip
vip

Reputation: 53

Unsorting a sorted array and preserving the original order of the array elements in C

I have my initial array ( random or duplicates allowed) which needs to be sorted first and then needs to be unsorted the sorted array back to the original array with the order preserved.

Ex:
    orig_array: char[512] = { 1, 2, 4, 1,3,5,a,b,c,........}

    Sorted array: char[512] = { 1, 1, 2,2,3.....)  ( using sorting algorithms )

    Unsorting back to original order: { 1, 2,4,1,3,......}

I am stuck on the way on how to implement unsorting back to the original order in which they are created.

Some background on why I'm asking:

Unsorting the array is done completely on a different system. The sorted array is sent out to the network and is received by the host. The original data must be preserved.

I have a compression module on both ends to compress and decompress data. Compression module is CPU expensive. My idea was compression on data sequences sorted shouldn't consume much CPU and should be pretty quick to encrypt and decrypt on both ends which decreases the overall latency

Upvotes: 2

Views: 3238

Answers (4)

dbush
dbush

Reputation: 223739

You say you want to sort and unsort because the data needs to be compressed (which takes time) before being sent over a network, and compressing sorted data "should" be faster than compressing unsorted data.

This sounds like a case of premature optimization. "Should" be faster sounds like a guess. You should test compressing sorted data vs. compressing unsorted data to see if there's any difference. You would also need to take the sorting and unsorted time into account. There's a good chance that the cost of sorting would be much more any any possible gain from compressing a sorted list.

Even if you find that compressing sorted data is faster, it may not matter.

Here's the problem with unsorting.

Forget about software for a moment. Suppose I give you the following sorted list:

2 7 9 10 15 18 19 20 26 29

Now unsort it.

How would you do it? Without any additional information, there's no way to know what the "proper" order is. It may very well have been sorted to begin with, in which case it's already "unsorted".

At a minimum, you would need to know what position each value was in when the list is unsorted. So if the unsorted list looked like this:

10 9 26 29 18 2 20 7 19 15

You would need another list giving the original index of each value in the sorted list:

5 7 1 0 9 4 8 6 2 3

But now you have two lists of numbers, so you've doubled the size of the data you need to compress and send. On top of that, one of those lists is unsorted. So you're back to where you started.

The only possible way you could gain anything is if the items in the original list are not just numbers but a large aggregate data type, so that the list of indexes is small compared to the original list. In that case, you could add the original list index to the datatype to simplify keeping track of it. Then you would again have to test if compressing a sorted list is faster than an unsorted list. But again, the chance that compressing a sorted list is faster is small, even more so if the list contains complex data types.

My recommendation: just compress and send the unsorted list.

Upvotes: 1

gfdevelop
gfdevelop

Reputation: 317

You can create a copy of your original array before you sort it or you can create another array of pointers and sort pointers... The memory is not a problem, the time yes, don't lose more time to find magic function that not exist...

Upvotes: 0

einpoklum
einpoklum

Reputation: 131527

If you sort "in-place" - that is, reorder elements in the actual array, you cannot unsort. Sorry.

You could, instead, sort "out-of-place", or sort an array of pointers to elements in the array (using the array data for comparison) - keeping the original array as-is.

See also: In-place algorithms on Wikipedia

Upvotes: 4

John Bollinger
John Bollinger

Reputation: 180181

You cannot reverse a sort without retaining additional data. A trivial way to do that is to retain a copy of the data in their original order, for instance by performing an out-of-place sort. That's fast and simple, but it's not the only alternative.

Another way you could provide for reversing the process (per @lurker) is to make a list of the changes (swaps / permutations) performed on the array, in the order they are performed, and to reverse by going backward through that list and reversing each individual step. That's fairly simple, and approximately as fast as the original sort, but it requires more memory in the average-case asymptotic limit than just keeping a copy does.

Yet another way is to track the element permutation performed by the sort. You can do this by creating an auxiliary array of element indices, one per element to be sorted, initially in order. As the sort proceeds, it mirrors each element move with a corresponding move in the auxiliary array. When the sort is finished, the auxiliary array tells you what the original index of each element was, and you can use that information to return the elements to the original order. This is no less memory efficient than keeping a copy of the original data, and the order restoration can be performed in linear time.

Alternatively, you can avoid sorting the original data at all. If you instead sort an array of pointers to the data or, equivalently, indices of the data then you can find the sorted order without losing the original order in the first place.

Upvotes: 3

Related Questions