djangofan
djangofan

Reputation: 29669

How to randomize order of data in 3 columns

I have 3 columns of data in SQL Server 2005 :

LASTNAME FIRSTNAME CITY

I want to randomly re-order these 3 columns (and munge the data) so that the data is no longer meaningful. Is there an easy way to do this? I don't want to change any data, I just want to re-order the index randomly.

Upvotes: 2

Views: 445

Answers (4)

Andriy M
Andriy M

Reputation: 77687

In SQL Server 2005+ you could prepare a ranked rowset containing the three target columns and three additional computed columns filled with random rankings (one for each of the three target columns). Then the ranked rowset would be joined with itself three times using the ranking columns, and finally each of the three target columns would be pulled from their own instance of the ranked rowset. Here's an illustration:

WITH sampledata (FirstName, LastName, CityName) AS (
  SELECT 'John', 'Doe', 'Chicago' UNION ALL
  SELECT 'James', 'Foe', 'Austin' UNION ALL
  SELECT 'Django', 'Fan', 'Portland'
),
ranked AS (
  SELECT
    *,
    FirstNameRank = ROW_NUMBER() OVER (ORDER BY NEWID()),
    LastNameRank  = ROW_NUMBER() OVER (ORDER BY NEWID()),
    CityNameRank  = ROW_NUMBER() OVER (ORDER BY NEWID())
  FROM sampledata
)
SELECT
  fnr.FirstName,
  lnr.LastName,
  cnr.CityName
FROM ranked fnr
  INNER JOIN ranked lnr ON fnr.FirstNameRank = lnr.LastNameRank
  INNER JOIN ranked cnr ON fnr.FirstNameRank = cnr.CityNameRank

This is the result:

FirstName LastName CityName
--------- -------- --------
James     Fan      Chicago
John      Doe      Portland
Django    Foe      Austin

Upvotes: 0

Jhonny D. Cano -Leftware-
Jhonny D. Cano -Leftware-

Reputation: 18013

I suggest using newid with checksum for doing randomization

 SELECT LASTNAME, FIRSTNAME, CITY FROM table ORDER BY CHECKSUM(NEWID())

Upvotes: 0

VoteyDisciple
VoteyDisciple

Reputation: 37803

When you say "re-order" these columns, do you mean that you want some of the last names to end up in the first name column? Or do you mean that you want some of the last names to get associated with a different first name and city?

I suspect you mean the latter, in which case you might find a programmatic solution easier (as opposed to a straight SQL solution). Sticking with SQL, you can do something like:

UPDATE the_table
SET lastname = (SELECT lastname FROM the_table ORDER BY RAND())

Depending on what DBMS you're using, this may work for only one line, may make all the last names the same, or may require some variation of syntax to work at all, but the basic approach is about right. Certainly some trials on a copy of the table are warranted before trying it on the real thing.

Of course, to get the first names and cities to also be randomly reordered, you could apply a similar query to either of those columns. (Applying it to all three doesn't make much sense, but wouldn't hurt either.)

Since you don't want to change your original data, you could do this in a temporary table populated with all rows.

Finally, if you just need a single random value from each column, you could do it in place without making a copy of the data, with three separate queries: one to pick a random first name, one a random last name, and the last a random phone number.

Upvotes: 2

Jeff Ferland
Jeff Ferland

Reputation: 18292

select *, rand() from table order by rand();

I understand some versions of SQL have a rand() that doesn't change for each line. Check for yours. Works on MySQL.

Upvotes: -1

Related Questions