sipsorcery
sipsorcery

Reputation: 30699

Randomise dates in MySql sample data

I've created a sample data set in a MySql database that I need to use to test generating some reports with. For one particular table I've got 100k records that were all created within a few hours of each other. Because of the way the sample data was generated I couldn't set the datetime columns on the 100k records to the desired values.

How can I use an SQL statement to randomise the datetime columns on the 100k records so instead of being within a few hours of each other they are spread across a month? As an example I'd like to spread the records evenly across the interval 1st Oct 2012 to 31st Oct 2012.

Upvotes: 3

Views: 256

Answers (2)

Jason McCreary
Jason McCreary

Reputation: 72981

Intriguing question. I have needed something similar before. The following is a proof of concept.

SET @start_timestamp = 1349049600;
SET @end_timestamp = 1351641600;
SET @offset = @end_timestamp - @start_timestamp;

SELECT FROM_UNIXTIME(@start_timestamp + FLOOR(RAND() * @offset));

A Fail Note

Although flow control statements exist in MySQL (WHILE, LOOP, etc), they are only available within stored programs.

I tried to JOIN an existing table against the SELECT directly. However, the timestamp was always the same value.

I tried to create a dates table and seed it using BENCHMARK(100000, INSERT ...). But it only accepts scalar expressions for the second argument.

In the end, you could overcome all of the above by writing a stored program. However, that seemed overkill as I believe in your case adapting this POC into your original script makes the most sense.

Upvotes: 1

GregD
GregD

Reputation: 2867

We can use RAND() function. Following MySQL Reference Manual, to obtain a random integer R in the range i <= R < j, use the expression FLOOR(i + RAND() * (j – i)). So for every row in your sample table, you'd need to run the following query:

SET @year = '2012';
SET @month = '10';
SET @day = FLOOR(1 + RAND() * (31 - 1));
SET @hour = FLOOR(0 + RAND() * (23 - 0));
SET @minute = FLOOR(0 + RAND() * (59 - 0));
SET @second = FLOOR(0 + RAND() * (59 - 0));

SET @date = CONCAT(@year,'-',@month,'-',@day,'-',@hour,'-',@minute,'-',@second);

UPDATE `sample_tbl`
   SET `date` = STR_TO_DATE(@date,'%Y-%c-%e-%k-%i-%s');

Upvotes: 2

Related Questions