Stack Overflow Asked by smiley on August 1, 2020
These are my table columns:
ID || Date || Description || Priority
My goal is to insert random test data of 2000 rows with date ranging between (7/1/2019 – 7/1/2020) and randomize the priority from list (High, Medium, Low).
I know how to insert random numbers but I am stuck with the date and the priority fields.
If I need to write code, any pointers on how do I do it?
Just want to be clear – I have issue with randomizing and inserting from a given list
CREATE TABLE mytable (
id SERIAL PRIMARY KEY,
date DATE NOT NULL,
description TEXT,
priority ENUM('High','Medium','Low') NOT NULL
);
INSERT INTO mytable (date, priority)
SELECT '2019-07-01' + INTERVAL FLOOR(RAND()*365) DAY,
ELT(1+FLOOR(RAND()*3), 'High', 'Medium', 'Low')
FROM DUAL;
The fake table DUAL
is a special keyword. You can select from it, and it always returns exactly one row. But it has no real columns with data, so you can only select expressions.
Do this INSERT a few times and you get:
mysql> select * from mytable;
+----+------------+-------------+----------+
| id | date | description | priority |
+----+------------+-------------+----------+
| 1 | 2019-10-20 | NULL | Medium |
| 2 | 2020-05-17 | NULL | High |
| 3 | 2020-06-25 | NULL | Low |
| 4 | 2020-05-06 | NULL | Medium |
| 5 | 2019-09-30 | NULL | High |
| 6 | 2019-08-06 | NULL | Low |
| 7 | 2020-02-21 | NULL | High |
| 8 | 2019-11-10 | NULL | High |
| 9 | 2019-07-30 | NULL | High |
+----+------------+-------------+----------+
Here's a trick to use the number of rows in the table itself to insert the same number of rows, basically doubling the number of rows:
INSERT INTO mytable (date, priority)
SELECT '2019-07-01' + INTERVAL FLOOR(RAND()*365) DAY,
ELT(1+FLOOR(RAND()*3), 'High', 'Medium', 'Low')
FROM mytable;
Just changing FROM DUAL
to FROM mytable
I change from selecting one row, to selecting the current number of rows from the table. But the values I insert are still random expressions, not the values already in those rows. So I get new rows with new random values.
Then repeat this INSERT as many times as you want to double the number of rows.
Read also about the ELT() function.
Correct answer by Bill Karwin on August 1, 2020
You seem to be looking for something like this. A basic random sample is:
select t.*
from t
where date >= '2019-07-01' and date < '2020-07-01'
order by random()
fetch first 2000 rows only;
Of course, the function for random()
varies by database, as does the logic for limiting rows. This should get about the same distribution of priorities as in the original data.
If you want the rows to come by priority first, then use:
select t.*
from t
where date >= '2019-07-01' and date < '2020-07-01'
order by (case when priority = 'High' then 1 when priority = 'Medium' then 2 else 3 end),
random()
fetch first 2000 rows only;
Answered by Gordon Linoff on August 1, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP