Database Administrators Asked by Sharvari on November 14, 2021
I have a table including these columns:
Sample data:
concurrent_users concurrent_timestamp concurrent_date
3 2020-06-24 07:20:00 UTC 2020-06-24
7 2020-06-24 08:20:00 UTC 2020-06-24
8 2020-07-05 06:20:00 UTC 2020-07-05
2 2020-07-05 03:20:00 UTC 2020-07-05
I want to find/list the date and timestamp where the concurrent_users
is highest for each date
.
Expected output:
concurrent_users concurrent_timestamp concurrent_date
7 2020-06-24 08:20:00 UTC 2020-06-24
8 2020-07-05 06:20:00 UTC 2020-07-05
I tried following approach/query:
Error 1
select concurrent_timestamp, max(concurrent_users)
from sample_data
group by concurrent_date;
But this gives an error as the concurrent_timestamp
column is not included in the group by
clause.
When I put concurrent_timestamp
in group by
, I get …
Error 2
select concurrent_timestamp, max(concurrent_users)
from sample_data
group by concurrent_timestamp ;
It does not give the desired output instead just lists more than a single record for each date
.
How to solve this?
Just apply the DATE() function in the aggregation of concurrent_timestamp
select
DATE(concurrent_timestamp) timestamp_date
,max(concurrent_users) max_concurrent_users
from sample_data
group by DATE(concurrent_timestamp);
Answered by RolandoMySQLDBA on November 14, 2021
In PostgreSQL you can simply use DISTINCT ON
:
SELECT DISTINCT ON (concurrent_date) *
FROM sample_data
ORDER BY concurrent_date, concurrent_users DESC;
We don't need the redundant column concurrent_date
for this at all. Casting concurrent_timestamp
on the fly is very cheap - overall cheaper than storing the functionally dependent value redundantly.
SELECT DISTINCT ON (concurrent_timestamp::date) *
FROM sample_data
ORDER BY concurrent_timestamp::date, concurrent_users DESC;
db<>fiddle here
See:
DISTINCT ON
is typically fastest for few rows per group. Depending on table definition and data distribution, there may be (much) faster solutions. See:
Answered by Erwin Brandstetter on November 14, 2021
You can do something like this in mysql
CREATE TABLE sample_data ( `concurrent_users` INTEGER, `concurrent_timestamp` VARCHAR(30), `concurrent_date` DATE );
INSERT INTO sample_data (`concurrent_users`, `concurrent_timestamp`, `concurrent_date`) VALUES ('3', '2020-06-24 07:20:00 UTC', '2020-06-24'), ('7', '2020-06-24 08:20:00 UTC', '2020-06-24'), ('8', '2020-07-05 06:20:00 UTC', '2020-07-05'), ('2', '2020-07-05 03:20:00 UTC', '2020-07-05');
select concurrent_timestamp, maxid from sample_data s INNER JOIN (select concurrent_date, max(concurrent_users) maxid from sample_data group by concurrent_date) s1 ON s.concurrent_users = s1.maxid;
concurrent_timestamp | maxid :---------------------- | ----: 2020-06-24 08:20:00 UTC | 7 2020-07-05 06:20:00 UTC | 8
db<>fiddle here
Answered by nbk on November 14, 2021
This is a "groupwise-max" problem. You can't do it with a simple query, you need a subquery. There are many examples off the tag I added. Or you can read my blog: http://mysql.rjweb.org/doc.php/groupwise_max
Answered by Rick James on November 14, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP