Find timestamp for highest number of concurrent users by date

Question

I have a table including these columns:
Sample data:
concurrent_users  concurrent_timestamp       concurrent_date
3                 2020-06-24 07:20:00 UTC    2020-06-24
7                 2020-06-24 08:20:00 UTC    2020-06-24
8                 2020-07-05 06:20:00 UTC    2020-07-05
2                 2020-07-05 03:20:00 UTC    2020-07-05

I want to find/list the date and timestamp where the concurrent_users is highest for each date.
Expected output:
concurrent_users  concurrent_timestamp       concurrent_date
7                 2020-06-24 08:20:00 UTC    2020-06-24
8                 2020-07-05 06:20:00 UTC    2020-07-05

I tried following approach/query:
Error 1
select concurrent_timestamp, max(concurrent_users)
from sample_data
group by concurrent_date;

But this gives an error as the concurrent_timestamp column is not included in the group by clause.
When I put concurrent_timestamp in group by, I get ...
Error 2
select concurrent_timestamp, max(concurrent_users)
from sample_data
group by concurrent_timestamp ;

It does not give the desired output instead just lists more than a single record for each date.
How to solve this?

RolandoMySQLDBA · Answer

Just apply the DATE() function in the aggregation of concurrent_timestamp
select
    DATE(concurrent_timestamp) timestamp_date
   ,max(concurrent_users) max_concurrent_users
from sample_data
group by DATE(concurrent_timestamp);

Erwin Brandstetter · Answer

In PostgreSQL you can simply use DISTINCT ON:
SELECT DISTINCT ON (concurrent_date) *
FROM   sample_data
ORDER  BY concurrent_date, concurrent_users DESC;

We don't need the redundant column concurrent_date for this at all. Casting concurrent_timestamp on the fly is very cheap - overall cheaper than storing the functionally dependent value redundantly.
SELECT DISTINCT ON (concurrent_timestamp::date) *
FROM   sample_data
ORDER  BY concurrent_timestamp::date, concurrent_users DESC;

db<>fiddle here
See:

Select first row in each GROUP BY group?

DISTINCT ON is typically fastest for few rows per group. Depending on table definition and data distribution, there may be (much) faster solutions. See:

Optimize GROUP BY query to retrieve latest row per user

nbk · Answer

You can do something like this in mysql

CREATE TABLE sample_data (
  `concurrent_users` INTEGER,
  `concurrent_timestamp` VARCHAR(30),
  `concurrent_date` DATE
);

INSERT INTO sample_data
  (`concurrent_users`, `concurrent_timestamp`, `concurrent_date`)
VALUES
  ('3', '2020-06-24 07:20:00 UTC', '2020-06-24'),
  ('7', '2020-06-24 08:20:00 UTC', '2020-06-24'),
  ('8', '2020-07-05 06:20:00 UTC', '2020-07-05'),
  ('2', '2020-07-05 03:20:00 UTC', '2020-07-05');

select concurrent_timestamp, maxid 
from sample_data s INNER JOIN (select concurrent_date, max(concurrent_users) maxid
from sample_data
group by concurrent_date) s1 ON s.concurrent_users = s1.maxid;

concurrent_timestamp    | maxid
:---------------------- | ----:
2020-06-24 08:20:00 UTC |     7
2020-07-05 06:20:00 UTC |     8

db<>fiddle here

Rick James · Answer

This is a "groupwise-max" problem. You can't do it with a simple query, you need a subquery. There are many examples off the tag I added. Or you can read my blog: http://mysql.rjweb.org/doc.php/groupwise_max

Answered by Rick James on November 14, 2021

Find timestamp for highest number of concurrent users by date

4 Answers

Add your own answers!

Ask a Question