Stack Overflow Asked by Maxsteel on January 1, 2022
I have a table that looks like this:
host, job, folder, file, mtime
Folder names are not unique and can be same for a job spread across different hosts. I need to pick folder where the max(mtime for a file) is the max across all the folders by the same name across different hosts. Roughly I need something like this:
Select (host, folder) pair where tuple (host, job, folder) max(max (file mtime))
Example:
1, j1, f1, e1, 2
2, j1, f1, e2, 0
2, j1, f1, e9, 3
3, j1, f1, e3, 2
1, j2, f2, e4, 3
2, j2, f2, e5, 4
3, j2, f2, e6, 5
1, j3, f3, e7, 6
2, j3, f3, e8, 7
result would be:
2, j1, f1, e9, 3
3, j2, f2, e6, 5
2, j3, f3, e8, 7
The table is huge, so I trying to find out best possible way to do this. Thanks
A window function like ROW_NUMBER()
should provide the best performance:
SELECT host, job, folder, file, mtime
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY folder, job ORDER BY mtime DESC) rn
FROM tablename
)
WHERE rn = 1
See the demo.
Results:
| host | job | folder | file | mtime |
| ---- | --- | ------ | ---- | ----- |
| 2 | j1 | f1 | e9 | 3 |
| 3 | j2 | f2 | e6 | 5 |
| 2 | j3 | f3 | e8 | 7 |
Answered by forpas on January 1, 2022
You can filter with a subquery:
select t.*
from mytable t
where t.mtime = (
select max(t1.mtime) from mytable t1 where t1.folder = t.folder and t1.job = t.job
)
For performance, consider an index on (folder, job, mtime)
.
You did not specify how you want to handle potential top ties (rows that relate to the same folder
and job
with the maximum mtime
): this query does return them.
Answered by GMB on January 1, 2022
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP