Database Administrators Asked by JShark on October 28, 2021
I have been tuning a query/ indexes in a dev environment, but cannot replicate the new query plan when applying the same changes to a UAT environment.
Specifically, in the UAT environment the optimizer chooses to ignore a particular index and instead performs a nonclustered index seek on an existing unique constraint, followed by a key lookup on the clustered index.
An abridged version of the query is:
select d.dim_date_id
, f.dim_form_id
, d.date
INTO #TMP
from DWH.dbo.tbl_fact_outcome f
join DWH.dbo.tbl_dim_date d ON d.date >= DATEADD(DAY,-1,f.known_from) and d.date < f.known_to
join DWH.dbo.tbl_dim_form df ON f.dim_form_id = df.dim_form_id
join DWH.dbo.tbl_dim_question Q ON f.dim_question_id = Q.dim_question_id
where (d.flag_latest_day = 'Y' or d.flag_end_of_month = 'Y' or (d.flag_end_of_week = 'Y' AND d.flag_latest_week = 'Y'))
and d.flag_future_day = 'N'
and df.flag_latest_form = 'Y'
and f.deleted = 0
and q.question_key like 'R%'
and d.date >= '14/07/2020'
The index in question is:
CREATE NONCLUSTERED INDEX [ix_tbl_dim_date_flag_future_day_includes] ON [dbo].[tbl_dim_date]
([flag_future_day] ASC)
INCLUDE([dim_date_id],[date],[flag_end_of_month],[flag_latest_day],[flag_end_of_week],[flag_latest_week])
It’s not a large table and should only return one or two dates which filter results from subsequent joins accordingly. Adding other fields to key columns in index made no difference in the dev environment – the only improvement came from the order/ shape of the query plan.
Both environments have the exact same table structure and indexes and are using the same query.
The dev environment does have around half the volume of data overall, but even if I amend the date range in either to return a similar number of records the query plans in both environments stay as they are.
There was a dramatic improvement on performance in the dev environment since this query plan has been used and I am fairly confident the same would be seen in my UAT environment.
I have tried using query hints to force the index and/ or the order but haven’t been able to replicate this plan.
Essentially I have two questions:
I am using SQL Server 2014 enterprise edition.
EDIT – 20/07/2020 – Actual execution plans added
Actual good execution plan from dev environment – running this took 37 seconds:
https://www.brentozar.com/pastetheplan/?id=BJkr-Q7gP
Actual bad execution plan from UAT – running this took 33m 42s: https://www.brentozar.com/pastetheplan/?id=H1nsxX7ew
This is not a direct answer to your question however few observation which might put you in right direction to solve the issue. First few observations from actual plan:
You may read more on this link how number of rows could affect query plan and other joining conditions.
I downloaded the XML for good and bad plan. If you use SentryOne Plan explorer, below is the one which is highlighted in bad plan:
Number of rows read is in billions and it is 3 times than that of actually required.
In the case of good plan, I/O residual is much lesser.
So, you need to solve this issue first and also point raised by David Browne on table tbl_fact_outcome. Below are some of the pages where it could help you:
https://sqlperformance.com/2016/06/sql-indexes/actual-rows-read-warnings-plan-explorer
"Warnings: Operation caused residual I/O" versus key lookups
https://www.sqlshack.com/the-impact-of-residual-predicates-in-a-sql-server-index-seek-operation/
I hope this helps you.
Answered by Learning_DBAdmin on October 28, 2021
This was totally a workaround, but it worked for me. I had 4 servers running the same database with different data sets. Three of them would chose an optimal plan, but the fourth would never choose that plan, and was consistently taking significantly longer. In my case I could see the good plans seeking one of the tables for a handful of rows out of around 100,000 first, then doing the other joins. The bad plan would do that seek last and would seek that table 500,000 plus times. I also tried the query hints and re-writing the query in different ways with no luck.
So my work around was to manually query that table for those handful of rows first into a temp table, then join to that instead of the full table.
So in your case could you run something like this first into a temp table and then join against it instead of the whole DWH.dbo.tbl_dim_date table:
select d.dim_date_id
,d.date
,d.flag_future_day
from DWH.dbo.tbl_dim_date d
where (d.flag_latest_day = 'Y' or d.flag_end_of_month = 'Y'
or (d.flag_end_of_week = 'Y' AND d.flag_latest_week = 'Y'))
and d.flag_future_day = 'N'
and d.date >= '14/07/2020'
Not sure in your case if the above query would be selective enough to be really helpful and I realize that it is a workaround. But sometimes workarounds work!
Another thought, are you able to modify that NCI you included? It might help if you added the date column to the indexed columns instead of the INCLUDE columns. There are a lot of filters done in that query based on that date column.
Answered by John on October 28, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP