Think that you have enough rows for that index?
Maybe not. Recently I was working on some demo scripts for a future presentation which included things to look for in an execution plan. Namely, RID/Key look ups. Naturally, I had built out a table, applied a clustered and non-clustered index, and started to run my queries. Based on my queries, I fully expected a nice look up to happen, which would prove that my scripts were working as I expected.
CREATE TABLE Beaker (id INT IDENTITY(1,1) , fname VARCHAR(100) , lname NVARCHAR(100) , age tinyint , bankbalance bigint --19 digits ) GO INSERT dbo.Beaker VALUES ('fname', 'lname', 34, 9223372036854775807) GO 15000 CREATE CLUSTERED INDEX IX_ClustedIndex ON dbo.Beaker (ID) GO CREATE NONCLUSTERED INDEX IX_NC_Index ON dbo.Beaker (fname) GO -- updated some values just because UPDATE dbo.Beaker SET fname = 'bob' WHERE ID between 100 and 250 GO -- key look up SELECT ID, age FROM dbo.Beaker WHERE fname = 'bob' GO
Nope. I was wrong.
I couldn’t believe it. I know that I had coded things up correctly. I started to surf the net to see where I had screwed up. I still couldn’t believe it!
I then ran across a posting from Pinal Dave (Blog|Twitter) talking about the same thing that I was in my presentation. Of course, his demo scripts are much nicer than mine! I hope, with practice and time, I’ll get there. =) Anyway, I happen to notice that he had used a record set that was 100k rows where mine was only 15k. I thought, would that make a difference? I’m not sure why it would? So I tried it.
I truncated my table and completely reloaded the data, only this time it was set for 150k rows vs only 15k. Once done, I re-ran my query and presto chango, the key look up is there!
This made me start to think, why would the optimizer do that? The only thing that I can think of (and it’s late) is that the optimizer when evaluating the best execution plan, must decided that doing a table (IE: clustered index) scan is faster than doing a key lookup from a non-clustered index simply because of the size of the data itself.
This makes sense right? Thus, when I bumped the row count to 150k thousand rows, now the optimizer has a lot more to play with and decides that doing a key look up is the best route. I did some quick google searches however I wasn’t able to (quickly) confirm my answers but I’ll continue to look around.
If you think I’m wrong, please feel free to correct me. I’m here to learn. =)
I guess the moral of the story is that if you expect for the optimizer to throw back a particular type of plan operator and it doesn’t, you might want to give it more “wood” to play with. It might surprise you. Enjoy!
© 2011, John Morehouse. All rights reserved.