Throw more wood on the fire!

Think that you have enough rows for that index?

Maybe not.  Recently I was working on some demo scripts for a future presentation which included things to look for in an execution plan.  Namely, RID/Key look ups.  Naturally, I had built out a table, applied a clustered and non-clustered index, and started to run my queries.  Based on my queries, I fully expected a nice look up to happen, which would prove that my scripts were working as I expected.

CREATE TABLE Beaker (id INT IDENTITY(1,1)
, fname VARCHAR(100)
, lname NVARCHAR(100)
, age tinyint
, bankbalance bigint  --19 digits
)
GO
INSERT dbo.Beaker VALUES ('fname', 'lname', 34, 9223372036854775807)
GO 15000
CREATE CLUSTERED INDEX IX_ClustedIndex ON dbo.Beaker (ID)
GO
CREATE NONCLUSTERED INDEX IX_NC_Index ON dbo.Beaker (fname)
GO
-- updated some values just because
UPDATE dbo.Beaker
SET fname = 'bob'
WHERE ID between 100 and 250
GO
-- key look up
SELECT ID, age
FROM dbo.Beaker
WHERE fname = 'bob'
GO

Nope.  I was wrong.

I couldn’t believe it.  I know that I had coded things up correctly. I started to surf the net to see where I had screwed up. I still couldn’t believe it!

I then ran across a posting from Pinal Dave (Blog|Twitter) talking about the same thing that I was in my presentation. Of course, his demo scripts are much nicer than mine! I hope, with practice and time, I’ll get there. =) Anyway, I happen to notice that he had used a record set that was 100k rows where mine was only 15k. I thought, would that make a difference? I’m not sure why it would? So I tried it.

I truncated my table and completely reloaded the data, only this time it was set for 150k rows vs only 15k. Once done, I re-ran my query and presto chango, the key look up is there!

A key lookup
Thar she blows! Key lookup off port bow!

This made me start to think, why would the optimizer do that?  The only thing that I can think of (and it’s late) is that the optimizer when evaluating the best execution plan, must decided that doing a table (IE: clustered index) scan is faster than doing a key lookup from a non-clustered index simply because of the size of the data itself.

This makes sense right?  Thus, when I bumped the row count to 150k thousand rows, now the optimizer has a lot more to play with and decides that doing a key look up is the best route.  I did some quick google searches however I wasn’t able to (quickly) confirm my answers but I’ll continue to look around.

If you think I’m wrong, please feel free to correct me.  I’m here to learn.  =)

I guess the moral of the story is that if you expect for the optimizer to throw back a particular type of plan operator and it doesn’t, you might want to give it more “wood” to play with.  It might surprise you.  Enjoy!

© 2011, John Morehouse. All rights reserved.

One Reply to “Throw more wood on the fire!”

Hey you! Leave me a comment and start a discussion!

This site uses Akismet to reduce spam. Learn how your comment data is processed.