Quick Script: Index Sizes

In the world of performance monitoring, there a number of ways to identify problems. Each method is different and usually, neither way is better than another.

Over the years, I’ve started look at metrics around non-clustered indexes to help find problems.   Once while working at a client site, I was astounded by how they would implement changes so rapidly.  This quick rate of change usually had detrimental effects on the performance of the application. Of course as usual, once the change was in Production, it was difficult to fix things.

One day, I happened to be looking at their flagship application’s database.  The table I was focused on was about 60GB in size.  This, in itself, was not a cause for alarm.  However, the 120GB worth of non-clustered index space was.  Did this mean that we had a performance issue?  Not necessarily.  In my opinion, this meant that we had a potential of having a performance issue.

Here is the common problem that I see.

  1. Ted writes a query for the application.
  2. Ted knows he needs an index so creates a non-clustered index to cover his query.  Awesome!
  3. Paul also writes a query for the application
  4. Paul is just as smart as Ted and so he writes another non-clustered index for his query.
  5. Rinse and Repeat

Shockingly, this occurs more often than you’d think.  Instead of checking to see what indexes are already in place that might cover their respective queries, they just blindly create a new one.

To help identify this potential issue, you could just compare the total number of non-clustered indexes on the table, however in my opinion that doesn’t tell you the whole story.  You don’t know how large those indexes are so your story is not complete.

The script below will give you a better picture.  It compares the heap/clustered index size against the aggregate size of the non-clustered indexes.   The script is configured to identify indexes that are three times larger than table itself.  Your mileage may vary and you can certainly adjust to your liking.

If you run the script against one of your databases and the script alerts you to a potential issue, you will have to further evaluate that particular table.

DISCLAIMER: Do not run code you find on the internet in your production environment without testing it very thoroughly.  Period.  

/***************************************************************
-- Author: John Morehouse
-- Date: April 2015
-- T: @SQLRUS
-- E: john@jmorehouse.com
-- B: http://sqlrus.com

--THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE.

--IN OTHER WORDS: USE AT YOUR OWN RISK.

--AUTHOR ASSUMES ZERO LIABILITY OR RESPONSIBILITY.

--You may alter this code for your own purposes.
--You may republish altered code as long as you give due credit.
***************************************************************/

;WITH TableIndexes (tableName , datapages, datasize)
	AS (
				SELECT st.name, si.dpages, (CAST(si.dpages as BIGINT) * 8192)/ 1024 as 'dataSize'
				FROM .sys.tables st
						INNER JOIN sys.sysindexes si ON st.object_id = si.id
				WHERE st.type = 'U' -- user tables
						AND si.indid in (0,1)-- Heap or clustered index
	)
	SELECT
			ss.name as 'Schema'
			, st.name as 'Table Name'
			, MAX(ti.datasize) AS 'Table Size(KB)'
			, SUM((CAST(si.dpages as BIGINT)*8192)/1024) AS 'Total NC Index Size(KB)'
			, SUM(si.rowcnt) as 'Total Row Count'
			, COUNT(si.indid) as 'NonClustered Index Count'
			, CASE
					WHEN MAX(ti.datasize*2) < SUM((CAST(si.dpages as BIGINT)*8192 )/1024) THEN 'WARNING'
					ELSE 'OK'
					END AS 'Response'
			FROM sys.tables st
				INNER JOIN sys.sysindexes si on st.object_id = si.id
				INNER JOIN TableIndexes ti ON st.name = ti.tableName
				INNER JOIN sys.schemas ss on st.schema_id = ss.schema_id
		WHERE st.type = 'U' -- user tables
				AND si.indid > 1
		GROUP BY ss.name
					, st.name
		ORDER BY [Schema],[Table Name]

This script can be useful to help identify potential issues.   It is just another tool for your toolbox.

Enjoy!

Posted in General Conversations | 1 Comment

I’m Speaking! SQL Saturday #397 – Omaha

SQLSAT397_web

I recently blogged about the opportunity to speak at that the Omaha SQL Server User group.  In a nutshell, being a co-founder of the group back in 2007, I never presented to the group as a speaker.  In July, I finally got that chance.

Now I have another opportunity.  Just like with the user group, being an organizer of the first two SQL Saturday’s in Omaha (91 & 197), I never submitted to be a presenter.  WithSQL Saturday #397, I finally submitted and two of my sessions were selected.  As such, this weekend I’ll be traveling back to my hometown.  I’m looking forward to seeing old friends, enjoy great food, and of course, presenting to the SQL community.

The schedule is packed with fantastic speakers and the event is being held in a great venue so this is definitely an event you don’t want to miss.  So if you like good people, free training, hanging out with other geeks, come on out to Mammel Hall on Saturday, August 15th, 2015.  Join us and learn something.

Here are the abstracts for my sessions.  Please do come out and see me!

Social Networking Is Alive!

SocialMediaIsAlive!Social Media is all over the news today. Did you see that on Facebook? Johnny 5 said what on Twitter? Can you believe Susie moved to that company? Do any of those sound familiar? No?  Never heard of Twitter or LinkedIn? Have no fear, this is the session for you!  Social networking can be a very confusing and intimidating place. What makes it even worse, is when you have no clue even where to start.  In this session, we will go over different facets of Social Networking and Media and how you can use them to enhance not only your personal lives but just as importantly, your professional one. For example, did you know you can get help from some of  the top minds in the SQL Server profession by using the #SQLHELP hash tag for free? We will show you how! We’ll be covering that and more using Facebook, LinkedIn and Twitter.

10 Things Every DBA Should Know!

Are you new to the world of Databases? Are you a developer wanting to improve your SQL Skills? If you answered ‘Yes’ to either of those questions, then this is the session for you! John will be reviewing things that he’s seen in his experiences working with databases that every DBA should know and understand. He will demonstrate and\or explain every day things like Auto Shrink, SELECT *, Index Basics and 7 others! You will leave this session walking away with knowledge you can implement right away.

FREE Training

Remember that this event is a FREE day of awesome SQL Server training and professional development.  The only cost to you is if you decide to purchase lunch! Check out the schedule and register.  In today’s economy, how can you afford to pass up FREE training?!?!

If you make it out either of my sessions, make sure to come up and introduce yourself as I’d love to meet you! I look like a mean DBA but I promise that I’m not.  =)

See you in Omaha!

Posted in Presenting, SQL Saturday | Tagged , | Leave a comment

SQL Saturday Speaker Marketing – With Powershell

pass_marketing_2015

If you follow the Twitter account for SQL PASS, you probably noticed that they are actively promoting individual Summit 2015 speakers.  I think this is a fantastic marketing strategy not only for the individual speakers but also for the Summit itself.

I’m currently involved with organizing a SQL Saturday event (SQL Saturday #403 – Louisville).  With these events, I am always on the lookout for ways to make our event even better.  When I saw PASS doing this type of marketing, I knew that we had to do it for our event.

Now, if you look at the event schedule, we have 36 speakers.  I’ll admit it would be relatively easy to copy & paste things together for a tweet  How about those SQL Saturday’s that have 75 speakers like Atlanta?  With 75 speakers, the copy & paste process would become unwieldy quickly.

sqlsat_xml_feed_exampleOne of the benefits of the SQL Saturday website is that they offer a XML feed for every event.  This XML feed provides a number of useful data elements. These elements include speakers, sessions, and the event itself.  Given that I like to make things suck less, I wanted to find a way to dynamically generate Twitter content using this XML feed.  Once I have the content, then I can schedule them appropriately in my favorite social media application.

Enter Powershell.  Stage left.

Powershell has many abilities.  One of those is the ability to consume and manipulate XML data.  This is a perfect solution to generate what I need.  I modified the format of the tweet from the one that PASS was using.  In short, I wanted the script to:

  • Print out the length of the tweet so that adjustments could be made if longer than 140 characters
  • Include a link to each individual session
  • Automatically shorten the session URL
  • Include session title
  • Automatically use the appropriate event hash tag, namely the number or the name
  • If the speaker included their Twitter handle, include that as well
  • Output the results either to the screen or to a CSV file

Bit.ly URL Shorting

posh_tweet_1First, let’s get setup to shorten the session URL.  There are a number of services that can do this, however I choose to use Bit.ly.  For no specific reason aside from, I was able to find examples of how to communicate with their service on the internet.   In order to do this, I had to create an account at Bit.ly and generate an application token.  This token will allow the script to automatically pass in the long URL and retrieve the shortened version.

It is worth noting that you should not publicize the token as someone could use it to impersonate you when communicating with the service.

Let’s dive into the script!

The first thing that we are going to do in the script is declare some variables that will be used later in the script. We will also create a function to shorten the URL. This function will be called as we iterate through the XML data.

#Tweet Format
#Speaker [name] presents [title] [SessionURL} /cc [twitter handle] [event hash tag]

$eventNum = "403"
$OAuthToken =  "[bit.ly token goes here]"
$outputfile = "c:\temp\sqlsat$($eventnum)_tweets.csv"
$outtofile = 1

#declare our array to hold our tweets
$tweets = @()

# used to shorten the session URL for a better character count for Twitter
# internals borrowed from http://powershellnotebook.com/2014/10/29/powershell-shorten-links-with-the-bit-ly-api/
# function wrapper was written by John Morehouse

function Get-ShortURL{
    Param([string]$longURL, 
            [string]$OAuthToken)

  # Make the call
  $MyURL=Invoke-WebRequest `
    -Uri https://api-ssl.bitly.com/v3/shorten `
    -Body @{access_token=$OAuthToken;longURL=$LongURL} `
    -Method Get

  #Get the elements from the returned JSON 
  $MyURLjson = $MyURL.Content | convertfrom-json
   
  # Print out the shortened URL 
  write-output $MyURLjson.data.url 
}   

Given that we now have the function available, we can start to parse the XML data. To do this we will use the Invoke-webrequest cmdlet. The output from the cmdlet will put the results into the $xdoc object.

#let's get the XML from the SQL Saturday website
$xdoc = Invoke-WebRequest -Uri "http://www.sqlsaturday.com/eventxml.aspx?sat=$eventNum" -UseBasicParsing

If you examine the $xdoc object, you will notice that there are a number of elements that you can use. For this purpose, we will only need certain elements from the Speaker and Event objects.

#we only need a subset of each node of the XML, mainly the speakers and the sessions
$speakers = $xdoc.GuidebookXML.speakers.speaker | select importid, name, twitter
$sessions = $xdoc.GuidebookXML.events.event | select importID, title

To my knowledge there isn’t any easy way to “join” two Powershell objects together so instead I opted to use a ForEach loop to do this (if someone knows a better method, please share, I would love to know).  I start with the $speakers object and for every speaker within the object, we need to get their respective session from the $sessions object. The $sessions object will be piped into the Where-Object cmdlet using the criteria that the session import id is equal to the speaker import ID. In SQL terms, this would be the foreign key between to the two objects.

foreach ($speaker in $speakers){
$session = $sessions | where-object {$_.importid -eq $speaker.importID}

Now we have the speaker and their respective session, we need to clean up some data. The SQL Saturday website does not appear to sanitize the Twitter feed information. This means that speakers can enter a range of values for their Twitter handle. Things like “http://twitter.com/sqlrus”, “sqlrus”, or “@sqlrus”.

Another issue I ran into is with the Title. The site stores the title along with ASCII encoded characters. For example, ampersands appear as “&amp” and apostrophes as “#39″. Since we want the tweet to be as clean as possible, we will need to adjust the title for those characters.

#santize the data some
#if the twitter value is less than 2, just set it to a blank value
IF ($speaker.twitter.Length -le 2){
$twitter = ""
}
#if the twitter value is larger than 1 and begins with https, replace it with an @
ELSEIF (($i.twitter.Length -gt 1 ) -and ($i.twitter.substring(0,5) -eq "https")){
$twitter = "/cc " + $i.twitter.Replace("https://twitter.com/","@")
}
#if the twitter value is larger than 1 and begins with http, replace it with an @
ELSEIF (($i.twitter.Length -gt 1 ) -and ($i.twitter.substring(0,4) -eq "http")){
$twitter = "/cc " + $i.twitter.Replace("http://twitter.com/","@")
}
#if the first character is NOT an @, add one
ELSEIF ($i.twitter.substring(0,1) -ne "@"){
$twitter = "/cc @" + $i.twitter
}
#else build in the /cc string
ELSE {$twitter = "/cc " + $i.twitter}

#clean up the title if there are any ASCII encoded characters
$title = $session.title.TrimEnd().replace("#39;","'").replace("amp;","&")

At this point, we’ve cleaned up the Twitter information and the title. Now we are ready to get the shortened URL. Since each session has a unique ID for each event, it is pretty simple to dynamically build out the long URL. We will then pass the long URL along with the OAuth security token into the function.

#get the short URL
$longurl = "http://sqlsaturday.com/$eventNum/Sessions/Details.aspx?sid=" +$session.importID
$shortURL = Get-ShortURL -longURL $longURL -OAuthToken $OAuthToken

Now we can bring it all together.  This will insert every individual tweet into our array.

#bring it all together and put it in the array
  $tweets += "Speaker " + $speaker.name + " presents `"" + $title + "`" " + $shortURL + " " + $twitter + " " + $xdoc.GuidebookXML.guide.twitterHashtag

Our array is now fully populated with all of the dynamically created tweets.  Since I wanted the ability to either output the contents to the screen or a CSV, we have to do some conditional logic.  We also provide the array some formatting so that things look nice and clean.

#provide some formatting to our array
$format = @{Label="Length";Expression={$_.Length}}, @{Expression={$_};Label = "Tweet"}

#if the $outtofile variable is 1, export it to CSV.  If not, pump it to the screen.  
If ($outtofile -eq 1){
    $tweets | select-object -Property $format  | export-csv $outputfile
  } else {
    $tweets | select-object -Property $format | format-table -AutoSize
  }

The output to the screen looks like this:
posh_tweet_2

If I had wanted to output the results to a CSV file, then it would look like this:posh_tweet_5

In our case, we are using the CSV to track when each tweet goes out as well as who is sending it out.

Final Product

Now that I have the tweets created, all I have to do now is to schedule them accordingly.  I choose to use Buffer because the interface is clean and simple.  If you have a free account, you are limited to 10 tweets in the queue so you will have to manage it accordingly.

Once you have created a Buffer account, you will need to link it to your Twitter account.  The process is simple to do.  After linking the accounts, you can start to add tweets to the queue.

As you can see below, I’ve got one ready to be added for Bill Fellows.    This particular tweet will only be going to my Twitter feed.  However, my Buffer account is linked to my Twitter feed, my LinkedIn Profile and my Facebook account so I could also publish the content there as well.

posh_tweet_3

posh_tweet_4I just simply add it to the queue and Buffer does the rest! You can also schedule it for a specific date and time.  Whether it’s been scheduled or just in the queue, at the appropriate time, Buffer will deliver the tweet for you! You might notice that Buffer kindly convert the short URL to their own version.  You might have this same experience using your own social media tool so just something to be aware of.

Summary

In today’s world, we want our events to be as successful as possible.  With the number of SQL Saturday’s that are available almost every weekend, every little bit of marketing helps to drawn in attendees.  Not only will our speakers get more press time, the event itself will too.   Hopefully a script like this will make things easier.

You can download the entire script here.

Enjoy!

Posted in General Conversations, SQL Saturday | Tagged , | 3 Comments

I’m Speaking! SQL Saturday #402 – Indianapolis

SQLSAT402_webOne of benefits, among several, in moving to Kentucky is the close proximity to several SQL Saturday events.  I am honored to have been selected to speak at SQL Saturday #402 in Indianapolis, Indiana. Indianapolis is a great town and I’m looking forward to attending my 4th SQL Saturday of 2015 there.  Thankfully it is only a few hours north of Louisville,  so I don’t have to fly this time and it’s an easy drive.

The schedule is packed full of SQL goodness. So if you like free training, hanging out with other geeks and if you have nothing else planned on Saturday, August 8th, 2015, come on out and get your learn on.  Join us and learn something.

Oh, and I’m speaking!

SQL Server Databaseology: A Deep Dive Into Database Internals

Have you ever taken apart a toaster or an alarm clock just to see how it worked? Ever wondered how that database actually functions at the record level, behind the scenes? SQL Server Databaseology is the study of SQL Server databases and their structures down to the very core of the records themselves. In this session, we will explore some of the deep inner workings of a SQL Server database at the record and page level.  You will walk away with a better understanding of how SQL Server stores data and that knowledge that will allow you to build better and faster databases.

If you do attend, please feel free to introduce yourself. While I can’t attest to the other speakers, I know that I don’t bite!

See you in Indy!

Posted in Presenting, SQL Saturday | Tagged , , , | Leave a comment

I’m Speaking! Edmonton SQL Server User Group

PASS Chapter LogoI am honored to be speaking at the Edmonton SQL Server User Group.  I always find it humbling to be asked to present and I thank the group for the opportunity.  My travel budget will not allow for me to be onsite for this meeting, so this will be a remote presentation.

If you are lucky enough to be in the area, here are the location details:

CWB Canadian Room
Concourse Level (lower Level)
Canadian Wester Bank Building
10303- Jasper Ave
Edmonton , Alberta T5J 3X6
CA

The meeting will be on Thursday, July 23th, 2015 @ 6PM (MDT).

SQL Server Databaseology: A Deep Dive Into Database Internals

Have you ever taken apart a toaster or an alarm clock just to see how it worked? Ever wondered how that database actually functions at the record level, behind the scenes? SQL Server Databaseology is the study of SQL Server databases and their structures down to the very core of the records themselves. In this session, we will explore some of the deep inner workings of a SQL Server database at the record and page level.  You will walk away with a better understanding of how SQL Server stores data and that knowledge that will allow you to build better and faster databases.

Please make sure to RSVP so that the organizers have an accurate head count!

See you Thursday!

 

Posted in General Conversations | Tagged | Leave a comment

Validating Cluster Instance Owner with Powershell

A few weeks ago, for TSQL Tuesday, I wrote about how I used Powershell to manage failing over clusters.  At work, we have now implemented that script in our patching process and it has reduced our patching time significantly.  However, the process was still missing something.  A validation step was needed to ensure that the script was doing what was expected.  Without the automated validation, the on-call person still had to VPN into the environment and manually check the status of each cluster.

After discussing this with the team, we determined that all we needed was an email confirmation post process.  This email would need to be sent out after the clusters are failed over.  This would allow the on-call person to easily identify via email if something was incorrect.

The email needed to contain a couple of things to make it useful.

  1. The expected owner of each cluster
  2. Who currently owns the active node

We are using this script to manage several clustered instances.

The Parts

The first step is to setup the metadata about the email itself.

import-module failoverclusters
$FromAddress = "from@somedomain.com"
$ToAddress = "me@somedomain.com"
$MessageSubject = "Before Patching - Cluster Node Owners"
$SendingServer = "smtp.somedomain.com"

Next we will create a hash table. We will use this to neatly keep track of what we expect for each respective clustered instance. There might be better ways to do this, but I thought that this was sufficient for what we needed.

$expected = @{
"Cluster1" = "Owner1";
"Cluster2" = "Owner2";
"Cluster3" = "Owner3";
"Cluster4" = "Owner4"
}

Now we will define a style sheet for the HTML email message. This style sheet will be applied to the two tables that will be embedded into the message.

$style = "<style>BODY{font-family: Arial; font-size: 10pt;}"
$style = $style + "TABLE{border: 1px solid black; border-collapse: collapse;}"
$style = $style + "TH{border: 1px solid black; background: #dddddd; padding: 5px; }"
$style = $style + "TD{border: 1px solid black; padding: 5px; }"
$style = $style + "</style>"

Now we need to get information about each cluster instance. There are probably several ways to accomplish this, but I choose to use a powershell array.

#build the current status array
$list = @((get-clustergroup -cluster Cluster1)
, (get-clustergroup -cluster Cluster2)
, (get-clustergroup -cluster Cluster3)
, (get-clustergroup -cluster Cluster4))

We can now start to build out the body of the message. Let’s being with the expected outcome.  One thing to note is that a hash table is a single PowerShell object.  In order to do some manipulation, such as applying the style sheet, I had to use the getenumerator() method.  This “unwraps” each object.

#this will get us the expected outcome into a table format
$body1 = $expected.getenumerator() | sort-object -property name | name, value | convertto-html -head $style | out-string

The second part of the body for the message will be the current status of each cluster.  Similar to the hash table, we have to get each object out of the array that we created.  We do this by piping the list into a ForEach-Object cmdlet.

#this will get us the current status of each cluster into a table format
$body2 = $list | % {$_} | select Cluster,name, ownernode,state | convertto-html -head $style | out-string

Now that we have the two body elements, we can combine them into a single body element. Since the message will be in HTML format, I added wording around each table so that it is easier to read.

#bring it all together into a single body to email
$body = "Expected: " + $body1 + " Current Status: " + $body2

Finally, we can send the message. We will be using the send-MailMessage cmdlet to accomplish this.
#send the message

send-MailMessage -SmtpServer $SendingServer -To $ToAddress -From $FromAddress -Subject $MessageSubject -Body $body -BodyAsHtml

The Whole Pie

import-module failoverclusters
$FromAddress = "from@somedomain.com"
$ToAddress = "me@somedomain.com"
$MessageSubject = "Before Patching - Cluster Node Owners"
$SendingServer = "smtp.somedomain.com"

$expected = @{
"Cluster1" = "Owner1";
"Cluster2" = "Owner2";
"Cluster3" = "Owner3";
"Cluster4" = "Owner4"
}

$style = "<style>BODY{font-family: Arial; font-size: 10pt;}"
$style = $style + "TABLE{border: 1px solid black; border-collapse: collapse;}"
$style = $style + "TH{border: 1px solid black; background: #dddddd; padding: 5px; }"
$style = $style + "TD{border: 1px solid black; padding: 5px; }"
$style = $style + "</style>"

" #build the current status array
$list = @((get-clustergroup -cluster Cluster1)
, (get-clustergroup -cluster Cluster2)
, (get-clustergroup -cluster Cluster3)
, (get-clustergroup -cluster Cluster4)) 

#this will get us the expected outcome into a table format
$body1 = $expected.getenumerator() | sort-object -property name | name, value | convertto-html -head $style | out-string 

#this will get us the current status of each cluster into a table format
$body2 = $list | % {$_} | select Cluster,name, ownernode,state | convertto-html -head $style | out-string 

#bring it all together into a single body to email
$body = "Expected: " + $body1 + " Current Status: " + $body2 

#send the message
send-MailMessage -SmtpServer $SendingServer -To $ToAddress -From $FromAddress -Subject $MessageSubject -Body $body -BodyAsHtml

Summary

The result is a nicely formatted HTML message, as shown below.

You can see that the email is easy to understand.  This will allow us to quickly validate that things are where they should be. Once the patching is completed, we’ll use this same email validation to confirm that things are the way that they should be.

Posted in Administration | Tagged , | Leave a comment

I’m Speaking! Omaha SQL Server/BI User Group

omaha_skyline_400x400Well, it has finally happened. In the 7+ years that I helped to run the Omaha SQL/BI User Group, I never once gave a presentation of my own.  Of course I spoke, but it was usually about house keeping items for the group or helping to facilitate a “Bring Us Your Problems” session.  This is about to change.

As luck would have it, I’m headed back to Omaha for a vacation.  By sheer coincidence, my trip coincides with the July 2015 meeting of the Omaha SQL/BI User group.  I could not have planned it better.

I am excited to again speak at the meeting, only this time, I’m giving my own presentation.

SQL Server Databaseology: A Deep Dive Into Database Internals

Have you ever taken apart a toaster or an alarm clock just to see how it worked? Ever wondered how that database actually functions at the record level, behind the scenes? SQL Server Databaseology is the study of SQL Server databases and their structures down to the very core of the records themselves. In this session, we will explore some of the deep inner workings of a SQL Server database at the record and page level.  You will walk away with a better understanding of how SQL Server stores data and that knowledge that will allow you to build better and faster databases.

Logistics

The meeting will be held on Wednesday, July 8th, 2015 @ 6PM at Farm Credit Services of America.  Farm Credit Services of America will be our sponsor for the evening.

Please make sure to RSVP so that there is an accurate head count for food.

See you there!

 

Posted in General Conversations | 1 Comment

T-SQL Tuesday #67 – Extended Events

TSQL2sDay150x150This month’s T-SQL Tuesday topic is about extended events and comes from Jes Borland (B|T).

Like in most environments, we utilize a mixture of Windows logins and SQL logins.  Over the years, the amount of SQL logins we use has grown.  Knowing that it is a better practice to utilize windows accounts, we now want reduce the number of SQL logins.

So how do we do that? You could just disable the accounts and see what squawks, right?  The other alternative, which is what we choose, is to enable several types monitoring, which will allow us to see which accounts are being actively used.  Once we have a defined list, we can then weed out the inactive SQL logins as well as move the active ones to a windows account.

There are a couple of ways to do this that I’m aware of.

  1. Profiler
  2. Traces
  3. Login Triggers
  4. Extended Events
  5. Audits
  6. Others?

Each of these methods have their pros and cons, so I would advise that you consider each one carefully.

In full disclosure, I do not have much experience with extended events so I wasn’t even certain that these would work for what I wanted.  Thankfully, it turns out it did!

Brief History on Extended Events

In the beginning, in my opinion, extended events used to be a bear to utilize.  In 2008 R2 and older, there wasn’t an easy way to configure and utilize them.  There were ways around this, by using things such as add-in tools.

xe_1

SQL Server 2014

With the release of SQL Server 2012, there is a GUI interface that allows you to create and configure an extended events session.

As you can see on the left, just expand the Management tree within SSMS, Extended Events and then Sessions.

By default, you will notice you have a system_health session already running.  This is very much like the black box default trace that started to show up in SQL Server 2005.

I will let you experiment on creating your own extended events session.  There are a number of resources on the internet that will help you accomplish this.

TL;DR

Instead of creating a very long blog post about how to create an extended events session, I have provided the script that I generated.

DISCLAIMER:  Please test accordingly. NEVER run code that you find on the internet against your Production systems without testing first. You have been warned!

IF EXISTS(SELECT 1 from sys.dm_xe_sessions where name = 'MonitorSQLLogins')
    BEGIN
	   DROP EVENT SESSION [MonitorSQLLogins] ON SERVER
    END
GO
CREATE EVENT SESSION [MonitorSQLLogins] ON SERVER 
ADD EVENT sqlserver.connectivity_ring_buffer_recorded(
    ACTION(sqlserver.client_app_name,sqlserver.client_connection_id,sqlserver.client_hostname,sqlserver.context_info,sqlserver.server_principal_name,sqlserver.session_id)
    WHERE ([sqlserver].[session_nt_user]=N'NULL')),
ADD EVENT sqlserver.login(
    ACTION(sqlserver.session_id,sqlserver.session_nt_username,sqlserver.session_server_principal_name)
    WHERE ([sqlserver].[session_nt_user]=N'')) 
ADD TARGET package0.event_file(SET filename=N'MonitorSQLLogins')
ADD TARGET package0.histogram(SET filtering_event_name=N'sqlserver.login',source=N'sqlserver.session_server_principal_name')
WITH (MAX_MEMORY=4096 KB
,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=ON,STARTUP_STATE=OFF)
GO
ALTER EVENT SESSION MonitorSQLLogins
ON SERVER STATE = START
GO

There are a couple of things to note.

First, I did not want to capture all the logins, only the SQL logins.  In order to do that, I put a filter (WHERE clause) on the actions limited it to the session_nt_user of NULL or blank.  This effectively limits the list SQL logins, because a windows login would have a property value in this field.

Second, I am using a histogram target to keep track of the logins.  The histogram will aggregate the number of times a login occurred.  You could also use the ring_buffer or anevent_file target.  Given that I do not care about when the login occurred but just that it did, a histogram seemed more appropriate.

Testing

In order to test, I just created a SQL login named ‘LoginTest1’.   Once the account was created, I just simply started a new query window and authenticated using those credentials.

The Results

Once the extended event session is capturing data, the next hurdle is how to query it.  You can watch the live data stream or you can use this query:

SELECT name, target_name, CAST(st.target_data AS xml)
FROM sys.dm_xe_session_targets AS st
JOIN sys.dm_xe_sessions AS s
ON (s.address = st.event_session_address)
WHERE s.name = 'MonitorSQLLogins'

This query will return the target_data from the session as XML as shown below:

xe_3

Bxe_4y clicking on the hyper link that is returned, you can see a better representation on what accounts are logging into your server.

In this case, the LoginTest1 user account has logged in twice.

One thing to note is the number of “buckets” shown here.  This represents the number of groups that will be tracked by the histogram.  In my case, since I am grouping by login, if there were more than 256 (which is the default) logins, the session would stop logging. Meaning any other logins, would not be captured regardless if it is a new group or not.  Given that I do not have more than 256 SQL logins on my server, this was not an issue for me.

You can adjust this value either by using the GUI when the session is created or by adjusting this line in the above script:

ADD TARGET package0.histogram(SET filtering_event_name=N'sqlserver.login',slots=(512),source=N'sqlserver.session_server_principal_name')
xe_5

If I wanted 512 buckets

Summary

Now that I’ve got a method to track SQL Logins, I can periodically check the histogram to see what accounts are logging into the instance.

Watch for a future blog post on how I’ll develop a process to extract this data into a more usable format!

If you have any interest in hosting one of these events, you can contact the founder, Adam Machanic.  He is always on the lookout for someone to host.  The only requirement is that you have participated in at least 2 T-SQL Tuesday events and you have an idea for a topic.

 

Posted in Administration | Tagged , , , | Leave a comment

Why NULL Placement Matters

showmethebenjamins

Not those Benjamin’s

Benjamin Franklin once said, “A place for everything and everything in it’s place“.  While I assume that he was not thinking about NULL values in a SQL Server table, the premise is still correct.  Some data professionals believe that NULL values do not belong in any table design.  However, the reality is that they exist and I do not see that changing anytime soon.

Several years ago, I learned that when a table is designed, putting columns that will have the majority of NULL values at the end of the table will help with performance.   To be honest, at that time I took that lesson at face value.  I never really understood why that would make a difference. Recently, I ran across my notes from that lesson and decided to investigate it further.

Let’s examine a situation where a table has a variety of NULL placements.  We will create a table, NullPlacementDemo, that has five columns total.  An identity column along with four variable character length columns.

USE Scratch
GO
IF OBJECT_ID('dbo.NullPlacementDemo') IS NOT NULL
	BEGIN
		DROP TABLE NullPlacementDemo
	END
GO
CREATE TABLE NullPlacementDemo (id int identity(1,1), col1 varchar(10), col2 varchar(10), col3 varchar(10),col4 varchar(10))
GO

Now that the table structure is created, let’s put some data in it.  I will insert three records.

  1. This record will have values in columns 1 & 2, with 3 & 4 NULL
  2. This record will have values in columns 1 & 3, with 2 & 4 NULL
  3. This record will have values in columns 1 & 4, with 2 & 3 NULL
INSERT INTO NullPlacementDemo (col1, col2)
	SELECT REPLICATE('a',10), REPLICATE('b',10)
GO
INSERT INTO NullPlacementDemo (col1, col3)
	SELECT REPLICATE('a',10), REPLICATE('b',10)
GO
INSERT INTO NullPlacementDemo (col1, col4)
	SELECT REPLICATE('a',10), REPLICATE('b',10)
GO

Note, that in order to make calculations easier, I am filling the variable length columns to their maximum capacity of ten characters.  Also, note that the amount of data inserted is the same on all three records.

Anatomy of a Record

In order to understand why NULL placement matters, we need to investigate the anatomy of the record itself.  I highly recommend the post by Paul Randal, “Inside the Storage Engine: Anatomy of a Record”.  This post does an excellent job of telling us about all of the crucial elements that comprises each record.

NullPlacementMatters_3Taking the information from Paul’s blog we can deduce that the size for each record should be 37 bytes.  Remember that we inserted the exact same amount of data for all three records.

Regarding the calculations on the left, the tag bytes, NULL bitmap location, number of columns in NULL bitmap, and variable column count are fixed values.  The others are determined on how the table is configured:

  • Fixed Data Length – the total size in bytes of the fixed data length.  We have an integer data type for the identity, so four bytes
  • Bit for every column – ‘N’ represents the total number of columns. 5/8 = 0.625 but you cannot have a partial byte, so we round up to one byte.
  • Variable Column Offset – ‘N’ represents the total number of columns that have a value. We only inserted two values, therefore four bytes

In order to confirm this, we can use two of my favorite undocumented tools, DBCC IND and DBCC PAGE to look at the internal structure of the records.

Remember that we need to turn on trace flag 3604 to get the output of the DBCC Page command to send the results to the SSMS window.

DBCC IND(Scratch,NullPlacementDemo,1)
GO
DBCC TRACEON(3604)
DBCC PAGE('Scratch',1,300,1)
DBCC TRACEOFF(3604)
GO

NullPlacementMatters_1

The results of DBCC Ind tells me that I need to look at page ID 300 to see the contents of the page.  The page ID will potentially different on your system.

Once we determine which page to look at, we’ll use the DBCC Page command to look at the page itself.  For simplicity, we are going to use the dump style of one.  This will display the size for each corresponding record.

NullPlacementMatters_2

We confirmed that the first record is 37 bytes, which matches the calculation above.

Wait.  Hold the phone!

The other record sizes are different! How can that be?  We inserted the EXACT same data each time!  You will recall that in the first record both values were in conjunction to each other.  The second record had a NULL value between them and the third record had two NULL values between them.   The only thing different between all three records was the NULL placement.

The Truth Shall Set You Free

The truth is how SQL Server processes the values for each column along with values in the NULL bitmap.  Paul’s article from above teaches us, among other things:

  • The NULL bitmap keeps track whether or not the column has a NULL value, even if it’s non-nullable
  • The variable column offset stores the end of each column value

Let’s look closer at the NULL bitmap, specifically for the second record.  Remember that it’s zero based, thus the second row is in slot one. Using the output of DBCC Page, we can see the NULL Bitmap byte.   This is in hexadecimal.  NullPlacementMatters_4

If we convert the value of fourteen to binary, we get the value 10100.   Zero indicates that the column is not null.  Reading right to left, this value tells SQL Server that:

  • the first two columns are not null (zero) – 10100
  • the third column is null – 10100
  • the fourth column is not null – 10100
  • the fifth column is null – 10100

Using this information in conjunction with the variable column offset array it has to keep track of that third column, which is NULL.  We recall that the variable column offset array stores the end location of the column value.   This allows SQL Server to know where the start of the next column is as well as provide an easy way to determine how big the value is.

Even though the value is NULL in the third column, it has to allocate two bytes in order to track where that column ends.  If it did not allocate these two bytes, it wouldn’t know where the fourth column started since, according to the NULL bitmap, is not null.

We can further see this by again looking at the DBCC Page output:

NullPlacementMatters_5

The underlined three in the above image represents the number of variable columns that are present.  This means that there are three – two byte entries for these columns.  They are

  • Col1 – 0x001d (not null)
  • Col2 – 0x001d (Null)
  • Col3 – 0x0027 (not null)

These are highlighted in the above image.  If we convert 0x001d to decimal, this is 29.  If we count over 29 bytes from the beginning of the record, we end up at the last ’61’, as shown below:

NullPlacementMatters_6

This represents the end of the first column.   The second column, which is NULL, has a length of 0 so also ends at the same location, 0x001d.

The third two byte entry, 0x0027 is 39 in decimal.  This would correlate to the last ’62’ in the output above.   You’ll also notice that this is the same as the length of the record.

Since SQL Server has to store these extra two bytes for this NULL value, the size is two bytes larger, 39.  Given that the third record has two NULL values between columns one and four, the record size is four bytes larger than the first record, thus 41 bytes in total.

Summary

At the end of the day, SQL Server is all about how many bytes it has to read/write to get the job done.  If you have a table that has 100 columns with a mixture of NULLable columns and multi-million records, could this impact you?  Possibly.  If the table design calls for nullable columns and you organize them appropriately, you will help to reduce the amount of bytes that SQL Server will have to handle.  A number of factors may affect your milage of course, but reducing the amount of I/O is never a bad thing.

Posted in Internals | Tagged | 7 Comments

SQL Server 2008 R2 SP3 Install Errors

Whenever it is time to install a service pack every DBA silently hopes that nothing will go wrong.  I would surmise that, in most cases, installing the service pack is a non-event.  No flashing lights, no fireworks, no applause and no major hiccups.  This post is not about one of those times.

The Install

Several weeks ago I attempted to install SQL Server 2008 R2 Service Pack 3 on one of our production instances.   The server was already at SP2 so it was to be a routine install. However, as the installer launched, this error appeared:

sp3setuperror_1

The issue stems from missing files in the installer cache.  When applications, including SQL Server, are installed certain files are placed within the installer cache.  These files are subsequently used when installing service packs and cumulative updates.  For some unknown reason b40c768.msi was missing.

My initial search lead me to a Microsoft Support page title “How to restore the missing Windows Installer cache files and resolve problems that occur during a SQL Server update“.  Perfect.   The blog post has a lot of detail in it and explains the issue fairly well.  It suggests two methods to help resolve the issue, one being a script and the other manual intervention.   I opted for the former, the script.

The Script

The instructions are straight forward.

  1. Download the script to a folder on the affected server, IE: c:\temp
  2. Open an elevated command prompt
  3. Navigate to the folder where the script is located
  4. Execute this command: Cscript FindSQLInstalls.vbs %computername%_sql_install_details.txt

The script will collect information about what packages may or may not be missing from the installer cache.  Note: that the output of the script will put a text file in the same folder as the script itself.  It will also overwrite any files that might already be present.  If you wish to run this multiple times, I would suggest changing the file names accordingly.

The output of the script looks like this:

sp3_script_output

Click to enlarge image

You will see that in the above example, it’s shouting at me stating that the MSI file doesn’t exist in the path D:\x64\set\sql_engine_core_inst_msi\.  This is because the install media isn’t located within the CD-ROM drive.  This was expected.

You will further notice that it says that the installer cache file, b40c756.msi was located and no further actions are needed.  If this particular cache file was missing the output would notify me.  I would utilize the command it provides to copy the missing msi file from the install media to the location it specifies.

Now that I knew where the missing file might be located, I set out to go find it.  In this particular case, I located the file on the Service Pack 2 install media.  If you weren’t aware, those service packs (and cumulative updates) are self-extracting files.  You can manually extract the files to a location of your choice so that you can locate the file you need.

Once I was able to locate the missing MSI file from the extract contents, I was able to successfully copy it utilizing the command that is provided in the output file.

At this point, it was rinse and repeat.  I would follow the same steps for each file that was missing from the installer cache.  All in all, there were 3 or 4 files that were missing.  Once I had all of the files replaced, I re-ran the SP3 installer file.

Then this happened.

sp3_error

Looks like we have another issue.  This error was much more cryptic and not very useful.  So of course, I went back to Google and did some more research.  In addition to researching, I also checked the event view on the server.  Nothing of interest was there.   I also thought to check the bootstrap log.   Here I found an interesting error:

sp3_error_2

Click to enlarge image.

You will note that the error happened right after it started to look at sql_as_CPU64, or b40c73a.msi (two lines above the censored section). I thought that I had mis-copied this particular file.  I tried to copy it over again.  That did not solve the issue.  I still got the same error.

I eventually turned to a forum I frequent, Ask SQL Server Central.   You can read the question that I posted here.  Unfortunately, I did not get any responses so that meant that I had to continue to try to fix it.  I really did not want to call Microsoft Support but I would if I had to.

TL;DR

The answer lies within the bootstrap log.  The error isn’t referring to the line above it as I had thought.  The “sql_as_CPU64.msi” file was processed just fine.  The error was occurring on the file AFTER that one.   However, you will notice that the file it is trying to consume is not listed. Using the bootstrap log, I was able to comb back through it looking for all of the missing installer cache files.  I finally found one that wasn’t accounted for within the log file, so that had to be the one.

Turns out, when I copied the MSI file from the source media I mistaken copied an MSP file, which is distinctly different.  I found the correct file and copied it back over to the server.

A relaunch of the installer lead to…..Success!

Success

Resources

Here are some additional resources that I found.  These are not listed in any particular order.

Lessons Learned

I learned a number of lessons throughout this process.

  1. Pay attention to the details.  There is a distinct difference from a MSI and MSP file.  The former is an installer file.  The latter is a MSI patch file.
  2. Keep note of things tried.  I found that this issue was a combination of several different things and I should have kept better documentation on each one.  I found that I would repeat steps that I had already tried because I did not document them well enough.
  3. The internet is your friend.
  4. Do not be shy to use the resources you have available to you, IE: forums.  They are there to help you.
Posted in Administration | Tagged | Leave a comment