Finding Composite Primary Key Columns

You never know when the idea for a script will make an appearance.

I had to work with composite primary keys recently while working on a project for a client.  More specifically, I had to create a process that would dynamically (dynamic SQL?! Say it isn’t so?!) handle composite keys in an efficient manner.

Usually, a primary key is just a single column that uniquely identifies a row within a table.  However, a composite primary key consisting of 2 or more columns can be created.  Regardless if the primary key is singular or composite, it provides identical functionality.  In this particular instance, this process would perform data modifications based on the columns that composed the primary key.  Thus I needed to be able to determine what columns are in the key.

There is a limit to the number of columns you can have in a composite key.  In SQL Server 2016 and newer, the limit is 32 columns and prior to that a primary key could have up to 16 columns.   Keep in mind, neither limit is a goal but it’s there if you need it.

The Parts & Pieces

As I started to work on this, my first thought was that it would be helpful to know how many tables had a composite primary key.  This would give me an idea on how many tables I was dealing with.  Thankfully, SQL Server has this information by using system DMVs (dynamic management views) along with the COL_NAME function.

Note: the COL_NAME function will only work with SQL Server 2008 and newer.  

Below is my attempt to get a result of how many columns compose the primary key:

In looking at the AdventureWorks2014 database, the above script will give you an output that looks like this:

Cool!  Now I knew which tables had a primary key comprised of multiple columns as well as how many columns were in the key definition.

Taking things a step further, I thought that it would be useful to see what columns the primary key is composed of in that same output.  This part becomes a little more complicated to gather as we need to get the list of columns, which could be 1 – 32 columns (or 1-16 columns depending on the version of SQL Server) into a comma delimited list.

Using the STUFF function along with XML PATH is a quick and efficient way to generate a comma delimited list of string values.

Thus this script was born:

We can see the output of this query below:

Note that this output has the schema and table name just like the first result set.

Using a CTE, we can tie these two result sets together and get a clean unified look.

We can see from below that now we have a nice result set that tells use not only how many columns are in the primary key but also what those keys are.

 

Update: Based on comments, I’ve removed the CTE and replaced it with a in-line query as well as included the schema for each object.  This should be a cleaner look.  The GitHub repository has been updated with this change. 

Summary

When working with composite primary keys, it’s a good idea to know how many columns as well as what columns are in the key.  This query helped me in creating the process my client needed. It just might help you to figure out some logic when writing dynamic SQL!

You can download the full script from here.

Enjoy!

 

© 2018, John Morehouse. All rights reserved.

5 Replies to “Finding Composite Primary Key Columns”

  1. The “unified look” query has an oversight. Line 24 joins on table name only. A database has two tables in different schemas with the same name, the query will show the two tables as having a composite primary key.

    The corrected query is below. The changes are to the CTE query to include object_id and change the join to join on object_id.

    WITH mycte AS (
    SELECT i2.object_id
    , STUFF(
    (SELECT ‘,’ + COL_NAME(ic.object_id,ic.column_id)
    FROM sys.indexes i1
    INNER JOIN sys.index_columns ic ON i1.object_id = ic.object_id AND i1.index_id = ic.index_id
    WHERE i1.is_primary_key = 1
    AND i1.object_id = i2.object_id AND i1.index_id = i2.index_id
    FOR XML PATH(”)),1,1,”) AS PK
    FROM sys.indexes i2
    INNER JOIN sys.objects o ON i2.object_id = o.object_id
    WHERE i2.is_primary_key = 1
    AND o.type_desc = ‘USER_TABLE’
    )
    — Use this select to get the count, join to the CTE and get the column list
    SELECT SCHEMA_NAME(o.schema_id) AS ‘Schema’
    , OBJECT_NAME(i.object_id) AS ‘TableName’
    , COUNT(COL_NAME(ic.object_id,ic.column_id)) AS ‘Primary_Key_Column_Count’
    , mycte.PK AS ‘Primary_Key_Columns’
    FROM sys.indexes i
    INNER JOIN sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id
    INNER JOIN sys.objects o ON i.object_id = o.object_id
    INNER JOIN mycte ON mycte.object_id = i.object_id
    WHERE i.is_primary_key = 1
    AND o.type_desc = ‘USER_TABLE’
    GROUP BY SCHEMA_NAME(o.schema_id)
    , OBJECT_NAME(i.object_id)
    , mycte.PK
    HAVING COUNT(‘Primay_Key_Column_Count’) > 1
    ORDER BY ‘Schema’, ‘TableName’ ASC

  2. Interesting as it lead me to a PK clustered with 4 fields( 95 bytes)? A note on using a CTE’s. CTE’s are not always as efficient as embedding a subquery or using a derived query, it just depends. In the re-worked example below I have included the stuff as a subquery and a derived table for the column count. By using a derived table the count can be used in the WHERE clause with out an aggregate. The Count can be done as a CTE but at a cost of addition CPU. In SSMS Show plan the plan cost if I run both queries is 69% and 31%.
    As always in SQL is just depends…

    SELECT SCHEMA_NAME(o.schema_id) AS ‘Schema’
    , OBJECT_NAME(i.object_id) AS ‘TableName’
    , pkc.PrimaryKeyColumnCount
    , STUFF((SELECT ‘, ‘ + COL_NAME(ic.object_id,ic.column_id)
    FROM sys.indexes i1
    INNER JOIN sys.index_columns ic ON i1.object_id = ic.object_id AND i1.index_id = ic.index_id
    WHERE i1.is_primary_key = 1
    AND i1.object_id = o.object_id AND i1.index_id = i.index_id
    FOR XML PATH(”)),1,1,”) AS ‘Primary_Key_Columns’
    FROM sys.indexes i
    INNER JOIN sys.objects o ON i.object_id = o.object_id
    INNER JOIN ( SELECT OBJECT_ID, index_ID, COUNT(*) AS PrimaryKeyColumnCount
    FROM sys.index_Columns
    GROUP BY OBJECT_ID, index_ID
    ) pkcON pkc.OBJECT_ID = o.object_ID AND pkc.index_ID = i.index_ID
    WHERE i.is_primary_key = 1
    AND o.type_desc = ‘USER_TABLE’
    AND pkc.PrimaryKeyColumnCount > 1
    ORDER BY ‘TableName’ ASC;

  3. Hi Dan! Thanks for stopping by! I’ve updated the query to include the in-line query rather than a CTE. I agree that it’ll perform better as well as provide a cleaner look. Thanks!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.