rss feed blog search engine
 
Search rss blog search engine
 
SQL Fool  
Released:  11/3/2008 10:27:55 PM
RSS Link:  http://sqlfool.com/?feed=rss2
Last View 1/4/2009 8:32:02 PM
Last Refresh 1/8/2009 12:06:45 PM
Page Views 195
Comments:  Read user comments (1)
Save It Add to Technorati Add to Del.icio.us Add to Furl Add to Yahoo My Web 2.0 Add to My MSN Add to Google Add to My Yahoo! SQL Fool



Description:



Adventures in SQL Tuning - a blog for the rest of us


Contents:

SQLFool on Vacation

Just a quick note to let everyone know that I’ll be on vacation until Jan 14th. I’m not taking my laptop and I’m packing away my iPhone, so I won’t be posting any blogs until I return. :)
Happy Holidays!

Michelle




Indexing for Partitioned Tables

So you’ve partitioned your table, and now you’re ready to performance tune. As with any table, indexing is a great place to start. And if you’re like most people new to partitioning, you probably created all of your indexes on the partitioned scheme, either by design or unintentionally.

Let’s take a step back here and discuss what a partitioned index is. A partitioned index, like a partitioned table, separates data into different physical structures (partitions) under one logical name. Specifically, each partition in a nonclustered index contains its own B-tree structure with a subset of rows, based upon the partitioning scheme. By default, an unpartitioned nonclustered index has just one partition.

Keep in mind, when you create an index on a partitioned table, i.e.

CREATE NONCLUSTERED INDEX IX_myIndex
    ON dbo.myTable(myColumn);
 

… you are creating the index on the partitioned scheme by default. In order to create a NON-partitioned index on that same table, you would need to explicitly declare “On [FileGroup]“, i.e.

CREATE NONCLUSTERED INDEX IX_myIndex
    ON dbo.myTable(myColumn)
    ON [PRIMARY];
 

 

But should you partition your index? That depends on how you use it. In fact, most environments will probably want to use a mix of partitioned and non-partitioned indexes. I’ve found that that partitioned indexes perform better when aggregating data or scanning partitions. Conversely, you’ll probably find that, if you need to locate a single, specific record, nothing performs better than a non-partitioned index on that column.

Let’s walk through some examples and see how they perform. I’ll bring back my trusty ol’ orders table for this.

/* Create a partition function. */
CREATE Partition FUNCTION
    [test_monthlyDateRange_pf] (DATETIME)
    AS Range RIGHT FOR VALUES
    (‘2009-01-01′, ‘2009-01-08′, ‘2009-01-15′
    , ‘2009-01-22′, ‘2009-01-29′);
Go

/* Associate the partition function with a partition scheme. */
CREATE Partition Scheme test_monthlyDateRange_ps
    AS Partition test_monthlyDateRange_pf
    All TO ([PRIMARY]);
Go

/* Create a partitioned table. */
CREATE TABLE dbo.orders
(
      order_id  INT IDENTITY(1,1)   Not Null
    , orderDate DATETIME            Not Null
    , orderData SMALLDATETIME       Not Null

    CONSTRAINT PK_orders PRIMARY KEY CLUSTERED
    (
        order_id
      , orderDate
    )
) ON test_monthlyDateRange_ps(orderDate);
Go

/* Create some records to play with. */
SET NOCOUNT ON;

DECLARE @endDate DATETIME = ‘2009-01-01′;

WHILE @endDate < ‘2009-02-01′
BEGIN

    INSERT INTO dbo.orders
    SELECT @endDate, @endDate;
   
    SET @endDate = DATEADD(MINUTE, 1, @endDate);

END;

SET NOCOUNT OFF;

/* Let’s create an aligned, partitioned index. */
CREATE NONCLUSTERED INDEX IX_orders_aligned
    ON dbo.orders(order_id)
    ON test_monthlyDateRange_ps(orderDate);
    /* you don’t actually need to declare the last
       line of this unless you want to create the
       index on a different partitioning scheme.   */

   
/* Now let’s create an unpartitioned index. */
CREATE NONCLUSTERED INDEX IX_orders_unpartitioned
    ON dbo.orders(order_id)
    ON [PRIMARY];
 

 

Now that we have both a partitioned and an unpartitioned index, let’s take a look at our sys.partitions table:

/* Let’s take a look at our index partitions */
SELECT i.name
    , i.index_id
    , p.partition_number
    , p.ROWS
FROM sys.partitions AS p
Join sys.indexes AS i
    ON p.OBJECT_ID = i.OBJECT_ID
   And p.index_id = i.index_id
WHERE p.OBJECT_ID = OBJECT_ID(‘orders’)
ORDER BY i.index_id, p.partition_number;
 

 

sys.partitions

sys.partitions

As expected, both of our partitioned indexes, PK_orders and IX_orders_aligned, have 6 partitions, with a subset of rows on each partition. Our unpartitioned non-clustered index, IX_orders_unpartitioned, on the other hand has just 1 partition containing all of the rows.

Now that we have our environment set up, let’s run through some different queries and see the performance impact of each type of index.

/* Query 1, specific record look-up, covered */
SELECT order_id, orderDate
FROM dbo.orders WITH (INDEX(IX_orders_aligned))
WHERE order_id = 25000;
 
SELECT order_id, orderDate
FROM dbo.orders WITH (INDEX(IX_orders_unpartitioned))
WHERE order_id = 25000;

 

Query 1

Query 1

The unpartitioned index performs significantly better when given a specific record to look-up. Now let’s try the same query, but utilizing a scan instead of a seek:

/* Query 2, specific record look-up, uncovered */
SELECT order_id, orderDate, orderData
FROM dbo.orders WITH (INDEX(IX_orders_aligned))
WHERE order_id = 30000;
 
SELECT order_id, orderDate, orderData
FROM dbo.orders WITH (INDEX(IX_orders_unpartitioned))
WHERE order_id = 30000;

 

Query 2

Query 2

Again we see that the non-partitioned index performs better with the single-record look-up. This can lead to some pretty dramatic performance implications. So when *would* we want to use a partitioned index? Two instances immediately pop to mind. First, partition switching can only be performed when all indexes on a table are aligned. Secondly, partitioned indexes perform better when manipulating large data sets. To see this in action, let’s try some simple aggregation…

/* Query 3, aggregation */
SELECT CAST(ROUND(CAST(orderdate AS FLOAT), 0, 1) AS SMALLDATETIME) AS ‘order_date’
    , COUNT(*)
FROM dbo.orders WITH (INDEX(IX_orders_aligned))
WHERE orderDate Between ‘2009-01-01′ And ‘2009-01-07 23:59′
GROUP BY CAST(ROUND(CAST(orderdate AS FLOAT), 0, 1) AS SMALLDATETIME)
ORDER BY CAST(ROUND(CAST(orderdate AS FLOAT), 0, 1) AS SMALLDATETIME);
 
SELECT CAST(ROUND(CAST(orderdate AS FLOAT), 0, 1) AS SMALLDATETIME) AS ‘order_date’
    , COUNT(*)
FROM dbo.orders WITH (INDEX(IX_orders_unpartitioned))
WHERE orderDate Between ‘2009-01-01′ And ‘2009-01-07 23:59′
GROUP BY CAST(ROUND(CAST(orderdate AS FLOAT), 0, 1) AS SMALLDATETIME)
ORDER BY CAST(ROUND(CAST(orderdate AS FLOAT), 0, 1) AS SMALLDATETIME);

 

Query 3

Query 3

As you can see, partitioned indexes perform better when aggregating data. This is just a simple example, but the results can be even more dramatic in a large production environment. This is one of the reasons why partitioned tables and indexes are especially beneficial in data warehouses.

So now you have a general idea of what a partitioned index is and when to use a partitioned index vs a non-partitioned index. Ultimately, your indexing needs will depend largely on the application and how the data is used. When in doubt, test, test, test! So to recap…

  • Specify “On [FileGroup]“ to create an unpartitioned index on a partitioned table
  • Consider using non-partitioned indexes for single-record look-ups
  • Use partitioned indexes for multiple records and data aggregations
  • To enable partition switching, all indexes on the table must be aligned.

For more information on partitioning, check out my other partitioning articles:

Partitioning Example
Partitioning 101
Tips for Large Data Stores




Error Handling in T-SQL

Error handling is one of those things in SQL Server that just doesn’t get the attention it deserves. Even a properly constructed stored procedure can still result in error, such as primary key or unique constraint errors.

Why should you care? Consider this real-world example:

You’re a DBA monitoring a well-performing environment. You deploy a new application to production. Suddenly, performance degrades but you do not know why. You look in your error log and see a whole mess of primary key errors. Digging into your newly deployed application, you find that you are now making an extra (and unnecessary) insert to the database, which is resulting in error and causing your performance issues.

This is just one example of many. Fortunately, SQL 2005 has really simplified the error handling process with features such as the Try/Catch block.

The basic components of error handling are:

  • Try…Catch block (2005/2008)
  • Error identification
  • Transaction handling
  • Error logging (optional)
  • Error notification

As an early holiday gift, here’s a generic error handling process to get you started:

IF OBJECTPROPERTY(OBJECT_ID(‘dbo.dba_logError_sp’), N‘IsProcedure’) = 1
BEGIN
    DROP PROCEDURE dbo.dba_logError_sp;
    PRINT ‘Procedure dba_logError_sp dropped’;
END;
Go

IF OBJECTPROPERTY(OBJECT_ID(‘dbo.dba_errorLog’), N‘IsTable’) IS Null
BEGIN

    CREATE TABLE dbo.dba_errorLog
    (         errorLog_id       INT IDENTITY(1,1)
            , errorType         CHAR(3)    
                CONSTRAINT [DF_errorLog_errorType] DEFAULT ’sys’
            , errorDate         DATETIME       
                CONSTRAINT [DF_errorLog_errorDate] DEFAULT(GETDATE())
            , errorLine         INT
            , errorMessage      NVARCHAR(4000)
            , errorNumber       INT
            , errorProcedure    NVARCHAR(126)
            , procParameters    NVARCHAR(4000)
            , errorSeverity     INT
            , errorState        INT
            , databaseName      NVARCHAR(255)
        CONSTRAINT PK_errorLog_errorLogID PRIMARY KEY CLUSTERED
        (
            errorLog_id
        )
    );

    PRINT ‘Table dba_errorLog created’;

END;
Go

SET ANSI_Nulls ON;
SET Ansi_Padding ON;
SET Ansi_Warnings ON;
SET ArithAbort


Home  


 


Link to us




RSS Feed of new blogs                                                   Home        Feed Map        Submit Feed      Link to Us       Contact