Partitioning in SQL Server 2005
Partitioning allows storage of data within a table to be organized in multiple units that can be spread across more than one filegroup in a database. Using this feature, data is partitioned by rows and various defined subsets of rows are split into separate units. From the point of view of database access, the table is treated as a single logical entity.
Advantages of Partitioning
- When implemented correctly, partitioning leads to improvement in database performance and manageability.
- You can consider partitioning when the table contains very large amounts of data and need to improve the performance. Cases where specific subsets of data have different types of processing requirements are candidates for partitioning.
- Table containing active Job Numbers for various departments can be partitioned into units containing active Job Numbers for each department
- Table containing Machine Readings for the last 12 months rolling period can be partitioned into units containing Machine Readings for each month
- Before implementing Partitioning a thorough analysis needs to be performed – gather requirements, focus on the design and maintenance aspects.
- Overhead – Partitioning has will require a maintenance overhead and will introduce a level of complexity. If the performance is acceptable, then consider the implications of these overhead issues.
- Filegroup: Database files are stored in data files which are subdivided into file groups. Spreading a database over multiple files can improve performance, especially by storing files on different disk drives.
- Partitions: Horizontal division of a table into units that can be spread across more than one filegroup.
Steps for Setting up Partitions for a Table
- Create the Partition Function
- Create the Partition Scheme
- Create the table using the Partition Scheme
In this sample, we will consider a Billing application for a fictitious corporate internet provider company Netar. Our application collects the details of individual connection records in the ConnDetails table. We will partition the ConnDetails table into 12 partitions – a partition for each month in the last 12 months rolling period.
1. ConnDetails Table
2. Create the Partition Function
In the CREATE PARTITION FUNCTION statement, you can specify the value criteria on which the table will be partitioned. Specifying the LEFT argument will include the value as the upper boundary of the first partition. Specifying the RIGHT argument will include the first value as the lower boundary of the second partition.
In our example, we are partitioning based on a datetime value. If the time component is not specified in a datetime value, it defaults to 12:00:00 AM. In this example, the RIGHT argument is more intuitive to set the partition boundaries.
CREATE PARTITION FUNCTION [MonthlyConnPF] (datetime) AS RANGE RIGHT FOR VALUES (‘20060101’, ‘20060201’, ‘20060301’, ‘20060401’, ‘20060501’, ‘20060601’, ‘20060701’, ‘20060801’, ‘20060901’, ‘20061001’, ‘20061101’, ‘20061201’);
- As per the SQL Server documentation: "All data types are valid for use as partitioning columns, except text, ntext, image, xml, timestamp, varchar(max), nvarchar(max), varbinary(max), alias data types, or CLR user-defined data types."
3. Create the Partition Scheme
Now we specify the file groups to be used for the partitions that were declared using the boundary conditions in step 2.
CREATE PARTITION SCHEME MonthlyConnPS AS PARTITION MonthlyConnPF TO (fg01, fg02, fg03, fg03, fg04, fg05, fg06, fg07, fg08, fg09, fg10, fg11, fg12, fg13);
13 filegroup values are specified in the above code – 1 to 12 for each of the partitions based on the boundary values in the Partition Function and the 13th for values that fall outside the first and/or last partition boundary.
You can specify to use the PRIMARY file group to store data in the primary file group.
4. Create the Partitioned Table
CREATE TABLE [dbo].[ConnDetails]
ConnId int IDENTITY,
Use the ConnDetails table as a normal SQL Server table for insert, updates, selects and deletes.
When data is inserted into the table, it will be inserted into the corresponding partition depending on the ConnEndTime field value.
You can also set a constraint on the ConnEndTime column to check the date range, depending on the implementation requirements.
When you define partitioning based on a sliding window of values, such as the rolling 12 month period in our sample above, you need to design the on-going maintenance of the partitioned table. This activity can be automated to run at the end of the sliding window such as the start of the new month, in our example.
The following steps are prescribed for the maintenance of sliding window partitions (Reference Link listed below):
- Manage the staging table for the partition that will be switched IN.
- Manage the second staging table for the partition that will be switched OUT.
- Roll the old data OUT and the new data IN to the partitioned table.
- Drop the staging tables.
- Back up the filegroup.
Special attention should be paid to this aspect of the implementation. Taking certain design decisions such as having certain tables on the same file groups, emptying data out of partitions at the right time, disabling constraints/indexes at specific points lead a large amount of optimization and performance enhancement.
You can also specify partitioning for Indexes.
Partition Data and MetaData
$Partition: Query individual partitions of a partitioned table or index
Example : SELECT $PARTITION. MonthlyConnPF (20061010) ;
Sys.partition functions: Information on individual partition functions
Sys.partition_range_values: Information on boundary values of partition functions.
Sys.partitions_parameters: Information on individual parameters of partition functions
Sys.partition_schemes: Information on all partition schemes in a database.
Sys.data_spaces: Information on data spaces for file groups or partition schemes.
Sys.partitions: Information on all partitions in a database