Monday, July 10, 2017

SQL Server -T-SQL RowNumber() vs Rank() vs DenseRank() vs NTILE()

How if we found some things in life that we want to be done so smoothly, so perfect without any flaw or fracas that something like a wizard or some magic that makes things simpler and fine.
SQL Server comes with some magic words like the above phrase that makes simple the life of a developer or DBA to a great extent. And these magic words are RowNumber(), Rank(), DenseRank() and NTile().


We can do a lot more than what the words means by itself like starting from removing duplicates from a database, or getting the highest or second highest salaried employees in the organization, generating on the fly sequence number or identity column, or simply rank any column on the basis of its values, etc. The ranking functions set a rank to each of the records as per the condition specified. All these ranking function has its own meaning and purpose.
Let’s have a table to use to explain the effects with example. Let’s consider the dbo.employee table as below:
DeptNo
empname
salary
101
Ram
5000
102
Abhishek
7500
101
kumar
5000
104
prasad
6570
102
Jumla
7500
101
Harkesh
12000
101
John
4000
Run the below query to create the table in your database and to insert data into it.

CREATE TABLE [dbo].[employee](
        [DeptNo] [INT] NULL,
        [empname] [VARCHAR](50) NULL,
        [salary] [FLOAT] NULL
)
INSERT INTO [dbo].[employee]([DeptNo],[empname],[salary])
SELECT '101', 'Ram', '5000' UNION All
SELECT '102', 'Abhishek', '7500' UNION All
SELECT '101', 'kumar', '5000' UNION All
SELECT '104', 'prasad', '6570' UNION All
SELECT '102', 'Jumla', '7500' UNION All
SELECT '101', 'Harkesh', '12000' UNION All
SELECT '101', 'John', '4000'

Here in this article, I will elaborate each of these functions in brief and with examples.
Each of these functions has a similar type of syntax in T – SQL.

[ROWNUMBER( )/RANK()/DENSERANK()/NTILE()] OVER ( [ < partition_by_clause > ] < order_by_clause > )
Here <partition by clause> is optional, but is very useful when dealing with complex logic. It will partition the result set into sub sets on which the ranking functions should work. The ranking functions (windowed functions) will work on each partition separately. If we don’t specify the partition by clause, the whole result set will be considered as single partition.
When we use “Partition by “, we need to specify the columns on which it will partition the data.
For example, in the above dbo.employee table, if we want to partition by DeptNo column, then the employee table will be divided into 3 partitions as shown below.


Now the ranking functions / windowed functions will affect each of these partitions separately.
<Over> Clause: This will prepare the result set before the associated ranking function or windowed function is applied i.e. like partitioning dataset or ordering (order by clause) it on certain criteria.

Let’s check each of the ranking functions one by one.

No comments:

Search This Blog

DAX - Grouping on multiple columns and the count

Please go thorugh the below URL for entire soultion. http://community.powerbi.com/t5/Desktop/DAX-Grouping-on-multiple-columns-and-the-cou...