One of the most obvious and useful set of window functions are ranking functions where rows from your result set are ranked according to a certain scheme. There are three ranking functions:
ROW_NUMBER() 
RANK() 
DENSE_RANK() 
The difference is easy to remember. For the examples, let’s assume we have this table (using PostgreSQL syntax):
1
2
3
4
5
 
CREATE TABLE t(v) AS
SELECT * FROM (
  VALUES('a'),('a'),('a'),('b'),
        ('c'),('c'),('d'),('e')
) t(v)
  
ROW_NUMBER()
… assigns unique numbers to each row within the PARTITION given the ORDER BY clause. So you’d get:
1
2
 
SELECT v, ROW_NUMBER() OVER()
FROM t
  
Note that some SQL dialects (e.g. SQL Server) require an explicit ORDER BY clause in the OVER() clause:
1
2
 
SELECT v, ROW_NUMBER() OVER(ORDER BY v)
FROM t
  
The above query returns:
| V | ROW_NUMBER |
|---|------------|
| a |          1 |
| a |          2 |
| a |          3 |
| b |          4 |
| c |          5 |
| c |          6 |
| d |          7 |
| e |          8 |
RANK()
… behaves like ROW_NUMBER(), except that “equal” rows are ranked the same. If we substitute RANK() into our previous query:
1
2
 
SELECT v, RANK() OVER(ORDER BY v)
FROM t
  
… then the result we’re getting is this:
| V | RANK |
|---|------|
| a |    1 |
| a |    1 |
| a |    1 |
| b |    4 |
| c |    5 |
| c |    5 |
| d |    7 |
| e |    8 |
As you can see, much like in a sports ranking, we have gaps between the different ranks. We can avoid those gaps by using
DENSE_RANK()
Trivially, DENSE_RANK() is a rank with no gaps, i.e. it is “dense”. We can write:
1
2
 
SELECT v, DENSE_RANK() OVER(ORDER BY v)
FROM t
  
… to obtain
| V | DENSE_RANK |
|---|------------|
| a |          1 |
| a |          1 |
| a |          1 |
| b |          2 |
| c |          3 |
| c |          3 |
| d |          4 |
| e |          5 |
One interesting aspect of DENSE_RANK() is the fact that it “behaves like” ROW_NUMBER() when we add the DISTINCT keyword.
1
2
 
SELECT DISTINCT v, DENSE_RANK() OVER(ORDER BY v)
FROM t
  
… to obtain
| V | DENSE_RANK |
|---|------------|
| a |          1 |
| b |          2 |
| e |          5 |
| d |          4 |
| c |          3 |
In fact, ROW_NUMBER() prevents you from using DISTINCT, because ROW_NUMBER() generates unique values across the partition before DISTINCT is applied:
1
2
3
 
SELECT DISTINCT v, ROW_NUMBER() OVER(ORDER BY v)
FROM t
ORDER BY 1, 2
  
DISTINCT has no effect:
| V | ROW_NUMBER |
|---|------------|
| a |          1 |
| a |          2 |
| a |          3 |
| b |          4 |
| c |          5 |
| c |          6 |
| d |          7 |
| e |          8 |
ROW_NUMBER()RANK()DENSE_RANK()
1 
2 
3 
4 
5 
 | CREATE TABLE t(v) ASSELECT * FROM (  VALUES('a'),('a'),('a'),('b'),        ('c'),('c'),('d'),('e')) t(v) | 
PARTITION given the ORDER BY clause. So you’d get:
1 
2 
 | SELECT v, ROW_NUMBER() OVER()FROM t | 
ORDER BY clause in the OVER() clause:
1 
2 
 | SELECT v, ROW_NUMBER() OVER(ORDER BY v)FROM t | 
ROW_NUMBER(), except that “equal” rows are ranked the same. If we substitute RANK() into our previous query:
1 
2 
 | SELECT v, RANK() OVER(ORDER BY v)FROM t | 
DENSE_RANK() is a rank with no gaps, i.e. it is “dense”. We can write:
1 
2 
 | SELECT v, DENSE_RANK() OVER(ORDER BY v)FROM t | 
DENSE_RANK() is the fact that it “behaves like” ROW_NUMBER() when we add the DISTINCT keyword.
1 
2 
 | SELECT DISTINCT v, DENSE_RANK() OVER(ORDER BY v)FROM t | 
ROW_NUMBER() prevents you from using DISTINCT, because ROW_NUMBER() generates unique values across the partition before DISTINCT is applied:
1 
2 
3 
 | SELECT DISTINCT v, ROW_NUMBER() OVER(ORDER BY v)FROM tORDER BY 1, 2 | 
DISTINCT has no effect:Putting it all together
A good way to understand the three ranking functions is to see them all in action side-by-side. Run this query
1
2
3
4
5
6
7
 
SELECT
  v, 
  ROW_NUMBER() OVER(ORDER BY v),
  RANK()       OVER(ORDER BY v),
  DENSE_RANK() OVER(ORDER BY v)
FROM t
ORDER BY 1, 2
  
… or this one (using the SQL standard WINDOW clause, to reuse window specifications):
1
2
3
4
5
6
7
 
SELECT
  v, 
  ROW_NUMBER() OVER(w),
  RANK()       OVER(w),
  DENSE_RANK() OVER(w)
FROM t
WINDOW w AS (ORDER BY v)
  
… to obtain:
| V | ROW_NUMBER | RANK | DENSE_RANK |
|---|------------|------|------------|
| a |          1 |    1 |          1 |
| a |          2 |    1 |          1 |
| a |          3 |    1 |          1 |
| b |          4 |    4 |          2 |
| c |          5 |    5 |          3 |
| c |          6 |    5 |          3 |
| d |          7 |    7 |          4 |
| e |          8 |    8 |          5 |
Note that unfortunately, the WINDOW clause is not supported in all databases.
1 
2 
3 
4 
5 
6 
7 
 | SELECT  v,   ROW_NUMBER() OVER(ORDER BY v),  RANK()       OVER(ORDER BY v),  DENSE_RANK() OVER(ORDER BY v)FROM tORDER BY 1, 2 | 
WINDOW clause, to reuse window specifications):
1 
2 
3 
4 
5 
6 
7 
 | SELECT  v,   ROW_NUMBER() OVER(w),  RANK()       OVER(w),  DENSE_RANK() OVER(w)FROM tWINDOW w AS (ORDER BY v) | 
WINDOW clause is not supported in all databases.
Yorumlar
Yorum Gönder