In SQLSERVER ALL, PERCENT, CUBE keywords, ROLLUP keywords and GROUPING function

In SQLSERVER ALL, PERCENT, CUBE keywords, ROLLUP keywords and GROUPING function

To create a test table

 1 USE [tempdb]
 2 GO
 3 
 4 CREATE TABLE #temptb(id INT ,NAME VARCHAR(200))
 5 GO
 6 
 7 INSERT INTO [#temptb] ( [id], [NAME] )
 8 SELECT 1,'China' UNION ALL
 9 SELECT 2,'China' UNION ALL
10 SELECT 3,'Britain' UNION ALL
11 SELECT 4,'Britain' UNION ALL
12 SELECT 5,'The United States of America' UNION ALL
13 SELECT 6,'The United States of America' UNION ALL
14 SELECT null, 'France' UNION ALL
15 SELECT 8,'France' 
16 GO
17 
18 SELECT * FROM [#temptb]
19 GO

First look at the syntax of the SELECT statement:

1 SELECT [ ALL | DISTINCT ] [ topSubclause ] aliasedExpr 
2       [{ , aliasedExpr }] FROM fromClause [ WHERE whereClause ] [ GROUP BY groupByClause [ HAVING havingClause ] ] [ ORDER BY orderByClause ]
3 or
4 SELECT VALUE [ ALL | DISTINCT ] [ topSubclause ] expr FROM fromClause [ WHERE whereClause ] [ GROUP BY groupByClause [ HAVING havingClause ] ] [ ORDER BY orderByClause

The ALL keyword: Specified in the can display repeated rows in the result set, which is the default keyword, that is to say, when you are not using the ALL keyword in the query, the default is added on the ALL keyword

For example, the following two SQL statements are equivalent, in fact, will put the select out duplicate records

1 --Which two statements are equivalent
2 SELECT * FROM [#temptb]
3 GO
4 -------------------------------------------
5 SELECT ALL * FROM [#temptb]
6 GO

If you need to have a unique value select, to filter out the duplicate values need to useThe DISTINCT keyword

1 SELECT DISTINCT([NAME]) FROM [#temptb]

And when you put an SQL statement, the ALL field in parentheses, it becomes an expression, for example, the following SQL statement

1 SELECT ALL([NAME]+'Hello') AS 'The country' FROM [#temptb]

In my last article: Table (query and delete duplicate records)

Record the maximum in the Name ID, which has a SQL statement

1 SELECT  *
2 FROM    [#temptb] a
3 WHERE   ID!<ALL ( SELECT    ID
4                   FROM      [#temptb]
5                   WHERE     Name = a.Name )

If you remove the ALL keyword meeting how?

Because the sub query needs is an expression, so need to use the ALL keyword to turn them into an expression, so use ALL


The ALL keyword can also put in GROUP BY

Here are two cases, one is the SQL statement with a where clause, the other is no where clause in the SQL statement

A:

1 SELECT AVG(id) FROM [#temptb] WHERE NAME='France' GROUP BY ALL NAME
2 SELECT AVG(id) FROM [#temptb] WHERE NAME='France'  GROUP BY NAME

For there is no line with the conditions of the group, there is no consistent with name='of France', as the aggregate value of column value is NULL

Without the ALL keyword, GROUP BY clause will not display is not in line with the conditions of the group

Situation two:

1 SELECT AVG(id) FROM [#temptb]  GROUP BY ALL NAME
2 SELECT AVG(id) FROM [#temptb]  GROUP BY  NAME

When no where clause in the SQL statement, query results are the same


The ALL keyword can also put after UNION

1 USE [GPOSDB]
2 GO
3 INSERT INTO [dbo].[SystemPara] ( [ParaValue], [Name], [Description] )
4 SELECT 'nihao','nihao','nihao' UNION ALL
5 SELECT 'nihao','nihao','nihao' 

The PERCENT keyword

The PERCENT keyword with the TOP keyword need

Concentrate output N percent of rows from the result, the N must be an integer between 0~100.

1 SELECT TOP 10 PERCENT * from [#temptb]
2 GO


The SQL statement is: the number of records from the [#temptb] table output 10%, because order is used without a by clause, so this record is random

Because the [#temptb] table has 8 entries, 8*10%=0.8 four to five homes in after the equivalent of a record

1 SELECT TOP 30 PERCENT * from [#temptb]
2 GO

8*30%=2.4 four to five homes in after the equivalent of three records, SQLSERVER here even if four to five homes in three records, he also can output larger number, namely three records


The CUBE keyword

The CUBE keyword: if you need in the result set contains not only the normal line is provided by the GROUP BY, also contains a summary row, you can use the CUBE keyword. The CUBE keyword is used together with GROUP BY

When using the CUBE keyword, you can use the GROUPING function to output an extra column, when the results are normal line, return 0; when the line is the summary, return 1.

1 SELECT  AVG(id) AS 'The average', GROUPING(NAME) AS 'Whether the summary'
2 FROM    [#temptb]
3 GROUP BY NAME
4         WITH CUBE

The last line shows the GROUP BY records the number of lines, a total of 4 rows, and in the total row (i.e. the last line) is summarized the column shows the 1, that is the total row

The Grouping keyword

Indicates whether the specified list of GROUP polymerization in BY list of type.

In the result set, if GROUPING returns 1 indicates that polymerization; return 0 indicates no polymerization.

If GROUP BY is specified, the GROUPING can only be used in the SELECT <select> List, HAVING and ORDER in the BY clause.

GROUPING is used to distinguish the standard null values and returned by the ROLLUP, CUBE or GROUPING SETS null.

As a ROLLUP, CUBE or GROUPING SETS operating results returned by the NULL is a special application of NULL.

It is in the result set as shown in the placeholder, said the.


The following example will block SalesQuota and polymerization of SaleYTD number. Application of GROUPING function in the SalesQuota column.

1 USE [AdventureWorks];
2 GO
3 SELECT  SalesQuota, SUM(SalesYTD) 'TotalSalesYTD',
4         GROUPING(SalesQuota) AS 'Grouping'
5 FROM    Sales.SalesPerson
6 GROUP BY SalesQuota
7         WITH ROLLUP;
8 GO

The results show two null values under SalesQuota.

The first NULL representatives from the table in this column is null group.

Second NULL located in the ROLLUP operating the added summary rows.

Summary display all SalesQuota group TotalSalesYTD number, and in the Grouping column indicates 1.




The simple summary reports using Transact-SQL

The application generates a simple summary statements can use the following Transact-SQL elements:

ROLLUP, CUBE or GROUPING SETS operator. This is an extension to the GROUP BY clause of the SELECT statement.

COMPUTE or COMPUTE BY operator. The two operator is also associated with GROUP BY.

The operator generates a result set, which contains the details for each project, but also summary rows for each group, the total row shows the total polymerization of the group.

The GROUP BY clause can be used to generate each polymerization results and contains no detail rows.

Applications should use Analysis instead of Services, CUBE, ROLLUP, COMPUTE or COMPUTE BY.

In particular, CUBE and ROLLUP should be used only in cannot access the OLE DB or ADO environments, such as scripts or stored procedures.

Support COMPUTE and COMPUTE BY is maintained for backward compatibility.

Should be preferred ROLLUP operator instead of COMPUTE or COMPUTE BY. Generated by the COMPUTE or the COMPUTE BY summary value as separate multiple result sets are returned,

These results also inserts containing details between each row result set; or as the result set contains total return, added after the main result set.

Treatment of these multiple result sets complexity will increase the application code. A server cursor does not support COMPUTE, also does not support the COMPUTE BY.

But ROLLUP support server cursor. CUBE and ROLLUP will produce a single result set, which contains an embeddedSubtotalAndTotalThat's ok.

In addition, the query optimizer can sometimes generate more efficient than COMPUTE and COMPUTE BY to generate the execution plan execution plans for ROLLUP.

If you use these operators without GROUP BY, it returns a single result sets, each of which corresponds to a line that contains the group, polymerization subtotal rows. No detail rows in the result set.


SQLSERVER Cube, the use of RollUp

Cube , RollUpTo query the data aggregation, often used in the statistics, especially do statements, used in the Select statement

Below is a comparison of the two statistical methods

SQL script:

 1 USE [tempdb]
 2 GO
 3 CREATE TABLE t_test
 4 (
 5   id INT ,
 6   productName VARCHAR(200) ,
 7   price MONEY ,
 8   num INT ,
 9   amount INT ,
10   operatedate DATETIME
11 )
12 GO
13 
14 --Insert random data
15 DECLARE @i INT 
16 DECLARE @rand MONEY
17 DECLARE @date DATETIME
18 DECLARE @index INT 
19 DECLARE @DateBase INT 
20 SET @date = '2012-10-23'
21 SET @i = 1
22 WHILE ( @i < 18 ) 
23     BEGIN
24         SET @rand = RAND() * 20
25         SET @index = CAST(RAND() * 3 AS INT)
26         SET @DateBase = CAST(RAND() * 10 AS INT)
27  
28         INSERT  INTO t_test ( id, productName, price, num, amount, operatedate )
29         VALUES  ( @i, 'product' + CAST (@index AS VARCHAR(10)), @rand, 100,
30                   @rand * 100, @date + @DateBase )
31         SET @i = @i + 1
32     END
33  
34  
35 SELECT  *  FROM    t_test

By two kinds of patterns.:

 1 --By two kinds of patterns.: 
 2  
 3 SELECT  CASE WHEN GROUPING(operatedate) = 1 THEN 'Subtotal'
 4              ELSE CONVERT(VARCHAR(10), operatedate, 120)
 5         END AS Date, CASE WHEN GROUPING(productName) = 1 THEN 'Subtotal'
 6                         ELSE productName
 7                    END AS Product name, SUM(amount) / SUM(num) AS The average price, SUM(num) AS Number,
 8         SUM(amount) AS Amount of money
 9 FROM    t_test
10 GROUP BY operatedate, productName  WITH ROLLUP;   
11 -------------------------------------------------------------------
12 SELECT  CASE WHEN GROUPING(operatedate) = 1 THEN 'Subtotal'
13              ELSE CONVERT(VARCHAR(10), operatedate, 120)
14         END AS Date, CASE WHEN GROUPING(productName) = 1 THEN 'Subtotal'
15                         ELSE productName
16                    END AS Product name, SUM(amount) / SUM(num) AS The average price, SUM(num) AS Number,
17         SUM(amount) AS Amount of money
18 FROM    t_test
19 GROUP BY operatedate, productName WITH CUBE; 

ROLLUP According to the packet sequence, the first field in operatedate packet, statistics within the group, finally together

1 SELECT  CASE WHEN GROUPING(operatedate) = 1 THEN 'Subtotal'  --With GROUPING it is summary, in this example, the last line is the total row
2              ELSE CONVERT(VARCHAR(10), operatedate, 120)
3         END AS Date, CASE WHEN GROUPING(productName) = 1 THEN 'Subtotal'
4                         ELSE productName
5                    END AS Product name, SUM(amount) / SUM(num) AS The average price, SUM(num) AS Number,
6         SUM(amount) AS Amount of money
7 FROM    t_test
8 GROUP BY operatedate, productName  WITH ROLLUP;   --Because the operatedate and productName fields in the GROUPING function of statistical or summary, so GROUP BY operatedate and productName need to be added to the two fields

CUBE Will the grouping field statistics, all as the example, first find the subtotal to date, which is the total amount of products each day, then the statistics of each product the total amount, finally gives the total aggregate.

ROLLUPAndCUBEThe difference is: the total of the first field each packet ROLLUP will only go back and statistical group by subtotal and the first field

Grouping (field) are used to distinguish the current line is it right? Subtotal rows produced, Grouping (field name) that is =1 statistics, Grouping (field name =0) that is the rows in the table

Can be used in case, where.


Another example

SQL script:

1 USE [tempdb] 2 GO 3 CREATE TABLE Sales (EmpId INT, Yr INT, Sales MONEY) 4 INSERT Sales VALUES(1, 2005, 12000) 5 INSERT Sales VALUES(1, 2006, 18000) 6 INSERT Sales VALUES(1, 2007, 25000) 7 INSERT Sales VALUES(2, 2005, 15000) 8 INSERT Sales VALUES(2, 2006, 6000) 9 INSERT Sales VALUES(3, 2006, 20000) 10 INSERT Sales VALUES(3, 2007, 24000) 11 12 SELECT * FROM [dbo].[Sales]

ROLLUP

1 SELECT EmpId, Yr, SUM(Sales) AS Sales
2 FROM Sales
3 GROUP BY EmpId, Yr WITH ROLLUP

CUBE

1 SELECT EmpId, Yr, SUM(Sales) AS Sales
2 FROM Sales
3 GROUP BY EmpId, Yr WITH CUBE

CUBE ROLLUP and more than a year statistics, statistics of the 2005, 2006, 2007 sales

Can use the diagram to express

ROLLUP

CUBE



The difference between CUBE and ROLLUP verification

ROLLUPAndCUBEThe difference is: the total of the first field each packet ROLLUP will only go back and statistical group by subtotal and the first field

We modify the above experiment

1 USE [tempdb] 2 GO 3 CREATE TABLE Sales (EmpId INT,productName VARCHAR(200), Yr INT, Sales MONEY) 4 GO 5 INSERT Sales VALUES(1,'product2', 2005, 12000) 6 INSERT Sales VALUES(1,'product1', 2005, 18000) 7 INSERT Sales VALUES(1,'product0', 2006, 25000) 8 INSERT Sales VALUES(1,'product2', 2007, 15000) 9 INSERT Sales VALUES(2,'product1', 2005, 60000) 10 INSERT Sales VALUES(2,'product1', 2006, 22000) 11 INSERT Sales VALUES(2,'product0', 2007, 24000) 12 INSERT Sales VALUES(3,'product0', 2005, 32000) 13 INSERT Sales VALUES(3,'product2', 2006, 42000) 14 INSERT Sales VALUES(3,'product0', 2007, 24000) 15 GO 16 17 SELECT * FROM [dbo].[Sales]

ROLLUP

1 SELECT EmpId, Yr,[productName], SUM(Sales) AS Sales
2 FROM Sales
3 GROUP BY EmpId, Yr,[productName] WITH ROLLUP

CUBE

1 SELECT EmpId, Yr,[productName], SUM(Sales) AS Sales
2 FROM Sales
3 GROUP BY EmpId, Yr,[productName] WITH CUBE

You can see the CUBE in addition to EmpId field, also counts GROUP BY behind the Yr and productName these two fields

ROLLUP counts only the EmpId field


Summary

These keywords and the function of the normal application for statistics are very useful, if you are familiar with these function words, in the development of which will be handy

In addition, personally feel that the PERCENT keyword can be used in the paging.


If have a wrong place, welcome Paizhuan Oh o(∩_∩)o

Posted by Allen at November 25, 2013 - 12:24 PM