Showing posts with label Sort. Show all posts
Showing posts with label Sort. Show all posts

Tempdb should be some % size of Large Database in SQL Instance ?

It doesn't seem like that!!!

There is NO any arithmetic formula to calculate Tempdb Size. Yes ?

Do we have any ?

Yes. still we can calculate/figure out by performing the following things when storing Intermediate results

1. Memory Spill - Causing by Hash Or Sort operation
2. Rebuilding Index along with SORT_IN_TEMPDB Option
3. DBCC CHECKDB on Larg Database
4. Using Temp (#/##) Table
5. Using multiple aggregations with huge data

Tempdb doesn't behave like User Databases (Say suppose, If the User database grows up to 500 GB and It'll remain same even the SQL Instance restarts. Right ?

But, The Tempdb will be recreated with the Size to whatever the size it was last set to!!!

To avoid Memory Spill:
- Omit the ORDER BY clause if you do not need the result set to be ordered.
- If ORDER BY is required, eliminate the column that participates in the multiple range scans from the ORDER BY clause.
- Using an index hint, force the optimizer to use a different access path on the table in question.
- Rewrite the query to produce a different query execution plan.
- Force serial execution of the query by adding the MAXDOP = 1 option to the end of the query or index operation

UNION operator is not eliminating duplicate data. When ? and Why ?


Normally, UNION operator will elimate the duplicate data from the result sets.

But, In some situation, It'll not eliminate the duplicate data.

When ?
Incase, If we have used any Case-Sensitive collations on columns, database, server.

Why ?
The UNION operation depend on the collation what we have used in columns, Database, Server.

Scenario...
/*One table created, column with case-sensitive(CS) collation*/
If OBJECT_ID('Tb_Sample1') Is Null
Create Table Tb_Sample1
(
Id Int Identity(1,1),
Column1 Varchar(10) Collate SQL_Latin1_General_CP1_CS_AS,
Column2 Varchar(10)
)
Go


/*Inserting Duplicate data, But the case differs*/
Insert Tb_Sample1(Column1, Column2) Values('SQL','Server')
Insert Tb_Sample1(Column1, Column2) Values('MS','Access')
Insert Tb_Sample1(Column1, Column2) Values('Adobe','Reader')
Insert Tb_Sample1(Column1, Column2) Values('sql','Server')
Go


/*Fetching data using UNION operator*/
Select Column1 , Column2 From Tb_Sample1 where Column1 in('SQL', 'MS','sql')
Union
Select Column1 , Column2 From Tb_Sample1 where Column1 in('Adobe')


/*Expected result was...*/
Column1Column2
AdobeReader
MSAccess
sqlServer


/*But, The actual result is...*/
Column1Column2
AdobeReader
MSAccess
SQLServer
sqlServer

The UNION operator compares the data depend on Collation what we have used in Columns, Database, Server. So, It could not be considered as a duplicate and eliminate it.

How to fix it ?
- The collation has to be changed to Case-Insensitive as given below
Alter Table Tb_Sample1 Alter Column Column1 Varchar(10) Collate SQL_Latin1_General_CP1_CI_AS
Go


or

- Use COLLATE statement and force the case-insensitive collation when using the UNION operator as given below
Select Column1 Collate SQL_Latin1_General_CP1_CI_AS as Column1, Column2 From Tb_Sample1 where Column1 in('SQL', 'MS','sql')
UNION
Select Column1 , Column2 From Tb_Sample1 where Column1 in('Adobe')


Important:
- UNION operator automatically applies DISTINCT SORT operator internally.
- So, The duplicate data will be eliminated and SORTED from the final result set. But, UNION ALL will NOT work depend on Collation and not do any data Sorting.