Harsha's

Tuesday, September 13, 2011

Friday, September 9, 2011

Very good article about datacompression

http://msdn.microsoft.com/en-us/library/dd894051(v=sql.100).aspx

"A deadlock occurs when two or more tasks permanently block each other by each task having a lock on a resource which the other tasks are trying to lock [msdn]"

When SQL Server detects a deadlock between two tasks, it will terminate one of the tasks. We don't know which task would be terminated by SQL Server. It will have its own computations. What if we don't want one of the tasks to be a victim of dead lock? (Victim is the task that will be terminated by SQL Server). The following steps helps to prioritize the deadlock. There is a keyword

SET DEADLOCK_PRIORITY. We can use this property at the start of transaction or a task. It takes in following values. LOW/NORMAL/HIGH/ -10 to 10

Syntax: SET DEADLOCK_PRIORITY HIGH

Consider there are two tasks/Connections T1 and T2 and these will be dead locked. Normally when you run the two connections, connection1 with Task T1 becomes a victim. But, you want T2 to be the victim. If you don't want task T1 to be a victim, Set the DEADLOCK_PRIORTY greater than the DEADLOCK_PRIORITY of Task T2 .

IN T1 at the top of the query

SET DEADLOCK_PRIORITY NORMAL

GO

IN T2 at the top of the query

SET DEADLOCK_PRIORITY HIGH

GO

Now open two connections and run the queries. T2 will be a victim.

Using the concept of Unpivot

Pivoting and Unpivoting are a very handy features in SQL Server. Unpivoting technique in particular is very handy. Consider an example. Say there is an employee table that has ID,Name,Homeph,MobilePh,Workph fields in the table and there are more than 100 million records. Now a new requirement comes in to add the Faxnumber also. What are the options we have? One is using native ALTER TABLE statement and add the new field.

ALTER TABLE Employee
ADD Faxnum VARCHAR(13)

100 million records, its ok to alter the table. What if there are a billion records and a requirement comes in to add another field say ManagerPh? Again we have to alter the table. The best way to handle these kinds of situations is to use UNPIVOT. Look at the pattern. HomePh,MobilePh,Workph,FaxNum... all are of similar type (Phone numbers). The table can be called as PIVOTED (remember its just an example. to pivot the data we need to USE aggregate functions on one of the filed. Its just for understanding).

We can Unpivot. Create a table Employee with Fields ID,PhoneType, Number
Now we can use UNPIVOT or alternate way is to use Union/UnionAll

Using UNPIVOT:

SELECT ID,PhoneType,Number
FROM Employee
UNPIVOT(Number FOR PhoneType IN(HomePh,MobilePh,Workph)) UNP

This will change the fields to records. So, If at all new phone type needs to be added, there is no need to change the schema of the table. We just need to update the query.

Using UNION ALL

SELECT ID, 'HomePh' AS PhoneType, HomePh AS Number
UNION ALL
SELECT ID, 'MobliePh',MobilePh
UNION ALL
SELECT ID, 'WorkPh',WorkPh

The above UNION ALL statement is same as using UNPIVOT. I used UNION ALL here. UNION also can be used. Logically, UNION will do a DISTINCT on select columns. This decreases the performance. This is very true if the table holds huge amounts of data.

Thursday, August 25, 2011

Find First Sunday of a month

There are several ways to find First Sunday of a month. The following is one of the ways

DECLARE @dtDate DATETIME
SET @dtDate = GETDATE() -- Replace with any other date
SELECT DATEADD(WEEKDAY,(8-(DATEPART(WEEKDAY,DATEADD(mm,DATEDIFF(m,0,@dtDate),0))))%7,DATEADD(mm,DATEDIFF(m,0,@dtDate),0))

First look at
DATEDIFF(m,0,@dtDate). 0 in datetime corresponds to 1900-01-01. We calculate the difference of months from 1900-01-01 to a given date. Once we get this we add the difference we got to the anchordate.
DATEADD(mm,DATEDIFF(m,0,@dtDate),0). This gives the firstday of the given month. Once we have firstday of a given month, we have to find the weekday of the day. In SQL server Sunday starts with 1 and Saturday ends with 7.
DATEPART(WEEKDAY,DATEADD(mm,DATEDIFF(m,0,@dtDate),0)). This gives a number from 1-7. Now we subtract the obtained number from 8 since, chronologically second sunday would be 8. Once we subtract from 8, we are getting the remainder of the value after dividing it by 7 as there are seven days in a week. This step is required because if first day of the month is first sunday, if we don't do a modulo, the logic will show second sunday as first sunday.
Once we have this value, we add this value to the weekday and the final query will be

SELECT DATEADD(WEEKDAY,(8-(DATEPART(WEEKDAY,DATEADD(mm,DATEDIFF(m,0,@dtDate),0)))),DATEADD(mm,DATEDIFF(m,0,@dtDate),0))

If you want to get first monday replace 8 with 9 and for first tueday replace 8 with 10...for first saturday...If you want to get second sunday add +7 to the above query for third sunday add +14 and so on....its that simple

Tuesday, August 23, 2011

Masking field in Database table

Today I came across a situation where my manager suddenly asked me to mask a filed in a table. Generally people can do symmetric key encryption and stuff. More info is available at http://msdn.microsoft.com/en-us/library/ms179331.aspx. What if if that's an immediate request and you are not aware of doing encryption. The best and easiest way is using VARBINARY. Take an example

CREATE TABLE Emp
(
EmpId INT IDENTITY(1,1)
,EmpName VARCHAR(50)
,SSN VARCHAR(15)
,MaskedSSN AS CAST(SSN AS VARBINARY(50))
)

INSERT INTO Emp
SELECT 'ABCD','123-45-3215'
UNION
SELECT 'FERG','879-55-4312'

In the above example I am using a computed column MaskedSSN. This filed will be re-calculated each and every time a query is ran against the table. Now do a SELECT.

SELECT EmpID,EmpName,SSN,MaskedSSN FROM Emp. The output will be

EmpID EmpName SSN MaskedSSN
1 ABCD 123-45-3215 0x3132332D34352D33323135
2 FERG 879-55-4312 0x3837392D35352D34333132

SSN Field is masked now. You can create a view which will pull EmpID,EmpName,MaskedSSN as SSN and take out access to table. If you by default want the actual SSN Field to be masked in the table, Use the following script

CREATE TABLE Emp
(
EmpId INT IDENTITY(1,1)
,EmpName VARCHAR(50)
,SSN VARBINARY(256)
)

INSERT INTO Emp
SELECT 'ABCD','123-45-3215'
UNION
SELECT 'FERG','879-55-4312'

When you do a select, the output will be

EmpID EmpName SSN
1 ABCD 0x3132332D34352D33323135
2 FERG 0x3837392D35352D34333132

If you want to find the actual SSN, you can use the following query
SELECT EmpID,EmpName,CAST(SSN AS VARCHAR(256)) AS SSN FROM Emp
which will give the actual SSN.

EmpID EmpName SSN
1 ABCD 123-45-3215
2 FERG 879-55-4312

Friday, August 19, 2011

Implementing INTERSECT ALL

We know that SQL SERVER has a built in keywords for SET operations.

1.UNION
2.UNION ALL
3.INTERSECT
4.EXCEPT

Consider table A and B. If we use A UNION B, the result set is the records from both tables A and B with out repetition. i.e., duplicates are eliminated. If we want to include duplicates, we have UNION ALL. if we use A UNION ALL B, then the result set is all the records from both A and B.

In the same way, we have INTERSECT. This will pull out records that exist in both the tables A and B with out repetition(duplicates). What if? if we want to include duplicates also? There is no INTERSECT ALL operation. Still we can achieve this programmatically. However there is a rule for this. For example consider a record appears x times in table A and y times in table B. The INTERSECT ALL will display the record minimum(x,y) times. For example if a record exists 3 times table A and 4 times in table B, INTERSECT ALL will display the record 3 times because it is for sure a record INTERSECTS 3 times in both the tables. Here is how we can achieve this

;WITH INTERSECTALL
AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY [col1,col2,..coln] ORDER BY (SELECT 0)) AS RowNum
,[col1,col2,...coln]
FROM db.schema.tablename

INTERSECT

SELECT ROW_NUMBER() OVER(PARTITION BY [col1,col2,..coln] ORDER BY (SELECT 0)) AS RowNum
,[col1,col2,...coln]
FROM db.schema.tablename
)
SELECT [col1,col2,...coln]
FROM INTERSECTALL

In the above CTE, col1,col2,...coln are the columns that you want from table A and B. Again remember same number of columns should be selected in both the tables and the datatypes should be type compatible.
db is the name of the database
schema is the schema to which the table is associated
tablename is the name of the table.
SELECT(0) in the ORDER BY clause in ROW_NUMBER() tells SQL that Ordering is not important. You may use any constant.

Here I am calculating the ROW_NUMBER() to find the number of occurrences of each row in a table. Since ROW_NUMBER() shouldn't be returned in the final result, I have included the code in a CTE and I am pulling only the required fields. Here is an example. Consider you have two tables Employees and Customers and you want to show Country,Region and City fields that exist in both the tables with duplication (INTERSECT ALL. Remember the number repetitions is MIN(x,y)). The code would look like this

WITH INTERSECTALL
AS
(
SELECT ROW_NUMBER() OVER(PARTITION BY country,region,city ORDER BY (SELECT 0)) AS RowNum
,country
,city
,region
FROM HR.Employees

INTERSECT

SELECT ROW_NUMBER() OVER(PARTITION BY country,region,city ORDER BY (SELECT 0)) AS RowNum
,country
,city
,region
FROM Sales.Customers
)
SELECT country
,city
,region
FROM INTERSECTALL