Thursday, September 16, 2010

Stored Procedure best practices

There is a nice presentation that was found on the web published by Kevin Kline. The actual topic starts from slide 9 / page 10. This indeed will be very helpful. The performance of the procedures increases significantly.

http://www.quest.com/sql_server/pdfs/Stored-Procedure-BP.pdf

Wednesday, September 15, 2010

Optimizing SSIS packages

When I was working for a client, I faced a serious problem with SSIS package. The SSIS package is simple, extract data from csv file and two databases and load the data into a single table. I used a flat file source, two OLEDB Sources, a data conversion and one OLEDB destination. Once the package was created in the dev, I executed it and it took whooping 5 hours to execute. I was scared of the time it took. We were supposed to schedule a job with this package that executes every 30 mins. So, I started digging into the package. There were several fields from the source which were not used. Instead of using a table in the source, I wrote a SQL command that selects only the desired fields. In the destination mappings were updated. I re ran the package and it did not make a significant impact on the execution time. I was confused of what to do. Went into the destination table and checked the schema. Wow, its the index that caused all the problems. There is one clustered and three non clustered indexes on the table. I disabled the indexes and re ran the package. Its amazing the package executed in 15 minutes. So, I added few steps in the package where before the dataflow begins, the indexes are disabled and once the dataflow executes, the indexes are rebuilt. This way the performance was amazing and the job is scheduled to run the package every 30 mins. These are the few steps which would be helpful to optimize the SSIS package.

1. While extracting the data from source use query to pull only the required fields
2. If there are indexes on the destination table, disable them and enable the indexes once the data load is complete.
3. If data is extracted and loaded in the same server, use SQL server Destination instead of OLEDB Destination.

There are some other optimization techniques but, they depend on the environment and the kind of package thats being developed. Since SSIS is mostly about data, pull and push be optimized first.

The magic of number 9

The number 9 plays a significant role in many of our lives. 9 is considered lucky by many people who believe in numerology. But, there is a magic that 9 has. The sum of 9s will be a 9. For instance consider 99. Add 9 and 9 which is 18 and again add 1 and 8, the result is 9. No matter how many 9 are taken, the sum of all those 9s will be a 9. 9 is the only number which has this significance.

Tuesday, September 14, 2010

Performance hit with Indexes

One main reason for the index creation on field(s) of a table is to fetch the results quickly. There is a performance hit because of the index. This happens while loading the data (inserting the data). Sometimes, the data insert would take quite a long time if an index exists on a table. To over come this issue, disable the indexes on a table while inserting the data. Once the insert is done, re enable the index.

The following is the syntax to disable the index

ALTER INDEX [IndexName] ON Database.Schema.tablename DISABLE

The following is the syntax to enable the index

ALTER INDEX [IndexName] ON Database.Schema.tablename REBUILD

The main reasons for performance hit while inserting data on a table with indexes are

1. If a table has a clustered index, each and every record inserted has to be sorted. If the table has less records, it will not be an issue. If the table has huge amounts of data, imagine sorting of the whole table for each and every record insert (takes significant amount of time and resources).

Optimizing Complex T-SQL queries / stored procedures

Generally when asked about optimizing T-SQL queries, people say to do this and that. But, after working extensively on SQL Server, I found that there are no fixed rules set forth that would work in all the senarios. The optimization technique that is used in one query may reduce the performance when applied in another query resulting in longer execution time. But, there is a way in SQL Server to identify and apply optimization technique. Its called the "EXECUTION PLAN" which is an option available in query analyzer.

The following steps will result in identying better optimization rule.

1. Take the query / stored procedure that is taking too long to execute.
2. In the Query Analyzer click on the show estimated execution plan.
3. A graphical representation will be shown how the compiler executes the code section and an estimated cost will be shown. The general rule of thumb is to confine this cost below 20%.
4. If there are table scans, create non-clustered indexes so that the execution will be faster.
5. Make sure that the fields that are used in joins have indexes on them.
6. Avoid sub - queries. Instead use joins (this may not be ideal in all situations the vice versa may work).
7. If a stored procedure needs to be optimized, see for the temp tables. Temp tables use lot of memory and disk storage should be kept in mind. If the temp table is used in the very next piece of code and no where else later, try replacing the temp table with a Common Table Expression (CTE).
8. CTEs scope is very limited to the next statement. But one advantage is CTE stores the data in the RAM and gives the data on the fly.

These are some of the steps that can be followed. But, again the optimization completely depends on the kind of query and environment. Also, there are some additional steps that can be followed while optimizing a stored procedure. The following are the steps that can be looked at.

1. Use SET NOCOUNT to ON so that the number  of records affected will not be returned.
2. Generally system stored procedures that exist in Master database starts with sp_ProcName. Avoid a prefix sp while creating stored procedures in databases other than master because SQL Server initially scans master database for the stored procedures starting with sp. If a procedure is created with sp_proc in some user's database and the same procedure exists in master, the user proc will not execute.

Installing, Configuring and customizing work item templates for Team Foundation Server 2010

Team Foundation Server is a Microsoft Product offering source control, data collection, reporting and project tracking. Installation of the Team Foundation Server for the first time for a novice will be a little bit tricky. One major thing that should be remembered is TFS 2010 does not work on Windows Server 2003 64 bit edition. Inorder to install and use TFS the following softwares are required.

1. Team Foundation Server 2010
2. Visual Studio Ultimate 2010

Install TFS 2010. After installation is done, TFS needs to be configured. There are two different types of configurations that can be set.
1. Basic
2. Advanced.

More to come....