Rahul Raj: August 2014

Aug 17, 2014

Understanding Execution Plan

Cluster Index Scan
If there is no WHERE clause or it is on column which is not indexed, then SQL Server scans the entire table. If table has a Clustered Index it will do a Clustered Index Scan otherwise it will do a Table Scan.

Cluster Index Seek
If where clause is on cluster index then it will do cluster index seek.

Non Cluster Index Seek

CREATE NONCLUSTERED INDEX [IX_sku]
ON [dbo].[Products] ([sku])
GO

In following case SQL Server will ideally do Non Cluster Index Seek

SELECT ID, SKU FROM [dbo].[Products] WHERE SKU = '123456'

Non Cluster Index Seek and Key Lookup on the clustered index

If where clause is on non-cluster index column and you are selecting * or columns other than the column on which table has non-cluster and cluster index , then SQL Server first does Index Seek (nonclustered) and then do a key lookup (clustered). The other operation which happens is Nested Loops which joins results from Index Seek and the Key Lookup. Included columns should be columns that you don't search on, but will return.

SELECT ID, SKU,Description FROM [dbo].[Products] WHERE SKU = '123456'

You can improve performance in above scenario by Creating a Covering Index or by Creating an Index with Included Columns. This way SQL Server doesn't have to do Key Lookup.

CREATE NONCLUSTERED INDEX [IX_LastName]
ON [dbo].[Products] ([sku],[Description])

Joining on non-like datatypes frequently causes table scan even when indexes are present

2005 introduced Dynamic Management Views (DMVs) which can be used to see how sql server uses index. This can also be used for Identifying Unused Indexes.

DMV - sys.dm_db_index_operational_stats

SELECT 
 OBJECT_NAME(A.[OBJECT_ID]) AS [OBJECT NAME], 
 I.[NAME] AS [INDEX NAME], 
 A.LEAF_INSERT_COUNT, 
 A.LEAF_UPDATE_COUNT, 
 A.LEAF_DELETE_COUNT 
FROM   
 SYS.DM_DB_INDEX_OPERATIONAL_STATS (db_id(),NULL,NULL,NULL ) A 
INNER JOIN SYS.INDEXES AS I ON 
 I.[OBJECT_ID] = A.[OBJECT_ID] 
    AND I.INDEX_ID = A.INDEX_ID 
WHERE  
 OBJECTPROPERTY(A.[OBJECT_ID],'IsUserTable') = 1
 AND OBJECT_NAME(A.[OBJECT_ID]) = 'Products'

SYS.DM_DB_INDEX_USAGE_STATS

SELECT 
 OBJECT_NAME(S.[OBJECT_ID]) AS [OBJECT NAME], 
 I.[NAME] AS [INDEX NAME], 
 USER_SEEKS, 
 USER_SCANS, 
 USER_LOOKUPS, 
 USER_UPDATES 
FROM   
 SYS.DM_DB_INDEX_USAGE_STATS AS S 
INNER JOIN 
 SYS.INDEXES AS I 
 ON I.[OBJECT_ID] = S.[OBJECT_ID] 
 AND I.INDEX_ID = S.INDEX_ID 
WHERE  
 OBJECTPROPERTY(S.[OBJECT_ID],'IsUserTable') = 1
    AND S.database_id = DB_ID() 
 AND OBJECT_NAME(S.[OBJECT_ID]) = 'Products'

The seeks refer to how many times an index seek occurred for that index. A seek is the fastest way to access the data, so this is good.
The scans refers to how many times an index scan occurred for that index. A scan is when multiple rows of data had to be searched to find the data. Scans are something you want to try to avoid.
The lookups refer to how many times the query required data to be pulled from the clustered index or the heap (does not have a clustered index). Lookups are also something you want to try to avoid.
The updates refers to how many times the index was updated due to data changes which should correspond to the first query above.

To manually remove a single plan or all plans from the cache, use DBCC FREEPROCCACHE
Reference:
https://www.simple-talk.com/sql/learn-sql-server/sql-server-index-basics/

Aug 14, 2014

Agile Methodology - Scrum, Kanban

Scrum

In scrum you have small team working on list of small, concrete deliverables. You divide big organization into small cross-functional and self-organizing teams.

It has fixed length iteration (usually 2 – 4 weeks) at the end of which you have shippable code.

Each items in the iteration (sprint) are estimated and team commits on finishing it by the end of sprint.

At the beginning of iteration team pulls out items from the product backlog, based on the product owner’s priorities and how much the team thinks they can complete in one iteration.

After each iteration the process is optimized by having retrospective.

Since you working on a small thing which is part of a large group, you integrate regularly to see the whole.

Scrum prescribes three roles: Product Owner (sets product vision & priorities), Team (implements the product) and Scrum Master (removes impediments and provides process leadership).

Scrum resistance change within the iteration, which means you typically don't add new item in the middle of iteration as team is commited with items pulled in the sprint.

Scrum prescribes estimation and velocity(per iteration how much hour of work a team can complete).

Burndown chart prescribed

Kanban

Kanban leaves almost everything open. The only constraints are visualize your workflow and limit your WIP.

Split the work into pieces, write each item on a card and put on the wall

Use named columns to illustrate where each item is in the workflow.

Limit work in progress ( WIP), this way proper flow to finish line is always maintained.

Measure the lead time which is average time to complete one item.

It does not prescribe any roles but that doesn't mean you shouldn't have. You are free to add role(Product Owner) which make sense to your project. In a big project Project Manager role may be great idea, perhaps that’s the guy who helps multiple teams & product owners synchronize with each other. In a small project that role might be waste, or worse, might lead to suboptimization and micromanagement.

Kanban board doesn't need to be reset or start over.

In Kanban, cross-functional teams are optional, and a board doesn’t need to be owned by one specific team. A board is related to one workflow, not necessarily one team.

In Kanban, estimation is not prescribed. Uses Lead time as default metric for planning and process improvement.

Lean Principals

Fail Fast, Learn Rapidly
Planning and Commitment are two different thing. Plan thoroughly and commit sparingly.
Think Big (Look at the whole Value Stream), Act Small
There is no process which cannot be improved. Continuous Improvement is never-ending.
Designs evolve, so do not waste time locking it down early.
Release Early, Release Often
Predictions of the future are always going to be inaccurate, by locking down designs, processes, requirements, etc., we lose our ability to respond to the inevitable unpredictability. Change is inevitable!
Build quality from the start -- don't rely on testing it in later. The job of testing is to prevent defects not find defects
Specify the most important things and deliver them fast, then take the next most important things. And repeat. We do not know what we do not know, so it is not possible to define everything (and have it remain unchanged) up-front.
Everything not adding value to the customer is waste - stop doing it
Don't equate speed with hacking. A fast moving development team must have excellent reflexes and a well-disciplined "Stop the Line" culture.
Treat each other as our Customer. Find good people and let them do their job.
When a defect is found-Stop the Line, Find the Cause, Fix it

Aug 11, 2014

Asynchronous vs Synchronous Method

ASP.Net processes a request (http) by using a thread from thread pool. The processing thread cannot service any other request if its being processed synchronously. This can eventually result in condition known as thread starvation. After this condition server starts queuing up the request and if the queue becomes full server starts rejecting request with 503 status code - Server Too Busy.

In IIS you can set Threads Per Processor Limit and the Queue Length (IIS->Web Server(Feature View)->ASP)

Example of how you can write async action in MVC/Web api

 public async Task Create(Product product)
 {
                .   
                .
                .
var response = await client.PostAsJsonAsync(uri, product);
                .
                .  
}

In the above example, code will be executed synchronously until it hits await key word. The await keyword registers rest of the method as a callback on the task and then it immediately return. This will free up the processing thread. When the awaited task completes, it will invoke that callback to resume the execution of the method.

This is mostly helpful for I/O bound operations. For I/O intensive activity thread tells kernel what I/O it wants done, after which it continues on, rather than sitting and waiting for response from kernel. When kernel is done with I/O it callback so that rest of the processing can be done.

On the hand if the operation is primarily CPU intensive using asynchronous action will not provide any benefits rather it will make things worse because of its overhead.

http://msdn.microsoft.com/en-us/library/hh873177(v=vs.110).aspx

Aug 9, 2014

Protecting your Mvc page from CSRF

Adding @Html.AntiForgeryToken() to a view does following

 
 sets __RequestVerificationToken Cookie
 add hidden input __RequestVerificationToken to the page

And then you can add attribute ValidateAntiForgeryToken to the HttpPost action which will validate __RequestVerificationToken between cookie and posted hidden input value. This helps defend against cross-site request forgery. If any of the values are missing or the values does not match (it has some encryption logic so if you try to compare, the value it will not looksame), it will throw HttpAntiForgeryException.

With the absence of attribute ValidateAntiForgeryToken, your site can be easily prone to csrf. Refer to following for a quick way to create this condition

<body>
<form name="badform" method="post" action="http://localhost/product-ui/Product/Create">
<input type="hidden" name="sku" value="1234" />
<input type="hidden" name="overview" value="something...." />
</form>
<script type="text/javascript">
document.badform.submit();
</script>
</body>