Workshops

Instructor
Schedule
Feedbacks
Webcasts
Contacts




Sign up

for Email

Updates


Parallel Sort and Merge Join – Watch out for unpredictability in performance:

When SQL Server sorts or does merge join in parallel the query performance highly depends on other concurrent CPU intensive activities taking place on the server.

In the below example you will see that when we execute a query with MAXDOP 0 it completes in less than a second. When we have a CPU intensive query executing over a CPU core, the above query with MAXDOP 0 takes 42 seconds. In reality it need not just be 42 seconds, it could be many minutes or hours depending on data volume. Also possible solutions are discussed in webcast 3.

To read additional articles I wrote click here.

This and similar topics are covered in my workshops. If you are interested in learning practical techniques to improve your SQL server performance, register for the upcoming 3 Day Level 400 Microsoft SQL Server 2012 and 2008 Performance Monitoring & Tuning Hands-on Workshop in Helsinki, Finland during February 27-March 01, 2013, click here to register. These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here.

Enough explanation, let’s see a real world example.

The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script.

Let’s execute the query with MAXDOP 0:

The reason I am using convert is to trick the optimizer to grant additional memory. The memory estimated by the optimizer without convert is not enough to perform an in-memory sort. The sort spills to tempdb without the convert leading to non optimal performance. You can learn more about this from my webcasts or watching my SQL Bits V session at http://go.microsoft.com/?linkid=9712350 or reading the MSSQL Tip based on our webcast, http://www.mssqltips.com/tip.asp?tip=1955 (written by Matteo Lorini).

set statistics time on

go

 

--Example provided by www.sqlworkshops.com

declare @order_date datetime, @order_status int, @order_id int,

      @order_description varchar(4000)

select @order_date = order_date, @order_status = order_status,

            @order_id = order_id, @order_description = convert(varchar(4000), order_description)

      from Orders

      where order_date between '2005-01-01' and '2005-06-30'

      order by order_id

      option (maxdop 0)

go

--Query takes less than a second

Now let’s execute a CPU intensive query over another session:

--Example provided by www.sqlworkshops.com

--Execute this CPU intensive query on a different session

declare @i int

while 1=1

select @i = count(*) from sys.objects option (maxdop 1)

go

Now let’s execute the above query with MAXDOP 0:

--Example provided by www.sqlworkshops.com

declare @order_date datetime, @order_status int, @order_id int,

      @order_description varchar(4000)

select @order_date = order_date, @order_status = order_status,

            @order_id = order_id, @order_description = convert(varchar(4000), order_description)

      from Orders

      where order_date between '2005-01-01' and '2005-06-30'

      order by order_id

      option (maxdop 0)

go

 

--Query takes about 42 second

--How many seconds did it take on your server ?

The reason for the performance issue is the way SQL Server implements merging or order preserving exchanges and SQL Server’s CPU/thread scheduling architecture. You can see our ‘Gather Streams’ has ‘Order By’, the sorted output of the parallel thread has to be merged by ‘Gather Streams’ maintaining the sorted order.

Every parallel thread supplies a page of sorted rows (packet) and the ‘Gather Streams’ merges these pages. Once ‘Gather Streams’ runs out of rows from a thread it waits for the next page of sorted rows from that thread. If that thread (which has to supply the next page of sorted rows) is executing on a CPU core that is busy executing a CPU intensive query, then the thread has to wait up to 4ms to supply the next page of sorted rows.

Duration depends on how many CPU cores you have, with more cores, each thread sorts less rows.

4ms is the scheduling quantum in SQL Server. Refer to the article www.sqlworkshops.com/cpupressure I wrote a while ago and the webcasts for additional details.

Some of you might think well in that case I will not execute queries in parallel, will set MAXDOP to 1. There are cases where you have to use all the CPU cores (return on investment (ROI), Maximizing your existing hardware CPU, memory and disk); you cannot tell your customers ‘well everyone has to wait while I execute this report using 1 CPU core keeping the rest of the CPU cores idle for the next hour’. You have to find creative ways to resolve these issues. One way to solve the issue and bring back query performance predictability is to partition the workload; I discuss this in my webcast (www.sqlworkshops.com/webcasts).

This issue is just not limited to Sort; below you can see SQL Server chooses to execute the query in parallel using merge join.

The query executes faster with merge join when the CPU intensive query is not executing:

The query does parallel merge join:

When we force hash join, the query performance is similar:

The query does parallel hash join:

Let's execute the CPU intensive query on a CPU core:

Now the Merge join query is very slow, takes in my case 19 seconds:

In the execution plan you can see merge join:

Now when we force a hash join, the query is fast.

In the execution plan you can see hash join:

This is because hash join does not suffer from CPU/thread scheduling architecture like merge join. Parallel hash join suffers from memory pressure due to uneven row distribution, similar to parallel sort, which I explain in my webcasts.

In a situation like above, when the cost of parallel merge and parallel hash are close, one can choose hash (with an hint) as it gives performance predictability. And again the other possibility is to partition the workload (Soft-NUMA) like I discuss in webcast 3.

I explain these concepts with detailed examples in my webcasts (www.sqlworkshops.com/webcasts), I recommend you to watch them. The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts.

Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2012 and 2008 Performance Monitoring & Tuning Hands-on Workshop in Helsinki, Finland during February 27-March 01, 2013, click here to register. These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here.

Disclaimer and copyright information:
This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners.

Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work.

This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk.

SQL Workshop Feedbacks

Ramesh's consulting customers include Austria: Allgemeines Krankenhaus Linz, BMD Systemhaus, Bundesrechenzentrum, bwin, C07 IT Operations, CMC Markets, derStandard, ERGO Insurance Service, Fabasoft Austria, Greentube, IAEA (UN), Interwetten, INTERSPORT, Kapsch, Linz AG, Mayr-Melnhof Karton, NÖM, ÖBB, OMV, Österreichische Post, RACON West Software, Raiffeisen Bankengruppe, Raiffeisen Informatik, S.Spitz, Salzburg AG, Siemens, TIWAG, UC4, Vinzenz Gruppe, Wiener Zeitung, Wirtschaftskammer Österreich; Bulgaria: MobilTel; Czech Republic: Česká pojišťovna, ČSOB Bank, GTS Novera, Telefónica O2, Vodafone; Denmark: Danfoss, Egmont, NNIT; Finland: Cap Gemini, Kemira, Pohjola; Germany: JAM Software, SAP AG; Netherlands: De Nederlandsche Bank; Norway: Acta, Aker Solutions, Det Norske Veritas(DNV), Komplett; Russia: Mediatel; Serbia: Tarkett; Slovakia: Adastra Corporation, VÚB Banka; Slovenia: Bankart, Iskratel, KDD, Krka, Mobitel, Pošta Slovenije, Sava, SRC.SI; South Africa: De Beers, Kumba, Sasol; Spain: CIE Automotive, Gas Natural, ITP; Sweden: Sony Ericsson; Switzerland: F. Hoffmann-La Roche, Philip Morris International; UAE: Dubai Customs. For past consulting engagement feedbacks click here.

COPYRIGHT AND TRADEMARK NOTICES:

The copyright in all material provided on this site is held by SQLWorkshops.com or by the original creator of the material. Except as stated herein, none of the material may be copied, reproduced, distributed, republished, downloaded, displayed, posted or transmitted in any form or by any means, including, but not limited to, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of SQLWorkshops.com or the copyright owner. Permission is granted to download the webcast on this Site for personal, non-commercial use only, provided you do not modify the webcast and that you retain all copyright and other proprietary notices contained in the webcast. You also may not, without SQLWorkshops.com permission, "mirror" any webcast contained on this Site on any other server. This permission terminates automatically if you breach any of these terms or conditions. Upon termination, you must immediately destroy any downloaded webcast. Any unauthorized use of any webcast contained on this Site may violate copyright laws, trademark laws, the laws of privacy and publicity, and communications regulations and statutes.
Microsoft is a registered trademark of Microsoft Corporation in the United States and/or other countries. SQL Server is a trademark of Microsoft Corporation in the United States and/or other countries. This webcast may refer to organizations and products that may be the trademarks or registered trademarks of their various owners.
You may freely use the ideas and concepts discussed in this webcast, but you may not claim any of it as your own work. You may quote freely, with acknowledgements to SQLWorkshops.com.
This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk.

 

© 2010 www.sqlworkshops.com. All rights reserved. Terms of Use