|
Parallel Sort and Merge Join Parallel Sort and Merge Join – Watch out for unpredictability in performance:
When SQL Server sorts
or does merge join in parallel the query performance highly
depends on other concurrent CPU intensive activities taking
place on the server.
In the below example you
will see that when we execute a query with MAXDOP 0 it completes
in less than a second. When we have a CPU intensive query
executing over a CPU core, the above query with MAXDOP 0 takes
42 seconds. In reality it need not just be 42 seconds, it could
be many minutes or hours depending on data volume. Also possible
solutions are discussed in webcast
3.
To read additional articles I wrote
click
here.
This and similar topics are covered in my
workshops. If you are interested in learning practical
techniques to improve your SQL server performance, register for the upcoming 3 Day Level 400 Microsoft SQL Server 2012 and 2008 Performance Monitoring & Tuning Hands-on Workshop in Helsinki, Finland
during
February 27-March 01, 2013, click
here to register. These
are hands-on workshops with a maximum of 12 participants and not
lectures. For consulting engagements click
here.
Enough explanation,
let’s see a real world example.
The best way to learn is
to practice. To create the below tables and reproduce the
behavior, join the mailing list by using this link:
www.sqlworkshops.com/ml and
I will send you the table creation script.
Let’s execute the query with MAXDOP 0:
The reason I am using convert is to trick the
optimizer to grant additional memory. The memory estimated by
the optimizer without convert is not enough to perform an in-memory sort. The
sort spills to tempdb without the convert leading to non optimal
performance. You can learn more
about this from my webcasts or watching my SQL Bits V session at
http://go.microsoft.com/?linkid=9712350
or reading the MSSQL Tip based on our webcast,
http://www.mssqltips.com/tip.asp?tip=1955
(written by
Matteo Lorini).
set
statistics
time on
go
--Example provided
by www.sqlworkshops.com
declare
@order_date datetime,
@order_status int,
@order_id int,
@order_description varchar(4000)
select
@order_date = order_date, @order_status = order_status,
@order_id = order_id,
@order_description =
convert(varchar(4000),
order_description)
from
Orders
where
order_date between
'2005-01-01'
and
'2005-06-30'
order
by order_id
option
(maxdop
0)
go
--Query takes less than a second

Now let’s execute a CPU intensive query
over another session:
--Example provided
by www.sqlworkshops.com
--Execute this CPU
intensive query on a different session
declare
@i int
while
1=1
select
@i =
count(*)
from
sys.objects
option
(maxdop 1)
go


Now let’s execute the above query with
MAXDOP 0:
--Example provided
by www.sqlworkshops.com
declare
@order_date datetime,
@order_status int,
@order_id int,
@order_description varchar(4000)
select
@order_date = order_date, @order_status = order_status,
@order_id = order_id,
@order_description =
convert(varchar(4000),
order_description)
from
Orders
where
order_date between
'2005-01-01'
and
'2005-06-30'
order
by order_id
option
(maxdop
0)
go
--Query takes about
42 second
--How many seconds
did it take on your server ?
 The reason for the performance issue is the
way SQL Server implements merging or order preserving exchanges
and SQL Server’s CPU/thread scheduling architecture. You can see
our ‘Gather Streams’ has ‘Order By’, the sorted output of the
parallel thread has to be merged by ‘Gather Streams’ maintaining
the sorted order.

Every
parallel thread supplies a page of sorted rows (packet) and the
‘Gather Streams’ merges these pages. Once ‘Gather Streams’ runs
out of rows from a thread it waits for the next page of sorted
rows from that thread. If that thread (which has to supply the
next page of sorted rows) is executing on a CPU core that is
busy executing a CPU intensive query, then the thread has to
wait up to 4ms to supply the next page of sorted rows.
Duration depends on how many CPU cores you have, with more
cores, each thread sorts less rows.
4ms is the scheduling quantum in SQL
Server. Refer to the article
www.sqlworkshops.com/cpupressure
I wrote a while ago
and the webcasts for additional details.

Some
of you might think well in that case I will not execute queries
in parallel, will set MAXDOP to 1.
There are cases where you
have to use all the CPU cores (return on investment (ROI),
Maximizing your existing hardware CPU, memory and disk); you
cannot tell your customers ‘well everyone has to wait while I
execute this report using 1 CPU core keeping the rest of the CPU
cores idle for the next hour’. You have to find creative ways to
resolve these issues. One way to solve the issue and bring back
query performance predictability is to partition the workload; I
discuss this in my webcast (www.sqlworkshops.com/webcasts).
This issue is just not limited to Sort;
below you can see SQL Server chooses to execute the query in
parallel using merge join.
The query executes faster with merge join
when the CPU intensive query is not executing:
 The query does parallel merge join:
 When we force hash join, the query
performance is similar:
 The query does parallel hash join:
 Let's execute the CPU intensive query
on a CPU core:


Now the Merge join query is very slow, takes in
my case 19 seconds:
 In the execution plan you can see merge
join:
 Now when we force a hash join, the query is
fast.
 In the execution plan you can see hash
join:
 This is because hash join does not suffer
from CPU/thread scheduling architecture like merge join.
Parallel hash join suffers from memory pressure due to uneven
row distribution, similar to parallel sort, which I explain in
my webcasts.
In a situation like above, when the cost of
parallel merge and parallel hash are close, one can choose hash
(with an hint) as it gives performance predictability.
And again the other possibility is to partition the workload
(Soft-NUMA) like I discuss in
webcast
3.
I explain these concepts with detailed examples in
my webcasts (www.sqlworkshops.com/webcasts), I recommend
you to watch them. The best way to learn is to practice. To
create the above tables and reproduce the behavior, join the
mailing list at
www.sqlworkshops.com/ml and I will send you the
relevant SQL Scripts.
Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2012 and 2008 Performance Monitoring & Tuning Hands-on Workshop in Helsinki, Finland
during
February 27-March 01, 2013, click
here to register. These
are hands-on workshops with a maximum of 12 participants and not
lectures. For consulting engagements click
here.
Disclaimer and copyright information:
This article refers to organizations and products that may be
the trademarks or registered trademarks of their various owners.
Copyright of this article belongs to
R Meyyappan
/ www.sqlworkshops.com. You may freely use the ideas and
concepts discussed in this article with acknowledgement
(www.sqlworkshops.com), but you may not claim any of it as your
own work.
This article is for informational purposes
only; you use any of the suggestions given here entirely at your
own risk.

Ramesh's consulting customers include Austria:
Allgemeines Krankenhaus Linz, BMD Systemhaus,
Bundesrechenzentrum, bwin, C07 IT Operations, CMC Markets,
derStandard, ERGO Insurance Service, Fabasoft Austria,
Greentube, IAEA (UN), Interwetten, INTERSPORT, Kapsch, Linz
AG, Mayr-Melnhof Karton, NÖM, ÖBB, OMV, Österreichische
Post, RACON West Software, Raiffeisen Bankengruppe,
Raiffeisen Informatik, S.Spitz, Salzburg AG, Siemens, TIWAG,
UC4, Vinzenz Gruppe, Wiener Zeitung, Wirtschaftskammer
Österreich; Bulgaria: MobilTel;
Czech Republic: Česká pojišťovna, ČSOB Bank, GTS
Novera, Telefónica O2, Vodafone; Denmark:
Danfoss, Egmont, NNIT; Finland: Cap Gemini,
Kemira, Pohjola; Germany: JAM Software, SAP
AG; Netherlands: De Nederlandsche Bank;
Norway: Acta, Aker Solutions, Det Norske
Veritas(DNV), Komplett; Russia: Mediatel;
Serbia: Tarkett; Slovakia:
Adastra Corporation, VÚB Banka; Slovenia:
Bankart, Iskratel, KDD, Krka, Mobitel, Pošta Slovenije,
Sava, SRC.SI; South Africa: De Beers,
Kumba, Sasol; Spain: CIE Automotive, Gas
Natural, ITP; Sweden: Sony Ericsson;
Switzerland: F. Hoffmann-La Roche, Philip
Morris International; UAE: Dubai Customs.
For past consulting engagement feedbacks click
here.
COPYRIGHT AND TRADEMARK NOTICES:
The copyright in all material provided on this site is held
by SQLWorkshops.com or by the original creator of the material.
Except as stated herein, none of the material may be copied,
reproduced, distributed, republished, downloaded, displayed,
posted or transmitted in any form or by any means, including,
but not limited to, electronic, mechanical, photocopying,
recording, or otherwise, without the prior written permission of
SQLWorkshops.com or the copyright owner. Permission is granted
to download the webcast on this Site for personal,
non-commercial use only, provided you do not modify the webcast
and that you retain all copyright and other proprietary notices
contained in the webcast. You also may not, without
SQLWorkshops.com permission, "mirror" any webcast contained on
this Site on any other server. This permission terminates
automatically if you breach any of these terms or conditions.
Upon termination, you must immediately destroy any downloaded
webcast. Any unauthorized use of any webcast contained on this
Site may violate copyright laws, trademark laws, the laws of
privacy and publicity, and communications regulations and
statutes.
Microsoft is a registered trademark of Microsoft Corporation in
the United States and/or other countries. SQL Server is a
trademark of Microsoft Corporation in the United States and/or
other countries. This webcast may refer to organizations and
products that may be the trademarks or registered trademarks of
their various owners.
You may freely use the ideas and concepts discussed in this
webcast, but you may not claim any of it as your own work. You
may quote freely, with acknowledgements to SQLWorkshops.com.
This article is for informational purposes only; you use any of
the suggestions given here entirely at your own risk.
|