|
Parallel Query Memory More
CPU cores may not always lead to better performance – MAXDOP and
query memory distribution in spotlight:
More hardware normally
delivers better performance, but there are exceptions where it
can hinder performance.
Understanding these exceptions and working around it is a major
part of SQL Server performance tuning.
When a memory allocating
query executes in parallel, SQL Server distributes memory to
each task that is executing part of the query in parallel. In
our example the sort operator that executes in parallel divides
the memory across all tasks assuming even distribution of rows.
Common memory allocating queries are that perform Sort and do
Hash Match operations like Hash Join or Hash Aggregation or Hash
Union.
This and similar topics are covered in my
workshops. If you are interested in learning practical
techniques to improve your SQL server performance, register for the upcoming 3 Day Level 400 Microsoft SQL Server 2012 and 2008 Performance Monitoring & Tuning Hands-on Workshop in Helsinki, Finland
during
February 27-March 01, 2013, click
here to register. These
are hands-on workshops with a maximum of 12 participants and not
lectures. For consulting engagements click
here.
In reality, how often
are column values evenly distributed, think about an example;
are employees working for your company distributed evenly across
all the Zip codes or mainly concentrated in the headquarters?
What happens when you sort result set based on Zip codes? Do all
products in the catalog sell equally or are few products hot
selling items?
One of my customers tested the below
example on a 24 core server with various MAXDOP settings and
here are the results: MAXDOP 1: CPU time = 1185 ms, elapsed
time = 1188 ms MAXDOP 4: CPU time = 1981 ms, elapsed time =
1568 ms MAXDOP 8: CPU time = 1918 ms, elapsed time = 1619 ms
MAXDOP 12: CPU time = 2367 ms, elapsed time = 2258 ms MAXDOP
16: CPU time = 2540 ms, elapsed time = 2579 ms MAXDOP 20: CPU
time = 2470 ms, elapsed time = 2534 ms MAXDOP 0: CPU time =
2809 ms, elapsed time = 2721 ms - all 24 cores. In the above
test, when the data was evenly distributed, the elapsed time of
parallel query was always lower than serial query.
Why does the query get
slower and slower with more CPU cores / higher MAXDOP? Maybe you
can answer this question after reading the article; let me know:
rmeyyappan@sqlworkshops.com.
To read additional articles I wrote
click
here.
Well
you get the point, let’s see an example.
The best way to learn is
to practice. To create the below tables and reproduce the
behavior, join the mailing list by using this link:
www.sqlworkshops.com/ml and
I will send you the table creation script.
Let’s update the Employees table with 49
out of 50 employees located in Zip code 2001.
update
Employees set Zip
= EmployeeID
/ 400 +
1 where EmployeeID
% 50 =
1
update
Employees set Zip
= 2001
where EmployeeID % 50
!= 1
go
update
statistics Employees
with
fullscan
go
Let’s create the temporary table #FireDrill
with all possible Zip codes.
drop
table #FireDrill
go
create
table #FireDrill
(Zip
int
primary key)
insert
into #FireDrill
select
distinct Zip from
Employees
update
statistics #FireDrill
with
fullscan
go
Let’s execute the query serially with
MAXDOP 1.
--Example provided by www.sqlworkshops.com
--Execute query with uneven Zip code distribution
--First
serially with MAXDOP 1
set
statistics
time on
go
declare
@EmployeeID int,
@EmployeeName varchar(48),@zip
int
select
@EmployeeName = e.EmployeeName,
@zip = e.Zip
from Employees e
inner
join #FireDrill fd
on (e.Zip
= fd.Zip)
order
by e.Zip
option
(maxdop
1)
go
The query took 1011 ms to complete.

The execution plan shows the 77816 KB of
memory was granted while the estimated rows were 799624.

No Sort Warnings in SQL Server Profiler.

Now let’s execute the query in parallel
with MAXDOP 0.
--Example provided by www.sqlworkshops.com
--Execute query with uneven Zip code distribution
--In
parallel with MAXDOP 0
set
statistics
time on
go
declare
@EmployeeID int,
@EmployeeName varchar(48),@zip
int
select
@EmployeeName = e.EmployeeName,
@zip = e.Zip
from Employees e
inner
join #FireDrill fd
on (e.Zip
= fd.Zip)
order
by e.Zip
option
(maxdop
0)
go
The query took 1912 ms to complete.

The execution plan shows the 79360 KB of
memory was granted while the estimated rows were 799624.

The estimated number of rows between serial
and parallel plan are the same. The parallel plan has slightly
more memory granted due to additional overhead.
Sort properties shows the rows are unevenly
distributed over the 4 threads.

Sort Warnings in SQL Server Profiler.

Intermediate Summary: The reason for the
higher duration with parallel plan was sort spill. This is due
to uneven distribution of employees over Zip codes, especially
concentration of 49 out of 50 employees in Zip code 2001.
Now let’s update the Employees table and
distribute employees evenly across all Zip codes.
update
Employees set Zip
= EmployeeID
/ 400 +
1
go
update
statistics Employees
with
fullscan
go
Let’s execute the query serially with
MAXDOP 1.
--Example provided by www.sqlworkshops.com
--Execute query with uneven Zip code distribution
--Serially with MAXDOP 1
set
statistics
time on
go
declare
@EmployeeID int,
@EmployeeName varchar(48),@zip
int
select
@EmployeeName = e.EmployeeName,
@zip = e.Zip
from Employees e
inner
join #FireDrill fd
on (e.Zip
= fd.Zip)
order
by e.Zip
option
(maxdop
1)
go
The query took 751 ms to complete.

The execution plan shows the 77816 KB of
memory was granted while the estimated rows were 784707.

No Sort Warnings in SQL Server Profiler.

Now let’s execute the query in parallel
with MAXDOP 0.
--Example provided by www.sqlworkshops.com
--Execute query with uneven Zip code distribution
--In
parallel with MAXDOP 0
set
statistics
time on
go
declare
@EmployeeID int,
@EmployeeName varchar(48),@zip
int
select
@EmployeeName = e.EmployeeName,
@zip = e.Zip
from Employees e
inner
join #FireDrill fd
on (e.Zip
= fd.Zip)
order
by e.Zip
option
(maxdop
0)
go
The query took 661 ms to complete.

The execution plan shows the 79360 KB of
memory was granted while the estimated rows were 784707.

Sort properties shows the rows are evenly
distributed over the 4 threads.

No Sort Warnings in SQL Server Profiler.

Intermediate Summary:
When employees were distributed unevenly,
concentrated on 1 Zip code, parallel sort spilled while serial
sort performed well without spilling to tempdb. When the
employees were distributed evenly across all Zip codes, parallel
sort and serial sort did not spill to tempdb. This shows uneven
data distribution may affect the performance of some parallel
queries negatively. For detailed discussion of memory
allocation, refer to webcasts available at
www.sqlworkshops.com/webcasts.
Some of you might conclude from the above
execution times that parallel query is not faster even when
there is no spill. Below you can see when we are joining limited
amount of Zip codes, parallel query will be fasted since it can
use Bitmap Filtering.
Let’s update the Employees table with 49
out of 50 employees located in Zip code 2001.
update
Employees set Zip
= EmployeeID
/ 400 +
1 where EmployeeID
% 50 =
1
update
Employees set Zip
= 2001
where EmployeeID % 50
!= 1
go
update
statistics Employees
with
fullscan
go
Let’s create the temporary table #FireDrill
with limited Zip codes.
drop
table #FireDrill
go
create
table #FireDrill
(Zip
int
primary key)
insert
into #FireDrill
select
distinct Zip
from
Employees where Zip
between 1800
and 2001
update
statistics #FireDrill
with
fullscan
go
Let’s execute the query serially with
MAXDOP 1.
--Example provided by www.sqlworkshops.com
--Execute query with uneven Zip code distribution
--Serially with MAXDOP 1
set
statistics
time on
go
declare
@EmployeeID int,
@EmployeeName varchar(48),@zip
int
select
@EmployeeName = e.EmployeeName,
@zip = e.Zip
from Employees e
inner
join #FireDrill fd
on (e.Zip
= fd.Zip)
order
by e.Zip
option
(maxdop
1)
go
The query took 989 ms to complete.

The execution plan shows the 77816 KB of
memory was granted while the estimated rows were 785594.

No Sort Warnings in SQL Server Profiler.

Now let’s execute the query in parallel
with MAXDOP 0.
--Example provided by www.sqlworkshops.com
--Execute query with uneven Zip code distribution
--In
parallel with MAXDOP 0
set
statistics
time on
go
declare
@EmployeeID int,
@EmployeeName varchar(48),@zip
int
select
@EmployeeName = e.EmployeeName,
@zip = e.Zip
from Employees e
inner
join #FireDrill fd
on (e.Zip
= fd.Zip)
order
by e.Zip
option
(maxdop
0)
go
The query took 1799 ms to complete.

The execution plan shows the 79360 KB of
memory was granted while the estimated rows were 785594.

Sort Warnings in SQL Server Profiler.

The estimated number of rows between serial
and parallel plan are the same. The parallel plan has slightly
more memory granted due to additional overhead.
Intermediate Summary: The reason for the
higher duration with parallel plan even with limited amount of
Zip codes was sort spill. This is due to uneven distribution of
employees over Zip codes, especially concentration of 49 out of
50 employees in Zip code 2001.
Now let’s update the Employees table and
distribute employees evenly across all Zip codes.
update
Employees set Zip
= EmployeeID
/ 400 +
1
go
update
statistics Employees
with
fullscan
go
Let’s execute the query serially with
MAXDOP 1.
--Example provided by www.sqlworkshops.com
--Execute query with uneven Zip code distribution
--Serially with MAXDOP 1
set
statistics
time on
go
declare
@EmployeeID int,
@EmployeeName varchar(48),@zip
int
select
@EmployeeName = e.EmployeeName,
@zip = e.Zip
from Employees e
inner
join #FireDrill fd
on (e.Zip
= fd.Zip)
order
by e.Zip
option
(maxdop
1)
go
The query took 250
ms to complete.

The execution plan shows the 9016 KB of
memory was granted while the estimated rows were 79973.8.

No Sort Warnings in SQL Server Profiler.

Now let’s execute the query in parallel
with MAXDOP 0.
--Example provided by www.sqlworkshops.com
--Execute query with uneven Zip code distribution
--In
parallel with MAXDOP 0
set
statistics
time on
go
declare
@EmployeeID int,
@EmployeeName varchar(48),@zip
int
select
@EmployeeName = e.EmployeeName,
@zip = e.Zip
from Employees e
inner
join #FireDrill fd
on (e.Zip
= fd.Zip)
order
by e.Zip
option
(maxdop
0)
go
The query took 85 ms to complete.

The execution plan shows the 13152 KB of
memory was granted while the estimated rows were 784707.

No Sort Warnings in SQL Server Profiler.

Here you see, parallel query is much faster
than serial query since SQL Server is using Bitmap Filtering to
eliminate rows before the hash join.
Parallel queries are very good for performance, but in some
cases it can hinder performance. If one identifies the reason
for these hindrances, then it is possible to get the best out of
parallelism. I covered many aspects of monitoring and tuning
parallel queries in webcasts (www.sqlworkshops.com/webcasts)
and articles (www.sqlworkshops.com/articles).
I suggest you to watch the webcasts and read the articles to
better understand how to identify and tune parallel query
performance issues.
Summary:
One has to avoid sort spill over tempdb and the chances
of spills are higher when a query executes in parallel with
uneven data distribution. Parallel query brings its own
advantage, reduced elapsed time and reduced work with Bitmap
Filtering. So it is important to understand how to avoid spills
over tempdb and when to execute a query in parallel.
I explain these concepts with detailed examples in
my webcasts (www.sqlworkshops.com/webcasts), I recommend
you to watch them. The best way to learn is to practice. To
create the above tables and reproduce the behavior, join the
mailing list at
www.sqlworkshops.com/ml and I will send you the
relevant SQL Scripts.
Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2012 and 2008 Performance Monitoring & Tuning Hands-on Workshop in Helsinki, Finland
during
February 27-March 01, 2013, click
here to register. These
are hands-on workshops with a maximum of 12 participants and not
lectures. For consulting engagements click
here.
Disclaimer and copyright information:
This article refers to organizations and products that may be
the trademarks or registered trademarks of their various owners.
Copyright of this article belongs to
R Meyyappan
/ www.sqlworkshops.com. You may freely use the ideas and
concepts discussed in this article with acknowledgement
(www.sqlworkshops.com), but you may not claim any of it as your
own work.
This article is for informational purposes
only; you use any of the suggestions given here entirely at your
own risk.

Ramesh's consulting customers include Austria:
Allgemeines Krankenhaus Linz, BMD Systemhaus,
Bundesrechenzentrum, bwin, C07 IT Operations, CMC Markets,
derStandard, ERGO Insurance Service, Fabasoft Austria,
Greentube, IAEA (UN), Interwetten, INTERSPORT, Kapsch, Linz
AG, Mayr-Melnhof Karton, NÖM, ÖBB, OMV, Österreichische
Post, RACON West Software, Raiffeisen Bankengruppe,
Raiffeisen Informatik, S.Spitz, Salzburg AG, Siemens, TIWAG,
UC4, Vinzenz Gruppe, Wiener Zeitung, Wirtschaftskammer
Österreich; Bulgaria: MobilTel;
Czech Republic: Česká pojišťovna, ČSOB Bank, GTS
Novera, Telefónica O2, Vodafone; Denmark:
Danfoss, Egmont, NNIT; Finland: Cap Gemini,
Kemira, Pohjola; Germany: JAM Software, SAP
AG; Netherlands: De Nederlandsche Bank;
Norway: Acta, Aker Solutions, Det Norske
Veritas(DNV), Komplett; Russia: Mediatel;
Serbia: Tarkett; Slovakia:
Adastra Corporation, VÚB Banka; Slovenia:
Bankart, Iskratel, KDD, Krka, Mobitel, Pošta Slovenije,
Sava, SRC.SI; South Africa: De Beers,
Kumba, Sasol; Spain: CIE Automotive, Gas
Natural, ITP; Sweden: Sony Ericsson;
Switzerland: F. Hoffmann-La Roche, Philip
Morris International; UAE: Dubai Customs.
For past consulting engagement feedbacks click
here.
COPYRIGHT AND TRADEMARK NOTICES:
The copyright in all material provided on this site is held
by SQLWorkshops.com or by the original creator of the material.
Except as stated herein, none of the material may be copied,
reproduced, distributed, republished, downloaded, displayed,
posted or transmitted in any form or by any means, including,
but not limited to, electronic, mechanical, photocopying,
recording, or otherwise, without the prior written permission of
SQLWorkshops.com or the copyright owner. Permission is granted
to download the webcast on this Site for personal,
non-commercial use only, provided you do not modify the webcast
and that you retain all copyright and other proprietary notices
contained in the webcast. You also may not, without
SQLWorkshops.com permission, "mirror" any webcast contained on
this Site on any other server. This permission terminates
automatically if you breach any of these terms or conditions.
Upon termination, you must immediately destroy any downloaded
webcast. Any unauthorized use of any webcast contained on this
Site may violate copyright laws, trademark laws, the laws of
privacy and publicity, and communications regulations and
statutes.
Microsoft is a registered trademark of Microsoft Corporation in
the United States and/or other countries. SQL Server is a
trademark of Microsoft Corporation in the United States and/or
other countries. This webcast may refer to organizations and
products that may be the trademarks or registered trademarks of
their various owners.
You may freely use the ideas and concepts discussed in this
webcast, but you may not claim any of it as your own work. You
may quote freely, with acknowledgements to SQLWorkshops.com.
This article is for informational purposes only; you use any of
the suggestions given here entirely at your own risk.
|