Plan Installation

Microsoft Skill Measured: This objective may include but is not limited to: evaluate installation requirements; design the installation of SQL Server and its components (including drives and service accounts); plan scale up vs. scale out basics; plan for capacity, including if/when to shrink, grow, autogrow, and monitor growth; manage the technologies that influence SQL architecture (including service broker, full text, and scale out); design the storage for new databases (drives, filegroups, partitioning); design database infrastructure; configure a SQL Server standby database for reporting purposes; Windows-level security and service level security; Core mode installation; benchmark a server before using it in a production environment (SQLIO, Tests on SQL Instance); choose the right hardware

What I see

–          Installation requirements

–          Service accounts

–          Scale up vs. scale out

–          Database shrinking, growing, autogrowth, growth monitoring

–          RAID

Installation Requirements

                For a complete listing of installation requirements, please see MSDN’s Hardware and Software Requirements for Installing SQL Server 2012.

Service Accounts

                For a complete reference on all service account considerations regarding Virtual Accounts, domain accounts, permissions granted during setup, etc. please see the BOL reference on Configure Windows Service Accounts and Permissions documentation.

Scale Up vs. Scale Out

                To learn the differences between the two different scalability models, see this Wikipedia reference.

Database Shrinking, Growing, Autogrowth, and Growth Monitoring

                As a general rule of thumb, routine database shrinking is not recommended.  Shrinking the log is sometimes necessary if, for instance, you are in full recovery mode and transaction log backups haven’t been taken frequently enough and the log has grown out of control.  As per Paul Randal’s blog article on the topic, it can cause index fragmentation.  Please reference Paul’s article for reasoning and proof against data file shrinking.

                Conversely, database file growth is a necessary action during the lifetime of a database.  There are two ways to control this:  Manually (with ALTER DATABASE… MODIFY FILE…), and automatically (setting autogrowth).  The manual process is simple specifying a new (larger) file size:

use master;

go

alter database AdventureWorks2012

modify file

(

       name = ‘AdventureWorks2012_Data’,

       size = 200MB

);

go

The automatic way is by setting autogrowth with the FILEGROWTH file specification.  When this is set, the respective file grows by a certain byte increment, or percentage increment:

alter database AdventureWorks2012

modify file

(

       name = ‘AdventureWorks2012_Data’,

       filegrowth = 20MB

);

go

With the above setting, when the database file with the logical name of ‘AdventureWorks2012_Data’ becomes full, it will automatically grow that file by 20 MB (provided it hasn’t reached MAXSIZE).  This can be specified as KB, MB, GB, TB, or a percentage (%).  In order to monitor file growth, there are a couple of avenues.  You can use a SQL Server Agent Alert to notify when a log file has grown.  Also, you can utilize Extended Events to capture log file growth, and perfmon exposes a counter for this.  As for data file growth, you can monitor data file size and used space to get a view on growth and, if manually done, when it needs to happen.

RAID

                See this BOL reference on different RAID configurations.

References/Documentation

–          MSDN reference on Hardware and Software Requirements for Installing SQL Server 2012

–          BOL reference on Configure Windows Service Accounts and Permissions

–          Wikipedia reference on Scalability

–          BOL reference on RAID Levels

Advertisements

Optimize Queries

(This is part of the Study Guide series, 70-457)
Microsoft’s Measured Skill description: This objective may include but is not limited to: understand statistics; read query plans; plan guides; DMVs; hints; statistics IO; dynamic vs. parameterized queries; describe the different join types (HASH, MERGE, LOOP) and describe the scenarios in which they would be used
What I see:
·         understand statistics
·         query hints
·         statistics IO
·         join types (HASH, MERGE, LOOP)
Understand Statistics
                Statistics are the way that SQL Server records and uses the data distribution for tables and indexes.  They allow the query optimizer to choose an appropriate plan based off of row count, histograms, or page density.  Fresh statistics are necessary for the process to make the best possible decision, but stale statistics can fool SQL Server into thinking it has found the best plan, when in fact it is a sub-optimal plan.  For a great read and more information on statistics, see this Idera post by Donabel Santos on Understanding SQL Server Statistics.
Query Hints
                Query hints are a way to tell the optimizer what to do, regardless of what the optimizer might have done originally.  A few popular query hints are KEEP PLAN, MAXDOP, OPTIMIZE FOR, and RECOMPILE.  For instance, MAXDOP will override the configured instance max degree of parallelism.  RECOMPILE will cause SQL Server to discard the query execution plan after the query has completed as opposed to persistently storing it for later use.  Please see BOL for a full list of Query Hints and corresponding explanations.  All of these query hints are probably fair game on the exam, so a cursory knowledge of what they do will benefit you.
STATISTICS IO
                The set statement, SET STATISTICS IO, is used to output statistics regarding disk activity for the executed T-SQL queries.  To see a working example of this, execution the below T-SQL and view the Messages window to see the disk/cache activity:
use AdventureWorks2012;
go
set statistics io on;
select *
from HumanResources.Department;
set statistics io off;
This gives us information such as scan count, logical reads (from the data cache/memory), physical reads (from disk), read-ahead reads (read from disk to cache for future page reads), and the LOB equivalents to the aforementioned statistics.  This is a great way to see if a query or a subset of queries is hitting the disk too often.
Join Types
                There are three particular join types the optimizer can choose to utilize:
Hash Join – this join takes the smaller of the two sets to join and makes a hash table and fits that in the memory grant.  Then it takes the other set and probes by computing a hash value for each row and comparing it to the hash table.  To see this join in action, utilize the following query (notice a relatively small table that can easily fit into memory as a hashed table):
use AdventureWorks2012;
go
select *
from HumanResources.Department d
inner join HumanResources.EmployeeDepartmentHistory edh
on d.DepartmentID = edh.DepartmentID;
The execution plan should look like the following:
Merge Join – this join goes through the inputted rows only once, and this can show performance gains through sorted data:
use AdventureWorks2012;
go
select *
from Person.Person p
inner join Person.BusinessEntity be
on p.BusinessEntityID = be.BusinessEntityID;
The execution plan will resemble the following:
Loop Join – this join does just as its name states: one of the data sets will have every row of data iterated for each row of the other data set:
use AdventureWorks2012;
go
select
       p.LastName,
       bea.AddressTypeID
from Person.Person p
inner join Person.BusinessEntityAddress bea
on p.BusinessEntityID = bea.BusinessEntityID
where bea.AddressTypeID = 5;
The execution plan for the above query:
References
If there are any comments, questions, issues, or suggestions please feel free to leave a comment below or email me at sqlsalt@gmail.com.

Modify Data by Using INSERT, UPDATE, and DELETE Statements

(This is part of the Study Guide series, 70-457)
Microsoft’s Measured Skill description: This objective may include but is not limited to: given a set of code with defaults, constraints, and triggers, determine the output of a set of DDL; know which SQL statements are best to solve common requirements; use output statement
What I see:
·         OUTPUT statement
OUTPUT Statement
                The OUTPUT clause can be used to pipe affected data from a corresponding DML statement (INSERT, UPDATE, DELETE) or MERGE.  Through the use of this clause, the data can be saved to a table (user table, table var, temp table, etc.) or output to the host.  The below example shows two of the many ways you can utilize the OUTPUT clause:
use MeasureMySkills;
go
if exists
(
       select *
       from sys.tables
       where name = ‘TestDataTable’
)
       drop table TestDataTable;
go
create table TestDataTable
(
       id int identity(1, 1) not null,
       SomeString nvarchar(128) not null,
       AnotherInt int not null
);
go
— show the output of the inserted data to the client
insert into TestDataTable(SomeString, AnotherInt)
output inserted.id, inserted.SomeString, inserted.AnotherInt
values
       (‘hello’, 34),
       (‘goodbye’, 49),
       (‘hola’, 60),
       (‘adios’, 78);
— create temp table to hold deleted data
create table #DeletedData
(
       id int not null,
       SomeString nvarchar(128) not null,
       AnotherInt int not null
);
go
— output the deleted data to the temp table
delete from TestDataTable
output deleted.*
into #DeletedData
where AnotherInt in (34, 49);
select *
from #DeletedData;
References
If there are any comments, questions, issues, or suggestions please feel free to leave a comment below or email me at sqlsalt@gmail.com.

Create and Alter Stored Procedures

(This is part of the Study Guide series, 70-457)
Microsoft’s Measured Skill description: This objective may include but is not limited to: write a stored procedure to meet a given set of requirements; branching logic; create stored procedures and other programmatic objects; techniques for developing stored procedures; different types of stored procedure results; create a stored procedure for data access layer; program stored procedures, triggers, and functions with T-SQL
What I see:
·         stored procedure results
·         create stored procedures
Stored Procedure Results
                I go into quite a bit detail about the difference types of stored procedure results, and returning data to the calling code in this SQL Salt blog post.  In summary, stored procedures have a few ways of returning data, but often times there are different requirements to use each method, such as returning a result set, or returning execution status (success or failure).  To see examples and a fuller explanation, read the linked blog post.
Create Stored Procedures
                Again, this was covered in the same SQL Salt blog post as above.  Not much to it, but there are a lot of “gotchas” that could arise.  Read the blog post, but also see BOL for corner-cases that aren’t always widely known.
References
If there are any comments, questions, issues, or suggestions please feel free to leave a comment below or email me at sqlsalt@gmail.com.

Implement Data Types

(This is part of the Study Guide series, 70-457)
Microsoft’s Measured Skill description: This objective may include but is not limited to: use appropriate data; understand the uses and limitations of each data type; impact of GUID (newid, newsequentialid) on database performance, when to use which data type for columns
What I see:
·         appropriate data types
·         NEWID() vs. NEWSEQUENTIALID()
Appropriate Data Types
                This is one of those can’t-really-explain-it-in-a-blog-post topics, but I think what Microsoft is getting at here is that you should know the limitations (for data storage as well as storage consumption) of the many data types that SQL Server uses.  For instance, know the difference (precision, storage) between datetime and datetime2, and when to use one or the other.  See BOL for a refresher on the data types.
NEWID() vs. NEWSEQUENTIALID()
                I wrote a blog post on this exact topic.  For extensive information, review that.  But in summary, NEWID() does not ensure a sequentially generated GUID, and that is a bad candidate for a clustered index key.  This could lead to high amounts of page splits and index fragmentation.  NEWSEQUENTIALID() solves that problem by ensuring sequential identifiers when called.
References
·         BOL reference on Data Types
If there are any comments, questions, issues, or suggestions please feel free to leave a comment below or email me at sqlsalt@gmail.com.

Implement Sub-Queries

(This is part of the Study Guide series, 70-457)
Microsoft’s Measured Skill description: This objective may include but is not limited to: identify problematic elements in query plans; pivot and unpivot; apply operator; cte statement; with statement
What I see:
·         pivot and unpivot
·         APPLY operator
·         CTEs
PIVOT and UNPIVOT
                In my opinion the best resource for learning and reviewing PIVOT and UNPIVOT is BOL.  They have written a great explanation with very helpful examples.  Please see this reference for further reading.
APPLY Operator
                The APPLY operator is used with a table-valued function to pass parameters to the TVF and these are then “applied” to the result set.  Below shows an example of this:
use MeasureMySkills;
go
create table NumbersTable
(
       id int identity(1, 1) not null,
       MyNumber int not null
);
go
insert into NumbersTable(MyNumber)
values
       (3),
       (7),
       (12),
       (5),
       (93),
       (132);
go
create function AddMore
(
       @original_num int,
       @add_by int
)
returns table
as
       return
       (
              select
                     @original_num + @add_by as NewNum
       )
go
select
       nt.id,
       nt.MyNumber,
       ao.NewNum
from NumbersTable nt
cross apply AddMore(nt.MyNumber, 27) ao;
As you can see in the above example, by using the CROSS APPLY clause with the AddMore()TVF, the result set now contains the additional column NewNum with the calculated values from the function.
CTEs
                A Common Table Expression (CTE) is a temporary result set that has the scope of and can be used for a single, subsequent DML statement.  Its definition starts out with the WITH keyword.  It is commonplace to put a semi-colon in front of the WITH, as SQL Server needs to know it’s not part of a previous statement.
use AdventureWorks2012;
go
;with PersonCTE as
(
       select
              BusinessEntityID,
              FirstName,
              LastName,
              EmailPromotion
       from Person.Person
)
select
       EmailPromotion,
       count(*) as EmailPromoCount
from PersonCTE
group by EmailPromotion;
References
·         BOL reference on Using APPLY
If there are any comments, questions, issues, or suggestions please feel free to leave a comment below or email me at sqlsalt@gmail.com.