Tag Archives: xml

How To Flatten Xml Data in SQL Server

In this article we will see how we can flatten a xml data and make it more readable. Let us understand the scenario first and then dive into the solution. To understand the scenario we will create a xml file. Load the data in the file into a table in SQL Server.

Step 1:  Create a xml file and store it in the C drive. Below is the content of the file

<?xml version="1.0" ?>
<Root>
<Person>
<Name>John</Name>
<Age>30</Age>
<Hobbies>
<Hobby>PhotoGraphy</Hobby>
<Hobby>Driving</Hobby>
</Hobbies>
</Person>
<Person>
<Name>Wayne</Name>
<Age>21</Age>
<Hobbies>
<Hobby>Fishing</Hobby>
<Hobby>Cooking</Hobby>
</Hobbies>
</Person>
</Root>

Step 2 : Save the file as XmlSource.xml on the C drive. You can choose a different location also.

Step 3: Run the following script and create a table which will store the xml data.

IF EXISTS (SELECT 1 FROM sys.objects WHERE name = 'XmlTable')
DROP TABLE XmlTable
go
CREATE TABLE XmlTable
(
XmLId INT IDENTITY(1, 1) NOT NULL PRIMARY KEY ,
XmlData XML NOT NULL
)

Step 4: Run the below script to insert the contents of the xml file into the table

INSERT INTO XmlTable(XmlData)
SELECT * FROM OPENROWSET(BULK 'C:\XmlSource.xml', SINGLE_BLOB)
AS ImportSource
GO

Step 5: Check if the data has been inserted correctly or not.

xml1

 

If we observe we will see that the contents of the xmlData row are in xml format. So how do we make it something like below
xml2

Here you go. The below query will make it happen

SELECT pref.value('(Name/text())[1]', 'varchar(50)') AS Name ,
pref.value('(Age/text())[1]', 'varchar(50)') AS Age,
pref.value('(Hobbies/Hobby/text())[1]', 'varchar(50)') AS Hobby1,
pref.value('(Hobbies/Hobby/text())[2]', 'varchar(50)') AS Hobby2 
FROM
XmlTable
CROSS APPLY
XmlData.nodes('//Person') AS People ( pref )

The quick way to extract the xml data is to use the CROSS APPLY operator which uses a function
against each row and then adds the data to the result set. We then combine this with a function
that can be executed on the XML data type called NODES

 

 


Advertisement

OPTIMIZE for UNKNOWN

In this article I will discuss about the ‘Optimize for Unknown’ query hint that was introduced in SQL Server 2008 and how we can use it.

So what is ‘OPTIMIZE for UNKNOWN’?
‘OPTIMIZE for UNKNOWN’ directs the query optimizer to use the standard rules it uses if no parameter values have been passed.The optimizer will use statistical data instead of the initial values for all variables when the query is compiled and optimized.

Now the big question is how does this help? Well ‘OPTIMIZE for UNKNOWN’ is one of the mechanisms which helps us to avoid parameter sniffing.

So what is parameter sniffing?
I will explain this with an example. I will use the Adventureworks2012 database for demonstration purpose on SQL Server 2012. However you can use any other version but not lower than SQL Server 2008.

Let us run the below query on the database and observe the output and execution plan

USE AdventureWorks2012
GO
SELECT * FROM sales.SalesOrderDetail
WHERE ProductID = 744

Row Count: 13
ofu1Now lets take a look at one of the aspects of the execution plan of the above query. Make a note that the actual number of rows is 13 and the estimated number of rows is 44.
ofu2Let us have a look at the plan xml for the run time values and compiled time values:
ofu3Now let us run another query with a different value.

USE AdventureWorks2012
GO
SELECT * FROM Sales.SalesOrderDetail
Where ProductID = 707

Let us observe the row count, execution plan and plan xml values as we did for the previous query
Row Count:
ofu4Execution plan: Make a note that the estimated number of rows is still 44.5 which was for the parameter used in the previous query i.e 744
ofu5Plan XML for Compiled and Runtime values: Make a note here that the compiled value used by the query optimizer here to create the plan is 744 which was the value that the previous query used.
ofu6This is called parameter sniffing where the optimizer sniffs the current parameter value during compilation.

Can this cause trouble?
Yes, it can. When a non-similar parameter is passed when a plan is compiled for the first time, the plan that the optimizer will find in the memory may not be the optimal one for that parameter passed. This will can result in a plan that is suboptimal and can cause a devastating effect on performance.

So how can we use ‘OPTIMIZE for UNKNOWN’ to avoid parameter sniffing. let us understand that with the below example.

We are going to run the above 2 queries here also but with the Optimize query hint in the second query:

USE AdventureWorks2012
GO
SELECT * FROM sales.SalesOrderDetail
WHERE ProductID = 744
go
SELECT * FROM sales.SalesOrderDetail
WHERE ProductID = 707
OPTION (OPTIMIZE FOR UNKNOWN)--Query Hint

Observations post execution of the two queries:
1)  It creates 2 different plans:
ofu72) Estimated and Actual row calculation for the two queries respectively:
Query 1:
ofu8Query 2:
ofu93) Compiled value and RunTime value from Plan xml:
Query 1: With parameter value 744
ofu10  Query 2: With parameter value 707 and query hint (Optimize for Unknown)
ofu11 So with the use of the query hint the optimizer does not use the value that it got from the cache but generates a new plan based on the available stats in the database.

What are the kind of stored procedures that can be a victim of Parameter Sniffing?
A)  SP’s which has optional parameters
B)  SP’s which has parameters in range operators.

%d bloggers like this: