As we all uses XML Query instead of row level query to insert/update bulk data into a database in order to achieve better performance. But when there is a large dataset having multilevel xml nodes, the performance degrade significantly.
While doing performance tuning in “AcceleRATE” I came across with “XML indexes” on an xml column.
Below is the step by step process to achieve this.
1. Create a temporary table with on identity column with primary key and one other column with xml datatype.
CREATE TABLE #OptimizedXML(id INT IDENTITY PRIMARY KEY, SurveyXML xml NOT NULL)
2. Create a primary XML index
CREATE PRIMARY XML INDEX PrimaryXMLIndex ON #OptimizedXML(SurveyXML)3. Create a secondary XML index
CREATE XML INDEX XMLDataStore_XmlCol_PATH ON #OptimizedXML(SurveyXML) USING XML INDEX PrimaryXMLIndex FOR PATH4. Now insert the values into a table
INSERT INTO #OptimizedXML VALUES(@SurveyXML)5. Here we are ready to perform the same XML query with some tweak to achieve performance.
SELECT F.Manager.value('@fname','varchar(255)') as fname, F.Manager.value('@lname', 'varchar(255)') as lname, F.Manager.value('@title', 'varchar(255)') as title, F.Manager.value('@email', 'varchar(255)') as email, F.Manager.value('@domainid', 'int') as domainid, F.Manager.value('@ID', 'int') as userid, F.Manager.value('@encryptedmail', 'varchar(500)') as encryptedemail FROM #OPtimizedXML X cross apply X.SurveyXML.nodes('Surveys[1]/Survey/Manager') AS F(Manager)
By doing the above tweak, the query which was earlier taking more than 2 minutes to execute, now takes just 10-15 seconds.
Additionally, there are few points, which we need to keep in mind while using this approach.
We must have one primary key (clustered index) on a table; it is required to create a XML index.
There must be one primary xml index followed by one or more secondary xml indexes on xml column(s).
While querying the table, we have to use a cross apply on the node where in we want to fetch data.