Thursday, February 16, 2012

Cross Joining two XML columns, or two (or more) XML variables

Hello,
I am trying to filter an XML variable @.RoleList with the approved roles in a
similiarly structured XML variable called @.ApprovedRoles .
What i mean by "filter" is that i would like to retrieve into an XML
variable only those
"roles" in my @.RoleList that are in the @.ApprovedRoles XML variable.
Any suggestions for this?
Of course I could go the "nvarchar" way, and parse both lists and then after
a comparison ouput only return those roles that match the roles in the
@.ApprovedRoles XML variable.
But how about an SQL Server XQUery way of doing it?
A way to cross or inner *joins* two (or more) XML variables, or cross or
inner *join* two (or more) XML columns?
For examples i tried using the below example:
declare @.RoleList xml, @.ApprovedRoles xml
Set @.RoleList = '<roles><role>N12</role><role>N12</role><role>N13</role>
<role>S39</role><role>S14</role></roles>'
Set @.ApprovedRoles = '<roles><role>N12</role><role>S39</role>
</roles>'
In filterning @.RoleList with @.ApprovedRoles, I would expect the output of:
'<role>N12</role><role>N12</role><role>S39</role>'
Do joint to XML columns i tried it with the table below and then inserted
then inserted @.RoleList & @.ApprovedRoles to the table.
Create Table #XmlTable
(
aRoleList xml,
aAuthorisedList xml
)
Insert INTO @.XmlTable( aRoleList, aAuthorisedList)
VALUES (@.RoleList,@.ApprovedRoles)
The question is, how would I join the columns aRoleList aAuthorisedList, to
ouput all nodes() from aRoleList that are authorized?
I tried also using the two XML variables (@.RoleList,@.ApprovedRoles, not
needing to put them into a table) above, but can't find the syntax to make an
XQuery joining XMLs from to separate contexts.
Any suggestions to this too?
declare @.temp nvarchar(200)
set @.temp = cast(@.ApprovedRolesas nvarchar(200) )
Select x1.query('
for $Step1 in /roles,
$Step2 in sql:variable("@.temp")
return $Step2
') as RoleListsHere
from
@.aaRoleList.nodes('//roles') T(x1)
cross apply @.ApprovedRoles.nodes('//roles') TT(xx1)
where TT.xx1.exist('/roles/role') = 1
Regards,
Paul
How about:
SELECT rl.node.value('.','varchar(20)')
FROM @.RoleList.nodes('/roles/role') rl(node)
INNER JOIN @.ApprovedRoles.nodes('/roles/role') ar(node)
ON rl.node.value('.','varchar(20)') =
ar.node.value('.','varchar(20)')
Marc
|||or as xml (you might choose to add a DISTINCT too):
SELECT rl.node.value('.','varchar(20)') as 'role'
FROM @.RoleList.nodes('/roles/role') rl(node)
INNER JOIN @.ApprovedRoles.nodes('/roles/role') ar(node)
ON rl.node.value('.','varchar(20)') =
ar.node.value('.','varchar(20)')
FOR XML PATH(''), ROOT('roles')
Marc
|||Not bad.
I thought now instead of hard-coding the XPath expression in the Node()
function to instead put the 2 XPaths in variables.
So the (additional) question would be.
How would we make it dynamic? I.e., where @.RoleList and @.ApprovedRoles are
inputs, but also inputs wouold be the 2 XPaths that point just where the
comparison should start.
The problem I ran into is because I put the Select statement in a nvarchar
variable (see below), @.queryForRole, that I latter executed in SQL Server.
Well, this means that @.RoleList, @.ApprovedRoles - my XML variables need to
be also converted to nvarchar.
But something like '<roles><role>N12</role></roles>'.node(...) doesn't work.
Another idea is to try to make it dynamic by using table columns (taken from
my original example). I.e., tempdb.#XmlTable.aRoleList.nodes('/roles/role')
rl(node), but this returns the error (translated from German)
'tempdb.#XmlTable.aRoleList.nodes' is not recongnised as a Funktion name.'
So any ideas on making it dynamic?
By this i mean both XML variables (@.RoleList and @.ApprovedRoles) and their 2
companion XPaths, that point to just where in the XML, the comparison should
start.
For example the above could be 4 inputs to a user function where the user
function returns the xml of approved roles.
Here's the first variant i tried putting the Select in an nvarchar variable.
SET @.firstXpath = '/roles/role'
SET @.secondXpath = '/roles/role'
set @.queryForRole = 'SELECT rl.node.value(''.'',''varchar(20)'') as ''role''
FROM @.RoleList.nodes(''' + @.firstXpath + ''') rl(node)
INNER JOIN @.ApprovedRoles .nodes(''' + @.secondXpath + ''') ar(node)
ON rl.node.value(''.'',''varchar(20)'') =
ar.node.value(''.'',''varchar(20)'')
FOR XML PATH(''''), ROOT(''roles'')'
exec(@.queryForRole)
Regards,
Paul
"Marc Gravell" wrote:

> or as xml (you might choose to add a DISTINCT too):
> SELECT rl.node.value('.','varchar(20)') as 'role'
> FROM @.RoleList.nodes('/roles/role') rl(node)
> INNER JOIN @.ApprovedRoles.nodes('/roles/role') ar(node)
> ON rl.node.value('.','varchar(20)') =
> ar.node.value('.','varchar(20)')
> FOR XML PATH(''), ROOT('roles')
> Marc
>
>
|||The thing to remember is the the XPath cannot be *directly* dynamic,
so must be concatenated into the query (watch for injection). The
arguments, however, can be parameterised and invoked using
sp_ExecuteSQL. I've changed the names in the dynamic query just to
highlight (to you, not the database) that they aren't related to the
outer query.
You might also want to look at the old "dynamic xpath in XQUERY
exist() function" thread in this forum, where I posted something
similar, with more comments on injection.
Marc
DECLARE @.query nvarchar(max), @.xpath1 nvarchar(100), @.xpath2
nvarchar(100)
SELECT @.xpath1 = N'/roles/role', @.xpath2 = N'/roles/role'
SET @.query = N'
SELECT rl.node.value(''.'',''varchar(20)'') as ''role''
FROM @.Var1.nodes(''' + @.xpath1 + ''') rl(node)
INNER JOIN @.Var2.nodes(''' + @.xpath2 + ''') ar(node)
ON rl.node.value(''.'',''varchar(20)'') =
ar.node.value(''.'',''varchar(20)'')
FOR XML PATH(''''), ROOT(''roles'')
'
EXEC sp_ExecuteSQL @.query, N'@.Var1 xml, @.Var2 xml',
@.RoleList, @.ApprovedRoles
|||Not bad - again :-)
Yes, by chance i also participated (i was the questioner) in the "dynamic
xpath in XQUERY exist() function" thread, and I agree it is important to use
sql:variable(@.myVar) for user input. And this pertains to the below too.
I am trying to use now, what we discussed in this thread, to create a user
function that returns XML.
I could put it in a stored procedure (the procedure would then (so it seems)
return the result set of query executed by sp_ExecuteSQL.
Would you know the way to do it in a user function?
And if not (or also), your take on using the stored proc alternative.
Regards,
Paul
What i tried is given right here below:
Create FUNCTION [dbo].[CreateFunctionCrossJoinXML]
(
@.RoleList XML,
@.RoleListXPath nvarchar(50),
@.ApprovedRoles XML,
@.ApprovedRolesXPath nvarchar(50),
@.RootPathOfBothXMLs nvarchar(50)
)
RETURNS XML
AS
BEGIN
declare @.DynamicQuery nvarchar(max), @.TheReturnXML xml
declare @.xpath1 nvarchar(100), @.xpath2
nvarchar(100)
SELECT @.xpath1 = N'/roles/role', @.xpath2 = N'/roles/role'
SET @.DynamicQuery = N'
SELECT role.node.value(''.'',''varchar(20)'') as ''role''
FROM @.DynamicRoleList.nodes(''' + @.RoleListXPath + ''') role(node)
INNER JOIN
@.DynamicApprovedRoles.nodes(''' + @.ApprovedRoles + ''')
approved(node)
ON role.node.value(''.'',''varchar(20)'') =
approved.node.value(''.'',''varchar(20)'')
FOR XML PATH(''''), ROOT(''' + @.RootPathOfBothXMLs + ''')
'
EXEC sp_ExecuteSQL @.DynamicQuery, N'@.DynamicRoleList xml,
@.DynamicApprovedRoles xml , @.RootPathOfBothXMLs
nvarchar(50)',
@.RoleList, @.ApprovedRoles, @.RootPathOfBothXMLs
END
sp_ExecuteSQL
Regards,
Paul
"Marc Gravell" wrote:

> The thing to remember is the the XPath cannot be *directly* dynamic,
> so must be concatenated into the query (watch for injection). The
> arguments, however, can be parameterised and invoked using
> sp_ExecuteSQL. I've changed the names in the dynamic query just to
> highlight (to you, not the database) that they aren't related to the
> outer query.
> You might also want to look at the old "dynamic xpath in XQUERY
> exist() function" thread in this forum, where I posted something
> similar, with more comments on injection.
> Marc
> DECLARE @.query nvarchar(max), @.xpath1 nvarchar(100), @.xpath2
> nvarchar(100)
> SELECT @.xpath1 = N'/roles/role', @.xpath2 = N'/roles/role'
> SET @.query = N'
> SELECT rl.node.value(''.'',''varchar(20)'') as ''role''
> FROM @.Var1.nodes(''' + @.xpath1 + ''') rl(node)
> INNER JOIN @.Var2.nodes(''' + @.xpath2 + ''') ar(node)
> ON rl.node.value(''.'',''varchar(20)'') =
> ar.node.value(''.'',''varchar(20)'')
> FOR XML PATH(''''), ROOT(''roles'')
> '
> EXEC sp_ExecuteSQL @.query, N'@.Var1 xml, @.Var2 xml',
> @.RoleList, @.ApprovedRoles
>
>
>
|||> Yes, by chance i also participated (i was the questioner) in the "dynamic
> xpath in XQUERY exist() function" thread
Oops; I should have checked...

> Would you know the way to do it in a user function?
there isn't one...
Firstly:
Msg 557, Level 16, State 2, Line 2
Only functions and extended stored procedures can be executed from
within a function.
However, you also have trouble getting the returned xml value out of
your EXEC and into the RETURN statement/table.
As for an SP route; note that you can perhaps tie the result into an
OUT param if you are concerned about the unstructured nature of a
results-grid (and since you only have a single result). Shown below (I
haven't tidied it up much; 'tis quite rough...).
Finally - this data is very simple; xml could well be overkill. Just
thought I'd mention it...
Marc
DROP PROC XmlSP
GO
Create PROC XmlSP
(
@.RoleList XML,
@.RoleListXPath nvarchar(50),
@.ApprovedRoles XML,
@.ApprovedRolesXPath nvarchar(50),
@.RootPathOfBothXMLs nvarchar(50),
@.Result xml OUT
)
AS
BEGIN
DECLARE @.DynamicQuery nvarchar(max), @.TheReturnXML xml
DECLARE @.tmp TABLE (col xml)
SET @.DynamicQuery = N'
DECLARE @.Result xml
SELECT @.Result = (
SELECT role.node.value(''.'',''varchar(20)'') as ''role''
FROM @.DynamicRoleList.nodes(''' + @.RoleListXPath + ''')
role(node)
INNER JOIN
@.DynamicApprovedRoles.nodes(''' + @.ApprovedRolesXPath +
''')
approved(node)
ON role.node.value(''.'',''varchar(20)'') =
approved.node.value(''.'',''varchar(20)'')
FOR XML PATH(''''), ROOT(''' + @.RootPathOfBothXMLs + ''')
)
SELECT @.Result
'
INSERT @.tmp(col)
EXEC sp_ExecuteSQL @.DynamicQuery, N'@.DynamicRoleList xml,
@.DynamicApprovedRoles xml , @.RootPathOfBothXMLs
nvarchar(50)',
@.RoleList, @.ApprovedRoles, @.RootPathOfBothXMLs
SELECT TOP 1 @.Result = col FROM @.tmp
END
GO
DECLARE @.Result xml
EXEC XmlSP '<roles><role>N12</role><role>N12</role><role>N13</role>
<role>S39</role><role>S14</role></
roles>', 'roles/role',
'<roles><role>N12</role><role>S39</role></roles>' , 'roles/
role','roles',
@.Result OUT
SELECT @.Result
|||tidied, removed table-var, and added injection detection
DROP PROC XmlJoin
GO
Create PROC XmlJoin
(
-- inputs
@.Left xml, @.LeftXPath nvarchar(50),
@.Right XML, @.RightXPath nvarchar(50) = NULL,
-- outputs
@.Result xml OUTPUT,
-- xml options
@.Path nvarchar(50) = 'node', @.Row nvarchar(50) = '', @.Root
nvarchar(50) = 'xml',
-- query options
@.Distinct bit = 0,
@.Debug bit = 0
)
AS
BEGIN
DECLARE @.Query nvarchar(max)
-- verify inputs
SET @.LeftXPath = ISNULL(@.LeftXPath, '')
SELECT @.RightXPath = ISNULL(@.RightXPath, @.LeftXPath),
@.Path = ISNULL(@.Path, N'node'),
@.Row = ISNULL(@.Row, N''), @.Root = ISNULL(@.Root, N'')
IF CHARINDEX(N'''', @.LeftXPath+@.RightXPath+@.Path+@.Row+@.Root) > 0
BEGIN
RAISERROR ('Invalid character',16,1)
RETURN
END
-- build command
SET @.Query = N'
SELECT @.Result = (
SELECT ' + CASE @.Distinct WHEN 1 THEN N'DISTINCT ' ELSE N'' END
+ 'l.n.value(''.'',''varchar(20)'') AS [' + @.Path + N']
FROM @.Left.nodes(''' + @.LeftXPath + N''') l(n)
INNER JOIN @.Right.nodes(''' + @.RightXPath + N''') r(n)
ON l.n.value(''.'',''varchar(20)'') =
r.n.value(''.'',''varchar(20)'')
FOR XML PATH(''' + @.Row + N'''), ROOT(''' + @.Root + N''')
)'
-- debug?
IF @.Debug = 1
BEGIN
PRINT @.Query
SELECT @.Left AS [@.Left], @.Right AS [@.Right]
END
-- exec
EXEC sp_ExecuteSQL @.Query,
N'@.Left xml, @.Right xml, @.Result xml OUTPUT',
@.Left, @.Right, @.Result OUTPUT
END
GO
SET NOCOUNT ON
DECLARE @.Result xml
EXEC XmlJoin
@.Left =
'<roles><role>N12</role><role>N12</role><role>N13</role><role>S39</role><role>S14</role></roles>',
@.LeftXPath = 'roles/role',
@.Right = '<roles><role>N12</role><role>S39</role></roles>',
@.Result = @.Result OUTPUT,
@.Path = 'role', @.Root = 'roles', @.Debug = 1
SELECT @.Result
|||Not bad (again) -- to say the least.
I agree, the data in the example is simple and joining it as XML is
overkill, but your example has several finesses and techniques and is as such
complete and ready to use for larger @.Left and @.Right nodes() that can be
joined (assuming they're @.LeftXPath and @.RightXPath nodes() have parallel
structures). The @.* variables refer to your last example.
This example is also interesting as it uses flexible dynamic SQL, has nifty
debugging, and can inspire variation.
This thread, i think is nearly finished, but of course - any comments
concerning performance consideration/optimisations are welcome. Or if any
misunderstanding hase cropt up in my coments here.
I am also including in the below a way to match elements from two comma
delimited strings. The example below just uses nvarchar. It is by far less
flexible than your example, but can be used where no injection detection is
needed and for more simple data.
CREATE FUNCTION [dbo].[IsNameInString]
(
@.LeftIems nvarchar(max),
@.RightItems nvarchar(max)
)
RETURNS bit
AS
BEGIN
DECLARE @.aLeftItemToCheck nvarchar(100),
@.aRightItem nvarchar(100),
@.TempList nvarchar(max),
@.Pos int,
@.aPos int,
@.Count int
SET @.LeftIems = LTRIM(RTRIM(@.LeftIems))+ ','
SET @.Pos = CHARINDEX(',', @.LeftIems, 1)
IF REPLACE(@.LeftIems, ',', '') <> ''
BEGIN
WHILE @.Pos > 0
BEGIN
SET @.aLeftItemToCheck = ''
SET @.aLeftItemToCheck = LTRIM(RTRIM(LEFT(@.LeftIems, @.Pos - 1)))
IF @.aLeftItemToCheck <> ''
BEGIN
Set @.TempList = ''
Set @.aPos = 0
SET @.TempList = LTRIM(RTRIM(@.RightItems))+ ','
SET @.aPos = CHARINDEX(',', @.TempList, 1)
IF REPLACE(@.TempList, ',', '') <> ''
BEGIN
WHILE @.aPos > 0
BEGIN
SET @.aRightItem = LTRIM(RTRIM(LEFT(@.TempList, @.aPos - 1)))
IF @.aRightItem <> ''
Begin
If @.aRightItem = @.aLeftItemToCheck
BEGIN
RETURN 1
End
End
SET @.TempList = RIGHT(@.TempList, LEN(@.TempList) - @.aPos)
SET @.aPos = CHARINDEX(',', @.TempList, 1)
END
END
END
SET @.LeftIems = RIGHT(@.LeftIems, LEN(@.LeftIems) - @.Pos)
SET @.Pos = CHARINDEX(',', @.LeftIems, 1)
END
END
RETURN 0
END
Regards,
Paul
"Marc Gravell" wrote:

> tidied, removed table-var, and added injection detection
> DROP PROC XmlJoin
> GO
> Create PROC XmlJoin
> (
> -- inputs
> @.Left xml, @.LeftXPath nvarchar(50),
> @.Right XML, @.RightXPath nvarchar(50) = NULL,
> -- outputs
> @.Result xml OUTPUT,
> -- xml options
> @.Path nvarchar(50) = 'node', @.Row nvarchar(50) = '', @.Root
> nvarchar(50) = 'xml',
> -- query options
> @.Distinct bit = 0,
> @.Debug bit = 0
> )
> AS
> BEGIN
> DECLARE @.Query nvarchar(max)
> -- verify inputs
> SET @.LeftXPath = ISNULL(@.LeftXPath, '')
> SELECT @.RightXPath = ISNULL(@.RightXPath, @.LeftXPath),
> @.Path = ISNULL(@.Path, N'node'),
> @.Row = ISNULL(@.Row, N''), @.Root = ISNULL(@.Root, N'')
> IF CHARINDEX(N'''', @.LeftXPath+@.RightXPath+@.Path+@.Row+@.Root) > 0
> BEGIN
> RAISERROR ('Invalid character',16,1)
> RETURN
> END
> -- build command
> SET @.Query = N'
> SELECT @.Result = (
> SELECT ' + CASE @.Distinct WHEN 1 THEN N'DISTINCT ' ELSE N'' END
> + 'l.n.value(''.'',''varchar(20)'') AS [' + @.Path + N']
> FROM @.Left.nodes(''' + @.LeftXPath + N''') l(n)
> INNER JOIN @.Right.nodes(''' + @.RightXPath + N''') r(n)
> ON l.n.value(''.'',''varchar(20)'') =
> r.n.value(''.'',''varchar(20)'')
> FOR XML PATH(''' + @.Row + N'''), ROOT(''' + @.Root + N''')
> )'
> -- debug?
> IF @.Debug = 1
> BEGIN
> PRINT @.Query
> SELECT @.Left AS [@.Left], @.Right AS [@.Right]
> END
> -- exec
> EXEC sp_ExecuteSQL @.Query,
> N'@.Left xml, @.Right xml, @.Result xml OUTPUT',
> @.Left, @.Right, @.Result OUTPUT
> END
> GO
> SET NOCOUNT ON
> DECLARE @.Result xml
> EXEC XmlJoin
> @.Left =
> '<roles><role>N12</role><role>N12</role><role>N13</role><role>S39</role><role>S14</role></roles>',
> @.LeftXPath = 'roles/role',
> @.Right = '<roles><role>N12</role><role>S39</role></roles>',
> @.Result = @.Result OUTPUT,
> @.Path = 'role', @.Root = 'roles', @.Debug = 1
> SELECT @.Result
>
>
|||Regarding performance of the xml code; I guess there are two aspects
here... sp_ExecuteSQL benefits in that it uses the query cache, so any
uses of the same xpaths (with different xml) have the chance to share
a query plan to reduce recompilation. Of course, xml performance is a
black art, and the usage of value() pretty-much precludes a lot of
indexing - but as a general-purpose utility it does the job.
Performance-critical code should run against tables (not variables)
which can have XML indexing applied; and queries should be moved
*inside* the xml query (via sql:parameter) so that the xml indexing
can be used - but that isn't what we are discussing... just an aside.
The CSV stuff is interesting; the parsing approach that you've used
will probably work, but I'd be rather tempted to cobble together a
table function (UDF) that splits a /single/ delimted string, and call
it twice (perhaps into @.tables or #tables, or perhaps just "as is") -
this will let SQL Server do what it does best: set based operations.
Besides which, a simple "split this string" is so useful for day-to-
day operations that such a function should exist in every database; it
also demonstrates a level of "do one thing well; then join 'em
together" re-use.
Actually, it *really* annoys me that MS haven't added an optimised
method for delimited string splitting, rather than having to write it
in TSQL which simply isn't ideal for the job; I know you can call CLR
code in 2005/2008, but I doubt that it is worth the overhead of
loading the runtime... I've resisted CLR/SQL code so far, and it just
seems so... unnecessary!
Example code for a join (once you have the UDF written - apols for the
name; not my choice ;-p)
DECLARE @.Left varchar(max), @.Right varchar(max)
SELECT @.Left = 'a,b,d,c,g,h',
@.Right = 'h,b,e,b'
SELECT l.Value
FROM SplitMaxArray_udf(@.Left,',',1,1) l
INNER JOIN SplitMaxArray_udf(@.Right,',',1,1) r
ON r.Value = l.Value
or if you just want to check for a single match you could use EXISTS.
You can get UDF TSQL for CSV->table easily enough on t'net, so I won't
bother posting that unless prompted.
Marc

No comments:

Post a Comment