Friday, March 11, 2016

replace text inside sql with XML output

Leave a Comment

I have this query:

SELECT [content_id]       ,[content_html]       ,[date_created]       ,folder_id   FROM content ct   where folder_id=126 order by content_title   FOR XML PATH('PressRelease'), ROOT ('PressReleases') 

When I run this query this is the XML file that is generated:

<PressReleases>   <PressRelease>     <content_id>6442452927</content_id>     <content_html>&lt;root&gt;&lt;Date&gt;2015-12-02&lt;/Date&gt;         &lt;Description&gt;&lt;p class="customHeader"&gt;jobs to Philadelphia.&lt;/p&gt;         &lt;p&gt;mtext in here.&lt;/p&gt;         &lt;p&gt;mtext in here.&lt;/p&gt;         &lt;/Description&gt;         &lt;SEO&gt;&lt;h1&gt;Pennsylvania Location&lt;/h1&gt;         &lt;div class="bulletRightBar"&gt; The move was made possible in part by the Philadelphia Jobs          Credit&lt;/div&gt; &lt;/SEO&gt; &lt;/root&gt;</content_html>     <date_created>2015-12-02T09:47:12</date_created>     <folder_id>126</folder_id>   </PressRelease>   <PressReleases> 

What I need is this XML file:

<PressReleases>   <PressRelease>     <content_id>6442452927</content_id>     <content_html><root><Date>2015-12-02</Date>         <Description>&lt;p class="customHeader"&gt;jobs to Philadelphia.&lt;/p&gt;         &lt;p&gt;mtext in here.&lt;/p&gt;         &lt;p&gt;mtext in here.&lt;/p&gt;         </Description>         <SEO>&lt;h1&gt;Pennsylvania Location&lt;/h1&gt;         &lt;div class="bulletRightBar"&gt; The move was made possible in part by the Philadelphia Jobs          Credit&lt;/div&gt; </SEO> </root></content_html>     <date_created>2015-12-02T09:47:12</date_created>     <folder_id>126</folder_id>   </PressRelease>   <PressReleases> 

Inside the <content_html> I want to make <root> <date> and <Description> as XML elements but leave the rest as encoded html.

Here's a screenshot for the sql resultenter image description here

4 Answers

Answers 1

It's not pretty, but you could cast the XML field as a string, use the REPLACE function and cast it back to XML like below. You might want to create a function that does this as the line would get with a lot of replacing:

SELECT [content_id]       ,cast(REPLACE(cast([content_html] as varchar(max)),'&lt;root&gt;','<root>') as xml)       ,[date_created]       ,folder_id   FROM content ct   where folder_id=126 order by content_title   FOR XML PATH('PressRelease'), ROOT ('PressReleases') 

or here is a way to call it with a function

CREATE FUNCTION [dbo].[XML_Replace]         (@XML_Field XML) RETURNS XML   BEGIN  DECLARE @xml varchar(max)   DECLARE @xml_Mid varchar(max)  DECLARE @strtBigInt bigint  , @endBigInt bigint   SET @xml = cast(@XML_Field as varchar(max))  SET @strtBigInt = CHARINDEX('&lt;Description&gt;',@xml)  SET @endBigInt = CHARINDEX('&lt;/SEO&gt;',@xml)   SET @xml_Mid = SUBSTRING(@xml, @strtBigInt+19,@endBigInt-@strtBigInt-19);   RETURN(cast(REPLACE(REPLACE(REPLACE(REPLACE(substring(@xml,0,@strtBigInt+19),'&lt;','<'),'&gt;','>') +    @xml_Mid +  REPLACE(REPLACE(substring(@xml,@endBigInt,Len(@xml)),'&lt;','<'),'&gt;','>'),'&lt;/Description&gt;','</Description>'),'&lt;SEO&gt;','<SEO>') as xml)); END 

Then use the funtion in your code:

SELECT [content_id]       ,dbo.XML_Replace([content_html]) as content_html       ,[date_created]       ,folder_id   FROM content ct   where folder_id=126 order by content_title   FOR XML PATH('PressRelease'), ROOT ('PressReleases') 

Answers 2

It looks like [content_html] is stored as string (perhaps nvarchar?). It's contents looks like a well-formed XML. If that is the case it can be CAST to XML to get properly incorporated into result set:

-- Test Data Set WITH content AS (     SELECT         6442452927 AS [content_id],         N'<root><Date>2015-12-02</Date>         <Description><p class="customHeader">jobs to Philadelphia.</p>         <p>mtext in here.</p>         <p>mtext in here.</p>         </Description>         <SEO><h1>Pennsylvania Location</h1>         <div class="bulletRightBar"> The move was made possible in part by the Philadelphia Jobs          Credit</div> </SEO> </root>' AS [content_html],         CAST('2015-12-02T09:47:12' AS DATETIME2) AS [date_created],         126 AS [folder_id],         1 AS [content_title] ) -- Query SELECT [content_id]         ,CAST((Select            CAST([content_html] AS XML).query('/root/Date/node()') AS [Date]           ,CAST(CAST([content_html] AS XML).query('/root/Description/node()') AS nvarchar(max)) As [Description]           ,CAST(CAST([content_html] AS XML).query('/root/SEO/node()') AS nvarchar(max)) AS [SEO]           FOR XML PATH('root'), ROOT ('content_html')) As XML)       ,[date_created]       ,[folder_id]   FROM content ct WHERE folder_id=126 ORDER BY content_title FOR XML PATH('PressRelease'), ROOT ('PressReleases') 

Answers 3

The tricky thing about this is that your content_html field contains neither XML or HTML. It's a very strange way to encode the data. If possible you might want to convert the data in the field so that it is XML, which would make this query very easy. For example, you would convert the field in the data you provided to:

<root>   <Date>2015-12-02</Date>   <Description>&lt;p class="customHeader"&gt;jobs to Philadelphia.&lt;/p&gt;         &lt;p&gt;mtext in here.&lt;/p&gt;         &lt;p&gt;mtext in here.&lt;/p&gt;         </Description>   <SEO>&lt;h1&gt;Pennsylvania Location&lt;/h1&gt;         &lt;div class="bulletRightBar"&gt; The move was made possible in part by the Philadelphia Jobs          Credit&lt;/div&gt; </SEO> </root> 

If you can't change the way the data is stored, one way to get at it is to do some converting back and forth using the code below. The sub-query creates a field called [content_xml] that converts your data in the content_html field to an XML data type. This assumes that all of your data can be converted to valid XML. Then for each node under root in the XML, it rebuilds the node and wraps the data with CDATA so that it preserves the formatting you desire.

--set up test data create table #content     ([content_id] bigint     ,[content_html] varchar(max)     ,[date_created] datetime     ,folder_id int     , content_title varchar(50)     )  insert into #content values ( 6442452927,  '&lt;root&gt;&lt;Date&gt;2015-12-02&lt;/Date&gt;         &lt;Description&gt;&lt;p class="customHeader"&gt;jobs to Philadelphia.&lt;/p&gt;         &lt;p&gt;mtext in here.&lt;/p&gt;         &lt;p&gt;mtext in here.&lt;/p&gt;         &lt;/Description&gt;         &lt;SEO&gt;&lt;h1&gt;Pennsylvania Location&lt;/h1&gt;         &lt;div class="bulletRightBar"&gt; The move was made possible in part by the Philadelphia Jobs          Credit&lt;/div&gt; &lt;/SEO&gt; &lt;/root&gt;',  '2015-12-02T09:47:12',  126,  'Content Title')  --Output select [content_id],     (     SELECT         CONVERT(xml,         '<' + convert(varchar(max),T.c.query('local-name(.)')) + '>'          + (select convert(xml, '<![CDATA[' + convert(varchar(max),T.c.query('node()')) + ']]>') for xml path(''))         + '</' + convert(varchar(max),T.c.query('local-name(.)')) + '>'         )     FROM content_xml.nodes('/root/*') T(c)     for xml path(''), type     ) as [content_html/root]     ,[date_created]     ,folder_id from     (     SELECT *,          convert(xml,convert(xml,content_html).value('.','varchar(max)')) [content_xml]     FROM #content ct      ) AllData order by content_title FOR XML PATH('PressRelease'), ROOT ('PressReleases') 

Answers 4

Plz try this one

SELECT [content_id]       ,CONVERT(XML,REPLACE(REPLACE([content_html],'&lt;','<'),'&gt;','>')) AS [content_html]       ,[date_created]       ,folder_id FROM content ct  WHERE folder_id=126 ORDER BY content_title FOR XML PATH('PressRelease'), ROOT ('PressReleases') 
If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment