Tuesday, August 30, 2016

Nokogiri XSLT transform using multiple source XML files

Leave a Comment

I want to translate XML using Nokogiri. I built an XSL and it all works fine. I ALSO tested it in Intellij. My data comes from two XML files.

My problem occurs when I try to get Nokogiri to do the transform. I can't seem to find a way to get it to parse multiple source files.

This is the code I am using from the documentation:

require 'Nokogiri'  doc1 = Nokogiri::XML(File.read('F:/transcoder/xslt_repo/core_xml.xml',)) xslt = Nokogiri::XSLT(File.read('F:/transcoder/xslt_repo/google.xsl'))  puts xslt.transform(doc1) 

I tried:

require 'Nokogiri'  doc1 = Nokogiri::XML(File.read('F:/transcoder/xslt_repo/core_xml.xml',)) doc2 = Nokogiri::XML(File.read('F:/transcoder/xslt_repo/file_data.xml',)) xslt = Nokogiri::XSLT(File.read('F:/transcoder/xslt_repo/test.xsl'))  puts xslt.transform(doc1,doc2) 

However it seems transform only takes one argument, so at the moment I am only able to parse half the data I need:

<?xml version="1.0"?> <package package_id="LB000001">   <asset_metadata>     <series_title>test asset 1</series_title>     <season_title>Number 1</season_title>     <episode_title>ET 1</episode_title>     <episode_number>1</episode_number>     <license_start_date>21-07-2016</license_start_date>     <license_end_date>31-07-2016</license_end_date>     <rating>15</rating>     <synopsis>This is a test asset</synopsis>   </asset_metadata>   <video_file>     <file_name/>     <file_size/>     <check_sum/>   </video_file>   <image_1>     <file_name/>     <file_size/>     <check_sum/>   </image_1> </package> 

How can I get this to work?

Edit:

This is the core_metadata.xml which is created via a PHP code block and the data comes from a database.

<?xml version="1.0" encoding="utf-8"?> <manifest task_id="00000000373">   <asset_metadata>     <material_id>LB111111</material_id>     <series_title>This is a test</series_title>     <season_title>This is a test</season_title>     <season_number>1</season_number>     <episode_title>that test</episode_title>     <episode_number>2</episode_number>     <start_date>23-08-2016</start_date>     <end_date>31-08-2016</end_date>     <ratings>15</ratings>     <synopsis>this is a test</synopsis>   </asset_metadata>   <file_info>     <source_filename>LB111111</source_filename>     <number_of_segments>2</number_of_segments>     <segment_1 seg_1_start="00:00:10.000" seg_1_dur="00:01:00.000"/>     <segment_2 seg_2_start="00:02:00.000" seg_2_dur="00:05:00.000"/> <conform_profile definition="hd" aspect_ratio="16f16">ffmpeg -i S_PATH/F_NAME.mp4 SEG_CONFORM 2&gt; F:/Transcoder/logs/transcode_logs/LOG_FILE.txt</conform_profile> <transcode_profile profile_name="xbox" package_type="tar">ffmpeg -f concat -i T_PATH/CONFORM_LIST TRC_PATH/F_NAME.mp4 2&gt; F:/Transcoder/logs/transcode_logs/LOG_FILE.txt</transcode_profile>     <target_path>F:/profiles/xbox</target_path>   </file_info> </manifest> 

The second XML (file_date.xml) is dynamically create during the trancode process by nokogiri:

<?xml version="1.0"?> <file_data>   <video_file>     <file_name>LB111111_xbox_230816114438.mp4</file_name>     <file_size>141959922</file_size>     <md5_checksum>bac7670e55c0694059d3742285079cbf</md5_checksum>   </video_file>   <image_1>     <file_name>test</file_name>     <file_size>test</file_size>     <md5_checksum>test</md5_checksum>   </image_1> </file_data> 

I managed to work around this issue by making a call to by hard coding the file_date.xml into the XSLT file:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" indent="yes"/> <xsl:template match="/">     <package>         <xsl:attribute name="package_id">             <xsl:value-of select="manifest/asset_metadata/material_id"/>         </xsl:attribute>         <asset_metadata>             <series_title>                 <xsl:value-of select="manifest/asset_metadata/series_title"/>             </series_title>             <season_title>                 <xsl:value-of select="manifest/asset_metadata/season_title"/>             </season_title>             <episode_title>                 <xsl:value-of select="manifest/asset_metadata/episode_title"/>             </episode_title>             <episode_number>                 <xsl:value-of select="manifest/asset_metadata/episode_number"/>             </episode_number>             <license_start_date>                 <xsl:value-of select="manifest/asset_metadata/start_date"/>             </license_start_date>             <license_end_date>                 <xsl:value-of select="manifest/asset_metadata/end_date"/>             </license_end_date>             <rating>                 <xsl:value-of select="manifest/asset_metadata/ratings"/>             </rating>             <synopsis>                 <xsl:value-of select="manifest/asset_metadata/synopsis"/>             </synopsis>         </asset_metadata>         <video_file>             <file_name>                 <xsl:value-of select="document('file_data.xml')/file_data/video_file/file_name"/>             </file_name>             <file_size>                 <xsl:value-of select="document('file_data.xml')/file_data/video_file/file_size"/>             </file_size>             <check_sum>                 <xsl:value-of select="document('file_data.xml')/file_data/video_file/md5_checksum"/>             </check_sum>         </video_file>         <image_1>             <file_name>                 <xsl:value-of select="document('file_data.xml')/file_data/image_1/file_name"/>             </file_name>             <file_size>                 <xsl:value-of select="document('file_data.xml')/file_data/image_1/file_size"/>             </file_size>             <check_sum>                 <xsl:value-of select="document('file_data.xml')/file_data/image_1/md5_checksum"/>             </check_sum>         </image_1>     </package> </xsl:template> 

I then use Saxon to do the transform:

xslt = "java -jar C:/SaxonHE9-7-0-7J/saxon9he.jar #{temp}core_metadata.xml #{temp}#{profile}.xsl > #{temp}#{file_name}.xml"  system("#{xslt}") 

I would love to find a way to do this without having to hardcode the file_date.xml into the XSLT.

1 Answers

Answers 1

Merge XML Documents and Transform

You'll have to do a bit of work to combine the XML content prior to your XLS-Transformation. @the-Tin-Man has a nice answer to a similar question in the archives, which can be adapted for your use case.

Let's say we have the following sample content:

<!--a.xml--> <?xml version="1.0"?> <xml>   <packages>     <package>Data here for A</package>     <package>Another Package</package>     </packages> </xml> <!--a.xml-->  <!--b.xml--> <?xml version="1.0"?> <xml>   <packages>     <package>B something something</package>     </packages> </xml> <!--end b.xml--> 

And we want to apply the following XLST template:

<!--transform.xslt--> <?xml version="1.0"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="//packages">   <html>   <body>     <h2>Packages</h2>     <ol>       <xsl:for-each select="./package">         <li><xsl:value-of select="text()"/></li>       </xsl:for-each>     </ol>   </body>   </html> </xsl:template> </xsl:stylesheet> <!--end transform.xslt--> 

If we have parallel document structure, as in this case, we can merge the two XML documents' content together and pass that along for transformation.

require 'Nokogiri'  doc1 = Nokogiri::XML(File.read('./a.xml')) doc2 = Nokogiri::XML(File.read('./b.xml'))  moved_packages = doc2.search('package') doc1.at('/descendant::packages[1]').add_child(moved_packages)  xslt = Nokogiri::XSLT(File.read('./transform.xslt'))  puts xslt.transform(doc1) 

This would generate the following output:

<html><body> <h2>Packages</h2> <ol> <li>Data here for A</li> <li>Another Package</li> <li>B something something</li> </ol> </body></html> 

If your XML documents have varying structure, you may benefit from an intermediary XML nodeset that you add your content to, rather than the shortcut of merging document 2 content into document 1.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment