XSL multiple file input
Here should go questions about transforming XML with XSLT and FOP.
-
- Posts: 80
- Joined: Wed Jan 14, 2009 12:50 pm
XSL multiple file input
Is there a way to use XSLT over multiple input files (specified by location or wildcard) to produce a single output file?
For instance, I have a directory with a varying number of XML files, all of which have the same XML structure:
FILE1.xml
<a><header>some metadata A</header><content>file contents</content></a>
FILE2.xml
<a><header>some metadata B</header><content>file contents</content></a>
FILE3.xml
<a><header>some metadata C</header><content>file contents</content></a>
FILE4.xml
<a><header>some metadata D</header><content>file contents</content></a>
I'd like to be able to run a transformation across all the files that happen to be in the directory at any particular time to get a list of the filenames (or full paths, I don't mind which) and some of their XML content, e.g. the <header> elements:
FILENAME HEADER INFO
FILE1.xml some metadata A
FILE2.xml some metadata B
FILE3.xml some metadata C
FILE4.xml some metadata D
The contents of the directory change fairly often, so if, for instance, this file is added:
FILE5.xml
<a><header>some metadata E</header><content>file contents</content></a>
I'd like not to have to change anything in the transformation (or the scenario) in order to get the new result:
FILENAME HEADER INFO
FILE1.xml some metadata A
FILE2.xml some metadata B
FILE3.xml some metadata C
FILE4.xml some metadata D
FILE5.xml some metadata E
Any suggestions for how to do it - or indeed advice that it's not possible - gratefully received.
For instance, I have a directory with a varying number of XML files, all of which have the same XML structure:
FILE1.xml
<a><header>some metadata A</header><content>file contents</content></a>
FILE2.xml
<a><header>some metadata B</header><content>file contents</content></a>
FILE3.xml
<a><header>some metadata C</header><content>file contents</content></a>
FILE4.xml
<a><header>some metadata D</header><content>file contents</content></a>
I'd like to be able to run a transformation across all the files that happen to be in the directory at any particular time to get a list of the filenames (or full paths, I don't mind which) and some of their XML content, e.g. the <header> elements:
FILENAME HEADER INFO
FILE1.xml some metadata A
FILE2.xml some metadata B
FILE3.xml some metadata C
FILE4.xml some metadata D
The contents of the directory change fairly often, so if, for instance, this file is added:
FILE5.xml
<a><header>some metadata E</header><content>file contents</content></a>
I'd like not to have to change anything in the transformation (or the scenario) in order to get the new result:
FILENAME HEADER INFO
FILE1.xml some metadata A
FILE2.xml some metadata B
FILE3.xml some metadata C
FILE4.xml some metadata D
FILE5.xml some metadata E
Any suggestions for how to do it - or indeed advice that it's not possible - gratefully received.
-
- Site Admin
- Posts: 2095
- Joined: Thu Jan 09, 2003 2:58 pm
Re: XSL multiple file input
You can do that with XSLT 2.0. For example, if you select Saxon 9 as the XSLT engine then the following should give you a list with all the XML files from the same folder with the stylesheet:
Best Regards,
George
Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:text>FILENAME HEADER INFO</xsl:text>
<xsl:for-each select="collection('.?select=*.xml')">
<xsl:text> </xsl:text>
<xsl:value-of select="document-uri(.)"/>
<xsl:text> </xsl:text>
<xsl:value-of select="/a/header"/>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
George
George Cristian Bina
-
- Site Admin
- Posts: 2095
- Joined: Thu Jan 09, 2003 2:58 pm
Re: XSL multiple file input
One more thing: for additional information on the syntax for the Saxon 9 collection function please see
http://www.saxonica.com/documentation/s ... tions.html
Best Regards,
George
http://www.saxonica.com/documentation/s ... tions.html
Best Regards,
George
George Cristian Bina
-
- Posts: 80
- Joined: Wed Jan 14, 2009 12:50 pm
Re: XSL multiple file input
Thanks. That's really helpful. I have a follow-up question or two, with regard to doing more complex things with this ...
(1) Can I combine collections in the usual way in select expressions, e.g. ? This seems to work OK, but I just wanted to be sure.
(2) When processing collections, is it possible to use xsl:for-each-group rather than xsl:for-each? I'd like to group items within a collection by their declared schema (i.e. group by each document root element's attribute xsi:noNamespaceSchemaLocation) so that I deal with all the XML files with one schema together and then all with the next schema and so on?
I wondered about:But I only seem to get the first document per group when I then show the <xsl:value-of select="current-group()">
(3) This means then that I don't seem to have a group of files to deal with together as the input to another <xsl:for-each>. And in fact, inside each group I'd like to sort files (what <xsl:for-each select=???> do I need to select each file in turn?) ideally by just their filename (irrespective of the path to that file, i.e. what <xsl:sort select=???>)?
(1) Can I combine collections in the usual way in select expressions, e.g.
Code: Select all
select="collection('../A?select=*.xml')|collection('../B?select=*.xml')"
(2) When processing collections, is it possible to use xsl:for-each-group rather than xsl:for-each? I'd like to group items within a collection by their declared schema (i.e. group by each document root element's attribute xsi:noNamespaceSchemaLocation) so that I deal with all the XML files with one schema together and then all with the next schema and so on?
I wondered about:
Code: Select all
<xsl:for-each-group select="collection('../A?select=*.xml')|collection('../B?select=*.xml')" group-by="//*/@xsi:noNamespaceSchemaLocation">
(3) This means then that I don't seem to have a group of files to deal with together as the input to another <xsl:for-each>. And in fact, inside each group I'd like to sort files (what <xsl:for-each select=???> do I need to select each file in turn?) ideally by just their filename (irrespective of the path to that file, i.e. what <xsl:sort select=???>)?
-
- Site Admin
- Posts: 2095
- Joined: Thu Jan 09, 2003 2:58 pm
Re: XSL multiple file input
The following stylesheet works ok:
on my test files it gives:
Best Regards,
George
Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0">
<xsl:output indent="yes"/>
<xsl:template match="/">
<test>
<files>
<xsl:for-each select="collection('../a?select=*.xml')|collection('../b?select=*.xml')">
<file>
<xsl:value-of select="document-uri(.)"/>
</file>
</xsl:for-each>
</files>
<grouped>
<xsl:for-each-group select="collection('../a?select=*.xml')|collection('../b?select=*.xml')"
group-by="/*/@xsi:noNamespaceSchemaLocation">
<group schema="{current-grouping-key()}">
<xsl:for-each select="current-group()">
<file>
<xsl:value-of select="document-uri(.)"/>
</file>
</xsl:for-each>
</group>
</xsl:for-each-group>
</grouped>
</test>
</xsl:template>
</xsl:stylesheet>
Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<test xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<files>
<file>file:/Users/george/test/xslt/a/test.xml</file>
<file>file:/Users/george/test/xslt/b/test.xml</file>
<file>file:/Users/george/test/xslt/b/testb.xml</file>
</files>
<grouped>
<group schema="test.xsd">
<file>file:/Users/george/test/xslt/a/test.xml</file>
<file>file:/Users/george/test/xslt/b/test.xml</file>
</group>
<group schema="testb.xsd">
<file>file:/Users/george/test/xslt/b/testb.xml</file>
</group>
</grouped>
</test>
George
George Cristian Bina
-
- Site Admin
- Posts: 2095
- Joined: Thu Jan 09, 2003 2:58 pm
Re: XSL multiple file input
Please note that the above does not resolve the schema file paths and in the example both test.xml from a and b refer to test.xsd, but those schemas are one in the folder a and the other in the folder b. To correctly group file that refer to the same schema file you can use something like below:
Best Regards,
George
Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0">
<xsl:output indent="yes"/>
<xsl:template match="/">
<test>
<files>
<xsl:for-each select="collection('../a?select=*.xml')|collection('../b?select=*.xml')">
<file>
<xsl:value-of select="document-uri(.)"/>
</file>
</xsl:for-each>
</files>
<grouped>
<xsl:for-each-group select="collection('../a?select=*.xml')|collection('../b?select=*.xml')"
group-by="document-uri(document(/*/@xsi:noNamespaceSchemaLocation))">
<group schema="{current-grouping-key()}">
<xsl:for-each select="current-group()">
<file>
<xsl:value-of select="document-uri(.)"/>
</file>
</xsl:for-each>
</group>
</xsl:for-each-group>
</grouped>
</test>
</xsl:template>
</xsl:stylesheet>
George
George Cristian Bina
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service