Dummy DTDs and Catalog files for transformation

Here should go questions about transforming XML with XSLT and FOP.
philgooch
Posts: 2
Joined: Fri Jul 15, 2005 2:21 pm

Dummy DTDs and Catalog files for transformation

Post by philgooch »

Hi there

Having read a previous post (http://www.oxygenxml.com/forum/topic3418.html) I found that I could define a dummy dtd in my catalog to get around transforming XML files for which I did not have the DTD. This works really well (although the Xerces parser complains in the Oxygen message pane, it still allows the transformation to proceed).

What's interesting is that you can add entity declarations in the dummy dtd and these will automatically be resolved by the Saxon XSLT processor. This is helpful as you could have an XML document for which you do not have the DTD but that contains a bunch of XHTML character entities and all you need to do is point to them in your dummy.dtd and these will be resolved and rendered correctly.

However, for this to work I still need to add a dummy declaration at the top of each file so that the dummy.dtd gets invoked by the catalog.

So my question is: is it possible to define a 'default' catalog mapping for documents for which there is no DTD declaration? I.e. given a document

<foo>
<bar>[full document content here containing stuff like &eacute; etc]</bar>
</foo>

a dummy DTD could be invoked if suitably defined in the catalog file?

This would be an immense help as it would pretty much allow any XML document from any source, with or without a DTD and with or without unresolved entity references, to be transformed.

Cheers

Phil
adrian
Posts: 2879
Joined: Tue May 17, 2005 4:01 pm

Re: Dummy DTDs and Catalog files for transformation

Post by adrian »

Hello,

Unfortunately I don't see a way to make this work for transformations. The catalog resolver needs a system ID or a public ID to resolve through an XML Catalog. If there is no DOCTYPE declaration in the XML file, then there is no public or system ID to be resolved and the catalog resolver is not even involved in the process.

Additionally, for transformations there is no common concept of specifying a separate DTD/schema that is involved in the transformation. You can only specify an XML, an XSLT and the output.

For validation on the other hand this is very simple to resolve in Oxygen with the help of a generic Document Type Association that provides a DTD/schema for any file. But like I said, this makes no difference for a transformation.

In conclusion, I'm afraid you will still have to make the changes in the XML documents for this to work.
You could use the Find/Replace in Files tool to add a DOCTYPE declaration to all the files that you need to process.

For example you could add a DOCTYPE declaration after the XML declaration in every file. The problem is that you have to specify in the DOCTYPE the root element name that the XML document is using.

Find -> Find/Replace in Files:
- Enable the Regular expression option.

- Search for the XML declaration. Text to find:

Code: Select all

\Q<?xml version="1.0" encoding="UTF-8"?>\E
\Q and \E mark the start and end of an escaped sequence(for regular expressions).

- Replace with:

Code: Select all

$0\n<!DOCTYPE root SYSTEM "dummy.dtd">
$0 is the searched string which you want to keep

Regards,
Adrian
Adrian Buza
<oXygen/> XML Editor, Schema Editor and XSLT Editor/Debugger
http://www.oxygenxml.com
Post Reply