XHTML to DocBook conversion
XHTML to DocBook conversion
Hi all,
I'm currently working in a little company which owns an inline shop.
I have written a lot of detailed product sheets in XHTML format for years and each time we have changed our website engine (home made to Prestashop or Magento), for various business reasons, I had to modifiy those product sheets to make them compatible with the new website structure.
So I have decided to transform all our product sheets in DocBook 5 format. All our new product sheets are now written directly in DocBook 5 format using Oxygen tools.
I have written an XSL stylesheet to transform those product sheets in our actual XHTML output format (New Oxatis SaaS engine) and use a transformation scenario to automatically generate my XHTML fragments ; all the outputs are XHTML fragments and not whole pages.
I'm searching now to make the opposite : XHTML fragments to DocBook 5. This scenario will allow me to retrieve existing product sheets written for years in XTML format.
Is there any tool to easily transform fragments of XHTML code to DocBook format automatically ?
It is important to do this with a script or another automatic method (not copy/paste) because we have hundreds maybe thousands product sheets already written.
Thanks for your help
Vincent
I'm currently working in a little company which owns an inline shop.
I have written a lot of detailed product sheets in XHTML format for years and each time we have changed our website engine (home made to Prestashop or Magento), for various business reasons, I had to modifiy those product sheets to make them compatible with the new website structure.
So I have decided to transform all our product sheets in DocBook 5 format. All our new product sheets are now written directly in DocBook 5 format using Oxygen tools.
I have written an XSL stylesheet to transform those product sheets in our actual XHTML output format (New Oxatis SaaS engine) and use a transformation scenario to automatically generate my XHTML fragments ; all the outputs are XHTML fragments and not whole pages.
I'm searching now to make the opposite : XHTML fragments to DocBook 5. This scenario will allow me to retrieve existing product sheets written for years in XTML format.
Is there any tool to easily transform fragments of XHTML code to DocBook format automatically ?
It is important to do this with a script or another automatic method (not copy/paste) because we have hundreds maybe thousands product sheets already written.
Thanks for your help
Vincent
Re: XHTML to DocBook conversion
Hi Vincent,
We have a batch converter add-on for Oxygen allowing you to transform multiple HTML or XHTML documents to DocBook 4 or 5:
https://www.oxygenxml.com/doc/versions/ ... addon.html
Regards,
Radu
We have a batch converter add-on for Oxygen allowing you to transform multiple HTML or XHTML documents to DocBook 4 or 5:
https://www.oxygenxml.com/doc/versions/ ... addon.html
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
Re: XHTML to DocBook conversion
Hi Radu,
I have installed the add-on and it works fine !
I have to use an other XSL stylesheet to transform the DocBook output from the add-on to match my own DocBook structure which contains some attributes (xml:id or xml:lang) or which use section/simplesection.
Thank you for your help.
Is it possible to customize the output document ?
I have some attributes (class) in the original XHTML file that could be helpful to retrieve but they don't appear in the output XML file. I know that it is difficult to map XHTML classes into DocBook's attributes but the class attribute is useful to map <div> tags from XHTML to <section> tags in DocBook.
Regards
Vincent
I have installed the add-on and it works fine !
I have to use an other XSL stylesheet to transform the DocBook output from the add-on to match my own DocBook structure which contains some attributes (xml:id or xml:lang) or which use section/simplesection.
Thank you for your help.
Is it possible to customize the output document ?
I have some attributes (class) in the original XHTML file that could be helpful to retrieve but they don't appear in the output XML file. I know that it is difficult to map XHTML classes into DocBook's attributes but the class attribute is useful to map <div> tags from XHTML to <section> tags in DocBook.
Regards
Vincent
-
- Site Admin
- Posts: 123
- Joined: Wed Dec 12, 2018 5:33 pm
Re: XHTML to DocBook conversion
Post by Cosmin Duna »
Hi Vincent,
This conversion is based on XSLT, so you can modify some XSLT stylesheets for changing the output. After the add-on is installed, Oxygen saves the add-on's files in this directory: "C:\Users\user_name\AppData\Roaming\com.oxygenxml\extensions\v24.0\plugins\https_www.oxygenxml.com_InstData_Addons_default_updateSite.xml\oxygen-batch-converter-3.1.0".
The stylesheets are located in the "oxygen-batch-converter-3.1.0.jar" jar file from the "lib" directory and you have to open this jar in the "Archive Browser" view from Oxygen to edit them.
The "stylesheets/docbook/xhtml2db4.xsl" file is used in DocBook4 conversion and the "stylesheets/docbook/xhtml2db5.xsl" file in Docbook5. For keeping the "class" attribute in the output, open one of these files and add this template:
You have to restart the application after modifications. After that, the "class" attributes will be preserved in the output of HTML to Docbook conversion.
Regards,
Cosmin
This conversion is based on XSLT, so you can modify some XSLT stylesheets for changing the output. After the add-on is installed, Oxygen saves the add-on's files in this directory: "C:\Users\user_name\AppData\Roaming\com.oxygenxml\extensions\v24.0\plugins\https_www.oxygenxml.com_InstData_Addons_default_updateSite.xml\oxygen-batch-converter-3.1.0".
The stylesheets are located in the "oxygen-batch-converter-3.1.0.jar" jar file from the "lib" directory and you have to open this jar in the "Archive Browser" view from Oxygen to edit them.
The "stylesheets/docbook/xhtml2db4.xsl" file is used in DocBook4 conversion and the "stylesheets/docbook/xhtml2db5.xsl" file in Docbook5. For keeping the "class" attribute in the output, open one of these files and add this template:
Code: Select all
<xsl:template match="@class">
<xsl:copy-of select="."/>
</xsl:template>
Regards,
Cosmin
Cosmin Duna
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service