How to enforce the encoding in the output html
Post here questions and problems related to editing and publishing DITA content.
-
- Posts: 7
- Joined: Wed Apr 25, 2018 2:24 pm
How to enforce the encoding in the output html
Post by Tarja Koski »
Hello,
I'm using Oxygen XML Editor for creating htmlhelp (CHM) from dita source files. I am using a customized htmlhelp plugin which is based on the DITA-OT 2.5.4. It worked fine with Oxygen XML Editor 23.1. I have now installed Oxygen XML Editor 26.1, and I have modified my htmlhelp plugin so that it now works together with the latest DITA-OT included in Oxygen XML Editor 26.1. I can successfully build the CHM file and everything looks fine.
There is only one problem. My source dita files are in Finnish, so there are a lot of scandinavian characters. They are displayed correctly in the htmlhelp viewer, in the table of contents, and on the Index tab. But when I type a word containing scandinavian characters in the text field on the Search tab and press Enter, the result is "No topics found" even though there are such topics.
The generated html files now seem to have charset=utf-8, when previously with the older DITA-OT it was charset=iso-8859-1. Could this be the reason why the full-text search cannot find the scandinavian characters? Is there a way I could enforce my htmlhelp plugin to set the character encoding to iso-8859-1 and generate the html files according to that?
Any help would be much appreciated, thank you.
I'm using Oxygen XML Editor for creating htmlhelp (CHM) from dita source files. I am using a customized htmlhelp plugin which is based on the DITA-OT 2.5.4. It worked fine with Oxygen XML Editor 23.1. I have now installed Oxygen XML Editor 26.1, and I have modified my htmlhelp plugin so that it now works together with the latest DITA-OT included in Oxygen XML Editor 26.1. I can successfully build the CHM file and everything looks fine.
There is only one problem. My source dita files are in Finnish, so there are a lot of scandinavian characters. They are displayed correctly in the htmlhelp viewer, in the table of contents, and on the Index tab. But when I type a word containing scandinavian characters in the text field on the Search tab and press Enter, the result is "No topics found" even though there are such topics.
The generated html files now seem to have charset=utf-8, when previously with the older DITA-OT it was charset=iso-8859-1. Could this be the reason why the full-text search cannot find the scandinavian characters? Is there a way I could enforce my htmlhelp plugin to set the character encoding to iso-8859-1 and generate the html files according to that?
Any help would be much appreciated, thank you.
Re: How to enforce the encoding in the output html
Hello Tarja,
A possible hackish workaround:
- Close Oxygen.
- If you install on your side a tool like 7-Zip, you can open in it the JAR library "OXYGEN_INSTALL_DIR/frameworks/dita/DITA-OT/plugins/com.oxygenxml.dost.patches/lib/oxygen-dost-patches.jar" and inside the JAR there is a file in the folder path "org/dita/dost/util/codepages.xml", remote the "codepages.xml" file and then save the JAR archive.
- Then start Oxygen and try to publish again.
Regards,
Radu
I think you are right about this. Oxygen has some patches made to the DITA Open Toolkit engine and one of those patches created a long time ago attempts to use UTF-8 for the generated HTML files in order to fix a problem with generating CHM containing Greek letters if I recall correctly. I added an internal issue to remove this patch as it seemed to also cause problems for some of our Chinese users.The generated html files now seem to have charset=utf-8, when previously with the older DITA-OT it was charset=iso-8859-1. Could this be the reason why the full-text search cannot find the scandinavian characters? Is there a way I could enforce my htmlhelp plugin to set the character encoding to iso-8859-1 and generate the html files according to that?
A possible hackish workaround:
- Close Oxygen.
- If you install on your side a tool like 7-Zip, you can open in it the JAR library "OXYGEN_INSTALL_DIR/frameworks/dita/DITA-OT/plugins/com.oxygenxml.dost.patches/lib/oxygen-dost-patches.jar" and inside the JAR there is a file in the folder path "org/dita/dost/util/codepages.xml", remote the "codepages.xml" file and then save the JAR archive.
- Then start Oxygen and try to publish again.
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 7
- Joined: Wed Apr 25, 2018 2:24 pm
Re: How to enforce the encoding in the output html
Post by Tarja Koski »
Hello Radu,
This worked! The html files now have charset=iso-8859-1, the Scandinavian characters are displayed correctly and the full-text search finds them.
Thank you very much!
This worked! The html files now have charset=iso-8859-1, the Scandinavian characters are displayed correctly and the full-text search finds them.
Thank you very much!
Re: How to enforce the encoding in the output html
HI Tarja,
Great, thanks for the feedback, the official fix will be included in the DITA OT bundled with Oxygen 27 (November this year).
Regards,
Radu
Great, thanks for the feedback, the official fix will be included in the DITA OT bundled with Oxygen 27 (November this year).
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 601
- Joined: Wed Oct 16, 2019 3:47 pm
Re: How to enforce the encoding in the output html
Post by julien_lacour »
Hello,
Oxygen 27.0 is now available, in this version the encoding for Scandinavian characters has been fixed.
Regards,
Julien
Oxygen 27.0 is now available, in this version the encoding for Scandinavian characters has been fixed.
Regards,
Julien
-
- Posts: 7
- Joined: Wed Apr 25, 2018 2:24 pm
Return to “DITA (Editing and Publishing DITA Content)”
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service