Word to DITA- image names and alt text
Here should go questions about transforming XML with XSLT and FOP.
-
- Posts: 22
- Joined: Mon Jul 09, 2012 5:30 pm
Word to DITA- image names and alt text
Hi,
Hopefully I have the correct forum group. I'm working on converting a Word .docx document into XML. I followed the instructions by Radu at http://www.oxygenxml.com/forum/post28564.html#p28564 which have been very helpful. I have two questions about getting image info converted correctly:
1) After opening my Word document and viewing the document.xml file via the Archive Browser I see the following entry for an image and the alt text that I input into the Word doc:
<wp:docPr id="249" name="Picture 1" descr="Content Rating Description" title="Content Rating title"/>
After running the DOCX DITA transform the topic has:
<image href="media/image1.png"><alt>media/image1.png</alt></image>
So it doesn't keep the alt title that I had.
I noticed C:\Program Files (x86)\Oxygen XML Editor 15\frameworks\dita\DITA-OT\plugins\net.sourceforge.dita4publishers.word2dita\xsl\simple2dita.xsl file has this entry which is what I assume I'll need to update, I just don't know what to update it to:
<image href="{$imageUrl}">
<alt><xsl:sequence select="$imageUrl"/></alt>
</image>
Anyone know what I should change this to?
2) Is there a way to get more meaningful names assigned to pictures? In the media folder they are all called image1.png etc. I looked in Word to figure out how to name them but couldn't find anything. We can just manually update them in the folder but was curious if someone had found a way in Word.
Thanks,
Belinda
Hopefully I have the correct forum group. I'm working on converting a Word .docx document into XML. I followed the instructions by Radu at http://www.oxygenxml.com/forum/post28564.html#p28564 which have been very helpful. I have two questions about getting image info converted correctly:
1) After opening my Word document and viewing the document.xml file via the Archive Browser I see the following entry for an image and the alt text that I input into the Word doc:
<wp:docPr id="249" name="Picture 1" descr="Content Rating Description" title="Content Rating title"/>
After running the DOCX DITA transform the topic has:
<image href="media/image1.png"><alt>media/image1.png</alt></image>
So it doesn't keep the alt title that I had.
I noticed C:\Program Files (x86)\Oxygen XML Editor 15\frameworks\dita\DITA-OT\plugins\net.sourceforge.dita4publishers.word2dita\xsl\simple2dita.xsl file has this entry which is what I assume I'll need to update, I just don't know what to update it to:
<image href="{$imageUrl}">
<alt><xsl:sequence select="$imageUrl"/></alt>
</image>
Anyone know what I should change this to?
2) Is there a way to get more meaningful names assigned to pictures? In the media folder they are all called image1.png etc. I looked in Word to figure out how to name them but couldn't find anything. We can just manually update them in the folder but was curious if someone had found a way in Word.
Thanks,
Belinda
-
- Posts: 9420
- Joined: Fri Jul 09, 2004 5:18 pm
Re: Word to DITA- image names and alt text
Hi Belinda,
I am not very familiar with the Word to DITA processing so my advice would be for you to write these questions on the yahoo Group DITA Users List, Eliot Kimber, the expert who developed the word to DITA plugins is registered on it and may help further.
From what I looked, in the XSL:
OXYGEN_INSTALL_DIR/frameworks/dita/DITA-OT/plugins/net.sourceforge.dita4publishers.word2dita/xsl/wordml2simple.xsl
there are several places where <image> tags are generated in a special namespace by matching certain MS Office elements.
Then in the XSLT you found simple2dita.xsl those image elements generated in the first step are further processed to produce the DITA image tags.
Regards,
Radu
I am not very familiar with the Word to DITA processing so my advice would be for you to write these questions on the yahoo Group DITA Users List, Eliot Kimber, the expert who developed the word to DITA plugins is registered on it and may help further.
From what I looked, in the XSL:
OXYGEN_INSTALL_DIR/frameworks/dita/DITA-OT/plugins/net.sourceforge.dita4publishers.word2dita/xsl/wordml2simple.xsl
there are several places where <image> tags are generated in a special namespace by matching certain MS Office elements.
Then in the XSLT you found simple2dita.xsl those image elements generated in the first step are further processed to produce the DITA image tags.
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 35
- Joined: Wed Oct 26, 2016 8:19 pm
- Location: India
Re: Word to DITA- image names and alt text
Hi Radu, do we have a solution for this? Images should take Alt text from word doc as their names after conversion. Or, is there a way to rename the images and they get updated in the xml files?
-Sathya
-
- Site Admin
- Posts: 125
- Joined: Wed Dec 12, 2018 5:33 pm
Re: Word to DITA- image names and alt text
Post by Cosmin Duna »
Hi Sathya,
This is an old discussion thread. In the meantime, we created an addon named Batch Documents Converter that contains Word to DITA conversion.
Here you have more information about the addon: https://www.oxygenxml.com/doc/versions/ ... addon.html
This conversion should preserve alternate text on images.
As I know, Word doesn't keep the original name of the images in the internal structure. So, you have to use the "Rename resource" refactoring action after the conversion for renaming and updating references to them. See this documentation topic for more information: https://www.oxygenxml.com/doc/versions/ ... _resources
Also, we have a webinar where we present various migrations including Word to DITA and refactoring actions that can be applied after conversion: https://www.oxygenxml.com/events/2021/w ... oring.html
Best regards,
Cosmin
This is an old discussion thread. In the meantime, we created an addon named Batch Documents Converter that contains Word to DITA conversion.
Here you have more information about the addon: https://www.oxygenxml.com/doc/versions/ ... addon.html
This conversion should preserve alternate text on images.
As I know, Word doesn't keep the original name of the images in the internal structure. So, you have to use the "Rename resource" refactoring action after the conversion for renaming and updating references to them. See this documentation topic for more information: https://www.oxygenxml.com/doc/versions/ ... _resources
Also, we have a webinar where we present various migrations including Word to DITA and refactoring actions that can be applied after conversion: https://www.oxygenxml.com/events/2021/w ... oring.html
Best regards,
Cosmin
Cosmin Duna
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service