How to find special characters?
Questions about XML that are not covered by the other forums should go here.
-
- Posts: 9
- Joined: Mon Jul 20, 2015 3:31 pm
How to find special characters?
One of the documents i'm editing in Oxygen 21.1 triggers the "Special characters detected" warning message. I would like to search those characters to check if they are "legitimate" document content, or if they are the result of a transformation problem (it's a converted docx, so it can contain a variety of ugly content...
).
So, is there a way (regex search) to detect all characters / unicode control codes that may trigger the "Special characters detected" message?
The only "foreign" characters i was able to identify in the document were some greek letters, but since greek doesn't require bidirectional text layout, i doubt if they are responsible for triggering the message.

So, is there a way (regex search) to detect all characters / unicode control codes that may trigger the "Special characters detected" message?
The only "foreign" characters i was able to identify in the document were some greek letters, but since greek doesn't require bidirectional text layout, i doubt if they are responsible for triggering the message.
-
- Posts: 9420
- Joined: Fri Jul 09, 2004 5:18 pm
Re: How to find special characters?
Hi,
I'm afraid we do not yet have a way in the application to signal what those complex characters are. Usually this issue is triggered when you have situations in which characters combine (the font may render one symbol for multiple characters). This will mean for example that when moving the cursor using the arrow keys special code will be triggered to properly jump over the combining characters as if they are one symbol.
Enabling the support for complex characters is usually associated to a slowdown when opening and editing the document.
There is an Oxygen GitHub project containing lots of sample plugins which you can download as a zip:
https://github.com/oxygenxml/wsaccess-j ... le-plugins
I just uploaded there a plugin folder called "determineComplexLayoutChars" which can be copied to the "OXYGEN_INSTALL_DIR\plugins" folder. After you start Oxygen the plugin will add a new contextual menu action when an XML document is opened in the Text editing mode. This new "Determine Complex Layout Chars" action should run a detection and then report all characters in the results view.
Regards,
Radu
I'm afraid we do not yet have a way in the application to signal what those complex characters are. Usually this issue is triggered when you have situations in which characters combine (the font may render one symbol for multiple characters). This will mean for example that when moving the cursor using the arrow keys special code will be triggered to properly jump over the combining characters as if they are one symbol.
Enabling the support for complex characters is usually associated to a slowdown when opening and editing the document.
There is an Oxygen GitHub project containing lots of sample plugins which you can download as a zip:
https://github.com/oxygenxml/wsaccess-j ... le-plugins
I just uploaded there a plugin folder called "determineComplexLayoutChars" which can be copied to the "OXYGEN_INSTALL_DIR\plugins" folder. After you start Oxygen the plugin will add a new contextual menu action when an XML document is opened in the Text editing mode. This new "Determine Complex Layout Chars" action should run a detection and then report all characters in the results view.
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 53
- Joined: Sat May 22, 2021 6:04 pm
Re: How to find special characters?
Post by patjporter »
Hello, can you please provide specific instructions on how to download these files and install them on a Mac?
Thank you,
Patrick
Thank you,
Patrick
-
- Posts: 9420
- Joined: Fri Jul 09, 2004 5:18 pm
Re: How to find special characters?
Hi Patrick,
Download a zip containing the entire project contents:
https://github.com/oxygenxml/wsaccess-j ... master.zip
Inside the zip there are folders, each folder is an Oxygen plugin.
Copy for example the folder "determineComplexLayoutChars" folder to the "OXYGEN_INSTALL_DIR\plugins" folder and then restart Oxygen.
Open an XML document in the text editing mode, right click inside it and there is a new menu item "Determine Complex Layout Chars".
Regards,
Radu
Download a zip containing the entire project contents:
https://github.com/oxygenxml/wsaccess-j ... master.zip
Inside the zip there are folders, each folder is an Oxygen plugin.
Copy for example the folder "determineComplexLayoutChars" folder to the "OXYGEN_INSTALL_DIR\plugins" folder and then restart Oxygen.
Open an XML document in the text editing mode, right click inside it and there is a new menu item "Determine Complex Layout Chars".
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 163
- Joined: Sat Aug 28, 2010 1:23 am
Re: How to find special characters?
The menu item appears, but selecting it has no effect for me in XML Editor 26.0, build 2023100905 on macOS Sequoia 15.0.
After installing the plugin and restarting Oxygen, I open a file that gives the warning about "The document ... contains bidirectional text (such as Arabic or Hebrew), South/South-Eastern Asian text, or special characters (such as combining characters) that require special handling to ensure proper editing."
When I right-click in text mode and select Determine Complex Layout Characters, nothing happens. Likewise if I select all the text or manually insert some bidi text.
Regards,
David
After installing the plugin and restarting Oxygen, I open a file that gives the warning about "The document ... contains bidirectional text (such as Arabic or Hebrew), South/South-Eastern Asian text, or special characters (such as combining characters) that require special handling to ensure proper editing."
When I right-click in text mode and select Determine Complex Layout Characters, nothing happens. Likewise if I select all the text or manually insert some bidi text.
Regards,
David
-
- Posts: 9420
- Joined: Fri Jul 09, 2004 5:18 pm
Re: How to find special characters?
Hi David,
This sample plugin that I created about 5 years ago used/uses a non-API Java code to test for such bidi characters. After we upgraded from Java 8 to newer Java versions, the newer Java versions reject by default using non-API Java code.
I just updated the Javascript code of the plugin to avoid using the non-API Java code, so maybe you can update your plugin with the new code:
https://github.com/oxygenxml/wsaccess-j ... sAccess.js
Regards,
Radu
This sample plugin that I created about 5 years ago used/uses a non-API Java code to test for such bidi characters. After we upgraded from Java 8 to newer Java versions, the newer Java versions reject by default using non-API Java code.
I just updated the Javascript code of the plugin to avoid using the non-API Java code, so maybe you can update your plugin with the new code:
https://github.com/oxygenxml/wsaccess-j ... sAccess.js
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 163
- Joined: Sat Aug 28, 2010 1:23 am
Re: How to find special characters?
Ok thanks! I also found them using Find in files and the following regular expression:
Code: Select all
[\p{M}\p{IsArabic}\p{IsHebrew}\p{IsHan}\p{IsHiragana}\p{IsKatakana}\p{IsHangul}\p{IsThai}\p{IsLao}\p{IsKhmer}]
Return to “General XML Questions”
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service