RegEx to find text in a map
Post here questions and problems related to editing and publishing DITA content.
-
- Posts: 20
- Joined: Mon Aug 04, 2014 5:18 pm
RegEx to find text in a map
Hi folks,
I'm terrible at RegEx and I'm hoping someone here can help me. I have a bunch of maps that have topicrefs that contain two elements, a difficulty and a topicsubject, like this:
Each map could have one or a hundred of these topicrefs. I'm trying to do a find that will give me each one of these as a separate result. I have tried a variety of regex, but each time, the result gives me the first <topicref all the way down to the last </topicref>, so it's not terribly helpful. This is the code I've been using:
Does anyone have any tips on how to have each <topicref> container return a result?
Thanks!
Peyton
I'm terrible at RegEx and I'm hoping someone here can help me. I have a bunch of maps that have topicrefs that contain two elements, a difficulty and a topicsubject, like this:
Code: Select all
<topicref href="../questions/cfaL1_question_00086_13.dita">
<topicsubject keyref="cfa_l1_los_567"/>
<difficulty value="intermediate"/>
</topicref>
Code: Select all
(topicref href=").*[">][\r\n][ ]*(<topicsubject keyref=").*["\/>][\r\n][ ]*(<difficulty value=").*[">][\r\n][ ]*(<\/topicref>)
Thanks!
Peyton
-
- Posts: 9421
- Joined: Fri Jul 09, 2004 5:18 pm
Re: RegEx to find text in a map
Hi Peyton,
By default regular expressions like ".*" are greedy, you can make them not greedy by appending a "?":
https://stackoverflow.com/questions/230 ... xpressions
Anyway, when it comes to the idea of searching for XML structure, you should try to use our XPath Builder view, you can change its scope to run on multiple files and run a simple XPath expression like this one:
Why is a regexp not a good idea when searching for XML content? Because you may have situations like this:
By default regular expressions like ".*" are greedy, you can make them not greedy by appending a "?":
https://stackoverflow.com/questions/230 ... xpressions
Anyway, when it comes to the idea of searching for XML structure, you should try to use our XPath Builder view, you can change its scope to run on multiple files and run a simple XPath expression like this one:
Code: Select all
//topicref[topicsubject[@keyref]][difficulty[@value]]
Code: Select all
<topicsubject keyref="cfa_l1_los_567"></topicsubject>[]
The tag is written in expanded form but it's equivalent to the collapsed form, it's difficult to express this with regexp.
Or you may have XML comments somewhere inside the topicref and again you need to express this with the regexp, leading to very ugly to understand expressions which may not find all cases.
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 20
- Joined: Mon Aug 04, 2014 5:18 pm
Re: RegEx to find text in a map
Actually, it's not exactly what I needed (I just checked the XML doc). I see the topicref, but is there a way to pull that whole chunk? I really just need to know the file name and the difficulty value.
-
- Posts: 9421
- Joined: Fri Jul 09, 2004 5:18 pm
Re: RegEx to find text in a map
Hi,
Two possible ways:
1) Run an XPath like this:
Problem is that double clicking it will not take you to the place where the topicref is located.
2) Find/Replace in Files dialog, set the expression to find as:
Check the "Regular Expressions" and "Dot matches all" checkboxes.
Set the "Restrict to XPath" field to value:
Regards,
Radu
Two possible ways:
1) Run an XPath like this:
Code: Select all
//topicref[topicsubject[@keyref]][difficulty[@value]]/concat(@href, ' - ', difficulty/@value)
2) Find/Replace in Files dialog, set the expression to find as:
Code: Select all
(.*)
Set the "Restrict to XPath" field to value:
Code: Select all
//topicref[topicsubject[@keyref]][difficulty[@value]]
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
Return to “DITA (Editing and Publishing DITA Content)”
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service