Fast way to get total count of found items, resources, and sources
Are you missing a feature? Request its implementation here.
-
- Posts: 3
- Joined: Fri Apr 07, 2023 1:24 am
Fast way to get total count of found items, resources, and sources
Post by natemyersibm »
Hello!
In OxygenXML Author 25.0, I'm looking for a fast and simple way to see a count of total search results across a count of files in a count of sources. I searched the oxygenxml website and forum, and I watched a related video , but I didn't quite see what I'm looking for.
Details:
In OxygenXML Author 25.0, I'm looking for a fast and simple way to see a count of total search results across a count of files in a count of sources. I searched the oxygenxml website and forum, and I watched a related video , but I didn't quite see what I'm looking for.
Details:
- Open/Find Resource gives a count of matching resources, but not an obvious total count of found items when 'In content' is selected.
- Find/Replace in Files gives a count of found items, but not an obvious total count of matching resources.
- In Preferences, Open/Find Resource prefs include a mandatory search results limiter. Fortunately, in Open/Find Resource a count still appears of total matching resources despite the limiter, but only the limited number appear in the result set. This limit can be out of proportion with web-based search results that hit the same doc collection.
- One string that gives a total count of found items in a total count of resources. Example: "452 items found in 126 files".
- An option similar to the above that includes multiple linked sources, along with a details option that could offer a breakdown:
- Example: 17,362 items found in 887 files across 3 sources (Details)
- 13,002 items found in 686 files in source MyProject
- 3300 items found in 100 files in source OtherTeam'sProject
- 1060 items found in 101 files in source OtherDepartment'sProject
- Example: 17,362 items found in 887 files across 3 sources (Details)
- In Open/Find Resources, a string that shows how many times an item is found in each matching resource.
- An option (maybe a checkbox) associated with the results limiter in Open/Find Resource prefs to page large found sets if they exceed the limit. Example: "Use multiple pages to display all results". Even better would be to just do this by default. That way, you can keep the Open/Find Resource window performant and responsive by displaying the first X results (where X is a settable integer, as it is today) and then lazy-loading subsequent found items across additional 'pages' (separated lists) within the Open/Find Resource window.
Re: Fast way to get total count of found items, resources, and sources
Hi,
When using the 'In content" search from the Open/Find resources dialog, Oxygen indeed returns to you files which contain inside the searched string (either one or multiple times). Indeed for each of those resources Oxygen does not present to you information about how many times inside each file the searched string occurs.
Please see some more remarks below:
About this request:
The intended purpose of the "Open/Find Resources" dialog is to find resources containing certain words, you seem to want to create some kind of reports using its functionality but I'm afraid we cannot change the "Open/Find Resources" dialog to also display the number of matches per file.
EXM-53082 Open/Find Resources - present results in pages instead of imposing returned results limit
Searching with Find/Replace in Files versus Open/Find Resources is of course not identical, "Open Find Resources" can find for you resources which contain a set of words even if the words are not consecutive to each other in the original document...
Regards,
Radu
When using the 'In content" search from the Open/Find resources dialog, Oxygen indeed returns to you files which contain inside the searched string (either one or multiple times). Indeed for each of those resources Oxygen does not present to you information about how many times inside each file the searched string occurs.
Please see some more remarks below:
We impose this returned search results limitation for performance reasons and you can control its limit indeed from the preferences page. I do not understand the "This limit can be out of proportion" part. Searching in various places may behave in different ways, web search (depending on the search engine) may for example display the results split in pages and web search may also have limitations imposed to the total number of found items. We chose not to display the "Open/Find Resource" results in pages and to impose this global limit to the returned search results.In Preferences, Open/Find Resource prefs include a mandatory search results limiter. Fortunately, in Open/Find Resource a count still appears of total matching resources despite the limiter, but only the limited number appear in the result set. This limit can be out of proportion with web-based search results that hit the same doc collection.
About this request:
The indexer we are using does not have this capability, it just returns resources which contain inside once or more times the searched string. In general indexers work like this, they are optimized for speed, for example when google searching you search for a word, Google returns a link to a web page, it does not tell you how many times in the web page the word occurs.In Open/Find Resources, a string that shows how many times an item is found in each matching resource.
The intended purpose of the "Open/Find Resources" dialog is to find resources containing certain words, you seem to want to create some kind of reports using its functionality but I'm afraid we cannot change the "Open/Find Resources" dialog to also display the number of matches per file.
Sounds like an interesting improvement request, I added an internal issue based on it, pasting the issue ID below for future reference:An option (maybe a checkbox) associated with the results limiter in Open/Find Resource prefs to page large found sets if they exceed the limit. Example: "Use multiple pages to display all results". Even better would be to just do this by default. That way, you can keep the Open/Find Resource window performant and responsive by displaying the first X results (where X is a settable integer, as it is today) and then lazy-loading subsequent found items across additional 'pages' (separated lists) within the Open/Find Resource window.
EXM-53082 Open/Find Resources - present results in pages instead of imposing returned results limit
Yes, but here you may have more flexibility to create a custom report. For example after Find/Replace in Files shows the results in the Results view, you can select all those found matches, right click and choose "Save results as XML". Once you have the XML document containing all matches maybe you can create an XSLT stylesheet which applies over the XML to produce a report for you.Find/Replace in Files gives a count of found items, but not an obvious total count of matching resources.
Searching with Find/Replace in Files versus Open/Find Resources is of course not identical, "Open Find Resources" can find for you resources which contain a set of words even if the words are not consecutive to each other in the original document...
Regards,
Radu
Radu Coravu
<oXygen/> XML Editor
http://www.oxygenxml.com
<oXygen/> XML Editor
http://www.oxygenxml.com
-
- Posts: 918
- Joined: Thu May 02, 2019 2:32 pm
Re: Fast way to get total count of found items, resources, and sources
Post by chrispitude »
If you save the results as XML as Radu suggested, here is a stylesheet that computes the total number of matches and the files with the most matches:
You could even paste the results XML and the stylesheet above into this site:
.NET XSLT Fiddle
and experiment interactively with different ways of querying the data.
Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xpath-default-namespace="http://www.oxygenxml.com/ns/report"
exclude-result-prefixes="#all"
version="3.0">
<xsl:output indent="yes"/>
<!-- remember matches by their file name -->
<xsl:key name="matches-by-file" match="incident" use="systemID"/>
<xsl:template match="report">
<results>
<!-- show total match count -->
<total-count><xsl:value-of select="count(incident)"/></total-count>
<!-- show matches by file, sorted by number of matches in file -->
<count-by-file>
<xsl:for-each-group select="incident" group-by="systemID">
<xsl:sort select="count(key('matches-by-file', current-grouping-key()))" order="descending"/>
<file href="{current-grouping-key()}">
<count><xsl:value-of select="count(key('matches-by-file', current-grouping-key()))"/></count>
</file>
</xsl:for-each-group>
</count-by-file>
</results>
</xsl:template>
</xsl:stylesheet>
.NET XSLT Fiddle
and experiment interactively with different ways of querying the data.
-
- Posts: 3
- Joined: Fri Apr 07, 2023 1:24 am
Re: Fast way to get total count of found items, resources, and sources
Post by natemyersibm »
Thanks Radu for the thorough reply, and thanks chrispitude for the sample XML stylesheet!
I used Open/Find Resource as a possible place to display roll-up data because it's available any time as a View, but maybe I can simplify my request by focusing on the Find/Replace in Files function and the Results window. Currently, when no 'Grouped by' option is selected, the Results window shows several column headers: Description - (number of items), Resource, System ID, and Location. The Results window contains all the information needed to calculate and display a roll-up count, so it would be great, especially for find operations across large collections, if the OxygenXML UI could present that roll-up data somewhere rather than requiring an export, stylesheet, or other custom report.
FWIW, it would also be great if the Results window could behave as other Views, and perhaps retain recent find operation tabs.
I'm making this request because the advent of LLMs seems to be quickly changing content development to become much more collaborative and business-oriented, and may give rise to a more DevOps-style culture (think "DocOps") that emphasizes content administration over authorship. In such an environment, instantaneous on-screen real-time roll-up data that can be used to rapidly assess and respond to content needs across large doc collections (e.g, in an online meeting) could become more important, and I'd like to be ready.
I used Open/Find Resource as a possible place to display roll-up data because it's available any time as a View, but maybe I can simplify my request by focusing on the Find/Replace in Files function and the Results window. Currently, when no 'Grouped by' option is selected, the Results window shows several column headers: Description - (number of items), Resource, System ID, and Location. The Results window contains all the information needed to calculate and display a roll-up count, so it would be great, especially for find operations across large collections, if the OxygenXML UI could present that roll-up data somewhere rather than requiring an export, stylesheet, or other custom report.
FWIW, it would also be great if the Results window could behave as other Views, and perhaps retain recent find operation tabs.
I'm making this request because the advent of LLMs seems to be quickly changing content development to become much more collaborative and business-oriented, and may give rise to a more DevOps-style culture (think "DocOps") that emphasizes content administration over authorship. In such an environment, instantaneous on-screen real-time roll-up data that can be used to rapidly assess and respond to content needs across large doc collections (e.g, in an online meeting) could become more important, and I'd like to be ready.
Last edited by natemyersibm on Fri May 05, 2023 9:41 pm, edited 1 time in total.
Jump to
- Oxygen XML Editor/Author/Developer
- ↳ Feature Request
- ↳ Common Problems
- ↳ DITA (Editing and Publishing DITA Content)
- ↳ SDK-API, Frameworks - Document Types
- ↳ DocBook
- ↳ TEI
- ↳ XHTML
- ↳ Other Issues
- Oxygen XML Web Author
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Content Fusion
- ↳ Feature Request
- ↳ Common Problems
- Oxygen JSON Editor
- ↳ Feature Request
- ↳ Common Problems
- Oxygen PDF Chemistry
- ↳ Feature Request
- ↳ Common Problems
- Oxygen Feedback
- ↳ Feature Request
- ↳ Common Problems
- Oxygen XML WebHelp
- ↳ Feature Request
- ↳ Common Problems
- XML
- ↳ General XML Questions
- ↳ XSLT and FOP
- ↳ XML Schemas
- ↳ XQuery
- NVDL
- ↳ General NVDL Issues
- ↳ oNVDL Related Issues
- XML Services Market
- ↳ Offer a Service