un-pretty print

Post by **xofer** » Fri Nov 03, 2006 2:00 am

I'm new to this and using 6.2 so maybe this has been taken care of.
Why can't I reverse the formatting of pretty print? I know about the Preferences/Editor/Format but from posts I've seen it looks like it takes manual customization to get rid of the whitespace-only text nodes.
I just want to go back to the way it was before I hit that button.
Thanks

Post by **george** » Fri Nov 03, 2006 10:39 am

Hi,

You can look at the format and indent action as a function that takes as input the initial document and outputs the document after the formatting and indenting is applied. There are an infinite number of input documents that give the same output, here it is an example, _ stands for space

<a>_
__<b/>
</a>

and

<a>__
___<b/>
</a>

will give the same result after format and indent (with the default settings).
Therefore it is impossible to determine what was the document before the format and indent was applied.
However, oXygen has undo/redo support, so if you want to come back to the document as it was before a format and indent operation use the Undo action.

Best Regards,
George

Post by **xofer** » Fri Nov 03, 2006 4:37 pm

Thanks for the reply. Your explanation matches one I found here >> http://www.xmltraining.biz/whitespace
I like using pretty print while I'm working for navigation purposes so having to immediately undo it after applying isn't ideal.
Is the solution in the link above (adding the 'strip-space' to the xslt) the way to go?

At the moment I'm the holdout in my company as everyone else likes xmlspy. When I do pretty print to our documents, I not only double their length but some of the elements no longer work due to added space in such things as <footnote> links.

My only options at the moment are to not use pretty print in Oxygen or switch to xmlspy.

Thanks -Chris

Post by **george** » Fri Nov 03, 2006 6:10 pm

Hi Chris,

If you want the footnote element to be left as it is then you have a couple of possibilities. One is to add an attribute xml:space="preserve" on the element you want to be left as in the initial document. Another possibility is to add footnote to the list of preserve space elements (that is a list of XPath expressions with some limitations), see Options->Preferences -- Editor -- Format -- XML ---- Preserve space elements. You can press F1 once you are on that preference page to get the contextual help that describes the options on that page.

Best Regards,
George

Post by **xofer** » Fri Nov 03, 2006 6:41 pm

Thanks George. Your solution may be the one I apply. One of my co-workers also suggested that in Preferences/Editor/Format, setting the 'line width' to zero. While not as 'pretty' as I like, it does break a footnote in a long sentence, right after the first tag instead of somewhere else in the footnote which was the problem.

One other question. Will it ever be an option to see all these added invisible elements that pretty print adds? I would think if the program knows what to insert, it should be able to identify them separately.

Thanks - Chris

Post by **george** » Fri Nov 03, 2006 6:56 pm

Hi Chris,

I am not sure I understand what you mean by invisible elements that are added by the format and indent... format and indent does not add XML elements to your document.
Anyway, you can save a copy of the file, apply the format and indent and then compare that with the saved copy, not ignoring whitespaces. You can choose Words as diff algorithm to see exactly where spaces were added.

Best Regards,
George

Post by **xofer** » Fri Nov 03, 2006 7:20 pm

Hey George
I guess this is part of my ignorance. When I said invisible elements, I was going off of something I saw on another site >> "Whitespace consists of one or more space (#x20) characters, carriage returns (#xD), line feeds (#xA), or tabs (#x9). "
I realize these are added to affect the html but I thought when pretty print executed, it was adding tabs everywhere. Which was I was asking why they couldn't be un-done.

Does that make sense?

When I do the compare, how come it picks up on the added spaces when 'Words' is used?

Thanks - Chris

Post by **george** » Fri Nov 03, 2006 7:27 pm

Hi Chris,

The format and indent adds whitespaces to the document, but again, an infinite number of input documents can have the same result document after format and indent so it is not possible to have an unformat action because that cannot guess what the document was like before the format and indent was applied.

If you use an XML based diff algorithm that will not detect the spaces inside start tags, for instance if the format an indent splits a start tag placing an attribute on a new line. That is because there is no difference at XML level, a parser will see the same thing. If you use the Words algorithm a word level comparison is performed and you will be able to see also these whitespace changes.

Best Regards,
George