Deleting XML Elements recursively

Comrade · 5 February 2016 12:09

Hi guys,

I have an XML document I’m filling out with data from excel, however any elements without a value at the end I need to remove from the XML doc. I can delete empty elements using the xpath “//<myelement>//[not()]” in the remove XML node operation and that works fine on anything like <element></element> or <element/>. However, when I run this it only deletes the node, not the whitespace too, so for example, if I have:

and I use that xpath to delete xml nodes it leaves me with

</element>

with whitespace in between <element> and </element>. If I try and run the same operation, I would expect it to now delete the <element> node (as that is now an empty element), however, that doesn’t work as I think it’s now classed not as an empty element, but as one with the whitespace as it’s value…

Is there a way I can delete an element, after deleting an element in between it?

jan · 22 August 2021 22:49

You could try

//*[normalize-space(.)=''][not(*)]

where [not(*)] represents elements without children and [normalize-space(.)=''] represents elements that have no value after whitespace is excluded.

So in your case the task would look like this:

a542e8fd2d332bfbf9445e2a63dae2bc1c797707

After executing it the first time you will get:

<element> </element>
If you execute it the second time you will get an empty document. You could obviously change the logic to make it repeat the remove operation until there is no more elements to be removed.

Comrade · 5 February 2016 16:30

Yep got that working thanks. I actually had a few elements without values but that had attributes that I didn’t want deleted so I used the xpath “//[normalize-space(.)=’’][not()][not(@*)]” which did exactly what I wanted
Thanks again