-
Notifications
You must be signed in to change notification settings - Fork 0
XML Diff
To start basic comparison just feed XMLs to diffHelper
def xml1 = new XmlSlurper().parseText(new File('/path/to/file1').text).toList();
def xml2 = new XmlSlurper().parseText(new File('/path/to/file2').text).toList();
AbstractDiffHelper diffHelper = new XmlDiffHelper(xml1, xml2);
diffHelper.calcDiff();
assert diffHelper.isSimilar();
Even so you can compare 1 XML at time, initially system has been designed to compare result SETs.
For DB comparison, just feed system with List<GroovyRowResult>.
For XML comparison, feed system with manually created List<NodeChild>, or use XmlUtil.walkXmlByPath()
def xmlString = """
<result>
<id>getRegion</id>
<data>
<region>
<state>ON</state>
</region>
<region>
<state>QB</state>
</region>
<region>
<state>AB</state>
</region>
</data>
<errors/>
</result>
""";
...
// This will produce List<NodeChild> object of "region" nodes.
def xml = XmlUtil.walkXmlByPath("data.region", xmlString);
For both diffHelpers, properties outputList1 and outputList2
will contain list of not matched nodes.
NOTE! If at least 1 element in internal XML structure does not match -
entire XML will will be added to output.
<!-- XML1 -->
<result>
<id>getRegion</id>
<data>
<region>
<state>ON</state>
</region>
<region>
<state>QB</state>
</region>
<region>
<state>AB</state>
</region>
</data>
<errors/>
</result>
<!-- XML2 -->
<result>
<id>getRegion</id>
<data>
<region>
<state>ON</state>
</region>
<region>
<state>BC</state>
</region>
<region>
<state>MN</state>
</region>
</data>
<errors/>
</result>
...
//Region ON has been deleted from resultSet, as it completely match in both XMLs
log.info(diffHelper.outputList1.toString()); // [<region>QB</region>, <region>AB</region>]
log.info(diffHelper.outputList2.toString()); // [<region>BC</region>, <region>MN</region>]
When system ignores any specified element, all other parts of XML will be compared.
However if you ignore XML Node - all internal elements also will be ignored.
At the moment, there are 3 properties that consumes List or Map with xmlPath to elements
that must be ignored (and value, when it is applicable). In future, code will be modified to consume
all parameters through 1 property. Also, you can provide closure,
that accept 1 NodeChild element and return Boolean value. In Closure you must check XML path and/or value.
There are couple options to ignore nodes during comparison:
- Ignore attributes
<!-- XML1 -->
<node attr1="val1" attr2="val2" attr3="val3">
<subNode>subNodeVal</subNode>
</node>
<!-- XML2 -->
<node attr1="val1-1" attr2="val2-1" attr3="val3">
<subNode>subNodeVal</subNode>
</node>
To make this XMLs match, attributes attr1 and attr2 have to be ignored
diffHelper.ignoreAttrs = [
'@attr1',
'@attr2'
];
- Ignore nodes
<!-- XML1 -->
<node attr="val">
<subNode1>subNodeVal1</subNode1>
<subNode2>subNodeVal2</subNode2>
<subNode3>subNodeVal3</subNode3>
</node>
<!-- XML2 -->
<node>
<subNode1>subNodeVal1-1</subNode1>
<subNode2>subNodeVal2-2</subNode2>
<subNode3>subNodeVal3</subNode3>
</node>
To make this XMLs match, nodes subNode1 and subNode2 have to be ignored
diffHelper.ignoreNodes = [
'subNode1',
'subNode2'
];
- Ignore Nodes/Attributes with values
<!-- XML1 -->
<node>
<subNode attr="val1">subNodeVal1</subNode>
<subNode attr="val2">subNodeVal2</subNode>
</node>
<!-- XML2 -->
<node>
<subNode attr="val1">subNodeVal1-1</subNode>
<subNode attr="val2">subNodeVal2</subNode>
</node>
To make this XMLs match, node subNode that has attribute attr equal to val1 to be ignored.
diffHelper.ignoreNodesWValues = [
'subNode' : 'val1'
];
NOTE! You can use RegExp as value
- Depth search
<!-- XML1 -->
<node attr="val1">
<subNode attr="subVal1">subNodeVal</subNode>
</node>
<!-- XML2 -->
<node attr="val1">
<subNode attr="subVal2">subNodeVal</subNode>
</node>
To make this XMLs match, attribute attr that belongs to node subNode have to be ignored
diffHelper.ignoreAttrs = [
'subNode.@attr'
];
Same rule is applicable to other ignorable settings. If Node or Attribute
that has to be ignored is unique per XML tree, there is no need to assign any Parent nodes
for that element. However if there are more than 1 element that matches with provided xmlPath -
all elements will be ignored.
- Ignorable Closure
<dealer>
<SpecialProperties>
<property>NVD</property>
<property>Special</property>
</SpecialProperties>
</dealer>
---
xdh.ignoreCommand = {NodeChild XML ->
return XML.name() == "property" && XML.localText()[0] == "NVD";
};
Sometimes XML structure might be unsorted. System has 2 options to control comparison levels.
<!-- XML1 -->
<result>
<id>getRegion</id>
<data>
<region>
<state>ON</state>
</region>
<region>
<state>QB</state>
</region>
<region>
<state>AB</state>
</region>
</data>
<errors/>
</result>
<!-- XML1 -->
<result>
<id>getRegion</id>
<data>
<region>
<state>ON</state>
</region>
<region>
<state>AB</state>
</region>
<region>
<state>QB</state>
</region>
</data>
<errors/>
</result>
Default behavior - compare without checking order in provided List<NodeChild>
AbstractDiffHelper diffHelperOrdered = new XmlDiffHelper(xml1, xml2);
diffHelperOrdered.orderlySafeMode = false; //Can be ommited
diffHelperOrdered.calcDiff();
assert diffHelperOrdered.isSimilar() == true;
You can easily compare order of elements:
AbstractDiffHelper diffHelperOrdered = new XmlDiffHelper(xml1, xml2);
diffHelperUnOrdered.orderlySafeMode = true;
diffHelperUnOrdered.calcDiff();
assert diffHelperUnOrdered.isSimilar() == false;
log.info(diffHelperUnOrdered.outputList1.toString()); // [<region>QB</region>, <region>AB</region>]
log.info(diffHelperUnOrdered.outputList2.toString()); // [<region>AB</region>, <region>QB</region>]
By default, system compares all structure of provided XML as is. However, order of internal elements also could be different. It is possible to ignore order of internal elements.
<!-- XML1 -->
<node>
<subNode1>subNodeVal1</subNode1>
<subNode2>subNodeVal2</subNode2>
<subNode3>subNodeVal3</subNode3>
</node>
<!-- XML2 -->
<node>
<subNode3>subNodeVal3</subNode3>
<subNode2>subNodeVal2</subNode2>
<subNode1>subNodeVal1</subNode1>
</node>
Compare inner elements without order
AbstractDiffHelper diffHelperUnOrdered = new XmlDiffHelper(xml1, xml2);
diffHelperUnOrdered.orderlySafeChildrenMode = false;
diffHelperUnOrdered.calcDiff();
assert diffHelperUnOrdered.isSimilar() == true;
Ordered comparison is much more faster. When it is possible - use that option. There is no "sorting" mechanisms in system. In case of unordered comparison, system compares elements from both lists. If XML matched, it is deleted from list. If not, it will be used for next iteration. When use unordered comparison on huge amount of unordered XMLs with similar structure, performance issues may appear. To prevent that, you can specify "needleHelpers", so system will try to pre-match those element. And only if pre-match passed, internal elements will be taken care of. NOTE! Every individual case should be considered. As sometimes pre-matching might be even bit worse, however performance decrease in this case not very sufficient.
Use needleHelper property. Syntax is similar to "ignorable" properties
diffHelper.needleHelper = [
'subNode.@attr'
];