Similarity background and Demo
In complex data-sets it is often necessary to compare different sets of criteria (attributes). In the similarity analysis we intend to calculate the similarity of different posets (partially ordered sets). This similarity analysis is an important feature of PyHasse.
In the similarity analysis we calculate the proximity of different posets (partially ordered sets) based on the same ground set G. The outcome of a partial order for two objects a, b may be
a < b,
a > b,
a || b,
a ≅ b.
When two partial orders (G, ) and (G, ) are to be compared, then the different combinations are counted, say for objects a, b, c, d, e, f such as:
for which we use the shorthand notation <<, <>, ||< etc. Most important are the entries like >> or <<, which are counting the ‘isotone’ character of both partial orders (ISO) and the entries like ><, <> which contribute to the »antitone« character, i.e. to the conflicts between the two partial orders (ANTI). Following Rademaker et al. (2008) two posets are in conflict or »contradict each other « (are antitone) on two objects x, , »if we have and or and «. There are still more combinations to look upon: <||, >||, ||>,||< , =||, ||= or || || are considered as indifferent (IND), combinations such as > =, < =, = <, = > are called weak isotone (WISO). Finally the entry of type = = contributes to equivalence relations (IDE). The appropriately normalized contributions of ISO, ANTI, IND, WISO and IDE are the final result. It is left open how this similarity profile is mapped onto a single number, for example onto the Tanimoto index, because in different applications the importance of WISO, IND and IDE may vary. Summing up, in order to describe the behavior of two partial orders in a compact way we use the wording:
|ISO||isotone: matchings (<,<) or (>,>)|
|ANTI||antitone: the matchings (>,<) or (<,>)|
|WISO||weak isotone: the following matchings:(<,≅), (>,≅), (≅, <), (≅, >)|
|IND||indifferent: all matchings where || is part of the pair.|
|IDE||equivalent: matching (≅, ≅).|
Recent examples of the application of the similarity on an environmental health data set is given by Voigt et al. (2010a, 2012a, b), Bruggemann et al. (2013 submitted to ESPR).