ccu.structure.comparator

This module defines the Comparator class.

The Comparator class can be used to determine teh similarity of two structures as follows:

>>> import ase
>>> from ccu.structure.comparator import Comparator
>>> co1 = ase.Atoms("CO", positions=[[0, 0, 0], [1, 0, 0]])
>>> co2 = ase.Atoms("CO", positions=[[0, 1, 1], [1, 1, 1]])
>>> oc = ase.Atoms("OC", positions=[[0, 0, 0], [1, 0, 0]])
>>> Comparator.check_similarity(co1, co2)
True
>>> Comparator.check_similarity(co1, oc)
False
class ccu.structure.comparator.Comparator[source]

Bases: object

An object which compares the similarity of two structures.

static _missing_displacements(all_displacements: Iterable[NDArray], minimally_displaced_ordering: Iterable[NDArray]) list[NDArray][source]

Determines the displacements not in the M.D.O.

Parameters:
  • all_displacements – All displacements.

  • minimally_displaced_ordering – The displacements in the minimally displaced ordering (M.D.O.)

Returns:

The missing displacements.

static calculate_cumulative_displacement(fingerprint1: Fingerprint, fingerprint2: Fingerprint) float[source]

Calculates the cumulative displacement for fingerprint2.

The cumulative displacement is calculated for fingerprint2 relative to the corresponding atomic positions in fingerprint1.

The cumulative displacement is defined as follows:

Note that each row in each numpy.ndarray associated with each histogram key corresponds to a displacement vector between two atoms. With each such displacement vector in the histogram of fingerprint1, we can identify a corresponding displacement vector in the histogram of fingerprint2 as the displacement vector associated with the same histogram key and index. We then define a difference vector as the difference between a displacement vector in fingerprint1 and its counterpart in fingerprint2. The set of all difference vectors is defined on the basis of fingerprint1. That is, if \(X\) is the set of all displacement vectors in fingerprint1 and \(Y\) is the set of all corresponding vectors in fingerprint2, the set of all difference vectors is the set of all vectors \(x - y\) where \(x\) is a displacement vector in fingerprint1 and y is the corresponding displacement vector in \(Y\). (Note that this requires that the histogram of fingerprint2 must include all the keys that the histogram of fingerprint1 includes. Additionally, this requires that for each key in the histogram of fingerprint1, the value in fingerprint2 includes at least as many displacement vectors as the value in fingerprint1.) The cumulative displacement is then defined as the sum of the norms of all the difference vectors corresponding to fingerprint1 and fingerprint2.

Parameters:
Returns:

A float representing the cumulative displacement for fingerprint2 relative to fingerprint1.

static check_similarity(structure1: Atoms, structure2: Atoms, tol: float = 0.05) bool[source]

Determines similarity of two structures within a given tolerance.

Parameters:
  • structure1 – An Atoms instance representing the first structure to compare.

  • structure2 – An Atoms instance representing the second structure to compare.

  • tol – A float specifying the tolerance for the average cumulative displacement for fingerprint in Angstroms. Defaults to 5e-2. The average cumulative displacement is the cumulative displacement between each set of Fingerprints <ccu.structure.fingerprint.Fingerprint derived structure1 and structure2 divided by the number of atoms represented in the Fingerprint.

Returns:

A bool indicating whether or not the two structures are similar within the specified tolerance.

Note

The notion of similarity here can be summarized as:

Two structures are similar if they can be superimposed via a translation operation.

static cosort_fingerprints(fingerprints1: Sequence[Fingerprint], fingerprints2: Sequence[Fingerprint]) tuple[Fingerprint, ...][source]

Determines the second fingerprints’s minimally displaced ordering.

The minimally displaced ordering of the second Fingerprint list relative to the first is the ordering of the second supplied iterable of Fingerprints which minimizes the cumulative displacement across the two iterables of Fingerprints.

Parameters:
  • fingerprints1 – An iterable containing Fingerprint instances.

  • fingerprints2 – An iterable containing Fingerprint instances.

  • Note that the two iterables must be of the same length and that the

  • :meth:`ccu.structure.fingerprint.Fingerprint.values` methods of all

  • :class:`~ccu.structure.fingerprint.Fingerprint` instances across the

  • two iterables must be of the same length.

Returns:

A tuple containing the ordering of fingerprints2 which minimizes the cumulative displacement across the two iterables of Fingerprints.

Raises:

RuntimeError – Unable to find minimally displaced fingerprint.

static cosort_histograms(fingerprint1: Fingerprint, fingerprint2: Fingerprint) dict[str, ndarray][source]

Minimizes the cumulative displacement of atoms in each fingerprint.

Given the first fingerprint, this method determines the ordering of the second fingerprint’s histogram which minimizes the cumulative displacement of atoms in each structure.

The two supplied Fingerprints need not have the same keys or the same number of entries under each key. Such cases are handled as follows:

Let \(k\) be a key in both the histograms of fingerprint1 and fingerprint2. Let \(p\) be the iterable corresponding to the key \(k\) in the histogram of fingerprint1, and let \(q\) be the iterable corresponding to the key \(k\) in the histogram of fingerprint2.

If \(len(p) > len(q)\), then \(q\) is ordered according to its match with the first \(len(q)\) elements of \(p\).

If \(len(p) <= len(q)\), then \(q\) is ordered according to the best match with \(p\) and the first \(len(p)\) elements of \(q\).

Parameters:
Returns:

A dict constructed from fingerprint2._histogram mapping chemical symbols to a numpy.ndarray containing the displacement vectors to atoms with the corresponding chemical symbol. The order of the displacement vectors is such that the cumulative displacement of the displacement vectors is minimized relative to fingerprint1._histogram.