Mathematical helper functions#

Helper functions for mathematical and physical operations.

maicos.lib.math.FT(t: ndarray, x: ndarray, indvar: bool = True) ndarray | Tuple[ndarray, ndarray][source]#

Discrete Fourier transformation using fast Fourier transformation (FFT).

Parameters:
  • t (numpy.ndarray) – Time values of the time series.

  • x (numpy.ndarray) – Function values corresponding to the time series.

  • indvar (bool) – If True, returns the FFT and frequency values. If False, returns only the FFT.

Returns:

If indvar is True, returns a tuple (k, xf2) where:
  • k (numpy.ndarray): Frequency values corresponding to the FFT.

  • xf2 (numpy.ndarray): FFT of the input function, scaled by the time range and phase shifted.

If indvar is False, returns the FFT (xf2) directly as a numpy.ndarray.

Return type:

tuple(numpy.ndarray, numpy.ndarray) or numpy.ndarray

Raises:

RuntimeError – If the time series is not equally spaced.

Example

>>> t = np.linspace(0, np.pi, 4)
>>> x = np.sin(t)
>>> k, xf2 = FT(t, x)
>>> k
array([-3. , -1.5,  0. ,  1.5])
>>> np.round(xf2, 2)
array([ 0.  +0.j  , -0.68+0.68j,  1.36+0.j  , -0.68-0.68j])

See also

iFT()

For the inverse fourier transform.

maicos.lib.math.center_cluster(ag: AtomGroup, weights: ndarray) ndarray[source]#

Calculate the center of the atomgroup with respect to some weights.

Parameters:
Returns:

com – The center with respect to the weights.

Return type:

numpy.ndarray

Without proper treatment of periodic boundrary conditions (PBC) most algorithms will result in wrong center calculations. As shown below without treating PBC the center of mass is located in the center of the box

+-----------+
|           |
| 1   x   2 |
|           |
+-----------+

However, the distance accross the box boundary is shorter and therefore the center with PBC should be located somwhere else. The correct way to calculate the center is described in Bai and Breen[1] where coordinates of the particles are projected on a circle and weighted by their mass in this two dimensional space. The center of mass is obtained by transforming this point back to the corresponding point in the real system. This is done seperately for each dimension.

Reasons for doing this include the analysis of clusters in periodic boundrary conditions and consistent center of mass calculation across box boundraries. This procedure results in the right center of mass as seen below

+-----------+
|           |
x 1       2 |
|           |
+-----------+
maicos.lib.math.compute_form_factor(q: float, atom_type: str) float[source]#

Calculate the form factor \(f(q)\).

\(f(q)\) is expressed in terms of the scattering vector as

\[f(q) = \sum_{i=1}^4 a_i e^{-b_i q^2/(4\pi)^2} + c \,.\]

The coefficients \(a_{1,\dots,4}\), \(b_{1,\dots,4}\) and \(c\) are also known as Cromer-Mann X-ray scattering factors and are documented in Prince[2].

For determining the elements maicos.lib.tables.atomtypes is used and the Cromer-Mann X-ray scattering factors are stored in maicos.lib.tables.CM_parameters.

Parameters:
  • q (float) – The magnitude of the scattering vector in reciprocal angstroms (1/Å).

  • atom_type (str) – The type of the atom for which the form factor is calculated.

Returns:

The calculated form factor for the specified atom type and q.

Return type:

float

maicos.lib.math.compute_rdf_structure_factor(rdf: ndarray, r: ndarray, density: float) Tuple[ndarray, ndarray][source]#

Computes the structure factor based on the radial distribution function (RDF).

The structure factor \(S(q)\) based on the RDF \(g(r)\) is given by

\[S(q) = 1 + 4 \pi \rho \int_0^\infty \mathrm{d}r r \frac{\sin(qr)}{q} (g(r) - 1)\,\]

where \(q\) is the magnitude of the scattering vector. The calculation is performed via a discrete sine transform as implemented in scipy.fftpack.dst().

For an example take a look at Small-angle X-ray scattering.

Parameters:
  • rdf (numpy.ndarray) – radial distribution function

  • r (numpy.ndarray) – equally spaced distance array on which rdf is defined

  • density (float) – number density of particles

Returns:

  • q (numpy.ndarray) – array of q points

  • struct_factor (numpy.ndarray) – structure factor

maicos.lib.math.correlation(a: ndarray, b: ndarray | None = None, subtract_mean: bool = False) ndarray[source]#

Calculate correlation or autocorrelation.

Uses fast fourier transforms to give the correlation function of two arrays, or, if only one array is given, the autocorrelation. Setting subtract_mean=True causes the mean to be subtracted from the input data.

Parameters:
  • a (numpy.ndarray) – The first input array to calculate the correlation

  • b (numpy.ndarray) – The second input array. If None, autocorrelation of a is calculated.

  • subtract_mean (bool) – If True, subtract the mean from the input data.

Returns:

The correlation or autocorrelation function.

Return type:

numpy.ndarray

maicos.lib.math.correlation_time(timeseries: ndarray, method: str = 'sokal', mintime: int = 3, sokal_factor: float = 8) float[source]#

Compute the integrated correlation time of a time series.

The integrated correlation time (in units of the sampling interval) is given by

\[\tau = \sum\limits_{t=1}^{N_\mathrm{cut}} C(t) \left(1 - \frac{t}{N}\right)\]

where \(N_\mathrm{cut} < N\) is a subset of the time series of length \(N\) and \(C(t)\) is the discrete-time autocorrelation function. To obtain the upper limit of the sum \(N_\mathrm{cut}\) two different methods are provided:

  1. For “chodera” [3] \(N_\mathrm{cut}\) is given by the time when \(C(t)\) crosses zero the first time.

  2. For “sokal” [4] \(N_\mathrm{cut}\) is determined iteratively by stepwise increasing until

    \[N_\mathrm{cut} \geq c \cdot \tau\]

    where \(c\) is the constant sokal_factor. If the condition is never fulfilled, -1 is returned, indicating that the time series does not provide sufficient statistics to estimate a correlation time.

While both methods give the same correlation time for a smooth time series that decays to 0, “sokal” will results in a more reasonable result for actual time series that are noisy and cross zero several times.

Parameters:
  • timeseries (numpy.ndarray) – The time series used to calculate the correlation time from.

  • method ({"sokal", "chodera"}) – Method to choose summation cutoff \(N_\mathrm{cut}\).

  • mintime (int) – Minimum possible value for \(N_\mathrm{cut}\).

  • sokal_factor (float) – Cut-off factor \(c\) for the Sokal method.

Returns:

tau – Integrated correlation time \(\tau\). If -1 (only for method="sokal") the provided time series does not provide sufficient statistics to estimate a correlation time.

Return type:

float

Raises:
  • ValueError – If mintime is larger than the length of the timeseries.

  • ValueError – If method is not one of “sokal” or “chodera”.

References

maicos.lib.math.iFT(k: ndarray, xf: ndarray, indvar: bool = True) ndarray | Tuple[ndarray, ndarray][source]#

Inverse Fourier transformation using fast Fourier transformation (FFT).

Takes the frequency series and the function as arguments. By default, returns the iFT and the time series. Setting indvar=False means the function returns only the iFT.

Parameters:
  • k (numpy.ndarray) – The frequency series.

  • xf (numpy.ndarray) – The function series in the frequency domain.

  • indvar (bool) – If True, return both the iFT and the time series. If False, return only the iFT.

Returns:

If indvar is True, returns a tuple containing the time series and the iFT. If indvar is False, returns only the iFT.

Return type:

tuple(numpy.ndarray, numpy.ndarray) or numpy.ndarray

Raises:

RuntimeError – If the time series is not equally spaced.

See also

FT()

For the Fourier transform.

maicos.lib.math.new_mean(old_mean: float, data: float, length: int) float[source]#

Compute the arithmetic mean of a series iteratively.

Compute the arithmetic mean of n samples based on an existing mean of n-1 and the n-th value.

Given the mean of a data series

\[\bar x_N = \frac{1}{N} \sum_{n=1}^N x_n\]

we seperate the last value

\[\bar x_N = \frac{1}{N} \sum_{n=1}^{N-1} x_n + \frac{x_N}{N}\]

and multiply 1 = (N - 1)/(N - 1)

\[\begin{split}\bar x_N = \frac{N-1}{N} \frac{1}{N-1} \\ \sum_{n=1}^{N-1} x_n + \frac{x_N}{N}\end{split}\]

The first term can be identified as the mean of the first N - 1 values and we arrive at

\[\bar x_N = \frac{N-1}{N} \bar x_{N-1} + \frac{x_N}{N}\]
Parameters:
  • old_mean (float) – arithmetic mean of the first n - 1 samples.

  • data (float) – n-th value of the series.

  • length (int) – Length of the updated series, here called n.

Returns:

new_mean – Updated mean of the series of n values.

Return type:

float

Examples

The mean of a data set can easily be calculated from the data points. However this requires one to keep all data points on hand until the end of the calculation.

>>> np.mean([1, 3, 5, 7])
4.0

Alternatively, one can update an existing mean, this requires only knowledge of the total number of samples.

>>> new_mean(np.mean([1, 3, 5]), data=7, length=4)
4.0
maicos.lib.math.new_variance(old_variance: float | ndarray, old_mean: float | ndarray, new_mean: float | ndarray, data: float | ndarray, length: int) float | ndarray[source]#

Calculate the variance of a timeseries iteratively.

The variance of a timeseries \(x_n\) can be calculated iteratively by using the following formula:

\[S_n = S_n-1 + (n-1) * (x_n - \bar{x}_n-1)^2 / (n-1)\]

Here, \(\bar{x}_n\) is the mean of the timeseries up to the \(n\)-th value.

Floating point imprecision can lead to slight negative variances leading non defined standard deviations. Therefore a negetaive variance is set to 0.

Parameters:
  • old_variance (float, numpy.ndarray) – The variance of the first n-1 samples.

  • old_mean (float) – The mean of the first n-1 samples.

  • new_mean (float, numpy.ndarray) – The mean of the full n samples.

  • data (float, numpy.ndarray) – The n-th value of the series.

  • length (int) – Length of the updated series, here called n.

Returns:

new_variance – Updated variance of the series of n values.

Return type:

float

Examples

The data set [1, 5, 5, 1] has a variance of 4.0

>>> np.var([1, 5, 5, 1])
4.0

Knowing the total number of data points, this operation can be performed iteratively.

>>> new_variance(
...     old_variance=np.var([1, 5, 5]),
...     old_mean=np.mean([1, 5, 5]),
...     new_mean=np.mean([1, 5, 5, 1]),
...     data=1,
...     length=4,
... )
4.0
maicos.lib.math.scalar_prod_corr(a: ndarray, b: ndarray | None = None, subtract_mean: bool = False) ndarray[source]#

Give the corr. function of the scalar product of two vector timeseries.

Arguments should be given in the form a[t, i], where t is the time variable along which the correlation is calculated, and i indexes the vector components.

Parameters:
  • a (numpy.ndarray) – The first vector timeseries of shape (t, i).

  • b (numpy.ndarray) – The second vector timeseries of shape (t, i). If None, correlation with itself is calculated.

  • subtract_mean (bool) – If True, subtract the mean from the timeseries before calculating the correlation.

Returns:

The correlation function of the scalar product of the vector timeseries.

Return type:

numpy.ndarray

maicos.lib.math.symmetrize(m: ndarray, axis: None | int | Tuple[int] = None, inplace: bool = False) ndarray[source]#

Symmeterize an array.

The shape of the array is preserved, but the elements are symmetrized with respect to the given axis.

Parameters:
  • m (array_like) – Input array to symmetrize

  • axis (int, tuple(int)) – Axis or axes along which to symmetrize over. The default, axis=None, will symmetrize over all of the axes of the input array. If axis is negative it counts from the last to the first axis. If axis is a tuple of ints, symmetrizing is performed on all of the axes specified in the tuple.

  • inplace (bool) – Do symmetrizations inplace. If False a new array is returned.

Returns:

out – the symmetrized array

Return type:

array_like

Notes

symmetrize uses np.flip() for flipping the indices.

Examples

>>> A = np.arange(10).astype(float)
>>> A
array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.])
>>> symmetrize(A)
array([4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5])
>>> symmetrize(A, inplace=True)
array([4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5])
>>> A
array([4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5])

It also works for arrays with more than 1 dimensions in a general dimension.

>>> A = np.arange(20).astype(float).reshape(2, 10).T
>>> A
array([[ 0., 10.],
       [ 1., 11.],
       [ 2., 12.],
       [ 3., 13.],
       [ 4., 14.],
       [ 5., 15.],
       [ 6., 16.],
       [ 7., 17.],
       [ 8., 18.],
       [ 9., 19.]])
>>> symmetrize(A)
array([[9.5, 9.5],
       [9.5, 9.5],
       [9.5, 9.5],
       [9.5, 9.5],
       [9.5, 9.5],
       [9.5, 9.5],
       [9.5, 9.5],
       [9.5, 9.5],
       [9.5, 9.5],
       [9.5, 9.5]])
>>> symmetrize(A, axis=0)
array([[ 4.5, 14.5],
       [ 4.5, 14.5],
       [ 4.5, 14.5],
       [ 4.5, 14.5],
       [ 4.5, 14.5],
       [ 4.5, 14.5],
       [ 4.5, 14.5],
       [ 4.5, 14.5],
       [ 4.5, 14.5],
       [ 4.5, 14.5]])
maicos.lib.math.transform_cylinder(positions: ndarray, origin: ndarray, dim: int) ndarray[source]#

Transform positions into cylinder coordinates.

The origin of th coordinate system is at origin, the direction of the cylinder is defined by dim.

Parameters:
  • positions (numpy.ndarray) – Cartesian coordinates (x,y,z)

  • origin (numpy.ndarray) – Origin of the new cylindrical coordinate system (x,y,z).

  • dim (int) – Direction of the cylinder axis (0=x, 1=y, 2=z).

Returns:

Positions in cylinder coordinates (r, phi, z)

Return type:

numpy.ndarray

maicos.lib.math.transform_sphere(positions: ndarray, origin: ndarray) ndarray[source]#

Transform positions into spherical coordinates.

The origin of the new coordinate system is at origin.

Parameters:
  • positions (numpy.ndarray) – Cartesian coordinates (x,y,z)

  • origin (numpy.ndarray) – Origin of the new spherical coordinate system (x,y,z).

Returns:

Positions in spherical coordinates (\(r\), phi, theta)

Return type:

numpy.ndarray

maicos.lib._cmath.compute_structure_factor(positions, dimensions, start_q, end_q, mintheta, maxtheta, weights)#

Calculates \(S(\vert q \vert)\) for all possible \(q\) values.

Returns the \(q\) values as well as the structure factor.

Use via from maicos.lib.math import compute_structure_factor

Parameters:
  • positions (numpy.ndarray) – position array.

  • dimensions (numpy.ndarray) – dimensions of the cell.

  • startq (float) – Starting q (1/Å).

  • endq (float) – Ending q (1/Å).

  • mintheta (float) – Minimal angle (°) between the q vectors and the z-axis.

  • maxtheta (float) – Maximal angle (°) between the q vectors and the z-axis.

  • weights (numpy.ndarray) – Atomic quantity whose \(S(\vert q \vert)\) we are computing. Provide an array of 1 that has the same size as the postions, i.e np.ones(len(positions)), for the standard structure factor

Returns:

The q values and the corresponding structure factor.

Return type:

tuple(numpy.ndarray, numpy.ndarray)