Mathematical helper functions¶
Helper functions for mathematical and physical operations.
- maicos.lib.math.FT(t: ndarray, x: ndarray, indvar: bool = True) ndarray | Tuple[ndarray, ndarray] [source]¶
Discrete Fourier transformation using fast Fourier transformation (FFT).
- Parameters:
t (numpy.ndarray) – Time values of the time series.
x (numpy.ndarray) – Function values corresponding to the time series.
indvar (bool) – If
True
, returns the FFT and frequency values. IfFalse
, returns only the FFT.
- Returns:
- If
indvar
isTrue
, returns a tuple(k, xf2)
where: k
(numpy.ndarray): Frequency values corresponding to the FFT.xf2
(numpy.ndarray): FFT of the input function, scaled by the time range and phase shifted.
If indvar is
False
, returns the FFT (xf2
) directly as anumpy.ndarray
.- If
- Return type:
- Raises:
RuntimeError – If the time series is not equally spaced.
Example
>>> t = np.linspace(0, np.pi, 4) >>> x = np.sin(t) >>> k, xf2 = FT(t, x) >>> k array([-3. , -1.5, 0. , 1.5]) >>> np.round(xf2, 2) array([ 0. +0.j , -0.68+0.68j, 1.36+0.j , -0.68-0.68j])
See also
iFT()
For the inverse fourier transform.
- maicos.lib.math.center_cluster(ag: AtomGroup, weights: ndarray) ndarray [source]¶
Calculate the center of the atomgroup with respect to some weights.
- Parameters:
ag (MDAnalysis.core.groups.AtomGroup) – Group of atoms to calculate the center for.
weights (numpy.ndarray) – Weights in the shape of ag.
- Returns:
com – The center with respect to the weights.
- Return type:
Without proper treatment of periodic boundrary conditions (PBC) most algorithms will result in wrong center calculations. As shown below without treating PBC the center of mass is located in the center of the box
+-----------+ | | | 1 x 2 | | | +-----------+
However, the distance accross the box boundary is shorter and therefore the center with PBC should be located somwhere else. The correct way to calculate the center is described in Bai and Breen[1] where coordinates of the particles are projected on a circle and weighted by their mass in this two dimensional space. The center of mass is obtained by transforming this point back to the corresponding point in the real system. This is done seperately for each dimension.
Reasons for doing this include the analysis of clusters in periodic boundrary conditions and consistent center of mass calculation across box boundraries. This procedure results in the right center of mass as seen below
+-----------+ | | x 1 2 | | | +-----------+
- maicos.lib.math.compute_form_factor(q: float, atom_type: str) float [source]¶
Calculate the form factor \(f(q)\).
\(f(q)\) is expressed in terms of the scattering vector as
\[f(q) = \sum_{i=1}^4 a_i e^{-b_i q^2/(4\pi)^2} + c \,.\]The coefficients \(a_{1,\dots,4}\), \(b_{1,\dots,4}\) and \(c\) are also known as Cromer-Mann X-ray scattering factors and are documented in Prince[2].
For determining the elements
maicos.lib.tables.atomtypes
is used and the Cromer-Mann X-ray scattering factors are stored inmaicos.lib.tables.CM_parameters
.
- maicos.lib.math.compute_rdf_structure_factor(rdf: ndarray, r: ndarray, density: float) Tuple[ndarray, ndarray] [source]¶
Computes the structure factor based on the radial distribution function (RDF).
The structure factor \(S(q)\) based on the RDF \(g(r)\) is given by
\[S(q) = 1 + 4 \pi \rho \int_0^\infty \mathrm{d}r r \frac{\sin(qr)}{q} (g(r) - 1)\,\]where \(q\) is the magnitude of the scattering vector. The calculation is performed via a discrete sine transform as implemented in
scipy.fftpack.dst()
.For an example take a look at Small-angle X-ray scattering.
- Parameters:
rdf (numpy.ndarray) – radial distribution function
r (numpy.ndarray) – equally spaced distance array on which rdf is defined
density (float) – number density of particles
- Returns:
q (numpy.ndarray) – array of q points
struct_factor (numpy.ndarray) – structure factor
- maicos.lib.math.correlation(a: ndarray, b: ndarray | None = None, subtract_mean: bool = False) ndarray [source]¶
Calculate correlation or autocorrelation.
Uses fast fourier transforms to give the correlation function of two arrays, or, if only one array is given, the autocorrelation. Setting
subtract_mean=True
causes the mean to be subtracted from the input data.- Parameters:
a (numpy.ndarray) – The first input array to calculate the correlation
b (numpy.ndarray) – The second input array. If
None
, autocorrelation ofa
is calculated.subtract_mean (bool) – If
True
, subtract the mean from the input data.
- Returns:
The correlation or autocorrelation function.
- Return type:
- maicos.lib.math.correlation_time(timeseries: ndarray, method: str = 'sokal', mintime: int = 3, sokal_factor: float = 8) float [source]¶
Compute the integrated correlation time of a time series.
The integrated correlation time (in units of the sampling interval) is given by
\[\tau = \sum\limits_{t=1}^{N_\mathrm{cut}} C(t) \left(1 - \frac{t}{N}\right)\]where \(N_\mathrm{cut} < N\) is a subset of the time series of length \(N\) and \(C(t)\) is the discrete-time autocorrelation function. To obtain the upper limit of the sum \(N_\mathrm{cut}\) two different methods are provided:
For “chodera” [3] \(N_\mathrm{cut}\) is given by the time when \(C(t)\) crosses zero the first time.
For “sokal” [4] \(N_\mathrm{cut}\) is determined iteratively by stepwise increasing until
\[N_\mathrm{cut} \geq c \cdot \tau\]where \(c\) is the constant
sokal_factor
. If the condition is never fulfilled,-1
is returned, indicating that the time series does not provide sufficient statistics to estimate a correlation time.
While both methods give the same correlation time for a smooth time series that decays to 0, “sokal” will results in a more reasonable result for actual time series that are noisy and cross zero several times.
- Parameters:
timeseries (numpy.ndarray) – The time series used to calculate the correlation time from.
method ({"sokal", "chodera"}) – Method to choose summation cutoff \(N_\mathrm{cut}\).
mintime (int) – Minimum possible value for \(N_\mathrm{cut}\).
sokal_factor (float) – Cut-off factor \(c\) for the Sokal method.
- Returns:
tau – Integrated correlation time \(\tau\). If
-1
(only formethod="sokal"
) the provided time series does not provide sufficient statistics to estimate a correlation time.- Return type:
- Raises:
ValueError – If mintime is larger than the length of the timeseries.
ValueError – If method is not one of “sokal” or “chodera”.
References
- maicos.lib.math.iFT(k: ndarray, xf: ndarray, indvar: bool = True) ndarray | Tuple[ndarray, ndarray] [source]¶
Inverse Fourier transformation using fast Fourier transformation (FFT).
Takes the frequency series and the function as arguments. By default, returns the iFT and the time series. Setting indvar=False means the function returns only the iFT.
- Parameters:
k (numpy.ndarray) – The frequency series.
xf (numpy.ndarray) – The function series in the frequency domain.
indvar (bool) – If
True
, return both the iFT and the time series. IfFalse
, return only the iFT.
- Returns:
If indvar is
True
, returns a tuple containing the time series and the iFT. If indvar isFalse
, returns only the iFT.- Return type:
- Raises:
RuntimeError – If the time series is not equally spaced.
See also
FT()
For the Fourier transform.
- maicos.lib.math.new_mean(old_mean: float, data: float, length: int) float [source]¶
Compute the arithmetic mean of a series iteratively.
Compute the arithmetic mean of n samples based on an existing mean of n-1 and the n-th value.
Given the mean of a data series
\[\bar x_N = \frac{1}{N} \sum_{n=1}^N x_n\]we seperate the last value
\[\bar x_N = \frac{1}{N} \sum_{n=1}^{N-1} x_n + \frac{x_N}{N}\]and multiply 1 = (N - 1)/(N - 1)
\[\begin{split}\bar x_N = \frac{N-1}{N} \frac{1}{N-1} \\ \sum_{n=1}^{N-1} x_n + \frac{x_N}{N}\end{split}\]The first term can be identified as the mean of the first N - 1 values and we arrive at
\[\bar x_N = \frac{N-1}{N} \bar x_{N-1} + \frac{x_N}{N}\]- Parameters:
- Returns:
new_mean – Updated mean of the series of n values.
- Return type:
Examples
The mean of a data set can easily be calculated from the data points. However this requires one to keep all data points on hand until the end of the calculation.
>>> print(np.mean([1, 3, 5, 7])) 4.0
Alternatively, one can update an existing mean, this requires only knowledge of the total number of samples.
>>> print(new_mean(np.mean([1, 3, 5]), data=7, length=4)) 4.0
- maicos.lib.math.new_variance(old_variance: float | ndarray, old_mean: float | ndarray, new_mean: float | ndarray, data: float | ndarray, length: int) float | ndarray [source]¶
Calculate the variance of a timeseries iteratively.
The variance of a timeseries \(x_n\) can be calculated iteratively by using the following formula:
\[S_n = S_n-1 + (n-1) * (x_n - \bar{x}_n-1)^2 / (n-1)\]Here, \(\bar{x}_n\) is the mean of the timeseries up to the \(n\)-th value.
Floating point imprecision can lead to slight negative variances leading non defined standard deviations. Therefore a negetaive variance is set to 0.
- Parameters:
old_variance (float, numpy.ndarray) – The variance of the first n-1 samples.
old_mean (float) – The mean of the first n-1 samples.
new_mean (float, numpy.ndarray) – The mean of the full n samples.
data (float, numpy.ndarray) – The n-th value of the series.
length (int) – Length of the updated series, here called n.
- Returns:
new_variance – Updated variance of the series of n values.
- Return type:
Examples
The data set
[1, 5, 5, 1]
has a variance of4.0
>>> print(np.var([1, 5, 5, 1])) 4.0
Knowing the total number of data points, this operation can be performed iteratively.
>>> print( ... new_variance( ... old_variance=np.var([1, 5, 5]), ... old_mean=np.mean([1, 5, 5]), ... new_mean=np.mean([1, 5, 5, 1]), ... data=1, ... length=4, ... ) ... ) 4.0
- maicos.lib.math.scalar_prod_corr(a: ndarray, b: ndarray | None = None, subtract_mean: bool = False) ndarray [source]¶
Give the corr. function of the scalar product of two vector timeseries.
Arguments should be given in the form a[t, i], where t is the time variable along which the correlation is calculated, and i indexes the vector components.
- Parameters:
a (numpy.ndarray) – The first vector timeseries of shape (t, i).
b (numpy.ndarray) – The second vector timeseries of shape (t, i). If
None
, correlation with itself is calculated.subtract_mean (bool) – If
True
, subtract the mean from the timeseries before calculating the correlation.
- Returns:
The correlation function of the scalar product of the vector timeseries.
- Return type:
- maicos.lib.math.symmetrize(m: ndarray, axis: None | int | Tuple[int] = None, inplace: bool = False) ndarray [source]¶
Symmeterize an array.
The shape of the array is preserved, but the elements are symmetrized with respect to the given axis.
- Parameters:
m (array_like) – Input array to symmetrize
axis (int, tuple(int)) – Axis or axes along which to symmetrize over. The default,
axis=None
, will symmetrize over all of the axes of the input array. If axis is negative it counts from the last to the first axis. If axis is atuple
of ints, symmetrizing is performed on all of the axes specified in thetuple
.inplace (bool) – Do symmetrizations inplace. If
False
a new array is returned.
- Returns:
out – the symmetrized array
- Return type:
array_like
Notes
symmetrize uses
np.flip()
for flipping the indices.Examples
>>> A = np.arange(10).astype(float) >>> A array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]) >>> symmetrize(A) array([4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5]) >>> symmetrize(A, inplace=True) array([4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5]) >>> A array([4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5, 4.5])
It also works for arrays with more than 1 dimensions in a general dimension.
>>> A = np.arange(20).astype(float).reshape(2, 10).T >>> A array([[ 0., 10.], [ 1., 11.], [ 2., 12.], [ 3., 13.], [ 4., 14.], [ 5., 15.], [ 6., 16.], [ 7., 17.], [ 8., 18.], [ 9., 19.]]) >>> symmetrize(A) array([[9.5, 9.5], [9.5, 9.5], [9.5, 9.5], [9.5, 9.5], [9.5, 9.5], [9.5, 9.5], [9.5, 9.5], [9.5, 9.5], [9.5, 9.5], [9.5, 9.5]]) >>> symmetrize(A, axis=0) array([[ 4.5, 14.5], [ 4.5, 14.5], [ 4.5, 14.5], [ 4.5, 14.5], [ 4.5, 14.5], [ 4.5, 14.5], [ 4.5, 14.5], [ 4.5, 14.5], [ 4.5, 14.5], [ 4.5, 14.5]])
- maicos.lib.math.transform_cylinder(positions: ndarray, origin: ndarray, dim: int) ndarray [source]¶
Transform positions into cylinder coordinates.
The origin of th coordinate system is at origin, the direction of the cylinder is defined by dim.
- Parameters:
positions (numpy.ndarray) – Cartesian coordinates (x,y,z)
origin (numpy.ndarray) – Origin of the new cylindrical coordinate system (x,y,z).
dim (int) – Direction of the cylinder axis (0=x, 1=y, 2=z).
- Returns:
Positions in cylinder coordinates (r, phi, z)
- Return type:
- maicos.lib.math.transform_sphere(positions: ndarray, origin: ndarray) ndarray [source]¶
Transform positions into spherical coordinates.
The origin of the new coordinate system is at origin.
- Parameters:
positions (numpy.ndarray) – Cartesian coordinates (x,y,z)
origin (numpy.ndarray) – Origin of the new spherical coordinate system (x,y,z).
- Returns:
Positions in spherical coordinates (\(r\), phi, theta)
- Return type:
- maicos.lib._cmath.compute_structure_factor(positions, dimensions, qmin, qmax, thetamin, thetamax, weights)¶
Calculates scattering vectors and corresponding structure factors.
Use via
from maicos.lib.math import compute_structure_factor
The structure factors are calculated according to
\[S(\boldsymbol{q}) = \left [ \sum\limits_{k=1}^N w_j(q) \cos(\boldsymbol{qr}_j) \right ]^2 + \left [ \sum\limits_{k=1}^N w_j(q) \sin(\boldsymbol{qr}_j) \right ]^2 \,.\]where \(\boldsymbol{r}_j\) is the positions vector of particle \(k\), \(\boldsymbol{q}\) is scattering vector and the \(w_j\) are optional weights. The possible scattering vectors are determined by the given cell
dimensions
.Results are returned as arrays with three dimensions, where the index of each dimensions referers to the Miller indices \(hkl\). Based on the Miller indices and the returned length of the scattering vector the actual scattering vector can be obtained by
\[q_{hkl} = \vert \boldsymbol{q} \vert \frac{2\pi}{L_{hkl}}\]where \(\vert \boldsymbol{q} \vert\) are the returned lengths of the scattering vector and \(L_{hkl}\) are the components of the simulation cell.
- Parameters:
positions (numpy.ndarray) – position array.
dimensions (numpy.ndarray) – dimensions of the cell.
qmin (float) – Starting scattering vector length (1/Å).
qmax (float) – Ending scattering vector length (1/Å).
thetamin (float) – Minimal angle (°) between the scattering vectors and the z-axis.
thetamax (float) – Maximal angle (°) between the scattering vectors and the z-axis.
weights (numpy.ndarray) – Atomic quantity whose \(S(\vert q \vert)\) we are computing. Provide an array of
1
that has the same size as the postions, h.enp.ones(len(positions))
, for the standard structure factor.
- Returns:
The length of the scattering vectors and the corresponding structure factors.
- Return type: