Analysis Module#

Statistical analysis and metrics.

cdiutils.analysis.stats.kde_from_histogram(counts, bin_edges)[source]#

Compute the Kernel Density Estimate (KDE) from histogram counts and bin edges provided by numpy.histogram function.

Parameters:
  • counts (np.ndarray) – the number of elements in each bin.

  • bin_edges (np.ndarray) – the limits of each bin.

Returns:

x values used to compute the KDE

estimate, the y value (KDE estimate)

Return type:

tuple[np.ndarray, np.ndarray]

cdiutils.analysis.stats.find_isosurface(amplitude, nbins=100, sigma_criterion=3, plot=False, show=False, save=None)[source]#

Estimate the isosurface value from the amplitude distribution.

This function computes the isosurface value based on the amplitude distribution of a 3D volume. The isosurface is calculated as: mu - sigma_criterion * sigma, where mu is the mean and sigma is the standard deviation of the distribution.

Parameters:
  • amplitude (np.ndarray) – The 3D amplitude volume.

  • nbins (int, optional) – The number of bins to use for the histogram. Defaults to 100.

  • sigma_criterion (float, optional) – The factor used to compute the isosurface. Defaults to 3.

  • plot (bool, optional) – Whether to generate a plot of the histogram and density estimate. Defaults to False.

  • show (bool, optional) – Whether to display the plot. Defaults to False.

  • save (str, optional) – File path to save the plot if generated. Defaults to None.

Returns:

The isosurface value. If plot or show is True, also returns the matplotlib figure object.

Return type:

tuple[float, plt.Axes] | float

cdiutils.analysis.stats.get_histogram(data, support=None, bins=50, density=False, region='overall')[source]#

Calculate histogram and optionally kernel density estimate (KDE) of the data. Optionally applies a support mask to the data before and calculates the surface and bulk histograms. :param data: the data to be analysed :type data: np.ndarray :param support: the support mask to be applied to the data

before histogram calculation. If None, no mask is applied. Defaults to None.

Parameters:
  • bins (int, optional) – number of bins for the histogram. Defaults to 50.

  • density (bool, optional) – whether to normalise the histogram to form a probability density function. Defaults to False.

  • region (str, optional) – region of the data to be analysed. Can be “overall”, “surface”, “bulk” or “all”. Defaults to “overall”.

Returns:

a dictionary containing the histograms for the specified

region(s). If kde is True, also includes the KDEs.

Return type:

dict

cdiutils.analysis.stats.plot_histogram(ax, counts, bin_edges, kde_x=None, kde_y=None, color='lightcoral', fwhm=True, bar_args=None, kde_args=None)[source]#

Plot the bars of a histogram as well as the kernel density estimate.

Parameters:
  • ax (plt.Axes) – the matplotlib ax to plot the histogram on.

  • counts (np.ndarray) – the count in each bin from np.histogram().

  • bin_edges (np.ndarray) – the bin edge values from np.histogram().

  • kde_x (np.ndarray, optional) – the x values used to calculate the kernel density estimate values.

  • kde_y (np.ndarray, optional) – the (y) values of the kernel density estimate.

  • color (ColorType, optional) – the colour of the bar and line. Defaults to “lightcoral”.

  • fwhm (bool, optional) – whether to calculate and plot the full width at half maximum (FWHM) of the kernel density estimate. Defaults to True.

  • bar_args (dict, optional) – additional arguments for the matptlotlib bar function.

  • kde_args (dict, optional) – additional arguments for the matplotlib fill_between function. Can include boolean “fill” and float “fill_alpha” to control whether to fill the kde area and the alpha value of the fill. Defaults to None.

Returns:

the fwhm if required else None.

Return type:

float