Datasets#

class mpol.datasets.GriddedDataset(*args: Any, **kwargs: Any)[source]#
Parameters:
  • coords (GridCoords) – If providing this, cannot provide cell_size or npix.

  • vis_gridded (torch.Tensor of torch.complex128) – the gridded visibility data stored in a “packed” format (pre-shifted for fft)

  • weight_gridded (torch.Tensor) – the weights corresponding to the gridded visibility data, also in a packed format

  • mask (torch.Tensor of torch.bool) – a boolean mask to index the non-zero locations of vis_gridded and weight_gridded in their packed format.

  • nchan (int) – the number of channels in the image (default = 1).

After initialization, the GriddedDataset provides the non-zero cells of the gridded visibilities and weights as a 1D vector via the following instance variables. This means that any individual channel information has been collapsed.

Variables:
  • vis_indexed – 1D complex tensor of visibility data

  • weight_indexed – 1D tensor of weight values

If you index the output of the Fourier layer in the same manner using self.mask, then the model and data visibilities can be directly compared using a loss function.

add_mask(mask: _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) None[source]#

Apply an additional mask to the data. Only works as a data limiting operation (i.e., mask is more restrictive than the mask already attached to the dataset).

Parameters:

mask (2D numpy or PyTorch tensor) – boolean mask (in packed format) to apply to dataset. Assumes input will be broadcast across all channels.

forward(modelVisibilityCube: torch.Tensor) torch.Tensor[source]#
Parameters:

modelVisibilityCube (complex torch.tensor) – with shape (nchan, npix, npix) to be indexed. In “pre-packed” format, as in output from mpol.fourier.FourierCube.forward()

Returns:

1d torch tensor of indexed model samples collapsed

across cube dimensions.

Return type:

torch complex tensor

property ground_mask: torch.Tensor#

The boolean mask, arranged in ground format.

Returns:

3D mask cube of shape (nchan, npix, npix)

Return type:

torch.boolean

class mpol.datasets.Dartboard(coords: GridCoords, q_edges: ndarray[Any, dtype[floating[Any]]] | None = None, phi_edges: ndarray[Any, dtype[floating[Any]]] | None = None)[source]#

A polar coordinate grid relative to a GridCoords object, reminiscent of a dartboard layout. The main utility of this object is to support splitting a dataset along radial and azimuthal bins for k-fold cross validation.

Parameters:
  • coords (GridCoords) – an object already instantiated from the GridCoords class. If providing this, cannot provide cell_size or npix.

  • q_edges (1D numpy array) – an array of radial bin edges to set the dartboard cells in \([\mathrm{k}\lambda]\). If None, defaults to 12 log-linearly radial bins stretching from 0 to the \(q_\mathrm{max}\) represented by coords.

  • phi_edges (1D numpy array) – an array of azimuthal bin edges to set the dartboard cells in [radians], over the domain \([0, \pi]\), which is also implicitly mapped to the domain \([-\pi, \pi]\) to preserve the Hermitian nature of the visibilities. If None, defaults to 8 equal-spaced azimuthal bins stretched from \(0\) to \(\pi\).

get_polar_histogram(qs: ndarray[Any, dtype[floating[Any]]], phis: ndarray[Any, dtype[floating[Any]]]) ndarray[Any, dtype[floating[Any]]][source]#

Calculate a histogram in polar coordinates, using the bin edges defined by q_edges and phi_edges during initialization. Data coordinates should include the points for the Hermitian visibilities.

Parameters:
  • qs – 1d array of q values \([\lambda]\)

  • phis – 1d array of datapoint azimuth values [radians] (must be the same length as qs)

Returns:

2d integer numpy array of cell counts, i.e., how many datapoints fell into each dartboard cell.

get_nonzero_cell_indices(qs: ndarray[Any, dtype[floating[Any]]], phis: ndarray[Any, dtype[floating[Any]]]) ndarray[Any, dtype[integer[Any]]][source]#

Return a list of the cell indices that contain data points, using the bin edges defined by q_edges and phi_edges during initialization. Data coordinates should include the points for the Hermitian visibilities.

Parameters:
  • qs – 1d array of q values \([\lambda]\)

  • phis – 1d array of datapoint azimuth values [radians] (must be the same length as qs)

Returns:

list of cell indices where cell contains at least one datapoint.

build_grid_mask_from_cells(cell_index_list: ndarray[Any, dtype[integer[Any]]]) ndarray[Any, dtype[bool_]][source]#

Create a boolean mask of size (npix, npix) (in packed format) corresponding to the vis_gridded and weight_gridded quantities of the GriddedDataset .

Parameters:

cell_index_list (list) – list or iterable containing [q_cell, phi_cell] index pairs to include in the mask.

Returns: (numpy array) 2D boolean mask in packed format.