Hide code cell content
%run notebook_setup

Making a Mock Dataset#

In this tutorial, we’ll explore how you might construct a mock dataset from a known sky brightness distribution. In many ways, this problem is already solved in a realistic manner by CASA’s simobserve task. However, by replicating the key parts of this process with MPoL framework, we can easily make direct comparisons to images produced using RML techniques.

In a nutshell, this process is works by

  1. taking a known sky brightness distribution (i.e., a mock “true” image)

  2. inserting it into an mpol.images.ImageCube

  3. using the mpol.fourier.NuFFT to predict visibilities at provided \(u,v\) locations

  4. adding noise

The final two steps are relatively straightforward. The first two steps are conceptually simple but there are several technical caveats one should be aware of, which we’ll cover now.

Choosing a mock sky brightness distribution#

You can choose a mock sky brightness distribution from a simulation, a parametric model, or even an image from the Internet. For this tutorial, we’ll use a JPEG image from the internet, since it will highlight many of the problems one might run into. First, we’ll download the image and display it.

# use python to download an image
import requests

image_url="https://cdn.eso.org/images/screen/alma-eight_close.jpg"

img_data = requests.get(image_url).content
with open('alma.jpg', 'wb') as handler:
    handler.write(img_data)
from IPython.display import Image
Image("alma.jpg")
ALMA

The ALMA antennas. Credit: ALMA (ESO/NAOJ/NRAO)#

There are several operations we will need to perform on this image before it is suitable to insert into an mpol.images.ImageCube.

  1. convert the JPEG image to greyscale float64 values

  2. make the image square (if necessary)

  3. choose a target ImageCube size via cell_size and npix. The range of acceptable image dimensions depends on the range \(u,v\) coordinates

  4. if the raw image size is larger than the target size of the ImageCube, smooth and downsample the image

  5. scale flux units to be Jy/arcsec^2

To perform image manipulations, we’ll use the Pillow library.

Using Pillow to greyscale, apodize, pad, and resize#

from PIL import Image, ImageOps, ImageMath
import numpy as np

im_raw = Image.open("alma.jpg")

# convert to greyscale
im_grey = ImageOps.grayscale(im_raw)

# get image dimensions
xsize, ysize = im_grey.size
print(im_grey.mode)
L

In converting the JPEG image to greyscale (mode “L” or “luminance”), the Pillow library has reduced the color channel to a single axis with an 8-bit unsigned integer, which can take on the values from 0 to 255. More info on the modes is available here. We can see the greyscale image

im_grey
../_images/0df4d8edccc07cb273e547e7e22a41a610e41409dc683ff936a4f38c8a5154a0.png

Now let’s think about how to make the image square. Depending on the image, we can either crop the longer dimension or pad the larger dimension. Before we do that, though, we also need to think about the image edges.

Because the discrete Fourier transform is used to take an image to the visibility plane, we make the assumption that the image is infinite and periodic in space beyond the field of view. i.e., it tiles to infinity. Therefore, to avoid introducing spurious spatial frequencies from discontinous edges, it is a good idea to make sure that the edges of the image all have the same value. The simplest thing to do here is to taper the image edges such that they all go to 0. We can do this by multiplying by the image by an apodization function, like the Hann window. We’ll multiply two 1D Hann windows to create a 2D apodization window.

xhann = np.hanning(xsize)
yhann = np.hanning(ysize)
# each is already normalized to a max of 1
# so hann is also normed to a max of 1
# broadcast to 2D
hann = np.outer(yhann, xhann)

# now convert the numpy array to a Pillow object
# scale to 0 - 255 and then convert to uint8
hann8 = np.uint8(hann * 255)
im_apod = Image.fromarray(hann8)

We can visualize the 2D Hann apodization window

im_apod
../_images/edaaa8a8995da4dce60881ff88a6cb193bdc232c63b316c5222c4e37e07dc449.png

And then use image math to multiply the apodization window against the greyscaled image

im_res = ImageMath.eval("a * b", a=im_grey, b=im_apod)

To give an image with a vignette-like appearance.

im_res
../_images/80e0c945417f0a8e8b7fb4760a4944c03a37f926d53e07c9e4db4d7d21689411.png

Now, let’s pad the image to be square

max_dim = np.maximum(xsize, ysize)
im_pad = ImageOps.pad(im_res, (max_dim, max_dim))
im_pad
../_images/a842fe3bb8275ad72c2a19c55ee300f08a96539a6b8effa84e88d29041462ca3.png

Great, we now have a square, apodized image.

The next thing we should fix is that a 1280 x 1280 image is still a bit too many pixels for most ALMA observations. I.e., the spatial resolution or “beam size” of most ALMA observations is such that for any single-pointing observation, we wouldn’t need this many pixels to represent the full information content of the image. Therefore, let’s resize the image to be a bit smaller.

npix = 500
im_small = im_pad.resize((npix,npix))
im_small
../_images/3eff8a4b322a194cc508504c70be7e83e92f73bd5d83a6d5fd6818104cf1c6c0.png

Exporting to a Numpy array and setting flux scale#

Now that we have done the necessary image preparation, we’re ready to leave the Pillow library and work with numpy arrays and pytorch tensors. First we convert from a Pillow object to a numpy array

import numpy as np
a = np.array(im_small)

We can see that this array is now a 32-bit integer array (it was promoted from an 8-bit integer array during the ImageMath operation to save precision).

a
array([[0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       ...,
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0],
       [0, 0, 0, ..., 0, 0, 0]], dtype=int32)

We will convert this array to a float64 type and normalize its max value to 1.

b = a.astype("float64")
b = b/b.max()

Now, we can plot this array using matplotlib’s imshow and using the origin="lower" argument as we might normally do with arrays of data from MPoL.

import matplotlib.pyplot as plt
plt.imshow(b, origin="lower")
<matplotlib.image.AxesImage at 0x7f10580fa550>
../_images/be79b995a2a50bbef7c7e91994d8e1236259a969198edba72563cf87593c642d.png

In doing so, we’ve uncovered an additional problem that the image is upside down! We can fix this using

c = np.flipud(b)
plt.imshow(c, origin="lower")
<matplotlib.image.AxesImage at 0x7f104c106c10>
../_images/a44fc2c19d14ce07c780c6b36d662170cba0b55b28bb6fd0f4cf9182abdda2e6.png

In this example, we’re only working with a single-channel mock sky brightness distribution, so we only need to add an extra channel dimension to make an image cube. If we were working with a multi-channel sky brightness distribution, we could repeat the above transformations for each channel of the image cube.

d = np.expand_dims(c, axis=0)

Now let’s choose how big we want our mock sky brightness to be on the sky. Adjusting the cell_size changes the maximum spatial frequency that can be represented in the image. I.e., a smaller pixel cell_size will enable an image to carry higher spatial frequencies. Changing the number of pixels in the image via npix will change the number of \(u,v\) cells between 0 and the max spatial frequency. We effectively chose the npix when we performed the resize operation, so all that’s left is to choose the cell_size.

cell_size = 0.03 # arcsec

The final task is to scale the amplitude of the image to the desired level. The ImageCube object will expect the input tensor to be in units of Jy/arcsec^2.

Let’s assume that we would like the total flux of our mock image to be 30 Jy, which a very bright source for ALMA band 6. Then again, the noise levels in the mock baseline distribution we plan to use are relatively high, the baseline distribution lacks short spacings, and we want to make sure our source shines through.

So, if we have assigned each pixel to be 0.03 arcseconds on each side, then each pixel has an area of

pixel_area = cell_size**2 # arcsec
print(pixel_area, "arcsec^2")
0.0009 arcsec^2

What is the current flux of the image?

# if the raw image is supposed to be in Jy/arcsec^2, then to calculate
# total flux, we would convert to Jy/pixel by multiplying area / pixel
# and then summing all values
old_flux = np.sum(d * pixel_area)
print(old_flux, "Jy")
17.617448544297993 Jy

So, if we want the image to have a total flux of 30 Jy, we need to multiply by a factor of

flux_scaled = 30/old_flux * d
print("Total flux of image is now {:.1f} Jy".format(np.sum(flux_scaled * pixel_area)))
Total flux of image is now 30.0 Jy

Initializing ImageCube#

Now, we’ll convert the numpy array to a PyTorch tensor

import torch
img_tensor = torch.tensor(flux_scaled.copy())

And finally, we’ll shift the tensor from a “Sky Cube” to a “Packed Cube” as the ImageCube expects

from mpol import utils
img_tensor_packed = utils.sky_cube_to_packed_cube(img_tensor)
from mpol.images import ImageCube
image = ImageCube.from_image_properties(cell_size=cell_size, npix=npix, nchan=1, cube=img_tensor_packed)

If you want to double-check that the image was correctly inserted, you can do

# double check it went in correctly
plt.imshow(np.squeeze(utils.packed_cube_to_sky_cube(image()).detach().numpy()), origin="lower")

to see that it’s upright and not flipped.

Obtaining \(u,v\) baseline and weight distributions#

One of the key use cases for producing a mock dataset from a known sky brightness is to test the ability of an RML algorithm to recover the “true” image. \(u,v\) baseline distributions from real interferometric arrays like ALMA, VLA, and others are highly structured sampling distributions that are difficult to accurately replicate using distributions available to random number generators.

Therefore, we always recommend generating fake data using \(u,v\) distributions from real datasets, or use those produced using realistic simulators like CASA’s simobserve task. In this example, we’ll just use the baseline distribution from the mock dataset we’ve used in many of the tutorials. You can see a plot of it in the Gridding and Diagnostic Images tutorial. We’ll only need the \(u,v\) and weight arrays.

from astropy.utils.data import download_file
from mpol.__init__ import zenodo_record

# load the mock dataset of the ALMA logo
fname = download_file(
    f"https://zenodo.org/record/{zenodo_record}/files/logo_cube.noise.npz",
    cache=True,
    show_progress=True,
    pkgname="mpol",
)

# select the components for a single channel
chan = 4
d = np.load(fname)
uu = d["uu"][chan]
vv = d["vv"][chan]
weight = d["weight"][chan]

MPoL has a helper routine to calculate the maximum cell_size that can still Nyquist sample the highest spatial frequency in the baseline distribution.

max_uv = np.max(np.array([uu,vv]))
max_cell_size = utils.get_maximum_cell_size(max_uv)
print("The maximum cell_size that will still Nyquist sample the spatial frequency represented by the maximum u,v value is {:.2f} arcseconds".format(max_cell_size))
assert cell_size < max_cell_size
The maximum cell_size that will still Nyquist sample the spatial frequency represented by the maximum u,v value is 0.09 arcseconds

Thankfully, we see that we already chose a sufficiently small cell_size.

Making the mock dataset#

With the ImageCube, \(u,v\) and weight distributions now in hand, generating the mock visibilities is relatively straightforward using the mpol.fourier.make_fake_data() routine. This routine uses the NuFFT to produce loose visibilities at the \(u,v\) locations and then adds random Gaussian noise to the visibilities, drawn from a probability distribution set by the value of the weights.

from mpol import fourier
# will have the same shape as the uu, vv, and weight inputs
data_noise, data_noiseless = fourier.make_fake_data(image, uu, vv, weight)

print(data_noise.shape)
print(data_noiseless.shape)
print(data_noise)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/mpol/fourier.py:440: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  uu_radpix_aug = torch.unsqueeze(torch.tensor(uu_radpix), 1)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/mpol/fourier.py:441: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  vv_radpix_aug = torch.unsqueeze(torch.tensor(vv_radpix), 1)
(1, 325080)
(1, 325080)
[[ 1.12276852-0.5931206j  10.61054223+0.83476027j -1.0241611 +1.12850301j
  ... -1.15520768-2.83258039j  1.01975462-3.07777726j
   1.92658731-0.61666572j]]

Now you could save this to disk. Since this is continuum dataset, we’ll remove the channel dimension from the mock visibilities

data = np.squeeze(data_noise)
# data = np.squeeze(data_noiseless)
np.savez("mock_data.npz", uu=uu, vv=vv, weight=weight, data=data)

And now you could use this dataset just like any other when doing RML inference, and now you will have a reference image to compare “ground truth” to.

Verifying the mock dataset#

To make sure the whole process worked OK, we’ll load the visibilities and then make a dirty image. We’ll set the coordinates of the gridder and dirty image to be exactly those as our input image, so that we can make a pixel-to-pixel comparison. Note that this isn’t strictly necessary, though. We could make a range of images with different cell_sizes and npixs.

from mpol import coordinates, gridding

# well set the
coords = coordinates.GridCoords(cell_size=cell_size, npix=npix)

imager = gridding.DirtyImager(
    coords=coords,
    uu=uu,
    vv=vv,
    weight=weight,
    data_re=np.squeeze(np.real(data)),
    data_im=np.squeeze(np.imag(data)),
)
C = 1 / np.sum(weight)
noise_estimate = C * np.sqrt(np.sum(weight))
print(noise_estimate, "Jy / dirty beam")
0.0025689290972358914 Jy / dirty beam
img, beam = imager.get_dirty_image(weighting="briggs", robust=1.0, unit="Jy/arcsec^2")
chan = 0
kw = {"origin": "lower", "interpolation": "none", "extent": imager.coords.img_ext}
fig, ax = plt.subplots(ncols=2, figsize=(6.0, 4))
ax[0].imshow(beam[chan], **kw)
ax[0].set_title("beam")
ax[1].imshow(img[chan], **kw)
ax[1].set_title("image")
for a in ax:
    a.set_xlabel(r"$\Delta \alpha \cos \delta$ [${}^{\prime\prime}$]")
    a.set_ylabel(r"$\Delta \delta$ [${}^{\prime\prime}$]")
fig.subplots_adjust(left=0.14, right=0.90, wspace=0.35, bottom=0.15, top=0.9)
../_images/341452a8dc0a9e9e0b83fab35b1ddebd5bda1ed639fb56aae7fa3bd07229e2c2.png

We can even subtract this on a pixel-by-pixel basis and compare to the original image.

chan = 0
kw = {"origin": "lower", "interpolation": "none", "extent": imager.coords.img_ext}
fig, ax = plt.subplots(ncols=3, figsize=(6.0, 3))

ax[0].imshow(flux_scaled[chan], **kw)
ax[0].set_title("original")

ax[1].imshow(img[chan], **kw)
ax[1].set_title("dirty image")

ax[2].imshow(flux_scaled[chan] - img[chan], **kw)
ax[2].set_title("difference")

ax[0].set_xlabel(r"$\Delta \alpha \cos \delta$ [${}^{\prime\prime}$]")
ax[0].set_ylabel(r"$\Delta \delta$ [${}^{\prime\prime}$]")

for a in ax[1:]:
    a.xaxis.set_ticklabels([])
    a.yaxis.set_ticklabels([])

fig.subplots_adjust(left=0.14, right=0.90, wspace=0.2, bottom=0.15, top=0.9)
../_images/6555159e8b57a7c4ea5e7183a3d1c7a12166e806ff584c8d9bb93b2ceda393d2.png

The subtraction revears some interesting artefacts.

  1. the dirty image and difference image have substantial emission in regions away from the true locations of flux. This is because the dirty beam sidelobes spread flux from the center of the image to other regions. CLEAN or RML would remove most of these features.

  2. the difference image has fine-featured residuals in the center, corresponding to the edges of the antenna dishes and support structures. This is because the dirty beam has some Briggs weighting applied to it, and is closer to natural weighting than uniform weighting. This means that the spatial resolution of the dirty image is not as high as the original image, and thus high spatial frequency features, like the edges of the antennae, are not reproduced in the dirty image. Pushing the beam closer to uniform weighting would capture some of these finer structured features, but at the expense of higher thermal noise in the image.

  3. the faint “halo” surrounding the antennas in the original image (the smooth blue sky and brown ground, in the actual JPEG) has been spatially filtered out of the dirty image. This is because this mock baseline distribution was generated for a more extended ALMA configuration without a sufficient number of short baselines.