S-1 based DEM generation

Masters programme | E-portfolio
Semester I & II
Remote Sensing Applications

Juli 6, 2022
DEM, SAR

Investigation into the usability of radar-based terrain models for analysing the landscape-altering effects of the Brumadinho dam collapse (Brazil, 2019)

Intro: SAR based DEM generation

The following explanations providing an overview on SAR-based DEM generation are mainly based on Braun 2021. For further information see the compilation of references at the end of this post.

With radar imaging not only amplitude/intensity but also phase information is retained. φ is proportional to the two-way travel distance 2R divided by the transmitted wavelength λ.

Basically, this phase information related to distance information can be exploited in two main contexts: topographic mapping and surface deformation analyses. The underlying idea in both cases is to use multiple SAR images to analyse and focus on the different well-known contributors to phase variations in a combined product of these single images. This combined product is called interferogram and retrieved by cross-multiplying the phase of the first image with the complex conjugate of the second image. Phase variations φ visible in interferograms can stem from topography, the Earth’s curvature, surface deformations, atmospheric contributions and system-inherent noise:

$$\varphi = \varphi_{topo} + \varphi_{flat} + \varphi_{disp} + \varphi_{atm} + \varphi_{noise} $$

Whereas the goal for surface deformation analyses is to focus on line-of-sight displacements $\varphi_{disp}$ part by excluding or reducing all other factors, topographic mapping tries to highlight the influence of topographic variations $\varphi_{topo}$. Note that the retrieval of height information via this radar interferometric approach is fundamentally different compared to the derivation of 3D information from stereoscopic mapping (e.g. UAV data and structure from motion methods). The topographic height variation observed in the interferogram can be translated into height by consideration of the following relationship:

$$ \varphi_{topo} = -\frac{4\pi }{\lambda } \frac{b_{perp}}{R \ast sin\theta} \ast H$$

where $ b_{perp}$ is the distance between the satellite’s position when taking the two images, $\lambda$ is the wavelength, R is the range distance between the sensor and the Earth, θ the incidence angle and H the height ambiguity.

The height ambiguity refers to one 2pi fringe cycle and depends on the specific values for the other parameters in the equation above. If topographic height variations are the dominating source of phase variations, the corresponding interferogram will exhibit a clear fringe pattern. This first of all requires a coregistration of the used images with sub-pixel accuracy to derive at accurately stacked images without geometrical errors. Also, the flat earth contribution needs to be removed during interferogram formation. Subsequent filtering approaches, such as an application of the Goldstein filter based on the Fast Fourier Transform, are recommended to reduce noise. After these steps, the (filtered) interferogram represents the first intermediate product within the processing chain towards DEM generation. Alongside, the local coherence as a cross-correlation measure between reference and secondary image phases is often calculated in order to assess the quality of the interferogram. Low coherence values can be a result of various decorrelation sources, the most significant of which are the following:

scattering mechanism changes between both images (temporal decorrelation)
incidence angle changes between both acquisitions (baseline decorrelation)
penetration of the radar wave into mediums (volume decorrelation)
thermal noise and other antenna-related sources of bias
processing-induced errors such as low coregistration quality

These decorrelation sources introduce noise into the interferogram and complicate the retrieval of accurate height information in the next steps of phase unwrapping and height conversion. Phase unwrapping refers to the process of integrating the cyclic phase measure over the entire image along closed paths in order to get a continuous height image. Finally, height conversion deals with the geocoding of the continuous image and translation into actual geophysical units, e.g. meters above sea level.

Deriving DEMs from S1 data is commonly considered to be a challenging task as the image acquisition was optimised for differential interferometry posing different requirements regarding temporal and geometrical baselines. The complications which frequently arise may best be illustrated by taking a quick look at two of the most used SAR-based elevation models and their underlying satellite missions:

Shuttle Radar Topography Mission (SRTM)

SRTM was conducted in 2000 to produce a nearly global DEM with a spatial resolution of 30m. The images necessary for forming the interferogram were acquired at the same time by two antennas separated by about 60m. This bistatic or single-pass interferometry setting is particularly suited for DEM generation as temporal decorrelation introduced by changes in the surface or atmosphere is minimised and the perpendicular baseline can be specified very precisely.

German Aerospace Centre DEM

This DEM at a spatial resolution of 12.5m is based on images acquired by TerraSAR-X and TanDEM-X. Again, the bistatic constellation with simultaneous image acquisitions is used here, though the constellation is built of two independent satellites moving in a helix-shaped orbit.

Compared to these settings, Sentinel 1 operates in a monostatic mode meaning that only repeat-pass interferometry with images acquired at different times can be performed. This brings the problem of temporal phase decorrelation, mostly over vegetated areas which – given the wavelength of C-Band Sentinel 1 – lose their coherent scattering properties within minutes. Additionally, the perpendicular baseline is suboptimal as the narrow orbital tube of Sentinel 1 leads to height ambiguities of more than 500m. Smaller height ambiguities allowing more precise descriptions of terrain features require to have two Sentinel 1 scenes with a baseline larger than 150m – a condition that is rarely met. Nonetheless, the high temporal resolution and open data policy of Sentinel 1 makes it an interesting source to be exploited for DEM generations.

Case study: Brumadinho dam collapse

Back in 2019, the Brumadinho dam collapse was one of the most devastating events in the brazilian history. Some key facts are listed below:

Place: Córrego do Feijão iron ore mine complex
Time: 25 January 2019
Consequences: >200 deaths

As its evident from the Sentinel 2 timeseries, the event did not only cause lots of deaths but also altered the landscape across several kilometers. The failure occured across the full 720 m width and 86 m height of the dam and more than 11.7 million m³ of tailings were released downstream. From a research point of view, one interesting question refers to the ability to quantify the landscape-altering effect of the event in terms of the volume of eroded material. To this aim, the current study tries to use differential terrain models derived from Sentinel 1. The motivation for using Sentinel 1 is twofold:

First, there is a poor availability of elevation data for this area. Usually, one would rely on highly accurate LIDAR or drone data to generate elevation models for such a small-scale area. To the best of my knowledge, no such data exists in the current case, so that S1 based DEMs may be suited to get an inital estimation.

Second, compared to other similar mass movements associated with the dislocation of material, this one can be considered a major event with dimensions and volumes that are potentially suitable to be detected via Sentinel 1 despite the well-known inaccuracies in DEM generation.

Data & Methods

First, focussing on the dam collapse event requires an accurate spatial delineation of the event. To this end, the two closest cloudfree Sentinel 2 images, one before and one after the event, were selected. As the mudflow is likely to appear spectrally homogeneous on the second image, this image is chosen to perform a segmentation. Based on the sudden NDVI decrease between the two image acquisitions together with constraints regarding the minimum object size, all segments which do not correspond to area effect by the event are filtered out.

After the delineation of the AoI, the main part of the analyses is a comparison of pre- and post-event DEMs. The generation of DEMs via Sentinel 1 has been automated recently within the framework of the SLIDEM project by creating a corresponding docker container that solves the following tasks:

querying of scene pairs fulfilling minimum requirements regarding temporal and geometric baselines via ASF Baseline search tool executed in the background
downloading the scene pair(s)
executing a standard DEM creation workflow via ESAs SNAP and outputting main intermediate and final results, i.e. wrapped and unwrapped phase, coherence and DEM

In order to mitigate the flaws in S1 based DEMs, multiple DEMs were created for the pre-event and post-event period. This enables a subsequent selection of those, which have the highest coherence values and are thus likely to be more reliable than others. Only the most reliable pre-event DEM and the most reliable post-event DEM are then used to create a differential DEM. Additionally, the second (and third) most reliable DEM are included to create a mask based on a stack of multiple differential DEMs. The underlying idea is that high uncertainties in the differential DEM product can be detected and excluded if there is a huge variation across various differential DEMs, each based on a pair of reliable DEMs.

Results & Discussion

A. Event delineation

The two closest S2 images could be obtained for the 7th January and 1st February. The corresponding NDVI images clearly show the change due to the mudflow destroying the vegetation previously being in place. As visible in the differential NDVI to the right, NDVI decreases mainly occur in the central part for the mudflow, though some decreases correspond to areas unrelated to the actual event. These declines are due to other ongoing lanscape changes, for example, mine constructions in the southern part. Still, by applying the described object-based thresholding on mean NDVI changes and minimum object sizes, the outline of the event could be created and other changed areas excluded.

Note that there are way more elaborated ways to delineate the mudflow. One could for example also rely on SAR coherence images to detect changes and the object-based criteria max be expanded beyond the simple NDVI and size parameters. However, for the purposes of the current analyses, the current approach is sufficient and enables to derive at delineations in a fast and easy manner.

B. Volume estimation

Within the series of pre- and post-event DEMs presented below, clear differences in the calculated heights can be seen in some cases. A large homogeneity and low variance of the results within the pre- and post-event groups combined with a large intergroup heterogeneity due to the dam collapse is not apparent at first glance. It should be noted that for all of the terrain models presented, low overall coherence values were calculated for the underlying interferograms. Despite the low temporal baselines of a few days, the maximum coherence is 0.3. This is mainly explained by the dense vegetation within the study area subsequently leading to strong deviations in the derived elevation models. It is interesting to observe that for both the 2018 and 2019 model series, there seems to be a tendency for lower coherence values and larger variances for models based on S1 scenes of later months (October, November). Given the spatial location of the study area of about 20°S, this seems plausible in that increasing precipitation in the southern hemisphere summer makes enhanced atmospheric noise likely.

The low resilience of the derived elevation models with their associated errors is then subsequently also shown in the differential DEM. Despite selection of the individual DEMs with the highest coherence values as the basis of the differential formation, elevation changes of more than +-100m show up in places where no elevation change would be expected. While apparent subsidence is consistently indicated in the mountainous northeast, the vast remaining part of the study area is characterised by supposedly moderate elevation. Most of these changes can be classified as false positives, as no significant changes in relief are observed within the year except for the area affected by the dam collapse. However, if we now take a closer look at this area affected by dam collapse, a surprisingly plausible representation of the elevation changes emerges. In the northernmost part of the immediate dam collapse area, subsidence of several tens of metres is indicated, while in the central and southern part of the mudflow, elevations/risings can be interpreted as a sign of material accumulation. To what extent a quantification of material quantities based on this differential DEM is reliable can be doubted in view of the generally high errors. But at least the possibility to detect the event as such can be confirmed by the precise correspondence of the delineated boundaries of the event with the identified subsidence area (and possibly also accumulation area).

Finally, if masking of unreliable values is applied using the procedure described above, at least parts of the uplift and subsidence areas previously identified as inaccurate can be excluded. The changes within the delineated event area, on the other hand, are described in a consistent manner across different differential DEMs, resulting in low standard deviations and corresponding retention of the areas in the masked differential DEM.

References & Further Literature

SAR-based DEM generation

Abad, L., Hölbling, D., Dabiri, Z., & Robson, B. (2022). An open-source Python package for DEM generation and landslide volume estimation based on Sentinel-1 imagery. https://doi.org/10.5194/egusphere-egu22-693
Braun, A. (2021). DEM generation with Sentinel-1: Workflow and Challenges. Open Geosciences, 13, 532–569.
Braun, A. (2021). Retrieval of digital elevation models from Sentinel-1 radar data – open applications, techniques, and limitations. Open Geosciences, 13(1), 532–569. https://doi.org/10.1515/geo-2020-0246
Dabiri, Z., Hölbling, D., Abad, L., Helgason, J. K., Sæmundsson, Þ., & Tiede, D. (2020). Assessment of Landslide-Induced Geomorphological Changes in Hítardalur Valley, Iceland, Using Sentinel-1 and Sentinel-2 Data. Applied Sciences, 10(17), 5848. https://doi.org/10.3390/app10175848

Brumadinho dam collapse event

F. Gama, F., Mura, J. C., R. Paradella, W., & G. de Oliveira, C. (2020). Deformations Prior to the Brumadinho Dam Collapse Revealed by Sentinel-1 InSAR Data Using SBAS and PSI Techniques. Remote Sensing, 12(21), 3664. https://doi.org/10.3390/rs12213664
Grebby, S., Sowter, A., Gluyas, J., Toll, D., Gee, D., Athab, A., & Girindran, R. (2021). Advanced analysis of satellite data reveals ground deformation precursors to the Brumadinho Tailings Dam collapse. Communications Earth & Environment, 2(1), 1–9. https://doi.org/10.103/s43247-020-00079-2
Lumbroso, D., Davison, M., Body, R., & Petkovšek, G. (2021). Modelling the Brumadinho tailings dam failure, the subsequent loss of life and how it could have been reduced. Natural Hazards and Earth System Sciences, 21(1), 21–37. https://doi.org/10.5194/nhess-21-21-2021
Silva Rotta, L. H., Alcântara, E., Park, E., Negri, R. G., Lin, Y. N., Bernardo, N., Mendes, T. S. G., & Souza Filho, C. R. (2020). The 2019 Brumadinho tailings dam collapse: Possible cause and impacts of the worst human and environmental disaster in Brazil. International Journal of Applied Earth Observation and Geoinformation, 90, 102119. https://doi.org/10.1016/j.jag.2020.102119

Appendix - Scripts

				
					# load standard libraries
import os
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import xarray as xr
import rioxarray as rxr

# load segmentation libraries
from skimage.segmentation import felzenszwalb
from skimage.segmentation import slic
from skimage.segmentation import mark_boundaries

# show warnings only once
import warnings
warnings.filterwarnings(action='once')

# allow multiple outputs per cell
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

# set plotting style to default
plt.style.use('default')

# data dirs
s1_dir = os.path.join(os.getcwd(), "SLIDEM-python", "data", "results")
s2_dir = os.path.join(os.getcwd(), "SLIDEM-python", "data", "s2")
print(f"{len(os.listdir(s1_dir))} DEM products are available in the directory")
print(f"{len(os.listdir(s2_dir))} S2 images are available in the directory")

				
					# delineate mudflow based on NDVI
# ndvi calculating function
def calculate_ndvi(prod = "s2_20190107"):
    # define dirs
    _s2_dir = os.path.join(s2_dir, prod)
    red_band_dir = os.path.join(_s2_dir, [prod for prod in os.listdir(_s2_dir) if "L2A_B04" in prod][0])
    nir_band_dir = os.path.join(_s2_dir, [prod for prod in os.listdir(_s2_dir) if "L2A_B08" in prod][0])

    # get data
    red_band = rxr.open_rasterio(red_band_dir)
    nir_band = rxr.open_rasterio(nir_band_dir)

    # NDVI calculation
    ndvi = (nir_band - red_band)/(nir_band + red_band)
    return ndvi

# calculate pre/post event ndvi
ndvi_pre = calculate_ndvi(prod = "s2_20190107")
ndvi_post = calculate_ndvi(prod = "s2_20190201")
ndvi_diff = ndvi_post - ndvi_pre

# create segments based on post ndvi
# alternative slic sgementation
# segments = slic(ndvi_post.to_numpy(), n_segments=400, compactness=0.1, channel_axis=0)
segments = felzenszwalb(ndvi_post.to_numpy(), scale=200, min_size=50, channel_axis=0)
segments = xr.DataArray(segments, coords={'x': ndvi_post["x"],'y': ndvi_post["y"]}, dims=["y", "x"])

# plotting ndvi & changes
f, ax = plt.subplots(1, 3, figsize=(20,5))
ndvi_pre.plot(ax=ax[0], cmap='YlGn', vmin=-1, vmax=1)
ax[0].set(title="NDVI 01/07")
ax[0].set_axis_off()
ndvi_post.plot(ax=ax[1], cmap='YlGn', vmin=-1, vmax=1)
ax[1].set(title="NDVI 02/01")
ax[1].set_axis_off()
ndvi_diff.plot(ax=ax[2])
xr.plot.contour(segments, cmap="black", levels=len(np.unique(segments.to_numpy())), linewidths=1)
ax[2].set(title="NDVI difference")
ax[2].set_axis_off()
plt.show();

# object-based delineation of mudflow
# thresholding bases on NDVI difference
interim_I = ndvi_post.groupby(segments).where(ndvi_diff.groupby(segments).map(func=np.mean) < -0.4)
# removal of smaller objects (10 ha)
interim_II = ndvi_post.groupby(segments).where(ndvi_post.groupby(segments).count() > 1000)
mudflow = interim_I * interim_II

# convert to binary mask
mudflow = mudflow.where(mudflow.isnull(), other=1)
mudflow = mudflow.where(~mudflow.isnull(), other=0)

# plot result overlayed on optical image
s2_dir = os.path.join(s2_dir, "s2_20190201")
blue_band_dir = os.path.join(s2_dir, [prod for prod in os.listdir(s2_dir) if "L2A_B02" in prod][0])
green_band_dir = os.path.join(s2_dir, [prod for prod in os.listdir(s2_dir) if "L2A_B03" in prod][0])
red_band_dir = os.path.join(s2_dir, [prod for prod in os.listdir(s2_dir) if "L2A_B04" in prod][0])

# get data
blue_band = rxr.open_rasterio(blue_band_dir)
green_band = rxr.open_rasterio(green_band_dir)
red_band = rxr.open_rasterio(red_band_dir)

blue_band.name = 'blue'
green_band.name = 'green'
red_band.name = 'red'

# use concat to create DataArray with multiple bands, merge for Dataset
# rgb_post = xr.merge([blue_band, green_band, red_band])
rgb_post = xr.concat(
    [blue_band, green_band, red_band], 
    pd.Index([1,2,3], name='band')
)

# plot
f, ax = plt.subplots(figsize=(5,5))
xr.plot.imshow(rgb_post, vmax=0.15)
xr.plot.contour(mudflow[0,:,:], cmap="black", levels=len(np.unique(mudflow.to_numpy())), linewidths=1)
ax.set(title="post-event RGB & mudflow")
ax.set_axis_off();

				
					# plot subset of elevation models plus coherence
years = [prod.split('_')[-1][:4] for prod in os.listdir(s1_dir)]
prods_2018 = [prod[0] for prod in zip(os.listdir(s1_dir), years) if prod[1] == "2018"]
prods_2019 = [prod[0] for prod in zip(os.listdir(s1_dir), years) if prod[1] == "2019"]

def plot_dems_coherence(products = prods_2018):
    f, ax = plt.subplots(2, 5, figsize=(20,8))
    for idx, prod in enumerate(products):
        # get basic prod infos
        elev_dir = os.path.join(s1_dir, prod, prod.split('_',1)[-1] + '_elevation.tif')
        coh_dir = os.path.join(s1_dir, prod, prod.split('_',1)[-1] + '_coherence.tif')
        year = prod.split("_")[1][:4]
        _dateI = prod.split("_")[1][4:6] + "/" + prod.split("_")[1][6:8]
        _dateII = prod.split("_")[2][4:6] + "/" + prod.split("_")[2][6:8]
        date = _dateI + " - " + _dateII

        # read data & define na values
        elev = rxr.open_rasterio(elev_dir)
        elev = elev.where(elev != -99999)
        coh = rxr.open_rasterio(coh_dir)
        mean_coh = np.mean(coh)
        coh = coh.where(coh != -99999)

        # calculate stats (mean coherence)
        coh_mean = round(coh.mean(["x", "y"], skipna=True).values.item(), 2)

        # plotting data
        elev.plot(ax=ax[0, idx], cmap='terrain', vmin=500, vmax=1500)
        ax[0, idx].set(title=f"{date}\nDEM")
        ax[0, idx].set_axis_off()

        coh.plot(ax=ax[1, idx])
        ax[1, idx].set(title=f"coherence\n{coh_mean}")
        ax[1, idx].set_axis_off()

    # show output
    plt.tight_layout()
    plt.show();

plot_dems_coherence(prods_2018)
plot_dems_coherence(prods_2019)

				
					# calculate differential DEM
# read dems function
def read_dem(prod = prods_2018[1]):
    elev_dir = os.path.join(s1_dir, prod, prod.split('_',1)[-1] + '_elevation.tif')
    dem = rxr.open_rasterio(elev_dir)
    dem = dem.where(dem != -99999)
    return dem 

# best pair of DEM (based on max coherence)
DEM_2018 = read_dem(prods_2018[1])
DEM_2019 = read_dem(prods_2019[0])

# differencing 
DEM_2019 = DEM_2019.interp(x=DEM_2018["x"], y=DEM_2018["y"])
DEM_diff = DEM_2019 - DEM_2018

# reprojecting mudflow for overlay 
mudflow_reproj = mudflow.drop("band").drop("group").rio.reproject_match(DEM_diff)
mudflow_reproj = mudflow_reproj.where((mudflow_reproj == 0) | (mudflow_reproj == 1))[0,:,:]

# plotting data
f, ax = plt.subplots(1, 3, figsize=(20,5))
DEM_2018.plot(ax=ax[0], cmap='terrain', vmin=500, vmax=1500)
ax[0].set(title="DEM 2018")
ax[0].set_axis_off()
DEM_2019.plot(ax=ax[1], cmap='terrain', vmin=500, vmax=1500)
ax[1].set(title="DEM 2019")
ax[1].set_axis_off()
DEM_diff.plot(ax=ax[2], cmap="RdBu_r", vmin=-100, vmax=100)
xr.plot.contour(mudflow_reproj, cmap="black", linewidths=1)
ax[2].set(title="differential DEM")
ax[2].set_axis_off()
plt.show();

				
					# calculate multiple differential DEMs
# select best DEMs
idx_dem_2018 = [0,1,2]
idx_dem_2019 = [0,1]
idxs = [(x,y) for x in idx_dem_2018 for y in idx_dem_2019]

# array for DEM diffs
single_dem_diffs_bands = []

# perform calculations
for i, idx_pair in enumerate(idxs):
    DEM_2018 = read_dem(prods_2018[idx_pair[0]])
    DEM_2019 = read_dem(prods_2019[idx_pair[1]])

    # differencing 
    DEM_2019 = DEM_2019.interp(x=DEM_2018["x"], y=DEM_2018["y"])
    DEM_diff = DEM_2019 - DEM_2018
    if i != 0:
        DEM_diff = DEM_diff.interp(x=single_dem_diffs_bands[0]["x"], y=single_dem_diffs_bands[0]["y"])
    single_dem_diffs_bands.append(DEM_diff)

    # reprojecting mudflow for overlay 
    mudflow_reproj = mudflow.drop_vars("band").drop_vars("group").rio.reproject_match(DEM_diff)
    mudflow_reproj = mudflow_reproj.where((mudflow_reproj == 0) | (mudflow_reproj == 1))[0,:,:]

# create diff DEM stack
dems_diffs = xr.concat(single_dem_diffs_bands, pd.Index(np.arange(len(idxs)), name='band'))

# mask mean diff based on high variance in results
diff_std = dems_diffs.std(dim="band").rolling(x=10, y=10, center=True).mean(skipna=False)
masked_diff_mean = dems_diffs.mean(dim="band").where(diff_std < 40, other=np.nan)
masked_diff_mean = masked_diff_mean.where(dems_diffs.std(dim="band") > 0, other=np.nan)
masked_diff_mean = masked_diff_mean.rolling(x=2, y=2, center=True).min(skipna=False)

# plotting
f, ax = plt.subplots(2, figsize=(5,8))
diff_std.plot(ax = ax[0])
xr.plot.contour(mudflow_reproj, cmap="black", linewidths=1, ax=ax[0])
ax[0].set(title=f"standard deviation \ndifferential DEMs")
ax[0].set_axis_off()
masked_diff_mean.plot(ax = ax[1], cmap="RdBu_r", vmin=-100, vmax=100)
xr.plot.contour(mudflow_reproj, cmap="black", linewidths=1)
ax[1].set(title=f"masked differential DEM")
ax[1].set_axis_off()
plt.tight_layout()
plt.show();

Masters programme | E-portfolio Semester I & IIRemote Sensing Applications