5 End notes
5.1 Acknowlegements
Image and data processing by Earth Observation Group, Payne Institute for Public Policy, Colorado School of Mines. DMSP data collected by US Air Force Weather Agency.
Special thanks to …
Daynan Crull, Trevor Monroe and the world bank for their Open Night Lights tutorial, presented as part of the 3rd Annual Geo4Dev Symposium & Workshop December 10-11, 2020.
Robin Lovelace, Jakub Nowosad, and Jannes Muenchow for the very helpful Geocomputation with R, which is now available as a physical book from CRC Press.
Edzer Pebesma and all those who have been working on sf and terra in recent years. July 2021 overview: https://edzer.github.io/UseR2021/#1
Kyle Walker, Matt Herman, and Kris Eberwein for the tidycensus package
The RStudio team and other contributors to the tidyverse and tidymodels package families.
Max Kuhn and Julia Silge for Tidy Modeling with R. I used Version 0.0.1.9010 (2021-07-19).
CRAN maintainers and the R Foundation for fostering a high-quality, growing R ecosystem.
5.2 Data provenance
5.2.1 Version 4 DMSP-OLS Nighttime Lights Time Series
From https://eogdata.mines.edu/products/dmsp/
Since the 1970s, the U.S. Air Force Defense Meteorological Satellite Program (DMSP) has operated satellite sensors capable of detecting the visible and near-infrared (VNIR) emissions from cities and towns. The DMSP Operational Linescan System (OLS) acquires global daytime and nighttime imagery of the Earth in two spectral bands (VIS and TIR). The nighttime “VIS” bandpass straddles the VNIR portion of the spectrum (0.5 to 0.9 um). The VIS band signal is intensified at night using a photomultiplier tube (PMT), making it possible to detect faint VNIR emission sources. The PMT system was implemented for the detection of clouds at night. An unanticipated consequence of the nighttime light intensification is the detection of city lights, gas flares, and fires.
Since 1997, EOG had been pioneering in combining the imagery taken by DMSP-OLS to produce global Nighttime Light maps. With annual data stretch from 1992 to 2013, making DMSP Nighttime Light the longest data series available for nocturnal remote sensing on human activities.
From https://eogdata.mines.edu/dmsp/downloadV4composites.html
The files are cloud-free composites made using all the available archived DMSP-OLS smooth resolution data for calendar years. In cases where two satellites were collecting data - two composites were produced. The products are 30 arc second grids, spanning -180 to 180 degrees longitude and -65 to 75 degrees latitude.
Citations:
These data sets are the results of years of algorithm development and production efforts. The data are here for you and others to use in any way you like and have no copyright. Please acknowledge our efforts by including a brief data source attribution and one or more of the following references in anything you write where our data is utilized. Data source attribution “Earth Observation Group, Colorado School of Mines.”
References:
Elvidge, C. D., Baugh, K. E., Kihn, E. A., Kroehl, H. W., & Davis, E. R. (1997). Mapping city lights with nighttime data from the DMSP Operational Linescan System. Photogrammetric Engineering and Remote Sensing, 63(6), 727-734.
Baugh, K., Elvidge, C. D., Ghosh, T., & Ziskin, D. (2010). Development of a 2009 stable lights product using DMSP-OLS data. Proceedings of the Asia-Pacific Advanced Network, 30(0), 114.
5.2.2 Radiance data
I use Radiance Calibrated values(Hsu et al. 2015), a transformation of the Digital Number (DN) made available by the EOG group, which among other things, offers greater dynamic range than the six-bit diginal number values, which can be saturated at high values.
From https://eogdata.mines.edu/products/dmsp/#radcal :
The Operational Linescan System (OLS) flown on the Defense Meteorological Satellite Program (DMSP) satellites, has a unique capability to record low light imaging data at night worldwide. These data are archived at the National Oceanic and Atmospheric Administration (NOAA) National Geophysical Data Center (NGDC). The useful data record stretches back to 1992 and is ongoing. The OLS visible band detector observes radiances about one million times dimmer than most other Earth observing satellites. The sensor is typically operated in a high gain setting to enable the detection of moonlit clouds. However, with six bit quantization and limited dynamic range, the recorded data are saturated in the bright cores of urban centers. A limited set of observations have been obtained at low lunar illumination were obtained where the gain of the detector was set significantly lower than its typical operational setting (sometimes by a factor of 100). By combining these sparse data acquired at low gain settings with the operational data acquired at high gain settings, we have produced a set of global nighttime lights product with no sensor saturation. This product can be related to radiances based on the pre-flights sensor calibration.
See the Global Radiance Calibrated Nighttime Lights Product Readme for a summary of the methodology.
These files align with the US decennial census, which has the highest-quality demographic data. I downloaded the image files on 2021-07-17.
F12-F15_20000103-20001229_rad_v4.geotiff.tgz (F12-F15_20000103-20001229_rad_v4); using F12-F15_20000103-20001229_rad_v4.avg_vis.tif
F16_20100111-20110731_rad_v4.geotiff.tgz (F16_20100111-20110731_rad_v4); using F16_20100111-20110731_rad_v4.avg_vis.tif
5.2.3 Intercalibration
https://eogdata.mines.edu/products/dmsp/
The EOG’s Radiance Calibrated products require intersatellite and inter-annual calibration in order to make valid comparisons. The coefficient table is available in DMSP-OLS Radiance Calibrated Nighttime Lights Time Series with Intercalibration(Hsu et al. 2015):
The Defense Meteorological Satellite Program-Operational Linescan System (DMSP-OLS) stable lights products are made using operational OLS data collected at high gain settings, resulting in sensor saturation on brightly lit areas, such as city centers. This has been a paramount shortcoming of the DMSP-OLS stable lights time series. This study outlines a methodology that greatly expands the dynamic range of the OLS data using observations made at different fixed-gain settings, and by incorporating the areas not affected by saturation from the stable lights product. The radiances for the fixed-gain data are computed based on each OLS sensor’s pre-flight calibration. The result is a product known as the OLS radiance calibrated nighttime lights. A total of eight global datasets have been produced, representing years from 1996 to 2010. To further facilitate the usefulness of these data for time-series analyses, corrections have been made to counter the sensitivity differences of the sensors, and coefficients are provided to adjust the datasets to allow inter-comparison.
Page 1868 includes the relevant coefficients, as does the Global Radiance Calibrated Nighttime Lights Product Readme:
Inter-satellite Calibration Multiplier.
Satellite | Base Gain (dB) | Multiplier | Radiance @ DN1 (W/cm2/sr) |
---|---|---|---|
F14 | 55 | 0.82 | 1.23E-10 |
F16 | 55 | 1.0 | 1.50E-10 |
Note all [radiance] products are inter-satellite calibrated to the base gain setting of 55 dB of satellite F16. (page 1872)
Thus, to perform inter-satellite calibration, I multiplied the radiance values in the F14 GeoTIFF image by 0.82. While this improves the comparability in time series, it contributes its own sources of error, as summarized below.
I take the Inter-annual Calibration Coefficients directly from the Global Radiance Calibrated Nighttime Lights Product Readme; see rows annotated with “<-.”
NOTE: (1) This table is preliminary, and needs to be updated.
(2) There is no data for "F16_20051128-20061224_rad_v4" for it being
the reference dataset.
(3) F16_20100111-20101209_rad_v4 and F16_20100111-20110731_rad_v4 share
the same data for there is onyl small diferrence from adding images
from year 2011.
Equation
Y=Coeff0+Coeff1*X
Y-File
F16_20051128-20061224_rad_v4
X-File Pow Coeff0 Coeff1 R2 N_Pt
F12_19960316-19970212_rad_v4 1 4.336 0.915 0.971 20540
F12_19990119-19991211_rad_v4 1 1.423 0.780 0.980 20846
F12-F15_20000103-20001229_rad_v4 1 3.658 0.710 0.980 20866 <-
F14-F15_20021230-20031127_rad_v4 1 3.736 0.797 0.980 20733
F14_20040118-20041216_rad_v4 1 1.062 0.761 0.984 20844
F16_20100111-20101209_rad_v4 1 2.196 1.195 0.981 20848 <-
F16_20100111-20110731_rad_v4 1 -1.987 1.246 0.981 20848
To make values comparable across years, I further transform the 2000 radiance values as follows, where X is the vector of radiance values: \[Y=C_0+C_1 * X = 3.658 + 0.710 * X\]
Similarly for 2010: \[Y=C_0+C_1 * X = 2.196 + 1.195 * X\]
Note the \(R^2\) values above. This adjustment is introducing error.
5.3 Assumptions and limitations
5.3.1 Limitations
5.3.1.1 Limitations in nighttime lights data
The World Bank’s Open Nighttime Lights tutorial (WorldBank 2020) summarizes the following limitations (among others):
2.2.2. Limitations and challenges of DMSP-OLS
- Low radiometric resolution (6-bit data, values range from 0-63)
- No on-board calibration
- Large spatial resolution (4.9km for nighttime VIS band)
- Saturation in urban cores
These limitations can lead to challenges for scientists using these data for their work. For example, drawing conclusions from the DMSP-OLS stable lights series may be challenging in low-density urban areas …. In addition, the extent and intensity of lit areas cannot directly delimit urban regions due to the “blooming” effect.
Therefore I use the Radiance Calibrated products. Note however that they are not absolute radiance(Hsu et al. 2015)
Many users would like to relate the DN values in the Radiance Calibrated products to actual radiance. The fixed-gain image avoids the automatic gain variation by VDGA which is not recorded in the data stream. So theoretically, by applying the preflight sensor calibration, the actual radiance could be derived from the DN. Nevertheless, even under such controlled operation, there is no means to address the degradation of the equipment without an on board calibration device. Moreover, the uncertainty brought by blending Stable Light products, which were taken under variable gain settings, make it even more unlikely to derive the exact radiance from the Radiance Calibrated product. As a result, it is suggested that the Radiance Calibrated product is only suitable for analyses which do not require actual radiance. Nevertheless, the conversion factor from DN to radiance for the Radiance Calibrated products is listed in Table 3 for reference. Note all products are inter-satellite calibrated to the base gain setting of 55 dB of satellite F16. (page 1872)
and
Because the DMSP-OLS does not carry an on-board calibration device for the visible band, it is not possible to track how actual radiance is being converted to DN even if the auto-adjust gain setting is fixed. Instead we rely on the preflight calibration made for each OLS instrument prior to launch. Therefore, even though the Radiance Calibrated products are made from data collected under much more controlled setting than the Stable Lights products, the radiance values should still be considered relative and not absolute. (page 1874)
In this analysis I refer to these Radiance Calibrated values as “radiance,” “radiance values,” and “light” corresponding to variables light_avg
and the transformed log_light_avg
.
5.3.1.2 Limitations in intercalibration
As summarized in (Hsu et al. 2015)
Being freed from the saturation problem, Radiance Calibrated products need a reference area which not only is stable enough but also provides samples with brightness ranging from very low light to the world’s highest levels. Los Angeles was taken as the reference for the Radiance Calibrated products for two reasons. First, Los Angeles has long been a mature metropolis and the light change is negligible. Second, being a metropolis, it can provide samples with high DNs from the city center, as well as low DNs from the suburban area.
While the distribution of light data from L.A. over the time period may have been relatively stable, the differences probably weren’t zero, and the error or bias is not quantified. In fact the readme includes the following related to inter-annual calibration:
- This table is preliminary, and needs to be updated.
with these dates in the readme file:
Date: 01/15/2015
Modified: 09/09/2020
Typical uncertainty of sensors noted in (Chander et al. 2013):
In general, the absolute radiometric calibration for most optical sensors is specified to an uncertainty of 5%; hence, the likely uncertainty (in reflectance units) on a measured reflectance value of, for example, 0.4 will be ±5% of 0.4 equal to ±0.02. (page 1058)
Scene Variability can source of error, however in this case, use of light averaged over data collected at various times throughout the year minimizes this source of error.
The intercalibration of satellite instruments often requires comparing observations from different instruments coincident in space, time, and viewing geometry. As these are never exactly aligned, thresholds are usually applied to define the collocations. The choice of these thresholds directly impacts the uncertainty of the comparison, partially due to the scene variability within the range of the collocation criteria. The collocation criteria represent tradeoffs between the errors on each collocation and the number of collocations available. Collocated observations from a pair of satellite instruments are not sampled at exactly the same place or time. Variations in the atmosphere and surface during the interval between their observations introduce errors when comparing their collocated radiances. The greater the collocated observations interval, the larger the contribution of the scene’s variability to the total error budget. (page 1061)
5.3.2 Census data and state and county geometry
I employed three sources of demographic data:
- US census 2000, which provided total population by county and median income by county.
- US census 2010, which provided total population by county. This decennial census did not provide median household income (in fact, the 2010 census did away with the “long form” that provided the data for the sf3 table).
- I initially used the five-year ACS (2006-2010), which provided total population by county and median income by county, however once I decided not to use median income, I didn’t need the ACS. Which is good since ACS population estimates are ~2% lower than the 2010 census for the counties I examined in this analysis, and I did not need to deal with differences in methodology between the decennial census and ACS, or adjust for inflation.
The tidycensus package provides the population data from 2000 and 2010 decennial censuses. Median income per family comes from the decennial 2000 census and the American Community Survey (ACS) five-year estimates 2006-2010 provided by the US Census Bureau. tidycensus
also provides state and county geometry via the tigris package (and corresponding extracts from the Census Bureau’s Master Address File/Topologically Integrated Geographic Encoding and Referencing (TIGER) database).
5.3.3 Population and population density
I used these county boundaries to calculate the area of each county and, combined with decennial census population, each county’s population density. I included land and water area when calculating total area, since my calculation of average light radiance per county used the same boundaries that included land and water within those boundaries
I did not do anything to adjust for light blooming beyond whatever the EOG did in their Radiance Calibrated products. Instead I assume any remaining blooming artifacts are similar in the 2000 and 2010 products.
On the coasts and in other high-traffic areas, census population (and therefore population density) may not capture vacationers and other itinerant people that contribute to population density, the capacity of the build environment, and (if my causal conjecture holds), the degree to which an area is built up and generates light. I assume any effect due to itinerant people is small enough (or consistent across the time interval) that it can be ignored.
5.3.4 Economic development / GDP / business cycle
The US was in the middle of the Great Recession in 2010. Could this be a factor in the lower radiance levels that year compared to 2000? Starting in 2008, the recession depressed new investments, including building projects that would have created more nighttime light.
5.4 Project organization
5.4.1 Data preparation pipeline implemented to with targets package
I used the targets
package to define a pipeline for data preparation. Dependencies among the set of functions are specified in the _targets file and executed via tar_make()
. They include the following:
tar_manifest() %>%
arrange(name) %>%
gt()
## here() starts at /Users/dmoul/rwork/nightlight
name | command | pattern |
---|---|---|
all_df | model_prepare_data(county_sf) | NA |
all_df_c | model_prepare_data(county_sf_c) | NA |
bbox_vec | get_bbox(census_df, these_states) | NA |
census_df | get_census_data(all_states, census_year_a, "census") | NA |
census_df_c | get_census_data(all_states, census_year_c, "census") | NA |
census_focus_df | census_focus(census_df, these_states, bbox_vec) | NA |
census_focus_df_c | census_focus(census_df_c, these_states, bbox_vec) | NA |
county_sf | get_county_light_sf(census_focus_df, light_sf) | NA |
county_sf_c | get_county_light_sf(census_focus_df_c, light_sf_c) | NA |
light_points_df | get_light_levels(bbox_vec, source_light_file_2000, \n intersat_calibration = 0.82, interannual_calibration = c(3.658, \n 0.71)) | NA |
light_points_df_c | get_light_levels(bbox_vec, source_light_file_2010, \n intersat_calibration = 1, interannual_calibration = c(2.196, \n 1.195)) | NA |
light_sf | get_light_sf(light_points_df, state_boundaries_geom) | NA |
light_sf_c | get_light_sf(light_points_df_c, state_boundaries_geom) | NA |
state_boundaries_geom | state_boundaries(census_focus_df) | NA |
tar_visnetwork()
draws a dependency graph. Zoom in to see the labels on the nodes:
tar_visnetwork()
## here() starts at /Users/dmoul/rwork/nightlight
5.4.2 Session information
sessionInfo()
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] splines stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] gt_0.3.1 multilevelmod_0.0.0.9000 yardstick_0.0.8
## [4] workflowsets_0.1.0 workflows_0.2.3 tune_0.1.6
## [7] rsample_0.1.0 recipes_0.1.16 parsnip_0.1.7
## [10] modeldata_0.1.1 infer_1.0.0 dials_0.0.9
## [13] broom_0.7.9 tidymodels_0.1.3 hrbrthemes_0.8.6
## [16] scales_1.1.1 units_0.7-2 tidycensus_1.0
## [19] patchwork_1.1.1 ggdag_0.2.3 dagitty_0.3-1
## [22] ggridges_0.5.3 terra_1.3-22 sf_1.0-2
## [25] glue_1.4.2 forcats_0.5.1 stringr_1.4.0
## [28] dplyr_1.0.7 purrr_0.3.4 readr_2.0.1
## [31] tidyr_1.1.3 tibble_3.1.4 ggplot2_3.3.5
## [34] tidyverse_1.3.1 targets_0.7.0
##
## loaded via a namespace (and not attached):
## [1] readxl_1.3.1 uuid_0.1-4 backports_1.2.1 systemfonts_1.0.2
## [5] plyr_1.8.6 igraph_1.2.6 sp_1.4-5 listenv_0.8.0
## [9] digest_0.6.27 foreach_1.5.1 htmltools_0.5.2 fansi_0.5.0
## [13] checkmate_2.0.0 magrittr_2.0.1 doParallel_1.0.16 tzdb_0.1.2
## [17] globals_0.14.0 modelr_0.1.8 gower_0.2.2 extrafont_0.17
## [21] extrafontdb_1.0 hardhat_0.1.6 colorspace_2.0-2 rvest_1.0.1
## [25] rappdirs_0.3.3 haven_2.4.3 xfun_0.25 rgdal_1.5-23
## [29] callr_3.7.0 crayon_1.4.1 jsonlite_1.7.2 iterators_1.0.13
## [33] survival_3.2-13 tigris_1.4.1 gtable_0.3.0 ipred_0.9-11
## [37] V8_3.4.2 Rttf2pt1_1.3.9 DBI_1.1.1 Rcpp_1.0.7
## [41] GPfit_1.0-8 foreign_0.8-81 proxy_0.4-26 lava_1.6.9
## [45] prodlim_2019.11.13 htmlwidgets_1.5.3 httr_1.4.2 ellipsis_0.3.2
## [49] pkgconfig_2.0.3 nnet_7.3-16 sass_0.4.0 dbplyr_2.1.1
## [53] utf8_1.2.2 here_1.0.1 tidyselect_1.1.1 rlang_0.4.11
## [57] DiceDesign_1.9 visNetwork_2.0.9 munsell_0.5.0 cellranger_1.1.0
## [61] tools_4.1.0 cli_3.0.1 generics_0.1.0 evaluate_0.14
## [65] fastmap_1.1.0 yaml_2.2.1 processx_3.5.2 knitr_1.33
## [69] fs_1.5.0 tidygraph_1.2.0 future_1.22.1 xml2_1.3.2
## [73] compiler_4.1.0 rstudioapi_0.13 curl_4.3.2 e1071_1.7-8
## [77] reprex_2.0.1 lhs_1.1.1 bslib_0.2.5.1 stringi_1.7.4
## [81] ps_1.6.0 gdtools_0.2.3 lattice_0.20-44 Matrix_1.3-4
## [85] classInt_0.4-3 vctrs_0.3.8 pillar_1.6.2 lifecycle_1.0.0
## [89] furrr_0.2.3 jquerylib_0.1.4 data.table_1.14.0 maptools_1.1-1
## [93] raster_3.4-13 R6_2.5.1 bookdown_0.23 KernSmooth_2.23-20
## [97] parallelly_1.27.0 codetools_0.2-18 boot_1.3-28 MASS_7.3-54
## [101] assertthat_0.2.1 rprojroot_2.0.2 withr_2.4.2 parallel_4.1.0
## [105] hms_1.1.0 grid_4.1.0 rpart_4.1-15 timeDate_3043.102
## [109] class_7.3-19 rmarkdown_2.10 pROC_1.17.0.1 lubridate_1.7.10