I’ve been doing a lot of work with networks and graph theory over the years, both for my job at DWR and as a central component of my dissertation. I’ve been doing some stuff with spatial graphs as part of an analysis of salmonid habitat in the Russian River estuary, and this work involves analyses based on network representations of gridded data.
In a previous post I showed you a very long and arduous process to convert polylines to networks using ArcGIS. I initially turned to ArcGIS for converting rasters to networks as well, focusing on using the Create Fishnet and Polygon Neighbors tools to convert rasters to shapefiles and generate the network edge list. This turned out to be a very chalenging approach: generating the fishnet took massive computing power, and my computer repeatedly failed to complete the polygon neighbor analysis due to the large number of polygons. I needed a different solution.
I eventually realized that the
raster
package for R had the tools I needed to generate a network directly
from the raster—no shapefiles needed! And it’s only a few lines of
code. Generating the node and edge list of the network are just a single
function call each: the node list and their attributes can be generated
using the function getValues
, while the adjacent
function generates
a two-column matrix that is equivalent to a (bi)directional edge list.
I’ll use the test raster provided by the package to demonstrate:
library(raster)
r = raster(system.file("external/test.grd", package="raster"))
First we generate the node list using getValues()
. We also get the location
info with a call to coordinates()
:
library(tidyverse)
nodes = tibble(value = getValues(r)) %>%
bind_cols(as_tibble(coordinates(r)))
The edge list is generated with a call to adjacent()
. We want the
adjacency information for every pixel, so we pass the complete set
of pixel indices for the cells
argument. In this case only ask for
adjacency in the four cardinal directions, i.e. we don’t check for
diagonal adjacency:
edges = as_tibble(adjacent(r, cell = 1:ncell(r), directions = 4, sorted = TRUE))
Note that the node and edge lists include nodata
pixels, which we
almost certainly don’t care about. It doesn’t noticeably impact the process
for this example, but for larger rasters it’s probably smart to filter
out the nodata
pixels beforehand and only retrieve adjacency info
for valid pixels. Here we simply filter out the nodata
nodes after
creating the graph via the
tidygraph
package:
library(tidygraph)
g = tbl_graph(nodes, edges, TRUE) %N>%
filter(!is.na(value))
And that’s it! This is fine for smaller rasters, but you’ll start
hitting problems with memory allocation as your rasters get larger.
I’m still working on how to handle that, but the fact that you can
choose to analyze adjacencies for a subset of pixels when using
adjacent()
means that it’s easy to break up the analysis into chunks.
I’ll write another post once I work out a good process.
Comments
Want to leave a comment? Visit this post's issue page on GitHub (you'll need a GitHub account).