Megan's IS415 Journey: Hands-On Exercise 11

1.0 Overview

Like always, before we start: just what is geographical accessibility? Well, the answer’s in the name: it’s about how easy a location is to be accessed - its capacity to be reached by other locations. Determining geographical accessibility is critical in urban planning - for example, the placement of transport infrastructure so as to connect healthcare facilities and public amenities to the people in residential areas.

Here’s this week’s recommended reading: a primer on what geographical accessibility is and how it’s measured. It might come off as a little heavy on the theory (and math side), so if you’d prefer to see accessibility in action - why not look at accessibility in terms of healthcare in this study?

2.0 Setup

2.1 Packages Used

The packages we’ll be introducing today are:

SpatialAcc: used for modelling geographical accessibility
ggstatsplot: an extension of ggplot2 that is used for statistical analysis

In addition, we’ll be using packages from our previous exercises:

sf: used for importing, managing, and processing geospatial data
tmap: used for plotting thematic maps, such as choropleth and bubble maps
tidyverse: used for importing, wrangling and visualising data (and other data science tasks!)

In addition, the following tidyverse packages will be used (for attribute data handling):

readr for importing delimited files (.csv)
dplyr for wrangling and transforming data
ggplot2 for visualising data

Show code

packages = c('sf', 'tmap', 'SpatialAcc', 'ggstatsplot', 'reshape2', 'tidyverse')
for(p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

2.2 Data Used

The datasets used for this exercise are:

MP14_SUBZONE_NO_SEA_PL: a polygon feature data in ESRI shapefile format from data.gov.sg, providing information of URA 2014 Master Plan Planning Subzone boundary data, in SVY21 projected coordinates system
hexagons: A 250m radius hexagons GIS data in ESRI shapefile format which was created by using st_make_grid() of the sf package
ELDERCARE: GIS data showing location of eldercare service, from data.gov.sg. Note that there are two versions: one in ESRI shapefile format, and the other in Google kml file format. For this exercise, we’ll be using the ESRI shapefile.
OD_Matrix: a distance matrix in csv format. There are six fields:
- origin_id: the unique id values of the origin (i.e. fid of hexagon data set.), -destination_id: the unique id values of the destination (i.e. fid of ELDERCARE data set.), -entry_cost: the perpendicular distance between the origins and the nearest road), -network_cost: the actual network distance from the origin and destination, -exit_cost: the perpendicular distance between the destination and the nearest road), and -total_cost: the summation of entry_cost, network_cost and exit_cost.

All the values of the cost related fields are in metres.

2.3 Geospatial Data Importing + Wrangling

Importing Geospatial Data

Show code

# output: simple features object
mpsz <- st_read(dsn = "data/geospatial", layer = "MP14_SUBZONE_NO_SEA_PL")

Reading layer `MP14_SUBZONE_NO_SEA_PL' from data source 
  `C:\IS415\IS415_msty\_posts\2021-10-31-hands-on-exercise-11\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 323 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21

Show code

hexagons <- st_read(dsn = "data/geospatial", layer = "hexagons")

Reading layer `hexagons' from data source 
  `C:\IS415\IS415_msty\_posts\2021-10-31-hands-on-exercise-11\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 3125 features and 6 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: 2667.538 ymin: 21506.33 xmax: 50010.26 ymax: 50256.33
Projected CRS: SVY21 / Singapore TM

Show code

eldercare <- st_read(dsn = "data/geospatial", layer = "ELDERCARE")

Reading layer `ELDERCARE' from data source 
  `C:\IS415\IS415_msty\_posts\2021-10-31-hands-on-exercise-11\data\geospatial' 
  using driver `ESRI Shapefile'
Simple feature collection with 120 features and 19 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 14481.92 ymin: 28218.43 xmax: 41665.14 ymax: 46804.9
Projected CRS: SVY21 / Singapore TM

Updating CRS Information

As we can see, our mpsz does not have any EPSG information. Let’s address that, and use ESPG 3414 which is the SVY21 projected coordinates system specific to Singapore:

Show code

mpsz <- st_transform(mpsz, 3414)
eldercare <- st_transform(eldercare, 3414)
hexagons <- st_transform(hexagons, 3414)

# check the newly-transformed sf for the correct ESPG code 3414
st_crs(mpsz)

Coordinate Reference System:
  User input: EPSG:3414 
  wkt:
PROJCRS["SVY21 / Singapore TM",
    BASEGEOGCRS["SVY21",
        DATUM["SVY21",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4757]],
    CONVERSION["Singapore Transverse Mercator",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",1.36666666666667,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",103.833333333333,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",1,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",28001.642,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",38744.572,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["northing (N)",north,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["easting (E)",east,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Cadastre, engineering survey, topographic mapping."],
        AREA["Singapore - onshore and offshore."],
        BBOX[1.13,103.59,1.47,104.07]],
    ID["EPSG",3414]]

Cleaning + Updating Attribute Fields

As we can see, there are many redundant fields in eldercare and hexagons. In addition, we should add new fields, capacity and demand into eldercare and hexagons respectively - these can be derived with the mutate() function of the dplyr package.

Show code

eldercare <- eldercare %>%
  select(fid, ADDRESSPOS) %>%
  mutate(capacity = 100)

hexagons <- hexagons %>%
  select(fid) %>%
  mutate(demand = 100)

Notice: For the purpose of this hands-on exercise, a constant value of 100 is used. In practice, the actual demand of the hexagon and actual capacity of the eldercare centre should be used instead.

2.4 Aspatial Data Importing + Wrangling

Importing Distance Matrix

Show code

# output: tibble data.frame
ODMatrix <- read_csv("data/aspatial/OD_Matrix.csv", skip = 0)

Tidying distance matrix

Here’s what our imported ODMatrix looks like - it’s organised columnwise:

However, most of the modelling packages in R expect a matrix like this:

To explain the expected matrix: the rows represent origins (i.e. also know as from field) and the columns represent destination (i.e. also known as to field.)

Well, our matrix needs to be transformed from a thin format to that fat format! We’ll use the spread() fucntion of our tidyr package, like so:

Show code

distmat <- ODMatrix %>%
  select(origin_id, destination_id, total_cost) %>%
  spread(destination_id, total_cost)%>%
  select(c(-c('origin_id')))

Note: We can also use pivot_wider(), which is the newer and updated function!

Show code

distmat <- ODMatrix %>%
  select(origin_id, destination_id, total_cost) %>%
  pivot_wider(names_from=destination_id, values_from=total_cost)%>%
  select(c(-c('origin_id')))

Currently, the distance is measured in metres because SVY21 projected coordinate system is used - to make it easier for our analysis, we should convert from metres ot kilometres:

Show code

distmat_km<-as.matrix(distmat/1000)

3.0 Modelling and Visualising Accessibility using Hansen Method

3.1 Computing Hansen’s accessibility

Now that our data’s done, we’re ready to dive into the meat of today’s exercise: computing geographical accessibility! First up is computing Hansen’s accessibility by using the ac() function of our newly-introduced SpatialAcc package.

Let’s check how to use ac() first:

Taken from the SpatialAcc Package Guide

Now, let’s try it out!

Show code

acc_Hansen <- data.frame(ac(hexagons$demand,
                            eldercare$capacity,
                            distmat_km, 
                            #actually, the d0 won't affect the results in this situation
                            d0 = 50,
                            power = 2, 
                            family = "Hansen"))

Oh, yikes!! Look at the default field name:

That’s too long and messy - we’ll rename it.

Show code

colnames(acc_Hansen) <- "accHansen"

Much better 😄 Next, we have to convert the data table into tibble format:

Show code

acc_Hansen <- tbl_df(acc_Hansen)

Lastly, we’ll join the acc_Hansen tibble data frame with the hexagons simple feature data frame using the bind_cols() function of our dplyr package.

Show code

hexagon_Hansen <- bind_cols(hexagons, acc_Hansen)

3.2 Visualising Hansen’s accessibility

Still remember how we extracted the map extent in our Hands-On Exercise 9, when we were wrangling our geospatial data in Section 2.3? We’ll be doing the same thing here with st_bbox():

Show code

mapex <- st_bbox(hexagons)

Now, let’s plot:

Show code

tmap_mode("plot")
tm_shape(hexagon_Hansen,
         bbox = mapex) + 
  tm_fill(col = "accHansen",
          n = 10,
          style = "quantile",
          border.col = "black",
          border.lwd = 1) +
tm_shape(eldercare) +
  tm_symbols(size = 0.1) +
  tm_layout(main.title = "Accessibility to eldercare: Hansen method",
            main.title.position = "center",
            main.title.size = 2,
            legend.outside = FALSE,
            legend.height = 0.45, 
            legend.width = 3.0,
            legend.format = list(digits = 6),
            legend.position = c("right", "top"),
            frame = TRUE) +
  tm_compass(type="8star", size = 2) +
  tm_scale_bar(width = 0.15) +
  tm_grid(lwd = 0.1, alpha = 0.5)

3.3 Statistical graphic visualisation

In this section, we’ll compare the distribution of Hansen’s accessibility values by URA Planning Region. Firstly, we need to add the planning region field into hexagon_Hansen:

Show code

hexagon_Hansen <- st_join(hexagon_Hansen, mpsz, 
                          join = st_intersects)

Next, we’ll plot the distribution by using the boxplot graphical method:

Show code

ggplot(data=hexagon_Hansen, 
       aes(y = log(accHansen), 
           x= REGION_N)) +
  geom_boxplot() +
  geom_point(stat="summary", 
             fun.y="mean", 
             colour ="red", 
             size=2)

4.0 Modelling and Visualising Accessibility using KD2SFCA Method

This section’s going to be a repeat of what we did above, only with a different method! And since we’ve explained the code already, this time we’ll combine the codes into one chunk for efficiency 👍

4.1 Computing KD2SFCA’s accessibility

Show code

acc_KD2SFCA <- data.frame(ac(hexagons$demand,
                            eldercare$capacity,
                            distmat_km, 
                            d0 = 50,
                            power = 2, 
                            family = "KD2SFCA"))
colnames(acc_KD2SFCA) <- "accKD2SFCA"
acc_KD2SFCA <- tbl_df(acc_KD2SFCA)
hexagon_KD2SFCA <- bind_cols(hexagons, acc_KD2SFCA)

4.2 Visualising KD2SFCA’s accessibility

Show code

tmap_mode("plot")
tm_shape(hexagon_KD2SFCA,
         bbox = mapex) + 
  # note that 'mapex' is reused for our bbox argument!
  tm_fill(col = "accKD2SFCA",
          n = 10,
          style = "quantile",
          border.col = "black",
          border.lwd = 1) +
tm_shape(eldercare) +
  tm_symbols(size = 0.1) +
  tm_layout(main.title = "Accessibility to eldercare: KD2SFCA method",
            main.title.position = "center",
            main.title.size = 2,
            legend.outside = FALSE,
            legend.height = 0.45, 
            legend.width = 3.0,
            legend.format = list(digits = 6),
            legend.position = c("right", "top"),
            frame = TRUE) +
  tm_compass(type="8star", size = 2) +
  tm_scale_bar(width = 0.15) +
  tm_grid(lwd = 0.1, alpha = 0.5)

4.3 Statistical graphic visualisation

Now, we are going to compare the distribution of KD2CFA accessibility values by URA Planning Region. Like before, we need to add the planning region field into hexagon_KD2SFCA, like so:

Show code

hexagon_KD2SFCA <- st_join(hexagon_KD2SFCA, mpsz, 
                          join = st_intersects)

Plotting time! We’ll be using the boxplot graphical method again:

Show code

ggplot(data=hexagon_KD2SFCA, 
       aes(y = accKD2SFCA, 
           x= REGION_N)) +
  geom_boxplot() +
  geom_point(stat="summary", 
             fun.y="mean", 
             colour ="red", 
             size=2)

5.0 Modelling and Visualising Accessibility using Spatial Accessibility Measure (SAM) Method

5.1 Computing SAM accessibility

Show code

acc_SAM <- data.frame(ac(hexagons$demand,
                         eldercare$capacity,
                         distmat_km, 
                         d0 = 50,
                         power = 2, 
                         family = "SAM"))
colnames(acc_SAM) <- "accSAM"
acc_SAM <- tbl_df(acc_SAM)
hexagon_SAM <- bind_cols(hexagons, acc_SAM)

5.2 Visualising SAM’s accessibility

Show code

tmap_mode("plot")
tm_shape(hexagon_SAM,
         bbox = mapex) +
  # note that 'mapex' is reused for our bbox argument!
  tm_fill(col = "accSAM",
          n = 10,
          style = "quantile",
          border.col = "black",
          border.lwd = 1) +
tm_shape(eldercare) +
  tm_symbols(size = 0.1) +
  tm_layout(main.title = "Accessibility to eldercare: SAM method",
            main.title.position = "center",
            main.title.size = 2,
            legend.outside = FALSE,
            legend.height = 0.45, 
            legend.width = 3.0,
            legend.format = list(digits = 3),
            legend.position = c("right", "top"),
            frame = TRUE) +
  tm_compass(type="8star", size = 2) +
  tm_scale_bar(width = 0.15) +
  tm_grid(lwd = 0.1, alpha = 0.5)

5.3 Statistical graphic visualisation

Now, we are going to compare the distribution of SAM accessibility values by URA Planning Region. As always, we need to add the planning region field into hexagon_SAM:

Show code

hexagon_SAM <- st_join(hexagon_SAM, mpsz, 
                       join = st_intersects)

One last plot with the boxplot graphical method:

Show code

ggplot(data=hexagon_SAM, 
       aes(y = accSAM, 
           x= REGION_N)) +
  geom_boxplot() +
  geom_point(stat="summary", 
             fun.y="mean", 
             colour ="red", 
             size=2)

6.0 Ending Notes & Thank You

With that, we’ve learned how to model, compute and visualise accessibility measures!

And now - we come to the end of our hands-on exercises. This past few months have been full of learning experiences: from our baby steps of importing and basic mapping, to flexing our geospatial biceps with (advanced!) point pattern analysis, to real-life implementation and application with segmentation and accessibility. And that’s not exhaustive - there’s so much more we learned together, and I hope this journey has been as fulfilling for you as it has for me! 💖

My greatest and warmest thanks for Prof. Kam Tin Seong, who’s worked tirelessly to provide all sorts of resources, from data to readings, and for his readiness in answering clarification emails! For those who want to learn even more, his lesson blog can be found here. And thank you to all my classmates who’ve helped address our collective doubts with targeted questions - nothing beats a shared learning experience 😄