Session 11
2024-04-17
Exam 1 (in groups) [coef 33%] : given on feb. 27th for mar. 12th. DONE
Exam 2 (in groups) [coef 33%] : given on apr. 16th for apr. 30th
Exam 3 (individual) [coef 33%] : given on apr. 16th for may. 7th
sf
Simple Features for R website.
sf
stands for Simple Features.
Its goal is to consolidate the functionality of previous packages (sp
, rgeos
, and rgdal
) into one.
It simplifies the manipulation of spatial data using simple objects.
Compatible with tidy data principles: works well with the tidyverse
syntax, including the pipe operator |>
.
Main author and maintainer: Edzer Pebesma (also the author of the sp
package).
sf
object:Importing:
Reading layer `martinique' from data source
`https://minio.lab.sspcloud.fr/kantunez/diffusion/dsr/martinique.gpkg'
using driver `GPKG'
Simple feature collection with 34 features and 23 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 690574.4 ymin: 1592426 xmax: 736126.5 ymax: 1645660
Projected CRS: WGS 84 / UTM zone 20N
Exporting
The GeoPackage (gpkg) format is an open format (not tied to an operating system) and implemented as a SQLite database.
Also note the existence of the GeoParquet format, extremely efficient for processing large volumes of spatial data.
Coordinate systems/projections are identified using a code called an EPSG code:
Obtain the projection using st_crs()
(epsg code) and modify it using st_transform()
.
sf
uses spherical geometry by default for unprojected data since version 1.0, thanks to the s2
library. See https://r-spatial.github.io for more details.
Coordinate Reference System:
User input: WGS 84 / UTM zone 20N
wkt:
PROJCRS["WGS 84 / UTM zone 20N",
BASEGEOGCRS["WGS 84",
ENSEMBLE["World Geodetic System 1984 ensemble",
MEMBER["World Geodetic System 1984 (Transit)"],
MEMBER["World Geodetic System 1984 (G730)"],
MEMBER["World Geodetic System 1984 (G873)"],
MEMBER["World Geodetic System 1984 (G1150)"],
MEMBER["World Geodetic System 1984 (G1674)"],
MEMBER["World Geodetic System 1984 (G1762)"],
MEMBER["World Geodetic System 1984 (G2139)"],
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]],
ENSEMBLEACCURACY[2.0]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4326]],
CONVERSION["UTM zone 20N",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",0,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",-63,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9996,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",500000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",0,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Engineering survey, topographic mapping."],
AREA["Between 66°W and 60°W, northern hemisphere between equator and 84°N, onshore and offshore. Anguilla. Antigua and Barbuda. Bermuda. Brazil. British Virgin Islands. Canada - New Brunswick; Labrador; Nova Scotia; Nunavut; Prince Edward Island; Quebec. Dominica. Greenland. Grenada. Guadeloupe. Guyana. Martinique. Montserrat. Puerto Rico. St Kitts and Nevis. St Barthelemy. St Lucia. St Maarten, St Martin. St Vincent and the Grenadines. Trinidad and Tobago. Venezuela. US Virgin Islands."],
BBOX[0,-66,84,-60]],
ID["EPSG",32620]]
Default display:
Keep only the geometry:
Units: [m]
[,1] [,2] [,3] [,4] [,5]
[1,] 0.000 35297.56 3091.501 12131.617 17136.310
[2,] 35297.557 0.00 38332.602 25518.913 18605.249
[3,] 3091.501 38332.60 0.000 15094.702 20226.198
[4,] 12131.617 25518.91 15094.702 0.000 7177.011
[5,] 17136.310 18605.25 20226.198 7177.011 0.000
Simple union:
Using a grouping variable:
Creating a polygon.
st_intersection()
extracts the part of mtq that intersects with the created polygon.
st_sample()
creates random points on the map.
st_join
performs a spatial join coupled with aggregation.
It is possible to finely control the join (points on boundaries, etc.) with the join
argument, which defaults to st_intersects()
. See here for a precise definition of possible spatial predicates.
Simple feature collection with 3 features and 1 field
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 697601.7 ymin: 1598817 xmax: 710461.9 ymax: 1640521
Projected CRS: WGS 84 / UTM zone 20N
INSEE_COM geom
1 97201 POLYGON ((699261.2 1637681,...
2 97202 POLYGON ((709840 1599026, 7...
2.1 97202 POLYGON ((709840 1599026, 7...
Simple feature collection with 6 features and 2 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 695444.4 ymin: 1598817 xmax: 717731.2 ymax: 1645182
Projected CRS: WGS 84 / UTM zone 20N
INSEE_COM n geom
1 97201 1 POLYGON ((699261.2 1637681,...
2 97202 2 POLYGON ((709840 1599026, 7...
3 97203 4 POLYGON ((706092.8 1642964,...
4 97204 1 POLYGON ((697314.8 1623213,...
5 97205 1 POLYGON ((702756.8 1617978,...
6 97206 1 POLYGON ((709840 1599026, 7...
A Voronoi diagram is a plane partitioning into cells (adjacent regions, called Voronoi polygons) based on a discrete set of points. Each cell encloses a single point and forms the set of points in the plane that are closer to that point than to any other.
In this section, we will visualize data on personal injury road traffic accidents in 2019.
This is a data.frame
containing two columns, providing information about the latitude and longitude of road accidents.
accidents.2019.paris <- readRDS(paste0(repository,"/accidents2019_paris.RDS"))
head(accidents.2019.paris)
# A tibble: 6 × 13
Num_Acc id_vehicule grav sexe an_nais lat long com int lum voie
<dbl> <chr> <dbl> <fct> <dbl> <dbl> <dbl> <chr> <fct> <fct> <chr>
1 2.02e11 138 305 660 1 Fémi… 1983 48.9 2.28 75117 Inte… Nuit… BD P…
2 2.02e11 138 305 661 4 Masc… 1975 48.9 2.28 75117 Inte… Nuit… BD P…
3 2.02e11 138 305 658 4 Fémi… 1988 48.8 2.41 75120 Inte… Nuit… COUR…
4 2.02e11 138 305 658 1 Masc… 1991 48.8 2.41 75120 Inte… Nuit… COUR…
5 2.02e11 138 305 658 4 Masc… 1987 48.8 2.41 75120 Inte… Nuit… COUR…
6 2.02e11 138 305 659 1 Masc… 1996 48.8 2.41 75120 Inte… Nuit… COUR…
# ℹ 2 more variables: catr <fct>, catv <fct>
We need to convert this data.frame
into a spatial object (sf
) using the st_as_sf
function. Simply specify the column names containing the coordinates and the projection, here CRS = 4326
(WGS 84). We then transform this data into CRS = 2154
(Lambert 93).
Several solutions exist for creating interactive maps with R. mapview
, leaflet
and mapdeck
are the main ones. For simplicity, we focus here on mapview
.
Interactive maps are not necessarily very relevant for representing geostatistical information. However, they are useful for exploring databases. Let’s see an example with mapview
regarding fatal accidents in Paris in 2019.
When you click on a point, the values of various variables in the database appear. This can help in exploring the database. Let’s customize a bit….
Let’s customize a bit more…
However, adding a legend for the size of proportional circles is not easily done.
ggplot2
Once again, various R packages are used to create static maps:
ggplot2
is a widely used package for various types of plots and has been specifically adapted for maps (geom_sf
).tmap
package contains advanced functionalities based on the logic of ggplot2
.mapsf
(successor of cartography
) relies on a so-called “base R” language and allows for basic as well as advanced cartographic representations.For simplicity’s sake, we’ll concentrate here on ggplot2
, a well-known package for all types of graphics.
ggplot2
.Components of the Grammar
e.g., f(data) → x position, y position, size, shape, color
e.g., points, lines, bars, texts
e.g., f([0, 100]) → [0, 5] px
e.g., Segmentation of data based on one or more factors
e.g., mean, count, regression…
geom_sf
A brief introduction/reminder of graphic semiology:
We are working with the cartographic layer of IRIS1.
library(sf)
library(dplyr)
iris.75 <- st_read(paste0(repository, "/iris_75.gpkg"), stringsAsFactors = F)
Reading layer `iris_75' from data source
`https://minio.lab.sspcloud.fr/kantunez/diffusion/dsr/iris_75.gpkg'
using driver `GPKG'
Simple feature collection with 992 features and 2 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 643075.6 ymin: 6857477 xmax: 661086.2 ymax: 6867081
Projected CRS: RGF93 v1 / Lambert-93
Let’s count the number of people injured (nbacc
) per IRIS:
acc_iris <- iris.75 |> st_join(accidents.2019.paris) |>
group_by(CODE_IRIS) |> dplyr::summarize(nb_acc = n(),
nb_acc_grav = sum(if_else(grav%in%c(2,3), 1, 0),
na.rm = TRUE),
nb_vl = sum(if_else(catv == "VL seul", 1, 0),
na.rm = TRUE),
nb_edp = sum(if_else(catv == "EDP à moteur", 1, 0),
na.rm = TRUE),
nb_velo = sum(if_else(catv == "Bicyclette", 1, 0),
na.rm = TRUE)
)
head(acc_iris,1)
Simple feature collection with 1 feature and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 651771.6 ymin: 6862103 xmax: 652179 ymax: 6862406
Projected CRS: RGF93 v1 / Lambert-93
# A tibble: 1 × 7
CODE_IRIS nb_acc nb_acc_grav nb_vl nb_edp nb_velo geom
<chr> <int> <dbl> <dbl> <dbl> <dbl> <MULTIPOLYGON [m]>
1 751010101 27 1 11 1 4 (((652130.3 6862122, 652126…
library(ggplot2)
ggplot() +
geom_sf(data = acc_iris, colour = "ivory3", fill = "ivory") +
geom_sf(data = acc_iris |> st_centroid(),
aes(size= nb_acc), colour="#E84923CC", show.legend = 'point') +
scale_size(name = "Nombre d'accidents",
breaks = c(1,10,100,200),
range = c(0,5)) +
coord_sf(crs = 2154, datum = NA,
xlim = st_bbox(iris.75)[c(1,3)],
ylim = st_bbox(iris.75)[c(2,4)]) +
theme_minimal() +
theme(panel.background = element_rect(fill = "ivory",color=NA),
plot.background = element_rect(fill = "ivory",color=NA),legend.position = "bottom") +
labs(title = "Nombre d'accidents de la route à Paris par iris",
caption = "fichier BAAC 2019, ONISR\nantuki & comeetie, 2023",x="",y="")
library(RColorBrewer) #pour les couleurs des palettes
# Quintiles de la part des accidents ayant eu lieu à vélo
perc_velo = 100*acc_iris$nb_velo/acc_iris$nb_acc
bks <- c(0,round(quantile(perc_velo[perc_velo!=0],na.rm=TRUE,probs=seq(0,1,0.25)),1))
# Intégration dans la base de données
acc_iris <- acc_iris |> mutate(txaccvelo = 100*nb_velo/nb_acc,
txaccvelo_cat = cut(txaccvelo,bks,include.lowest = TRUE))
# Carte
ggplot() +
geom_sf(data = iris.75,colour = "ivory3",fill = "ivory") +
geom_sf(data = acc_iris, aes(fill = txaccvelo_cat)) +
scale_fill_brewer(name = "Part (En %)",
palette = "Reds",
na.value = "grey80") +
coord_sf(crs = 2154, datum = NA,
xlim = st_bbox(iris.75)[c(1,3)],
ylim = st_bbox(iris.75)[c(2,4)]) +
theme_minimal() +
theme(panel.background = element_rect(fill = "ivory",color=NA),
plot.background = element_rect(fill = "ivory",color=NA),legend.position="bottom") +
labs(title = "Part des accidentés à vélos",
subtitle = "par arrondissement à Paris en 2019",
caption = "fichier BAAC 2019, ONISR\nantuki & comeetie, 2023",
x = "", y = "")
mapsf
Below is an example of a similar map created using the mapsf
library syntax.
library(mapsf)
mf_theme("default",cex=0.9,mar=c(0,0,1.2,0),bg="ivory")
mf_init(x = acc_iris, expandBB = c(0, 0, 0, .15))
mf_map(acc_iris,add = TRUE,col = "ivory2")
# Plot symbols with choropleth coloration
mf_map(
x = acc_iris |> st_centroid(),
var = c("nb_acc", "txaccvelo"),
type = "prop_choro",
border = "grey50",
lwd = 0.1,
leg_pos = c("topright","right"),
leg_title = c("Nombre d'accidents", "Part des accidentés à vélo"),
breaks = c(0,8,15,25,100),
nbreaks = 5,
inches= 0.16,
pal = "Reds",
leg_val_rnd = c(0, 0),
leg_frame = c(TRUE, TRUE)
)
mf_layout(
title = "Nombre d'accidents de la route et proportion d'accidents impliquant des vélos",
credits = "fichier BAAC 2019, ONISR\nantuki & comeetie, 2023",
frame = TRUE)
Catch up!
Work on your graded exam
Handbooks, videos, cheatsheets