4 Spatial SQL II
This week we’re going to be looking at Spatial Joins. We’ll be working with DuckDB (though if you want to work with Postgres/PgAdmin, go ahead!). The notebook below contains some exercises on spatial joins using the NYC data we’ve been working with, as well as an extension that looks at ship-to-ship transfer detection using AIS data.
5 Spatial Joins Exercises
Here's a reminder of some of the functions we have seen. Hint: they should be useful for the exercises!
sum(expression)
: aggregate to return a sum for a set of recordscount(expression)
: aggregate to return the size of a set of recordsST_Area(geometry)
returns the area of the polygonsST_AsText(geometry)
returns WKTtext
ST_Contains(geometry A, geometry B)
returns the true if geometry A contains geometry BST_Distance(geometry A, geometry B)
returns the minimum distance between geometry A and geometry BST_DWithin(geometry A, geometry B, radius)
returns the true if geometry A is radius distance or less from geometry BST_GeomFromText(text)
returnsgeometry
ST_Intersects(geometry A, geometry B)
returns the true if geometry A intersects geometry BST_Length(linestring)
returns the length of the linestringST_Touches(geometry A, geometry B)
returns the true if the boundary of geometry A touches geometry BST_Within(geometry A, geometry B)
returns the true if geometry A is within geometry B
Also remember the tables we have available:
nyc_census_blocks
- name, popn_total, boroname, geom
nyc_streets
- name, type, geom
nyc_subway_stations
- name, routes, geom
nyc_neighborhoods
- name, boroname, geom
6 Exercises
What subway station is in 'Little Italy'? What subway route is it on?
What are all the neighborhoods served by the 6-train? (Hint: The
routes
column in thenyc_subway_stations
table has values like 'B,D,6,V' and 'C,6')After 9/11, the 'Battery Park' neighborhood was off limits for several days. How many people had to be evacuated?
What neighborhood has the highest population density (persons/km2)?
7 Ship-to-Ship Transfer Detection
Now for a less structured exercise. We’re going to look at ship-to-ship transfers. The idea is that two ships meet up in the middle of the ocean, and one ship transfers cargo to the other. This is a common way to avoid sanctions, and is often used to transfer oil from sanctioned countries to other countries. We’re going to look at a few different ways to detect these transfers using AIS data.
7.1 Step 1
Create a spatial database using the following AIS data:
https://storage.googleapis.com/qm2/casa0025_ships.csv
Each row in this dataset is an AIS ‘ping’ indicating the position of a ship at a particular date/time, alongside vessel-level characteristics.
It contains the following columns:
vesselid
: A unique numerical identifier for each ship, like a license platevessel_name
: The ship’s namevsl_descr
: The ship’s typedwt
: The ship’s Deadweight Tonnage (how many tons it can carry)v_length
: The ship’s length in metersdraught
: How many meters deep the ship is draughting (how low it sits in the water). Effectively indicates how much cargo the ship is carryingsog
: Speed over Ground (in knots)date
: A timestamp for the AIS signallat
: The latitude of the AIS signal (EPSG:4326)lon
: The longitude of the AIS signal (EPSG:4326)
Create a table called ‘ais’ where each row is a different AIS ping, with no superfluous information. Construct a geometry column.
Create a second table called ‘vinfo’ which contains vessel-level information with no superfluous information.
7.2 Step 2
Use a spatial join to identify ship-to-ship transfers in this dataset. Two ships are considered to be conducting a ship to ship transfer IF:
- They are within 500 meters of each other
- For more than two hours
- And their speed is lower than 1 knot
Some things to consider: make sure you’re not joining ships with themselves. Try working with subsets of the data first while you try different things out.