4 Spatial SQL II

This week we’re going to be looking at Spatial Joins. We’ll be working with DuckDB (though if you want to work with Postgres/PgAdmin, go ahead!). The notebook below contains some exercises on spatial joins using the NYC data we’ve been working with, as well as an extension that looks at ship-to-ship transfer detection using AIS data.

Spatial Joins

5 Spatial Joins Exercises

Here's a reminder of some of the functions we have seen. Hint: they should be useful for the exercises!

sum(expression): aggregate to return a sum for a set of records
count(expression): aggregate to return the size of a set of records
ST_Area(geometry) returns the area of the polygons
ST_AsText(geometry) returns WKT text
ST_Contains(geometry A, geometry B) returns the true if geometry A contains geometry B
ST_Distance(geometry A, geometry B) returns the minimum distance between geometry A and geometry B
ST_DWithin(geometry A, geometry B, radius) returns the true if geometry A is radius distance or less from geometry B
ST_GeomFromText(text) returns geometry
ST_Intersects(geometry A, geometry B) returns the true if geometry A intersects geometry B
ST_Length(linestring) returns the length of the linestring
ST_Touches(geometry A, geometry B) returns the true if the boundary of geometry A touches geometry B
ST_Within(geometry A, geometry B) returns the true if geometry A is within geometry B

Also remember the tables we have available:

nyc_census_blocks
- name, popn_total, boroname, geom
nyc_streets
- name, type, geom
nyc_subway_stations
- name, routes, geom
nyc_neighborhoods
- name, boroname, geom

6 Exercises

What subway station is in 'Little Italy'? What subway route is it on?
What are all the neighborhoods served by the 6-train? (Hint: The routes column in the nyc_subway_stations table has values like 'B,D,6,V' and 'C,6')
After 9/11, the 'Battery Park' neighborhood was off limits for several days. How many people had to be evacuated?
What neighborhood has the highest population density (persons/km2)?

7 Ship-to-Ship Transfer Detection

Now for a less structured exercise. We’re going to look at ship-to-ship transfers. The idea is that two ships meet up in the middle of the ocean, and one ship transfers cargo to the other. This is a common way to avoid sanctions, and is often used to transfer oil from sanctioned countries to other countries. We’re going to look at a few different ways to detect these transfers using AIS data.

7.1 Step 1

Create a spatial database using the following AIS data:

https://storage.googleapis.com/qm2/casa0025_ships.csv

Each row in this dataset is an AIS ‘ping’ indicating the position of a ship at a particular date/time, alongside vessel-level characteristics.

It contains the following columns:

vesselid: A unique numerical identifier for each ship, like a license plate
vessel_name: The ship’s name
vsl_descr: The ship’s type
dwt: The ship’s Deadweight Tonnage (how many tons it can carry)
v_length: The ship’s length in meters
draught: How many meters deep the ship is draughting (how low it sits in the water). Effectively indicates how much cargo the ship is carrying
sog: Speed over Ground (in knots)
date: A timestamp for the AIS signal
lat: The latitude of the AIS signal (EPSG:4326)
lon: The longitude of the AIS signal (EPSG:4326)

Create a table called ‘ais’ where each row is a different AIS ping, with no superfluous information. Construct a geometry column.

Create a second table called ‘vinfo’ which contains vessel-level information with no superfluous information.

7.2 Step 2

Use a spatial join to identify ship-to-ship transfers in this dataset. Two ships are considered to be conducting a ship to ship transfer IF:

They are within 500 meters of each other
For more than two hours
And their speed is lower than 1 knot

Some things to consider: make sure you’re not joining ships with themselves. Try working with subsets of the data first while you try different things out.