2 Introduction to SQL
3 Getting Started
This week we’ll be learning how to use SQL to query data from a database:
Clicking the link will take you to Google Colab, where you can run the notebook in your browser. You can also download the notebook and run it locally if you prefer.
Once you’ve completed this, you can test your knowledge by answering the questions in the lab below. You can also access it on Google Colab here.
4 Lab
Datasets:
The following datasets are used in this lab. You don’t need to download them manually, they can be accessed directly from the notebook.
# %pip install duckdb duckdb-engine jupysql
import duckdb
%load_ext sql
%config SqlMagic.autopandas = True
%config SqlMagic.feedback = False
%config SqlMagic.displaycon = False
4.1 Question 1: Creating Tables
Create a database, then write a SQL query to create a table named nyc_subway_stations
and load the data from the file nyc_subway_stations.tsv
into it. Similarly, create a table named nyc_neighborhoods
and load the data from the file nyc_neighborhoods.tsv
into it.
4.2 Question 2: Column Filtering
Write a SQL query to display the ID
, NAME
, and BOROUGH
of each subway station in the nyc_subway_stations
dataset.
4.3 Question 3: Row Filtering
Write a SQL query to find all subway stations in the nyc_subway_stations
dataset that are located in the borough of Manhattan.
4.4 Question 4: Sorting Results
Write a SQL query to list the subway stations in the nyc_subway_stations
dataset in alphabetical order by their names.
4.5 Question 5: Unique Values
Write a SQL query to find the distinct boroughs represented in the nyc_subway_stations
dataset.
4.6 Question 6: Counting Rows
Write a SQL query to count the number of subway stations in each borough in the nyc_subway_stations
dataset.
4.7 Question 7: Aggregating Data
Write a SQL query to list the number of subway stations in each borough, sorted in descending order by the count.
4.8 Question 8: Joining Tables
Write a SQL query to join the nyc_subway_stations
and nyc_neighborhoods
datasets on the borough name, displaying the subway station name and the neighborhood name.
4.9 Question 9: String Manipulation
Write a SQL query to display the names of subway stations in the nyc_subway_stations
dataset that contain the word “St” in their names.
4.10 Question 10: Filtering with Multiple Conditions
Write a SQL query to find all subway stations in the nyc_subway_stations
dataset that are in the borough of Brooklyn and have routes that include the letter “R”.