Using SQL to find the missing packages in the mail delivery service’s database

1 Missing Package Problem Scenario

Imagine you are a mail clerk for the city of Boston who oversees the delivery of mail across the city. For the most part, all packages sent are eventually delivered. However, every once in while, a mystery lands on your desk: a missing package! For each customer that comes to you with a report of a missing package, your task is to help each customer find their missing package and answer their relevant questions using just the information in the mail delivery service’s database, packages.db, which contains data on the transit of packages around the city.

The specific problems to solve and the schema of the database are described below.

1.1 Problem 1: The Missing Letter

The first report of a missing package comes from Anneke. Anneke walks up to your counter and tells you the following:

Clerk, my name’s Anneke. I live over at 900 Somerville Avenue. Not long ago, I sent out a special letter. It’s meant for my friend Varsha. She’s starting a new chapter of her life at 2 Finnegan Street, uptown. (That address, let me tell you: it was a bit tricky to get right the first time.) The letter is a congratulatory note—a cheery little paper hug from me to her, to celebrate this big move of hers. Can you check if it’s made its way to her yet?

Your job is to find out:

  1. At what type of address did the Lost Letter end up?

  2. At what address did the Lost Letter end up?

1.2 Problem 2: The Devious Delivery

The second report of a missing package comes from a mysterious fellow from out of town. They walk up to your counter and tell you the following:

Good day to you, deliverer of the mail. You might remember that not too long ago I made my way over from the town of Fiftyville. I gave a certain box into your reliable hands and asked you to keep things low. My associate has been expecting the package for a while now. And yet, it appears to have grown wings and flown away. Ha! Any chance you could help clarify this mystery? Afraid there’s no “From” address. It’s the kind of parcel that would add a bit more… quack to someone’s bath times, if you catch my drift.

Your job is to find out:

  1. At what type of address did the Devious Delivery end up?
  2. What were the contents of the Devious Delivery?

1.3 Problem 3: The Forgotten Gift

The third report of a missing package comes from a grandparent who lives down the street from the post office. They approach your counter and tell you the following:

Oh, excuse me, Clerk. I had sent a mystery gift, you see, to my wonderful granddaughter, off at 728 Maple Place. That was about two weeks ago. Now the delivery date has passed by seven whole days and I hear she still waits, her hands empty and heart filled with anticipation. I’m a bit worried wondering where my package has gone. I cannot for the life of me remember what’s inside, but I do know it’s filled to the brim with my love for her. Can we possibly track it down so it can fill her day with joy? I did send it from my home at 109 Tileston Street.

Your job is to find out:

  1. What are the contents of the Forgotten Gift?
  2. Who has the Forgotten Gift?

2 Schema

packages.db represents all recent package deliveries in the city of Boston with the help of following entities:

  • Drivers, who are the people that deliver packages
  • The packages themselves
  • Addresses, such as 1234 Main Street
  • Scans of packages, which represent confirmations a delivery driver picked up or dropped off a given package

These entities are related per the entity relationship (ER) diagram below:

ER diagram of the packages database

Let us now establish the SQLite connection to the database using DBI package and also list the tables in the database.

Code
library(DBI)
con <- dbConnect(RSQLite::SQLite(), "packages.db") # establish SQLite connection to the database

dbListTables(con) #list the tables in the database
[1] "addresses" "drivers"   "packages"  "scans"    

Click the tabsets below to learn more about the schema of each individual table in the packages.db.

The addresses table contains the following columns:

  • id, which is the ID of the address
  • address, which is the street address itself (i.e., 7660 Sharon Street)
  • type, which is the type of address (i.e., residential, commercial, etc.)

The sample data of the addresses table can be seen below.

Code
SELECT * FROM "addresses"
LIMIT 5;
5 records
id address type
1 7660 Sharon Street Residential
2 60 Drake Place Residential
3 88 City Point Court Residential
4 266 Dorchester Avenue Residential
5 2 Otis Place Residential

The drivers table contains the following columns:

  • id, which is the ID of the driver
  • name, which is the first name of the driver

The sample data of the drivers table can be seen below.

Code
SELECT * FROM "drivers"
LIMIT 5;
5 records
id name
1 Matthew
2 Isabel
3 Julianna
4 Varsha
5 Jacob

The packages table contains the following columns:

  • id, which is the ID of the package
  • contents, which contains the contents of the package
  • from_address_id, which is the ID of the address from which the package was sent
  • to_address_id, which is the ID of the address to which the package was sent. It’s not necessarily where it ended up!

The sample data of the packages table can be seen below.

Code
SELECT * FROM "packages"
LIMIT 5;
5 records
id contents from_address_id to_address_id
1 Sticky notes 8575 4635
2 Shears 8431 7641
3 Index cards 2107 3624
4 Glue stick 6954 3740
5 Trading cards 4530 5564

The scans table contains the following columns:

  • id, which is the ID of the scan
  • driver_id, which is the ID of the driver who created the scan
  • package_id, which is the ID of the package scanned
  • address_id, which is the ID of the address where the package was scanned
  • action, which indicates whether the package was picked up (“Pick”) or dropped off (“Drop”)
  • timestamp, which is the day and time at which the package was scanned

The sample data of the scans table can be seen below.

Code
SELECT * FROM "scans"
LIMIT 5;
5 records
id driver_id package_id address_id action timestamp
1 11 8502 1063 Pick 2023-07-11 15:16:05.340221
2 8 3320 9551 Pick 2023-07-11 15:19:39.359315
3 1 2879 9589 Pick 2023-07-11 15:23:47.208594
4 20 2240 3464 Pick 2023-07-11 15:28:16.920400
5 1 8664 8013 Pick 2023-07-11 15:32:42.265139

3 Solution to Problem 1: The Missing Letter

From the problem statement, it is clear that we know the from address, i.e., 900 Somerville Avenue, and the package is some kind of letter. However, we do not know the to address.

To answer the above two questions, we need to query the packages table to find the id of the packages using the from address, i.e., ‘900 Somerville Avenue’ and filtering for entries that contain the word ‘letter’ in the ‘contents’ column. Once we know the package_id, we can find the address_id, since the package was delivered somewhere, by setting the ‘action’ column to ‘Drop’ by querying the ‘scans’ table. Finally, we can then determine the type of address and the address where the letter ended up by querying the ‘addresses’ table using the obtained ‘address_id’.

Code

SELECT "type", "address" FROM "addresses"
WHERE "id" IN (
    SELECT "address_id" FROM "scans"
    WHERE "package_id" = (
        SELECT "id" FROM "packages"
        WHERE "from_address_id" = 
        (SELECT "id" FROM "addresses" WHERE "address" = '900 Somerville Avenue')
        AND 
        "contents" LIKE '%letter%'
        ) 
    AND "action" = 'Drop' 
);
1 records
type address
Residential 2 Finnigan Street

So, the answers to the questions are as follows:

  1. At what type of address did the Lost Letter end up?
    • Residential
  2. At what address did the Lost Letter end up?
    • 2 Finnigan Street

4 Solution to Problem 2: The Devious Delivery

Here, all we know is that the from address is not provided, and the package was delivered somewhere.

To answer the first question, we need to query the ‘scans’ table to find the address_id using the ‘package_id’ where the from address is NULL and set the ‘action’ column to ‘Drop’ as the package was delivered. After obtaining the address_id, we can use it to find the the type of address where the package ended up by querying the ‘addresses’ table.

For the second question, i.e., to know the contents of the package, we just need to query the ‘packages’ table where the from address is NULL.

Code

SELECT "type" FROM "addresses"
WHERE "id" = (
    SELECT "address_id" FROM "scans"
    WHERE "package_id" IN (
        SELECT "id" FROM "packages"
        WHERE "from_address_id" IS NULL
    )
    AND "action" = 'Drop' 
);
1 records
type
Police Station
Code

SELECT "contents" FROM "packages"
WHERE "from_address_id" IS NULL;
1 records
contents
Duck debugger

So, the answers to the questions are as follows:

  1. At what type of address did the Devious Delivery end up?
    • Police Station
  2. What were the contents of the Devious Delivery?
    • Duck debugger

5 Solution to Problem 3: The Forgotten Gift

In this problem, we know the from address, i.e., 109 Tileston Street, and the to address, i.e., 728 Maple Place, but we do not know the contents of the package. The additional crucial information that we know is the package is yet to be delivered. This implies that the package is still with the driver.

To solve the first question, we need to query the ‘packages’ table where the from address is ‘109 Tileston Street’ and the to address is ‘728 Maple Place’.

For the second question, all we need to know is the name of that driver since the package is still with the driver. For this, we need to query the ‘scans’ table to get the driver_id of that single driver involved. In doing so, we need to make sure that we get only one ‘driver_id’ by conditioning the query on columns that we know so far, i.e., ‘package_id’, which can be found using the from address and the to address, ‘address_id’, which cannot be the from address, and the ‘action’ is ‘Pick’. Once we know the ‘driver_id’, we can find the name of the driver using that.

Code

SELECT "contents" FROM "packages"
WHERE "from_address_id" = (
    SELECT "id" FROM "addresses"
    WHERE "address" = '109 Tileston Street'
)
AND "to_address_id" = (
    SELECT "id" FROM "addresses"
    WHERE "address" = '728 Maple Place'
);
1 records
contents
Flowers
Code

SELECT "name" FROM "drivers"
WHERE "id" IN (
    SELECT "driver_id" FROM "scans"
    WHERE "package_id" = (
        SELECT "id" FROM "packages"
        WHERE "to_address_id" = (
            SELECT "id" FROM "addresses"
            WHERE "address" = '728 Maple Place'
        )
        AND "from_address_id" = (
            SELECT "id" FROM "addresses"
            WHERE "address" = '109 Tileston Street'
        )
    )
    AND "address_id" != (
        SELECT "id" FROM "addresses"
        WHERE "address" = '109 Tileston Street'
    )
    AND "action" = "Pick"
);
1 records
name
Mikel

So, the answers to the questions are as follows:

  1. What are the contents of the Forgotten Gift?
    • Flowers
  2. Who has the Forgotten Gift?
    • Mikel
Back to top