Using SQL Join statements in Join tables
While working through the SQL section of the Flatiron School’s Access Labs course, I found my first example of multiple JOIN
statements. To understand how using multiple JOIN
statements worked, it helped to think about what the query was returning at each level. Just like you would do when iterating over a nested array. To give context, here are the 3 tables we are working with:
and the code itself:
Before we jump into any of that though, lets talk about why we even need to JOIN
3 tables instead of just using 2.
The limits of “has many” and “belongs to” relationships
Sticking with cats and owners metaphor, an owner “has many” cats, and each cat “belongs to” an owner. We could easily make the tables owners and cats, and in the cats table create a column containing the unique id of that cat’s owner from the owners table. We could see all the cats of an owner by looking for the owner’s id in the cats table:
SELECT cats.name
FROM cats
WHERE owner_id = 3
But what if a cat likes to get multiple dinners from every house on the block? Now a cat “has many” owners, and each owner thinks they only “belong to” that cat. That fat, sneaky cat.
The has many/belongs to relationship is no longer very efficient or clear because cats and owners can be both sides, they are now in a “many to many” relationship. To account for all relationships possible, we use a “join table.” in this case it’s the one on the lower right, cats_owners. It “joins” the two other tables by simply putting the id’s of owners and cats in the same row.
Using a join table with one JOIN
By querying cats_owners and joining it with either the owners or cats table, we can find what cats and owners are associated. Here we will ask who Grumpy Cat eats with:
SELECT owners.name AS grumpy_eats_with
FROM cats_owners
INNER JOIN owners
on cats_owners.owner_id = owners.id
WHERE cats_owners.cat_id = 3;grumpy_eats_with
----------------
Sophie
Penny
We can see that cat number 3 (Grumpy Cat) has dinner at two houses: Sophie’s and Penny’s. But this is clunky. By only joining 2 of the 3 tables, we can’t get the name of both cats and owners at the same time. Worse, we’ll have to alter that SQL code every time we add more cats and owners. Like cats, developers strive to do as little as possible, so this code is no good.
To always know every relationship, we have to join all three tables at once, but how does that actually work?
A SQL query essentially returns a table
As you know, a SQL query can return a specific value or rows. Take this example, it will return all rows in owners and cats_owners where an owner’s id is found to match in both tables:
Our return value is essentially a new table, right? It’s the combination of the relevant rows from cats_owners and cats. Let’s chain another JOIN
to this new “table” and see what happens:
And here we see all 3 tables joined together, but only the rows where a cats_owners.cat_id that was in the previously returned “table” matches a new cats.id. That way we’ll only get cats who have owners, not Hana or Lil’ Bub. But this is a huge table, and we only care about the two names columns, so lets SELECT
those instead of *
Now that we only used SELECT
on cats.name and owners.name (aliased to avoid two columns named “name”), our final return is nicely formatted to show what cat is associated with what owner. It’s the same structure as our cats_owners table, but instead of the id’s, we see names.
The original code
Now this code:
is more clear. Our first join combines cats_owners and owners together, then it adds in cats. Then, on this newly returned “table,” it uses a WHERE
statement to only return the sum of the net_worth of all the cats who are owned by Sophie (she’s the owner with the id of 2).
Keep track of each level
Joining multiple tables may sound complicated, it’s really similar to dealing with any other layered data, even simple arrays. Don’t psych yourself out when searching through huge tables. Always take a second to understand the value you’re working with at each level, and you’ll never get too lost.
Happy coding everyone,
Mike