5 Practice Interview Questions on SQL Joins

Question 1

1. What is a JOIN clause?

Answer

Let's start with the basics: what is a JOIN clause, anyway? If you're familiar with SQL, you've probably used JOINs extensively in the past. But using them is different from being able to explain what they do — so let's go over a high-level definition to start us off.

Simply put, a JOIN combines records from two separate tables. Oftentimes, we come upon situations within SQL in which data on two separate tables is linked, but separated. JOINs help bring that data back together.

Here's an example: let's say we have two tables in a database that tracks sales from an e-commerce company. One table is called customers, and contains data individual customers, like first_name and email. The second table is called orders, and contains information on individual orders that have been placed — like order_date and product.

Each order in our database is placed by a customer, but we don't keep the customer's information within the orders table. Why not? Because if the same customer placed multiple orders, and we kept track of customer information within the orders table, we'd be duplicating data unnecessarily. By separating customer information into its own customers table, we can reduce redundancy and make updates and changes much easier to handle.

So, we include a field called customer_id within each record on the orders table. this ID is linked to a customer_id on the customers table, which contains non-redundant data for each individual customer.

When we want to bring two tables together, we use a JOIN statement to combine data as necessary.

Here's how these two tables might look in practice:

CREATE TABLE customers (
	customer_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	first_name VARCHAR(255) NOT NULL,
	last_name VARCHAR(255) NOT NULL,
	email VARCHAR(255) NOT NULL,
);


CREATE TABLE orders (
	order_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	customer_id INT NOT NULL,
	order_placed_date DATE NOT NULL,
	FOREIGN KEY (customer_id) REFERENCES customers (customer_id)
);

Question 2

2. What are the different types of SQL JOIN clauses, and how are they used?

Answer

In SQL, a JOIN clause is used to return a table that merges the contents of two or more other tables together. For example, if we had two tables — one containing information on Customers and another containing information on the Orders various customers have placed — we could use a JOIN clause to bring them together to create a new table: a complete list of orders by customer, with all necessary information to make shipments.

There are multiple types of JOIN clauses, and they all serve slightly different functions:

INNER JOIN returns a list of rows for which there is a match in both tables specified. It's the default join type, so if you just type JOIN without specifying any other conditions, an INNER JOIN will be used.
LEFT JOIN will return all results from the left table in your statement, matched against rows in the right table when possible. If a row in the left table does not contain a corresponding match in the right table, it will still be listed — with NULL values in columns for the right table.
RIGHT JOIN will return all results from the right table in your statement, matched against rows in the left table when possible. If a row in the right table does not contain a corresponding match in the left table, it will still be listed — with NULL values in columns for the left table.
FULL JOIN will return all results from both the left and the right tables in your statement. If there are instances in which rows from the left table do not match the right table or vice versa, all data will still be pulled in — but SQL will output NULL values in all columns that are not matched.
CROSS JOIN returns the Cartesian product of two tables — in other words, each individual row of the left table matched with each individual row of the right table.

Question 3

3. What is the difference between INNER JOIN and LEFT JOIN?

Answer

When constructing a SELECT query that combines two or more tables, choosing the right JOIN type is half the battle. So how do we know when to use INNER JOIN, and when to use the more complex variants like RIGHT JOIN and LEFT JOIN?

Simply put, INNER JOIN should be used when we want to exclude all records that do not match both of the tables we're joining.

Let's check out an example of this: let's say that we have two tables — one called students, and one called advisors. The students table contains a field called advisor_id that references an id within the advisors table:

CREATE TABLE `advisors` (
  `advisor_id` int(11) NOT NULL AUTO_INCREMENT,
  `first_name` varchar(255) NOT NULL,
  `last_name` varchar(255) NOT NULL,
  PRIMARY KEY (`advisor_id`)
)

CREATE TABLE `students` (
  `student_id` int(11) NOT NULL AUTO_INCREMENT,
  `first_name` varchar(255) NOT NULL,
  `last_name` varchar(255) NOT NULL,
  `advisor_id` int(11) DEFAULT NULL,
  PRIMARY KEY (`student_id`),
  KEY `advisor_id` (`advisor_id`),
  CONSTRAINT `students_ibfk_1` FOREIGN KEY (`advisor_id`) REFERENCES `advisors` (`advisor_id`)
);

Here's the catch: not all students have an advisor, and not all advisors are assigned to students:

SELECT * FROM students;

/*
+------------+------------+------------+------------+
| student_id | first_name | last_name  | advisor_id |
+------------+------------+------------+------------+
|          1 | Tanisha    | Blake      |          2 |
|          2 | Jess       | Goldsmith  |       NULL |
|          3 | Tracy      | Wu         |          3 |
|          4 | Alvin      | Grand      |          1 |
|          5 | Felix      | Zimmermann |          2 |
+------------+------------+------------+------------+
5 rows in set (0.00 sec)
*/

SELECT * FROM advisors;
/*
+------------+------------+-----------+
| advisor_id | first_name | last_name |
+------------+------------+-----------+
|          1 | James      | Francis   |
|          2 | Amy        | Cheng     |
|          3 | Lamar      | Alexander |
|          4 | Anita      | Woods     |
+------------+------------+-----------+
4 rows in set (0.00 sec)
*/

If we wanted to pull information on students and their advisors, excluding students who are not assigned to an advisor, we would use an INNER JOIN:

SELECT s.first_name AS student_name, a.first_name AS advisor_name
FROM students AS s
INNER JOIN advisors AS a ON s.advisor_id = a.advisor_id;

/*
+--------------+--------------+
| student_name | advisor_name |
+--------------+--------------+
| Alvin        | James        |
| Tanisha      | Amy          |
| Felix        | Amy          |
| Tracy        | Lamar        |
+--------------+--------------+
4 rows in set (0.00 sec)
*/

Since we're using an INNER JOIN the results of this query exclude students for whom the advisor_id is set to NULL. Let's take a look at what happens when we use a LEFT JOIN instead:

SELECT s.first_name AS student_name, a.first_name AS advisor_name
FROM students AS s
LEFT JOIN advisors AS a ON s.advisor_id = a.advisor_id;

/*
+--------------+--------------+
| student_name | advisor_name |
+--------------+--------------+
| Alvin        | James        |
| Tanisha      | Amy          |
| Felix        | Amy          |
| Tracy        | Lamar        |
| Jess         | NULL         |
+--------------+--------------+
5 rows in set (0.01 sec)
*/

Notice that our data set now contains information on all students, even those for whom the advisor is set to NULL!

Question 4

4. What is the difference between LEFT JOIN and RIGHT JOIN?

Answer

LEFT JOIN and RIGHT JOIN actually both do very similar things: they display the results of a JOIN query including all records on a given table. The only difference is that LEFT JOIN displays all records on the left table of the query, and RIGHT JOIN displays all records on the right table!

To make this a bit clearer, let's use our example of students and advisors above: a LEFT JOIN of students onto advisors will show a list of all students, even those who do not have advisors — because students is the LEFT table:

SELECT s.first_name AS student_name, a.first_name AS advisor_name
FROM students AS s
LEFT JOIN advisors AS a ON s.advisor_id = a.advisor_id;

/*
+--------------+--------------+
| student_name | advisor_name |
+--------------+--------------+
| Alvin        | James        |
| Tanisha      | Amy          |
| Felix        | Amy          |
| Tracy        | Lamar        |
| Jess         | NULL         |
+--------------+--------------+
5 rows in set (0.01 sec)
*/

Notice that the advisor 'Anita' is excluded in the table above, because she is not assigned to any students.

On the other hand, a RIGHT JOIN of students onto advisors will show a list of all students who are assigned to advisors, plus a list of advisors not assigned to students — because advisors is the RIGHT table:

SELECT s.first_name AS student_name, a.first_name AS advisor_name
FROM students AS s
RIGHT JOIN advisors AS a ON s.advisor_id = a.advisor_id;

/*
+--------------+--------------+
| student_name | advisor_name |
+--------------+--------------+
| Alvin        | James        |
| Tanisha      | Amy          |
| Felix        | Amy          |
| Tracy        | Lamar        |
| NULL         | Anita        |
+--------------+--------------+
5 rows in set (0.00 sec)
*/

Notice that the student 'Jess' is excluded in the table above, because she is not assigned to an advisor.

Question 5

5. How should data be structured to facilitate JOIN clauses in a one-to-many relationship? Whta about a many-to-many relationship?

Answer

This one is a bit trickier, and is an interesting database design question.

Generally, one-to-many relationships are structured using a single FOREIGN KEY. Consider our example of customers and orders above:

CREATE TABLE customers (
	customer_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	first_name VARCHAR(255) NOT NULL,
	last_name VARCHAR(255) NOT NULL,
	email VARCHAR(255) NOT NULL,
);


CREATE TABLE orders (
	order_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	customer_id INT NOT NULL,
	order_placed_date DATE NOT NULL,
	FOREIGN KEY (customer_id) REFERENCES customers (customer_id)
);

This is a one-to-many relationship, because one customer can place multiple orders, but one order cannot be assigned to more than one customer. As such, we've defined it with a simple foreign key in the orders table pointing to a given customer_id, and we can use JOIN clauses in our SELECT queries fairly easily.

Many-to-many relationships are a bit more complicated. For example, what if we had an orders table and a products table with a many-to-many relationship: any order can contain multiple products, and any product can be assigned to multiple orders. How would we structure our database?

The answer: we use an intermediary mapping table with two FOREIGN KEYs. Consider the following:

CREATE TABLE orders (
	order_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	order_placed_date DATE NOT NULL,
);


CREATE TABLE products (
	product_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	name VARCHAR(255) NOT NULL,
	price INT NOT NULL
);

CREATE TABLE products_to_orders (
	product_to_order_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	order_id INT NOT NULL,
	product_id INT NOT NULL,
	FOREIGN KEY (order_id) REFERENCES orders(order_id),
	FOREIGN KEY (product_id) REFERENCES products(product_id)
);

Above, we've created a separate table called products_to_orders that maps items on the products table to items on the orders table. Each row in our products_to_orders table represents one product-order combination, so that multiple products can be assigned to one order — and a single product can be assigned to multiple orders.

In this example, we need to use two JOIN statements to link all these tables together: one to link products_to_orders to products, and one to link products_to_orders with orders.

5 Practice Interview Questions on SQL Joins

New: Sign up for a free SQL mini-course

New: Sign up for a free SQL mini-course

Free practice question

5 Practice Interview Questions on SQL Joins

New: Sign up for a free SQL mini-course

New: Sign up for a free SQL mini-course

Free practice question

Ace your SQL interview — Free mini-course