Top 10 SQL interview questions for tech professionals
Doing an SQL interview as part of your job application?
Get some advance practice with our free sample questions.
SQL, a programming language used to communicate with databases, is one of the most useful skills to know for recruiting in the tech industry. Pronounced "ess-cue-ell" and short for Structured Query Language, this incredible tool is a must-have for analyzing large data sets. In particular, it shines when applied to relational databases — unique tables of data that are all related to each other in some way.
Because SQL is so ubiquitous in the tech world, many companies conduct SQL interviews before extending job offers. This helps ensure that job applicants — particularly for roles in project management, analytics, business intelligence, and software engineering — are comfortable using SQL on the job.
If you’ve got an upcoming SQL interview, you’re probably wondering what sorts of questions you might get. Recruiters are generally vague on details, and it can be scary to walk into a conversation like this one blind.
So, we’ve provided our list of the 10 most common SQL interview questions so that you can get some practice in before your exam. With a little bit of advance preparation, you’ll feel prepared and confident on interview day.
New: Sign up for a free SQL mini-course
Make sure you're prepared for the interview that could change your career. Sign up to receive five free must-learn lessons in preparation for your SQL interview. Start preparing today!
1. What is a relational database, and what is SQL? Please provide examples as part of your explanation.
Even though you probably know what relational databases are — and what SQL itself is — it can be tough to come up with coherent, simple explanations during a live interview. To that end, be sure that you've prepared in advance to answer this simple question. Here are some tips:
A
That's a bit of a vague definition, so let's take a look at a relational database in practice. A simple version of a relational database for an online retailer might contain two separate data tables:
- Customers. A list of customer information, including customer names, contact information, and shipping preferences. Each record in this database contains a unique
customer_id
field by which the customer can be identified. - Orders. A list of orders purchased at the retailer's website. Each order listing also contains a
customer_id
field, which is used to link that order's details with the specific customer who placed the order.
Of course, we wouldn't need a multi-table database if we simply included customer information on the Orders
table. But that wouldn't be particularly efficient: if a single customer placed multiple orders, his or her name, contact information, and shipping preferences would be listed on multiple lines of the Orders
table — leading to unnecessary duplication and an unmanageably large database. Instead, we create a relational database to save space and show how different pieces of data are linked together.
2. What are the different types of SQL JOIN
clauses, and how are they used?
In SQL, a JOIN
clause is used to return a table that merges the contents of two or more other tables together. For example, if we had two tables — one containing information on Customers
and another containing information on the Orders
various customers have placed — we could use a JOIN
clause to bring them together to create a new table: a complete list of orders by customer, with all necessary information to make shipments.
There are multiple types of JOIN
clauses, and they all serve slightly different functions:
INNER JOIN
returns a list of rows for which there is a match in both tables specified. It's the default join type, so if you just typeJOIN
without specifying any other conditions, anINNER JOIN
will be used.LEFT JOIN
will return all results from the left table in your statement, matched against rows in the right table when possible. If a row in the left table does not contain a corresponding match in the right table, it will still be listed — withNULL
values in columns for the right table.RIGHT JOIN
will return all results from the right table in your statement, matched against rows in the left table when possible. If a row in the right table does not contain a corresponding match in the left table, it will still be listed — withNULL
values in columns for the left table.FULL JOIN
will return all results from both the left and the right tables in your statement. If there are instances in which rows from the left table do not match the right table or vice versa, all data will still be pulled in — but SQL will outputNULL
values in all columns that are not matched.CROSS JOIN
returns theCartesian product of two tables — in other words, each individual row of the left table matched with each individual row of the right table.
JOIN
clauses, check out our article on JOIN interview questions!3. Why is this query not returning the expected results?
We have 1000 total rows in the orders
table:
SELECT * FROM orders;
-- 1000 rows in set (0.05 sec)
And 23 of those orders are from the user with customer_id
= 45:
SELECT * FROM orders WHERE customer_id = 45;
-- 23 rows in set (0.10 sec)
Yet, when we SELECT
the number of orders that are not from customer_id
= 45, we only get 973 results:
SELECT * FROM orders WHERE customer_id <> 45;
-- 973 rows in set (0.11 sec)
973 + 23 = 996. But shouldn't the number of orders with customer_id
equal to 45 plus the number of orders with customer_id
not equal to 45 equal 1000? Why is this query not returning the expected results?
The answer: this data set most likely contains order
values with a NULL
customer_id
. When using the SELECT
clause with conditions, rows with the NULL
value will not match against either the = or the <> operator.
Our second query above could be modified as follows to produce the expected results:
SELECT * FROM orders WHERE (customer_id <> 45 OR customer_id IS NULL);
-- 977 rows in set (0.11 sec)
New: Sign up for a free SQL mini-course
Make sure you're prepared for the interview that could change your career. Sign up to receive five free must-learn lessons in preparation for your SQL interview. Start preparing today!
4. Why does one of these queries work while the other does not?
Consider the following query, which returns the expected results:
SELECT CASE WHEN (3 IN (1, 2, 3, NULL)) THEN 'Three is here!'
ELSE "Three isn't here!" END AS result;
/*
+----------------+
| result |
+----------------+
| Three is here! |
+----------------+
1 row in set (0.00 sec)
*/
The output "Three is here!"
is shown, because the value 3 is included in the IN
clause. But what about the following query?
SELECT CASE WHEN (3 NOT IN (1, 2, NULL)) THEN "Three isn't here!"
ELSE 'Three is here!' END AS result;
/*
+----------------+
| result |
+----------------+
| Three is here! |
+----------------+
1 row in set (0.00 sec)
*/
Three is not included in the second set — so why does our query mistakenly deliver the output, "Three is here!"
?
The answer, once again, has to do with the way MYSQL handles NULL
values. Let's take a closer look. In our first query, we ask whether the value 3
is included in the set (1, 2, 3, NULL)
. Our statement is functionally equivalent to the following:
SELECT CASE WHEN ((3 = 1) OR (3 = 2) OR (3 = 3) OR (3 = NULL))
THEN 'Three is here!'
ELSE "Three isn't here!" END AS result;
/*
+----------------+
| result |
+----------------+
| Three is here! |
+----------------+
1 row in set (0.00 sec)
*/
Since 3 is definitely equal to 3, one of our OR
conditions is met, and the statement outputs, "Three is here!"
. Our second statement, on the other hand, asks whether the value 3
is NOT
included in the set (1, 2, NULL)
. This statement is functionally equivalent to the following:
SELECT CASE WHEN ((3 <> 1) AND (3 <> 2) AND (3 <> NULL))
THEN "Three isn't here!"
ELSE "Three is here!" END AS result;
/*
+----------------+
| result |
+----------------+
| Three is here! |
+----------------+
1 row in set (0.00 sec)
*/
In this case, the conditional check 3 <> NULL
fails, because in ANSI-standard SQL, we need to use the IS NULL
statement rather than the <>
operator.
5. Can you construct a basic INNER JOIN
?
Consider our customers
and orders
tables, with the following respective schema:
CREATE TABLE `customers` (
`customer_id` int(11) NOT NULL AUTO_INCREMENT,
`first_name` varchar(255) NOT NULL,
`last_name` varchar(255) NOT NULL,
`email` varchar(255) NOT NULL,
`address` varchar(255) DEFAULT NULL,
`city` varchar(255) DEFAULT NULL,
`state` varchar(2) DEFAULT NULL,
`zip_code` varchar(5) DEFAULT NULL,
PRIMARY KEY (`customer_id`)
);
CREATE TABLE `orders` (
`order_id` int(11) NOT NULL AUTO_INCREMENT,
`customer_id` int(11) NOT NULL,
`order_placed_date` date NOT NULL,
PRIMARY KEY (`order_id`),
KEY `customer_id` (`customer_id`),
FOREIGN KEY (`customer_id`) REFERENCES `customers` (`customer_id`)
);
Can you construct a simple SELECT
statement that uses an INNER JOIN
to combine all information from both the customers
and orders
tables?
The answer here is really simple. Here's how we'd do it:
SELECT * FROM orders
INNER JOIN customers
ON orders.customer_id = customers.customer_id;
6. Working with the AS
statement
We've written a query based on the orders
table above to select all orders from the year 2016. But something is wrong with our query. Can you figure out what it is?
SELECT order_id, customer_id, YEAR(order_placed_date) AS order_year
FROM orders
WHERE order_year = 2016;
Here's the answer: order_year
is an YEAR(order_placed_date)
. It turns out that in SQL, aliases can only be referenced in GROUP BY
, ORDER BY
, and HAVING
clauses — they can't be used in WHERE
clauses. Running the above code will produce the following result:
--ERROR 1054 (42S22): Unknown column 'order_year' in 'where clause'
To fix this problem, we need to reiterate the definition of the order_year
alias in the WHERE
clause like so:
SELECT order_id, customer_id, YEAR(order_placed_date) AS order_year
FROM orders
WHERE YEAR(order_placed_date) = 2016;
--498 rows in set (0.00 sec)
7. Using the SUM
function
Consider the following database schema:
CREATE TABLE `products` (
`product_id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`price` decimal(19,4) NOT NULL,
PRIMARY KEY (`product_id`)
);
CREATE TABLE `order_products` (
`order_product_id` int(11) NOT NULL AUTO_INCREMENT,
`order_id` int(11) NOT NULL,
`product_id` int(11) NOT NULL,
PRIMARY KEY (`order_product_id`),
KEY `order_id` (`order_id`),
KEY `product_id` (`product_id`),
FOREIGN KEY (`order_id`) REFERENCES `orders` (`order_id`),
FOREIGN KEY (`product_id`) REFERENCES `products` (`product_id`)
)
Can you write a query that finds the total order price (e.g., the sum of product.price
from each order) for all order_id
s?
This question is a bit tough, as we'll have to use both the SUM
function and the GROUP BY
clause to aggregate orders by order_id
. Here's how we do it:
SELECT order_id, SUM(price) AS total_order_price
FROM order_products
INNER JOIN products
ON order_products.product_id = products.product_id
GROUP BY order_id;
--1000 rows in set (0.01 sec)
8. How are INSERT
, UPDATE
, and DELETE
used in SQL databases?
SELECT
queries are useful, but they only allow you to query data — in other words, they only let you ask questions about data that already exists. But what if you want to manipulate your database — by creating new data, updating existing data, or removing data from your tables entirely?
To accomplish these goals, you'll need to use the INSERT
, UPDATE
, and DELETE
functions — three capabilities that are just as important, or more important, than SELECT
. Here's some more information on what these statements do:
INSERT
does just what it sounds like — it inserts new rows into a database. This is the primary tool used to build the actual records within a database. Here's an example: if a new student joins a school, that school might update itsstudents
table byINSERT
ing a new record.UPDATE
makes changes to existing records. For example, if the database administrator accidentally mistyped our new student's name when adding her to the database usingINSERT
, the situation could be remedied with a quickUPDATE
of the student's 'name' field.DELETE
permanently removes records from the database. It should be used carefully, as over-ambitiousDELETE
commands can wipe out portions of a database you actually want to keep. If our new student decided to quit school and move on to another program, the database administrator might run aDELETE
command on that student's row — being careful toLIMIT
the input so that only one record max is deleted.
9. What comparison operators can be used in SQL?
Comparison operators are most commonly found within WHERE
clauses. They help SQL users refine queries so that they only include data that matches certain criteria — for example, orders that were placed in the year 2017
or customers whose first name is 'Jonathan'
. Having a strong understanding of comparison operators is critical if you want to write effective, robust SELECT
queries to analyze data.
Before interviewing, it's highly recommended that you study up on comparison operators and how they're used in SQL's WHERE
clause. Check out this list of some of the most common ones:
Operator | Meaning |
---|---|
= |
Equal to |
<> or != |
Not equal to |
> |
Greater than |
>= |
Greater than or equal to |
< |
Less than |
< |
Less than or equal to |
> |
Greater than |
IN() |
Matches a value within another list |
NOT |
The opposite of a given condition |
BETWEEN() |
Within a specified range (inclusive) |
IS NULL |
Checks to see whether a given value is NULL (generally, =NULL does not work for this) |
IS NOT NULL |
Checks to see whether a given value is not NULL (generally, =!NULL does not work for this) |
LIKE |
Checks to see whether a given value matches a simple pattern using the '%' and '_' symbols |
EXISTS |
Returns TRUE if query matches at least one row in set |
10. What have you used SQL for in past jobs?
This one is a softer question, but no less important than the others listed on this page! If you're walking into an SQL interview, you're bound to be asked about your prior experience with databases, and you should have a strong, compelling answer ready.
To prepare for this question, think about the following: Have you used SQL at your jobs in the past? With which databases did you interact? Were you primarily responsible for modifying databases using INSERT
and UPDATE
queries, or did you mainly query databases using SELECT
? How were the databases with which you interacted structured? Were there any complications? What did you learn through your work that you could apply to your new job?
If you haven't used SQL in the past, don't worry — be honest with your interviewer, and explain that although you don't have extensive live experience with SQL, you've done significant research on your own to teach yourself the ropes. Express your excitement about using SQL in your new job, and appreciation at being offered the opportunity to try your hand at database manipulation.
Congrats — you've learned the important stuff, and now have a solid foundation for success in your interview! But practice makes perfect — so scroll down for a free SQL practice challenge.
New: Sign up for a free SQL mini-course
Make sure you're prepared for the interview that could change your career. Sign up to receive five free must-learn lessons in preparation for your SQL interview. Start preparing today!
Free practice question
Want more SQL interview prep? Try your hand with this free practice question:
Cocoa Confections is a small bakery that sells brownies, cookies, pies, and other delicious treats to customers online. It keeps records of all of its online sales in an SQL database that is automatically populated as customers place orders on its site.
In its database, Cocoa Confections has a customers
table to keep track of customer contact information, and an orders
table to keep track of various orders that those customers have placed. The schema of these tables is as follows:
CREATE TABLE customers (
customer_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
first_name VARCHAR(255) NOT NULL,
last_name VARCHAR(255) NOT NULL,
email VARCHAR(255) NOT NULL,
address VARCHAR(255) DEFAULT NULL,
city VARCHAR(255) DEFAULT NULL,
state VARCHAR(2) DEFAULT NULL,
zip_code VARCHAR(5) DEFAULT NULL,
);
CREATE TABLE orders (
order_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
customer_id INT NOT NULL,
order_placed_date DATE NOT NULL,
FOREIGN KEY (customer_id) REFERENCES customers (customer_id)
);
It's the end of 2016, and the owner of Cocoa Confections wants to write an SQL query that finds the COUNT
of orders placed in 2016 by customer e-mail address. She wants to ORDER
the results by the COUNT
of orders placed in 2016, descending, so that she can personally send thank-you e-mails to Cocoa Confection's top customers by order volume.
Can you write a query that will help the owner of Cocoa Confections find the COUNT
of all orders
placed in 2016, by customer e-mail address, sorted descending?
The owner of Cocoa Confections wants to find the COUNT
of orders placed by e-mail address. But the email
field is located within the customers
table, not the orders
table. What type of statement do we need to use to combine the two tables together?
INNER JOIN
To combine our two tables together, we'll use a JOIN
. Since the orders
table has a customer_id
field, we can match that against the customer_id
in the customers
table to link up the two separate data sources. We'll use an INNER JOIN
so that data is only included if the customer_id
listed matches both of our input tables. Try the following to get you started — it'll compile a list of all orders in our database with all requisite customer information appended:
SELECT
*
FROM
orders
INNER JOIN
customers ON orders.customer_id = customers.customer_id;
COUNT(*)
Now that we have a complete list of orders with e-mail addresses included, we can use the COUNT(*)
and GROUP BY
functions to get a list of orders by e-mail address. This will return a COUNT
of all orders in our database by e-mail address. We'll also restrict our SELECT
clause so that it only pulls the e-mail address and the COUNT
of orders from our database, excluding all other fields.
SELECT
email, COUNT(*)
FROM
orders
INNER JOIN
customers on orders.customer_id = customers.customer_id
GROUP BY
email;
/*
Results:
+------------------------------+----------+
| email | COUNT(*) |
+------------------------------+----------+
| adawson@yahoo.com | 6 |
| aferguson@gmail.com | 16 |
| amy.cummings@yahoo.com | 13 |
...100 rows in set (0.01 sec)
*/
WHERE
and ORDER BY
We're not done yet! We still need to restrict the dates of our pull to 2016 only. How do we filter for particular results in SQL? Using a WHERE
clause! We'll also need to use an ORDER BY
clause to ensure that our results are sorted descending by number of orders. These two clauses, combined, should look something like this:
WHERE
order_placed_date BETWEEN CAST('2016-01-01' AS DATE) AND CAST('2016-12-31' AS DATE)
ORDER BY
2016_num_orders DESC;
Note that we need to use the CAST
function to ensure that our numerical inputs are properly interpreted as dates by MySQL.
Your solution will need to combine the customers
table with the orders
table. Does it do so?
INNER JOIN
?To combine the customers
and orders
tables together, an INNER JOIN
query will be most effective. Does the solution you've created use an INNER JOIN
statement?
Don't forget that in order to find the total number of orders placed by e-mail address, you'll need to use SQL's COUNT(*)
or SUM
functions, which also necessitate a GROUP BY
clause in your query. Does your solution contain those features?
Don't forget that we're only pulling orders from the year 2016
, so we'll need to restrict our pull to that year only. We'll also need to arrange our results in descending order by total COUNT
. Have you got those features in your query?
Great! Sounds like you're on the right track, so let's move on to the solution.
Here's our full solution to the problem:
SELECT
customers.email, COUNT(*) AS 2016_num_orders
FROM
orders
INNER JOIN
customers on orders.customer_id = customers.customer_id
WHERE
orders.order_placed_date BETWEEN CAST('2016-01-01' AS DATE) AND CAST('2016-12-31' AS DATE)
GROUP BY
customers.email
ORDER BY
2016_num_orders DESC;
/*
Results:
+------------------------------+-----------------+
| email | 2016_num_orders |
+------------------------------+-----------------+
| doreen.glover@yahoo.com | 14 |
| rpayne@hotmail.com | 10 |
| fobrien@gmail.com | 10 |
...99 rows in set (0.01 sec)
*/
Check out what's happening above:
First, we use a SELECT
statement to pull specific fields from our database. We'll pull the email
field, which is located on the customers
table, and the aggregated COUNT(*)
of records, using the AS
clause to refer to this column as 2016_num_orders
.
We'll use a FROM orders
clause to note that we're pulling this data from our orders
table. But since the email
field isn't located on the orders
table, we'll also need to perform an INNER JOIN
linking the orders.customer_id
field to the customers.customer_id
field.
We'll use a WHERE
clause to restrict our data range to 2016 only using two criteria: >= '2016-01-01'
and <= '2016-12-31'
.
Since we're using a COUNT(*)
function to sum records, we'll need to use a GROUP BY
clause to tell SQL that we want to group our results by email
address.
Finally, we'll tell SQL that we want our results arranged by number of orders, from greatest to least, using ORDER BY
.
Our query is complete!