Master SQL Joins (Chapter 5)
Welcome to my blog where we explore the magic of SQL Joins and Set Operations. Uncover the art of combining, refining, and comparing data from different sources. Join me on this journey as we simplify the complexities and empower you to wield these tools for impactful data exploration.
SQL Joins
In SQL, a JOIN operation is used to combine rows from two or more tables based on a related column between them. It allows us to retrieve data from multiple tables in a single query, which is especially useful when we need to work with related data stored in separate tables. It’s like putting puzzle pieces together to see the whole picture. We use columns that have matching values in both tables to connect them, and different types of JOINs give us different ways to combine the information.
Now here a question rise, why do we need multiple tables for related data?
Using multiple tables for related data in a database helps organize information, reduce redundancy, ensure data accuracy, and make it easier to manage and retrieve specific details. This approach promotes efficiency, scalability, and better control over data access and changes.
Think of a database as a big organized storage system for information. Imagine we are storing data about a school. We could put all the details about students, classes, teachers, and subjects in a single huge list, but that would get messy and confusing quickly.
Instead, we use separate folders (tables) for different types of information. One folder holds student names and details, another has class information, and so on. These folders are connected through common pieces of information, like student IDs.
Why? Because of this separation:
- Makes Updates Easier: If a student changes their address, we only have to update it in one place, and all their records are automatically updated.
- Reduces Mistakes: Less chance of entering the same information in different ways by mistake.
- Helps Find Things: Need to know all the students in a particular class? we just look in the class folder, rather than searching through all the data.
- Keeps Things Fast: Searching and sorting are quicker in smaller, organized folders.
- Saves Space: we’re not repeating the same information for each related thing (like student names for each class they’re in).
There are different types of JOIN operations:
Cross Join
A CROSS JOIN in SQL is like creating all possible combinations between two sets of data, resulting in a Cartesian product. It combines every row from the first table with every row from the second table, creating a new table with all possible pairs of rows.
#CROSS JOIN
select t1.name, t2.course_id from table1 t1
cross join table2 t2
#it will return all the combination of name and course_id
Inner Join
This type of join returns only the rows that have matching values in both tables. Rows with non-matching values are excluded from the result. An INNER JOIN in SQL is like combining two puzzle pieces that fit perfectly together. It’s a way to bring data from two tables together, showing only the matching rows between them.
#Inner Join
select t1.name, t2.order_id from table1 t1
inner join table2 t2
on t1.order_id = t2.order_id
Left Join
A LEFT JOIN in SQL is like combining two sets of data while keeping all the rows from the left table and only the matching rows from the right table. If there’s no match in the right table, the result will still include the rows from the left table, but with NULL values in the columns from the right table.
#Left Join
select t1.name, t2.price from table1 t1
left join table2 t2
on t1.order_id = t2.order_id
Right Join
A RIGHT JOIN in SQL is similar to a LEFT JOIN, but with the roles of the tables reversed. It keeps all the rows from the right table and only the matching rows from the left table. If there’s no match in the left table, the result will still include the rows from the right table, but with NULL values in the columns from the left table.
#Right Join
select t1.name, t2.price from table1 t1
right join table2 t2
on t1.order_id = t2.order_id
Full Outer Join
A FULL OUTER JOIN is a type of join that combines the results of both a LEFT JOIN and a RIGHT JOIN. It returns all rows from both tables, matching rows where they exist, and filling in NULL values where there are no matches. However, it’s important to note that MySQL does not have a built-in FULL OUTER JOIN operator like some other database systems. Instead, you can simulate a FULL OUTER JOIN using a combination of LEFT JOIN and UNION.
#Left Join
select t1.name, t2.price from table1 t1
left join table2 t2
on t1.order_id = t2.order_id
union
#Right Join
select t1.name, t2.price from table1 t1
right join table2 t2
on t1.order_id = t2.order_id
Self Join
A self join in SQL is a type of join where a table is joined with itself. It’s useful when we have a table with hierarchical or related data, and we want to compare or combine rows from the same table based on certain conditions.
To perform a self join, we use table aliases to create two different “views” of the same table within the same query. This allows you to treat the table as if you had two separate instances of it, which you can then use to establish relationships between rows.
#Self Join
select t1.name, t2.name as 'secret_santa 'from table1 t1
join table1 t2
on t1.assigned_color = t2.color
SET OPERATION
Set operations in SQL refer to combining the results of multiple SELECT queries into a single result set based on certain conditions or criteria.
While set operations are not typically referred to as “joins,” they allow us to combine and manipulate data in a way that is similar to how joins work. Set operations include UNION, INTERSECT, and EXCEPT (or MINUS, in some databases).
Here’s an explanation of each set operation:
- UNION: The UNION operation combines the results of two or more SELECT queries into a single result set, removing duplicates. Each SELECT query must have the same number of columns, and the columns’ data types must be compatible.
SELECT column1 FROM table1
UNION
SELECT column2 FROM table2;
- INTERSECT: The INTERSECT operation returns the common rows that appear in the results of two or more SELECT queries. Like UNION, the number of columns and their data types must match.
SELECT column1 FROM table1
INTERSECT
SELECT column1 FROM table2;
- EXCEPT (MINUS): The EXCEPT operation returns the rows that appear in the result of the first SELECT query but not in the result of the second SELECT query. It’s similar to finding the difference between two sets.
SELECT column1 FROM table1
EXCEPT
SELECT column1 FROM table2;
As we reach the conclusion of our blog, I want to express my heartfelt appreciation to you, our dedicated reader. I hope this guide has equipped you with practical insights and tools to elevate your data manipulation prowess. Remember, the journey of learning is a continuous one, and I’m excited to accompany you as you explore new horizons in the realm of data analysis. Thank you for joining me on this enlightening adventure, and stay tuned for more insights in the future. ❤️