Let's start by creating sample data, and then demonstrate performing inner joins and left joins using both SQL and Python Pandas.
Creating Sample Data in SQL:
We will create two tables, "customers" and "orders", with a common column "customer_id" that we will use for joining.
-- Create customers table
CREATE TABLE customers (
customer_id INT PRIMARY KEY,
customer_name VARCHAR(100),
city VARCHAR(100)
);
-- Insert data into customers table
INSERT INTO customers (customer_id, customer_name, city)
VALUES (1, 'John Doe', 'New York'),
(2, 'Jane Smith', 'Los Angeles'),
(3, 'Mike Johnson', 'Chicago');
-- Create orders table
CREATE TABLE orders (
order_id INT PRIMARY KEY,
customer_id INT,
order_date DATE,
amount DECIMAL(10, 2)
);
-- Insert data into orders table
INSERT INTO orders (order_id, customer_id, order_date, amount)
VALUES (1, 1, '2023-01-01', 100.50),
(2, 1, '2023-02-01', 200.75),
(3, 2, '2023-01-15', 150.00),
(4, 3, '2023-02-10', 75.25);
Now let's move on to performing inner joins and left joins using SQL.
Performing Inner Join in SQL:
To perform an inner join between the "customers" and "orders" tables based on the "customer_id" column:
SELECT c.customer_name, o.order_id, o.order_date, o.amount
FROM customers c
INNER JOIN orders o ON c.customer_id = o.customer_id;
This will give you the result of the inner join, which combines matching rows from both tables based on the "customer_id" column.
Performing Left Join in SQL:
To perform a left join between the "customers" and "orders" tables based on the "customer_id" column:
SELECT c.customer_name, o.order_id, o.order_date, o.amount
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id;
This will give you the result of the left join, which combines all rows from the left table ("customers") and matching rows from the right table ("orders") based on the "customer_id" column.
Now, let's see how we can perform inner joins and left joins using Python Pandas.
Performing Inner Join in Python Pandas:
To perform an inner join using Python Pandas, we can use the merge()
function:
import pandas as pd
# Create customers DataFrame
customers = pd.DataFrame({
'customer_id': [1, 2, 3],
'customer_name': ['John Doe', 'Jane Smith', 'Mike Johnson'],
'city': ['New York', 'Los Angeles', 'Chicago']
})
# Create orders DataFrame
orders = pd.DataFrame({
'order_id': [1, 2, 3, 4],
'customer_id': [1, 1, 2, 3],
'order_date': ['2023-01-01', '2023-02-01', '2023-01-15', '2023-02-10'],
'amount': [100.50, 200.75, 150.00, 75.25]
})
# Perform inner join
result_inner = pd.merge(customers, orders, on='customer_id')
print(result_inner)
This will give you the result of the inner join, similar to what we obtained in SQL.
Performing Left Join in Python Pandas:
To perform a left join using Python Pandas, we can use the merge()
function with the how='left'
parameter:
import pandas as pd
# Create customers DataFrame
customers = pd.DataFrame({
'customer_id': [1, 2, 3],
'customer_name': ['John Doe', 'Jane Smith', 'Mike Johnson'],
'city': ['New York', 'Los Angeles', 'Chicago']
})
# Create orders DataFrame
orders = pd.DataFrame({
'order_id': [1, 2, 3, 4],
'customer_id': [1, 1, 2, 3],
'order_date': ['2023-01-01', '2023-02-01', '2023-01-15', '2023-02-10'],
'amount': [100.50, 200.75, 150.00, 75.25]
})
# Perform left join
result_left = pd.merge(customers, orders, on='customer_id', how='left')
print(result_left)
This willgive you the result of the left join, where all rows from the left DataFrame ("customers") are included, and matching rows from the right DataFrame ("orders") are joined based on the "customer_id" column.
That's how you can perform inner joins and left joins using SQL and Python Pandas. Let me know if there's anything else I can help you with!