I’ll be skipping past the standard SQL techniques, assuming we’re all familiar with the basics because in today’s data-driven world, mastering advanced SQL isn’t just useful for day-to-day analysis, it’s crucial for standing out in live-coding interviews, where you’re expected to navigate and manipulate complex datasets quickly and confidently.

1️⃣ GROUP BY vs. DISTINCT vs. Window Functions

FeatureGROUP BYDISTINCTWindow Functions
Use CaseAggregatesRemoves duplicatesRanking, cumulative sums
PerformanceMediumFastCan be slow with large data

Example: Count unique customers per region.

SELECT region, COUNT(DISTINCT customer_id) 
FROM customers 
GROUP BY region;

2️⃣ Joins (Inner, Left, Right, Full, Self Joins, Cross Joins)

Use Case: Combine data from multiple tables efficiently.

Example: Find customer orders and their details.

SELECT customers.name, orders.order_date, orders.total_amount 
FROM customers 
LEFT JOIN orders ON customers.customer_id = orders.customer_id;

3️⃣ CTEs (Common Table Expressions) & Subqueries

✅ CTEs (WITH cte AS (…)) → Make queries more readable
✅ Subqueries → Nested queries within SELECT, FROM, or WHERE

Example: Compute top customers by revenue using a CTE.

WITH customer_revenue AS (
  SELECT customer_id, SUM(total_amount) AS total_revenue 
  FROM orders 
  GROUP BY customer_id
)
SELECT * FROM customer_revenue 
ORDER BY total_revenue DESC 
LIMIT 10;

Example: Find customers who spent more than the average order amount using a subquery.

SELECT customer_id, total_amount 
FROM orders 
WHERE total_amount > (SELECT AVG(total_amount) FROM orders);

4️⃣ EXISTS vs. IN vs. JOIN (Performance Considerations)

Key Functions:

  • EXISTS → Stops at first match (efficient for large datasets)
  • IN → Works well for small lists but slower for large datasets
  • JOIN → Preferred when retrieving related columns

Example: Find customers who have placed at least one order.

SELECT name 
FROM customers 
WHERE EXISTS (
  SELECT 1 
  FROM orders 
  WHERE orders.customer_id = customers.customer_id
);

5️⃣ Running Totals & Cumulative Sums

Key Function: SUM() OVER (ORDER BY column)
Use Case: Running totals of revenue, sales, or cumulative counts.

Example: Compute cumulative revenue over time.

SELECT order_date, total_amount, 
       SUM(total_amount) OVER (ORDER BY order_date) AS running_total 
FROM orders;

6️⃣ Moving Averages (Rolling Window Aggregates)

Key Function: AVG() OVER (ORDER BY column ROWS BETWEEN N PRECEDING AND CURRENT ROW)

Example: Compute 7-day rolling average of revenue.

SELECT order_date, total_amount, 
       AVG(total_amount) OVER (ORDER BY order_date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS moving_avg 
FROM orders;

7️⃣ Date & Time Manipulation

Key Functions:

  • DATE_TRUNC() → Truncate date to year, month, day, etc.
  • EXTRACT(YEAR FROM date) → Get year, month, or day from a date
  • DATEADD(interval, value, date) → Add/subtract days/months
  • DATEDIFF(end_date, start_date) → Difference between dates

Example: Find total sales per month.

SELECT DATE_TRUNC('month', order_date) AS month, SUM(total_amount) 
FROM orders 
GROUP BY month;

Example: Find orders placed in the last 7 days.

SELECT * FROM orders 
WHERE order_date >= CURRENT_DATE - INTERVAL '7 days';

8️⃣ Handling NULLs

Key Functions:

  • COALESCE(column, default_value) → Replace NULL with a default value
  • NULLIF(value1, value2) → Returns NULL if both values are equal
  • IS NULL / IS NOT NULL → Filter NULL values

Example: Replace missing values with 0.

SELECT customer_id, COALESCE(total_amount, 0) AS total_amount 
FROM orders;

Example: Handle division by zero using NULLIF.

SELECT revenue / NULLIF(orders, 0) 
FROM sales;

9️⃣ Recursive CTEs (Hierarchical Data)

Key Function: Recursive WITH
Use Case: Hierarchical relationships (e.g., org charts, folder structures).

Example: Find all employees reporting to a specific manager.

WITH RECURSIVE employee_hierarchy AS (
  SELECT employee_id, manager_id, 1 AS depth 
  FROM employees 
  WHERE manager_id IS NULL -- Start with the CEO
  
  UNION ALL
  
  SELECT e.employee_id, e.manager_id, h.depth + 1 
  FROM employees e 
  JOIN employee_hierarchy h ON e.manager_id = h.employee_id
)
SELECT * FROM employee_hierarchy;

🔟 Pivoting & Unpivoting Data (CASE WHEN & PIVOT)

Key Functions:

  • CASE WHEN for manual pivoting
  • PIVOT() (if supported)
  • UNPIVOT() for reversing pivot tables

Example: Pivot sales data by region.

🔹 Before (Raw Table)

regiontotal_amount
North5000
South3000
East7000
West4000
North2000
South1500
SELECT 
  SUM(CASE WHEN region = 'North' THEN total_amount END) AS north_sales,
  SUM(CASE WHEN region = 'South' THEN total_amount END) AS south_sales,
  SUM(CASE WHEN region = 'East' THEN total_amount END) AS east_sales,
  SUM(CASE WHEN region = 'West' THEN total_amount END) AS west_sales
FROM orders;

🔹 After (Simulated Pivoted Table with CASE WHEN)

north_salessouth_saleseast_saleswest_sales
7000450070004000

Example: Pivot data so that each region is a row and each product becomes a column

🔹 Before (Raw Table)

RegionProductTotalSales
EastLaptop2000
WestLaptop1500
EastPhone1200
WestPhone1300
SELECT 
    Region,
    ISNULL([Laptop], 0) AS Laptop,
    ISNULL([Phone], 0) AS Phone
FROM (
    SELECT Region, Product, TotalSales
    FROM Sales
) AS SourceTable
PIVOT (
    SUM(TotalSales)
    FOR Product IN ([Laptop], [Phone])
) AS PivotTable;

🔹 After (Pivoted Table)

RegionLaptopPhone
East20001200
West15001300

1️⃣1️⃣ Performance Optimization (Indexing, Query Execution Plans)

Key Concepts:

  • Use indexes (CREATE INDEX) to speed up queries.
  • Analyze query performance with EXPLAIN ANALYZE.
  • Optimize JOIN orders (use smaller tables first).
  • Use LIMIT or TOP to improve efficiency in large queries.

Example: Add an index to speed up searches.

CREATE INDEX idx_orders_customer ON orders (customer_id);

1️⃣2️⃣ Web Scraping & Large-Scale Data Aggregation

Key Techniques:

  • Extracting data from APIs (JSON functions in SQL)
  • Combining web-scraped datasets using UNION
  • Cleaning and normalizing extracted data

Example: Aggregating data from multiple sources into a single table.

SELECT name, address 
FROM web_data_source1 
UNION 
SELECT name, address 
FROM web_data_source2;

🛠 SQL Interview Practice Checklist

  • ✅ Aggregation Queries (SUM, COUNT, AVG, MIN, MAX, GROUP BY)
  • ✅ Window Functions (Running Sums, Moving Averages, Ranking, LAG/LEAD)
  • ✅ Joins (INNER, OUTER, SELF, CROSS Joins)
  • ✅ Date/Time Manipulation (DATE_TRUNC, EXTRACT, DATEDIFF)
  • ✅ Handling NULLs (COALESCE, NULLIF, IS NULL)
  • ✅ Recursive Queries (Recursive CTEs, Hierarchical Data Processing)
  • ✅ Pivoting Data (CASE WHEN, PIVOT(), UNPIVOT())
  • ✅ Performance Tuning (INDEXES, Query Execution Plans, EXISTS vs. IN)
  • ✅ Web Scraping & Large-Scale Data Aggregation (JSON, API Data, UNION, ETL Processes)