SQL Book

Using SQL Cross Join - the report writers secret weapon

The SQL CROSS JOIN is an often overlooked part of SQL but it can prove invaluable if you are aware of which situations it can be useful in.

A Cross Join (also sometimes known as a Cartesian Join) results in every row of one table being joined to every row of another table. The CROSS JOIN does not establish a relationship between the two tables like an INNER JOIN or OUTER JOIN would by specifying which columns to join on.

The example below creates two basic tables and performs a cross join on them:

CREATE TABLE Customer (
       CustomerID int,
       CustomerName varchar(50)
)
CREATE TABLE Product (
       ProductID int,
       ProductName varchar(50)
)
INSERT INTO Customer VALUES (1, 'Joe Bloggs')
INSERT INTO Customer VALUES (2, 'Bob Marley')
INSERT INTO Customer VALUES (3, 'Jimi Hendrix')

INSERT INTO Product VALUES (1, 'MP3 Player')
INSERT INTO Product VALUES (2, 'LCD TV')

SELECT * FROM Customer CROSS JOIN Product

=======
Results:
=======
CustomerID  CustomerName   ProductID   ProductName
----------- --------------- ------------ ----------------------
1           Joe Bloggs        1           MP3 Player
1           Joe Bloggs        2           LCD TV
2           Bob Marley        1           MP3 Player
2           Bob Marley        2           LCD TV
3           Jimi Hendrix      1           MP3 Player
3           Jimi Hendrix      2           LCD TV


You can see from the above example that a cross join gets a record from the first table and then creates a new row for every row in the 2nd table. It then does the same for the next record in the first table etc.

When there is no WHERE filters applied to the CROSS JOIN the number of records returned will be the number of records in Table1 x the number of records in Table2. i.e. in the above example this is 3 x 2 = 6. With even moderate sized tables of a few thousand records you can see this can quickly produce very large resultsets that take large amounts of memory and can greatly reduce the performance of the application. e.g. 2000 customers CROSS JOIN 2000 products with no WHERE clause = 4 million records.

Using the WHERE Clause with the CROSS JOIN

We can use the WHERE clause with the CROSS JOIN to filter the records returned:

SELECT * FROM Customer CROSS JOIN Product
WHERE Customer.CustomerID = 2

=======
Results:
=======
CustomerID  CustomerName   ProductID   ProductName
----------- --------------- ------------ ----------------------
2           Bob Marley        1           MP3 Player
2           Bob Marley        2           LCD TV

Many text books cite that CROSS JOINS should be avoided due to the very large recordsets they create and that the only reason for using CROSS JOIN is to create some dummy data that can be used for testing your queries with. There are however, much more useful and powerful reasons for using CROSS JOIN.

Using CROSS JOIN as part of a report query

For an example of how we can utilise the CROSS JOIN when writing a query to generate a report let us use the example that we want to generate a report of how many items were sold for each customer and each product. A common attempt to try this would be:

-- assumes a 'Sales' table with the following columns:
-- CustomerID, ProductID, Quantity

SELECT C.CustomerName, P.ProductName, SUM(S.Quantity) As NumberSold
FROM Customer C INNER JOIN Sales S
       ON C.CustomerID = S.CustomerID INNER JOIN Product P
       ON P.ProductID = S.ProductID
GROUP BY CustomerName, ProductName
ORDER BY CustomerName, ProductName

Remember we want evidence of each customer and product in our resultset even if there have been no sales for that customer / product combination. The problem with the above query is that if a customer or product has no sales associated with it there would be no record of the customer or product in the resultset. i.e. If NumberSold is 0 then there is no evidence of the customer or product existing.

We can solve this problem by using a CROSS JOIN in combination with another sub query. If we use a CROSS JOIN query as the basis of our report we can guarantee that we will get every Customer / Product combination in our resultset:

SELECT C.CustomerName, P.ProductName
FROM Customer C CROSS JOIN Product P

We now need to add the NumberSold column onto our resultset and we can do this by using a derived table and joining it to our original query by using a LEFT OUTER JOIN:

SELECT C.CustomerName, P.ProductName, S.NumberSold
FROM Customer C CROSS JOIN Product P
       LEFT OUTER JOIN
       (SELECT CustomerID, ProductID, SUM(Quantity) As NumberSold
       FROM Sales
       GROUP BY CustomerID, ProductID) S
       ON C.CustomerID = S.CustomerID AND
       P.ProductID = S.ProductID
ORDER BY C.CustomerName, P.ProductName

========
Results:
========

CustomerName   ProductName      NumberSold
------------- --------------- -----------
Bob Marley     LCD TV               1
Bob Marley     MP3 Player         NULL
Jimi Hendrix   LCD TV              5
Jimi Hendrix   MP3 Player        NULL
Joe Bloggs     LCD TV             NULL
Joe Bloggs     MP3 Player         3

Great! This query does what we wanted but if the NumberSold was 0 then the LEFT JOIN gives us NULL in the NumberSold column. We can convert NULL to 0 in our application or in SQL Server we can use the ISNULL() function to convert NULL to 0. This is demonstrated below where we amend the first line of our SELECT statement.

SELECT C.CustomerName, P.ProductName, IsNull(S.NumberSold, 0) As NumberSold

This join pattern of: Table1 CROSS JOIN Table2 LEFT OUTER JOIN DerivedTable3 is the secret weapon of the report writer and can be used to provide an efficient, elegant solution to the 'missing rows' problem.

Adding row filters to our report

WHERE clauses can be applied in both the outer query (the query with the CROSS JOIN) and the inner query (the aggregating part) to filter the data.

For instance to filter the data by Customer or Product we would add a WHERE clause to the outer query:

WHERE C.CustomerName LIKE 'J%'

To filter data by Sales characteristics such as DateOfSale or SalesValue we would add a WHERE clause to the inner query:

-- assumes Sales table has a column called SalesValue
SELECT C.CustomerName, P.ProductName, IsNull(S.NumberSold, 0) As NumberSold
FROM Customer C CROSS JOIN Product P
       LEFT OUTER JOIN
       (SELECT CustomerID, ProductID, SUM(Quantity) As NumberSold
       FROM Sales
       WHERE SalesValue > 100.00
       GROUP BY CustomerID, ProductID) S
       ON C.CustomerID = S.CustomerID AND
       P.ProductID = S.ProductID
WHERE C.CustomerName LIKE 'J%'
ORDER BY C.CustomerName, P.ProductName

Conclusion

In this article we have seen how the SQL CROSS JOIN works by combining each row in one table with each row in another table. This can produce very large resultsets that can bring a database to it knees. To keep the size of the resultset down it is advisable to:

  • Specify which fields to select from each table in the cross join rather than using SELECT *
  • Use a WHERE clause to reduce the number of rows returned

The CROSS JOIN is a powerful tool when used for creating reports that require each row from one table to be displayed for one or more rows of another table. The CROSS JOIN solution is:

Table1 CROSS JOIN Table2 LEFT JOIN DerivedTable3 (where the DerivedTable3 is an aggregating sub query.)

A WHERE Clause can be added to both the outer query and the inner query to filter the data according to the needs of the report.

We hope you find this useful in helping you develop your reports.

Join our newsletter

Subscribe to our free weekly newsletter and get exclusive SQL tips and articles straight to your inbox. 

 Your email address is only used to send you our newsletter.