The following report was created to test the hypothesis that some film genres will outperform others in the video rental industry. It identifies the average stock turnover of videos, and groups the results by genre to identify genres that have above (and below) average turnover when compared with other genres.
This attached sample code demonstrates the use of aggregate functions and WHERE clauses. The most common issue with SQL code is incorrect use of WHERE clauses and aggregate functions. This can lead to slow query performance, high data charges and in some cases incomplete or otherwise inaccurate query results.
Data source: Sakila test database for MySQL.
-- This query calculates a number of statistics for each genre (or 'category.name' as they are called in the database schema.)
-- The query is grouped by category.name to create one row per category, and uses the inbuilt COUNT and DISTINCT functions to
-- calculate metrics for the number of distinct titles and the number of sales for that genre.
select category.name, COUNT(DISTINCT(inventory.inventory_id)) as 'no_of_titles', COUNT(rental.rental_id) as 'no_of_sales', COUNT(rental.rental_id)/COUNT(DISTINCT(inventory.inventory_id)) as 'avg_sale_per_title_by_genre'
FROM inventory LEFT OUTER JOIN film_category on inventory.film_id = film_category.film_id
LEFT OUTER JOIN category on film_category.category_id = category.category_id
LEFT OUTER JOIN rental on inventory.inventory_id = rental.inventory_id
WHERE YEAR(rental.rental_date)=2005
GROUP BY category.name;
-- LEFT OUTER joins are used to ensure that empty categories, and titles with no sales are still represented in the query
-- results and are correctly included in the average turnover statistics.
--
-- Note: some database engines allow the simpler expression 'no_of_titles' / 'no_of_sales' as 'avg_sale_per_title_by_genre' in
-- place of line 2, however this is not strict SQL and will cause an error in some popular engines including MySQL.