# Lab 6

Objectives:

- Learn how to group records
- Apply aggregate functions

## Aggregate functions in SQL

Aggregate functions are used within queries to return a single result row based on processing a group of records. They can be used within the SELECT list and in ORDER BY and HAVING clauses.

CAUTION: do not use any aggregate functions in the WHERE clause !!!

Aggregate functions can be used in conjunction with the keywords

- DISTINCT – duplicates in the group are eliminated and only afterwards the function is applied
- ALL – the default, makes the aggregate function to consider all the values in the group

COUNT ({ *|[DISTINCT|ALL] expr}) |
Returns the number of rows. If the argument is *, the function will return he size of the group by considering also the null values, otherwise only non-null values are counted. The argument may be CHAR, VARCHAR2, NUMBER, DATE. | SELECT COUNT(*) AS "Total number of students" FROM students;
SELECT COUNT(NVL(scholarship,0)) AS "Total number of students" FROM students; SELECT COUNT(scholarship) AS "# students earning scholarships" FROM students; SELECT COUNT(DISTINCT scholarship) AS "# scholarship types" FROM students; |

AVG ([DISTINCT|ALL]n) |
Returns the average. The argument must be numeric. | SELECT AVG(value) FROM grades; |

MAX ([DISTINCT|ALL]expr) |
Returns the maximum. The argument may be of type CHAR, VARCHAR2, NUMBER, DATE | SELECT MAX(scholarship) FROM students; |

MIN ([DISTINCT|ALL]expr) |
Returns the minimum. The argument may be of type CHAR, VARCHAR2, NUMBER, DATE | SELECT MIN(scholarship) FROM students; |

STDDEV ([DISTINCT|ALL]x) |
Returns the standard deviation. The argument must be numeric. | SELECT STDDEV(scholarship) AS "Standard deviation" FROM students; |

VARIANCE ([DISTINCT|ALL]x) |
Returns the variance. The argument must be numeric. | SELECT VARIANCE(scholarship) AS "Variance" FROM students; |

SUM([DISTINCT|ALL]n) |
Returns the sum. The argument must be numeric. | SELECT SUM(scholarship) FROM students; |

All aggregate functions ignore NULLs. An exception is COUNT with argument * which will treat NULL as a value. If we want NULLs to be considered by aggregate functions, we can make use of the NVL function to substitute nulls by a certain value (ex. 0).

If the group/data does not contain any rows or all rows have NULLs, the aggregate function will return NULL. An exception is COUNT which will return 0 or other number.

## Grouping data

All the examples above apply the aggregate function to a single group containing all the rows in the table, the result consisting of only one row.

However, the aggregate functions are usually used in conjunction with the GROUP BY clause which partitions the rows of the table into several groups; the DBMS will apply the aggregate function to each group, returning one result per group.

Let's revisit the syntax of the Oracle query:

SELECT [DISTINCT | ALL] {* | [column_expression [AS new_name]] [,...] } FROM table_name [alias] [, ...] [WHERE conditiion] [GROUP BY expression1 [HAVING condition] ] [ORDER BY expression2 [ASC|DESC][,…]]

Example:

SELECT MAX(value) FROM grades GROUP BY stud_id; --How many lines are returned? What is the meaning of each line?

Individual field values cannot be accessed in the SELECT list unless the GROUP BY clause contains the field. This means that all the fields/columns in the SELECT list which are not given as arguments to aggregate functions must also appear in the GROUP BY clause.

Example:

SELECT MAX(value), course_id FROM grades GROUP BY stud_id; --Doesn't work?

Groups may be crated based on several columns: Example:

SELECT year, count(*) FROM courses GROUP BY year; --How many groups? SELECT credits, count(*) FROM courses GROUP BY credits; --How many groups? SELECT year, credits, count(*) FROM courses GROUP BY year, credits ORDER BY year, credits; --How many groups?

It is not mandatory that the attributes used within the GROUP BY clause appear in the select list; but it is recommended in order to be able to interpret the results.

## Filtering groups

The WHERE clause eliminates individual rows before grouping takes place.

The HAVING clause is used only in conjunction with the GROUP BY clause in order to eliminate groups based on the results returned by the aggregate functions or based on column values that characterize the entire group and not based on individual row values.

Example:

SELECT course_id, COUNT(value) FROM grades GROUP BY course_id; --How many groups? SELECT course_id, COUNT(value) FROM grades WHERE value>8 GROUP BY course_id; --How many groups? Why? SELECT course_id, COUNT(value) FROM grades GROUP BY course_id HAVING COUNT(value)>8; --How many groups? Why?

## Exercises

- Show the number of students in each year of study.
- Show the number of students in each group of each year of study. Order ascending based on the year and then on the group.
- Show the number of students in each group of each year of study and indicate how many of them earn scholarships.
- Show the amount of money the faculty spends for all the scholarships.
- Compute the average amount of money per student allocated by the faculty for scholarships. (consider that the students who do not receive scholarships actually receive an amount equal to 0)
- Compute the distribution of the grade values (how many 10s, how many 9s,etc.). Order descending based on the grade value.
- Show the number of grades for each day of the week. Order descending based on the number of grades.
- Show the number of grades for each day of the week. Order ascending based on the week day.
- For each student who received at least one grade, show the last_name and the average of his/her grades. Order descending on the average.
- Modify the previous query to show also the students who did not receive any grade. Their average will be null.
- Modify the previous query to show for the students who did not receive any grade 0 for the average.
- Modify the previous query to show only the students with the average greater than 8.
- Show the last_name, the smallest grade, the highest grade and the average grade only for the students that have all their grades greater than or equal to 7.
- Show the last_names and the average of their grades for the students receiving at least 4 grades.
- For the students in group A2 year 3, show their names and their average grade.
- Show the highest average grade obtained by a student.
- For each course, show the name, the lowest grade and the highest grade.