Duh?
Now, let’s recover from that terrible start to a blog post. And by recover I mean put up this lame educational post a day after Charlie’s awesomely nerdy post from South Dakota.
For the second part of this pointless series on SQL, I’ll continue my discussion on select statements. This time around I’ll cover the ORDER BY clause, the GROUP BY clause, and the SUM function. If you need to get caught up, here’s Part 1.
ORDER BY
You can use the ORDER BY clause to sort your results by a certain field. Going back to my Babe Ruth example Part 1, we can order the results by the number of home runs in each season.
mysql> select playerID,yearID,HR
-> from Batting
-> where playerID='ruthba01'
-> order by HR desc;
+----------+--------+----+
| playerID | yearID | HR |
+----------+--------+----+
| ruthba01 | 1927 | 60 |
| ruthba01 | 1921 | 59 |
| ruthba01 | 1928 | 54 |
| ruthba01 | 1920 | 54 |
| ruthba01 | 1930 | 49 |
| ruthba01 | 1926 | 47 |
...
Here, I’ve used HR in the ORDER BY clause and set it to DESC (descending). You can also use ASC (ascending), but that is the default behavior of ORDER BY.
Using ORDER BY, we can also see the greatest single season home run totals across history.
mysql> select playerID,yearID,HR
-> from Batting
-> order by HR desc
-> limit 10;
+-----------+--------+----+
| playerID | yearID | HR |
+-----------+--------+----+
| bondsba01 | 2001 | 73 |
| mcgwima01 | 1998 | 70 |
| sosasa01 | 1998 | 66 |
| mcgwima01 | 1999 | 65 |
| sosasa01 | 2001 | 64 |
| sosasa01 | 1999 | 63 |
| marisro01 | 1961 | 61 |
| ruthba01 | 1927 | 60 |
| ruthba01 | 1921 | 59 |
| foxxji01 | 1932 | 58 |
+-----------+--------+----+
10 rows in set (0.33 sec)
Notice the end of the statement where I use the LIMIT clause. We can use this to only show the first few fields. Here, I have shown the first ten.
GROUP BY
A lot of times, the tables that you work with contain a lot of raw data. Each record might represent some small piece of the puzzle. But you might be interested in the bigger picture or a more summarized form of the data. The GROUP BY clause is one way to deal with this. We can choose a field in the table to group all the rows around. However, GROUP BY in itself is not very useful. We have to combine it with another function. Here is an example that uses the COUNT(*) function.
mysql> select playerID, count(*)
-> from Batting
-> where playerID='bondsba01'
-> group by playerID;
+-----------+----------+
| playerID | count(*) |
+-----------+----------+
| bondsba01 | 22 |
+-----------+----------+
1 row in set (0.02 sec)
This query shows us how many season Barry Bonds played (or has played so far, but that’s still up in the air), since one row in the Batting table represents one player-year*.
SUM
Another function that we can use with GROUP BY is the SUM function. Here, I will use GROUP BY to find the top ten career home run totals of all time.
mysql> select playerID, sum(HR)
-> from Batting
-> group by playerID
-> order by sum(HR) desc
-> limit 10;
+-----------+---------+
| playerID | sum(HR) |
+-----------+---------+
| bondsba01 | 762 |
| aaronha01 | 755 |
| ruthba01 | 714 |
| mayswi01 | 660 |
| sosasa01 | 609 |
| griffke02 | 593 |
| robinfr02 | 586 |
| mcgwima01 | 583 |
| killeha01 | 573 |
| palmera01 | 569 |
+-----------+---------+
10 rows in set (0.49 sec)
There you have it. Another post in the database. Zing!
Next time, I will be covering joins.
*Actually, if you want to get technical, one row in the Batting table represents a player-stint. A player can have more than one stint per year if he is traded mid-season.
A SEQUEL SEQUEL. Haaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa………….