Wednesday, August 01, 2012

mysql: getting a column in a row that maximizes another column

I can't explain the problem in a better way, but I'll provide an example of what I'm trying to solve here. Say you have a table t:


productidprice
keyboard123
keyboard255
keyboard399
keyboard418
chair129
chair215
chair39
chair442

If you want the greatest value of column price, you'll just

SELECT max(price) FROM t

However, if you want to know the greatest price for each product, you'll need group:

SELECT product,max(price) FROM t GROUP BY product

All good. However, what if you need to get the id column for the row that has the maximum price for each group? That is, you want to get the value 3 for keyboard, and 4 for chair.

The ideal solution would be to have an proper aggregation function to do that, for instance VALUE( id HAVING price = max(price) ) or something like that. But MySQL (and AFAIK, Oracle 10g) don't provide such thing.

There's a lot of possible solutions, and I'll summarize the ones I know.

This one is a trick that works when the column you want to maximize (or minimize) has a maximum size.

SELECT product, max(price), substr( max( concat( lpad( price, 10, 0 ), id ) ), 11 ) as id FROM t GROUP BY product

That one uses string functions also, but I think it's neater (mainly because it's the only one I came up by myself) (however, it won't work if the SEPARATOR character occurs in that column):

SELECT product, max(price), substring_index( group_concat( id order by price desc separator '|' ), '|', 1 ) FROM t GROUP BY product

Here's one that uses subqueries:

SELECT product, max(price),
( SELECT id FROM t t2 WHERE t2.product = t.product ORDER BY price DESC LIMIT 1 )
FROM t
GROUP BY product

However, you can't do this with temporary tables. Also, you can't refer to GROUP columns defined outside the subquery inside the subquery (or else you could just add AND t2.price = max(t.price), and it should be nice).


This one relies on an undocumented feature of grouping in MySQL:

SELECT product, max(price), id
FROM(
  SELECT product, price, id
  FROM t
  ORDER BY product, price DESC
) x
GROUP BY product

If we don't specify the column id neither as a GROUPed column or in an aggregation function (as MAX), standard SQL databases just won't run; however, MySQL has a  ONLY_FULL_GROUP_BY  flag (that I think is off by default) that allow you to do this - but it is not specified which value of id will be returned. It could be from any row belonging to that GROUP, but experience shows that it will always return the value from the last row. So, you perform an ORDER BY, then GROUP BY. It is not guarranteed that it will work, but it does (currently).








This page is powered by Blogger. Isn't yours?