Django: don’t use distinct and order_by across relations

By | December 7, 2010

I needed to get a list of project objects that had Time objects attached to them that had been updated by a specific user. I wanted the list to be ordered by the most recently updated Time object, and importantly, I wanted the list of project objects to be distinct (since there are multiple time objects attached to any one project).

I was trying to make the following query work in django:

As the note here describes, this particular combination (distinct and order_by on a related field) doesn’t work so well. The related table (Time, in this case) columns are being added to the query’s SELECT clause, giving me multiple copies of projects that I wanted to be distinct.

There is a Django feature request to support named fields in the call to distinct, but it is not incorporated into trunk yet, mostly due to database backend support.

After some searching and pondering, I was able to get the same list of projects using aggregates instead:

This solution to the problem doesn’t seem to be suggested often, so I thought I’d take the time to mention it.

Leave a Reply

Your email address will not be published. Required fields are marked *