Skip to content

Modify Django To Support Binary Order By

by Paul Kenjora on August 20th, 2007

Binary order by is different from simple order by in that it is case sensitive. Python’s string comparison operators are also case sensitive. So for any order_by query on a string column in MySQL the results are not in the order Python expects. This leads to serious problems when you try to do simple things like binary searches on the returned list of objects.

The Basics

A simple default sort under MySQL translates to "ORDER BY" and produce something like:

String Columns

  1. bat
  2. cat
  3. Dad
  4. mad
  5. not
  6. no_
  7. Wow

Number Columns

  1. 1
  2. 2
  3. 7
  4. 12
  5. 23

A binary sort under MySQL translates to "ORDER BY BINARY" and produce something like:

String Columns

  1. Dad
  2. Wow
  3. bat
  4. cat
  5. mad
  6. no_
  7. not

Number Columns

  1. 1
  2. 12
  3. 2
  4. 23
  5. 7

The binary sort goes purely by ASCII value of the characters, and hence picks up cases, just like Python string compare. The default MySQL sort ignores case. Try it, sort the above using MySQL and Python to get the following rules:

Strings: Python = MySQL ORDER BY BINARY

Numbers: Python = MySQL ORDER BY

I ran into the problem above while trying to fetch records sorted by string and then doing a binary search on the record list in Python code. Running a unique query for the data each time was not an option. Because the MySQL and Python ordering (string comparison) did not match the binary search was skipping wildly around in some instances. I decided to fix the issue in Django code rather than working around the issue. The fix is surprisingly simple.

The New Order By Function

Because the “BINARY” modifier may have undesirable effects on number sorting in MySQL and I’m not familiar with Postgress or other databases, I decided to make it optional. I also like the "__" syntax used in Django queries because its a safe bet everyone is adhering to the standard when naming columns. The result is a new modifier that can be added to the "order_by" function called "__binary". A sample call would look like the following:

SomeModel.objects.all().order_by(‘-some_column__binary’)

Modify Django To Perform Binary Order By

To get things to work right I modified about 5 lines of code and made sure the change was transparent to existing logic. The theory is to check for "__binary" at the end of the order by string and add "BINARY " to the SQL string. Then strip off the new "__binary" modifier so the rest of the function doesnt notice anything. Make the following modifications to implement binary sorting in your Django copy. All changes are marked by "ADDED:".

Conclusion

The above code could use some improvements. Currently it will mark all clauses as "BINARY" even if only one is specified, it needs to be extended to support chained "order_by" functions. This was not critical to my application. I do not know how Postgress or other databases support binary ordering, comments welcome. With a bit more research this functionality would be a great addition to Django code base. Given enough time Imay check this in sometime, unless one of you decided to do it before then. Hope this helps.

blog comments powered by Disqus