?

Log in

No account? Create an account
entries friends calendar profile Elf Sternberg's Pendorwright Projects Previous Previous Next Next
Using MySQL’s Full Text Search with Django - Elf M. Sternberg
elfs
elfs
Using MySQL’s Full Text Search with Django

It drives me nuts that we in the Django community rely on Solr or Haystack to provide us with full-text search when MySQL provides a perfectly functional full-text search feature, at least at the table level and for modest projects. I understand that not every app runs on MySQL, but mine do, and I’m sure many of you are running exactly that, and could use this technique without modification.

Well, after much digging, I found an article on MercuryTide’s website covering custom QuerySets with FULLTEXT and relevance, and built this library around it.

I used this rather than Django’s internal filter keyword search, because this technique adds an additional aggregated value, the relevance of the search terms to the search. This is useful in sorting the search, something not automatically provided by the QuerySet.filter() mechanism.

You must create the indexes against which the search will be conducted. For performance reasons, if you’re importing a massive collection of data, it’s better to import all of the data and then create the index. More importantly, when you declare that a SearchManager to be used by a Model, you declare it thusly:

class Book(models.Model):
    ...
    objects = SearchManager()

When you do, you must add an index that corresponds to that list of fields:

CREATE FULLTEXT INDEX book_text_index ON books_book (title, summary)

Notice how the contents of the index correspond with the contents of the Search Manager.  Or you can automate the process with South:

    def forwards(self, orm):
        db.execute('CREATE FULLTEXT INDEX book_text_index ON books_book (title, summary)')

    def backwards(self, orm):
        db.execute('DROP INDEX book_text_index on books_book')

To use the library is fairly trivial. If there is only one index (which can encompass several columns) for any table, you call

books = Book.objects.search('The Metamorphosis').order_by('-relevance')

If there’s more than one index, you specify the index by the list of fields:

books = Book.objects.search('The Metamorphosis', ('title', 'summary')).order_by('-relevance')

Note that that’s a tuple, and must be.

If you specify fields that are not part of a FULLTEXT index, the error message will include lists of viable indices.   It will also tell you if there are no indices.  (Getting that to work was tricky, as it involved database introspection and the decoration of methods, so I’m especially proud of it.)

The library is fully available on my github account: django_mysqlfulltextsearch

This entry was automatically cross-posted from Elf's technical journal, ElfSternberg.com

Tags: , ,

2 comments or Leave a comment
Comments
wolfwings From: wolfwings Date: October 23rd, 2010 12:06 am (UTC) (Link)
Sadly, most stuff just ignores MySQL's FULLTEXT for a different reason: It only works on MyISAM tables. :-/ InnoDB usage blocks it, and more and more stuff is running/focussing on InnoDB from what I've been seeing since it allows for row-level locking instead of shutting down an entire table to lock it to update it.

This same stuff could be applied to a Sphinx install pretty easilly though, since Sphinx runs a subset of the MySQL command-set for communication with front-end apps.
en_ki From: en_ki Date: October 25th, 2010 02:54 pm (UTC) (Link)
Yeah, I like the part where I can have full-text search XOR referential integrity. "Herp a derp", as the kids say.

2 comments or Leave a comment