Mysql full-text index query with SqlAlchemy
结论:
- mysql fulltext index基本可用(试了natural language mode和boolean mode),尤其在表结构比较简单的前提下
- sqlalchemy自带的match关键字只能match单列的fulltext索引,多列的我们用如下办法自制 – 仅用于查询的话,完全够用
- 最重要的一点:数据量大时,一个query尽量只使用where match against,与其他的where some_column = ‘hello’要分开写(做成subquery)。如果写在一起,极有可能只有fulltext索引生效,而其他列的简单查询所对应的index不生效。
注册自己的Match子句
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
from sqlalchemy.ext.compiler import compiles from sqlalchemy.sql.expression import ClauseElement from sqlalchemy import literal class Match(ClauseElement): def __init__(self, columns, value): self.columns = columns self.value = literal(value) @compiles(Match) def _match(element, compiler, **kw): return "MATCH (%s) AGAINST (%s)" % ( ", ".join(compiler.process(c, **kw) for c in element.columns), compiler.process(element.value) ) query(Model).filter(Match([Model.x, Model.y], "some value")) #实施查询 |
所有条件一起用于查询(前面1个filter不能触发已有index)
1 2 3 4 5 6 7 |
data = ErrorLog.query\ .filter(and_(ErrorLog.devId == thisMachine.devId, ErrorLog.dateTime.between(logStart, logEnd))) \ .filter(Match([ErrorLog.col1, ErrorLog.col2, ErrorLog.col3, ErrorLog.col4], "distributed system")) \ .order_by(ErrorLog.dateTime.desc()) \ .offset(offset).limit(size).all() |
先做普通字段查询,join后再做fulltext检索(subquery和query各自使用自己的index)
1 2 3 4 5 6 7 8 9 10 |
subq = db.session.query(ErrorLog.id.label('errorLogId'))\ .filter(and_(ErrorLog.devId == thisMachine.devId, ErrorLog.dateTime.between(logStart, logEnd))) \ .order_by(ErrorLog.dateTime.desc()) \ .subquery() data = ErrorLog.query.join(subq, ErrorLog.id == subq.c.errorLogId)\ .filter(Match([ErrorLog.col1, ErrorLog.col2, ErrorLog.col3, ErrorLog.col4], "distributed system")) \ .offset(offset).limit(size).all() |
参考自:https://stackoverflow.com/questions/14971619/proper-use-of-mysql-full-text-search-with-sqlalchemy
文章的脚注信息由WordPress的wp-posturl插件自动生成