For better testability and other reasons, it is good to have SQLAlchemy database sessions configuration non-global as described very well in the following question:
how to setup sqlalchemy session in celery tasks with no global variable (and also discussed in https://github.com/celery/celery/issues/3561 )
Now, the question is, how to handle metadata elegantly? If my understanding is correct, metadata can be had once, eg:
engine = create_engine(DB_URL, encoding='utf-8', pool_recycle=3600, pool_size=10) # db_session = get_session() # this is old global session meta = MetaData() meta.reflect(bind=engine)
Reflecting on each task execution is not good for performance reason, metadata is more or less stable and thread-safe structure (if we only read it).
However, metadata sometimes changes (celery is not the "owner" of the db schema), causing errors in workers.
What could be an elegant way to deal with meta
in a testable way, plus still be able to react to underlying db changes? (alembic in use, if it is relevant).
I was thinking of using alembic version change as a signal to re-reflect, but not quite sure how to make it work nicely in celery. For instance, if more than one worker will at once sense a change, the global meta
may be treated in a non-thread safety way.
If it matters, celery use in the case is standalone, no web framework modules/apps/whatever present in the celery app. The problem is also simplified as only SQLAlchemy Core is in use, not object mapper.
0 comments:
Post a Comment