背景
mlflow 的更新迭代速度还是很快的,平均一个月一个大版本的更新,截止到11月1号,已经更新到了1.11.0版本
我们查看mlflow release,就能看到早在1.10.0版本,就提供了对model registry的更好的feature支持,以及能够对实验进行逻辑删除操作,
而这些features 在mlflow 1.4.0是没有的,特别是删除实验的特性,如果实验很多的情况下,我们看到的实验是杂乱无章的,很不方便我们进行管理,所以我们进行mlflow的升级
升级以及准备
参照之前mlflow的搭建使用 ,我们先建立mlflow 1.4.0 和mlflow 1.11.0的conda环境
假设你已经建立好了对应的conda环境,且分别为mlflow-1.4.0 和mlflow-1.11.0 则执行:
conda activate mlflow-1.11.0
参考mlflow db upgrade ,执行
mlflow db upgrade mysql://user:passwd@host:port/db 如:mlflow db upgrade mysql://root:root@localhost/mlflow
其中
如果执行成功则会看到如下输出信息:
2020/11/02 10:24:50 INFO mlflow.store.db.utils: Updating database tables INFO [alembic.runtime.migration] Context impl MySQLImpl. INFO [alembic.runtime.migration] Will assume non-transactional DDL. INFO [alembic.runtime.migration] Running upgrade 2b4d017a5e9b -> cfd24bdc0731, Update run status constraint with killed INFO [alembic.runtime.migration] Running upgrade cfd24bdc0731 -> 0a8213491aaa, drop_duplicate_killed_constraint WARNI [0a8213491aaa_drop_duplicate_killed_constraint_py] Failed to drop check constraint. Dropping check constraints may not be supported by your SQL database. Exception content: (MySQLdb._exceptions.ProgrammingError) (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CHECK status' at line 1") [SQL: ALTER TABLE runs DROP CHECK status] (Background on this error at: http://sqlalche.me/e/f405) INFO [alembic.runtime.migration] Running upgrade 0a8213491aaa -> 728d730b5ebd, add registered model tags table INFO [alembic.runtime.migration] Running upgrade 728d730b5ebd -> 27a6a02d2cf1, add model version tags table INFO [alembic.runtime.migration] Running upgrade 27a6a02d2cf1 -> 84291f40a231, add run_link to model_version
如果此时再在mlflow 1.4.0的环境下 再执行:
mlflow server \ --backend-store-uri mysql://root:root@localhost/mlflow \ --host 0.0.0.0 -p 5002 \ --default-artifact-root s3://mlflow
就会报错:
2020/11/02 10:25:41 ERROR mlflow.cli: Error initializing backend store 2020/11/02 10:25:41 ERROR mlflow.cli: Detected out-of-date database schema (found version 84291f40a231, but expected 2b4d017a5e9b). Take a backup of your database, then run 'mlflow db upgrade <database_uri>' to migrate your database to the latest schema. NOTE: schema migration may result in database downtime - please consult your database's documentation for more detail. Traceback (most recent call last): File "/Users/ljh/opt/miniconda3/envs/mlflow-1.4.0-dev/lib/python3.6/site-packages/mlflow/cli.py", line 263, in server initialize_backend_stores(backend_store_uri, default_artifact_root) File "/Users/ljh/opt/miniconda3/envs/mlflow-1.4.0-dev/lib/python3.6/site-packages/mlflow/server/handlers.py", line 97, in initialize_backend_stores _get_tracking_store(backend_store_uri, default_artifact_root) File "/Users/ljh/opt/miniconda3/envs/mlflow-1.4.0-dev/lib/python3.6/site-packages/mlflow/server/handlers.py", line 83, in _get_tracking_store _tracking_store = _tracking_store_registry.get_store(store_uri, artifact_root) File "/Users/ljh/opt/miniconda3/envs/mlflow-1.4.0-dev/lib/python3.6/site-packages/mlflow/tracking/_tracking_service/registry.py", line 37, in get_store return builder(store_uri=store_uri, artifact_uri=artifact_uri) File "/Users/ljh/opt/miniconda3/envs/mlflow-1.4.0-dev/lib/python3.6/site-packages/mlflow/server/handlers.py", line 54, in _get_sqlalchemy_store return SqlAlchemyStore(store_uri, artifact_uri) File "/Users/ljh/opt/miniconda3/envs/mlflow-1.4.0-dev/lib/python3.6/site-packages/mlflow/store/tracking/sqlalchemy_store.py", line 99, in __init__ mlflow.store.db.utils._verify_schema(self.engine) File "/Users/ljh/opt/miniconda3/envs/mlflow-1.4.0-dev/lib/python3.6/site-packages/mlflow/store/db/utils.py", line 52, in _verify_schema "more detail." % (current_rev, head_revision)) mlflow.exceptions.MlflowException: Detected out-of-date database schema (found version 84291f40a231, but expected 2b4d017a5e9b). Take a backup of your database, then run 'mlflow db upgrade <database_uri>' to migrate your database to the latest schema. NOTE: schema migration may result in database downtime - please consult your database's documentation for more detail.
这说明升级成功
此时再在mlflow 1.11.0的conda环境下执行:
mlflow server \ --backend-store-uri mysql://root:root@localhost/mlflow \ --host 0.0.0.0 -p 5003 \ --default-artifact-root s3://mlflow
就能正常的看到页面,这样mlflow 从1.4.0到1.11.0的升级就完成了
注意事项
如果是线上操作,则先备份数据库,因为该升级不一定能保证升级成功,如升级失败,直接从备份数据库恢复或者参照失败处理进行处理