Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add versions to export import of models, resolve global variables, pass host around the initializers #23

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

kongzii
Copy link

@kongzii kongzii commented Jun 29, 2022

Hi, thank you for this project. I am working on a simple script that will use this to migrate runs and models between our development and production mlflow server in the company, however, I found a couple of issues so I tried to fix them.

  • I removed usage of global mlflow clients because it caused the creation of mlruns directory on import of this library, e.g. doing from mlflow_export_import.run import export_run, import_run would create a new local mlflow server, if the MLFLOW_TRACKING_URI variable is not set.
  • I added support for model versions to the exporting of the model, so now you can export specific version, not just the tag
  • I added optional host argument to the initializators, it is passed to the clients that the class is using, the reason is that before, you was able to pass custom mlflow client, but not the host, so the host could be different than tracking URI of the client.
  • I added the return of run id for the import_model function.
  • I added condition to the if not dst_source.startswith("dbfs:") and not dst_source.startswith("s3:") and not os.path.exists(dst_source), because it was failing on S3 path, saying it doesn't exist.
  • Removed a few (hopefully?) unnecessary mlflow.set_experiment, because it was setting a global experiment, not for the particular client.

The result is that now, you are able to use this library from the other python scripts, without need of covering environment variables, example:

       MODEL_NAME = '...'
       EXPERIMENT_NAME = '...'
       TMP_DIR = '...'
       SOURCE_TRACKING_URI = '...'
       TARGET_TRACKING_URI = '...'

        export_client = mlflow.tracking.MlflowClient(tracking_uri=SOURCE_TRACKING_URI)
        model_exporter = export_model.ModelExporter(
            mlflow_client=export_client,
            host=SOURCE_TRACKING_URI,
            versions=["11", "12"],
        )
        model_exporter.export_model(MODEL_NAME, TMP_DIR)

        import_client = mlflow.tracking.MlflowClient(tracking_uri=TARGET_TRACKING_URI)
        model_importer = import_model.ModelImporter(
            mlflow_client=import_client,
            host=TARGET_TRACKING_URI,
            await_creation_for=600,
        )
        exported_run_id = model_importer.import_model(
            MODEL_NAME,
            TMP_DIR,
            experiment_name=EXPERIMENT_NAME,
        )

I will do probably more contributions in the future, but I wanted to get your feedback before there is too many of changes.

amesar and others added 3 commits June 29, 2022 02:19
Signed-off-by: Peter Jung <peter@jung.ninja>
Signed-off-by: Peter Jung <peter@jung.ninja>
@kongzii kongzii force-pushed the add-versions-to-export-import branch 2 times, most recently from 640323f to b3c171f Compare June 29, 2022 12:22
@amesar amesar force-pushed the master branch 3 times, most recently from 403afa9 to 3a33a6d Compare June 30, 2022 19:23
@kongzii
Copy link
Author

kongzii commented Jul 12, 2022

Hi @amesar, sorry to bother you, but do you have any feedback for this PR, please? Is there something I should change?

@kongzii
Copy link
Author

kongzii commented Aug 11, 2022

Hi @amesar, any feedback for this, please? If you don't want to merge this, no problem, I can fork the project and do the required modifications just for me, but I thought it would be a shame to not collaborate on a project like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants