Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find a best way to import weigths to the Keras #58

Open
zaleslaw opened this issue May 18, 2021 · 4 comments
Open

Find a best way to import weigths to the Keras #58

zaleslaw opened this issue May 18, 2021 · 4 comments
Labels
good second issue Good for advanced contributors research Research or reproducing code from science paper

Comments

@zaleslaw
Copy link
Collaborator

zaleslaw commented May 18, 2021

We have the ability to load the model configuration from Keras JSON format and pre-trained weights stored in h5 format.
But how to export the model pre-trained in KotlinDL to the Keras or pure TensorFlow?

Currently, we support the weights export to the txt files. It's a baseline solution.
We need research here to find the best way to export weights.

There are a few open questions:

  1. What part of weights loading should be implemented on the KotlinDL side or maybe we need a special loader to the Keras in Python (published as a separate pip package)?
  2. Should we export to txt or another popular format or maybe to NumPy arrays to easily parse it on the Python side?
  3. Is it easy to write h5 writer to store them in Keras weight format?
  4. Will it be enough of a universal approach which could be reused for weights export to ONNX or PyTorch? Or we will eat the elephant in parts and develop an individual approach to exporting weights to different systems.

Let's discuss here, in this issue the possible approaches, their advantages, and disadvantages?

P.S Read the Contributing Guidelines.

@zaleslaw zaleslaw added good first issue Good for newcomers research Research or reproducing code from science paper good second issue Good for advanced contributors and removed good first issue Good for newcomers labels May 18, 2021
@dosier
Copy link
Contributor

dosier commented Jun 14, 2021

Maybe we can abstract out the serialisation part, then provide a default serialiser for the Keras weight format (this seems most fitting for KotlinDL). If we abstract the serialisation part, people can implement their own serialisers and provide it as a dependency for the niche use-cases. Maybe inspiration can be taken from KTor's features system.

@zaleslaw
Copy link
Collaborator Author

Yeah, it's a great idea to separate the serialization API (now it's built on Keras format for model saving).
What parts of KTor do you mean? I'm not familiar with this functionality

@dosier
Copy link
Contributor

dosier commented Jun 16, 2021

Nevermind the KTor part, it came to mind because it has a nice syntax for handling serialisation (e.g. https://ktor.io/docs/kotlin-serialization.html#register_arbitrary_converter).
But now come to think of it, we probably don't want to define the serialiser when we define the model but rather let the user provide it when saving the model. That way they can easily save the same model using multiple serialisers.

Though a case could be made for both, it definitely would be nice to have an option to easily define the serialiser when defining the model. I'd guess most users will only use one anyways.

They don't have to mutually exclusive either, maybe the save function could have an optional parameter if you want to use a different serialiser than the one provided during model creation.

@zaleslaw zaleslaw added this to the 0.4 milestone Sep 13, 2021
@zaleslaw zaleslaw removed this from the 0.4 milestone Dec 15, 2021
@dosier
Copy link
Contributor

dosier commented Oct 9, 2022

What are your thoughts on using https://github.com/Kotlin/kotlinx.serialization for implementing a serialisation strategy for the H5 format. I could probably make a PR for this. However, one drawback from kotlinx.serialization is that it depends upon a compiler Gradle plugin, this will make it incompatible with the new K2 compiler (at least for the next few months). But if this is not a problem, we can use it to implement a custom serializer for the H5 format.
However, tbh I am not entirely sure what the direct benefits would be of using this rather than write a custom parser from scratch. I guess integration, and ease of converting between formats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good second issue Good for advanced contributors research Research or reproducing code from science paper
Projects
None yet
Development

No branches or pull requests

2 participants