-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 fix knowledge king oom bug #150
base: main
Are you sure you want to change the base?
Conversation
- upgrade jtokkit version from 0.4.0 to 0.5.0 - use lazy load register that the new method of the Encodings class, which can resolve the issue of initializing all encoding types problem
🤖 OpenAI High-level SummaryThe code changes in this pull request involve upgrading the File Changes
Please review the specific file changes for more detailed information on each modification. Chat with 🤖 OpenAI Bot (
|
fun measureNumOfTokens(string: String): Int { | ||
val encoding: Encoding = registry.getEncoding(encodingType) | ||
val encoded: List<Int> = encoding.encode(string) | ||
return encoded.size | ||
return encoding.countTokens(string) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤖 OpenAI
The measureNumOfTokens
function has been modified to use the countTokens
method of the encoding
object. This change improves the efficiency of token counting by utilizing the built-in functionality provided by the jtokkit
library.
fun measureNumOfTokens(string: String): Int { | |
val encoding: Encoding = registry.getEncoding(encodingType) | |
val encoded: List<Int> = encoding.encode(string) | |
return encoded.size | |
return encoding.countTokens(string) | |
} | |
fun measureNumOfTokens(string: String): Int { | |
return encoding.countTokens(string) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
fun measureNumOfTokens(string: String): Int { | ||
val encoding: Encoding = registry.getEncoding(encodingType) | ||
val encoded: List<Int> = encoding.encode(string) | ||
return encoded.size | ||
return encoding.countTokens(string) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fun measureNumOfTokens(string: String): Int = encoding.countTokens(string)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修正
companion object { | ||
private val registry = Encodings.newLazyEncodingRegistry() | ||
private lateinit var encoding: Encoding | ||
} | ||
|
||
init { | ||
encoding = registry.getEncodingForModel(ModelType.GPT_3_5_TURBO) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q:這裡不能直接初始化嗎?
companion object {
private val registry = Encodings.newLazyEncodingRegistry()
private val encoding = registry.getEncodingForModel(ModelType.GPT_3_5_TURBO)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以直接初始化,已修
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
Why need this change? / Root cause:
Changes made:
Test Scope / Change impact:
Summary by OpenAI
Release Notes:
jtokkit
version from 0.4.0 to 0.5.1.JTokkit
class.JTokkit
class.measureNumOfTokens
function to use thecountTokens
method of the encoding object.