-
Notifications
You must be signed in to change notification settings - Fork 96
Encoder improvement #1332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Encoder improvement #1332
Conversation
|
||
val modelMetricsFullHistory: ModelMetricsFullHistory = ModelMetricsFullHistory() | ||
val modelMetrics: ModelMetrics = ModelMetricsWithTimeWindow(20) | ||
val modelMetricsWithTimeWindow: ModelMetrics = ModelMetricsWithTimeWindow(1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this wise? as this is per endpoint, and there might be 100s in an API, this might endup being the whole history. maybe something like 100 could be more reasonable, unless there was some specific reason for such a value. note that, in typical experiments with 1 hour budget, total number of calls is typically in the range 100k to 300k
|
||
companion object { | ||
private const val NOT_400 = 200 | ||
private const val MAX_SAMPLES = 10_000 // Fixed-size buffer for samples to avoid memory overload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why in KNN this is a constant, but a variable (having same 10_000 value) in KDE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, it must be a constant. I'll fix it.
|
||
/** Cap on stored samples based on the reservoir-style uniform downsampling to avoid memory overload.*/ | ||
companion object { | ||
private const val MAX_SAMPLES = 20_000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why some have 10_000
and others 20_000
? is bit confusing. also it might open to critiscims from reviewers, unless we can document good reasons for this (and, in such case, those good reeasons should be briefly mentioned here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, I'll add it to the Config with a clear explanation
//TODO any better way to use this accuracy? | ||
if(!randomness.nextBoolean(accuracy)){ | ||
/** | ||
* Skips repair if the classifier is still weak, as indicated by low accuracy and F1-score |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and
or or
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think AND, because the classifier needs to perform well in both accuracy and precision–recall, otherwise, its knowledge of true positives is no better than a random guess.
No description provided.