-
-
Notifications
You must be signed in to change notification settings - Fork 194
Convert activation functions to numpower #381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 3.0
Are you sure you want to change the base?
Conversation
Sam 8 LeakyReLU and refactoring
Sam 9 re lu and re lu6
Sam 10 selu and sigmoid
Sam 11 si lu
Sam 12 softmax and softplus functions
Sam 13 softsign
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work @apphp and @SkibidiProduction ... I think this is exactly what we need for the first round of integration with NumPower. I had a few questions and comments that may change the outcome of the PR so I'm just going to leave it at that for now until we get that sorted.
Overall, fantastic usage of unit tests and good code quality. I love to see it.
Andrew
$$ | ||
|
||
## Parameters | ||
| # | Name | Default | Type | Description | | ||
|---|---|---|---|---| | ||
| 1 | alpha | 1.0 | float | The value at which leakage will begin to saturate. Ex. alpha = 1.0 means that the output will never be less than -1.0 when inactivated. | | ||
|
||
## Size and Performance | ||
ELU is a simple function and is well-suited for deployment on resource-constrained devices or when working with large neural networks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you come up with these size and performance details? I'm noticing that some differ from my understanding. For example, it is not necessarily true when taken in the context of all activation functions that ELU is a simple function or well-suited for resource constrained devices.
Perhaps it would actually be more confusing to offer this somewhat subjective explanation. In addition, in practice, activation functions have very little impact on the total runtime of the network - so taking the effort here to detail out their performance is somewhat distracting.
How do you feel about dropping this "size and performance" section all together, not being opinionated about individual activation functions, and instead letting the user discover the nuances of each activation function for themselves? However, if there is something truly outstanding about a particular activation functions performance characteristics, then let's make sure to include that in the description of the class. For example, ReLU is outstanding because it is the simplest activation function in the group. Maybe there's another activation function that has an associated kernel that is particularly optimized, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, we can remove these section at all, it was it was too subjective.
So, remove them at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes remove the section, but if there is something unique about a particular functions performance characteristics, we can put that info in the description. What do you think?
*/ | ||
public function activate(NDArray $input) : NDArray | ||
{ | ||
// Calculate |x| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't feel that these comments provide enough value to justify their existence. I can understand what is going on clearly given your great usage of variables and naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be removed
$$ | ||
|
||
## Parameters | ||
| # | Name | Default | Type | Description | | ||
|---|---|---|---|---| | ||
| 1 | alpha | 1.0 | float | The value at which leakage will begin to saturate. Ex. alpha = 1.0 means that the output will never be less than -1.0 when inactivated. | | ||
|
||
## Size and Performance | ||
ELU is a simple function and is well-suited for deployment on resource-constrained devices or when working with large neural networks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, we can remove these section at all, it was it was too subjective.
So, remove them at all?
*/ | ||
public function activate(NDArray $input) : NDArray | ||
{ | ||
// Calculate |x| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be removed
Activation implementations
Swapped out custom Tensor code for NumPower APIs across all functions: ReLU, LeakyReLU, ELU, GELU, HardSigmoid, SiLU, Tanh, Sigmoid, Softmax, Softplus, Softsign, ThresholdedReLU, etc.
Updated derivative methods to use numpower’s derivative helpers.
Tests
Refactored unit tests to assert against numpower outputs.
Adjusted tolerances and assertions to match numpower’s numeric behavior.
Documentation
Added/updated images under docs/images/activation-functions/ to illustrate each activation curve and its derivative using the new implementations.
Cleaned up corresponding markdown to reference the updated diagrams.
Code cleanup
Aligned naming conventions and method signatures with numpower’s API.
Minor style fixes (whitespace, imports, visibility).