Closed
Description
So all the things are in place and we can get rid of the current pattern using implicit params:
using Flux
ps = Flux.params(model)
opt = Flux.Optimise.ADAM()
gs = gradient(() -> loss(model(x), y), ps)
Flux.Optimise.update!(opt, ps, grads)
to the one using explicit parameters and Optimisers.jl
using Flux, Optimisers
opt_state = Optimisers.setup(Optimisers.Adam(), model)
∇model = gradient(m -> loss(m(x), y), model)[1]
opt_state, model = Optimisers.update!(opt_state, model, ∇model)
## or the non-mutating
# state, model = Optimisers.update(opt_state, model, ∇model)
Code
- Upgrade or remove
train!(loss, ::Params, data, ::AbstractOptimiser)
: - A replacement for iterating over
Flux.params(m)
... or just fix it, broken in Make params non-differentiable (Closes #2040 & #2048) #2054:
julia> gradient(m -> (sum(norm, Flux.params(m))), (x=[1,2.0], y=[3.0]))
(nothing,)
Documentation
- Rewrite https://fluxml.ai/Flux.jl/stable/training/optimisers/ -- done in Re-write training docs #2114
- Better documentation for Optimisers.jl
- working with custom models doc improvement: working with custom model types Optimisers.jl#84
Examples
- Port model zoo examples -- tag "update"
- Help porting downstream libraries and check there are no surprises
- GraphNeuralNetworks.jl
@mcabbott @ToucheSir @darsnack feel free to add to this
Metadata
Metadata
Assignees
Type
Projects
Status
Done