Open
Description
When looking at stacktraces with Flux models, it's not uncommon to see types that cover a screen's worth of space or more. This is especially relevant with large, nested models such as those found in Metalhead. One approach that has been tried to reduce the amount of printing noise is https://github.com/FluxML/Zygote.jl/blob/master/src/compiler/show.jl, which purposefully leaves out certain type params. I think it would be worthwhile exploring a similar approach for Flux container layers. We could turn something like this:
Flux.Chain{Tuple{
Flux.Chain{Tuple{
Flux.Chain{Tuple{
Flux.Conv{2, 4, typeof(identity), Array{Float32, 4}, Vector{Float32}},
typeof(Metalhead.Layers._flatten_spatial),
typeof(identity)
}},
Metalhead.Layers.ClassTokens{Array{Float32, 3}},
Metalhead.Layers.ViPosEmbedding{Matrix{Float32}},
Flux.Dropout{Float64, Colon, Random.TaskLocalRNG},
Flux.Chain{Vector{Flux.Chain{Tuple{
Flux.SkipConnection{Flux.Chain{Tuple{
Flux.LayerNorm{typeof(identity), Flux.Scale{typeof(identity), Vector{Float32}, Vector{Float32}}, Float32, 1},
Metalhead.Layers.MHAttention{
Flux.Dense{typeof(identity), Matrix{Float32}, Bool}, Flux.Dropout{Float64, Colon, Random.TaskLocalRNG},
Flux.Chain{Tuple{
Flux.Dense{typeof(identity), Matrix{Float32}, Vector{Float32}},
Flux.Dropout{Float64, Colon, Random.TaskLocalRNG}
}}
}
}}, typeof(+)},
Flux.SkipConnection{Flux.Chain{Tuple{
Flux.LayerNorm{typeof(identity), Flux.Scale{typeof(identity), Vector{Float32}, Vector{Float32}}, Float32, 1},
Flux.Chain{Tuple{
Flux.Dense{typeof(NNlib.gelu), Matrix{Float32}, Vector{Float32}},
Flux.Dropout{Float64, Colon, Random.TaskLocalRNG},
Flux.Dense{typeof(identity), Matrix{Float32}, Vector{Float32}},
Flux.Dropout{Float64, Colon, Random.TaskLocalRNG}
}}
}}, typeof(+)}
}}}},
Metalhead.var"#120#121"
}},
Flux.Chain{Tuple{
Flux.LayerNorm{typeof(identity), Flux.Scale{typeof(identity), Vector{Float32}, Vector{Float32}}, Float32, 1},
Flux.Dense{typeof(NNlib.tanh_fast), Matrix{Float32}, Vector{Float32}}
}}
}}
(note, manually formatted!)
To one of these (other combinations of braces and ellipses welcome):
Flux.Chain(Flux.Chain, Flux.Chain)
Flux.Chain(Flux.Chain(..), Flux.Chain(..))
Flux.Chain(Flux.Chain{..}, Flux.Chain{..})
Thoughts?