Skip to content

Type instabilities lead to insane number of CPU allocations on grouped convolutions #520

@mashu

Description

@mashu
Contributor
Julia 1.9.1
  [052768ef] CUDA v4.4.0
  [872c559c] NNlib v0.9.1

Here is the MWE

using CUDA
using NNlib

function mwe()
    channels = 256
    x = rand(Float32,1024, channels, 64)
    w = rand(Float32,2, 1, channels)
    @info "NNlib.conv"
    NNlib.conv(x, w, groups=channels);
    @time NNlib.conv(x, w, groups=channels);
    @info "NNlib.depthwiseconv"
    NNlib.depthwiseconv(x,w);
    @time NNlib.depthwiseconv(x,w);
    @info "Done"
end

Result of above is run twice

julia> DepthwiseMWE.mwe()
[ Info: NNlib.conv
  0.031946 seconds (12.84 k allocations: 82.142 MiB, 10.82% gc time)
[ Info: NNlib.depthwiseconv
  0.032803 seconds (70 allocations: 79.931 MiB, 19.57% gc time)
[ Info: Done

julia> DepthwiseMWE.mwe()
[ Info: NNlib.conv
  0.031491 seconds (12.84 k allocations: 82.142 MiB, 30.70% gc time)
[ Info: NNlib.depthwiseconv
  0.029980 seconds (69 allocations: 79.931 MiB, 18.81% gc time)
[ Info: Done

Expected result ~70 CPU allocations, not 128400 CPU allocations, in a deeper network it puts considerable pressure on GC and kills performance.

I tried depthwiseconv in my code but it has another problem that it's not GPU friendly.

So it's either making depathwiseconv GPU friendly or fixing insane allocations of conv.

Activity

mashu

mashu commented on Jul 5, 2023

@mashu
ContributorAuthor

Closing no issue when on GPU

x_d = CUDA.CuArray(x)
w_d = CUDA.CuArray(w)
CUDA.@time Flux.conv(x_d, w_d, groups=256);

 0.002870 seconds (127 CPU allocations: 5.984 KiB) (1 GPU allocation: 127.875 MiB, 0.46% memmgmt time)
changed the title [-]Depthwise convolutions lead to insane number of CPU allocations or GPU version broken[/-] [+]Type instabilities lead to insane number of CPU allocations on grouped convolutions[/+] on Jul 5, 2023
ToucheSir

ToucheSir commented on Jul 5, 2023

@ToucheSir
Member

Since the MWE has a lot of useful information, I'm taking the liberty of reopening this with a different focus.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @mashu@ToucheSir@mcabbott

        Issue actions

          Type instabilities lead to insane number of CPU allocations on grouped convolutions · Issue #520 · FluxML/NNlib.jl