-
Notifications
You must be signed in to change notification settings - Fork 137
Improve compilation time #618
Description
Compilation time (swift build
) is very slow:
$ time swift build # clean build
[61/61] Linking libTensorFlow.dylib
swift build 506.63s user 5.63s system 253% cpu 3:22.07 total
$ echo "// Test." >> Sources/TensorFlow/Layer.swift # trivial change
$ time swift build # incremental build
[3/3] Linking libTensorFlow.dylib
swift build 80.78s user 0.42s system 99% cpu 1:21.25 total
I'm not sure when exactly it got so bad. Let's try to improve this!
Action items
- Identify compilation hot spots via profiling:
pprof
or Xcode Instruments.
This document describes Swift compiler performance tips.
Identifying hot spots using profiling tools like pprof
or Xcode Instruments seems like a great first step. @marcrasi previously used pprof
to generate TensorFlow
module compilation flamegraphs: perhaps that work can be polished and open-sourced in this repository.
Type-checking is one big source of source of slowdown. Here are some sorted results from swift build -Xswiftc -Xfrontend -Xswiftc -debug-time-function-bodies
(from Gist):
# Worst offenders, time in milliseconds.
(1767, "global function 'gelu'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Operators/Math.swift:1225:13')
(1767, "global function 'gelu'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Operators/Math.swift:1225:13')
(1767, "global function 'gelu'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Operators/Math.swift:1225:13')
(1866, "instance method 'sha512()'", '/Users/danielzheng/swift-apis/Sources/Tensor/TensorUtilities.swift:149:10')
(2220, "instance method 'update(_:along:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Optimizers/MomentumBased.swift:152:17')
(2220, "instance method 'update(_:along:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Optimizers/MomentumBased.swift:152:17')
(2220, "instance method 'update(_:along:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Optimizers/MomentumBased.swift:152:17')
(2304, "global function 'root'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Operators/Math.swift:1375:13')
(2304, "global function 'root'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Operators/Math.swift:1375:13')
(2304, "global function 'root'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Operators/Math.swift:1375:13')
(2528, "global function 'hingeLoss(predicted:expected:reduction:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Loss.swift:109:13')
(2528, "global function 'hingeLoss(predicted:expected:reduction:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Loss.swift:109:13')
(2528, "global function 'hingeLoss(predicted:expected:reduction:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Loss.swift:109:13')
(3796, "global function 'categoricalHingeLoss(predicted:expected:reduction:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Loss.swift:139:13')
(3796, "global function 'categoricalHingeLoss(predicted:expected:reduction:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Loss.swift:139:13')
(3796, "global function 'categoricalHingeLoss(predicted:expected:reduction:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Loss.swift:139:13')
(4102, "instance method 'update(_:along:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Optimizers/MomentumBased.swift:429:17')
(4102, "instance method 'update(_:along:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Optimizers/MomentumBased.swift:429:17')
(4102, "instance method 'update(_:along:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Optimizers/MomentumBased.swift:429:17')
(4128, "global function 'cosineSimilarity'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Operators/Math.swift:1492:13')
(4128, "global function 'cosineSimilarity'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Operators/Math.swift:1492:13')
(4128, "global function 'cosineSimilarity'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Operators/Math.swift:1492:13')
(15152, "global function 'logCoshLoss(predicted:expected:reduction:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Loss.swift:157:13')
(15152, "global function 'logCoshLoss(predicted:expected:reduction:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Loss.swift:157:13')
(15152, "global function 'logCoshLoss(predicted:expected:reduction:)'", '/Users/danielzheng/swift-apis/Sources/TensorFlow/Loss.swift:157:13')
Idea from @allevato: provide a contextual type for literal expressions to help the type checker.
It works:
import TensorFlow
@differentiable
public func logCoshLoss<Scalar: TensorFlowFloatingPoint>(
predicted: Tensor<Scalar>,
expected: Tensor<Scalar>,
reduction: @differentiable (Tensor<Scalar>) -> Tensor<Scalar> = _mean
) -> Tensor<Scalar> {
let x = predicted - expected
// Original code.
return reduction(x + softplus(Tensor(-2) * x) - log(Tensor(2)))
}
@differentiable
public func logCoshLossTest<Scalar: TensorFlowFloatingPoint>(
predicted: Tensor<Scalar>,
expected: Tensor<Scalar>,
reduction: @differentiable (Tensor<Scalar>) -> Tensor<Scalar> = _mean
) -> Tensor<Scalar> {
let x = predicted - expected
// Tony's suggestion: provide contextual type for literals.
return reduction(x + softplus(Tensor(-2 as Scalar) * x) - log(Tensor(2 as Scalar)))
}
$ swift -Xfrontend -debug-time-function-bodies timing.swift
timing.swift:4:13: warning: global function 'logCoshLoss(predicted:expected:reduction:)' took 7400ms to type-check (limit: 1ms)
public func logCoshLoss<Scalar: TensorFlowFloatingPoint>(
^
timing.swift:15:13: warning: global function 'logCoshLossTest(predicted:expected:reduction:)' took 134ms to type-check (limit: 1ms)
public func logCoshLossTest<Scalar: TensorFlowFloatingPoint>(
^
I believe this should not be necessary, and may be a deficiency in the Swift type checker, specifically bidirectional type-checking and constraint solving. Whenever a contextual type exists, constraints should propagate from out to in, so Scalar
is the only possible type for the literals 2
and -2
in logCoshLoss
. If we start constraint solving from the type variables for 2
and -2
(which have many possible types), a huge disjunction constraint may be generated, leading to big slowdown.
It would be nice to write a Swift forums question with a minimal reproducer of similar bad type-checking performance for literals with contextual type.
Suggestion from @rxwei: splitting larger files into more smaller files can help multithreaded compilation (swift build
), since one thread can be spawned per file. Some files like Sources/TensorFlow/Operators/Math.swift
(currently 2835 lines) are huge and can be split.
Note that we have one huge generated Swift file for TensorFlow bindings: Sources/TensorFlow/Bindings/RawOpsGenerated.swift
(currently 36743 lines). That probably takes a while to compile.