Skip to content

Conversation

@graemenail
Copy link
Member

@graemenail graemenail commented Sep 14, 2022

TODO

  • Manual check of training / scoring

Description

Adds oneDNN. This PR should enable a completely open-source compilations of Marian.

In contrast to #937, this PR also allows MKL-based builds. When compiled with -DUSE_DNNL=ON, oneDNN is used for sgemm, even if -DUSE_MKL=ON is requested.

This PR also includes a caching of boost, and a cleaning of the debug build directory. These can be broken off into a separate PR if necessary. During testing, windows builds would fail from running out of space building both debug and release. Since disabling oneDNN JIT profiling the build sizes are smaller.

Related: marian-nmt/marian-regression-tests#86
Closes: #706
Supersedes: #937

List of changes:

  • Adds oneDNN as a submodule
  • Use oneDNN for sgemm
  • MKL has been retained for backwards compatibility
  • LSH (with rotation) now explicitly requires MKL. This is to avoid silently falling back to less performant codepaths. To document, BLAS_FOUND is sufficient here and the MKL_FOUND condition could be omitted by a user expecting a performance degradation.

Added dependencies: Intel oneDNN

How to test

Ran the 1 million sentence testset of WNGT21 through the MKL and oneDNN versions.

The static binaries are now larger.

Regression test results
Skipped:
- tests/server/test_ende_cpu.sh
Failed:
- tests/decoder/intgemm/test_intgemm_16bit.sh
- tests/decoder/intgemm/test_intgemm_16bit_avx2.sh
- tests/decoder/intgemm/test_intgemm_16bit_sse2.sh
- tests/decoder/intgemm/test_intgemm_8bit.sh
- tests/decoder/intgemm/test_intgemm_8bit_avx2.sh
- tests/decoder/intgemm/test_intgemm_8bit_ssse3.sh
- tests/models/wngt19/test_model_base_fbgemm_packed8.sh
Logs:
- /home/gnail/projects/mkl-onednn/marian-dev/regression-tests/tests/decoder/intgemm/test_intgemm_16bit.sh.log
- /home/gnail/projects/mkl-onednn/marian-dev/regression-tests/tests/decoder/intgemm/test_intgemm_16bit_avx2.sh.log
- /home/gnail/projects/mkl-onednn/marian-dev/regression-tests/tests/decoder/intgemm/test_intgemm_16bit_sse2.sh.log
- /home/gnail/projects/mkl-onednn/marian-dev/regression-tests/tests/decoder/intgemm/test_intgemm_8bit.sh.log
- /home/gnail/projects/mkl-onednn/marian-dev/regression-tests/tests/decoder/intgemm/test_intgemm_8bit_avx2.sh.log
- /home/gnail/projects/mkl-onednn/marian-dev/regression-tests/tests/decoder/intgemm/test_intgemm_8bit_ssse3.sh.log
- /home/gnail/projects/mkl-onednn/marian-dev/regression-tests/tests/models/wngt19/test_model_base_fbgemm_packed8.sh.log
---------------------
Ran 19 tests in 00:00:0.000s, 11 passed, 1 skipped, 7 failed
FAILED

Checklist

  • I have tested the code manually
  • I have run regression tests
  • I have read and followed CONTRIBUTING.md
  • I have updated CHANGELOG.md

@graemenail graemenail mentioned this pull request Sep 14, 2022
4 tasks
@graemenail graemenail marked this pull request as ready for review November 14, 2022 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider oneDNN instead of MKL for SGEMM

1 participant