Skip to content

Conversation

@stweil
Copy link
Collaborator

@stweil stweil commented Mar 10, 2022

I currently try to rework the build process to support more modern Linux distributions and Python versions. Ideally the full test matrix of Ubuntu LTS versions and Python versions ranging from 3.6 to 3.10 should build fine.

My changes address several issues:

  • Check early whether the required Tensorflow versions are available and skip those submodules which cannot be built because of missing Tensorflow.
  • Use different rules for Python 3.6 and 3.10 when building ocrd_kraken.
  • Disable ocrd_cis unconditionally because it requires an old calamari_ocr causing version conflicts.
  • Address potential version conflicts for Python modules which are required by different OCR-D processors by using constraints.
  • Reduce the number of sub_venvs by installing all modules which work with a recent Tensorflow in the main venv.
  • Install Python module wheel early in all virtual environments, so it is no longer needed as a dependency.
  • ... and other changes.

This PR is not for merging, but for discussion of the different aspects with the goal to find a consensus on the right solution.

@stweil stweil marked this pull request as draft March 10, 2022 11:08
@stweil
Copy link
Collaborator Author

stweil commented Mar 10, 2022

I wonder whether we really need export PIP ?= pip3 and would prefer to replace $(PIP) by a simple pip. @bertsky, you added that macro in commit 7807ae6. Are there use cases where it is helpful? It is also exported. Are there submodules which need it? I found core and cor-asv-ann which use PIP, but it looks like they also don't need it.

@kba
Copy link
Member

kba commented Mar 16, 2022

I wonder whether we really need export PIP ?= pip3 and would prefer to replace $(PIP) by a simple pip

We don't need it any more, we did need it when we still had to support 2.7 under certain circumstances. In fact, using anything but pip/pip3 for $(PIP) could cause inconsistencies if people override it with a pip for a different minor version of python than the venv.

@stweil
Copy link
Collaborator Author

stweil commented Mar 16, 2022

Now 3 of 5 Python versions complete the build: https://github.com/stweil/ocrd_all/actions/runs/1993530591.

The remaining failures occur when building ocrd_kraken (see pull request #288 which I included here) and might be fixed in the CI running now.

ACTIVATE_VENV = $(VIRTUAL_ENV)/bin/activate

ifeq (0, $(MAKELEVEL))
ifeq ($(MAKECMDGOALS), all)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: support also builds with several targets, for example make all check.

ifeq ($(MAKECMDGOALS), all)
CHECK_SUBENVS := true
endif
ifeq ($(MAKECMDGOALS), check)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: support also builds with several targets, for example make all check.

create_venv := $(shell $(PYTHON) -m venv $(SUB_VENV)/headless-tf21 && bash -c "source $(SUB_VENV)/headless-tf21/bin/activate && pip install -U pip setuptools wheel")
endif

# Try to install different versions of Tensorflow.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: This takes some time when the installation is really done (or tried), so a log message would be useful.

@stweil
Copy link
Collaborator Author

stweil commented Mar 19, 2022

Now all builds pass.

@stweil
Copy link
Collaborator Author

stweil commented Mar 23, 2022

The updated branch still passes for all builds: https://github.com/stweil/ocrd_all/actions/runs/2027885009.

@stweil stweil force-pushed the tensorflow branch 2 times, most recently from 31335e4 to 3d2605a Compare March 23, 2022 15:47
else
cd $< ; $(MAKE) patch-pix2pixhd
# Hack for Python 3.10 which fails to install ocrd-fork-pylsd 0.0.4 from PyPI.
. $(ACTIVATE_VENV) && pip install git+https://github.com/kba/pylsd.git
Copy link
Collaborator Author

@stweil stweil Mar 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kba, @bertsky, could you have a look why pip3.10 install ocrd-fork-pylsd fails while pip3.10 install git+https://github.com/kba/pylsd.git works fine? I get this build error:

  gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Isource/include -I/home/stweil/src/github/OCR-D/venv3.10-20220323/include -I/usr/local/include/python3.10 -c source/src/lsd.cpp -o build/temp.linux-x86_64-3.10/source/src/lsd.o
  source/src/lsd.cpp:77:10: fatal error: lsd.h: No such file or directory
     77 | #include <lsd.h>
        |          ^~~~~~~
  compilation terminated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to build a manylinux binary wheel for 3.10. Can you try

pip install --only-binary :all: ocrd-fork-pylsd

and see if that works for you?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works, and pip3.10 install ocrd-fork-pylsd works now, too.

@stweil
Copy link
Collaborator Author

stweil commented Mar 24, 2022

Latest run passes all builds: https://github.com/stweil/ocrd_all/actions/runs/2029562843.

stweil added 20 commits March 31, 2022 15:56
Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
Signed-off-by: Stefan Weil <[email protected]>
They are required because of new package conflicts for Python < 3.10.

Signed-off-by: Stefan Weil <[email protected]>
@stweil
Copy link
Collaborator Author

stweil commented Mar 31, 2022

@bertsky, @kba, I compared the build rules here with those from git master and wrote the complete make all results in two separate virtual environments. The result shows that my build rules reduce the disk usage significantly, especially by reducing the total number of required virtual environments from four to two (one main venv, one sub venv). That would also result in faster builds and smaller docker images and so fix the time problems in CircleCI where the image uploads contribute a significant part.

stweil@ocr-02:~/src/github/OCR-D$ du -msc venv3.7-20220331-*
11605	venv3.7-20220331-master
6876	venv3.7-20220331-tensor
18480	total
stweil@ocr-02:~/src/github/OCR-D$ du -msc venv3.7-20220331-*/*
74	venv3.7-20220331-master/bin
594	venv3.7-20220331-master/build
2	venv3.7-20220331-master/build.log
11	venv3.7-20220331-master/include
2985	venv3.7-20220331-master/lib
1	venv3.7-20220331-master/lib64
3	venv3.7-20220331-master/libexec
1	venv3.7-20220331-master/locale
1	venv3.7-20220331-master/pyvenv.cfg
1	venv3.7-20220331-master/requirements.txt
103	venv3.7-20220331-master/share
7837	venv3.7-20220331-master/sub-venv
74	venv3.7-20220331-tensor/bin
594	venv3.7-20220331-tensor/build
1	venv3.7-20220331-tensor/build.log
11	venv3.7-20220331-tensor/include
4006	venv3.7-20220331-tensor/lib
1	venv3.7-20220331-tensor/lib64
3	venv3.7-20220331-tensor/libexec
1	venv3.7-20220331-tensor/locale
1	venv3.7-20220331-tensor/pyvenv.cfg
1	venv3.7-20220331-tensor/requirements.txt
103	venv3.7-20220331-tensor/share
2087	venv3.7-20220331-tensor/sub-venv
18480	total

@stweil
Copy link
Collaborator Author

stweil commented Apr 2, 2022

Now I also have timing and size results for a local make docker-maximum-cuda-git (single-threaded):

OCR-D:master
0.60user 0.55system 56:52.93elapsed 0%CPU (0avgtext+0avgdata 64456maxresident)k
3848inputs+3024outputs (39major+10597minor)pagefaults 0swaps
ocrd/all                  maximum-cuda-git   d737c47c1285   2 hours ago     32.8GB

stweil:tensorflow
0.71user 0.58system 45:54.85elapsed 0%CPU (0avgtext+0avgdata 67572maxresident)k
2280inputs+2712outputs (5major+11208minor)pagefaults 0swaps
ocrd/all                  maximum-cuda-git   a8fc92d9f811   2 minutes ago   23.7GB

@stweil
Copy link
Collaborator Author

stweil commented Jun 16, 2023

This proof of concept is no longer up to date, so it can be closed.

@stweil stweil closed this Jun 16, 2023
@stweil stweil deleted the tensorflow branch June 16, 2023 20:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants