Skip to content

Conversation

@bertsky
Copy link
Collaborator

@bertsky bertsky commented Feb 19, 2022

On Python 3.8, you get errors trying to load the existing HDF5 models for Tensorflow processors tiseg and layout-analysis.

However, Tensorflow offers a more stable alternative: SavedModel directories. I have converted the existing models an adapted the code to make them runnable again.

Now, how do we redistribute these? I have uploaded them as tarballs here and here. But really they should go to https://ocr-d-repo.scc.kit.edu/models/dfki as well.

As soon as we get OCR-D/core#800 done, we should then be able to update the resource list in ocrd-tool.json, right?

Another dependency is in the processors using ocrolib.morph, i.e. nlbin and textline: OCR-D/ocropy#2@kba, as soon as you have merged and published ocrd-fork-ocropy==1.4.0a4, this is ready to go.

Robert Sachunsky added 4 commits February 19, 2022 03:40
- move model loading into `setup` in constructor context
- allow directories as models (TF SavedModel format), too
- use correct pageId
- simplify and polish
Robert Sachunsky added 11 commits February 20, 2022 15:22
use custom dataset class for in-memory PIL.Image passing
instead of file-based repurposed `AlignedDataset` (since
(this is faster, and reliable: OCR-D does not guarantee us
 a `.filename` for derived images; also, does not create
 temporary files in the input fileGrp anymore)
after decoding, convert tensor to array with due respect for
proper channel and dynamic range coding (instead of ad-hoc
conversion); then resize while still in RGB and re-binarize
(instead of ad-hoc binarization followed by resizing in binary)
- rebase on pix2pixHD#293 (CPU-only option, Torch>=1.0,
  less verbose, arg passing)
- pass args to pix2pixHD directly (instead of sys.args
  hijacking)
- no unneccesary verbosity (and only through loggers)
- move model loading into startup context via `setup` fn
- rename params:
  * `imgresize` → `resize_mode`,
  * `resizeHeight` → `resize_height`
  * `resizeWidth` → `resize_width`
- add proper documentation
- fix region-level results
(just BIN is not enough / not as good / not realistic)
@bertsky bertsky changed the title Use SavedModel instead of HDF5 format Use SavedModel instead of HDF5 format, fix dewarping Feb 20, 2022
@bertsky
Copy link
Collaborator Author

bertsky commented Feb 20, 2022

Now also depends on NVIDIA/pix2pixHD#293, and contains various other fixes, mostly regarding dewarping.

Fixes #34, #35, #40, #60, #61, #72, #73, #77, #87, #88, and probably #42 (see below – with resize_mode=none).

With better upsampling/re-binarization, the quality of the dewarper has also improved a little. It is obviously not a good idea to downsample in the first place (which is the case with the default resize_mode=resize_and_crop). But one could always increase resize_width/resize_height, or use resize_mode=none to gain full size quality at the cost of higher memory and time demand.

Here are some examples based on the dfki-testdata test case (after binarization and cropping):

dewarped with default settings:

before after
dfki-crop-test dfki-dewarp-test-bin

dewarped with default settings but on GPU:

before after
dfki-crop-test dfki-dewarp-test-bin-gpu

dewarped with larger size (less resampling/interpolation):

before after
dfki-crop-test dfki-dewarp-test-bin-large

dewarped with original/full image size:

before after
dfki-crop-test dfki-dewarp-test-bin-full

dewarped on cropped but raw RGB (just to show that the models have not been trained on such data):

before after
dfki-crop-test dfki-dewarp-test-raw

@kba kba merged commit 01aea45 into OCR-D:master Feb 22, 2022
@bertsky
Copy link
Collaborator Author

bertsky commented Feb 22, 2022

Now, how do we redistribute these? I have uploaded them as tarballs here and here. But really they should go to https://ocr-d-repo.scc.kit.edu/models/dfki as well.

Like I said, we still need to upload the new models, and update the resource URLs. (This is the reason the CI still fails.)

@bertsky
Copy link
Collaborator Author

bertsky commented Feb 22, 2022

Fixes #34, #35, #40, #60, #61, #72, #73, #77, #87, #88, and probably #42

BTW I forgot to link these (and my formulation is not covered by autolinking). Please close them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants