06 Apr 19:49

benfred

e151b01

v1.0.0

What's Changed

Assume 'merlin' is a first party package for isort by @karlhigley in #1420
End-to-end inference POC migration to new ensemble API by @jperez999 in #1391
Update test_integration.sh by @albert17 in #1422
update test_tf4rec.py by @radekosmulski in #1424
Fix lambda dtype issue in PyTorch Multi-GPU training example notebook by @jperez999 in #1425
Prevent dataloaders from using GPU memory when CPU device is selected by @jperez999 in #1429
Fix dtype bug with GroupBy operator when aggs is a string by @jperez999 in #1430
Fix typo in example notebook by @L0Z1K in #1390
Extract Triton Ensemble DAG to merlin.systems package by @karlhigley in #1426
Add TagAs and related wrapper classes by @radekosmulski in #1414
docs: Add preview doc build to PR by @mikemckiernan in #1432
Docs script by @jperez999 in #1433
docs: Ensure that parent review directory exists by @mikemckiernan in #1434
Update reqs by @albert17 in #1406
Handle aiobotocore v2.0+ in test_s3 by @benfred in #1439
Update to work with the latest merlin-core by @benfred in #1441
Add intersphinx mappings for merlin.core by @benfred in #1440
Updates Container tests by @albert17 in #1445
Asvdb fix for integration testing by @jperez999 in #1413
remove setuptools by @jperez999 in #1460
Update imports for classes that moved to merlin-core by @karlhigley in #1447
Reactivate hugectr Criteo integration test by @jperez999 in #1457
Wrapper for TagAs did not work by @bschifferer in #1462
Set up automated docstring coverage checks by @karlhigley in #1454
doc: Update matrix for 22.03 by @mikemckiernan in #1450
Remove Systems library from nvtabular by @jperez999 in #1456
Fix bug about criteo download notebook by @bschifferer in #1453
Add deprecation warnings to modules that moved to core by @karlhigley in #1466
Hard-code the Workflow output dtypes for HugeCTR in Triton by @karlhigley in #1468
AWS SageMaker by @bschifferer in #1421
Improve Workflow error about mismatched dtypes by @karlhigley in #1465
Exclude additional directories and boost docstring coverage req to 35 percent by @karlhigley in #1471
fix(docs): Restore the version picker by @mikemckiernan in #1474
Documentation fixes from the docstring scrub by @benfred in #1475
Add missing --user flag to natsort CI install by @karlhigley in #1476
Change merlin level NVT import to transforms (from transform) by @karlhigley in #1472
Move merlin.core.worker to merlin.io.worker by @karlhigley in #1477
Fix merlin.core.worker imports by @benfred in #1482
Use quieter DeprecationWarning instead of FutureWarning by @karlhigley in #1486
Remove imports to deprecated modules by @benfred in #1487
README updates by @benfred in #1478
Add Troubleshoot for OOM errors with NVTabular dataloaders by @bschifferer in #1373
Upgrade poetry dependencies by @benfred in #1489
Note in the README that installing with pip runs only on CPU by @karlhigley in #1494
Add deprecation warnings to loader, inference, framework_utils by @karlhigley in #1492
Add merlin.transforms.ops sub-package by @karlhigley in #1491
fix for 1455 by @jperez999 in #1497
Restrict running on pandas 1.4.x by @benfred in #1496
Fixing Criteo Inference for TensorFlow and HugeCTR by @bschifferer in #1500
docs: Add a redirect page by @mikemckiernan in #1499
Final updates for 1.0 release by @benfred in #1501
update to compatible dtype by @jperez999 in #1503

New Contributors

@radekosmulski made their first contribution in #1424
@L0Z1K made their first contribution in #1390

Full Changelog: v0.11.0...v1.0.0

Contributors

benfred, karlhigley, and 6 other contributors

Assets 2

01 Mar 22:06

karlhigley

v0.11.0

4da878b

v0.11.0

What's Changed

Docs: Update URL to Criteo notebook by @mikemckiernan in #1383
Update support_matrix.rst by @lgardenhire in #1375
Support min_val for categorical features in DataGen by @bschifferer in #1369
Fix null_size logic in Categorify op by @rjzamora in #1386
Fix CUDA version doc by @albert17 in #1387
Fixes tests utils imports by @albert17 in #1393
Exit integration by @albert17 in #1395
Fix lambdaop call by @jperez999 in #1394
Add ReduceDtypeSize op by @benfred in #1398
Fix remove_inputs usage in export_pytorch_ensemble by @karlhigley in #1389
Param to send test results by @albert17 in #1405
Migrate io, graph, dispatch, worker, and utils to merlin.core by @karlhigley in #1384
Import Distributed and Serial execution-manager utilities from merlin-core by @rjzamora in #1380
Pin merlin-core to a specific commit to avoid breaking changes by @karlhigley in #1409
Rename merlin.graph to merlin.dag by @jperez999 in #1411
Add DropLowCardinality op by @benfred in #1412
Update merlin-core to v0.1.1 (instead of main branch) by @karlhigley in #1419

New Contributors

@mikemckiernan made their first contribution in #1383

Full Changelog: v0.10.0...v0.11.0

Contributors

benfred, karlhigley, and 6 other contributors

Assets 2

02 Feb 17:06

benfred

v0.10.0

525a1ed

v0.10.0

What's Changed

schema metadata propagation by @jperez999 in #1354
Create TagSet as a container that resolves conflicts between tags (like continuous and categorical) by @jperez999 in #1360
Update support_matrix.rst by @lgardenhire in #1363
Raise an error when the actual dtype produced by an operator doesn't match the schema by @jperez999 in #1362
Deprecate client from Dataset, Workflow, and DatasetInspector by @rjzamora in #1318
fixes asv display to one metric per notebook and does not repeat metrics by @jperez999 in #1366
Keras loader nvt dataset usage by default if available by @jperez999 in #1374
Fixes hash_crossed with cudf 21.12 by @albert17 in #1376
Fixes tests by @albert17 in #1377
Support custom Python operators in the Triton operator/ensemble API by @jperez999 in #1368
Use new fsspec.parquet module to accelerate reads from remote storage by @rjzamora in #1241

Full Changelog: v0.9.0...v0.10.0

Contributors

albert17, rjzamora, and 2 other contributors

Assets 2

11 Jan 23:54

benfred

v0.9.0

9077681

v0.9.0

What's Changed

Workflow for adding issues to the backlog by @benfred in #1305
Set the priority and date added fields for new issues. by @benfred in #1308
Label issues not created by nvidia-merlin members by @benfred in #1309
moved tf import to after tf config is completed by @jperez999 in #1311
Fix Triton import for _convert_string2pytorch_dtype by @karlhigley in #1312
Apply NVT graph API/DSL to building Triton ensembles by @jperez999 in #1292
Fixes tests by @albert17 in #1326
Activates Blossom CI by @albert17 in #1324
Add a compute_input_schema method to operators by @jperez999 in #1330
removed column_types.json from nvtabular by @jperez999 in #1317
working refit as expected by user by @jperez999 in #1338
Update support_matrix.rst by @lgardenhire in #1336
HugeCTR Multihot Training-Inference example by @albert17 in #1329
Triton setup via merlin graph api by @jperez999 in #1339
removed parents selector logic in selector setter, by @jperez999 in #1343
Switch to packaging.version.Version for version checks by @benfred in #1345
fix for storage name bug in path creation by @jperez999 in #1347
Fix multiGPU Pytorch MovieLens by @bschifferer in #1319
Update dead links in Documentation by @SimonCW in #1342
Fixes cudf 21.10 error by @albert17 in #1350
Fixes unit tests for containers by @albert17 in #1349
Create an explicit mapping between Operator input and output columns by @jperez999 in #1348
Updates notebooks for cudf 21.10 by @albert17 in #1353
Revert notebook by @albert17 in #1355
Update conda packages to cudf >= 21.10 and add pynvml by @benfred in #1356
Fix writing out workflows to S3 by @benfred in #1357

New Contributors

@SimonCW made their first contribution in #1342

Full Changelog: v0.8.0...v0.9.0

Contributors

benfred, karlhigley, and 5 other contributors

Assets 2

07 Dec 23:29

benfred

v0.8.0

dc9c04d

v0.8.0

What's Changed

Allow writing workflows to cloud storage by @benfred in #1232
Avoid copy of remote-data buffer in call to read_parquet by @rjzamora in #1239
Update container references to merlin 21.11 by @benfred in #1242
Fix numpy version in CI by @karlhigley in #1255
Modularize the Triton inference model for NVT Workflows by @karlhigley in #1252
Dl cpu by @jperez999 in #1245
fixes for schema saving and writing by @jperez999 in #1215
decouple io from schema by @jperez999 in #1161
Remove non-exist Torch uint dtypes from Triton conversion utils by @karlhigley in #1270
utf-8 when opening notebooks by @albert17 in #1271
Add 'pad' option for the ListSlice op by @benfred in #1262
End-to-end Inference support for Transformers4Rec Tensorflow Models by @rnyak in #1256
fix lookup error on typo in tags for target by @jperez999 in #1281
Fix resolution of tags to column names when executing Workflows by @jperez999 in #1285
Extract all knowledge of Triton from the serving-time WorkflowRunners by @karlhigley in #1257
Extract an abstract graph package from NVT Workflows by @karlhigley in #1265
dataset duck typing for dataloader by @jperez999 in #1272
Reduce device-memory footprint in Categorify fit by @rjzamora in #1259
Fixes for ListSlice operator with padding by @benfred in #1288
Update support_matrix.rst by @lgardenhire in #1243
Fix notebook tests broken by recent graph refactoring by @karlhigley in #1293
add init file for import support by @jperez999 in #1300
add missing dependencies to poetry by @benfred in #1298
Fix inference issues for end-to-end TF example for Transformers4Rec by @karlhigley in #1299
Uninstall NVT (removing versions from PyPI) before installing NVT in CI by @karlhigley in #1303
Updates integration tests by @albert17 in #1294
fix train_test split by @rnyak in #1291
fix arbitrary output file number bug, shrink number of files and warn… by @jperez999 in #1301

Full Changelog: v0.7.1...v0.8.0

Contributors

benfred, karlhigley, and 5 other contributors

Assets 2

04 Nov 01:54

benfred

v0.7.1

21c0f7a

v0.7.1

NVTabular v0.7.1 (2 November 2021)

Improvements

Add LogOp support for list features #1153
Add Normalize operator support for list features #1154
Add DataLoader.epochs() method and Dataset.to_iter(epochs=) argument #1147
Add ValueCount operator for recording of multihot min and max list lengths #1171

Bug Fixes

Fix Criteo inference #1198
Fix performance regressions in Criteo benchmark #1222
Fix error in JoinGroupby op #1167
Fix Filter/JoinExternal key error #1143
Fix LambdaOp transforming dependency values #1185
Fix reading parquet files with list columns from GCS #1155
Fix TargetEncoding with dependencies as the target #1165
Fix Categorify op to calculate unique count stats for Nulls #1159

Assets 2

24 Sep 03:45

benfred

v0.7.0

b55c57c

v0.7.0

NVTabular v0.7.0

Improvements

Add column tagging API #943
Export dataset schema when writing out datasets #948
Make dataloaders aware of schema #947
Standardize a Workflows representation of its output columns #372
Add multi-gpu training example using PyTorch Distributed #775
Speed up reading Parquet files from remote storage like GCS or S3 #1119
Add utility to convert TFRecord datasets to Parquet #1085
Add multi-gpu training example using PyTorch Distributed #775
Add multihot support for PyTorch inference #719
Add options to reserve categorical indices in the Categorify() op #1074
Update notebooks to work with CPU only systems #960
Save output from Categorify op in a single table for HugeCTR #946
Add a keyset file for HugeCTR integration #1049

Bug Fixes

Fix category counts written out by the Categorify op #1128
Fix HugeCTR inference example #1130
Fix make_feature_column_workflow bug in Categorify if features have vocabularies of varying size. #1062
Fix TargetEncoding op on CPU only systems #976
Fix writing empty partitions to Parquet files #1097

Assets 2

11 Aug 21:01

benfred

v0.6.1

cec2402

v0.6.1

NVTabular v0.6.1

Bug Fixes

Fix installing package via pip #1030
Fix inference with groupby operator #1019
Install tqdm with conda package #1030
Fix workflow output_dtypes with empty partitions #1028

Assets 2

03 Aug 18:44

benfred

v0.6.0

886d5b8

v0.6.0

NVTabular v0.6.0

Improvements

Add CPU support #534
Speed up inference on Triton Inference Server #744
Add support for session based recommenders #355
Add PyTorch Dataloader support for Sparse Tensors #500
Add ListSlice operator for truncating list columns #734
Categorical ids sorted by frequency #799
Add ability to select a subset of a ColumnGroup #809
Add option to use Rename op to give a single column a new fixed name #825
Add a 'map' function to KerasSequenceLoader, which enables sample weights #667
Add JoinExternal option on nvt.Dataset in addition to cudf #370
Allow passing ColumnGroup to get_embedding_sizes #732
Add ability to name LambdaOp and provide a better default name in graph visualizations #860

Bug Fixes

Fix make_feature_column_workflow for Categorical columns #763
Fix Categorify output dtypes for list columns #963
Fix inference for Outbrain example #669
Fix dask metadata after calling workflow.to_ddf() #852
Fix out of memory errors #896, #971
Fix normalize output when stdev is zero #993
Fix using UCX with a dask cluster on Merlin containers #872

Assets 2

26 May 18:21

benfred

v0.5.3

94d2960

v0.5.3

Bug Fixes

Fix Shuffling in Torch DataLoader #818
Fix "Unsupported type_id conversion" in triton inference for string columns #813
Fix HugeCTR inference backend Merlin#8

Assets 2

Releases: NVIDIA-Merlin/NVTabular

v1.0.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.11.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.10.0

What's Changed

Contributors

Uh oh!

v0.9.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.8.0

What's Changed

Contributors

Uh oh!

v0.7.1

NVTabular v0.7.1 (2 November 2021)

Improvements

Bug Fixes

Uh oh!

v0.7.0

NVTabular v0.7.0

Improvements

Bug Fixes

Uh oh!

v0.6.1

NVTabular v0.6.1

Bug Fixes

Uh oh!

v0.6.0

NVTabular v0.6.0

Improvements

Bug Fixes

Uh oh!

v0.5.3

Bug Fixes

Uh oh!