Skip to content

Commit f50d041

Browse files
sfc-gh-anavalosSnowflake Authors
andauthored
Project import generated by Copybara. (#116)
GitOrigin-RevId: 6fc3ce416ce5843fc01936fc61bf5480ae9f791f Co-authored-by: Snowflake Authors <[email protected]>
1 parent 0bdaf0b commit f50d041

File tree

142 files changed

+4356
-1493
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

142 files changed

+4356
-1493
lines changed

BUILD.bazel

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,8 @@ load("//:packages.bzl", "PACKAGES")
33
load("//bazel:py_rules.bzl", "py_wheel")
44
load("//bazel/requirements:rules.bzl", "generate_pyproject_file")
55

6+
package(default_visibility = ["//visibility:public"])
7+
68
exports_files([
79
"CHANGELOG.md",
810
"README.md",

CHANGELOG.md

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,26 @@
11
# Release History
22

3-
## 1.6.1 (TBD)
3+
## 1.6.2 (TBD)
4+
5+
### Bug Fixes
6+
7+
- Modeling: Support XGBoost version that is larger than 2.
8+
9+
- Data: Fix multiple epoch iteration over `DataConnector.to_torch_datapipe()` DataPipes.
10+
- Generic: Fix a bug that when an invalid name is provided to argument where fully qualified name is expected, it will
11+
be parsed wrongly. Now it raises an exception correctly.
12+
- Model Explainability: Handle explanations for multiclass XGBoost classification models
13+
- Model Explainability: Workarounds and better error handling for XGB>2.1.0 not working with SHAP==0.42.1
14+
15+
### New Features
16+
17+
- Data: Add top-level exports for `DataConnector` and `DataSource` to `snowflake.ml.data`.
18+
- Data: Add native batching support via `batch_size` and `drop_last_batch` arguments to `DataConnector.to_torch_dataset()`
19+
- Feature Store: update_feature_view() supports taking feature view object as argument.
20+
21+
### Behavior Changes
22+
23+
## 1.6.1 (2024-08-12)
424

525
### Bug Fixes
626

@@ -17,7 +37,6 @@
1737
### New Features
1838

1939
- Enable `set_params` to set the parameters of the underlying sklearn estimator, if the snowflake-ml model has been fit.
20-
- Data: Add top-level exports for `DataConnector` and `DataSource` to `snowflake.ml.data`.
2140
- Data: Add `snowflake.ml.data.ingestor_utils` module with utility functions helpful for `DataIngestor` implementations.
2241
- Data: Add new `to_torch_dataset()` connector to `DataConnector` to replace deprecated DataPipe.
2342
- Registry: Option to `enable_explainability` set to True by default for XGBoost, LightGBM and CatBoost as PuPr feature.

CONTRIBUTING.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -304,7 +304,7 @@ Example:
304304
305305
## Unit Testing
306306
307-
Write `pytest` or Python `unittest` style unit tests.
307+
Write Python `unittest` style unit tests. Pytest is allowed, but not recommended.
308308

309309
### `unittest`
310310

@@ -320,6 +320,10 @@ from absl.testing import absltest
320320
# instead of
321321
# from unittest import TestCase, main
322322
from absl.testing.absltest import TestCase, main
323+
324+
# Call main.
325+
if __name__ == '__main__':
326+
absltest.main()
323327
```
324328

325329
`absltest` provides better `bazel` integration which produces a more detailed XML

bazel/environments/conda-env-snowflake.yml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@ dependencies:
2828
- lightgbm==3.3.5
2929
- mlflow==2.3.1
3030
- moto==4.0.11
31+
- mypy==1.10.0
3132
- networkx==2.8.4
3233
- numpy==1.23.5
3334
- packaging==23.0
@@ -54,14 +55,16 @@ dependencies:
5455
- snowflake-snowpark-python==1.17.0
5556
- sphinx==5.0.2
5657
- sqlparse==0.4.4
58+
- starlette==0.27.0
5759
- tensorflow==2.12.0
5860
- tokenizers==0.13.2
5961
- toml==0.10.2
6062
- torchdata==0.6.1
6163
- transformers==4.32.1
64+
- types-PyYAML==6.0.12.12
6265
- types-protobuf==4.23.0.1
6366
- types-requests==2.30.0.0
6467
- types-toml==0.10.8.6
65-
- typing-extensions==4.5.0
68+
- typing-extensions==4.6.3
6669
- werkzeug==2.2.2
6770
- xgboost==1.7.3

bazel/environments/conda-env.yml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,6 @@ dependencies:
1414
- cachetools==4.2.2
1515
- catboost==1.2.0
1616
- cloudpickle==2.2.1
17-
- conda-forge::accelerate==0.22.0
18-
- conda-forge::mypy==1.5.1
19-
- conda-forge::starlette==0.27.0
20-
- conda-forge::types-PyYAML==6.0.12
21-
- conda-forge::types-cachetools==4.2.2
2217
- conda-libmamba-solver==23.7.0
2318
- coverage==6.3.2
2419
- cryptography==39.0.1
@@ -33,6 +28,7 @@ dependencies:
3328
- lightgbm==3.3.5
3429
- mlflow==2.3.1
3530
- moto==4.0.11
31+
- mypy==1.10.0
3632
- networkx==2.8.4
3733
- numpy==1.23.5
3834
- packaging==23.0
@@ -59,18 +55,22 @@ dependencies:
5955
- snowflake-snowpark-python==1.17.0
6056
- sphinx==5.0.2
6157
- sqlparse==0.4.4
58+
- starlette==0.27.0
6259
- tensorflow==2.12.0
6360
- tokenizers==0.13.2
6461
- toml==0.10.2
6562
- torchdata==0.6.1
6663
- transformers==4.32.1
64+
- types-PyYAML==6.0.12.12
6765
- types-protobuf==4.23.0.1
6866
- types-requests==2.30.0.0
6967
- types-toml==0.10.8.6
70-
- typing-extensions==4.5.0
68+
- typing-extensions==4.6.3
7169
- werkzeug==2.2.2
7270
- xgboost==1.7.3
7371
- pip
7472
- pip:
7573
- --extra-index-url https://pypi.org/simple
74+
- accelerate==0.22.0
75+
- types-cachetools==4.2.2
7676
- peft==0.5.0

bazel/environments/conda-gpu-env.yml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -14,11 +14,6 @@ dependencies:
1414
- cachetools==4.2.2
1515
- catboost==1.2.0
1616
- cloudpickle==2.2.1
17-
- conda-forge::accelerate==0.22.0
18-
- conda-forge::mypy==1.5.1
19-
- conda-forge::starlette==0.27.0
20-
- conda-forge::types-PyYAML==6.0.12
21-
- conda-forge::types-cachetools==4.2.2
2217
- conda-libmamba-solver==23.7.0
2318
- coverage==6.3.2
2419
- cryptography==39.0.1
@@ -33,6 +28,7 @@ dependencies:
3328
- lightgbm==3.3.5
3429
- mlflow==2.3.1
3530
- moto==4.0.11
31+
- mypy==1.10.0
3632
- networkx==2.8.4
3733
- numpy==1.23.5
3834
- nvidia::cuda==11.7.*
@@ -61,19 +57,23 @@ dependencies:
6157
- snowflake-snowpark-python==1.17.0
6258
- sphinx==5.0.2
6359
- sqlparse==0.4.4
60+
- starlette==0.27.0
6461
- tensorflow==2.12.0
6562
- tokenizers==0.13.2
6663
- toml==0.10.2
6764
- torchdata==0.6.1
6865
- transformers==4.32.1
66+
- types-PyYAML==6.0.12.12
6967
- types-protobuf==4.23.0.1
7068
- types-requests==2.30.0.0
7169
- types-toml==0.10.8.6
72-
- typing-extensions==4.5.0
70+
- typing-extensions==4.6.3
7371
- werkzeug==2.2.2
7472
- xgboost==1.7.3
7573
- pip
7674
- pip:
7775
- --extra-index-url https://pypi.org/simple
76+
- accelerate==0.22.0
77+
- types-cachetools==4.2.2
7878
- peft==0.5.0
7979
- vllm==0.2.1.post1

bazel/requirements/requirements.schema.json

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -59,11 +59,6 @@
5959
"pattern": "^$|^([1-9][0-9]*!)?(0|[1-9][0-9]*)(\\.(0|[1-9][0-9]*))*((a|b|rc|alpha|beta)(0|[1-9][0-9]*))?(\\.post(0|[1-9][0-9]*))?(\\.dev(0|[1-9][0-9]*))?$",
6060
"type": "string"
6161
},
62-
"from_channel": {
63-
"default": "https://repo.anaconda.com/pkgs/snowflake",
64-
"description": "The channel where the package come from, set if not from Snowflake Anaconda Channel.",
65-
"type": "string"
66-
},
6762
"gpu_only": {
6863
"default": false,
6964
"description": "The package is required when running in an environment where GPU is available.",

ci/conda_recipe/meta.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ build:
1717
noarch: python
1818
package:
1919
name: snowflake-ml-python
20-
version: 1.6.1
20+
version: 1.6.2
2121
requirements:
2222
build:
2323
- python
@@ -45,7 +45,7 @@ requirements:
4545
- snowflake-snowpark-python>=1.17.0,<2
4646
- sqlparse>=0.4,<1
4747
- typing-extensions>=4.1.0,<5
48-
- xgboost>=1.7.3,<2
48+
- xgboost>=1.7.3,<2.1
4949
- python>=3.8,<3.12
5050
run_constrained:
5151
- catboost>=1.2.0, <2

codegen/codegen_rules.bzl

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,6 @@ def autogen_estimators(module, estimator_info_list):
9090
"//snowflake/ml/_internal/exceptions:exceptions",
9191
"//snowflake/ml/_internal/utils:temp_file_utils",
9292
"//snowflake/ml/_internal/utils:query_result_checker",
93-
"//snowflake/ml/_internal/utils:pkg_version_utils",
9493
"//snowflake/ml/_internal/utils:identifier",
9594
"//snowflake/ml/model:model_signature",
9695
"//snowflake/ml/model/_signatures:utils",
@@ -181,7 +180,6 @@ def autogen_snowpark_pandas_tests(module, module_root_dir, snowpark_pandas_estim
181180
"//snowflake/ml/_internal/snowpark_pandas:snowpark_pandas_lib",
182181
"//snowflake/ml/utils:connection_params",
183182
],
184-
compatible_with_snowpark = False,
185183
timeout = "long",
186184
legacy_create_init = 0,
187185
shard_count = 5,

codegen/sklearn_wrapper_generator.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1153,15 +1153,18 @@ def generate(self) -> "XGBoostWrapperGenerator":
11531153
super().generate()
11541154

11551155
# Populate XGBoost specific values
1156-
self.estimator_imports_list.append("import xgboost")
1156+
self.estimator_imports_list.extend(["import sklearn", "import xgboost"])
11571157
self.test_estimator_input_args_list.extend(
11581158
["random_state=0", "subsample=1.0", "colsample_bynode=1.0", "n_jobs=1"]
11591159
)
1160-
self.score_sproc_imports = ["xgboost"]
1160+
self.score_sproc_imports = ["xgboost", "sklearn"]
11611161
# TODO(snandamuri): Replace cloudpickle with joblib after latest version of joblib is added to snowflake conda.
11621162
self.supported_export_method = "to_xgboost"
11631163
self.unsupported_export_methods = ["to_sklearn", "to_lightgbm"]
1164-
self.deps = "f'numpy=={np.__version__}', f'xgboost=={xgboost.__version__}', f'cloudpickle=={cp.__version__}'"
1164+
self.deps = (
1165+
"f'numpy=={np.__version__}', f'scikit-learn=={sklearn.__version__}', "
1166+
+ "f'xgboost=={xgboost.__version__}', f'cloudpickle=={cp.__version__}'"
1167+
)
11651168
self._construct_string_from_lists()
11661169
return self
11671170

0 commit comments

Comments
 (0)