Skip to content

Commit 6186ce6

Browse files
sfc-gh-anavalosSnowflake Authors
andauthored
Project import generated by Copybara. (#118)
GitOrigin-RevId: ac6dd60ea2f93da707c56f842e5afd9935987137 Co-authored-by: Snowflake Authors <[email protected]>
1 parent f50d041 commit 6186ce6

File tree

259 files changed

+8301
-22148
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

259 files changed

+8301
-22148
lines changed

.pre-commit-config.yaml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
---
2-
exclude: ^(.*egg.info.*|.*/parameters.py$|.*\.py_template|.*/experimental/.*|.*/fixtures/.*|docs/source/_themes/.*)
3-
minimum_pre_commit_version: 3.4.0
2+
exclude: ^(.*egg.info.*|.*/parameters.py$|.*\.py_template|.*/experimental/.*|.*/fixtures/.*|docs/source/_themes/.*|.*\.patch)
43
repos:
54
- repo: https://github.com/asottile/pyupgrade
65
rev: v2.31.1
@@ -65,7 +64,7 @@ repos:
6564
- id: markdownlint-fix
6665
language_version: 16.20.2
6766
- repo: https://github.com/keith/pre-commit-buildifier
68-
rev: 6.0.0
67+
rev: 7.3.1
6968
hooks:
7069
- id: buildifier
7170
args:
@@ -84,7 +83,7 @@ repos:
8483
exclude_types:
8584
- image
8685
- repo: https://github.com/lyz-code/yamlfix
87-
rev: 1.13.0
86+
rev: 1.16.1
8887
hooks:
8988
- id: yamlfix
9089
args:

CHANGELOG.md

Lines changed: 20 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,25 @@
11
# Release History
22

3-
## 1.6.2 (TBD)
3+
## 1.6.3
4+
5+
- Model Registry (PrPr) has been removed.
6+
7+
### Bug Fixes
8+
9+
- Registry: Fix a bug that when package whose name does not follow PEP-508 is provided when logging the model,
10+
an unexpected normalization is happening.
11+
- Registry: Fix `not a valid remote uri` error when logging mlflow models.
12+
- Registry: Fix a bug that `ModelVersion.run` is called in a nested way.
13+
- Registry: Fix an issue that leads to `log_model` failure when local package version contains parts other than
14+
base version.
15+
16+
### New Features
17+
18+
- Data: Improve `DataConnector.to_pandas()` performance when loading from Snowpark DataFrames.
19+
- Model Registry: Allow users to set a model task while using `log_model`.
20+
- Feature Store: FeatureView supports ON_CREATE or ON_SCHEDULE initialize mode.
21+
22+
## 1.6.2 (2024-09-04)
423

524
### Bug Fixes
625

@@ -18,8 +37,6 @@
1837
- Data: Add native batching support via `batch_size` and `drop_last_batch` arguments to `DataConnector.to_torch_dataset()`
1938
- Feature Store: update_feature_view() supports taking feature view object as argument.
2039

21-
### Behavior Changes
22-
2340
## 1.6.1 (2024-08-12)
2441

2542
### Bug Fixes
@@ -42,8 +59,6 @@
4259
- Registry: Option to `enable_explainability` set to True by default for XGBoost, LightGBM and CatBoost as PuPr feature.
4360
- Registry: Option to `enable_explainability` when registering SHAP supported sklearn models.
4461

45-
### Behavior Changes
46-
4762
## 1.6.0 (2024-07-29)
4863

4964
### Bug Fixes

bazel/requirements/templates/meta.tpl.yaml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,9 @@ build:
1111
requirements:
1212
build:
1313
- python
14-
- bazel >=6.0.0
14+
- bazel==6.3.0
1515
run:
1616
- python>=3.8,<3.12
17-
run_constrained:
18-
- openjpeg !=2.4.0=*_1 # [win]
1917

2018
about:
2119
home: https://github.com/snowflakedb/snowflake-ml-python

ci/build_and_run_tests.sh

Lines changed: 27 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#!/bin/bash
22

33
# Usage
4-
# build_and_run_tests.sh <workspace> [-b <bazel path>] [--env pip|conda] [--mode merge_gate|continuous_run] [--with-snowpark] [--report <report_path>]
4+
# build_and_run_tests.sh <workspace> [-b <bazel path>] [--env pip|conda] [--mode merge_gate|continuous_run] [--with-snowpark] [--with-spcs-image] [--report <report_path>]
55
#
66
# Args
77
# workspace: path to the workspace, SnowML code should be in snowml directory.
@@ -14,6 +14,7 @@
1414
# continuous_run (default): run all tests. (For nightly run. Alias: release)
1515
# quarantined: run all quarantined tests.
1616
# with-snowpark: Build and test with snowpark in snowpark-python directory in the workspace.
17+
# with-spcs-image: Build and test with spcs-image in spcs-image directory in the workspace.
1718
# snowflake-env: The environment of the snowflake, use to determine the test quarantine list
1819
# report: Path to xml test report
1920
#
@@ -29,14 +30,15 @@ PROG=$0
2930

3031
help() {
3132
local exit_code=$1
32-
echo "Usage: ${PROG} <workspace> [-b <bazel path>] [--env pip|conda] [--mode merge_gate|continuous_run|quarantined] [--with-snowpark] [--snowflake-env <sf_env>] [--report <report_path>]"
33+
echo "Usage: ${PROG} <workspace> [-b <bazel path>] [--env pip|conda] [--mode merge_gate|continuous_run|quarantined] [--with-snowpark] [--with-spcs-image] [--snowflake-env <sf_env>] [--report <report_path>]"
3334
exit "${exit_code}"
3435
}
3536

3637
WORKSPACE=$1 && shift || help 1
3738
BAZEL="bazel"
3839
ENV="pip"
3940
WITH_SNOWPARK=false
41+
WITH_SPCS_IMAGE=false
4042
MODE="continuous_run"
4143
PYTHON_VERSION=3.8
4244
PYTHON_ENABLE_SCRIPT="bin/activate"
@@ -86,6 +88,9 @@ while (($#)); do
8688
shift
8789
PYTHON_VERSION=$1
8890
;;
91+
--with-spcs-image)
92+
WITH_SPCS_IMAGE=true
93+
;;
8994
-h | --help)
9095
help 0
9196
;;
@@ -260,11 +265,18 @@ else
260265
# Build SnowML
261266
pushd ${SNOWML_DIR}
262267
# Build conda package
263-
conda build --prefix-length 50 --python=${PYTHON_VERSION} --croot "${WORKSPACE}/conda-bld" ci/conda_recipe
268+
conda build -c conda-forge --override-channels --prefix-length 50 --python=${PYTHON_VERSION} --croot "${WORKSPACE}/conda-bld" ci/conda_recipe
264269
conda build purge
265270
popd
266271
fi
267272

273+
if [[ "${WITH_SPCS_IMAGE}" = true ]]; then
274+
pushd ${SNOWML_DIR}
275+
# Build SPCS Image
276+
source model_container_services_deployment/ci/build_and_push_images.sh
277+
popd
278+
fi
279+
268280
# Start testing
269281
pushd "${TEMP_TEST_DIR}"
270282

@@ -281,6 +293,11 @@ if [[ -n "${JUNIT_REPORT_PATH}" ]]; then
281293
fi
282294

283295
if [ "${ENV}" = "pip" ]; then
296+
if [ "${WITH_SPCS_IMAGE}" = true ]; then
297+
COMMON_PYTEST_FLAG+=(-m "spcs_deployment_image and not pip_incompatible")
298+
else
299+
COMMON_PYTEST_FLAG+=(-m "not pip_incompatible")
300+
fi
284301
# Copy wheel package
285302
cp "${WORKSPACE}/snowflake_ml_python-${VERSION}-py3-none-any.whl" "${TEMP_TEST_DIR}"
286303

@@ -302,10 +319,15 @@ if [ "${ENV}" = "pip" ]; then
302319

303320
# Run the tests
304321
set +e
305-
TEST_SRCDIR="${TEMP_TEST_DIR}" python -m pytest "${COMMON_PYTEST_FLAG[@]}" -m "not pip_incompatible" tests/integ/
322+
TEST_SRCDIR="${TEMP_TEST_DIR}" python -m pytest "${COMMON_PYTEST_FLAG[@]}" tests/integ/
306323
TEST_RETCODE=$?
307324
set -e
308325
else
326+
if [ "${WITH_SPCS_IMAGE}" = true ]; then
327+
COMMON_PYTEST_FLAG+=(-m "spcs_deployment_image and not conda_incompatible")
328+
else
329+
COMMON_PYTEST_FLAG+=(-m "not conda_incompatible")
330+
fi
309331
# Create local conda channel
310332
conda index "${WORKSPACE}/conda-bld"
311333

@@ -319,7 +341,7 @@ else
319341

320342
# Run integration tests
321343
set +e
322-
TEST_SRCDIR="${TEMP_TEST_DIR}" conda run -p testenv --no-capture-output python -m pytest "${COMMON_PYTEST_FLAG[@]}" -m "not conda_incompatible" tests/integ/
344+
TEST_SRCDIR="${TEMP_TEST_DIR}" conda run -p testenv --no-capture-output python -m pytest "${COMMON_PYTEST_FLAG[@]}" tests/integ/
323345
TEST_RETCODE=$?
324346
set -e
325347

ci/conda_recipe/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Conda's guide on building a conda package from a wheel:
66
To invoke conda build:
77

88
```sh
9-
conda build --prefix-length=0 --python=[3.8|3.9|3.10|3.11] ci/conda_recipe
9+
conda build -c conda-forge --override-channels --prefix-length=0 --python=[3.8|3.9|3.10|3.11] ci/conda_recipe
1010
```
1111

1212
- `--prefix-length=0`: prevent the conda build environment from being created in

ci/conda_recipe/meta.yaml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,11 @@ build:
1717
noarch: python
1818
package:
1919
name: snowflake-ml-python
20-
version: 1.6.2
20+
version: 1.6.3
2121
requirements:
2222
build:
2323
- python
24-
- bazel >=6.0.0
24+
- bazel==6.3.0
2525
run:
2626
- absl-py>=0.15,<2
2727
- aiohttp!=4.0.0a0, !=4.0.0a1
@@ -39,7 +39,7 @@ requirements:
3939
- requests
4040
- retrying>=1.3.3,<2
4141
- s3fs>=2022.11,<2024
42-
- scikit-learn>=1.2.1,<1.4
42+
- scikit-learn>=1.2.1,<1.6
4343
- scipy>=1.9,<2
4444
- snowflake-connector-python>=3.5.0,<4
4545
- snowflake-snowpark-python>=1.17.0,<2
@@ -54,11 +54,10 @@ requirements:
5454
- pytorch>=2.0.1,<2.3.0
5555
- sentence-transformers>=2.2.2,<3
5656
- sentencepiece>=0.1.95,<1
57-
- shap==0.42.1
57+
- shap>=0.42.0,<1
5858
- tensorflow>=2.10,<3
5959
- tokenizers>=0.10,<1
6060
- torchdata>=0.4,<1
6161
- transformers>=4.32.1,<5
62-
- openjpeg !=2.4.0=*_1 # [win]
6362
source:
6463
path: ../../

ci/targets/local_only.txt

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +0,0 @@
1-
//snowflake/ml/model/_deploy_client/image_builds/inference_server:gpu_test
2-
//snowflake/ml/model/_deploy_client/image_builds/inference_server:main_vllm_test

ci/targets/quarantine/prod3.txt

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
1-
//tests/integ/snowflake/ml/model:deployment_to_snowservice_integ_test
2-
//tests/integ/snowflake/ml/registry:model_registry_snowservice_integ_test
3-
//tests/integ/snowflake/ml/model:spcs_llm_model_integ_test
1+
//snowflake/ml/model/_packager/model_handlers_test:mlflow_test
42
//tests/integ/snowflake/ml/extra_tests:xgboost_external_memory_training_test
5-
//tests/integ/snowflake/ml/registry:model_registry_snowservice_merge_gate_integ_test
3+
//tests/integ/snowflake/ml/modeling/ensemble:isolation_forest_test
4+
//tests/integ/snowflake/ml/modeling/linear_model:sgd_one_class_svm_test
5+
//tests/integ/snowflake/ml/modeling/preprocessing:k_bins_discretizer_test
6+
//tests/integ/snowflake/ml/registry/model:registry_mlflow_model_test
7+
//tests/integ/snowflake/ml/registry/services/...

ci/targets/slow.txt

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +0,0 @@
1-
//tests/integ/snowflake/ml/model:deployment_to_snowservice_integ_test
2-
//tests/integ/snowflake/ml/registry:model_registry_snowservice_integ_test
3-
//tests/integ/snowflake/ml/model:spcs_llm_model_integ_test

codegen/sklearn_wrapper_generator.py

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1058,12 +1058,41 @@ def generate(self) -> "SklearnWrapperGenerator":
10581058
]
10591059
self.test_estimator_input_args_list.append(f"dictionary={dictionary}")
10601060

1061+
if WrapperGeneratorFactory._is_class_of_type(self.class_object[1], "Isomap"):
1062+
# Using higher n_neighbors for Isomap to balance accuracy and performance.
1063+
self.test_estimator_input_args_list.append("n_neighbors=30")
1064+
1065+
if WrapperGeneratorFactory._is_class_of_type(
1066+
self.class_object[1], "KNeighborsClassifier"
1067+
) or WrapperGeneratorFactory._is_class_of_type(self.class_object[1], "RadiusNeighborsClassifier"):
1068+
# Use distance-based weighting to reduce ties and improve prediction accuracy.
1069+
self.test_estimator_input_args_list.append("weights='distance'")
1070+
1071+
if WrapperGeneratorFactory._is_class_of_type(self.class_object[1], "Nystroem"):
1072+
# Setting specific parameters for Nystroem to ensure a meaningful transformation.
1073+
# - `gamma`: Controls the shape of the RBF kernel. By setting gamma to a lower value
1074+
# like 0.1, you can help generate larger transformation values in the output, making the
1075+
# transformation less sensitive to small variations in the input data. This value also
1076+
# balances between underfitting and overfitting for most datasets.
1077+
# - `n_components`: Specifies a larger number of components for the approximation,
1078+
# which enhances the accuracy of the kernel approximation. This is especially useful
1079+
# in higher-dimensional data or when a more precise transformation is needed.
1080+
self.test_estimator_input_args_list.append("gamma=0.1")
1081+
self.test_estimator_input_args_list.append("n_components=200")
1082+
10611083
if WrapperGeneratorFactory._is_class_of_type(self.class_object[1], "SelectKBest"):
10621084
# Set the k of SelectKBest features transformer to half the number of columns in the dataset.
10631085
self.test_estimator_input_args_list.append("k=int(len(cols)/2)")
10641086

10651087
if "n_components" in self.original_init_signature.parameters.keys():
1066-
if WrapperGeneratorFactory._is_class_of_type(self.class_object[1], "SpectralBiclustering"):
1088+
if self.original_class_name == "KernelPCA":
1089+
# Explicitly set 'n_components' to the number of input columns (len(cols))
1090+
# to ensure consistency between implementations. This is necessary because
1091+
# the default behavior might differ, with 'n_components' otherwise defaulting
1092+
# to the minimum of the number of features or samples, potentially leading to
1093+
# discrepancies between the implementations.
1094+
self.test_estimator_input_args_list.append("n_components=int(len(cols)/2)")
1095+
elif WrapperGeneratorFactory._is_class_of_type(self.class_object[1], "SpectralBiclustering"):
10671096
# For spectral bi clustering, set number of singular vectors to consider to number of input cols and
10681097
# num best vector to select to half the number of input cols.
10691098
self.test_estimator_input_args_list.append("n_components=len(cols)")

codegen/sklearn_wrapper_template.py_template

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -389,6 +389,7 @@ class {transform.original_class_name}(BaseTransformer):
389389
"""
390390
self._infer_input_output_cols(dataset)
391391
super()._check_dataset_type(dataset)
392+
392393
model_trainer = ModelTrainerBuilder.build_fit_transform(
393394
estimator=self._sklearn_object,
394395
dataset=dataset,

codegen/transformer_autogen_test_template.py_template

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -182,7 +182,7 @@ class {transform.test_class_name}(TestCase):
182182
# TODO(snandamuri): HistGradientBoostingRegressor is returning different results in different envs.
183183
# Needs further debugging.
184184
if {transform._is_hist_gradient_boosting_regressor}:
185-
num_diffs = (~np.isclose(actual_arr, sklearn_numpy_arr)).sum()
185+
num_diffs = (~np.isclose(actual_arr, sklearn_numpy_arr, rtol=1.e-2, atol=1.e-2)).sum()
186186
num_example = sklearn_numpy_arr.shape[0]
187187
assert num_diffs < 0.1 * num_example
188188
elif (not {transform._is_deterministic}) or (not {transform._is_deterministic_cross_platform} and platform.system() == 'Windows'):

0 commit comments

Comments
 (0)