Skip to content

Commit 3cbf8f1

Browse files
sfc-gh-anavalosSnowflake Authors
andauthored
Project import generated by Copybara. (#107)
GitOrigin-RevId: a23c1817783b50e3eb626411cb222d74c60c578d Co-authored-by: Snowflake Authors <[email protected]>
1 parent f0ff796 commit 3cbf8f1

File tree

144 files changed

+5190
-793
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

144 files changed

+5190
-793
lines changed

CHANGELOG.md

Lines changed: 31 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,36 @@
11
# Release History
22

3-
## 1.5.3
3+
## 1.5.4
4+
5+
### Bug Fixes
6+
7+
- Model Registry (PrPr): Fix 401 Unauthorized issue when deploying model to SPCS.
8+
- Feature Store: Downgrades exceptions to warnings for few property setters in feature view. Now you can set
9+
desc, refresh_freq and warehouse for draft feature views.
10+
- Modeling: Fix an issue with calling `OrdinalEncoder` with `categories` as a dictionary and a pandas DataFrame
11+
- Modeling: Fix an issue with calling `OneHotEncoder` with `categories` as a dictionary and a pandas DataFrame
12+
13+
### New Features
14+
15+
- Registry: Allow overriding `device_map` and `device` when loading huggingface pipeline models.
16+
- Registry: Add `set_alias` method to `ModelVersion` instance to set an alias to model version.
17+
- Registry: Add `unset_alias` method to `ModelVersion` instance to unset an alias to model version.
18+
- Registry: Add `partitioned_inference_api` allowing users to create partitioned inference functions in registered
19+
models. Enable model inference methods with table functions with vectorized process methods in registered models.
20+
- Feature Store: add 3 more columns: refresh_freq, refresh_mode and scheduling_state to the result of
21+
`list_feature_views()`.
22+
- Feature Store: `update_feature_view()` supports updating description.
23+
- Feature Store: add new API `refresh_feature_view()`.
24+
- Feature Store: add new API `get_refresh_history()`.
25+
- Feature Store: Add `generate_training_set()` API for generating table-backed feature snapshots.
26+
- Feature Store: Add `DeprecationWarning` for `generate_dataset(..., output_type="table")`.
27+
- Feature Store: `update_feature_view()` supports updating description.
28+
- Feature Store: add new API `refresh_feature_view()`.
29+
- Feature Store: add new API `get_refresh_history()`.
30+
- Model Development: OrdinalEncoder supports a list of array-likes for `categories` argument.
31+
- Model Development: OneHotEncoder supports a list of array-likes for `categories` argument.
32+
33+
## 1.5.3 (06-17-2024)
434

535
### Bug Fixes
636

@@ -9,8 +39,6 @@
939
- Registry: Fix an issue that leads to incorrect result when using pandas Dataframe with over 100, 000 rows as the input
1040
of `ModelVersion.run` method in Stored Procedure.
1141

12-
### Behavior Changes
13-
1442
### New Features
1543

1644
- Registry: Add support for TIMESTAMP_NTZ model signature data type, allowing timestamp input and output.

bazel/environments/conda-env-snowflake.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ dependencies:
5151
- sentencepiece==0.1.99
5252
- shap==0.42.1
5353
- snowflake-connector-python==3.10.0
54-
- snowflake-snowpark-python==1.15.0
54+
- snowflake-snowpark-python==1.17.0
5555
- sphinx==5.0.2
5656
- sqlparse==0.4.4
5757
- tensorflow==2.12.0
@@ -63,4 +63,5 @@ dependencies:
6363
- types-requests==2.30.0.0
6464
- types-toml==0.10.8.6
6565
- typing-extensions==4.5.0
66+
- werkzeug==2.2.2
6667
- xgboost==1.7.3

bazel/environments/conda-env.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ dependencies:
5656
- sentencepiece==0.1.99
5757
- shap==0.42.1
5858
- snowflake-connector-python==3.10.0
59-
- snowflake-snowpark-python==1.15.0
59+
- snowflake-snowpark-python==1.17.0
6060
- sphinx==5.0.2
6161
- sqlparse==0.4.4
6262
- tensorflow==2.12.0
@@ -68,6 +68,7 @@ dependencies:
6868
- types-requests==2.30.0.0
6969
- types-toml==0.10.8.6
7070
- typing-extensions==4.5.0
71+
- werkzeug==2.2.2
7172
- xgboost==1.7.3
7273
- pip
7374
- pip:

bazel/environments/conda-gpu-env.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ dependencies:
5858
- sentencepiece==0.1.99
5959
- shap==0.42.1
6060
- snowflake-connector-python==3.10.0
61-
- snowflake-snowpark-python==1.15.0
61+
- snowflake-snowpark-python==1.17.0
6262
- sphinx==5.0.2
6363
- sqlparse==0.4.4
6464
- tensorflow==2.12.0
@@ -70,6 +70,7 @@ dependencies:
7070
- types-requests==2.30.0.0
7171
- types-toml==0.10.8.6
7272
- typing-extensions==4.5.0
73+
- werkzeug==2.2.2
7374
- xgboost==1.7.3
7475
- pip
7576
- pip:

bazel/requirements/templates/meta.tpl.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,8 @@ requirements:
1414
- bazel >=6.0.0
1515
run:
1616
- python>=3.8,<3.12
17+
run_constrained:
18+
- openjpeg !=2.4.0=*_1 # [win]
1719

1820
about:
1921
home: https://github.com/snowflakedb/snowflake-ml-python

ci/conda_recipe/meta.yaml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ build:
1717
noarch: python
1818
package:
1919
name: snowflake-ml-python
20-
version: 1.5.3
20+
version: 1.5.4
2121
requirements:
2222
build:
2323
- python
@@ -42,7 +42,7 @@ requirements:
4242
- scikit-learn>=1.2.1,<1.4
4343
- scipy>=1.9,<2
4444
- snowflake-connector-python>=3.5.0,<4
45-
- snowflake-snowpark-python>=1.15.0,<2
45+
- snowflake-snowpark-python>=1.17.0,<2
4646
- sqlparse>=0.4,<1
4747
- typing-extensions>=4.1.0,<5
4848
- xgboost>=1.7.3,<2
@@ -51,13 +51,14 @@ requirements:
5151
- catboost>=1.2.0, <2
5252
- lightgbm>=3.3.5,<5
5353
- mlflow>=2.1.0,<2.4
54-
- pytorch>=2.0.1,<3
54+
- pytorch>=2.0.1,<2.3.0
5555
- sentence-transformers>=2.2.2,<3
5656
- sentencepiece>=0.1.95,<1
5757
- shap==0.42.1
5858
- tensorflow>=2.10,<3
5959
- tokenizers>=0.10,<1
6060
- torchdata>=0.4,<1
6161
- transformers>=4.32.1,<5
62+
- openjpeg !=2.4.0=*_1 # [win]
6263
source:
6364
path: ../../

ci/targets/quarantine/prod3.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
//tests/integ/snowflake/ml/model:deployment_to_snowservice_integ_test
22
//tests/integ/snowflake/ml/registry:model_registry_snowservice_integ_test
33
//tests/integ/snowflake/ml/model:spcs_llm_model_integ_test
4+
//tests/integ/snowflake/ml/extra_tests:xgboost_external_memory_training_test
5+
//tests/integ/snowflake/ml/lineage:lineage_integ_test

codegen/BUILD.bazel

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ filegroup(
77
srcs = [
88
"init_template.py_template",
99
"sklearn_wrapper_template.py_template",
10+
"snowpark_pandas_autogen_test_template.py_template",
1011
"transformer_autogen_test_template.py_template",
1112
],
1213
)

codegen/build_file_autogen.py

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
66
python3 snowflake/ml/experimental/amauser/transformer/build_file_autogen.py
77
"""
8+
89
import os
910
from dataclasses import dataclass, field
1011
from typing import List
@@ -13,6 +14,7 @@
1314
from absl import app
1415

1516
from codegen import sklearn_wrapper_autogen as swa
17+
from snowflake.ml.snowpark_pandas import imports
1618

1719

1820
@dataclass(frozen=True)
@@ -23,7 +25,10 @@ class ModuleInfo:
2325

2426

2527
MODULES = [
26-
ModuleInfo("sklearn.linear_model", ["OrthogonalMatchingPursuitCV", "QuantileRegressor"]),
28+
ModuleInfo(
29+
"sklearn.linear_model",
30+
["OrthogonalMatchingPursuitCV", "QuantileRegressor"],
31+
),
2732
ModuleInfo(
2833
"sklearn.ensemble",
2934
[
@@ -170,6 +175,27 @@ def get_test_build_file_content(module: ModuleInfo, module_root_dir: str) -> str
170175
)
171176

172177

178+
def get_snowpark_pandas_test_build_file_content(module: imports.ModuleInfo, module_root_dir: str) -> str:
179+
"""Generates the content of BUILD.bazel file for snowpark_pandas test directory of the given module.
180+
181+
Args:
182+
module: Module information.
183+
module_root_dir: Relative directory path of the module source code.
184+
185+
Returns:
186+
Returns content of the BUILD.bazel file for module test directory.
187+
"""
188+
return (
189+
'load("//codegen:codegen_rules.bzl", "autogen_snowpark_pandas_tests")\n'
190+
f'load("//{module_root_dir}:estimators_info.bzl", "snowpark_pandas_estimator_info_list")\n'
191+
'package(default_visibility = ["//snowflake/ml/snowpark_pandas"])\n'
192+
"\nautogen_snowpark_pandas_tests(\n"
193+
f' module = "{module.module_name}",\n'
194+
f' module_root_dir = "{module_root_dir}",\n'
195+
" snowpark_pandas_estimator_info_list=snowpark_pandas_estimator_info_list\n)"
196+
)
197+
198+
173199
def main(argv: List[str]) -> None:
174200
del argv # Unused.
175201

@@ -200,6 +226,19 @@ def main(argv: List[str]) -> None:
200226
os.makedirs("/".join(test_build_file_path.split("/")[:-1]), exist_ok=True)
201227
open(test_build_file_path, "w").write(test_build_file_content)
202228

229+
for module in imports.MODULES:
230+
if len(module.exclude_list) > 0 and len(module.include_list) > 0:
231+
raise ValueError(f"Both include_list and exclude_list can't be specified for module {module.module_name}!")
232+
233+
module_root_dir = swa.AutogenTool.module_root_dir(module.module_name)
234+
test_build_file_path = os.path.join(TEST_OUTPUT_PATH, module_root_dir, "BUILD.bazel")
235+
236+
# Snowpandas test build file:
237+
# Contains genrules and py_test rules for all the snowpark_pandas estimators.
238+
test_build_file_content = get_snowpark_pandas_test_build_file_content(module, module_root_dir)
239+
os.makedirs("/".join(test_build_file_path.split("/")[:-1]), exist_ok=True)
240+
open(test_build_file_path, "w").write(test_build_file_content)
241+
203242

204243
def get_estimators_info_file_content(module: ModuleInfo) -> str:
205244
"""Returns information of all the transformer and estimator classes in the given module.

codegen/codegen_rules.bzl

Lines changed: 43 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,9 @@ ESTIMATOR_TEMPLATE_BAZEL_PATH = "//codegen:sklearn_wrapper_template.py_template"
1313
ESTIMATOR_TEST_TEMPLATE_BAZEL_PATH = (
1414
"//codegen:transformer_autogen_test_template.py_template"
1515
)
16+
SNOWPARK_PANDAS_TEST_TEMPLATE_BAZEL_PATH = (
17+
"//codegen:snowpark_pandas_autogen_test_template.py_template"
18+
)
1619
INIT_TEMPLATE_BAZEL_PATH = "//codegen:init_template.py_template"
1720
SRC_OUTPUT_PATH = ""
1821
TEST_OUTPUT_PATH = "tests/integ"
@@ -113,7 +116,7 @@ def autogen_tests_for_estimators(module, module_root_dir, estimator_info_list):
113116
List of generated build rules for every class in the estimator_info_list
114117
1. `genrule` with label `generate_test_<estimator-class-name-snakecase>` to auto-generate
115118
integration test for the estimator's wrapper class.
116-
2. `py_test` rule with label `test_<estimator-class-name-snakecase>` to build the auto-generated
119+
2. `py_test` rule with label `<estimator-class-name-snakecase>_test` to build the auto-generated
117120
test files from the `generate_test_<estimator-class-name-snakecase>` rule.
118121
"""
119122
cmd = get_genrule_cmd(
@@ -145,3 +148,42 @@ def autogen_tests_for_estimators(module, module_root_dir, estimator_info_list):
145148
shard_count = 5,
146149
tags = ["autogen"],
147150
)
151+
152+
def autogen_snowpark_pandas_tests(module, module_root_dir, snowpark_pandas_estimator_info_list):
153+
"""Generates `genrules` and `py_test` rules for every snowpark pandas estimator
154+
List of generated build rules for every class in the snowpark_pandas_estimator_info_list
155+
1. `genrule` with label `generate_test_snowpark_pandas_<estimator-class-name-snakecase>` to auto-generate
156+
integration test for the estimator.
157+
2. `py_test` rule with label `estimator-class-name-snakecase>_snowpark_pandas_test` to build the auto-generated
158+
test files from the `generate_test_snowpark_pandas_<estimator-class-name-snakecase>` rule.
159+
"""
160+
cmd = get_genrule_cmd(
161+
gen_mode = "SNOWPARK_PANDAS_TEST",
162+
template_path = SNOWPARK_PANDAS_TEST_TEMPLATE_BAZEL_PATH,
163+
module = module,
164+
output_path = TEST_OUTPUT_PATH,
165+
)
166+
167+
for e in snowpark_pandas_estimator_info_list:
168+
py_genrule(
169+
name = "generate_test_snowpark_pandas_{}".format(e.normalized_class_name),
170+
outs = ["{}_snowpark_pandas_test.py".format(e.normalized_class_name)],
171+
tools = [AUTO_GEN_TOOL_BAZEL_PATH],
172+
srcs = [SNOWPARK_PANDAS_TEST_TEMPLATE_BAZEL_PATH],
173+
cmd = cmd.format(e.class_name),
174+
tags = ["autogen_build"],
175+
)
176+
177+
py_test(
178+
name = "{}_snowpark_pandas_test".format(e.normalized_class_name),
179+
srcs = [":generate_test_snowpark_pandas_{}".format(e.normalized_class_name)],
180+
deps = [
181+
"//snowflake/ml/snowpark_pandas:snowpark_pandas_lib",
182+
"//snowflake/ml/utils:connection_params",
183+
],
184+
compatible_with_snowpark = False,
185+
timeout = "long",
186+
legacy_create_init = 0,
187+
shard_count = 5,
188+
tags = ["snowpark_pandas_autogen"],
189+
)

codegen/estimator_autogen_tool.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,9 +39,10 @@
3939
flags.DEFINE_string(
4040
"gen_mode",
4141
None,
42-
"Options: ['SRC', 'TEST']."
42+
"Options: ['SRC', 'TEST', 'SNOWPARK_PANDAS_TEST']."
4343
+ "SRC mode generates source code for snowflake wrapper for all the estimator objects in the given modules.\n"
44-
+ "TEST mode generates integration tests for all the auto generated python wrappers in the given module.\n",
44+
+ "TEST mode generates integration tests for all the auto generated python wrappers in the given module.\n"
45+
+ "SNOWPARK_PANDAS_TEST mode generates snowpark pandas integration tests in the given module.\n",
4546
)
4647
flags.DEFINE_string(
4748
"bazel_out_dir", None, "Takes bazel out directory as input to compute relative path to bazel-bin folder"

codegen/sklearn_wrapper_autogen.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,14 +18,16 @@
1818
class GenMode(Enum):
1919
SRC = "SRC"
2020
TEST = "TEST"
21+
SNOWPARK_PANDAS_TEST = "SNOWPARK_PANDAS_TEST"
2122

2223

2324
class AutogenTool:
2425
"""Tool to auto-generate estimator wrappers and integration test for estimator wrappers.
2526
2627
Args:
27-
gen_mode: Possible values {GenMode.SRC, GenMode.TEST}. Tool generates source code for estimator
28-
wrappers or integration tests for generated estimator wrappers based on the selected mode.
28+
gen_mode: Possible values {GenMode.SRC, GenMode.TEST, GenMode.SNOWPARK_PANDAS_TEST}. Tool generates source code
29+
for estimator wrappers or integration tests for generated estimator wrappers or snowpark_pandas based on the
30+
selected mode.
2931
template_path: Path to file containing estimator wrapper or test template code.
3032
output_path : Path to the root of the destination folder to write auto-generated code.
3133
class_list: Allow list of estimator classes. If specified, wrappers or tests will be generated for only
@@ -138,7 +140,8 @@ def _generate_src_files(
138140
def _generate_test_files(
139141
self, module_name: str, generators: Iterable[swg.WrapperGeneratorBase], skip_code_gen: bool = False
140142
) -> List[str]:
141-
"""Autogenerate integ tests for snowflake estimator wrappers for the given SKLearn or XGBoost module.
143+
"""Autogenerate integ tests for snowflake estimator wrappers or snowpark_pandas for the given SKLearn or XGBoost
144+
module.
142145
143146
Args:
144147
module_name: Module name to process.
@@ -153,7 +156,10 @@ def _generate_test_files(
153156

154157
generated_files_list = []
155158
for generator in generators:
156-
test_output_file_name = os.path.join(self.output_path, generator.estimator_test_file_name)
159+
if self.gen_mode == GenMode.TEST:
160+
test_output_file_name = os.path.join(self.output_path, generator.estimator_test_file_name)
161+
else:
162+
test_output_file_name = os.path.join(self.output_path, generator.snowpark_pandas_test_file_name)
157163
generated_files_list.append(test_output_file_name)
158164
if skip_code_gen:
159165
continue

0 commit comments

Comments
 (0)