AutoML is not working properly with large volume of data(30 Million rows with 150 features)

### Sparkling Water Version

3.5

### Issue description

Expected behavior:
Sparkling water can train individual models like XGBoost then it should also run for automl api.
Observed behavior:
Sparkling water can train individual models like XGBoost but fail to run with automl api.

### Programming language used

Python

### Programming language version

3.11

### What environment are you running Sparkling Water on?

Cloud Managed Spark (like Databricks, AWS Glue)

### Environment version info

15.4 LTS (includes Apache Spark 3.5.0, Scala 2.12)

### Brief cluster specification

Runtime 15.4.x-scala2.12, 1 Driver with 64 GB Memory, 8 Cores, 7 Workers with 64 GB Memory 8 Cores

### Relevant log output

```shell
Dont have any error logs as process continues for long time.
```


### Code to reproduce the issue

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AutoML is not working properly with large volume of data(30 Million rows with 150 features) #5747

Sparkling Water Version

Issue description

Programming language used

Programming language version

What environment are you running Sparkling Water on?

Environment version info

Brief cluster specification

Relevant log output

Code to reproduce the issue

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AutoML is not working properly with large volume of data(30 Million rows with 150 features) #5747

Description

Sparkling Water Version

Issue description

Programming language used

Programming language version

What environment are you running Sparkling Water on?

Environment version info

Brief cluster specification

Relevant log output

Code to reproduce the issue

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions