Ray memory usage not respecting defaults?

I was using swifter to do a groupy/apply and it was going through Ray, and spawning 32 workers, however this was just OOMing. I thought I might be able to set the default to run on fewer workers, and so did the import and set `npartitions` to 8, however this seemed to have no effect.

I then tried `df.swifter.set_npartitions(npartitions=8).groupby("ticket_id", group_keys=False).apply(create_ticket_object)` also to no avail.

I saw the comment in the documentation about how the call to `set_defaults` needs to occur before the DataFrame is instantiated, and it occurred to me that since I'm using `duckdb` to query a number of csvs it might be creating the DataFrame in a different manner when I run something like this 

```python
df = duckdb.sql(
    """
    SELECT *
    FROM read_csv(
        ?,
        delim = ',',
        quote = '"',
        header = true,
        skip = 1,
        null_padding = true,
        parallel = false, -- we need this because the data is quoted
        all_varchar = true,
        max_line_size = 10000000
    );
    """,
    params=(f"{path}/*.csv",),
).df()
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ray memory usage not respecting defaults? #235

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Ray memory usage not respecting defaults? #235

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions