Skip to content

TypeError: data type 'boolean' not understood` when updating to dask-sql 2024.5.0 and installing dask-expr 1.1.14 #1346

@teresama

Description

@teresama

When updating dask-sql to version 2024.5.0, it is required to have dask-expr installed.
In my machine I have installed:

pandas 2.2.3
dask 2024.9.0
dask-expr 1.1.14
dask_sql 2024.5.0

I am getting the error:
.venv/lib/python3.10/site-packages/dask/utils.py", line 1241, in __call__ return getattr(__obj, self.method)(*args, **kwargs) TypeError: data type 'boolean' not understood

when running:

import pandas as pd
import dask.dataframe as dd
from dask_sql import Context

data = {
    "column8": []
}
df = pd.DataFrame(data)

ddf = dd.from_pandas(df, npartitions=1)

c = Context()
c.create_table("tablename", ddf)
query = """
WITH
    sampled_table AS (
        SELECT "column8" AS "NEW_NAME"
        FROM tablename t
    ),
    table2 AS (
        SELECT "NEW_NAME" AS output1, COUNT(*) AS output2
        FROM sampled_table t
        GROUP BY "NEW_NAME"
    ),
    outputtable AS (
        SELECT *
        FROM table2 t
        WHERE output1 IS NOT NULL
    )
SELECT *
FROM outputtable"""

result = c.sql(query)
print(result.compute()) 

That code was working prior to the update (dask==2024.1.1 and dask-sql==2024.3.0).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions