Skip to content

🐛 Potentially missing extra in ibis dependency #6

@alexkolo

Description

@alexkolo

I had to install the following packages to make icanexplai work for me, which were required by the ibis module

- pyarrow~=18.0.0
- pyarrow_hotfix~=0.6

I encounter this in Win10 with Python 3.12.3 and in Linux (WSL) with Python 3.11.10.
I can't say whether this is an individual problem or a general one. But I thought I should let you know in case you want to check.

Using icanexplain==0.3.0.

Possible fix

Change this Line:

https://github.com/carbonfact/icanexplain/blob/2b5b1730d5aa16c07891530f5619e2fe2ea9062e/pyproject.toml#L10

to this:

ibis-framework = {extras = ["duckdb"], version = "^9.5.0"}

According to this file, it would install the missing modules.

How to reproduce

  • pip install icanexplai==0.3.0
import pandas as pd
import icanexplain as ice
data: list[dict[str, int]] = [
    {"year": 2019, "n_bookings": 1_000, "revenue_per_booking": 200},
    {"year": 2020, "n_bookings": 1_000, "revenue_per_booking": 220},
    {"year": 2021, "n_bookings": 1_500, "revenue_per_booking": 220},
    {"year": 2022, "n_bookings": 1_700, "revenue_per_booking": 225},
]
revenue: pd.DataFrame = pd.DataFrame(data=data)
explainer = ice.SumExplainer(fact="revenue_per_booking", period="year", count="n_bookings")
explanation: pd.DataFrame = explainer(revenue)

Traceback

{
	"name": "ModuleNotFoundError",
	"message": "No module named 'pyarrow'",
	"stack": "---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[4], line 2
      1 explainer = ice.SumExplainer(fact=\"revenue_per_booking\", period=\"year\", count=\"n_bookings\")
----> 2 explanation: pd.DataFrame = explainer(revenue)

File ..\site-packages\\icanexplain\\__init__.py:76, in Unpacker.__call__(self, table)
     75 def __call__(self, table):
---> 76     explanation = self._explanation(table)
     77     explanation_fmt = self._format(explanation)
     78     if is_pandas_dataframe(table):

File ..\site-packages\\icanexplain\\__init__.py:42, in coerce_table.<locals>._impl(self, table)
     39 @functools.wraps(method)
     40 def _impl(self, table):
     41     if is_pandas_dataframe(table):
---> 42         return method(self, ibis.memtable(table[self._necessary_columns]))
     43     if is_polars_dataframe(table):
     44         return method(self, ibis.memtable(table[self._necessary_columns]))

File ..\site-packages\\ibis\\expr\\api.py:462, in memtable(data, columns, schema, name)
    457 if columns is not None and schema is not None:
    458     raise NotImplementedError(
    459         \"passing `columns` and schema` is ambiguous; \"
    460         \"pass one or the other but not both\"
    461     )
--> 462 return _memtable(data, name=name, schema=schema, columns=columns)

File ..\site-packages\\ibis\\common\\dispatch.py:140, in lazy_singledispatch.<locals>.call(arg, *args, **kwargs)
    137 @functools.wraps(func)
    138 def call(arg, *args, **kwargs):
    139     impl = dispatcher.dispatch(type(arg))
--> 140     return impl(arg, *args, **kwargs)

File .\site-packages\\ibis\\expr\\api.py:475, in _memtable(data, columns, schema, name)
    465 @lazy_singledispatch
    466 def _memtable(
    467     data: pd.DataFrame | Any,
   (...)
    471     name: str | None = None,
    472 ) -> Table:
    473     import pandas as pd
--> 475     from ibis.formats.pandas import PandasDataFrameProxy
    477     if not isinstance(data, pd.DataFrame):
    478         df = pd.DataFrame(data, columns=columns)

File ..\site-packages\\ibis\\formats\\pandas.py:20
     18 from ibis.formats import DataMapper, SchemaMapper, TableProxy
     19 from ibis.formats.numpy import NumpyType
---> 20 from ibis.formats.pyarrow import PyArrowData, PyArrowSchema, PyArrowType
     22 if TYPE_CHECKING:
     23     import polars as pl

File ..site-packages\\ibis\\formats\\pyarrow.py:5
      1 from __future__ import annotations
      3 from typing import TYPE_CHECKING, Any
----> 5 import pyarrow as pa
      6 import pyarrow_hotfix  # noqa: F401
      8 import ibis.expr.datatypes as dt

ModuleNotFoundError: No module named 'pyarrow'"
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions