QST: Subject: User Experience Issue - NumPy Types in DataFrame Results Breaking Readability #61607

COderHop · 2025-06-08T09:05:24Z

Research

I have searched the [pandas] tag on StackOverflow for similar questions.
I have asked my usage related question on StackOverflow.

Link to question on StackOverflow

None

Question about pandas

ssue Description
TL;DR: Since pandas 2.0+, .tolist() and similar methods return NumPy types instead of native Python types, severely impacting user experience and data readability.
Problem Example
Before (pandas 1.x):
pythondf.index.tolist()

Returns: [0, 1, 2, 3, 4] # Clean, readable

Now (pandas 2.x):
pythondf.index.tolist()

Returns: [np.int64(0), np.int64(1), np.int64(2), np.int64(3), np.int64(4)] # Verbose, confusing

Impact on User Experience

Poor Readability: Results are cluttered with np.int64(), np.float64() wrappers
Debugging Nightmare: Harder to quickly scan and understand data
Display Issues: When printing or logging, output is unnecessarily verbose
User Confusion: Many users don't understand why they're seeing NumPy types
Breaking Change: Existing code expectations broken without clear migration path

Current Workarounds Are Painful
Users now need to write additional code for basic operations:
python# Instead of simple:
indices = df.index.tolist()

We need:

indices = [int(x) for x in df.index.tolist()]
The Core Problem
DataFrames are meant for data analysis and exploration. The primary use case is human-readable data inspection, not performance-critical numerical computation at the .tolist() level.
Suggested Solutions

Add a parameter: .tolist(native_types=True) (default True for user-facing methods)
Separate methods: Keep .tolist() for NumPy types, add .tolist_clean() for Python types
Configuration option: Allow users to set pandas behavior globally
Revert the change: Prioritize user experience over marginal performance gains

Why This Matters
Pandas' strength has always been its ease of use and intuitive behavior. This change sacrifices user experience for performance gains that most users don't need when calling .tolist().
The goal of data analysis is insight, not fighting with data types.
Request
Please consider reverting this behavior or providing a simple, built-in solution. The current situation forces every pandas user to write boilerplate code for basic data inspection.
Thank you for maintaining this incredible library. I hope we can find a solution that balances performance with the user-friendly experience that makes pandas great.

Environment:

pandas: 2.2.3
numpy: 1.26.4
Impact: All DataFrame operations returning lists

simonjayhawkins · 2025-06-08T17:52:46Z

Thanks @COderHop for the report.

It appears that you have a good grasp of the issue. IIRC this has been reported/discussed before but I can't find it at this time.

Breaking Change: Existing code expectations broken without clear migration path

I do not agree that from the pandas perspective this is true. Numpy made a change to their repr and pandas continues to return Numpy types as before, only the repr has changed and that should not really be considered a pandas issue.

However, to be fair, many users were probably unaware before that their lists contained numpy types and not Python types which would have perhaps been a more logical design choice. If pandas had however changed the return type this would have been a breaking change.

Please consider reverting this behavior or providing a simple, built-in solution.

IIRC other discussions have suggested making this breaking change in a future release in the the return type of some operations for which a return of standard Python objects would be appropriate. This seems reasonable to me.

Even though I'm sure this is a duplicate issue, I'll leave it open until I can find the other issues or until someone else point us in the right direction.

@mroeschke IIRC you did some PRs at some point related to this to fix ci?

COderHop added Usage Question Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

QST: Subject: User Experience Issue - NumPy Types in DataFrame Results Breaking Readability #61607

QST: Subject: User Experience Issue - NumPy Types in DataFrame Results Breaking Readability #61607

COderHop commented Jun 8, 2025

simonjayhawkins commented Jun 8, 2025

Uh oh!

Uh oh!

QST: Subject: User Experience Issue - NumPy Types in DataFrame Results Breaking Readability #61607

QST: Subject: User Experience Issue - NumPy Types in DataFrame Results Breaking Readability #61607

Comments

COderHop commented Jun 8, 2025

Research

Link to question on StackOverflow

Question about pandas

Returns: [0, 1, 2, 3, 4] # Clean, readable

Returns: [np.int64(0), np.int64(1), np.int64(2), np.int64(3), np.int64(4)] # Verbose, confusing

We need:

simonjayhawkins commented Jun 8, 2025

Uh oh!