Add markdown output format for execute_sql to reduce token usage by ~50%#297
Merged
calreynolds merged 1 commit intodatabricks-solutions:mainfrom Mar 16, 2026
Conversation
SQL results are consumed by LLMs via MCP, but the JSON array-of-objects format repeats every column name on every row — wasting ~42% of the payload on redundant keys. For a 100-row × 10-column result, JSON produces ~27K chars vs ~14K for a markdown table. This adds an `output_format` parameter (default: "markdown") to `execute_sql` and `execute_sql_multi`. Markdown tables state column names once in the header, which LLMs parse natively. Use `output_format="json"` for backwards compatibility. Closes databricks-solutions#296
calreynolds
approved these changes
Mar 16, 2026
Collaborator
calreynolds
left a comment
There was a problem hiding this comment.
Thank you! Great PR 👍
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
output_formatparameter toexecute_sqlandexecute_sql_multiMCP tools"markdown"— returns a markdown table string instead of JSON array-of-objectsoutput_format="json"preserves existing behavior for backwards compatibilityCloses #296
Problem
execute_sqlreturns results as a JSON array of objects, repeating every column name on every row:[ {"event_id": "EVT-10001", "event_name": "Concert A", "venue": "Arena 1", "city": "NYC"}, {"event_id": "EVT-10002", "event_name": "Concert B", "venue": "Arena 2", "city": "Chicago"} ]Since MCP tools are consumed by LLMs, this is extremely wasteful — ~42% of the payload is redundant column names. In a real session with 107 SQL queries, this caused 4 context compaction events and burned ~64M tokens.
Solution
Return a markdown table by default:
For a 100-row × 10-column result set:
Changes
databricks-mcp-server/databricks_mcp_server/tools/sql.py— adds_format_results_markdown()helper andoutput_formatparameter to bothexecute_sqlandexecute_sql_multidatabricks-mcp-server/tests/test_sql_output_format.py— 7 unit tests covering empty results, single/multiple rows, None handling, pipe escaping, column-name-once guarantee, and size comparisonNo changes to
databricks-tools-core— formatting is applied at the MCP server layer only.Test plan
pytest tests/test_sql_output_format.py)execute_sqlwith default format and verify markdown outputexecute_sqlwithoutput_format="json"and verify JSON output