feat(mcp): cap db_execute_query result size to protect agent context#167
feat(mcp): cap db_execute_query result size to protect agent context#167aprimakina wants to merge 2 commits into
Conversation
LLMs using db_execute_query tend to SELECT raw rows and process locally, blowing up the context window and running slowly. Cap the data the tool returns and steer agents toward doing the work in SQL. - Row cap (per result set) via new mcp_max_rows config (default 100), overridable per call by a max_rows parameter, hard-capped at 10000. - 256 KiB byte safety net across the response for very wide rows. - On truncation, abort the in-flight query (context cancel, no pgx drain), flag `truncated`, and return an actionable `notice`. Always keep at least one row so an oversized first row never yields an empty result. - Server instructions + tool description push aggregation/filtering/ pagination into SQL. MCP-only; CLI query paths are unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Askir
left a comment
There was a problem hiding this comment.
Not 100% sure about the cancellation logic it seems a bit weird that we would roll everything back just because you hit the row limit.
Also Claude told me this :D
mcp_max_rows is missing from
tiger config showtable output
| // Cancel before the deferred rows.Close() so pgx aborts the query | ||
| // instead of draining the remaining rows. | ||
| cancel() |
There was a problem hiding this comment.
Does this make sense? If the LLM provide a multi statement string, this will cancel everything and roll it back according to claude. That means If there is two selects the second one might never fire and if it mixes updates and deletes in those will be rolled back.
Should we maybe just do the truncation client side? So first processing everything and load it into memory but then truncate the individual result sets after execution?
There was a problem hiding this comment.
I agree with Jascha. I don't think we should cancel the query here, just drain the rows. I think calling rows.Close() is probably sufficient for this result set, and then also br.Close() to close the entire batch (i.e. additional result sets).
There was a problem hiding this comment.
I'm a bit skeptical of the need for this feature. For one thing, LLMs can already limit the number of results returned by including a LIMIT in the query itself. But also, many agent harnesses (e.g. Claude Code) will automatically send tool call results to a file if they're too large (instead of putting them directly into the context), and then the agent can use tools like grep or jq to search through the results in the file. So I'm not really convinced we need to add yet another way to limit/control the size of tool call results.
I guess I'm not wholly opposed to a configurable user limit (i.e. the mcp_max_rows config setting), but I do think the max_rows parameter and the hard upper cap are probably unnecessarily complicating things and limiting what users can do. See my comments below about that. Just my two cents.
|
|
||
| // No schema Default: it would be injected by the SDK and shadow the | ||
| // configured mcp_max_rows fallback when max_rows is omitted. | ||
| schema.Properties["max_rows"].Description = fmt.Sprintf("Maximum number of rows to return per result set. When the query produces more, the result set is truncated (the response indicates this) and the in-flight query is aborted to avoid streaming and buffering data the model won't use. Defaults to the configured mcp_max_rows (%d). Hard-capped at %d. Prefer aggregating or filtering in SQL over raising this.", config.DefaultMCPMaxRows, mcpMaxRowsCeiling) |
There was a problem hiding this comment.
This is a lot of text for a single field description, and it feels somewhat duplicative with the notice that gets returned when results are truncated. I'd try to make this a little more concise, if possible. Each field description uses up context, and we have lots of tools/fields, so we need to be careful or our MCP will end up eating a ton of context.
| DO THE WORK IN THE DATABASE. PostgreSQL is far more efficient at processing data than fetching raw rows and computing in your context. Push computation into SQL: aggregate with GROUP BY / COUNT / SUM / AVG / MIN / MAX, filter with WHERE, sort and take the top N with ORDER BY ... LIMIT, and join/transform server-side. Avoid SELECT * on large tables; project only the columns you need. | ||
|
|
||
| RESULTS ARE CAPPED. Each result set returns at most max_rows rows (default 100, configurable via mcp_max_rows), and the total response size is bounded. If a result set is truncated, the response sets "truncated": true and includes a "notice" — refine the query in SQL (aggregate, filter, or paginate with LIMIT/OFFSET) rather than re-running it to pull everything. Only raise max_rows when you genuinely need more raw rows in a single call. |
There was a problem hiding this comment.
Again, all of this content feels very duplicative to me. I don't think we need to beat a dead horse. We should aim to explain this all just once, and in as few words as possible, imo.
There was a problem hiding this comment.
Trimmed the text 👍
| // Cancel before the deferred rows.Close() so pgx aborts the query | ||
| // instead of draining the remaining rows. | ||
| cancel() |
There was a problem hiding this comment.
I agree with Jascha. I don't think we should cancel the query here, just drain the rows. I think calling rows.Close() is probably sufficient for this result set, and then also br.Close() to close the entire batch (i.e. additional result sets).
| // The command tag is unreliable after an abort; report rows returned. | ||
| rowsAffected = int64(len(resultRows)) |
There was a problem hiding this comment.
I think another big benefit of not cancelling the query is that we could return the true number of rows that the query returned here (as returned in the command tag), even when the results are truncated. I think that's important info, personally. It could help the LLM decide how to proceed.
There was a problem hiding this comment.
Nice idea 👍 Since we no longer cancel, the command tag is valid, so rows_affected now reports the true total even when truncated.
| // mcpMaxRowsCeiling is the hard upper bound on rows returned per result set, | ||
| // regardless of the configured or per-call max_rows. | ||
| mcpMaxRowsCeiling = 10000 |
There was a problem hiding this comment.
I'm personally not in favor of including a hard upper limit on the number of rows that can be returned. I think tools should give good defaults, but allow users to override them and do what they want at the end of the day. If a user wants to return more than 10,000 rows, they should be allowed to configure mcp_max_rows with a higher limit (or even disable the limit entirely, imo). I don't think we should prevent users from making their own decisions with respect to how they fetch and process their own data 🤷♂️.
There was a problem hiding this comment.
Removed the hard upper limit ✅
| TimeoutSeconds int `json:"timeout_seconds,omitempty"` | ||
| Role string `json:"role,omitempty"` | ||
| Pooled bool `json:"pooled,omitempty"` | ||
| MaxRows int `json:"max_rows,omitempty"` |
There was a problem hiding this comment.
I'm skeptical that we really need this parameter at all. LLMs can already limit the number of rows returned from a query by adding a LIMIT (and I think they're typically pretty good at doing this). Do we really need to expose a different knob to control basically the same thing (but with different semantics)? Won't this just confuse LLMs? Ultimately, a LIMIT is always the better choice, since it allows the query planner to optimize the query differently and avoiding sending rows over the wire in the first place. By exposing this parameter, it feels like we've nudging the LLM to use the parameter instead of a LIMIT, even though it's not going to be as efficient.
| n := configured | ||
| if requested > 0 { | ||
| n = requested | ||
| } |
There was a problem hiding this comment.
If I'm reading this correctly, it looks like the LLM can override the configured limit (i.e. the mcp_max_rows config setting) by sending it's own limit via the max_rows tool parameter. Is that really what we want? What's the point of having a global config setting if the LLM can just disregard it?
See my other comment about whether we really need the max_rows parameter at all (I'm skeptical).
There was a problem hiding this comment.
Resolved by dropping the param, mcp_max_rows config is now authoritative, and the model can't override it.
6793132 to
9c02379
Compare
Incorporate review feedback on the result-limiting change: - Drain instead of cancel on truncation, preserving the implicit transaction so writes and later statements still run, and reporting the true row count from the command tag. - Remove the hard 10000-row ceiling and the per-call max_rows parameter; mcp_max_rows config is now the single authoritative row cap. - Trim duplicative guidance text in the tool description and notice. - Show mcp_max_rows in `tiger config show` table output. - Fix the rows_affected schema description, which contradicted itself after rows_affected became the true total. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
9c02379 to
acebe3d
Compare
Fair points — dropped both the |
LLMs using
db_execute_querytend to SELECT raw rows and process locally, blowing up the context window and running slowly. Cap the data the tool returns and steer agents toward doing the work in SQL.mcp_max_rowsconfig (default 100). Not a per-call parameter — the config is the single knob, so the model can't override it.LIMITin the query remains the preferred tool when the agent knows how many rows it wants.rows_affectedreports the true total. The response setstruncated: trueand returns anoticesteering the agent to aggregate / filter / paginate in SQL instead of re-running the query.MCP-only; CLI query paths are unchanged.