Summary
--emit neo4j materializes only a small fraction of the call graph: on daytrader8 the emitted graph has 287 J_CALLS edges, while the analysis.json call_graph from an equivalent level-2 run has ~2077 edges. Edges are missing even when both endpoint :JCallable nodes exist in the graph, so it is not (only) the external-target gating.
Environment
- codeanalyzer-java 2.4.0 (
--analysis-level 2, live Bolt push)
Evidence
- The call graph is deterministic: two
--emit json runs gave 2077 and 2076 call_graph edges.
- The same project
--emit neo4j produced 287 J_CALLS edges.
- Of 1855 fully-resolved edges in
analysis.json (both endpoints are app callables present in the symbol table), only ~287 unique have a J_CALLS edge in the graph; ~1523 are absent despite both endpoint :JCallable nodes existing:
# for an absent edge, both nodes resolve:
MATCH (a:JCallable {id:$sid}) RETURN a // exists
MATCH (b:JCallable {id:$tid}) RETURN b // exists
MATCH (:JCallable {id:$sid})-[r:J_CALLS]->(:JCallable {id:$tid}) RETURN count(r) // 0
Example absent edges (both endpoints present as nodes):
...web.websocket.ActionMessage#... -> ...util.Log#error(...),
...web.prims.PingWebSocketJson#... -> ...web.websocket.JsonMessage#....
Likely area
neo4j/GraphProjector.java, projectCallGraph →
b.edgeIfBothResolved("J_CALLS", new NodeRef("JSymbol","id",from), new NodeRef("JSymbol","id",to), props)
with from/to = type_declaration + "#" + signature. Since both endpoint nodes exist with exactly those ids, edgeIfBothResolved should retain them — so either the resolution key used during projection differs from the node id actually emitted (a normalization mismatch between the call_graph vertices' type_declaration/signature and the symbol-table ids), or the --emit neo4j path is projecting a much smaller call graph than --emit json. Worth confirming whether the two emit paths build the SDG identically.
Impact
Any consumer reading the call graph from Neo4j (e.g. the CLDK SDK's read-only Java backend) sees ~14% of the call edges the in-memory / analysis.json backend reports — callers/callees, class call graphs, and reachability are all substantially incomplete.
Summary
--emit neo4jmaterializes only a small fraction of the call graph: on daytrader8 the emitted graph has 287J_CALLSedges, while theanalysis.jsoncall_graphfrom an equivalent level-2 run has ~2077 edges. Edges are missing even when both endpoint:JCallablenodes exist in the graph, so it is not (only) the external-target gating.Environment
--analysis-level 2, live Bolt push)Evidence
--emit jsonruns gave 2077 and 2076call_graphedges.--emit neo4jproduced 287J_CALLSedges.analysis.json(both endpoints are app callables present in the symbol table), only ~287 unique have aJ_CALLSedge in the graph; ~1523 are absent despite both endpoint:JCallablenodes existing:Example absent edges (both endpoints present as nodes):
...web.websocket.ActionMessage#... -> ...util.Log#error(...),...web.prims.PingWebSocketJson#... -> ...web.websocket.JsonMessage#....Likely area
neo4j/GraphProjector.java,projectCallGraph→b.edgeIfBothResolved("J_CALLS", new NodeRef("JSymbol","id",from), new NodeRef("JSymbol","id",to), props)with
from/to = type_declaration + "#" + signature. Since both endpoint nodes exist with exactly those ids,edgeIfBothResolvedshould retain them — so either the resolution key used during projection differs from the nodeidactually emitted (a normalization mismatch between thecall_graphvertices'type_declaration/signatureand the symbol-tableids), or the--emit neo4jpath is projecting a much smaller call graph than--emit json. Worth confirming whether the two emit paths build the SDG identically.Impact
Any consumer reading the call graph from Neo4j (e.g. the CLDK SDK's read-only Java backend) sees ~14% of the call edges the in-memory /
analysis.jsonbackend reports — callers/callees, class call graphs, and reachability are all substantially incomplete.