fix(bigquery): close GAPIC storage transport and auth sessions to prevent socket leaks#17508
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the BigQuery client to close the transport directly via _transport.close() instead of _transport.grpc_channel.close(), updates corresponding unit and system tests, and adds a new TestPropertyGraphReference test class. The review feedback suggests improving the patch_tracked_requests context manager in system tests to avoid closing pre-existing sessions, and refactoring manual debug prints to use idiomatic assertion messages instead.
77adf99 to
f2ec799
Compare
f2ec799 to
43e6eec
Compare
| bigquery_magics = None | ||
|
|
||
| if sys.version_info < (3, 10): | ||
| if sys.version_info < (3, 10): # pragma: NO COVER |
There was a problem hiding this comment.
I'd prefer we add a comment explaining why this is safe. In this case, it's to protect from the user somehow installing this despite us dropping support in #17187
There was a problem hiding this comment.
Regarding the comment on the version check: I looked at the gapic-generator-python templates and how this is handled across the rest of the monorepo. The generator templates do not include an explanatory comment here, and the vast majority of our libraries don't use one. The code is essentially self-documenting since the warnings.warn string states why the check is firing (to warn users who are on an unsupported Python version). To maintain parity with our generated code and the rest of the libraries, I'd prefer to leave the comment off.
There was a problem hiding this comment.
Regarding the comment on the version check: I looked at the gapic-generator-python templates and how this is handled across the rest of the monorepo. The generator templates do not include an explanatory comment here, and the vast majority of our libraries don't use one. The code is essentially self-documenting since the warnings.warn string states why the check is firing (to warn users who are on an unsupported Python version). To maintain parity with our generated code and the rest of the libraries, I'd prefer to leave the comment off.
| client.close() | ||
|
|
||
| client.close() | ||
| import gc |
There was a problem hiding this comment.
I've recently experienced pytest still hanging onto resources even after explicit del calls (pola-rs/polars-bigquery-client#3). Might make sense to add a layer of indirection like I did in https://github.com/pola-rs/polars-bigquery-client/pull/3/changes#diff-d9ee89704040c8f986c8e0a4ab3757806eebe9caa54073a72a3672b5626df3cf
There was a problem hiding this comment.
I've reviewed the polars-bigquery PR. While pytest's reference-holding is a concern, I believe this PR is less susceptible because we’ve moved toward explicit closure of the transport and auth sessions (via the tracker) rather than relying solely on object destruction. Since we aren't dealing with Rust-backed objects that have opaque lifecycles, the explicit .close() calls should release the underlying sockets even if pytest lingers on the Python wrapper. I'll monitor for flakiness, but the current gc.collect() approach seems to resolve the leaks in my testing.
| self.assertEqual(len(rows), 100000) | ||
|
|
||
| connection.close() | ||
| import gc |
There was a problem hiding this comment.
same for this function re: splitting the gc logic tests into a separate module.
There was a problem hiding this comment.
Same as above comment.
This PR resolves resource leaks (specifically open sockets left in the
ESTABLISHEDstate) that occur during client lifecycle operations and credential refreshing in system/unit tests.The Problem
Transport Lifecycle: When closing the BigQuery Storage client, calling
_transport.grpc_channel.close()was insufficient for releasing all network resources. The full_transport.close()method needs to be invoked to tear down the underlying transport channel correctly.Dynamic Auth Sessions: Under certain authentication environments (like Workload Identity/GCE Metadata server inside Kokoro CI), the
google-authlibrary dynamically instantiates helper HTTP sessions to fetch access tokens. These sessions are not owned by the BigQuery client and are not closed automatically, leading to leaked sockets.Flaky Test Assertions: Socket count assertions in system tests were flaky because Python's garbage collection is non-deterministic, meaning sockets remained open in the operating system even after client close calls until a garbage collection cycle swept them.
Changes
Client & Transport Lifecycle:
_transport.close()instead of_transport.grpc_channel.close().Testing Improvements:
patch_tracked_requestsinterceptor to system/unit tests to track and explicitly close all dynamically spawned credential-refreshing HTTP sessions when the test context exits.gc.collect()calls to socket leak verification tests to force synchronous sweeping of unreferenced socket objects before asserting final socket counts.Code Coverage:
# pragma: NO COVERto Python version checks in__init__.pyfor code paths that render a deprecation warning code if attempted to run on Python runtimes <3.10 test matrix (these paths do not run in our CI/CD since we never execute code with the older runtimes).