Skip to content

fix(cncf-kubernetes): use Configuration.socket_options instead of monkey-patching urllib3#68557

Open
anmolxlight wants to merge 1 commit into
apache:mainfrom
anmolxlight:fix-tcp-keepalive-urllib3-v2
Open

fix(cncf-kubernetes): use Configuration.socket_options instead of monkey-patching urllib3#68557
anmolxlight wants to merge 1 commit into
apache:mainfrom
anmolxlight:fix-tcp-keepalive-urllib3-v2

Conversation

@anmolxlight

Copy link
Copy Markdown
Contributor

Description

The kubernetes_executor.enable_tcp_keepalive config option was relying on monkey-patching urllib3's default_socket_options after module import. In urllib3 v2.x, the socket_options parameter of HTTPConnection.__init__ is evaluated as a default argument at import time, so changing the class attribute afterwards has no effect on newly created connections. This meant the TCP keepalive configuration was silently a no-op.

This PR fixes the issue by passing socket options through the Kubernetes client's Configuration.socket_options field, which is properly threaded through:

ApiClient -> RESTClientObject.__init__ -> urllib3.PoolManager -> HTTPConnectionPool -> HTTPConnection.__init__

The kubernetes-client/python library already supports this: Configuration has a socket_options attribute that is checked in RESTClientObject.__init__ and forwarded to urllib3.PoolManager as a keyword argument.

Changes

  • kube_client.py: _enable_tcp_keepalive() now accepts a Configuration object and sets configuration.socket_options directly instead of monkey-patching urllib3. get_kube_client() calls it after obtaining the configuration object.
  • kubernetes_engine.py: Passes configuration through to _enable_tcp_keepalive
  • tests: Updated test_enable_tcp_keepalive to verify configuration.socket_options is set correctly

Notes

  • Includes TCP_NODELAY in the socket options to preserve the default urllib3 behavior of disabling Nagle's algorithm

Related issue

Closes: #68396

Reproduction

See https://github.com/jonminter-dojo/airflow-k8s-tcp-keepalive-repro for a minimal reproduction demonstrating the bug with the old approach.

@boring-cyborg boring-cyborg Bot added area:providers provider:cncf-kubernetes Kubernetes (k8s) provider related issues provider:google Google (including GCP) related issues labels Jun 15, 2026
@anmolxlight anmolxlight force-pushed the fix-tcp-keepalive-urllib3-v2 branch from 9b380b4 to 27a9bca Compare June 15, 2026 15:28
…key-patching urllib3 (apache#68396)

The `enable_tcp_keepalive` config option in the Kubernetes provider
relied on monkey-patching urllib3's default_socket_options.
In urllib3 v2.x, the socket_options parameter in HTTPConnection.__init__
is evaluated as a default argument at import time, so post-import
changes are never picked up by new connections.

Fix by passing socket options through the Kubernetes client's
Configuration.socket_options field, which is properly threaded
through ApiClient -> RESTClientObject -> urllib3.PoolManager.

Also includes TCP_NODELAY in the socket options to preserve the
default urllib3 behavior of disabling Nagle's algorithm.

For the KubernetesHook case, keepalive is applied in
_TimeoutK8sApiClient.__init__ AFTER config loading so the
configuration already has the correct host/credentials.

Closes: apache#68396
@anmolxlight anmolxlight force-pushed the fix-tcp-keepalive-urllib3-v2 branch from 27a9bca to fa68e17 Compare June 15, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:cncf-kubernetes Kubernetes (k8s) provider related issues provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Kubernetes provider kubernetes_executor.enable_tcp_keepalive does not actually enable TCP keepalives

1 participant