Skip to content

RANGER-5658: Remove obsolete atlas.kafka.zookeeper.connect requirement from Tag Sync#1037

Open
ramackri wants to merge 3 commits into
masterfrom
RANGER-5658-patch
Open

RANGER-5658: Remove obsolete atlas.kafka.zookeeper.connect requirement from Tag Sync#1037
ramackri wants to merge 3 commits into
masterfrom
RANGER-5658-patch

Conversation

@ramackri

@ramackri ramackri commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

Remove the legacy atlas.kafka.zookeeper.connect configuration requirement from Ranger Tag Sync's Atlas Kafka source (RANGER-5658).

Tag Sync consumes Atlas entity notifications via kafka-clients using atlas.kafka.bootstrap.servers and atlas.kafka.entities.group.id. The zookeeper.connect property was never used by Atlas 2.4 KafkaNotification for consumer creation; Ranger only validated that it was set. With Kafka 3.9.x and KRaft brokers, clients do not use ZooKeeper for consumption.

Changes (RANGER-5658)

  • Drop TAGSYNC_ATLAS_ZOOKEEPER_ENDPOINT constant and startup validation from AtlasTagSource
  • Extract validateRequiredAtlasKafkaProperties() for unit testing
  • Remove TAG_SOURCE_ATLAS_KAFKA_ZOOKEEPER_CONNECT from install.properties, installprop2xml.properties, and setup.py

Additional change (TagSync install — Ranger destination credential)

During TagSync installation, setup.py stores the Ranger Admin destination password in the Java credential keystore (tagadmin.user.password) before the process exits. Previously this step always wrote the service account username (rangertagsync) as the secret value, even when install.properties defined a different rangerTagsync_password.

That mismatch breaks the common deployment pattern where:

  • TagSync reads classifications from Atlas REST (basic auth to Atlas), and
  • TagSync uploads imported tags to Ranger Admin via TagAdminRESTSink using basic auth (Hadoop simple security in TagSync's core-site.xml), rather than SPNEGO.

In those environments, an incorrect keystore password produces HTTP 401 on /service/tags/importservicetags/ and tag mappings never reach Ranger. A second script (updatetagadminpassword.py) at the end of setup.py was intended to correct the credential, but any install failure or early exit before that step left TagSync permanently unable to authenticate.

This change: when rangerTagsync_password is set in install.properties, use it for the initial keystore write—the same pattern already used for the Atlas REST credential in the same file.

Out of scope

  • Tag Sync HA (ranger-tagsync.server.ha.zookeeper.*) — unchanged; still requires ZooKeeper when HA is enabled
  • Atlas REST tag source — unaffected by ZK removal; benefits from the keystore credential fix

Upgrade note

Existing deployments may still have atlas.kafka.zookeeper.connect in conf/atlas-application.properties; it is harmless and can be removed manually. New installs no longer generate or require it.

Test plan

Unit tests

mvn test -pl tagsync -Drat.skip=true \
  -Dtest=AtlasTagSourceConfigTest \
  -Dsurefire.failIfNoSpecifiedTests=false
Test Verifies
validateRequiredAtlasKafkaProperties_acceptsBootstrapAndGroupWithoutZookeeper No ZK property required when bootstrap + group are set
validateRequiredAtlasKafkaProperties_rejectsMissingBootstrapServers Bootstrap servers still mandatory
validateRequiredAtlasKafkaProperties_rejectsMissingConsumerGroup Consumer group still mandatory
  • AtlasTagSourceConfigTest — 3 tests, 0 failures (local run)
  • Integration — Atlas REST TagSync → Ranger Admin: in a multi-service test environment (Atlas + Ranger Admin + TagSync on a shared container network), apply a classification to a governed Hive column in Atlas; confirm TagSync polls Atlas REST and imports tags without authentication errors; verify Ranger Admin shows new tag definitions and resource mappings for the target Hive service (createdBy TagSync user); confirm TagSync logs contain no repeated 401 responses on tag upload
  • Fresh Tag Sync install with Atlas Kafka source — conf/atlas-application.properties has no zookeeper.connect
  • Upgrade: existing install with legacy zookeeper.connect still starts
  • Tag Sync HA smoke test (if ranger-tagsync.server.ha.enabled=true) — leader election unchanged

…t from Tag Sync.

Tag Sync consumes Atlas notifications via kafka-clients (bootstrap.servers only).
The legacy zookeeper.connect property was never used by Atlas KafkaNotification
but was still required at startup and in installer templates.

Co-authored-by: Cursor <cursoragent@cursor.com>
@ramackri ramackri requested a review from mneethiraj June 29, 2026 14:07
Co-authored-by: Cursor <cursoragent@cursor.com>
@ramackri ramackri requested review from kumaab and rameeshm June 29, 2026 14:08
setup.py wrote tagadmin.user.password as the literal username 'rangertagsync'
instead of rangerTagsync_password from install.properties, causing 401 on
TagAdminRESTSink until updatetagadminpassword.py ran at end of setup.

Complements RANGER-5658 / PR #1037 (Atlas Kafka ZK cleanup); required for
docker Atlas REST TagSync with Kerberos Ranger Admin.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant