Skip to content

fix(release): Add more checks to release scripts#127

Open
NickLarsenNZ wants to merge 39 commits into
mainfrom
fix/release-scripts
Open

fix(release): Add more checks to release scripts#127
NickLarsenNZ wants to merge 39 commits into
mainfrom
fix/release-scripts

Conversation

@NickLarsenNZ

@NickLarsenNZ NickLarsenNZ commented May 20, 2026

Copy link
Copy Markdown
Member

Tip

This PR is intended to be compared with #125 so we can take the best ideas of both.

Summary

Harden the release scripts with shared validation, state assertions, and consistency improvements. The goal is to catch errors early (before mutations), make handover between colleagues safer, and reduce silent misbehavior.

Also incorporates features from #125: reduced duplication via shared helpers, and centralised variable derivation.

What changed

  • Created release/common.sh with shared functions sourced by all scripts
  • Added BATS tests for all common functions (release/test_common.bats)
  • All scripts now validate inputs before any mutations (validate_tag, validate_release_base_version, validate_what)
  • All scripts now check common dependencies early (check_common_dependencies)
  • Git state is asserted throughout: clean index, correct branch, correct repo, correct remote
  • Branch existence checks use exact matching instead of loose grep
  • Shallow clones removed (broke downstream tag/cherry-pick operations)
  • git commit -sam replaced with explicit git add of known files
  • pushd/popd with relative paths instead of cd with absolute paths
  • Renamed REPOSITORY to REMOTE for consistency
  • Fixed usage messages referencing wrong script names
  • Centralised variable derivation (derive_tag_vars, derive_branch_vars)
  • Centralised operator iteration (for_each_operator), cloning (ensure_clone), temp folder creation (ensure_temp_folder), and quote stripping (strip_double_quotes)
  • Renamed RELEASE to RELEASE_BASE for clarity

Checklist

Phase 1: Shared infrastructure

  • 1.1 - common.sh with validate_what, check_common_dependencies, assert_clean_index
  • 1.2 - validate_tag (CalVer, --no-rc flag) and validate_release_base_version
  • 1.3 - assert_cwd_is_repo
  • 1.4 - assert_on_branch
  • 1.5 - assert_remote_exists (checks github.com/stackabletech/$repo)

Phase 2: Apply shared functions to each script

  • 2.1 - create-release-branch.sh
  • 2.2 - create-release-candidate-branch.sh
  • 2.3 - merge-release-candidate.sh
  • 2.4 - tag-release-candidate.sh
  • 2.5 - post-release.sh

Phase 2b: Consolidate push_branch

  • 2.6 - Move push_branch to common.sh (deferred - current per-script implementations are short and clear)

Phase 3: Fix git state handling

  • 3.1 - Remove --depth 1 from clones
  • 3.2 - Replace git commit -sam with explicit git add + git commit -sm
  • 3.3 - assert_clean_index after commits
  • 3.4 - Subshells () for directory isolation (except release-branch-hygiene.sh which needs parent state)
  • 3.5 - Exact branch-existence checks (assert_remote_branch_exists, assert_remote_branch_not_exists, assert_tag_not_exists)
  • 3.6 - Commit divergence assertions (TODOs placed in scripts, deferred to future PR)

Phase 4: Improve dry-run and observability

  • 4.1 - Verify PR exists in merge-release-candidate.sh push mode (fails if missing). Dry-run skips the check to allow testing before PRs exist.
  • 4.2 - Summary output at end of each script (skipped - per-repo output already exists and set -e stops on first failure)
  • 4.3 - Document RC rejection in post-release.sh
  • 4.4 - Document demos branch-only workflow in README

Phase 5: Future-proofing (optional)

Not doing these now.

  • 5.1 - --non-interactive flag for merge-release-candidate.sh (deferred to future PR)
  • 5.2 - Progress/state file for handover resumability (deferred to future PR)
  • 5.3 - Two-pass operator loop (deferred to future PR)

Tip

The phases below incorporate improvements from @adwk67 in #125

Phase A: Reduce duplication

  • A.1 - derive_tag_vars (centralised variable derivation from tag)
  • A.2 - derive_branch_vars (centralised variable derivation from branch version)
  • A.3 - strip_double_quotes (replace inline quote stripping)
  • A.4 - for_each_operator (centralised operator iteration)
  • A.5 - ensure_clone (centralised clone-if-not-present)
  • A.6 - ensure_temp_folder (centralised temp folder creation)

Phase B: Improve existing functions

  • B.1 - Switch remote_branch_exists to git ls-remote (avoids modifying local refs)
  • B.2 - Switch assert_tag_not_exists to git ls-remote (now takes remote name as argument)
  • B.3 - Idempotent update_changelog (skip if tag already present, validate tag, use fixed-string matching)

Phase C: Add new safety checks

  • C.1 - No-op commit guard (extracted into commit_and_push_rc function)
  • C.2 - verify_release (post-mutation verification of Cargo, Helm, antora, changelog, etc.)
  • C.3 - shellcheck source=common.sh directive

Phase D: Bug fixes

  • D.1 - LIBGIT2_NO_PKG_CONFIG=1 workaround for non-NixOS users
  • D.2 - Add extra/ to git add paths in operator loop

Phase E: Update release-branch-hygiene.sh

  • E.1 - Apply all common.sh patterns to release-branch-hygiene.sh

Phase F: Future improvements

  • F.1 - Use subshells () instead of pushd/popd where functions don't propagate parent state (all scripts except release-branch-hygiene.sh which needs CREATED_PRS)

Ensures we are in a repo, or a specific repo.
Ensures we are on a specific branch.
Ensures we have a remote called origin, and it is for the repo we expect.
```
❯ bats release/test_common.bats
test_common.bats
 ✓ validate_what: accepts valid value
 ✓ validate_what: accepts last valid value
 ✓ validate_what: rejects invalid value
 ✓ validate_what: rejects value not in allowed set for this script
 ✓ validate_what: rejects empty value
 ✓ validate_tag: accepts final release tag
 ✓ validate_tag: accepts RC tag
 ✓ validate_tag: accepts multi-digit RC
 ✓ validate_tag: accepts month 12
 ✓ validate_tag: rejects month 0
 ✓ validate_tag: rejects month 13
 ✓ validate_tag: rejects leading zero on month
 ✓ validate_tag: rejects missing patch
 ✓ validate_tag: rejects empty tag
 ✓ validate_tag: rejects missing hyphen in rc
 ✓ validate_tag: --no-rc accepts final release
 ✓ validate_tag: --no-rc rejects RC tag
 ✓ validate_tag: rejects unknown flag
 ✓ validate_tag: rejects trailing arguments
 ✓ validate_release_base_version: accepts valid version
 ✓ validate_release_base_version: accepts month 11
 ✓ validate_release_base_version: rejects version with patch
 ✓ validate_release_base_version: rejects empty
 ✓ validate_release_base_version: rejects month 0
 ✓ validate_release_base_version: rejects leading zero
 ✓ assert_cwd_is_repo: passes in a git repo
 ✓ assert_cwd_is_repo: passes with matching name
 ✓ assert_cwd_is_repo: fails with wrong name
 ✓ assert_cwd_is_repo: fails outside a git repo
 ✓ assert_on_branch: passes on correct branch
 ✓ assert_on_branch: fails on wrong branch
 ✓ assert_on_branch: fails with empty argument
 ✓ assert_remote_exists: passes with SSH URL
 ✓ assert_remote_exists: passes with HTTPS URL
 ✓ assert_remote_exists: passes without .git suffix
 ✓ assert_remote_exists: fails with wrong repo name
 ✓ assert_remote_exists: fails with wrong org
 ✓ assert_remote_exists: fails with nonexistent remote
 ✓ assert_clean_index: passes on clean repo
 ✓ assert_clean_index: fails on staged changes
 ✓ assert_clean_index: fails on unstaged changes
 ✓ assert_clean_index: warns on untracked files, continues with y
 ✓ assert_clean_index: warns on untracked files, aborts with n

43 tests, 0 failures
```
…nctions for validating the tag and the "what"
Patch releases could break if the tag is too far back
We now have assertions that we are in the right place.
@NickLarsenNZ NickLarsenNZ self-assigned this May 20, 2026
NickLarsenNZ and others added 6 commits May 20, 2026 12:00
This sets commonly used variables based on the tag.

Also update RELEASE to RELEASE_BASE.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
This sets commonly used variables based on the given release base.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Replace all the manual double-quote stripping with it.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Allows us to apply a function with arguments _for each operator_ in the config.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Centralises the clone if not present check.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Centralizes `mkdir -p` of the temp folder.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
NickLarsenNZ and others added 5 commits May 20, 2026 14:59
This removes the need for fetching and then looking at local refs.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
This removes the need for fetching and then looking at local tags.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
NickLarsenNZ and others added 7 commits May 20, 2026 16:31
…ripts

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
…rgo build

This is a workaround for non-NixOS users where nix-shell provides libgit2 at build time but `LD_LIBRARY_PATH` doesn't include the nix store at runtime, causing a crash.

The proper fix belongs in the operator repos' nix shells.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
Note: This wasn't done for release-branch-hygiene.sh because it modifies global variables.

Co-authored-by: Andrew Kenworthy <andrew.kenworthy@stackable.de>
@NickLarsenNZ NickLarsenNZ requested a review from adwk67 May 21, 2026 07:33
@NickLarsenNZ NickLarsenNZ marked this pull request as ready for review May 21, 2026 07:34
@adwk67 adwk67 moved this to Development: In Review in Stackable Engineering May 21, 2026
Comment on lines +75 to +81
#
# LIBGIT2_NO_PKG_CONFIG: Workaround for non-NixOS users. nix-shell provides
# libgit2 at build time (found via pkg-config), but LD_LIBRARY_PATH may not
# include the nix store at runtime, causing a crash. This forces libgit2-sys
# to statically link its bundled copy instead. The proper fix belongs in the
# operator repos' nix shells.
LIBGIT2_NO_PKG_CONFIG=1 nix-shell --run 'make regenerate-charts'

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we consider this a hack? It causes openssl.so.3 issues for some users.
I think the fix is to include pkg-config for the operators.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to work out how to validate this.

I guess if @adwk67 runs it without the env var to reproduce the failure, then adds pkg-config to the checked out operator, and runs it again?

And for me, it works without the env var, so I just need to test the same way to ensure it still works.

Comment thread release/create-release-candidate-branch.sh Outdated
Comment thread release/create-release-candidate-branch.sh Outdated
Comment thread release/post-release.sh Outdated
Comment thread release/common.sh Outdated
fi

# gh authentication: if this fails you will need to e.g. gh auth login
gh auth status

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this fails, also echo something to the user (e.g. "It was not possible to verify authentication: please login with e.g. gh auth login") so that it is clear what to do.

@NickLarsenNZ NickLarsenNZ Jun 19, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with a dummy script:

#!/usr/bin/env bash
#
# See README.md
#
set -euo pipefail
# set -x

SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
# shellcheck source=common.sh
source "$SCRIPT_DIR/common.sh"

check_common_dependencies

With check_common_dependencies containing:

false || echo "You need to 'run gh auth login' before rerunning the script" && exit 1
# then
true || echo "You need to 'run gh auth login' before rerunning the script" && exit 1

to make sure it exists correctly, or continues if no error.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 5c7dc6d

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, no that didn't work. My test was flawed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in fd6ea2c

NickLarsenNZ and others added 4 commits June 19, 2026 11:51
Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>
Co-authored-by: Nick <10092581+NickLarsenNZ@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Development: In Review

Development

Successfully merging this pull request may close these issues.

2 participants