[DRAFT] experiment: add test sharding#17438
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces parallel execution for system tests and implements a sharding mechanism for CI jobs, including a new Python script to group packages and updates to the test runner script. Feedback on these changes focuses on improving reliability and safety: first, by using an EXIT trap in .kokoro/system.sh to guarantee cleanup of isolated gcloud configuration directories in case of test failures; and second, by avoiding global toggles of set -e in ci/run_conditional_tests.sh and instead capturing test exit codes using the || operator.
| # Isolate gcloud config for parallel execution | ||
| export CLOUDSDK_CONFIG=$(mktemp -d) |
There was a problem hiding this comment.
Since run_package_test runs with set -e enabled, any test failure or error will cause the subshell to exit immediately. This prevents the manual cleanup at the end of the function from running, leading to leaked temporary directories in /tmp.
Using an EXIT trap ensures that the temporary directory is reliably cleaned up when the subshell exits, regardless of whether the tests succeeded or failed.
| # Isolate gcloud config for parallel execution | |
| export CLOUDSDK_CONFIG=$(mktemp -d) | |
| # Isolate gcloud config for parallel execution | |
| export CLOUDSDK_CONFIG=$(mktemp -d) | |
| trap 'rm -rf "${CLOUDSDK_CONFIG}"' EXIT |
| # Clean up isolated gcloud config | ||
| rm -rf "${CLOUDSDK_CONFIG}" |
| set +e | ||
| ${test_script} | ||
| ret=$? | ||
| set -e |
There was a problem hiding this comment.
Toggling set -e globally can lead to unexpected side effects or mask errors in other parts of the script if not carefully managed. A more idiomatic and safer way to capture the exit status of a command under set -e is to use the || operator, which prevents the shell from exiting on failure without needing to disable set -e globally.
| set +e | |
| ${test_script} | |
| ret=$? | |
| set -e | |
| ret=0 | |
| ${test_script} || ret=$? |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces a package sharding mechanism for CI testing, adding a new Python script ci/get_package_shards.py and updating ci/run_conditional_tests.sh to support package lists, unique coverage files, and fallback sharding. It also configures test coverage reporting for sqlalchemy-spanner and updates coverage requirements for google-cloud-firestore. The review feedback highlights two key improvements in ci/get_package_shards.py: resolving potential path separator mismatches on Windows by avoiding mixed usage of os.path.join and forward slashes, and refining the sharding distribution logic to prevent highly unbalanced shards when the shard count is capped.
| if not os.path.exists(subdir): | ||
| continue | ||
| # Use the same sorting as the shell script | ||
| pkg_dirs = [os.path.join(subdir, d) + '/' for d in os.listdir(subdir) if os.path.isdir(os.path.join(subdir, d))] |
There was a problem hiding this comment.
Using os.path.join and then appending a hardcoded forward slash (/) can result in mixed path separators on Windows (e.g., packages\\google-cloud-bigtable/). Since git diff outputs always use forward slashes, this mismatch will cause the startswith check in get_packages_to_test to fail on Windows environments.
Using a consistent forward slash format for package paths avoids this issue.
| pkg_dirs = [os.path.join(subdir, d) + '/' for d in os.listdir(subdir) if os.path.isdir(os.path.join(subdir, d))] | |
| pkg_dirs = [f"{subdir}/{d}/" for d in os.listdir(subdir) if os.path.isdir(os.path.join(subdir, d))] |
| shard_size = math.ceil(num_packages / num_shards) | ||
|
|
||
| shards = [] | ||
| for i in range(num_shards): | ||
| start = i * shard_size | ||
| end = min((i + 1) * shard_size, num_packages) | ||
| if start >= num_packages: | ||
| break | ||
| shard_packages = packages[start:end] |
There was a problem hiding this comment.
Using math.ceil(num_packages / num_shards) as a fixed chunk size can lead to highly unbalanced shards when the number of shards is capped at 10. For example, if there are 101 packages, num_shards is capped at 10, and shard_size becomes ceil(101/10) = 11. This results in 9 shards of size 11 and 1 shard of size 2, which is highly unbalanced and defeats the purpose of efficient parallel sharding.
Using a contiguous even distribution formula ensures that the difference in package count between any two shards is at most 1.
| shard_size = math.ceil(num_packages / num_shards) | |
| shards = [] | |
| for i in range(num_shards): | |
| start = i * shard_size | |
| end = min((i + 1) * shard_size, num_packages) | |
| if start >= num_packages: | |
| break | |
| shard_packages = packages[start:end] | |
| shards = [] | |
| for i in range(num_shards): | |
| start = (i * num_packages) // num_shards | |
| end = ((i + 1) * num_packages) // num_shards | |
| shard_packages = packages[start:end] |
Experimenting with test sharding
Unit Tests
unit test completestep, which is only green if all shards pass. This can be our new required check for unit testsSystem Tests
Fail Fast