feat(scripts): Add dependency version scanner tool#16867
feat(scripts): Add dependency version scanner tool#16867chalmerlowe wants to merge 38 commits intomainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new dependency version scanner, including a configuration-driven regex scanner, a benchmarking tool, and comprehensive unit and integration tests. The review feedback highlights several areas for improvement: optimizing regex compilation in the scanner to avoid performance bottlenecks, using the tempfile module in the benchmark script to prevent race conditions, removing redundant code, improving test robustness by checking subprocess exit codes, and adhering to PEP 8 by moving imports to the top of files.
…boundaries for explicit_version_string
… to .scannerignore
| @@ -0,0 +1,34 @@ | |||
| import csv | |||
There was a problem hiding this comment.
It looks like the copytight header is missing (applies to all code files)
| Run the script from the repository root: | ||
|
|
||
| ```bash | ||
| python3 scripts/version_scanner/version_scanner.py -d <dependency> -v <version> [options] |
There was a problem hiding this comment.
When I ran this, I gt a ModuleNotFound error. is there a requirements.txt or anything that captures the dependencies?
| This plan outlines the approach to update Python packages to drop support for end-of-life Python runtimes (3.7, 3.8, 3.9) OR for deprecated dependencies, and ensure the packages are configured for modern Python. | ||
|
|
||
| #### High-Level Strategy | ||
| - **One Branch Per Package**: To keep PRs manageable and isolated, we suggest a dedicated worktree and branch for each package (e.g., `feat/drop-<dependency>-<version>-<package-name>` i.e. `feat/drop-protobuf-4.25.8-google-cloud-bigquery`). |
There was a problem hiding this comment.
This is only for hand-written packages, right? I assume others would get their updates through the generator?
Should we recommend doing a generator update first, to clean up most of the packages?
| @@ -0,0 +1,5 @@ | |||
| packages/google-cloud-access-context-manager | |||
| self.variables = self._compute_variables() | ||
|
|
||
| def _compute_variables(self) -> Dict[str, str]: | ||
| """Compute variables for interpolation from version string.""" |
There was a problem hiding this comment.
nit: more detailed comments/examples could be helpful for future maintainers. I'm not sure what a variable is, or the expected version string format
| try: | ||
| with open(file_path, 'r', encoding='utf-8', errors='ignore') as f: | ||
| skip_next = False | ||
| for line_num, line in enumerate(f, 1): |
There was a problem hiding this comment.
are there any issues with statements that span lines?
| def upload_to_drive(csv_path: str, matches: List[Dict[str, str]], github_repo: str = None, branch: str = "main") -> str: | ||
| """ | ||
| Upload matches to a Google Sheet in Drive. | ||
| """ |
There was a problem hiding this comment.
Is this necessary? It seems to add extra complexity, dependencies and test surface area, when Google Sheets makes it pretty easy to import a csv natively already
| parts = rel_root.split(os.sep) | ||
|
|
||
| # Monorepo filtering | ||
| if target_packages and parts[0] == "packages": |
There was a problem hiding this comment.
There's talk of separating the packages directory into separate ones for generated and handwritten libraries. Will that be easy to address here?
|
|
||
| package_group.add_argument( | ||
| "--package", | ||
| help="Specific subdirectory filter (useful for monorepos)" |
There was a problem hiding this comment.
Is this specific to the structure of the monorepo's package directory? Os is this more of a generic subdirectory filter?
This adds a utility with the ability to scan for common references to dependencies (Python runtimes and package dependencies) to facilitate updating code when runtimes and dependencies change.
< 3.8when searching for references to3.7since they are semantically equivalent even if syntactically different.