A collection of scripts for gathering and analyzing GitHub repository metrics and statistics, with a focus on Terraform-related repositories and resource management.
This repository contains utility scripts for collecting, analyzing, and reporting on GitHub repository data. These scripts are designed to help with POC (Proof of Concept) baseline analysis, resource statistics, and repository health metrics.
-
GitHub CLI (
gh): Required forgather-poc-data.shscript- Installation: https://cli.github.com/
- Authentication: Run
gh auth loginto authenticate
-
jq: JSON processor for data manipulation
- macOS:
brew install jq - Linux:
sudo apt-get install jqorsudo yum install jq
- macOS:
-
bash: Version 4.0 or higher
- Valid GitHub authentication via
gh auth login - Access to the repositories you want to analyze
Collects comprehensive GitHub repository metrics for POC baseline analysis using GitHub CLI and GraphQL API.
This script gathers the following data from a specified GitHub repository:
- Contributors: List of repository contributors with their statistics
- Pull Requests: PR data including state, timing, authors, and reviews
- Reviews: Review analysis including reviewer statistics and PR review metrics
- Trends: Historical trends including monthly volume, wait times, review times, and reviewer concentration
./scripts/gather-poc-data.sh --org ORGANIZATION --repo REPOSITORY [OPTIONS]--org ORGANIZATION: GitHub organization name--repo REPOSITORY: Repository name within the organization
--months NUMBER: Number of months to look back (default: 12)--output-dir PATH: Base output directory (default:./poc-data)-h, --help: Display help message
# Basic usage with defaults (12 months, ./poc-data output)
./scripts/gather-poc-data.sh --org myorg --repo myrepo
# Custom time range and output directory
./scripts/gather-poc-data.sh --org myorg --repo myrepo --months 6 --output-dir ./data
# Analyze a specific repository
./scripts/gather-poc-data.sh --org exampleorg --repo examplerepo --months 24The script creates the following directory structure:
<output-dir>/
├── contributors/
│ └── <org>-<repo>-contributors.json
├── prs/
│ └── <org>-<repo>-prs.json
├── reviews/
│ └── <org>-<repo>-reviews.json
└── trends/
└── <org>-<repo>-trends.json
Each JSON file includes:
- Metadata: Collection timestamp, parameters, and execution time
- Data: The collected metrics and analysis results
-
Contributors (
contributors/<org>-<repo>-contributors.json)- List of all contributors with their contribution statistics
-
Pull Requests (
prs/<org>-<repo>-prs.json)- PR details including number, title, state, timestamps
- Author information
- Review counts and details
- Calculated metrics: wait time, review time, reviewer count
-
Reviews (
reviews/<org>-<repo>-reviews.json)- Review analysis with reviewer statistics
- PR review metrics including wait times and review times
- Reviewer activity breakdown
-
Trends (
trends/<org>-<repo>-trends.json)- Monthly PR volume trends
- Wait time statistics (median, p95, mean, min, max)
- Review time statistics (median, p95, mean, min, max)
- Top reviewers and reviewer concentration metrics
- Pagination: Automatically handles GraphQL pagination for large datasets
- Rate Limiting: Detects and handles GitHub API rate limits with automatic retry
- Date Filtering: Efficiently filters PRs by creation date
- Cross-Platform: Works on both macOS and Linux
- Error Handling: Comprehensive error checking and validation
Collects all Terraform resource types in use within a repository and counts their frequency. Includes user-written modules but excludes external modules (from .terraform/ directory). Outputs results as a plain text file.
This script scans Terraform files (.tf) in a repository and extracts all resource type declarations. It counts how frequently each resource type is used and outputs the results as JSON. The script only analyzes user code and excludes common module directories (.terraform, modules, .git, etc.) by default.
./scripts/collect-resource-types.sh [OPTIONS]--path PATH: Local path to Terraform repository
--output-dir PATH: Output directory for text file (default:./resource-stats)--exclude-dirs DIRS: Space-separated list of directories to exclude (default:.terraform .git)-h, --help: Display help message
# Analyze local repository
./scripts/collect-resource-types.sh --path /path/to/terraform/repo
# Analyze current directory
./scripts/collect-resource-types.sh --path .
# Custom output directory and exclusions
./scripts/collect-resource-types.sh --path ./my-terraform --output-dir ./stats --exclude-dirs ".terraform modules vendor"The script creates a single text file:
<output-dir>/
└── <repo-name>-resource-types.txt
The text file contains:
- Header: Repository information and collection metadata
- Parameters: Configuration used (excluded directories)
- Statistics: Summary of the analysis
- Files scanned
- Total resources found
- Unique resource types
- Analysis duration
- Resource Type Counts: Sorted list of resource types with their frequency counts
Terraform Resource Type Statistics
===================================
Repository: examplerepo
Repository Path: /path/to/repo
Collection Timestamp: 2024-01-15 10:30:00 UTC
Parameters:
Excluded Directories: .terraform .git
Statistics:
Files Scanned: 42
Total Resources: 156
Unique Resource Types: 23
Analysis Duration: 2s
Resource Type Counts:
---------------------
aws_instance 45
aws_s3_bucket 12
aws_iam_role 8
...
- Smart Module Handling: Includes user-written modules but excludes external modules (from
.terraform/directory) - Local Analysis: Works directly with local Terraform repositories
- Custom Exclusions: Configurable directory exclusions
- Efficient Scanning: Fast file scanning and resource extraction
- Text Output: Human-readable plain text output with statistics and counts
When adding new scripts to this repository:
-
Place scripts in
scripts/directory- Use descriptive, kebab-case names (e.g.,
analyze-commits.sh)
- Use descriptive, kebab-case names (e.g.,
-
Follow the existing script structure:
- Include a shebang:
#!/usr/bin/env bash - Use
set -o errexit,set -o nounset,set -o pipefail - Include a help function with
-h, --helpsupport - Add proper error handling and validation
- Include a shebang:
-
Document in this README:
- Add a new section under "Scripts"
- Include description, usage, arguments, examples, and output structure
- Follow the same format as existing script documentation
-
Make scripts executable:
chmod +x scripts/your-script.sh
When contributing new scripts or improvements:
- Follow bash best practices and style guidelines
- Include error handling and input validation
- Add appropriate comments and documentation
- Test scripts on both macOS and Linux when possible
- Update this README with script documentation
[Add your license here]