Automatically clone, update, and archive GitHub repositories with version control and compression.
- ✅ Repository Management - Add, update, delete, and track GitHub repositories
- ✅ Automatic Updates - Hourly checks with 24-hour update intervals
- ✅ Versioned Archives - Creates compressed
tar.xzarchives for each update - ✅ Incremental Archives - Only archives changed files to save disk space (70-90% reduction)
- ✅ Status Detection - Detects archived/deleted repositories via GitHub API
- ✅ GraphQL Batch API - Efficiently batches GitHub API requests (up to 100 repos per query)
- ✅ Multiple Interfaces - Web app, desktop GUI, and command-line interface
- ✅ Search & Filter - Real-time search and status filtering (web app)
- ✅ Export/Import - Export repository lists to JSON, import from JSON
- ✅ Statistics Dashboard - View totals, disk usage, and archive counts
- ✅ Real-time Updates - Live log streaming via Server-Sent Events (web app)
-
Install dependencies:
pip install -r requirements.txt
-
Choose your interface:
- CLI
python src/github_repo_saver_cli.py add https://github.com/user/repo - Web App:
python src/github_repo_saver_web.pythen openhttp://localhost:5001 - Desktop GUI:
python src/github_repo_saver_gui.py
- CLI
Full-featured CLI for automation and scripting:
# Add a repository
python src/github_repo_saver_cli.py add https://github.com/user/repo
# Add repositories from file (one URL per line)
python src/github_repo_saver_cli.py add-bulk repos.txt
# Update all repositories
python src/github_repo_saver_cli.py update
# Update a specific repository
python src/github_repo_saver_cli.py update https://github.com/user/repo
# List all repositories
python src/github_repo_saver_cli.py list
# List repositories by status
python src/github_repo_saver_cli.py list --status archived
# Delete a repository
python src/github_repo_saver_cli.py delete https://github.com/user/repo
# Refresh repository statuses
python src/github_repo_saver_cli.py refresh-statuses
# Show statistics
python src/github_repo_saver_cli.py statsModern, responsive web interface accessible from any browser:
-
Run the server:
python src/github_repo_saver_web.py
-
Open your browser: Navigate to
http://localhost:5001 -
Features:
- Add repositories (single or bulk)
- Search and filter repositories
- Sortable table columns
- View archives and README files
- Export/import repository lists
- Real-time statistics dashboard
- Live activity log
Native desktop application using PyQt5:
-
Run the GUI:
python src/github_repo_saver_gui.py
-
Features:
- Add repositories via "Add Repo" or "Bulk Upload"
- View repository status, descriptions, and timestamps
- Open folders, view archives, and read README files
- Update individual or all repositories
- Delete repositories from the list
Git-Archiver-1/
├── src/ # Main Python source code
│ ├── repo_manager.py # Backend logic (core functionality)
│ ├── github_repo_saver_gui.py # Desktop GUI (PyQt5)
│ ├── github_repo_saver_web.py # Web application (Flask)
│ └── github_repo_saver_cli.py # Command-line interface
├── templates/ # Flask HTML templates
│ └── index.html
├── static/ # Flask static files
│ ├── css/
│ │ └── style.css
│ └── js/
│ └── app.js
├── config/ # Configuration files (auto-generated)
│ ├── cloned_repos.json # Repository database
│ └── auto_update_config.json # Auto-update settings
├── data/ # Cloned repositories (auto-generated)
│ └── [repo-name].git/
│ ├── [repo files]
│ └── versions/ # Archived versions
├── requirements.txt # Python dependencies
└── README.md # This file
- Add Repositories - Provide GitHub repository URLs (one or many)
- Automatic Cloning - Repositories are cloned to the
data/folder - Daily Checks - The app checks hourly for updates (updates if 24h passed)
- Versioned Archives - When updates are detected, compressed archives are created
- Status Tracking - Monitors repository status (active/archived/deleted) via GitHub API
- GitHub API Caching - Thread-safe cache with 5-minute TTL reduces API calls by 80-90%
- GraphQL Batch API - Batch fetches up to 100 repositories per query, reducing API calls by 90%+
- Incremental Archives - Only archives changed files, reducing archive sizes by 70-90%
- Smart Git Pulls - Checks for updates before pulling, skipping unnecessary operations
- Shallow Clones - Uses
--depth 1by default for 5-10x faster initial clones - Async Archive Creation - Archives are created in background threads, keeping the UI responsive
- Thread-Safe Operations - All file operations are protected to prevent data corruption
Performance: Bulk updates of 100 repos take ~5-8 minutes (60% faster than before). GraphQL batching reduces API rate limit issues significantly.
Feel free to contribute additional features!