Automating Repository Management using Rulesets and Custom Properties
Automating Repository Management in GitHub
In today's fast-paced development environment, managing multiple GitHub repositories efficiently is crucial for maintaining security, consistency, and compliance. This is especially true for large organisations managing hundreds of production repositories.
In this post, I'll show you how I built a system to automate GitHub repository management using Terraform, Python, and GitHub Actions. I done this for a large organisation on my project to help them manage compliance and security for all their repositories and feel its worth while sharing with you how you can incorporate this into your own workload.
Here you can find the link to my GitHub repository - Repository
The Challenge
Large organisations often maintain hundreds if not thousands of repositories at scale, the challenge a lot of organisations face is that they cannot manage this at a grand scale and have to do it on an individual basis. Manually configuring and updating these settings across all repositories would be time-consuming, error-prone, and nearly impossible to manage at scale.
Common challenges include:
- Consistency: Ensuring uniform protection rules across all repositories.
- Scalability: Managing protection rules for hundreds/thousands of repositories.
- Maintenance: Keeping rules updated as repositories are added or removed.
- Compliance: Meeting organisational governance requirements.
- Automation: Reducing manual intervention and human error.
Repository Structure
The repository follows a clean, modular structure:
github-repository-rules/
āāā .github/
ā āāā workflows/ # GitHub Actions workflows
āāā components/ # Terraform configuration
āāā custom-properties/ # Scripts for repository custom properties
āāā scripts/ # Utility scripts
āāā production-repos.json # List of production repositories
āāā ReadMe.md # Documentation
Understanding GitHub Rulesets and Custom Properties
Before diving into the solution, it's worth understanding the key GitHub features we're working with:
GitHub Rulesets
GitHub Rulesets are a powerful feature that allow you to define and enforce policies across repositories. They provide:
- Branch protection: Prevent force pushes, require pull request reviews, and enforce status checks.
- Tag protection: Control who can create and delete tags.
- Standardisation: Apply consistent rules across multiple repositories.
- Bypass options: Allow specific teams or users to bypass rules when necessary.
Rulesets can be applied at the organisation level, ensuring all repositories follow the same governance standards.
Custom Properties
GitHub's custom properties feature allows you to add metadata to repositories. These properties:
- Define specific attributes for repositories (e.g., "is_production" or "is_development")
- Help with categorisation and filtering
- Support programmatic decision-making based on repository attributes
- Enable better governance and reporting
Key Components
1. Terraform Infrastructure as Code
The core functionality resides in the components
directory, which contains Terraform modules to:
- Create and manage an organisation-level ruleset
- Configure branch protection rules for main and master branches
- Set up cloud resources for state management
The main Terraform configuration allows us to setup the organisation rulesets via code and ensures we can enable reusability without having to go and manually set this up.
2. Repository Management
The production-repos.json
file serves as the single source of truth for all repositories. This list is automatically updated by scheduled GitHub Actions workflows, which scan various configuration files across different repositories to identify production repositories.
3. Custom Properties
One of the most powerful features is the custom properties system. The Python script in the custom-properties
directory:
- Defines a custom property called
is_production
at the organisation level - Sets this property for all repositories listed in the
production-repos.json
file - Verifies that the properties have been correctly applied
This makes it easy to identify and query production repositories programmatically.
NOTE: Custom properties is not yet managed via Terraform. Which is why we had to leverage Python to help create this. Hopefully in the future we will have the ability to use Terraform to automate this approach without having to use other programming languages to do so!
4. Automation with GitHub Actions
The repository leverages several GitHub Actions workflows:
- update-repos.yaml: Runs daily at midnight to update the list of production repositories via the URL's.
- terraform.yaml: Applies Terraform changes when the main branch is updated.
Self-Updating Architecture
One of the most elegant aspects of this solution is its self-updating nature. The system:
- Periodically scans configuration files in other repositories to identify production repositories.
- Updates its own
production-repos.json
file. - Commits the changes back to the repository.
- This triggers a workflow to apply the updated repository rules.
This means no manual intervention is required to keep the repository list up-to-date, this is because the list of repositories are being pulled from external links in the python script, if you need to add new repositories all you have to do is update the repository name in the URL - with this it happens automatically every night at midnight.
Implementation Details
Rule Set Configuration:
The Terraform configuration creates an organisation-level ruleset with the following settings:
- Creates an organization-level ruleset named "Production Repositories"
- Targets both branches main/master for the rules to be applied.
- Maintain linear history.
- Requires at least one review approval for pull requests.
- Allows certain GitHub Teams to bypass the protection rules in the case of emergency merges etc.
The beauty of this approach is that it centralizes rule management - a single Terraform resource applies consistent rules across many repositories.
resource "github_organization_ruleset" "default_ruleset" {
name = "Production Repositories"
target = "branch"
enforcement = "active"
conditions {
ref_name {
include = ["refs/heads/main", "refs/heads/master"]
exclude = []
}
}
rules {
deletion = false
required_linear_history = true
pull_request {
required_approving_review_count = 1
}
}
bypass_actors {
actor_id = data.github_team.admin.id
actor_type = "Team"
bypass_mode = "always"
}
}
This is how the rulesets should look when the code is applied to the repository, you can check this by going to settings > rules:
Custom Properties Implementation
The custom properties script uses the GitHub API to define and set properties:
def define_custom_property(org_name):
url = f"{API_BASE}/orgs/{org_name}/properties/schema/is_production"
data = {
"value_type": "true_false",
"required": False,
"default_value": "",
"description": "Indicates if the repository is in production",
"values_editable_by": "org_and_repo_actors"
}
response = requests.put(url, headers=headers, json=data)
# Error handling and response processing
Below is an example of the of the screenshot of how the custom properties looks when assigned the tag is_production
which indicates the repository contains production based code.
Repository Discovery
The repository discovery script scans several sources:
urls = [ replace-this-with-the-url-to-your-stored-repositories ]
This link is used to grab the name of the repositories from the URL, in our case this URL was a file with the names of all the repositories that contained production based code.
It extracts repository names from these files, ensuring a comprehensive list of production repositories. You would need to replace YOUR_ORG
and the paths with your organization's repositories that contain configuration information.
Conclusion
The github-repository-rules project showcases an elegant solution to a common enterprise challenge: managing GitHub repositories at scale. By combining Terraform for infrastructure as code, Python for flexible data processing, and GitHub Actions for automation, it creates a self-maintaining system that enforces organisational standards consistently.
The code demonstrates best practices in error handling, API interaction, and automation design. The solution is both robust and adaptable, capable of scaling to hundreds of repositories with minimal maintenance overhead.
For organisations struggling with repository governance, this approach provides a blueprint that can be customised to fit specific requirements. By investing in automation upfront, teams can free themselves from repetitive manual tasks and focus on higher-value work while improving security and compliance.
The real strength of this solution lies in its adaptability - as organisational needs evolve, the automation can be extended to incorporate new rules, properties, and repositories with minimal effort.