Wednesday, February 7, 2024

Guarding GitHub secrets in your organization

GitHub Actions Secrets security model is a breeze to understand when you are dealing with public, open-source projects. Here, contributors are encouraged to fork your repository and submit changes through a pull request.

But when you're working within a GitHub organization, where collaboration happens on private repositories, the game changes a bit. Forking is usually disabled and contributors are encouraged to commit their proposed changes to a separate branch and then do a pull request to the main branch. To keep things secure, branch protection rules are set for sensitive branches like main to prevent unauthorized push.

Sounds secure, right? But here's the catch: GitHub Action secrets are indeed available when running your workflow from a branch. This is a stark contrast to the fork model for public repos, where secrets are not passed to the runner when a workflow is triggered from a forked repository [1].

People who are new to GitHub orgnizations may be caught off guard here. For instance, if you have a secret, e.g. personal access token, that allowes pushing into a protected branch, you can request this secret into your workflow that runs in a branch and still push your code into the protected branch, effectively bypassing the branch protection rules.

One may argue that if you don't trust your colleagues users to such an extent then you defeinitely have other more important issue to solve (Conway's Law) but let's face it - we all occasionally do silly things innocently - it's part of being human.

But don't worry, there are native solutions to guard against such mishaps.

Let's dive into a practical use case. Suppose I'm a repo admin and would like to periodically run a code generation workflow, say, to update my auto-generated Opentofu configuraiton based on some external data. My main branch is protected, requiring all pull requests to be manually approved - this is, again, to encourage everyone in my organization to contribute by creating branches in my repo, while leaving me with the final say on each pull request to decide if the contribution makes sense.

One approach I could take is to issue a personal access token (PAT), define it as secret in the repo, and then use it in the checkout action to interact with the git repo as yourself:


   steps:
    - name: Checkout
      uses: actions/checkout@v4.1.1
      with:
        token: ${{ secrets.my_personal_access_token }}
  
    - name: Code gen
      run: ./generate_tf.sh
  
    - name: Commit changes if any
      uses: stefanzweifel/git-auto-commit-action@v5.0.0

While this approach works and seems to do what I want, it's severly flawed:

  • Any org member with the write access to the repo can access my PAT, e.g. when running a workflow in a branch, and hence to commit to the main branch.
  • Making things worth, they will get write access to any repository I have write access to, including my private repositories outside the organization in question!
    • Yes, with Fine-grained access tokens I can now limit token access to a particular repo, but as of Feb 2024 that's limited only to repos in my own account, i.e. to have a PAT applicable to repos in your orgnization, I will need to create a legacy token that allows writes everywhere my user has access to.

Bottom line - don't use PATs in organizations unless you are really willing to give access to everything you got on GitHub.

"Of course" I hear you saying, "let's not use my user, let's use a user. E.g. let's create a user called gh-bot, allow it to bypass pull request protection in certain repos, issue a PAT for them and use it?" - Sure, it will work but is similarly problematic, though not as dangerous as with your own PATs:

  • You need to take care of guarding credentials for that user, such as using a password manager, etc. Whoever has acess to these credentials, will have write access to the relevant repos.
  • If you reuse this user over multiple repos, then whoever gets access to the PAT (again, by running a workflow in the branch on one repo) will have access to all those repos as well.
  • You can of course define one such user for every repo but it's wastefull because you'll need to pay for each of them to have them in your org, and credential management gets even worth.

So what should we do? - We need some kind of machine user, a service account in GCP terms or IAM Role in AWS terms. Thankfully there is a such thing - GitHub Apps. Yes, it may sound confusing - we need an identity, not another app to develop; but don't worry, it provides exactly that and you won't need to write any app code. Here is how it works:

  • You create a GitHub app. You don't need to use real Homepage URLs or Callback URL, neither implement webhooks - just enter values that makes sense, context wise. Following my previous example let's call it codegen.myrepo.mycomapany.com.
  • In the app's permissions section, grant the app Read & Write access to Contents.
  • Add the app to the list of entities allowed to bypass pull request protection rules for your repo's main branch.
  • Store App ID and private key in your repo env vars / secrets.
  • Use actions/create-github-app-token to create temporary token to access your github repository.
Here is how your workflow will look like:

    steps:
    - uses: actions/create-github-app-token@v1.7.0
      id: codegen-bot-token
      with:
        app-id: ${{ vars.CODEGEN_APP_ID }}
        private-key: ${{ secrets.CODEGEN_PRIVATE_KEY }}

    - name: Checkout
      uses: actions/checkout@v4.1.1
      with:
        token: ${{ steps.codegen-bot-token.outputs.token }}

    - name: Code gen
      run: ./generate_tf.sh
  
    - name: Commit changes if any
      uses: stefanzweifel/git-auto-commit-action@v5.0.0

This solves the issue of having proper machine users - you can have one app (or more!) per repo and there is neither overhead of managing them nor additional cost incurred.

There is still one issue remaining - any user in your org with the write access to the repo can still have access to the app's private key defined in the repo's secret and hence bypass the main branch protenction rules. Indeed, the blast radius is now very limited to just one branch in one repo, but still, it feels pointless to have branch protection that can be easily (and accidentally) be bypassed by any user in your org.

Thankfully, again, there is a native solution for this - GitHub Actions Environments. (Not to be confused with GitHub Actions Environment Variables. Environments allow fine-grained control to who can access secrets and env vars defined in them.

So instead of storing our app ID / private key in the repo level-secrets, let's:

  • Create a new environment called "codegen"
  • Limit access to that environemnt to the main branch
  • Move our app ID and secret from repolevel into the environment-level

The last touch, is to request access to this environment within the workflow file: Here is how your workflow will look like:


    environment: codegen
    steps:
      ...

With that change in place, re-running your workflow will only work if run on the main. If your colleague will try to run it on any other branch in the repo (with innocous intentions most of the time), they won't be allowed:

That's it! Now our repo access is properly configured!

As you can see, designing GitHub actions workflows with princial of least priviledge mind is not obvious and quite often a well working setup can be surprisingly well open to unintended access. I hope this summary of my research on the subject will save you some time.

References

No comments:

Post a Comment