Deploying Sourcegraph Executors natively on Kubernetes

This feature is in beta and is available in Sourcegraph 5.1.0 and later.

The native Kubernetes Executors have a master Executor pod that schedules worker pods via the Kubernetes API. The master Executor pod manages the lifecycle of jobs, while the worker pods process the actual batch change or precise auto indexing job specs. For more details, see how it works.

Requirements

Executors interact with the Kubernetes API to manage the lifecycle of individual executor jobs. The following are the RBAC Roles needed to run Executors natively on Kubernetes.

API Groups	Resources	Verbs	Reason
`batch`	`jobs`	`create`, `delete`	Executors create Job pods to run processes. Once Jobs are completed, they are cleaned up.
	`pods`, `pods/log`	`get`, `list`, `watch`	Executors need to look up and steam logs from the Job Pods.

Here's an example Role YAML to demonstrate the RBAC requirements for native Kubernetes Executors:

YAML
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: sg-executor-batches-role
  namespace: default
rules:
  - apiGroups:
      - batch
    resources:
      - jobs
    verbs:
      - create
      - delete
  - apiGroups:
      - ""
    resources:
      - pods
      - pods/log
    verbs:
      - get
      - list
      - watch
# Secrets are required post 5.5, when all pods run a single job
  - apiGroups:
      - ""
    resources:
      - secrets
    verbs:
      - create
      - delete
  # PVCs are required if KUBERNETES_JOB_VOLUME_TYPE is "pvc"
#  - apiGroups:
#      - ""
#    resources:
#      - persistentvolumeclaims
#    verbs:
#      - create
#      - delete

Deployment

Native Kubernetes Executors can be deployed via either the sourcegraph-executor-k8s Helm chart or by using the executors/k8s Kustomize component.

Steps

If you are deploying Executors for processing Batch Changes, set the native-ssbc-execution feature flag
Configure the following environment variables on the Executor Deployment:
1. EXECUTOR_FRONTEND_URL should match the URL of your Sourcegraph instance
2. EXECUTOR_FRONTEND_PASSWORD should match the executors.accessToken key in the Sourcegraph instance's site configuration
3. Either EXECUTOR_QUEUE_NAMES or EXECUTOR_QUEUE_NAME should be set depending on whether the Executor will process batch change or precise auto indexing jobs
4. EXECUTOR_KUBERNETES_NAMESPACE should be set to the Kubernetes namespace where you intend to run the worker pods. This should generally match the namesapce where you deploy the Executor resources in the next step.
Additional environment variables may need to be configured for your Executors deployment. See here for a complete list.
Deploy the native Kubernetes Executor resources to your Kubernetes cluster via either the sourcegraph-executor-k8s Helm chart or the executors/k8s Kustomize component.
1. For more details on how to work with the Sourcegraph Helm chart, see the Helm chart docs (Note: The same Helm values override.yaml file can be used for both the main sourcegraph Helm chart as well as the sourcegraph-executor-k8s chart)
2. For more details on how to configure the executors/k8s component for your Kustomize deployment, see the Kustomize configuration docs
Once all the native Kubernetes Executor resources have been deployed to your cluster, confirm that the Executors are online by checking the Executors page under Site admin > Executors > Instances

Additional Notes

Firecracker

Executors deployed on Kubernetes do not use Firecracker.

If you have security concerns, consider deploying via terraform or installing the binary directly.

Job Scheduling

Note: Kubernetes has a max of 110 pods per node. If you run into this limit, you can lower the number of Job Pods running on a node by setting the environment variable EXECUTOR_MAXIMUM_NUM_JOBS.

Executors deployed on Kubernetes require Jobs to be scheduled on the same Node as the Executor. This is to ensure that Jobs are able to access the same Persistence Volume as the Executor.

To ensure that Jobs are scheduled on the same Node as the Executor, the following environment variables can be set,

EXECUTOR_KUBERNETES_NODE_NAME
EXECUTOR_KUBERNETES_NODE_SELECTOR
EXECUTOR_KUBERNETES_NODE_REQUIRED_AFFINITY_MATCH_EXPRESSIONS
EXECUTOR_KUBERNETES_NODE_REQUIRED_AFFINITY_MATCH_FIELDS

Node Name

Using the Downward API, the property spec.nodeName can be used to set the EXECUTOR_KUBERNETES_NODE_NAME environment variable.

YAML
    - name: EXECUTOR_KUBERNETES_NODE_NAME
      valueFrom:
        fieldRef:
          fieldPath: spec.nodeName

This ensures that the Job is scheduled on the same Node as the Executor.

However, if the node does not have enough resources to run the Job, the Job will not be scheduled.

Firewall Rules

The following are Firewall rules that are highly recommended when running Executors on Kubernetes in a Cloud Environment.

Disable access to internal resources.
Disable access to 169.254.169.254 (AWS / GCP Instance Metadata Service).