Enable service discovery for schema migrations in k8s deployments for GitLab.com
Context
In Add configuration to use service discovery to i... (#890 - closed) we added a way to configure and use service discovery for schema migrations.
This is a follow-up to that issue to actually enable the migrations in our environments.
Update: Based on the recent findings from #1006 (comment 1610698801) . I'm rescoping this issue to only target:
- Asserting that we can use the database fqdn from the registry.
- Removing the service discovery configuration in the Gitlab.com deployments and in the registry configuration setup
With the above tackled in this issue we can follow up with:
- Add a new configuration field (to the registry database config section) for specifying the primary database host used for running migration: #1142 (closed)
- Update Gitlab.com deployments to use the new configuration in gstgs and gprd: #1143 (closed)
Implementation plan
Once we finish verifying that the registry can resolve the primary address of the database (#890 (closed)), we should update
the environments (pre, gstg, gprd) so that the migrations are checked and run using service discovery.
This depends on Registry: add database discovery configuration (gitlab-org/charts/gitlab!3142 - merged) and Release Version v3.72.0-gitlab (#999 - closed)
Steps:
- Update the K8s workloads on
pre
andgstg
with the service discovery fields https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-com/-/blob/master/releases/gitlab/values/pre.yaml.gotmpl#L56 added via gitlab-org/charts/gitlab!3142 (merged).
discovery:
enabled: false
# in testing mode the registry will attempt to solve the configuration when `nameserver` is not empty.
nameserver: fqdn.of.valid.dns
port: 53
primaryrecord: primary.database.fqdn.
tcp: true
- Monitor the logs and search for
json.msg: "DNS lookup found SRV records"
andjson.msg: "DNS lookup found A records"
or related errors. - If we can confirm that the registry can resolve the service names:
- Add the settings to
gprd
and do the same checks in the logs - Continue with step 4
- Add the settings to
- Make the necessary changes to the registry to use the resolved IP address for the database configuration to run schema migrations.
- Release and deploy to
pre
and test that schema migrations are working. - Proceed to
gprd
!
If step 3. failed, we will need to work with infra to find the appropriate settings to find the nameserver
and the primaryrecord
Note: Depending on the success of this issue, please look at chore(database): set maxopen to 1 for migrations (#922 - closed) and make sure it's still viable. If so, ask for it to be scheduled in milestone N+1.