Reputation: 1588
I've a custom operator that listens to changes in a CRD I've defined in a Kubernetes cluster.
Whenever something changed in the defined custom resource, the custom operator would reconcile and idempotently create a secret (that would be owned by the custom resource).
What I expect is for the operator to Reconcile only when something changed in the custom resource or in the secret owned by it.
What I observe is that for some reason the Reconcile
function triggers for every CR on the cluster in strange intervals without observable changes to related entities. I've tried focusing on a specific instance of the CR and follow the times in which Reconcile
was called for it. The intervals of these calls are very strange. It seems that the calls are alternating between two series - one starts at 10 hours and diminishes seven minutes at a time. The other starts at 7 minutes and grows by 7 minutes a time.
To demonstrate, Reconcile triggered
at these times (give or take a few seconds):
00:00
09:53 (10 hours - 1*7 minute interval)
10:00 (0 hours + 1*7 minute interval)
19:46 (10 hours - 2*7 minute interval)
20:00 (0 hours + 2*7 minute interval)
29:39 (10 hours - 3*7 minute interval)
30:00 (0 hours + 3*7 minute interval)
Whenever the diminishing intervals become less than 7 hours, it resets back to 10 hour intervals. The same with the growing series - as soon as the intervals are higher than 3 hours it resets back to 7 minutes.
My main question is how can I investigating why Reconcile is being triggered?
I'm attaching here the manifests for the CRD, the operator and a sample manifest for a CR:
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.4.1
creationTimestamp: "2021-10-13T11:04:42Z"
generation: 1
name: databaseservices.operators.talon.one
resourceVersion: "245688703"
uid: 477f8d3e-c19b-43d7-ab59-65198b3c0108
spec:
conversion:
strategy: None
group: operators.talon.one
names:
kind: DatabaseService
listKind: DatabaseServiceList
plural: databaseservices
singular: databaseservice
scope: Namespaced
versions:
- name: v1alpha1
schema:
openAPIV3Schema:
description: DatabaseService is the Schema for the databaseservices API
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: DatabaseServiceSpec defines the desired state of DatabaseService
properties:
cloud:
type: string
databaseName:
description: Foo is an example field of DatabaseService. Edit databaseservice_types.go
to remove/update
type: string
serviceName:
type: string
servicePlan:
type: string
required:
- cloud
- databaseName
- serviceName
- servicePlan
type: object
status:
description: DatabaseServiceStatus defines the observed state of DatabaseService
type: object
type: object
served: true
storage: true
subresources:
status: {}
status:
acceptedNames:
kind: DatabaseService
listKind: DatabaseServiceList
plural: databaseservices
singular: databaseservice
conditions:
- lastTransitionTime: "2021-10-13T11:04:42Z"
message: no conflicts found
reason: NoConflicts
status: "True"
type: NamesAccepted
- lastTransitionTime: "2021-10-13T11:04:42Z"
message: the initial names have been accepted
reason: InitialNamesAccepted
status: "True"
type: Established
storedVersions:
- v1alpha1
----
apiVersion: operators.talon.one/v1alpha1
kind: DatabaseService
metadata:
creationTimestamp: "2021-10-13T11:14:08Z"
generation: 1
labels:
app: talon
company: amber
repo: talon-service
name: db-service-secret
namespace: amber
resourceVersion: "245692590"
uid: cc369297-6825-4fbf-aa0b-58c24be427b0
spec:
cloud: google-australia-southeast1
databaseName: amber
serviceName: pg-amber
servicePlan: business-4
----
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "75"
secret.reloader.stakater.com/reload: db-credentials
simpledeployer.talon.one/image: <path_to_image>/production:latest
creationTimestamp: "2020-06-22T09:20:06Z"
generation: 77
labels:
simpledeployer.talon.one/enabled: "true"
name: db-operator
namespace: db-operator
resourceVersion: "245688814"
uid: 900424cd-b469-11ea-b661-4201ac100014
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
name: db-operator
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
name: db-operator
spec:
containers:
- command:
- app/db-operator
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: OPERATOR_NAME
value: db-operator
- name: AIVEN_PASSWORD
valueFrom:
secretKeyRef:
key: password
name: db-credentials
- name: AIVEN_PROJECT
valueFrom:
secretKeyRef:
key: projectname
name: db-credentials
- name: AIVEN_USERNAME
valueFrom:
secretKeyRef:
key: username
name: db-credentials
- name: SENTRY_URL
valueFrom:
secretKeyRef:
key: sentry_url
name: db-credentials
- name: ROTATION_INTERVAL
value: monthly
image: <path_to_image>/production@sha256:<some_sha>
imagePullPolicy: Always
name: db-operator
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: db-operator
serviceAccountName: db-operator
terminationGracePeriodSeconds: 30
status:
availableReplicas: 1
conditions:
- lastTransitionTime: "2020-06-22T09:20:06Z"
lastUpdateTime: "2021-09-07T11:56:07Z"
message: ReplicaSet "db-operator-cb6556b76" has successfully progressed.
reason: NewReplicaSetAvailable
status: "True"
type: Progressing
- lastTransitionTime: "2021-09-12T03:56:19Z"
lastUpdateTime: "2021-09-12T03:56:19Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
observedGeneration: 77
readyReplicas: 1
replicas: 1
updatedReplicas: 1
Note:
return ctrl.Result{Requeue: false, RequeueAfter: 0}
So that shouldn't be the reason for the repeated triggers.
Upvotes: 1
Views: 2354
Reputation: 29
same problem my Reconcile triggered at these times
00:00
09:03 (9 hours + 3 min)
18:06 (9 hours + 3 min)
00:09 (9 hours + 3 min)
sync period is not set so it should be default. kubernetes 1.20.11 version
Upvotes: 0
Reputation: 41
This would require more info on how your controller is set up. For example what is the sync period you have set. This could be due to default sync period set which reconciles all the objects at given interval of time.
SyncPeriod determines the minimum frequency at which watched resources are reconciled. A lower period will correct entropy more quickly, but reduce responsiveness to change if there are many watched resources. Change this value only if you know what you are doing. Defaults to 10 hours if unset. there will a 10 percent jitter between the SyncPeriod of all controllers so that all controllers will not send list requests simultaneously.
For more information check this: https://github.com/kubernetes-sigs/controller-runtime/blob/v0.11.2/pkg/manager/manager.go#L134
Upvotes: 1