4 Commits

Author SHA1 Message Date
Nelson Yen
80e4ce1769 mmkubernetes fix for apiserver error handling
submit on behalf of @abwaheed
- Added graceful handling of apiserver errors with unexpected responses,
  i.e., anything other than 200, 404, or 429. Idea is that apiserver
  transient error state will recover. We don't want mmkubernetes to miss
  metadata resolution for containers that don't have cached metadata.
  During these transient error states, mmkubernetes will provide basic
  container file path based resolution of namespace and pod metadata for
  new pods whose metadata is not yet cached. After this error state
  recovers, mmkubernetes is expected to resume its metadata resolution as
  expected.
- Added a unit test case for apiserver return 500 with changes to mock server
-  Fixed existing unit test that was failing due to missing expected results file
-  Added mmkubernetes unit tests to testbench
2021-06-29 18:14:25 -07:00
Rich Megginson
3987cd929d mmkubertnetes: action fails preparation cycle if kubernetes API destroys resource during bootup sequence
The plugin was not handling 404 Not Found correctly when looking
up pods and namespaces.  In this case, we assume the pod/namespace
was deleted, annotate the record with whatever metadata we have,
and cache the fact that the pod/namespace is missing so we don't
attempt to look it up again.
In addition, the plugin was not handling error 429 Busy correctly.
In this case, it should also annotate the record with whatever
metadata it has, and _not_ cache anything.  By default the plugin
will retry every 5 seconds to connect to Kubernetes.  This
behavior is controlled by the new config param `busyretryinterval`.
This commit also adds impstats counters so that admins can
view the state of the plugin to see if the lookups are working
or are returning errors.  The stats are reported per-instance
or per-action to facilitate using multiple different actions
for different Kubernetes servers.
This commit also adds support for client cert auth to
Kubernetes via the two new config params `tls.mycert` and
`tls.myprivkey`.
2018-09-14 12:42:06 -06:00
Rich Megginson
fb4a41ca47 mmkubernetes: stops working with non-kubernetes container names
When mmkubernetes encounters a record with a CONTAINER_NAME field,
but the value does not match the rulebase, mmkubernetes returns
an error, and mmkubernetes does not do any further processing
of any records.
The fix is to check the return value of ln_normalize to see if
it is a "hard" error or a "does not match" error.
This also adds a test for pod names with dots in them.
2018-07-26 14:21:55 -06:00
Rich Megginson
1d49aac5cb mmkubernetes: fix lnrules, add defaults, add test
Fix lnrules for CONTAINER_NAME

Add pkg check for lognorm >= 2.0.3 so we can set the macro
to enable ln_loadSamplesFromString

Add some reasonable default values for parameters, such as
kubernetesurl https://kubernetes.default.svc.cluster.local:443

Clean up sample.conf configuration file

Add test for mmkubernetes, including mock kubernetes service
2018-04-13 13:02:44 -06:00