Partial outage of flow and BGP ingest
Incident Report for Kentik SaaS EMEA Cluster
Postmortem

ROOT CAUSE

30% of our ingest servers restarted concurrently, causing ingest load balancing issues and resource constraints.

RESOLUTION

Kentik Operations allowed the servers to come back online, monitored the automated restart of all Kentik software, and tracked the reestablishment of BGP sessions with any peerings that flapped.

Additional ingest capacity will be brought online by 2022-10-14 to help alleviate these types of scenarios in the future.

Posted Oct 07, 2022 - 16:30 UTC

Resolved
This incident has been resolved.
Posted Sep 16, 2022 - 00:19 UTC
Monitoring
BGP Ingest has returned to fully operational.
Posted Sep 15, 2022 - 22:47 UTC
Update
Flow ingest has returned to fully operational.
Posted Sep 15, 2022 - 22:12 UTC
Identified
Some BGP sessions will bounce and limited amounts of flow may be lost.
Posted Sep 15, 2022 - 21:25 UTC
This incident affected: Flow Ingest and BGP Telemetry.