
ControlMonkey uses the Terraform Infrastructure-as-Code (IaC) technology to define the environment. The platform connects to each supported vendor and reverse engineers live configurations into Terraform HCL code. It then creates versioned snapshots on a daily basis.
The workflow has three phases. First, the platform performs a full asset inventory after connecting a vendor. Second, it identifies which resources have no code coverage and flags them for the operator. Third, it enables daily configuration snapshots so teams have a known-good state to recover from.
“The way to back up your configuration is with infrastructure as code,” Twizer explained. “We specifically do that with Terraform, and our core technology, our secret sauce, is to take providers or vendors of infrastructure and reverse engineer existing configuration, live configuration, to code.”
Recovery is executed through a one-click restore. When an incident occurs, the platform uses Terraform automation to provision the last known-good configuration into a second tenant. Customers can also use ControlMonkey APIs to build automated recovery playbooks triggered from external alerting tools such as PagerDuty or Datadog.
Scope: Configuration recovery, not vendor availability
To be clear, ControlMonkey isn’t a solution that will solve the issue of provider outages. The platform addresses configuration recovery, not vendor availability monitoring.
The primary scenario ControlMonkey is designed for is a ransomware attack that deletes or corrupts network configurations rather than data. In that situation, workloads and data may be intact but the network control plane is gone and applications become unreachable.
