Wilson Mar bio photo

Wilson Mar

Hello!

Email me Calendar Skype call

LinkedIn Twitter Gitter Instagram Youtube

Github Stackoverflow Pinterest

Use Terraform’s sentinel command on Workspaces from a module registry, to automatically identify RBAC-based violations

US (English)   Español (Spanish)   Français (French)   Deutsch (German)   Italiano   Português   Cyrillic Russian   中文 (简体) Chinese (Simplified)   日本語 Japanese   한국어 Korean

Overview

This is a deep yet succinct tour of Hashicorp’s Terraform Enterprise (TFE), presented in light of its competitors in the market.

This is in addition to my tutorial on Terraform> which covers basic features of the Open-Source (free) edition of Terraform.

Terraform Open-source runs “on-prem.” or configured within your server in public clouds (AWS, Azure, GCP, etc.).

Terraform docs

“Terraform Enterprise” is licensed (paid) software which can run either as SaaS in the “Terraform Cloud” and on-prem. As with other SaaS products, Hashicorp provide HA (High Availability) resilience “under the hood” without effort by users.

VIDEO: Why should I consider Terraform Enterprise? by HashiCorp co-founder and CTO Armon Dadgar:

  • “Point and click” adoption of templates instead of hand-coding

  • Instead of users submitting to a manual ticketing queue (which breaks/slows agility), TFE auto-approves if policies are satisfied.

Why PaC (Policy as Code)

When each change in policy code triggers automated checks for compliance in CI/CD pipelines, it’s like having a Security Engineer always review every change, which is too costly to achieve manually.

Policy-as-code is crucial to scaling security governance across a large number of development teams.

Each review generates an attestation report as evidence to show auditors compliance (that security policies are continually enforced).

When security policies are not just in a Word document but defined and automated like source code and versioned in GitHub, TFE treats Policy as Code (PaC).

To “shift-left” checking earlier in the development lifecycle, verify locally on laptops before resources are deployed in the public cloud.

Costing & Optimization

TFE calculates the costs of each Terraform configuration BEFORE creating them in the cloud. And change in costs associated with each change in that configuration:

tfe-cost-delta-1918x554

Cost Estimation Policies by Kyle Ruddy (Sr. Tech Product Marketing Manger)

TOOL: densify.com dynamically self-optimizes configurations. This “FinOps” works by updating tags in AWS of recommendations for server type based on cost and performance analysis in real-time. VIDEO: densify-real-time-807x261

Enforce use of Tags

Common tags include:

  • environment (prod/staging/dev, etc.)
  • cost_center
  • data_classification (secret, confidential, public, etc.)
  • etc.

The Environment tag is used by teams who want different rules to be applied based on environment:

  • prod environments should not be changed on Fridays
  • dev environments should not use big server types nor drives larger than 10 GB.
  • etc.

PROTIP: CAUTION: Policies refer to “resource_types” of specific services. Example:

  • “aws_vpn_gateway”
  • “aws_network_interface”
  • etc.

In the sample “enforce-mandatory-tags.sentinel” file, new services not on the list will not be evaluated because the policy file needs to be updated when new AWS services become available. Deprecated services may cause an error when referenced.

QUESTION: How will you know when new AWS services become available or deprecated?

Also, Tags provide metadata for “intelligence” to be applied, as shown in the next topic.

Data saved from volatile disks

Some scan their IaC to detect “violations” of a Policy about ill-advised combination of technologies.

For example, Risks using AWS EC2 I3 instance type:

Amazon Web Services (AWS) Elastic Cloud Computing (a.k.a. EC2) offers several types of server hardware.

Storage optimized instances, labeled the “I3” instance type family can be used as what Amazon calls “Elastic Block Storage” (a.k.a. EBS) for high transactions, low latency workloads, and high IOPS. I3 is built with Non-Volatile Memory Express (NVMe) SSDs which will not lose data when powered down, but as long the instance it’s bound to is up and running.

Alas, instance storage can lose data if AWS EC2 instance is stopped. And it’s very common to stop an instance to save money or to downgrade/upgrade an instance.

So, a smart policy would be to ensure that a configuration using volatile memory have some mechanism to send data that needs to be referenced be sent to a more permanent recepticle.


Other PaC (OPA, AWS SCPs)

There is an alternative to the proprietary Terraform Sentinel policy language: Regula, which is processed by an OPA (Open Policy Agent) (pronounced like the Greek acclaim “oh pa!” to express enthusiasm, shock or surprise, or just after having made a mistake), open-sourced at github.com/open-policy-agent by Styra.com, which provides support and training on OPA and the <target=”_blank” href=”https://www.openpolicyagent.org/docs/latest/policy-language/”>Rego language</a> for defining policy rules.

Any language to define policies needs to be a programming language with if/then/else using variables, loops referencing arrays, functions, etc. NOTE: Rego has Python-esque import statements, no semicolons, print() functions, list comprehensions, etc. Rego extends the Datalog query language.

The Rego language is backed by the CNCF (Cloud-Native Foundation) and thus used in Kafka, Kubernetes, JupiterOne, etc.

Vendors who use OPA and Rego include Harness and Spacelift SaaS. Spacelift built sophisticated tooling called Policy Workbench for capturing policy inputs and replaying evaluations, allowing you to tweak policies in a tight feedback loop until they reflect your business needs. Unlike Terraform, Spacelift can group and filter resources to understand the architecture or look up their history to get a glimpse of the evolution of your infrastructure.

OPA runs GitHub Actions to alert about noncompliant IaC code.

github.com/iacsecurity/tool-compare lists all policy checks and which tool performs them:

Post deployment, Pulumi finds unused resources daily and shut them down.

Hashicorp Terraform Sentinal

TFE has its own proprietary policy language based on Python.

HCL (Hashicorp Configuration Language) enables addition of comments in JSON files, which by definition does not recognize comments.

VIDEO: Introduction to Sentinel, HashiCorp Policy as Code Framework

tfe-pac-1436x436

The “Enterprise” license license of Terraform includes a sentinel command which applies policy checks (“guardrails”) when provisioning infrastructure from a VCS (Version Control System such as GitHub) containing IaC (Infrastructure as Code).

tfe-flow-1396x613 tfe-flow-1396x613

Since January 20, 2021, a HCL (rather than JSON) format sentinel.hcl configuration file specifies “third-generation” *.sentinal policy language files, each with an enforcement_level:


Install Sentinel with samples

Run this command from my public repo. It downloads everything needed after creating a project folder:

bash -c "$(curl -fsSL https://raw.githubusercontent.com/wilsonmar/tf-samples/blob/main/tf-init.sh)" -v -i
   

Sections below describe the programs installed, repositories downloaded, and commands run.

The script recognizes several parameters to control what gets run:

-v (-verbose) prints out details about each step.

The script aims to be invoked several times, but would not re-install unless the “-i” paramter is specified.

-file error-sentinel.hcl specifies running a policy file which specifies a rule that always return false because it the policy defined is impossible. To see what error messages look like when running a policy detects a violation. The command:

sentinel apply error.policy

Trace messages

The error.policy file was written to be recognized as a policy violation so you can see a sample error trace:

Execution trace. The information below will show the values of all
the rules evaluated. Note that some rules may be missing if
short-circuit logic was taken.
 
Note that for collection types and long strings, output may be
truncated; re-run "sentinel apply" with the -json flag to see the
full contents of these values.
 
The trace is displayed due to a failed policy.
 
Fail - error.policy
 
Description:
  This error-sentinel.policy always returns a violation (4 is never less than
  2).
 
error.policy:5:1 - Rule "main"
  Value:
    false
   

PROTIP: Notice that “false” is returned to indicate a policy violation.

error-sentinel.hcl contents

The policy is called from the error-sentinel.hcl file:

policy "error-sentinel" {
    source = "./error.sentinel"
    enforcement_level = "hard-mandatory"
}
   

Enforcement levels

What Sentinel does with rule violations depends on the Enforcement Level which defines the importance of that policy and thus what happens when a violation of that policy is detected:

  • “hard-mandatory” violations cause the run to be halted so that it can be resolved.

  • “soft-mandatory” violations can be overriden on a case-by-case basis (ignored) by any user with the “Manage Policy Overrides” permission. QUESTION: How?

  • “advisory” violations are surfaced as informational to the user, and not interrupt the run.

PROTIP: NAMING CONVENTION: When a sentinel.hcl file has only one policy, name that policy the same as the sentinel.hcl file.

error-sentinel.policy contents

The main rule returns the value for the result of the entire policy.

So Sentinel expects there to be a main rule.

# This error-sentinel.policy always returns a violation (4 is never less than 2).
 
hour = 4  // hard-coded value.
 
main = rule { hour >= 0 and hour < 2 }
   

Sentinel policies are executed top-down.

STAR: https://docs.hashicorp.com/sentinel/language

https://docs.hashicorp.com/sentinel/intro/getting-started/logic

https://github.com/hashicorp/terraform-guides/blob/master/governance/third-generation/aws/aws-functions/aws-functions.sentinel

Common-Sentinel.hcl

My example-Sentinel.hcl file invokes the most common policies:

policy "enforce-mandatory-tags" {
    source = "./enforce-mandatory-tags.sentinel"
    enforcement_level = "hard-mandatory"
}
policy "restrict-allowed-vm-types" {
    source = "./restrict-allowed-vm-types.sentinel"
    enforcement_level = "soft-mandatory"
}
policy "restrict-app-service-to-https" {
    source = "./restrict-app-service-to-https.sentinel"
    enforcement_level = "advisory"
}
policy "no-prod-updates-on-fridays" {
    # Don't risk ruining someone's weekend:
    source = "./no-prod-updates-on-fridays"
    enforcement_level = "soft-mandatory"
}
policy "aws-cis-4.1-networking-deny-public-ssh-acl-rules" {
  # (networking outside AWS to get Raw files in GitHub):
  source            = "https://raw.githubusercontent.com/hashicorp/terraform-foundational-policies-library/master/cis/aws/networking/aws-cis-4.1-networking-deny-public-ssh-acl-rules/aws-cis-4.1-networking-deny-public-ssh-acl-rules.sentinel"
  enforcement_level = "advisory"
}

Indentation is two or more spaces.

If local (“./”) precede a policy files are referenced, it is convenient to have the Sentinel hcl file in the same folder.

./this.sentinel specifies the current folder.

../this.sentinel specifies the parent folder of the current file.

../../this.sentinel specifies a folder two levels up from the current file.

Use of the above folder reference requires a standard.

https://github.com/hashicorp/terraform-foundational-policies-library/tree/master/cis/aws/networking/aws-cis-4.1-networking-deny-public-ssh-acl-rules/test/aws-cis-4.1-networking-deny-public-ssh-acl-rules

Common Reference in Module

PROTIP: Because this policy is referenced by all and needs updating, it’s best that it be referenced as a module.

Variables

PROTIP: Avoid hard-coding tag values. Within policy files, use a variable defined by the sentinel.hcl file.

Imports

Standard (built-in) imports available to all Sentinel policies:

  • time
  • types
  • units
  • version

Standard imports also come from Hashicorp products: Terraform, Consul, Nomad, Vault

Server Type Limits

QUESTION: Why does a central authority have to limit the VM types which can be requested?


Sentinel CLI install

Instead of following their DOCS, this procedure enables you to skip messing with PATH:

  1. What size and download counts (at time of writing):

    brew info sentinel

    CAUTION: Note that, at time of this writing, Sentinel is not even at version 1.0.

    sentinel: 0.18.4
    https://docs.hashicorp.com/sentinel
    /usr/local/Caskroom/sentinel/0.18.4 (21.7MB)
    From: https://github.com/Homebrew/homebrew-cask/blob/HEAD/Casks/sentinel.rb
    ==> Name
    Sentinel
    ==> Description
    Language and framework for policy as code
    ==> Artifacts
    sentinel (Binary)
    ==> Analytics
    install: 27 (30 days), 93 (90 days), 460 (365 days)
    
  2. Install from any folder:

    brew install sentinel
  3. Verify install by getting menu:

    sentinel
    Usage: sentinel [--version] [--help] <command> [<args>]
     
    Available commands are:
     apply      Execute a policy and output the result
     fmt        Format Sentinel policy to a canonical format
     test       Test policies
     version    Prints the Sentinel runtime version
    

    VIDEO: A Deep Dive into Sentinel: HashiCorp’s Policy as Code Framework by Nic Jackson, Developer Advocate at HashiCorp

    VSCode editor HCL add-on

    If you use VSCode (Visual Studio Code, for syntax highlighting, code completions, and more working with Terraform HCL and Sentinel code:

  4. Install the VSCode add-on (announced June 2020) which leverages a Terraform Language Server (terraform-ls) running locally.

    https://marketplace.visualstudio.com/items?itemName=HashiCorp.terraform

  5. Click “Install” on the website.
  6. Within VSCode, click “Install” if that appears.
  7. Click the reload button next to the extension.
  8. Open your desired workspace and/or the root folder containing your Terraform files.
  9. Modify extension configuration options: Navigate to the extension view within VS Code, select the settings cog and choose Extension settings, or alternatively, modify the .vscode/settings.json file in the root of your working directory.

  10. Update provider schemas for the terraform-ls language server:

    terraform init

    This creates a folder.

    Download Sentinal libraries

  11. Because Hashicorp changes its repos frequently, my script creates a folder to clone repositories, then

    git clone https://github.com/wilsonmar/tf-samples.git --depth 1
    cd tf-samples
    

    Hashicorp provides a small sample GUI app to displays pictures of cute cats:

    • https://github.com/hashicorp/hashicat-aws
    • https://github.com/hashicorp/hashicat-azure

  12. Clone several sample sentinel and policy files:

    Terraform Foundational Policies Library at https://github.com/hashicorp/terraform-foundational-policies-library

    https://www.terraform.io/docs/cloud/sentinel/examples.html = Example TFE policies in Hashicorp’s GitHub include cost and policy sub-commands.

    VIDEO” Limit the number of server instances requested in tf files.

    There are also examples policy files at:

    • https://github.com/PacktPublishing/HashiCorp-Infrastructure-Automation-Certification-Guide/tree/master/chapter10/terraform-sentinel from a book viewed on OReilly.

    The above sample is adapted from another user.

    Policy File Naming Conventions

    Source Cloud Provider reference sorting service limit
    hashi aws cis 4.1 network -
    hashidocs aws svcs 001 tags mandatory
    hashidocs any sample 001 memory limit_1GB

    NOTE: Dashes are used to separate items and underlines are used to separate words within an item.

    https://github.com/hashicorp/terraform-guides/blob/master/governance/third-generation/aws/aws-functions/aws-functions.sentinel

    1. find_resources_with_standard_tags
    2. determine_role_arn
    3. get_assumed_roles
    4. validate_assumed_roles_with_list
    5. validate_assumed_roles_with_map
    6. filter_providers_by_regions
    7. validate_provider_in_allowed_regions

    https://www.terraform.io/docs/cloud/sentinel/manage-policies.html#the-sentinel-hcl-configuration-file

“For many organizations, part of this challenge is magnified by policies being spread out across the organization. Separate sets of policies existing in different physical and logical locations only create more barriers to enforce the procedures that matter. Policy gaps are often introduced due to a change in business logic or technology changes. If your development team updates authentication schemes, or operational teams make changes to the architecture of the organization’s environment, you are likely going to end up with a policy gap.” *

NOTE: There is a 1:1 match between state and config.

Imports and Modules

VIDEO: At the top of some Sentinel files are import statements:

import "tfrun"
import tfplan"
 
workspace_name = tfrun.workspace.name
desired_instance_type = "t2.micro"
 
print("Checking that ", workspace_name, " is using desired_instance_type ", desired_instance_type )

https://www.hashicorp.com/blog/terraform-sentinel-v2-imports-now-in-technology-preview

VIDEO: Terraform Enterprise: Understanding Workspaces and Modules by Teddy.

Sentinel Modules allow Sentinel functions and rules to be defined in one file and used by Sentinel policies in other files. In other words, load in Sentinel code as an import.

The blog post from March 2020 by vancluever Introducing Sentinel v2 imports: https://discuss.hashicorp.com/t/sentinel-v0-15-0-introducing-modules/6579

Mocks in Sentinel HCL

https://docs.hashicorp.com/sentinel/writing/testing#mocking Mocks are used during testing to simulate scenarios where a policy would pass or fail.

Mocking can be done by setting various parts of the configuration file which set static globals and imports.

One way to mock is to shift parameter values to reference globals, simulating an environment where variables are already defined instead of being retrieved for reals. Example: https://docs.hashicorp.com/sentinel/writing/testing#mocking-globals

global "hour" {
  value = 14
}

Mocks can also be done by using an import.

https://github.com/hashicorp/terraform-guides/tree/master/governance/third-generation/aws/mocks

mock "time" {
  data = {
    now = {
      weekday_name = "Monday"
      hour         = 14
    }
  }
}

Workspaces

Each Workspace consists of:

  • A Terraform configuration
  • Values for variables used by the configuration
  • Presistent stored state for the resources managed by the config
  • Historical state and run logs

VIDEO: Terraform Workflow at Scale: Best Practices

There is one set of each for each environment:

  • Dev
  • QA
  • Stage (green)
  • Prod (blue)

Policy Sets

Policy sets are groups of policies that can be enforced on workspaces.

A policy set can be enforced on designated workspaces, or to all workspaces in the organization.

RBAC

RBAC Roles to Subject SME

A sample Workspace for each team which manage certain resources (configs), with different RBAC permissions for each team/workspace:

SME Team / Workspace AWS services Individuals
Networks WAF, VPC, subnets, ALB/ELB Jane Doe
Security SG, IAM John Doe
DBAs RDS, DominoDB Taylor Swift
Back-end SRE K8S, ASGs JLo
Front-end SRE S3, EC2, Redis Dolly Parton
Front-end Dev Lambda, DNS name Billy Elish

As with the free “Community” version, Terraform automatically sequences the order of resources created (Network first, etc.).

SMEs (Subject Matter Experts) are needed because there are many nuances to configure and misconfigre each service. “Best practices” to avoid vulnerabilities change quickly over time as services change and new vulnerabilities are discovered.

PROTIP: Begin with Terraform’s Reference implementation for a particular cloud provider (AWS).

PROTIP: Ideally, standards, Best Practices, and policy code for each service would be reviewed on a monthly basis, to review recent vulnerabilities and remediations. Video recording would enable others to catch up. Each policy reviewed would be accompanied by policy coding and a demo testing the service.

The review would be educational, so others can take over for the SME.

Module registry

Create a module registry containing reusable module (created by experts) so consumers don’t need to know how the module works. Consumers would only need to supply DNS name.

VIDEO: Getting Started with Terraform Enterprise

VIDEO: Terraform Cloud and Terraform Enterprise 101

Paste code such as this main.tf to provision, using defined Terraform Variables key/values:

    module "s3-webapp" {
        # resource
        source = "app.terraform.io/Hashicorp-Sam/web-app-container/aws" 
        version = "2.2.1"
        name = "$(var.prefix)-app"
        port = "80"
        https_only = "false"
        region = var.region
        container_type = "docker"
        container_image = "myprojectx/myapp"
        # insert other variables here ...
    }

Terraform Enterprise also maintains (in encrypted form) Environment Variables to the cloud provider: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN.

Azure variables are instead: ARM_SUBSCRIPTION_ID, ARM_CLIENT_ID, ARM_CLIENT_SECRET, ARM_TENANT_ID,

[23:55] State files are automatically maintained by Terraform Enterprise.

[14:08] Within each Terraform Enterprise are cloud provider credentials and TF_WARN_OUTPUT_ERRORS, CONFIRM_DESTROY.

[20:32] Cost restriction policies, such as: tfe-cost-policy

There are also policies to restrict list of VM sizes which can be used.

Define notifications.

Chain workspaces (dependencies between teams).

Change in GitHub auto-triggers runs.

Confirm deploy manually?

    tfe-confirm-manual

TFE Server Architecture

This data flow diagram shows TFE’s internal technology choices:

tfe-data-flow-2400x1288

Application Layer:

  • TFE Core - A Rails application at the center of Terraform Enterprise; consists of web frontends and background workers

  • TFE Services - A set of Go services that provide various pieces of key functionality for Terraform Enterprise

  • Terraform Workers - A fleet of isolated execution environments that perform Terraform Runs on behalf of Terraform Enterprise users

Coordination Layer:

  • Redis - Used for Rails caching and coordination between TFE Core’s web and background workers

  • RabbitMQ Used for Terraform Worker job coordination

Storage Layer:

  • PostgreSQL Database - Serves as the primary store of Terraform Enterprise’s application data such as workspace settings and user settings

  • Blob Storage - Used for storage of Terraform state files, plan files, configuration, and output logs

  • HashiCorp Vault - Used for encryption of sensitive data. There are two types of Vault data in Terraform Enterprise - key material and storage backend data.

  • Configuration Data - The information provided and/or generated at install-time (e.g. database credentials, hostname, etc.)

https://www.terraform.io/docs/cloud/run/index.html `


Sentinel coding

The types of Sentinel policies, correspond to the three Sentinel imports:

  • tfplan - to restrict attributes of specific resources or data sources.

  • tfconfig - to restrict the configuration of Terraform modules, variables, resources, data sources, providers, provisioners, and outputs.

  • tfstate - to check whether previously provisioned resources, data sources, or outputs have attribute values that are no longer allowed by governance policies.

  • other imports to allow Sentinel policies to inspect workspace metadata attributes and cost estimates.

https://www.hashicorp.com/resources/writing-and-testing-sentinel-policies-for-terraform

Guy Barros defined several repos which set up Demostack workspaces in new TFE Organizations:

Execute plans against your Terraform code and then testing the Sentinel policies against the generated plans

  • https://github.com/GuyBarros/terraform-TFE-Demostack/blob/master/scripts/config_and_run_tf.sh

  • https://github.com/GuyBarros/terraform-aws-demostack
  • https://github.com/GuyBarros/terraform-azurerm-demostack

Run Sentinel files

Run the Sentinel Simulator with mocks generated from Terraform plans.

Alternately, trigger runs against workspaces that use that Terraform code:

  • Manually run locally using TFE CLI
  • Terraform UI with the remote backend
  • Call the Terraform API

TFE-CLI

https://github.com/rgreinho/tfe-cli

The tfh program in https://github.com/hashicorp-community/tf-helper provides commands for performing operations relating to HashiCorp Terraform. The operations include interacting with Terraform Enterprise (TFE) and also reporting on and manipulating other Terraform artifacts. The scripts are not necessary to use Terraform Enterprise’s core workflows, but they offer a convenient interface for manual actions on the command line.

Its tfh pushconfig and tfh pushvars commands replace and extend the functionality of the deprecated terraform push command. Use it to upload configurations, start runs, and change and retrieve variables using the new Terraform Enterprise API at https://www.terraform.io/docs/enterprise/api/index.html


Previous videos

VIDEO: How to Transition from Terraform OSS to Enterprise

VIDEO: TFE Introduction and Workplace Setup from 2015

https://www.youtube.com/watch?v=tUMe7EsXYBQ&list=RDCMUC-AdvAxaagE9W2f0webyNUQ&index=16

https://learn.hashicorp.com/

VIDEO: HashiCorp Vault Enterprise and Open Source High-Availability Demo


Ensure follow-up with ServiceNow

AWS offers the AWS Security Hub service which publishes security alerts from other services across regions.

AWS Security Hub presents security alerts as a web page for users to manually review.

The cumbersome aspect here is that a central SOC team must individually analyze each alert, then notify the appropriate individuals to request remediation action.

AWS Security Hub also sends emails of crucial and high alerts to designated individuals, but emails often are not read.

So companies that have ServiceNow make use of the two-way automatic integration between ServiceNow and AWS Security Hub

More: https://aws.amazon.com/blogs/mt/tag/servicenow/

Others

VIDEO: Provision to Production with Terraform Enterprise

VIDEO: Depenency injection for IaC by @rosemarywang https://github.com/joatmon08/manning-book/tree/main/live

https://manning.com/books/essential-infrastructure-as-code WATCHWANG40

https://learn.hashicorp.com/collections/terraform/policy

https://github.com/gruberdev/tf-free

https://www.youtube.com/watch?v=pWzkIX8WOBw Policy as Code with Terraform and Sentinel Azure DevOps

https://www.youtube.com/watch?v=_TtJ2Aco1gc Policy as Code with Terraform & Sentinel 3,099 viewsMay 8, 2020