Archive Type

This document describes the design of the archive resource type for downloading and extracting archives.

Overview

The archive resource manages remote archives with three phases:

  • Download: Fetch archive from a URL to local filesystem
  • Extract: Unpack archive contents to a target directory
  • Cleanup: Optionally remove the archive file after extraction

These phases are conditional based on current state and configuration.

Provider Interface

Archive providers must implement the ArchiveProvider interface:

type ArchiveProvider interface {
    model.Provider

    Download(ctx context.Context, properties *model.ArchiveResourceProperties, log model.Logger) error
    Extract(ctx context.Context, properties *model.ArchiveResourceProperties, log model.Logger) error
    Status(ctx context.Context, properties *model.ArchiveResourceProperties) (*model.ArchiveState, error)
}

Method Responsibilities

MethodResponsibility
StatusQuery archive file existence, checksum, attributes, and creates file
DownloadFetch archive from URL, verify checksum, set ownership
ExtractUnpack archive contents to extract parent directory

Status Response

The Status method returns an ArchiveState containing:

type ArchiveState struct {
    CommonResourceState
    Metadata *ArchiveMetadata
}

type ArchiveMetadata struct {
    Name          string    // Archive file path
    Checksum      string    // SHA256 hash of archive
    ArchiveExists bool      // Whether archive file exists
    CreatesExists bool      // Whether creates marker file exists
    Owner         string    // Archive file owner
    Group         string    // Archive file group
    MTime         time.Time // Modification time
    Size          int64     // File size in bytes
    Provider      string    // Provider name (e.g., "http")
}

The Ensure field in CommonResourceState is set to:

  • present if the archive file exists
  • absent if the archive file does not exist

Available Providers

ProviderSourceDocumentation
httpHTTP/HTTPS URLsHTTP

Ensure States

ValueDescription
presentArchive must be downloaded (and optionally extracted)
absentArchive file must not exist

Supported Archive Formats

ExtensionDescription
.tar.gz, .tgzGzip-compressed tar archive
.tarUncompressed tar archive
.zipZIP archive

The URL and local file name must have matching archive type extensions.

Apply Logic

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Get current state via Status()          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Is current state desired state?         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              Yes β”‚         No
                  β–Ό         β”‚
          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
          β”‚ No change β”‚     β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
                            β–Ό
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚ What is desired ensure? β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                            β”‚
            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”‚ absent                        β”‚ present
            β–Ό                               β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚ Remove archiveβ”‚             β”‚ Download needed?    β”‚
    β”‚ file          β”‚             β”‚ (checksum mismatch  β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β”‚  or file missing)   β”‚
                                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                        Yes β”‚         No
                                            β–Ό         β”‚
                                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
                                    β”‚ Download  β”‚     β”‚
                                    β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜     β”‚
                                          β”‚           β”‚
                                          β–Ό           β–Ό
                                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                  β”‚ Extract needed?         β”‚
                                  β”‚ (extract_parent set AND β”‚
                                  β”‚  (download occurred OR  β”‚
                                  β”‚   creates file missing))β”‚
                                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                        Yes β”‚         No
                                            β–Ό         β”‚
                                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
                                    β”‚ Extract   β”‚     β”‚
                                    β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜     β”‚
                                          β”‚           β”‚
                                          β–Ό           β–Ό
                                  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                                  β”‚ Cleanup enabled?        β”‚
                                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                        Yes β”‚         No
                                            β–Ό         β–Ό
                                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”
                                    β”‚ Remove    β”‚ β”‚ Done  β”‚
                                    β”‚ archive   β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”˜
                                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Idempotency

The archive resource uses multiple checks for idempotency:

State Checks (in order)

  1. Ensure absent: Archive file must not exist
  2. Creates file: If creates is set, the marker file must exist
  3. Archive existence: If cleanup: false, archive must exist
  4. Owner/Group: Archive file attributes must match
  5. Checksum: If specified, archive checksum must match

Decision Table

ConditionStable?
ensure: absent + archive missingYes
ensure: absent + archive existsNo (remove)
creates file existsYes (skip all)
creates file missingNo (extract needed)
cleanup: false + archive missingNo (download needed)
Archive checksum mismatchNo (re-download needed)
Archive owner/group mismatchNo (re-download needed)

Creates Property

The creates property provides idempotency for extraction:

- archive:
    - /tmp/app.tar.gz:
        url: https://example.com/app.tar.gz
        extract_parent: /opt/app
        creates: /opt/app/bin/app
        owner: root
        group: root

Behavior:

  • If /opt/app/bin/app exists, skip download and extraction
  • Useful when extracted files indicate successful prior extraction
  • Prevents re-extraction on every run

Cleanup Property

The cleanup property removes the archive after extraction:

- archive:
    - /tmp/app.tar.gz:
        url: https://example.com/app.tar.gz
        extract_parent: /opt/app
        creates: /opt/app/bin/app
        cleanup: true
        owner: root
        group: root

Requirements:

  • extract_parent must be set (cleanup only makes sense with extraction)
  • creates must be set to track extraction state

Behavior:

  • After successful extraction, remove the archive file
  • On subsequent runs, creates file prevents re-download

Checksum Verification

When checksum is specified:

- archive:
    - /tmp/app.tar.gz:
        url: https://example.com/app.tar.gz
        checksum: "a1b2c3d4..."
        owner: root
        group: root

Behavior:

  • Downloaded file is verified against SHA256 checksum
  • Existing file checksum is compared to detect changes
  • Checksum mismatch triggers re-download
  • Download fails if fetched content doesn’t match

Authentication

Archives support two authentication methods:

Basic Authentication

- archive:
    - /tmp/app.tar.gz:
        url: https://private.example.com/app.tar.gz
        username: deploy
        password: "{{ lookup('data.password') }}"
        owner: root
        group: root

Custom Headers

- archive:
    - /tmp/app.tar.gz:
        url: https://api.example.com/releases/app.tar.gz
        headers:
          Authorization: "Bearer {{ lookup('data.token') }}"
        owner: root
        group: root

Required Properties

PropertyRequiredDescription
urlYesSource URL for download
ownerYesUsername that owns the archive file
groupYesGroup that owns the archive file

URL Validation

URLs are validated during resource creation:

  • Must be valid URL format
  • Scheme must be http or https
  • Path must end with supported archive extension
  • Extension must match the name property extension

Noop Mode

In noop mode, the archive type:

  1. Queries current state normally
  2. Computes what actions would be taken
  3. Sets appropriate NoopMessage:
    • “Would have downloaded”
    • “Would have extracted”
    • “Would have cleaned up”
    • “Would have removed”
  4. Reports Changed: true if changes would occur
  5. Does not call provider Download/Extract methods
  6. Does not remove files

Multiple actions are joined with “. " (e.g., “Would have downloaded. Would have extracted”).

Desired State Validation

After applying changes (in non-noop mode), the type verifies the archive reached the desired state by calling Status() again and checking all conditions. If validation fails, ErrDesiredStateFailed is returned.

Subsections of Archive Type

HTTP Provider

This document describes the implementation details of the HTTP archive provider for downloading and extracting archives from HTTP/HTTPS URLs.

Provider Selection

The HTTP provider is selected when:

  1. The URL scheme is http or https
  2. The archive file extension is supported (.tar.gz, .tgz, .tar, .zip)
  3. The required extraction tool (tar or unzip) is available in PATH

The IsManageable() function checks these conditions and returns a priority of 1 if all are met.

Operations

Download

Process:

  1. Parse the URL and add Basic Auth credentials if username/password provided
  2. Create HTTP request with custom headers (if specified)
  3. Execute GET request via util.HttpGetResponse()
  4. Verify HTTP 200 status code
  5. Create temporary file in the same directory as the target
  6. Set ownership on temp file before writing content
  7. Copy response body to temp file
  8. Verify checksum if provided
  9. Atomic rename temp file to target path

Atomic Write Pattern:

[parent dir]/archive-name-* (temp file)
    ↓ write content
    ↓ set owner/group
    ↓ verify checksum
    ↓ rename
[parent dir]/archive-name (final file)

The temp file is created in the same directory as the target to ensure os.Rename() is atomic (same filesystem).

Error Handling:

ConditionBehavior
HTTP non-200Return error with status code
Write failureClean up temp file, return error
Checksum mismatchClean up temp file, return error with expected vs actual
Rename failureTemp file cleaned up by defer

Authentication:

MethodImplementation
Basic AuthURL userinfo is passed to HttpGetResponse() which sets Authorization header
Username/Password propertiesEmbedded in URL before request: url.UserPassword(username, password)
Custom HeadersAdded to request via http.Header.Add()

Extract

Process:

  1. Validate ExtractParent is set
  2. Create ExtractParent directory if it doesn’t exist (mode 0755)
  3. Determine archive type from file extension
  4. Execute appropriate extraction command

Extraction Commands:

ExtensionCommand
.tar.gz, .tgztar -xzf <archive> -C <extract_parent>
.tartar -xf <archive> -C <extract_parent>
.zipunzip -d <extract_parent> <archive>

Command Execution:

Commands are executed via model.CommandRunner.ExecuteWithOptions() with:

OptionValue
Commandtar or unzip
ArgsExtraction flags and paths
CwdExtractParent directory
Timeout1 minute

Error Handling:

ConditionBehavior
Unsupported extensionReturn “archive type not supported” error
Command not foundRunner returns error
Non-zero exit codeReturn error with exit code and stderr

Status

Process:

  1. Initialize state with EnsureAbsent default
  2. Check if archive file exists via os.Stat()
  3. If exists: set EnsurePresent, populate metadata (size, mtime, owner, group, checksum)
  4. If Creates property set: check if creates file exists

Metadata Collected:

FieldSource
NameFrom properties
Provider“http”
ArchiveExistsos.Stat() success
SizeFileInfo.Size()
MTimeFileInfo.ModTime()
Ownerutil.GetFileOwner() - resolves UID to username
Grouputil.GetFileOwner() - resolves GID to group name
Checksumutil.Sha256HashFile()
CreatesExistsos.Stat() on Creates path

Idempotency

The provider supports idempotency through the type’s isDesiredState() function:

State Checks (in order)

  1. Ensure Absent: If ensure: absent, archive must not exist
  2. Creates File: If Creates set and file doesn’t exist β†’ not stable
  3. Archive Existence: If cleanup: false, archive must exist
  4. Owner/Group: Must match properties
  5. Checksum: If specified in properties, must match

Note: When cleanup: true, the creates property is required (enforced at validation time).

Decision Matrix

Archive ExistsCreates ExistsChecksum MatchCleanupStable?
NoNoN/AfalseNo (download needed)
YesNoYesfalseNo (extract needed)
YesYesYesfalseYes
NoYesN/AtrueYes
YesYesYestrueNo (cleanup needed)

Checksum Verification

Algorithm: SHA-256

Implementation:

sum, err := util.Sha256HashFile(tempFile)
if sum != properties.Checksum {
    return fmt.Errorf("checksum mismatch, expected %q got %q", properties.Checksum, sum)
}

Timing: Checksum is verified after download completes but before the atomic rename. This ensures:

  • Corrupted downloads are never placed at the target path
  • Temp file is cleaned up on mismatch
  • Clear error message with both expected and actual checksums

Security Considerations

Credential Handling

  • Credentials in URL are redacted in log messages via util.RedactUrlCredentials()
  • Basic Auth header is set by Go’s http.Request.SetBasicAuth(), not manually constructed

Archive Extraction

  • Extraction uses system tar/unzip commands
  • No path traversal protection beyond what the tools provide
  • ExtractParent must be an absolute path (validated in model)

Temporary Files

  • Created with os.CreateTemp() using pattern <archive-name>-*
  • Deferred removal ensures cleanup on all exit paths
  • Ownership set before content written

Platform Support

The provider is Unix-only due to:

  • Dependency on util.GetFileOwner() which uses syscall for UID/GID resolution
  • Dependency on util.ChownFile() for ownership management

Timeouts

OperationTimeoutConfigurable
HTTP Download1 minute (default in HttpGetResponse)No
Archive Extraction1 minuteNo

Large archives may require increased timeouts in future versions.