Archive Type This document describes the design of the archive resource type for downloading and extracting archives.
Overview The archive resource manages remote archives with three phases:
Download : Fetch archive from a URL to local filesystemExtract : Unpack archive contents to a target directoryCleanup : Optionally remove the archive file after extractionThese phases are conditional based on current state and configuration.
Provider Interface Archive providers must implement the ArchiveProvider interface:
type ArchiveProvider interface {
model .Provider
Download (ctx context .Context , properties * model .ArchiveResourceProperties , log model .Logger ) error
Extract (ctx context .Context , properties * model .ArchiveResourceProperties , log model .Logger ) error
Status (ctx context .Context , properties * model .ArchiveResourceProperties ) (* model .ArchiveState , error )
} Method Responsibilities Method Responsibility StatusQuery archive file existence, checksum, attributes, and creates file DownloadFetch archive from URL, verify checksum, set ownership ExtractUnpack archive contents to extract parent directory
Status Response The Status method returns an ArchiveState containing:
type ArchiveState struct {
CommonResourceState
Metadata * ArchiveMetadata
}
type ArchiveMetadata struct {
Name string // Archive file path
Checksum string // SHA256 hash of archive
ArchiveExists bool // Whether archive file exists
CreatesExists bool // Whether creates marker file exists
Owner string // Archive file owner
Group string // Archive file group
MTime time .Time // Modification time
Size int64 // File size in bytes
Provider string // Provider name (e.g., "http")
} The Ensure field in CommonResourceState is set to:
present if the archive file existsabsent if the archive file does not existAvailable Providers Provider Source Documentation httpHTTP/HTTPS URLs HTTP
Ensure States Value Description presentArchive must be downloaded (and optionally extracted) absentArchive file must not exist
Extension Description .tar.gz, .tgzGzip-compressed tar archive .tarUncompressed tar archive .zipZIP archive
The URL and local file name must have matching archive type extensions.
Apply Logic βββββββββββββββββββββββββββββββββββββββββββ
β Get current state via Status() β
βββββββββββββββββββ¬ββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββ
β Is current state desired state? β
βββββββββββββββββββ¬ββββββββββββββββββββββββ
Yes β No
βΌ β
βββββββββββββ β
β No change β β
βββββββββββββ β
βΌ
βββββββββββββββββββββββββββ
β What is desired ensure? β
βββββββββββββββ¬ββββββββββββ
β
βββββββββββββββββ΄ββββββββββββββββ
β absent β present
βΌ βΌ
βββββββββββββββββ βββββββββββββββββββββββ
β Remove archiveβ β Download needed? β
β file β β (checksum mismatch β
βββββββββββββββββ β or file missing) β
βββββββββββ¬ββββββββββββ
Yes β No
βΌ β
βββββββββββββ β
β Download β β
βββββββ¬ββββββ β
β β
βΌ βΌ
βββββββββββββββββββββββββββ
β Extract needed? β
β (extract_parent set AND β
β (download occurred OR β
β creates file missing))β
βββββββββββ¬ββββββββββββββββ
Yes β No
βΌ β
βββββββββββββ β
β Extract β β
βββββββ¬ββββββ β
β β
βΌ βΌ
βββββββββββββββββββββββββββ
β Cleanup enabled? β
βββββββββββ¬ββββββββββββββββ
Yes β No
βΌ βΌ
βββββββββββββ βββββββββ
β Remove β β Done β
β archive β βββββββββ
βββββββββββββIdempotency The archive resource uses multiple checks for idempotency:
State Checks (in order) Ensure absent : Archive file must not existCreates file : If creates is set, the marker file must existArchive existence : If cleanup: false, archive must existOwner/Group : Archive file attributes must matchChecksum : If specified, archive checksum must matchDecision Table Condition Stable? ensure: absent + archive missingYes ensure: absent + archive existsNo (remove) creates file existsYes (skip all) creates file missingNo (extract needed) cleanup: false + archive missingNo (download needed) Archive checksum mismatch No (re-download needed) Archive owner/group mismatch No (re-download needed)
Creates Property The creates property provides idempotency for extraction:
- archive :
- /tmp/app.tar.gz :
url : https://example.com/app.tar.gz
extract_parent : /opt/app
creates : /opt/app/bin/app
owner : root
group : root Behavior:
If /opt/app/bin/app exists, skip download and extraction Useful when extracted files indicate successful prior extraction Prevents re-extraction on every run Cleanup Property The cleanup property removes the archive after extraction:
- archive :
- /tmp/app.tar.gz :
url : https://example.com/app.tar.gz
extract_parent : /opt/app
creates : /opt/app/bin/app
cleanup : true
owner : root
group : root Requirements:
extract_parent must be set (cleanup only makes sense with extraction)creates must be set to track extraction stateBehavior:
After successful extraction, remove the archive file On subsequent runs, creates file prevents re-download Checksum Verification When checksum is specified:
- archive :
- /tmp/app.tar.gz :
url : https://example.com/app.tar.gz
checksum : "a1b2c3d4..."
owner : root
group : root Behavior:
Downloaded file is verified against SHA256 checksum Existing file checksum is compared to detect changes Checksum mismatch triggers re-download Download fails if fetched content doesn’t match Authentication Archives support two authentication methods:
Basic Authentication - archive :
- /tmp/app.tar.gz :
url : https://private.example.com/app.tar.gz
username : deploy
password : "{{ lookup('data.password') }}"
owner : root
group : root - archive :
- /tmp/app.tar.gz :
url : https://api.example.com/releases/app.tar.gz
headers :
Authorization : "Bearer {{ lookup('data.token') }}"
owner : root
group : root Required Properties Property Required Description urlYes Source URL for download ownerYes Username that owns the archive file groupYes Group that owns the archive file
URL Validation URLs are validated during resource creation:
Must be valid URL format Scheme must be http or https Path must end with supported archive extension Extension must match the name property extension Noop Mode In noop mode, the archive type:
Queries current state normally Computes what actions would be taken Sets appropriate NoopMessage:“Would have downloaded” “Would have extracted” “Would have cleaned up” “Would have removed” Reports Changed: true if changes would occur Does not call provider Download/Extract methods Does not remove files Multiple actions are joined with “. " (e.g., “Would have downloaded. Would have extracted”).
Desired State Validation After applying changes (in non-noop mode), the type verifies the archive reached the desired state by calling Status() again and checking all conditions. If validation fails, ErrDesiredStateFailed is returned.
Subsections of Archive Type HTTP Provider This document describes the implementation details of the HTTP archive provider for downloading and extracting archives from HTTP/HTTPS URLs.
Provider Selection The HTTP provider is selected when:
The URL scheme is http or https The archive file extension is supported (.tar.gz, .tgz, .tar, .zip) The required extraction tool (tar or unzip) is available in PATH The IsManageable() function checks these conditions and returns a priority of 1 if all are met.
Operations Download Process:
Parse the URL and add Basic Auth credentials if username/password provided Create HTTP request with custom headers (if specified) Execute GET request via util.HttpGetResponse() Verify HTTP 200 status code Create temporary file in the same directory as the target Set ownership on temp file before writing content Copy response body to temp file Verify checksum if provided Atomic rename temp file to target path Atomic Write Pattern:
[parent dir]/archive-name-* (temp file)
β write content
β set owner/group
β verify checksum
β rename
[parent dir]/archive-name (final file)The temp file is created in the same directory as the target to ensure os.Rename() is atomic (same filesystem).
Error Handling:
Condition Behavior HTTP non-200 Return error with status code Write failure Clean up temp file, return error Checksum mismatch Clean up temp file, return error with expected vs actual Rename failure Temp file cleaned up by defer
Authentication:
Method Implementation Basic Auth URL userinfo is passed to HttpGetResponse() which sets Authorization header Username/Password properties Embedded in URL before request: url.UserPassword(username, password) Custom Headers Added to request via http.Header.Add()
Process:
Validate ExtractParent is set Create ExtractParent directory if it doesn’t exist (mode 0755) Determine archive type from file extension Execute appropriate extraction command Extraction Commands:
Extension Command .tar.gz, .tgztar -xzf <archive> -C <extract_parent>.tartar -xf <archive> -C <extract_parent>.zipunzip -d <extract_parent> <archive>
Command Execution:
Commands are executed via model.CommandRunner.ExecuteWithOptions() with:
Option Value Commandtar or unzipArgsExtraction flags and paths CwdExtractParent directoryTimeout1 minute
Error Handling:
Condition Behavior Unsupported extension Return “archive type not supported” error Command not found Runner returns error Non-zero exit code Return error with exit code and stderr
Status Process:
Initialize state with EnsureAbsent default Check if archive file exists via os.Stat() If exists: set EnsurePresent, populate metadata (size, mtime, owner, group, checksum) If Creates property set: check if creates file exists Metadata Collected:
Field Source NameFrom properties Provider“http” ArchiveExistsos.Stat() successSizeFileInfo.Size()MTimeFileInfo.ModTime()Ownerutil.GetFileOwner() - resolves UID to usernameGrouputil.GetFileOwner() - resolves GID to group nameChecksumutil.Sha256HashFile()CreatesExistsos.Stat() on Creates path
Idempotency The provider supports idempotency through the type’s isDesiredState() function:
State Checks (in order) Ensure Absent : If ensure: absent, archive must not existCreates File : If Creates set and file doesn’t exist β not stableArchive Existence : If cleanup: false, archive must existOwner/Group : Must match propertiesChecksum : If specified in properties, must matchNote: When cleanup: true, the creates property is required (enforced at validation time).
Decision Matrix Archive Exists Creates Exists Checksum Match Cleanup Stable? No No N/A false No (download needed) Yes No Yes false No (extract needed) Yes Yes Yes false Yes No Yes N/A true Yes Yes Yes Yes true No (cleanup needed)
Checksum Verification Algorithm: SHA-256
Implementation:
sum , err := util .Sha256HashFile (tempFile )
if sum != properties .Checksum {
return fmt .Errorf ("checksum mismatch, expected %q got %q" , properties .Checksum , sum )
} Timing: Checksum is verified after download completes but before the atomic rename. This ensures:
Corrupted downloads are never placed at the target path Temp file is cleaned up on mismatch Clear error message with both expected and actual checksums Security Considerations Credential Handling Credentials in URL are redacted in log messages via util.RedactUrlCredentials() Basic Auth header is set by Go’s http.Request.SetBasicAuth(), not manually constructed Extraction uses system tar/unzip commands No path traversal protection beyond what the tools provide ExtractParent must be an absolute path (validated in model)Temporary Files Created with os.CreateTemp() using pattern <archive-name>-* Deferred removal ensures cleanup on all exit paths Ownership set before content written The provider is Unix-only due to:
Dependency on util.GetFileOwner() which uses syscall for UID/GID resolution Dependency on util.ChownFile() for ownership management Timeouts Operation Timeout Configurable HTTP Download 1 minute (default in HttpGetResponse) No Archive Extraction 1 minute No
Large archives may require increased timeouts in future versions.