HTTP Provider

This document describes the implementation details of the HTTP archive provider for downloading and extracting archives from HTTP/HTTPS URLs.

Provider Selection

The HTTP provider is selected when:

  1. The URL scheme is http or https
  2. The archive file extension is supported (.tar.gz, .tgz, .tar, .zip)
  3. The required extraction tool (tar or unzip) is available in PATH

The IsManageable() function checks these conditions and returns a priority of 1 if all are met.

Operations

Download

Process:

  1. Parse the URL and add Basic Auth credentials if username/password provided
  2. Create HTTP request with custom headers (if specified)
  3. Execute GET request via util.HttpGetResponse()
  4. Verify HTTP 200 status code
  5. Create temporary file in the same directory as the target
  6. Set ownership on temp file before writing content
  7. Copy response body to temp file
  8. Verify checksum if provided
  9. Atomic rename temp file to target path

Atomic Write Pattern:

[parent dir]/archive-name-* (temp file)
    ↓ write content
    ↓ set owner/group
    ↓ verify checksum
    ↓ rename
[parent dir]/archive-name (final file)

The temp file is created in the same directory as the target to ensure os.Rename() is atomic (same filesystem).

Error Handling:

ConditionBehavior
HTTP non-200Return error with status code
Write failureClean up temp file, return error
Checksum mismatchClean up temp file, return error with expected vs actual
Rename failureTemp file cleaned up by defer

Authentication:

MethodImplementation
Basic AuthURL userinfo is passed to HttpGetResponse() which sets Authorization header
Username/Password propertiesEmbedded in URL before request: url.UserPassword(username, password)
Custom HeadersAdded to request via http.Header.Add()

Extract

Process:

  1. Validate ExtractParent is set
  2. Create ExtractParent directory if it doesn’t exist (mode 0755)
  3. Determine archive type from file extension
  4. Execute appropriate extraction command

Extraction Commands:

ExtensionCommand
.tar.gz, .tgztar -xzf <archive> -C <extract_parent>
.tartar -xf <archive> -C <extract_parent>
.zipunzip -d <extract_parent> <archive>

Command Execution:

Commands are executed via model.CommandRunner.ExecuteWithOptions() with:

OptionValue
Commandtar or unzip
ArgsExtraction flags and paths
CwdExtractParent directory
Timeout1 minute

Error Handling:

ConditionBehavior
Unsupported extensionReturn “archive type not supported” error
Command not foundRunner returns error
Non-zero exit codeReturn error with exit code and stderr

Status

Process:

  1. Initialize state with EnsureAbsent default
  2. Check if archive file exists via os.Stat()
  3. If exists: set EnsurePresent, populate metadata (size, mtime, owner, group, checksum)
  4. If Creates property set: check if creates file exists

Metadata Collected:

FieldSource
NameFrom properties
Provider“http”
ArchiveExistsos.Stat() success
SizeFileInfo.Size()
MTimeFileInfo.ModTime()
Ownerutil.GetFileOwner() - resolves UID to username
Grouputil.GetFileOwner() - resolves GID to group name
Checksumutil.Sha256HashFile()
CreatesExistsos.Stat() on Creates path

Idempotency

The provider supports idempotency through the type’s isDesiredState() function:

State Checks (in order)

  1. Ensure Absent: If ensure: absent, archive must not exist
  2. Creates File: If Creates set and file doesn’t exist → not stable
  3. Archive Existence: If cleanup: false, archive must exist
  4. Owner/Group: Must match properties
  5. Checksum: If specified in properties, must match

Note: When cleanup: true, the creates property is required (enforced at validation time).

Decision Matrix

Archive ExistsCreates ExistsChecksum MatchCleanupStable?
NoNoN/AfalseNo (download needed)
YesNoYesfalseNo (extract needed)
YesYesYesfalseYes
NoYesN/AtrueYes
YesYesYestrueNo (cleanup needed)

Checksum Verification

Algorithm: SHA-256

Implementation:

sum, err := util.Sha256HashFile(tempFile)
if sum != properties.Checksum {
    return fmt.Errorf("checksum mismatch, expected %q got %q", properties.Checksum, sum)
}

Timing: Checksum is verified after download completes but before the atomic rename. This ensures:

  • Corrupted downloads are never placed at the target path
  • Temp file is cleaned up on mismatch
  • Clear error message with both expected and actual checksums

Security Considerations

Credential Handling

  • Credentials in URL are redacted in log messages via util.RedactUrlCredentials()
  • Basic Auth header is set by Go’s http.Request.SetBasicAuth(), not manually constructed

Archive Extraction

  • Extraction uses system tar/unzip commands
  • No path traversal protection beyond what the tools provide
  • ExtractParent must be an absolute path (validated in model)

Temporary Files

  • Created with os.CreateTemp() using pattern <archive-name>-*
  • Deferred removal ensures cleanup on all exit paths
  • Ownership set before content written

Platform Support

The provider is Unix-only due to:

  • Dependency on util.GetFileOwner() which uses syscall for UID/GID resolution
  • Dependency on util.ChownFile() for ownership management

Timeouts

OperationTimeoutConfigurable
HTTP Download1 minute (default in HttpGetResponse)No
Archive Extraction1 minuteNo

Large archives may require increased timeouts in future versions.