HTTP Provider
This document describes the implementation details of the HTTP archive provider for downloading and extracting archives from HTTP/HTTPS URLs.
Provider Selection
The HTTP provider is selected when:
- The URL scheme is
httporhttps - The archive file extension is supported (
.tar.gz,.tgz,.tar,.zip) - The required extraction tool (
tarorunzip) is available in PATH
The IsManageable() function checks these conditions and returns a priority of 1 if all are met.
Operations
Download
Process:
- Parse the URL and add Basic Auth credentials if username/password provided
- Create HTTP request with custom headers (if specified)
- Execute GET request via
util.HttpGetResponse() - Verify HTTP 200 status code
- Create temporary file in the same directory as the target
- Set ownership on temp file before writing content
- Copy response body to temp file
- Verify checksum if provided
- Atomic rename temp file to target path
Atomic Write Pattern:
The temp file is created in the same directory as the target to ensure os.Rename() is atomic (same filesystem).
Error Handling:
| Condition | Behavior |
|---|---|
| HTTP non-200 | Return error with status code |
| Write failure | Clean up temp file, return error |
| Checksum mismatch | Clean up temp file, return error with expected vs actual |
| Rename failure | Temp file cleaned up by defer |
Authentication:
| Method | Implementation |
|---|---|
| Basic Auth | URL userinfo is passed to HttpGetResponse() which sets Authorization header |
| Username/Password properties | Embedded in URL before request: url.UserPassword(username, password) |
| Custom Headers | Added to request via http.Header.Add() |
Extract
Process:
- Validate
ExtractParentis set - Create
ExtractParentdirectory if it doesn’t exist (mode 0755) - Determine archive type from file extension
- Execute appropriate extraction command
Extraction Commands:
| Extension | Command |
|---|---|
.tar.gz, .tgz | tar -xzf <archive> -C <extract_parent> |
.tar | tar -xf <archive> -C <extract_parent> |
.zip | unzip -d <extract_parent> <archive> |
Command Execution:
Commands are executed via model.CommandRunner.ExecuteWithOptions() with:
| Option | Value |
|---|---|
Command | tar or unzip |
Args | Extraction flags and paths |
Cwd | ExtractParent directory |
Timeout | 1 minute |
Error Handling:
| Condition | Behavior |
|---|---|
| Unsupported extension | Return “archive type not supported” error |
| Command not found | Runner returns error |
| Non-zero exit code | Return error with exit code and stderr |
Status
Process:
- Initialize state with
EnsureAbsentdefault - Check if archive file exists via
os.Stat() - If exists: set
EnsurePresent, populate metadata (size, mtime, owner, group, checksum) - If
Createsproperty set: check if creates file exists
Metadata Collected:
| Field | Source |
|---|---|
Name | From properties |
Provider | “http” |
ArchiveExists | os.Stat() success |
Size | FileInfo.Size() |
MTime | FileInfo.ModTime() |
Owner | util.GetFileOwner() - resolves UID to username |
Group | util.GetFileOwner() - resolves GID to group name |
Checksum | util.Sha256HashFile() |
CreatesExists | os.Stat() on Creates path |
Idempotency
The provider supports idempotency through the type’s isDesiredState() function:
State Checks (in order)
- Ensure Absent: If
ensure: absent, archive must not exist - Creates File: If
Createsset and file doesn’t exist → not stable - Archive Existence: If
cleanup: false, archive must exist - Owner/Group: Must match properties
- Checksum: If specified in properties, must match
Note: When cleanup: true, the creates property is required (enforced at validation time).
Decision Matrix
| Archive Exists | Creates Exists | Checksum Match | Cleanup | Stable? |
|---|---|---|---|---|
| No | No | N/A | false | No (download needed) |
| Yes | No | Yes | false | No (extract needed) |
| Yes | Yes | Yes | false | Yes |
| No | Yes | N/A | true | Yes |
| Yes | Yes | Yes | true | No (cleanup needed) |
Checksum Verification
Algorithm: SHA-256
Implementation:
Timing: Checksum is verified after download completes but before the atomic rename. This ensures:
- Corrupted downloads are never placed at the target path
- Temp file is cleaned up on mismatch
- Clear error message with both expected and actual checksums
Security Considerations
Credential Handling
- Credentials in URL are redacted in log messages via
util.RedactUrlCredentials() - Basic Auth header is set by Go’s
http.Request.SetBasicAuth(), not manually constructed
Archive Extraction
- Extraction uses system
tar/unzipcommands - No path traversal protection beyond what the tools provide
ExtractParentmust be an absolute path (validated in model)
Temporary Files
- Created with
os.CreateTemp()using pattern<archive-name>-* - Deferred removal ensures cleanup on all exit paths
- Ownership set before content written
Platform Support
The provider is Unix-only due to:
- Dependency on
util.GetFileOwner()which uses syscall for UID/GID resolution - Dependency on
util.ChownFile()for ownership management
Timeouts
| Operation | Timeout | Configurable |
|---|---|---|
| HTTP Download | 1 minute (default in HttpGetResponse) | No |
| Archive Extraction | 1 minute | No |
Large archives may require increased timeouts in future versions.