Skip to content

core

napt.core

Core orchestration for NAPT.

This module provides high-level orchestration functions that coordinate the complete workflow for recipe validation, package building, and deployment.

Two-Path Architecture:

The orchestration automatically selects the optimal path based on what each discovery strategy can do:

  • Version-First Path (web_scrape, api_github, api_json): These strategies can check the version without downloading the file. NAPT compares the discovered version to the cached version. If they match and the file already exists, the download is skipped entirely. This makes update checks very fast (~100-300ms) since no large installer files are downloaded.

  • File-First Path (url_download): This strategy requires downloading the file to extract the version. NAPT uses HTTP ETag headers to check if the file has changed. If the server responds with HTTP 304 (Not Modified), the existing cached file is reused, avoiding unnecessary re-downloads.

Design Principles:

  • Each function has a single, clear responsibility
  • Functions return structured data (dataclasses) for easy testing and extension
  • Error handling uses exceptions; CLI layer formats for user display
  • Discovery strategies are dynamically loaded via registry pattern
  • Configuration is immutable once loaded
Example

Programmatic usage:

from pathlib import Path
from napt.core import discover_recipe

result = discover_recipe(
    recipe_path=Path("recipes/Google/chrome.yaml"),
    output_dir=Path("./downloads"),
)

print(f"App: {result.app_name}")
print(f"Version: {result.version}")
print(f"SHA-256: {result.sha256}")

# Version-first strategies: may have skipped download if unchanged!

derive_file_path_from_url

derive_file_path_from_url(url: str, output_dir: Path, app_id: str) -> Path

Derive file path from URL using same logic as download_file.

This function ensures version-first strategies can locate cached files without downloading by following the same naming convention as the download module (app-scoped subdirectory).

Parameters:

Name Type Description Default
url str

Download URL.

required
output_dir Path

Base downloads directory.

required
app_id str

Application identifier used to scope the subdirectory.

required

Returns:

Type Description
Path

Expected path to the file under output_dir/app_id/.

Example

Get expected file path for a download URL:

from pathlib import Path

path = derive_file_path_from_url(
    "https://example.com/app.msi",
    Path("./downloads"),
    "my-app",
)
# Returns: Path('./downloads/my-app/app.msi')

Source code in napt/core.py
def derive_file_path_from_url(url: str, output_dir: Path, app_id: str) -> Path:
    """Derive file path from URL using same logic as download_file.

    This function ensures version-first strategies can locate cached files
    without downloading by following the same naming convention as the
    download module (app-scoped subdirectory).

    Args:
        url: Download URL.
        output_dir: Base downloads directory.
        app_id: Application identifier used to scope the subdirectory.

    Returns:
        Expected path to the file under output_dir/app_id/.

    Example:
        Get expected file path for a download URL:
            ```python
            from pathlib import Path

            path = derive_file_path_from_url(
                "https://example.com/app.msi",
                Path("./downloads"),
                "my-app",
            )
            # Returns: Path('./downloads/my-app/app.msi')
            ```

    """
    from urllib.parse import urlparse

    filename = Path(urlparse(url).path).name
    return output_dir / app_id / filename

discover_recipe

discover_recipe(recipe_path: Path, output_dir: Path | None = None, state_file: Path | None = Path('state/versions.json'), stateless: bool = False) -> DiscoverResult

Discover the latest version by loading config and downloading installer.

This is the main entry point for the 'napt discover' command. It orchestrates the entire discovery workflow using a two-path architecture optimized for version-first strategies.

The function uses duck typing to detect strategy capabilities:

VERSION-FIRST PATH (if strategy has get_version_info method):

  1. Load effective configuration (org + vendor + recipe merged)
  2. Call strategy.get_version_info() to discover version (no download)
  3. Compare discovered version to cached known_version
  4. If match and file exists -> skip download entirely (fast path!)
  5. If changed or missing -> download installer via download_file()
  6. Update state and return results

FILE-FIRST PATH (if strategy has only discover_version method):

  1. Load effective configuration (org + vendor + recipe merged)
  2. Call strategy.discover_version() with cached ETag
  3. Strategy handles conditional request (HTTP 304 vs 200)
  4. Extract version from downloaded file
  5. Update state and return results

Parameters:

Name Type Description Default
recipe_path Path

Path to the recipe YAML file. Must exist and be readable. The path is resolved to absolute form.

required
output_dir Path | None

Directory to download the installer to. Created if it doesn't exist. The downloaded file will be named based on Content-Disposition header or URL path.

None
state_file Path | None

Path to state file for version tracking and ETag caching. Default is "state/versions.json". Set to None to disable.

Path('state/versions.json')
stateless bool

If True, disable state tracking (no caching, always download). Default is False.

False

Returns:

Type Description
DiscoverResult

Discovery results and metadata including version, file path, and SHA-256 hash.

Raises:

Type Description
ConfigError

On missing or invalid configuration fields (no app defined, missing 'source.strategy' field, unknown discovery strategy name), YAML parse errors (from config loader), or if recipe file doesn't exist.

NetworkError

On download failures or version extraction errors.

Example

Basic version discovery:

from pathlib import Path
result = discover_recipe(
    Path("recipes/Google/chrome.yaml"),
    Path("./downloads")
)
print(result.version)  # 141.0.7390.123

Handling errors:

try:
    result = discover_recipe(Path("invalid.yaml"), Path("."))
except ConfigError as e:
    print(f"Config error: {e}")
except NetworkError as e:
    print(f"Network error: {e}")

Note

The discovery strategy must be registered before calling this function. Version-first strategies (web_scrape, api_github, api_json) can skip downloads entirely when version unchanged (fast path optimization). File-first strategy (url_download) uses ETag conditional requests. Downloaded files are written atomically (.part then renamed). Progress output goes to stdout via the download module. Strategy type detected via duck typing (hasattr for get_version_info).

Source code in napt/core.py
def discover_recipe(
    recipe_path: Path,
    output_dir: Path | None = None,
    state_file: Path | None = Path("state/versions.json"),
    stateless: bool = False,
) -> DiscoverResult:
    """Discover the latest version by loading config and downloading installer.

    This is the main entry point for the 'napt discover' command. It orchestrates
    the entire discovery workflow using a two-path architecture optimized for
    version-first strategies.

    The function uses duck typing to detect strategy capabilities:

    VERSION-FIRST PATH (if strategy has get_version_info method):

    1. Load effective configuration (org + vendor + recipe merged)
    2. Call strategy.get_version_info() to discover version (no download)
    3. Compare discovered version to cached known_version
    4. If match and file exists -> skip download entirely (fast path!)
    5. If changed or missing -> download installer via download_file()
    6. Update state and return results

    FILE-FIRST PATH (if strategy has only discover_version method):

    1. Load effective configuration (org + vendor + recipe merged)
    2. Call strategy.discover_version() with cached ETag
    3. Strategy handles conditional request (HTTP 304 vs 200)
    4. Extract version from downloaded file
    5. Update state and return results

    Args:
        recipe_path: Path to the recipe YAML file. Must exist and be
            readable. The path is resolved to absolute form.
        output_dir: Directory to download the installer to. Created if
            it doesn't exist. The downloaded file will be named based on
            Content-Disposition header or URL path.
        state_file: Path to state file for version tracking
            and ETag caching. Default is "state/versions.json". Set to None
            to disable.
        stateless: If True, disable state tracking (no caching,
            always download). Default is False.

    Returns:
        Discovery results and metadata including version, file path, and SHA-256 hash.

    Raises:
        ConfigError: On missing or invalid configuration fields (no app defined,
            missing 'source.strategy' field, unknown discovery strategy name),
            YAML parse errors (from config loader), or if recipe file doesn't exist.
        NetworkError: On download failures or version extraction errors.

    Example:
        Basic version discovery:
            ```python
            from pathlib import Path
            result = discover_recipe(
                Path("recipes/Google/chrome.yaml"),
                Path("./downloads")
            )
            print(result.version)  # 141.0.7390.123
            ```

        Handling errors:
            ```python
            try:
                result = discover_recipe(Path("invalid.yaml"), Path("."))
            except ConfigError as e:
                print(f"Config error: {e}")
            except NetworkError as e:
                print(f"Network error: {e}")
            ```

    Note:
        The discovery strategy must be registered before calling this function.
        Version-first strategies (web_scrape, api_github, api_json) can skip
        downloads entirely when version unchanged (fast path optimization).
        File-first strategy (url_download) uses ETag conditional requests.
        Downloaded files are written atomically (.part then renamed). Progress
        output goes to stdout via the download module. Strategy type detected
        via duck typing (hasattr for get_version_info).

    """
    logger = get_global_logger()

    # Load state file unless running in stateless mode
    state = None
    if not stateless and state_file:
        try:
            state = load_state(state_file)
            logger.verbose("STATE", f"Loaded state from {state_file}")
        except FileNotFoundError:
            logger.verbose("STATE", f"State file not found, will create: {state_file}")
            state = {
                "metadata": {"napt_version": __version__, "schema_version": "2"},
                "apps": {},
            }
        except Exception as err:
            logger.warning("STATE", f"Failed to load state: {err}")
            logger.verbose("STATE", "Continuing without state tracking")
            state = None

    # 1. Load and merge configuration
    logger.step(1, 4, "Loading configuration...")
    config = load_effective_config(recipe_path)

    # Resolve output_dir from config default when not provided by caller.
    if output_dir is None:
        output_dir = Path(config["defaults"]["discover"]["output_dir"])

    # 2. Extract the app configuration
    logger.step(2, 4, "Discovering version...")
    app = config.get("app")
    if not app:
        raise ConfigError(f"No app defined in recipe: {recipe_path}")

    app_name = app.get("name", "Unknown")
    app_id = app.get("id", "unknown-id")

    # 3. Get the discovery strategy name
    source = app.get("source", {})
    strategy_name = source.get("strategy")
    if not strategy_name:
        raise ConfigError(f"No 'source.strategy' defined for app: {app_name}")

    # 4. Get the strategy implementation
    # Import strategies to ensure they're registered
    import napt.discovery.api_github  # noqa: F401
    import napt.discovery.api_json  # noqa: F401
    import napt.discovery.url_download  # noqa: F401
    import napt.discovery.web_scrape  # noqa: F401

    strategy = get_strategy(strategy_name)

    # Get cache for this recipe from state
    cache = None
    if state and app_id:
        cache = state.get("apps", {}).get(app_id)
        if cache:
            logger.verbose("STATE", f"Using cache for {app_id}")
            if cache.get("known_version"):
                logger.verbose(
                    "STATE", f"  Cached version: {cache.get('known_version')}"
                )
            if cache.get("etag"):
                logger.verbose("STATE", f"  Cached ETag: {cache.get('etag')}")

    # 5. Run discovery: version-first or file-first path
    logger.step(3, 4, "Discovering version...")

    # Check if strategy supports version-first (has get_version_info method)
    download_url = None  # Track actual download URL for state file
    if hasattr(strategy, "get_version_info"):
        # VERSION-FIRST PATH (web_scrape, api_github, api_json)
        # Get version without downloading
        version_info = strategy.get_version_info(app)
        download_url = version_info.download_url  # Save for state file

        logger.verbose("DISCOVERY", f"Version discovered: {version_info.version}")

        # Check if we can use cached file (version match + file exists)
        if cache and cache.get("known_version") == version_info.version:
            # Derive file path from URL using same logic as download_file
            file_path = derive_file_path_from_url(
                version_info.download_url, output_dir, app_id
            )

            if file_path.exists():
                # Fast path: version unchanged, file exists, skip download!
                logger.info(
                    "CACHE",
                    f"Version {version_info.version} unchanged, using cached file",
                )
                logger.step(4, 4, "Using cached file...")
                sha256 = cache.get("sha256")
                discovered_version = DiscoveredVersion(
                    version_info.version, version_info.source
                )
                headers = {}  # No download occurred, no headers
            else:
                # File was deleted, re-download
                logger.warning(
                    "CACHE",
                    f"Cached file {file_path} not found, re-downloading",
                )
                logger.step(4, 4, "Downloading installer...")
                dl = download_file(
                    version_info.download_url,
                    output_dir / app_id,
                )
                file_path, sha256, headers = dl.file_path, dl.sha256, dl.headers
                discovered_version = DiscoveredVersion(
                    version_info.version, version_info.source
                )
        else:
            # Version changed or no cache, download new version
            if cache:
                logger.info(
                    "DISCOVERY",
                    (
                        f"Version changed: {cache.get('known_version')} -> "
                        f"{version_info.version}"
                    ),
                )
            logger.step(4, 4, "Downloading installer...")
            dl = download_file(
                version_info.download_url,
                output_dir / app_id,
            )
            file_path, sha256, headers = dl.file_path, dl.sha256, dl.headers
            discovered_version = DiscoveredVersion(
                version_info.version, version_info.source
            )
    else:
        # FILE-FIRST PATH (url_download only)
        # Must download to extract version (or use cached file via ETag)
        logger.step(4, 4, "Fetching installer...")
        discovered_version, file_path, sha256, headers = strategy.discover_version(
            app, output_dir, cache=cache
        )
        download_url = str(app.get("source", {}).get("url", ""))  # Use source.url

    # Update state with discovered information
    if state and app_id and state_file:
        from datetime import UTC, datetime

        if "apps" not in state:
            state["apps"] = {}

        # Extract ETag and Last-Modified from headers for next run
        etag = headers.get("ETag")
        last_modified = headers.get("Last-Modified")

        if etag:
            logger.verbose("STATE", f"Saving ETag for next run: {etag}")
        if last_modified:
            logger.verbose(
                "STATE", f"Saving Last-Modified for next run: {last_modified}"
            )

        # Build cache entry with new schema v2
        cache_entry = {
            "url": download_url
            or "",  # Actual download URL (from version_info or source.url)
            "etag": etag if etag else None,  # Only useful for url_download
            "last_modified": (
                last_modified if last_modified else None
            ),  # Only useful for url_download
            "sha256": sha256,
        }

        # Optional fields
        if discovered_version.version:
            cache_entry["known_version"] = discovered_version.version
        if strategy_name:
            cache_entry["strategy"] = strategy_name

        state["apps"][app_id] = cache_entry

        state["metadata"] = {
            "napt_version": __version__,
            "last_updated": datetime.now(UTC).isoformat(),
            "schema_version": "2",
        }

        try:
            save_state(state, state_file)
            logger.verbose("STATE", f"Updated state file: {state_file}")
        except Exception as err:
            logger.warning("STATE", f"Failed to save state: {err}")

    # 6. Return results
    return DiscoverResult(
        app_name=app_name,
        app_id=app_id,
        strategy=strategy_name,
        version=discovered_version.version,
        version_source=discovered_version.source,
        file_path=file_path,
        sha256=sha256,
        status="success",
    )