Skip to content

discovery

napt.discovery.base

Discovery strategy protocol, registry, and shared helpers.

A discovery strategy answers a single question: "what is the latest version of this app, and where can it be downloaded from?" Strategies return that answer as a RemoteVersion dataclass. They do not download files or touch the cache themselves; the orchestrator does.

Built-in strategies
  • api_github: queries the GitHub releases API for the latest tag.
  • api_json: extracts version and download URL from a JSON endpoint.
  • web_scrape: parses a vendor download page for both fields.

The fourth flow (url_download) is not a registered strategy. It downloads a fixed URL and extracts the version from the file itself, which is a different shape than the strategies in this module. The discovery orchestrator dispatches to that flow directly when a recipe uses strategy: url_download.

Design Philosophy
  • Strategies are typing.Protocol types. Implementations are matched structurally; no inheritance is required.
  • Strategies are pure functions of configuration. They have no state, no I/O of files, and no awareness of the cache.
  • Registration is a side effect of importing each strategy module.
  • The resolve_with_cache helper turns a RemoteVersion into a StrategyResult by checking the cache and downloading if needed. Strategies don't call it themselves; the orchestrator does.
Example

Adding a new strategy to the codebase:

from napt.discovery.base import (
    RemoteVersion, register_strategy,
)

class GitlabReleasesStrategy:
    def discover(self, app_config):
        # Query GitLab API and parse the response...
        return RemoteVersion(
            version="1.2.3",
            download_url="https://gitlab.example.com/.../installer.msi",
            source="gitlab_releases",
        )

    def validate_config(self, app_config):
        errors = []
        if "project" not in app_config.get("discovery", {}):
            errors.append("Missing required field: discovery.project")
        return errors

register_strategy("gitlab_releases", GitlabReleasesStrategy)

RemoteVersion dataclass

Version and download URL discovered from a remote source.

Returned by every DiscoveryStrategy implementation. The orchestrator passes this to resolve_with_cache to decide whether the file needs to be re-downloaded.

Attributes:

Name Type Description
version str

Raw version string extracted from the remote source (for example, "140.0.7339.128").

download_url str

URL the installer can be fetched from.

source str

Name of the strategy that produced this result, used for logging and result reporting (for example, "api_github").

Source code in napt/discovery/base.py
@dataclass(frozen=True)
class RemoteVersion:
    """Version and download URL discovered from a remote source.

    Returned by every [DiscoveryStrategy][napt.discovery.base.DiscoveryStrategy]
    implementation. The orchestrator passes this to
    [resolve_with_cache][napt.discovery.base.resolve_with_cache] to decide
    whether the file needs to be re-downloaded.

    Attributes:
        version: Raw version string extracted from the remote source
            (for example, ``"140.0.7339.128"``).
        download_url: URL the installer can be fetched from.
        source: Name of the strategy that produced this result, used
            for logging and result reporting (for example, ``"api_github"``).
    """

    version: str
    download_url: str
    source: str

StrategyResult dataclass

Resolved discovery result, ready to be saved to state.

Returned by both the version-first flow (via resolve_with_cache) and the url_download flow. Captures everything the orchestrator needs to update the state cache and build a public DiscoverResult.

Attributes:

Name Type Description
version str

Version string for the resolved file.

version_source str

Strategy name that produced this version (for example, "api_github" or "url_download").

file_path Path

Path to the resolved installer on disk. This is either a freshly downloaded file or a previously cached file when the cache was reused.

sha256 str

SHA-256 hex digest of the resolved file.

headers dict[str, str]

HTTP response headers from the download. Empty when the cache was reused without a network call. Used to persist ETag / Last-Modified for the next conditional request.

download_url str

URL the file came from. Stored in state so that future runs know where to re-fetch from if needed.

cached bool

True when the file was reused from cache; False when it was downloaded.

Source code in napt/discovery/base.py
@dataclass(frozen=True)
class StrategyResult:
    """Resolved discovery result, ready to be saved to state.

    Returned by both the version-first flow (via
    [resolve_with_cache][napt.discovery.base.resolve_with_cache]) and the
    url_download flow. Captures everything the orchestrator needs to
    update the state cache and build a public
    [DiscoverResult][napt.results.DiscoverResult].

    Attributes:
        version: Version string for the resolved file.
        version_source: Strategy name that produced this version
            (for example, ``"api_github"`` or ``"url_download"``).
        file_path: Path to the resolved installer on disk. This is either
            a freshly downloaded file or a previously cached file when the
            cache was reused.
        sha256: SHA-256 hex digest of the resolved file.
        headers: HTTP response headers from the download. Empty when the
            cache was reused without a network call. Used to persist
            ``ETag`` / ``Last-Modified`` for the next conditional request.
        download_url: URL the file came from. Stored in state so that
            future runs know where to re-fetch from if needed.
        cached: True when the file was reused from cache; False when it
            was downloaded.
    """

    version: str
    version_source: str
    file_path: Path
    sha256: str
    headers: dict[str, str]
    download_url: str
    cached: bool

DiscoveryStrategy

Bases: Protocol

Protocol for version discovery strategies.

A strategy queries a remote source (API, web page, etc.) and returns the latest version plus its download URL. Strategies do not download files, touch the cache, or write to disk. Those concerns belong to the orchestrator.

Implementations need only a discover and a validate_config method with the signatures below.

Source code in napt/discovery/base.py
class DiscoveryStrategy(Protocol):
    """Protocol for version discovery strategies.

    A strategy queries a remote source (API, web page, etc.) and returns
    the latest version plus its download URL. Strategies do not download
    files, touch the cache, or write to disk. Those concerns belong to
    the orchestrator.

    Implementations need only a ``discover`` and a ``validate_config``
    method with the signatures below.
    """

    def discover(self, app_config: dict[str, Any]) -> RemoteVersion:
        """Discovers the latest version and its download URL.

        Args:
            app_config: Merged recipe configuration dict.

        Returns:
            Latest version, the URL it can be downloaded from, and the
            strategy's own name as the source identifier.

        Raises:
            ConfigError: On missing or invalid required configuration.
            NetworkError: On HTTP failures or version-extraction errors.

        """
        ...

    def validate_config(self, app_config: dict[str, Any]) -> list[str]:
        """Validates strategy-specific configuration fields without network calls.

        Implementations should check field presence, types, and format only.

        Args:
            app_config: Merged recipe configuration dict.

        Returns:
            Human-readable error messages. Empty when configuration is valid.

        """
        ...

discover

discover(app_config: dict[str, Any]) -> RemoteVersion

Discovers the latest version and its download URL.

Parameters:

Name Type Description Default
app_config dict[str, Any]

Merged recipe configuration dict.

required

Returns:

Type Description
RemoteVersion

Latest version, the URL it can be downloaded from, and the

RemoteVersion

strategy's own name as the source identifier.

Raises:

Type Description
ConfigError

On missing or invalid required configuration.

NetworkError

On HTTP failures or version-extraction errors.

Source code in napt/discovery/base.py
def discover(self, app_config: dict[str, Any]) -> RemoteVersion:
    """Discovers the latest version and its download URL.

    Args:
        app_config: Merged recipe configuration dict.

    Returns:
        Latest version, the URL it can be downloaded from, and the
        strategy's own name as the source identifier.

    Raises:
        ConfigError: On missing or invalid required configuration.
        NetworkError: On HTTP failures or version-extraction errors.

    """
    ...

validate_config

validate_config(app_config: dict[str, Any]) -> list[str]

Validates strategy-specific configuration fields without network calls.

Implementations should check field presence, types, and format only.

Parameters:

Name Type Description Default
app_config dict[str, Any]

Merged recipe configuration dict.

required

Returns:

Type Description
list[str]

Human-readable error messages. Empty when configuration is valid.

Source code in napt/discovery/base.py
def validate_config(self, app_config: dict[str, Any]) -> list[str]:
    """Validates strategy-specific configuration fields without network calls.

    Implementations should check field presence, types, and format only.

    Args:
        app_config: Merged recipe configuration dict.

    Returns:
        Human-readable error messages. Empty when configuration is valid.

    """
    ...

register_strategy

register_strategy(name: str, strategy_class: type[DiscoveryStrategy]) -> None

Registers a discovery strategy by name in the global registry.

Strategies call this at module import time so they're available when the orchestrator looks them up. Registering the same name twice overwrites the previous entry (intentional, to allow test monkey-patching).

Parameters:

Name Type Description Default
name str

Strategy name. This is the value used in recipe YAML files under discovery.strategy. Use lowercase with underscores.

required
strategy_class type[DiscoveryStrategy]

Class implementing DiscoveryStrategy. Type checkers verify protocol compliance statically.

required
Note

url_download is intentionally not registered here. It runs through a separate code path in the orchestrator because it downloads the file before it can determine the version, which does not fit the version-first contract.

Source code in napt/discovery/base.py
def register_strategy(name: str, strategy_class: type[DiscoveryStrategy]) -> None:
    """Registers a discovery strategy by name in the global registry.

    Strategies call this at module import time so they're available when
    the orchestrator looks them up. Registering the same name twice
    overwrites the previous entry (intentional, to allow test
    monkey-patching).

    Args:
        name: Strategy name. This is the value used in recipe YAML files
            under ``discovery.strategy``. Use lowercase with underscores.
        strategy_class: Class implementing
            [DiscoveryStrategy][napt.discovery.base.DiscoveryStrategy].
            Type checkers verify protocol compliance statically.

    Note:
        ``url_download`` is intentionally not registered here. It runs
        through a separate code path in the orchestrator because it
        downloads the file before it can determine the version, which
        does not fit the version-first contract.

    """
    _STRATEGY_REGISTRY[name] = strategy_class

get_strategy

get_strategy(name: str) -> DiscoveryStrategy

Returns a discovery strategy instance by name from the registry.

Strategies are instantiated on-demand because they are stateless. The strategy's module must already be imported for registration to have happened.

Parameters:

Name Type Description Default
name str

Registered strategy name. Case-sensitive.

required

Returns:

Type Description
DiscoveryStrategy

New instance of the requested strategy.

Raises:

Type Description
ConfigError

If the name is not registered. The message lists the available strategies for troubleshooting.

Source code in napt/discovery/base.py
def get_strategy(name: str) -> DiscoveryStrategy:
    """Returns a discovery strategy instance by name from the registry.

    Strategies are instantiated on-demand because they are stateless. The
    strategy's module must already be imported for registration to have
    happened.

    Args:
        name: Registered strategy name. Case-sensitive.

    Returns:
        New instance of the requested strategy.

    Raises:
        ConfigError: If the name is not registered. The message lists
            the available strategies for troubleshooting.

    """
    if name not in _STRATEGY_REGISTRY:
        available = ", ".join(_STRATEGY_REGISTRY.keys())
        raise ConfigError(
            f"Unknown discovery strategy: {name!r}. Available: {available or '(none)'}"
        )
    return _STRATEGY_REGISTRY[name]()

resolve_with_cache

resolve_with_cache(
    info: RemoteVersion,
    app_config: dict[str, Any],
    output_dir: Path,
    cache: dict[str, Any] | None,
) -> StrategyResult

Resolves a RemoteVersion to a StrategyResult.

Implements the version-first fast path: when the discovered version matches the cached version and the cached file still exists on disk, the download is skipped entirely. Otherwise the file is downloaded fresh from info.download_url.

Parameters:

Name Type Description Default
info RemoteVersion

Version and download URL produced by a strategy's discover call.

required
app_config dict[str, Any]

Merged recipe configuration. Used to read id for the per-app download subdirectory.

required
output_dir Path

Base directory to download into. Files land in output_dir / app_id.

required
cache dict[str, Any] | None

Cached state for this recipe (known_version, file_path, sha256), or None when no prior state exists or stateless mode is on.

required

Returns:

Type Description
StrategyResult

Resolved version, file path, and download metadata. The

StrategyResult

cached field indicates whether the download was skipped.

Raises:

Type Description
NetworkError

On download failures.

Source code in napt/discovery/base.py
def resolve_with_cache(
    info: RemoteVersion,
    app_config: dict[str, Any],
    output_dir: Path,
    cache: dict[str, Any] | None,
) -> StrategyResult:
    """Resolves a [RemoteVersion][napt.discovery.base.RemoteVersion] to a [StrategyResult][napt.discovery.base.StrategyResult].

    Implements the version-first fast path: when the discovered version
    matches the cached version and the cached file still exists on disk,
    the download is skipped entirely. Otherwise the file is downloaded
    fresh from ``info.download_url``.

    Args:
        info: Version and download URL produced by a strategy's
            [discover][napt.discovery.base.DiscoveryStrategy.discover] call.
        app_config: Merged recipe configuration. Used to read ``id``
            for the per-app download subdirectory.
        output_dir: Base directory to download into. Files land in
            ``output_dir / app_id``.
        cache: Cached state for this recipe (``known_version``,
            ``file_path``, ``sha256``), or ``None`` when no prior state
            exists or stateless mode is on.

    Returns:
        Resolved version, file path, and download metadata. The
        ``cached`` field indicates whether the download was skipped.

    Raises:
        NetworkError: On download failures.

    """
    logger = get_global_logger()
    app_id = app_config["id"]

    if cache and not is_newer(info.version, cache.get("known_version")):
        cached_path_str = cache.get("file_path")
        cached_sha = cache.get("sha256")
        if cached_path_str and cached_sha:
            cached_path = Path(cached_path_str)
            if cached_path.exists():
                logger.info(
                    "CACHE",
                    f"Version {info.version} unchanged, using cached file",
                )
                return StrategyResult(
                    version=info.version,
                    version_source=info.source,
                    file_path=cached_path,
                    sha256=cached_sha,
                    headers={},
                    download_url=info.download_url,
                    cached=True,
                )
            logger.warning(
                "CACHE",
                f"Cached file {cached_path} not found, re-downloading",
            )

    if cache and cache.get("known_version"):
        logger.info(
            "DISCOVERY",
            f"Version changed: {cache.get('known_version')} -> {info.version}",
        )

    dl = download_file(info.download_url, output_dir / app_id)
    return StrategyResult(
        version=info.version,
        version_source=info.source,
        file_path=dl.file_path,
        sha256=dl.sha256,
        headers=dl.headers,
        download_url=info.download_url,
        cached=False,
    )

napt.discovery.url_download

url_download discovery flow.

This module is intentionally not a DiscoveryStrategy. The strategies in napt.discovery.base produce a RemoteVersion from configuration alone (version-first). url_download cannot do that — it has no remote endpoint to query for the version, so it must download the installer and extract the version from the file's metadata. The discovery orchestrator special-cases strategy: url_download and dispatches to run_url_download directly.

Cache Strategy

Uses HTTP conditional requests. If a previous run stored an ETag or Last-Modified header in state, those are sent as If-None-Match / If-Modified-Since on the next request. A server response of HTTP 304 reuses the cached file without a re-download. This is a different mechanism than the version-first strategies, which compare version strings (no HTTP round-trip required to detect "no change" beyond the initial discovery query).

Supported File Types
  • .msi — version is read from the MSI ProductVersion property.
  • Other extensions raise ConfigError. For non-MSI installers, use a version-first strategy.
Recipe Example
discovery:
  strategy: url_download
  url: "https://vendor.example.com/installer.msi"

run_url_download

run_url_download(
    app_config: dict[str, Any],
    output_dir: Path,
    cache: dict[str, Any] | None = None,
) -> StrategyResult

Downloads a fixed URL and extracts the version from the resulting file.

Issues a conditional HTTP request when cache carries an ETag or Last-Modified. On HTTP 304 the cached file is reused; otherwise the fresh download is used. Either way, the version is extracted from the file (MSI ProductVersion today).

Parameters:

Name Type Description Default
app_config dict[str, Any]

Merged recipe configuration dict containing discovery.url and id.

required
output_dir Path

Base directory to download into. The file lands in output_dir / app_id.

required
cache dict[str, Any] | None

Cached state for this recipe (etag, last_modified, file_path, sha256), or None when no prior state exists or stateless mode is on.

None

Returns:

Type Description
StrategyResult

Resolved version, file path, and download metadata. The

StrategyResult

cached field is True when HTTP 304 was used to reuse the

StrategyResult

previously downloaded file.

Raises:

Type Description
ConfigError

If discovery.url is missing, or if the downloaded file is not an MSI (version extraction is not supported for other file types).

NetworkError

On download or version-extraction failures.

Source code in napt/discovery/url_download.py
def run_url_download(
    app_config: dict[str, Any],
    output_dir: Path,
    cache: dict[str, Any] | None = None,
) -> StrategyResult:
    """Downloads a fixed URL and extracts the version from the resulting file.

    Issues a conditional HTTP request when ``cache`` carries an ``ETag``
    or ``Last-Modified``. On HTTP 304 the cached file is reused; otherwise
    the fresh download is used. Either way, the version is extracted from
    the file (MSI ProductVersion today).

    Args:
        app_config: Merged recipe configuration dict containing
            ``discovery.url`` and ``id``.
        output_dir: Base directory to download into. The file lands
            in ``output_dir / app_id``.
        cache: Cached state for this recipe (``etag``, ``last_modified``,
            ``file_path``, ``sha256``), or ``None`` when no prior state
            exists or stateless mode is on.

    Returns:
        Resolved version, file path, and download metadata. The
        ``cached`` field is True when HTTP 304 was used to reuse the
        previously downloaded file.

    Raises:
        ConfigError: If ``discovery.url`` is missing, or if the
            downloaded file is not an MSI (version extraction is
            not supported for other file types).
        NetworkError: On download or version-extraction failures.

    """
    from napt.logging import get_global_logger

    logger = get_global_logger()
    source = app_config.get("discovery", {})
    url = source.get("url")
    if not url:
        raise ConfigError("url_download strategy requires 'discovery.url' in config")

    app_id = app_config["id"]

    logger.verbose("DISCOVERY", "Strategy: url_download (file-first)")
    logger.verbose("DISCOVERY", f"Source URL: {url}")

    etag = cache.get("etag") if cache else None
    last_modified = cache.get("last_modified") if cache else None
    if etag:
        logger.verbose("DISCOVERY", f"Using cached ETag: {etag}")
    if last_modified:
        logger.verbose("DISCOVERY", f"Using cached Last-Modified: {last_modified}")

    try:
        dl = download_file(
            url,
            output_dir / app_id,
            etag=etag,
            last_modified=last_modified,
        )
    except NotModifiedError:
        return _resolve_not_modified(url, cache, output_dir, app_id, logger)
    except (NetworkError, ConfigError):
        raise
    except Exception as err:
        raise NetworkError(f"Failed to download {url}: {err}") from err

    version = _extract_version(dl.file_path)
    return StrategyResult(
        version=version,
        version_source="url_download",
        file_path=dl.file_path,
        sha256=dl.sha256,
        headers=dl.headers,
        download_url=url,
        cached=False,
    )

validate_url_download_config

validate_url_download_config(app_config: dict[str, Any]) -> list[str]

Validates url_download configuration fields.

Called by napt.validation.validate_config to compose the url_download field rules into the overall recipe validation.

Parameters:

Name Type Description Default
app_config dict[str, Any]

Merged recipe configuration dict.

required

Returns:

Type Description
list[str]

Human-readable error messages. Empty when configuration is valid.

Source code in napt/discovery/url_download.py
def validate_url_download_config(app_config: dict[str, Any]) -> list[str]:
    """Validates url_download configuration fields.

    Called by [napt.validation.validate_config][] to compose the
    url_download field rules into the overall recipe validation.

    Args:
        app_config: Merged recipe configuration dict.

    Returns:
        Human-readable error messages. Empty when configuration is valid.

    """
    errors: list[str] = []
    source = app_config.get("discovery", {})

    if "url" not in source:
        errors.append("Missing required field: discovery.url")
    elif not isinstance(source["url"], str):
        errors.append("discovery.url must be a string")
    elif not source["url"].strip():
        errors.append("discovery.url cannot be empty")

    return errors

napt.discovery.web_scrape

Web scraping discovery strategy.

Fetches a vendor download page, locates a download link, and extracts the version from that link's URL. Use this when a vendor has neither a JSON API nor a GitHub releases feed.

Recipe Example (CSS selector — recommended):

discovery:
  strategy: web_scrape
  page_url: "https://www.7-zip.org/download.html"
  link_selector: 'a[href$="-x64.msi"]'
  version_pattern: "7z(\\d{2})(\\d{2})-x64"
  version_format: "{0}.{1}"     # transforms ("25", "01") -> "25.01"

Recipe Example (regex fallback):

discovery:
  strategy: web_scrape
  page_url: "https://vendor.example.com/downloads"
  link_pattern: 'href="(/files/app-v[0-9.]+-x64\\.msi)"'
  version_pattern: "app-v([0-9.]+)-x64"

Configuration Fields
  • page_url (required): URL of the page to scrape.
  • link_selector (optional): CSS selector identifying the download link's <a> element. Recommended over regex.
  • link_pattern (optional): Regex with one capture group around the link URL. Used when a CSS selector cannot pin the link down. Exactly one of link_selector / link_pattern is required.
  • version_pattern (required): Regex applied to the discovered link URL to extract the version. Capture groups are pulled out and combined with version_format.
  • version_format (optional, default "{0}"): Python format string referencing capture groups by index ({0}, {1}, ...). Use this when a single version field needs to be assembled from multiple captures.
Finding a CSS Selector
  1. Open the download page in Chrome / Edge / Firefox.
  2. Right-click the download link -> Inspect.
  3. Right-click the highlighted element -> Copy -> Copy selector.
  4. Simplify the result. Common shapes:
    • a[href$=".msi"] (links ending in .msi)
    • a[href*="x64"] (links containing "x64")
    • a.download (links with class="download")
Note

The selector / pattern is expected to match exactly one link; the first match is used. Relative URLs in the page are resolved against page_url. CSS selector support requires BeautifulSoup4; the regex fallback does not.

WebScrapeStrategy

Discovery strategy for scraping vendor download pages.

Source code in napt/discovery/web_scrape.py
class WebScrapeStrategy:
    """Discovery strategy for scraping vendor download pages."""

    def discover(self, app_config: dict[str, Any]) -> RemoteVersion:
        r"""Discovers version and download URL by scraping a vendor page.

        Fetches ``discovery.page_url``, locates a download link with
        either ``link_selector`` (CSS) or ``link_pattern`` (regex),
        and extracts the version from the matched link using
        ``version_pattern``.

        Args:
            app_config: Merged recipe configuration dict containing
                ``discovery.page_url``, exactly one of
                ``discovery.link_selector`` or ``discovery.link_pattern``,
                and ``discovery.version_pattern``.

        Returns:
            Discovered version, the matched link's URL, and
            ``"web_scrape"`` as the source identifier.

        Raises:
            ConfigError: On missing required configuration or when
                a selector / pattern matches nothing.
            NetworkError: On page fetch failure.

        """
        from napt.logging import get_global_logger

        logger = get_global_logger()
        # Validate configuration
        source = app_config.get("discovery", {})
        page_url = source.get("page_url")
        if not page_url:
            raise ConfigError(
                "web_scrape strategy requires 'discovery.page_url' in config"
            )

        link_selector = source.get("link_selector")
        link_pattern = source.get("link_pattern")

        if not link_selector and not link_pattern:
            raise ConfigError(
                "web_scrape strategy requires either 'discovery.link_selector' or "
                "'discovery.link_pattern' in config"
            )

        version_pattern = source.get("version_pattern")
        if not version_pattern:
            raise ConfigError(
                "web_scrape strategy requires 'discovery.version_pattern' in config"
            )

        version_format = source.get("version_format", _DEFAULT_VERSION_FORMAT)

        logger.verbose("DISCOVERY", "Strategy: web_scrape (version-first)")
        logger.verbose("DISCOVERY", f"Page URL: {page_url}")
        if link_selector:
            logger.verbose("DISCOVERY", f"Link selector (CSS): {link_selector}")
        if link_pattern:
            logger.verbose("DISCOVERY", f"Link pattern (regex): {link_pattern}")
        logger.verbose("DISCOVERY", f"Version pattern: {version_pattern}")

        # Download the HTML page
        logger.verbose("DISCOVERY", f"Fetching page: {page_url}")
        try:
            response = requests.get(page_url, timeout=30)
        except requests.exceptions.RequestException as err:
            raise NetworkError(f"Failed to fetch page: {err}") from err

        if not response.ok:
            raise NetworkError(
                f"Failed to fetch page: {response.status_code} {response.reason}"
            )

        html_content = response.text
        logger.verbose("DISCOVERY", f"Page fetched ({len(html_content)} bytes)")

        # Find download link using CSS selector or regex
        download_url = None

        if link_selector:
            # Use CSS selector with BeautifulSoup4
            soup = BeautifulSoup(html_content, "html.parser")
            element = soup.select_one(link_selector)

            if not element:
                raise ConfigError(
                    f"CSS selector {link_selector!r} did not match any elements on page"
                )

            # Get href attribute
            href = element.get("href")
            if not isinstance(href, str) or not href:
                raise ConfigError(
                    f"Element matched by {link_selector!r} has no href attribute"
                )

            logger.verbose("DISCOVERY", f"Found link via CSS: {href}")

            # Build absolute URL
            download_url = urljoin(page_url, href)

        elif link_pattern:
            # Use regex fallback
            try:
                pattern = re.compile(link_pattern)
                match = pattern.search(html_content)

                if not match:
                    raise ConfigError(
                        f"Regex pattern {link_pattern!r} did not match anything on page"
                    )

                # Get first capture group or full match
                if pattern.groups > 0:
                    href = match.group(1)
                else:
                    href = match.group(0)

                logger.verbose("DISCOVERY", f"Found link via regex: {href}")

                # Build absolute URL
                download_url = urljoin(page_url, href)

            except re.error as err:
                raise ConfigError(
                    f"Invalid link_pattern regex: {link_pattern!r}"
                ) from err

        else:
            raise ConfigError(
                "web_scrape strategy requires either 'discovery.link_selector' or "
                "'discovery.link_pattern' in config"
            )

        logger.verbose("DISCOVERY", f"Download URL: {download_url}")

        # Extract version from the download URL
        try:
            version_regex = re.compile(version_pattern)
            match = version_regex.search(download_url)

            if not match:
                raise ConfigError(
                    f"Version pattern {version_pattern!r} did not match "
                    f"URL {download_url!r}"
                )

            # Get captured groups
            groups = match.groups()

            if not groups:
                # No capture groups, use full match
                version_str = match.group(0)
            else:
                # Format using captured groups
                try:
                    version_str = version_format.format(*groups)
                except (IndexError, KeyError) as err:
                    raise ConfigError(
                        f"version_format {version_format!r} failed with "
                        f"groups {groups}: {err}"
                    ) from err

        except re.error as err:
            raise ConfigError(
                f"Invalid version_pattern regex: {version_pattern!r}"
            ) from err

        logger.verbose("DISCOVERY", f"Extracted version: {version_str}")

        return RemoteVersion(
            version=version_str,
            download_url=download_url,
            source="web_scrape",
        )

    def validate_config(self, app_config: dict[str, Any]) -> list[str]:
        """Validate web_scrape strategy configuration.

        Checks for required fields and correct types without making network calls.

        Args:
            app_config: The app configuration from the recipe.

        Returns:
            List of error messages (empty if valid).

        """
        errors = []
        source = app_config.get("discovery", {})

        # Check page_url
        if "page_url" not in source:
            errors.append("Missing required field: discovery.page_url")
        elif not isinstance(source["page_url"], str):
            errors.append("discovery.page_url must be a string")
        elif not source["page_url"].strip():
            errors.append("discovery.page_url cannot be empty")

        # Check that at least one link finding method is provided
        link_selector = source.get("link_selector")
        link_pattern = source.get("link_pattern")

        if not link_selector and not link_pattern:
            errors.append(
                "Missing required field: must provide either "
                "discovery.link_selector or discovery.link_pattern"
            )

        # Validate link_selector if provided
        if link_selector:
            if not isinstance(link_selector, str):
                errors.append("discovery.link_selector must be a string")
            elif not link_selector.strip():
                errors.append("discovery.link_selector cannot be empty")
            else:
                # Try to validate CSS selector syntax
                try:
                    # Test if selector is parseable
                    soup = BeautifulSoup("<html></html>", "html.parser")
                    soup.select_one(link_selector)  # Will raise if invalid
                except Exception as err:
                    errors.append(f"Invalid CSS selector: {err}")

        # Validate link_pattern if provided
        if link_pattern:
            if not isinstance(link_pattern, str):
                errors.append("discovery.link_pattern must be a string")
            elif not link_pattern.strip():
                errors.append("discovery.link_pattern cannot be empty")
            else:
                # Validate regex compiles
                try:
                    re.compile(link_pattern)
                except re.error as err:
                    errors.append(f"Invalid link_pattern regex: {err}")

        # Check version_pattern
        if "version_pattern" not in source:
            errors.append("Missing required field: discovery.version_pattern")
        elif not isinstance(source["version_pattern"], str):
            errors.append("discovery.version_pattern must be a string")
        elif not source["version_pattern"].strip():
            errors.append("discovery.version_pattern cannot be empty")
        else:
            # Validate regex compiles
            try:
                re.compile(source["version_pattern"])
            except re.error as err:
                errors.append(f"Invalid version_pattern regex: {err}")

        # Validate version_format if provided
        if "version_format" in source:
            if not isinstance(source["version_format"], str):
                errors.append("discovery.version_format must be a string")
            elif not source["version_format"].strip():
                errors.append("discovery.version_format cannot be empty")

        return errors

discover

discover(app_config: dict[str, Any]) -> RemoteVersion

Discovers version and download URL by scraping a vendor page.

Fetches discovery.page_url, locates a download link with either link_selector (CSS) or link_pattern (regex), and extracts the version from the matched link using version_pattern.

Parameters:

Name Type Description Default
app_config dict[str, Any]

Merged recipe configuration dict containing discovery.page_url, exactly one of discovery.link_selector or discovery.link_pattern, and discovery.version_pattern.

required

Returns:

Type Description
RemoteVersion

Discovered version, the matched link's URL, and

RemoteVersion

"web_scrape" as the source identifier.

Raises:

Type Description
ConfigError

On missing required configuration or when a selector / pattern matches nothing.

NetworkError

On page fetch failure.

Source code in napt/discovery/web_scrape.py
def discover(self, app_config: dict[str, Any]) -> RemoteVersion:
    r"""Discovers version and download URL by scraping a vendor page.

    Fetches ``discovery.page_url``, locates a download link with
    either ``link_selector`` (CSS) or ``link_pattern`` (regex),
    and extracts the version from the matched link using
    ``version_pattern``.

    Args:
        app_config: Merged recipe configuration dict containing
            ``discovery.page_url``, exactly one of
            ``discovery.link_selector`` or ``discovery.link_pattern``,
            and ``discovery.version_pattern``.

    Returns:
        Discovered version, the matched link's URL, and
        ``"web_scrape"`` as the source identifier.

    Raises:
        ConfigError: On missing required configuration or when
            a selector / pattern matches nothing.
        NetworkError: On page fetch failure.

    """
    from napt.logging import get_global_logger

    logger = get_global_logger()
    # Validate configuration
    source = app_config.get("discovery", {})
    page_url = source.get("page_url")
    if not page_url:
        raise ConfigError(
            "web_scrape strategy requires 'discovery.page_url' in config"
        )

    link_selector = source.get("link_selector")
    link_pattern = source.get("link_pattern")

    if not link_selector and not link_pattern:
        raise ConfigError(
            "web_scrape strategy requires either 'discovery.link_selector' or "
            "'discovery.link_pattern' in config"
        )

    version_pattern = source.get("version_pattern")
    if not version_pattern:
        raise ConfigError(
            "web_scrape strategy requires 'discovery.version_pattern' in config"
        )

    version_format = source.get("version_format", _DEFAULT_VERSION_FORMAT)

    logger.verbose("DISCOVERY", "Strategy: web_scrape (version-first)")
    logger.verbose("DISCOVERY", f"Page URL: {page_url}")
    if link_selector:
        logger.verbose("DISCOVERY", f"Link selector (CSS): {link_selector}")
    if link_pattern:
        logger.verbose("DISCOVERY", f"Link pattern (regex): {link_pattern}")
    logger.verbose("DISCOVERY", f"Version pattern: {version_pattern}")

    # Download the HTML page
    logger.verbose("DISCOVERY", f"Fetching page: {page_url}")
    try:
        response = requests.get(page_url, timeout=30)
    except requests.exceptions.RequestException as err:
        raise NetworkError(f"Failed to fetch page: {err}") from err

    if not response.ok:
        raise NetworkError(
            f"Failed to fetch page: {response.status_code} {response.reason}"
        )

    html_content = response.text
    logger.verbose("DISCOVERY", f"Page fetched ({len(html_content)} bytes)")

    # Find download link using CSS selector or regex
    download_url = None

    if link_selector:
        # Use CSS selector with BeautifulSoup4
        soup = BeautifulSoup(html_content, "html.parser")
        element = soup.select_one(link_selector)

        if not element:
            raise ConfigError(
                f"CSS selector {link_selector!r} did not match any elements on page"
            )

        # Get href attribute
        href = element.get("href")
        if not isinstance(href, str) or not href:
            raise ConfigError(
                f"Element matched by {link_selector!r} has no href attribute"
            )

        logger.verbose("DISCOVERY", f"Found link via CSS: {href}")

        # Build absolute URL
        download_url = urljoin(page_url, href)

    elif link_pattern:
        # Use regex fallback
        try:
            pattern = re.compile(link_pattern)
            match = pattern.search(html_content)

            if not match:
                raise ConfigError(
                    f"Regex pattern {link_pattern!r} did not match anything on page"
                )

            # Get first capture group or full match
            if pattern.groups > 0:
                href = match.group(1)
            else:
                href = match.group(0)

            logger.verbose("DISCOVERY", f"Found link via regex: {href}")

            # Build absolute URL
            download_url = urljoin(page_url, href)

        except re.error as err:
            raise ConfigError(
                f"Invalid link_pattern regex: {link_pattern!r}"
            ) from err

    else:
        raise ConfigError(
            "web_scrape strategy requires either 'discovery.link_selector' or "
            "'discovery.link_pattern' in config"
        )

    logger.verbose("DISCOVERY", f"Download URL: {download_url}")

    # Extract version from the download URL
    try:
        version_regex = re.compile(version_pattern)
        match = version_regex.search(download_url)

        if not match:
            raise ConfigError(
                f"Version pattern {version_pattern!r} did not match "
                f"URL {download_url!r}"
            )

        # Get captured groups
        groups = match.groups()

        if not groups:
            # No capture groups, use full match
            version_str = match.group(0)
        else:
            # Format using captured groups
            try:
                version_str = version_format.format(*groups)
            except (IndexError, KeyError) as err:
                raise ConfigError(
                    f"version_format {version_format!r} failed with "
                    f"groups {groups}: {err}"
                ) from err

    except re.error as err:
        raise ConfigError(
            f"Invalid version_pattern regex: {version_pattern!r}"
        ) from err

    logger.verbose("DISCOVERY", f"Extracted version: {version_str}")

    return RemoteVersion(
        version=version_str,
        download_url=download_url,
        source="web_scrape",
    )

validate_config

validate_config(app_config: dict[str, Any]) -> list[str]

Validate web_scrape strategy configuration.

Checks for required fields and correct types without making network calls.

Parameters:

Name Type Description Default
app_config dict[str, Any]

The app configuration from the recipe.

required

Returns:

Type Description
list[str]

List of error messages (empty if valid).

Source code in napt/discovery/web_scrape.py
def validate_config(self, app_config: dict[str, Any]) -> list[str]:
    """Validate web_scrape strategy configuration.

    Checks for required fields and correct types without making network calls.

    Args:
        app_config: The app configuration from the recipe.

    Returns:
        List of error messages (empty if valid).

    """
    errors = []
    source = app_config.get("discovery", {})

    # Check page_url
    if "page_url" not in source:
        errors.append("Missing required field: discovery.page_url")
    elif not isinstance(source["page_url"], str):
        errors.append("discovery.page_url must be a string")
    elif not source["page_url"].strip():
        errors.append("discovery.page_url cannot be empty")

    # Check that at least one link finding method is provided
    link_selector = source.get("link_selector")
    link_pattern = source.get("link_pattern")

    if not link_selector and not link_pattern:
        errors.append(
            "Missing required field: must provide either "
            "discovery.link_selector or discovery.link_pattern"
        )

    # Validate link_selector if provided
    if link_selector:
        if not isinstance(link_selector, str):
            errors.append("discovery.link_selector must be a string")
        elif not link_selector.strip():
            errors.append("discovery.link_selector cannot be empty")
        else:
            # Try to validate CSS selector syntax
            try:
                # Test if selector is parseable
                soup = BeautifulSoup("<html></html>", "html.parser")
                soup.select_one(link_selector)  # Will raise if invalid
            except Exception as err:
                errors.append(f"Invalid CSS selector: {err}")

    # Validate link_pattern if provided
    if link_pattern:
        if not isinstance(link_pattern, str):
            errors.append("discovery.link_pattern must be a string")
        elif not link_pattern.strip():
            errors.append("discovery.link_pattern cannot be empty")
        else:
            # Validate regex compiles
            try:
                re.compile(link_pattern)
            except re.error as err:
                errors.append(f"Invalid link_pattern regex: {err}")

    # Check version_pattern
    if "version_pattern" not in source:
        errors.append("Missing required field: discovery.version_pattern")
    elif not isinstance(source["version_pattern"], str):
        errors.append("discovery.version_pattern must be a string")
    elif not source["version_pattern"].strip():
        errors.append("discovery.version_pattern cannot be empty")
    else:
        # Validate regex compiles
        try:
            re.compile(source["version_pattern"])
        except re.error as err:
            errors.append(f"Invalid version_pattern regex: {err}")

    # Validate version_format if provided
    if "version_format" in source:
        if not isinstance(source["version_format"], str):
            errors.append("discovery.version_format must be a string")
        elif not source["version_format"].strip():
            errors.append("discovery.version_format cannot be empty")

    return errors

napt.discovery.api_github

GitHub releases discovery strategy.

Queries the GitHub releases API for the latest tag and the download URL of a matching asset. The version comes from the release tag (parsed with a regex); the download URL comes from the first asset whose filename matches asset_pattern.

Recipe Example
discovery:
  strategy: api_github
  repo: "git-for-windows/git"            # required, "owner/name"
  asset_pattern: "Git-.*-64-bit\\.exe$"  # required, regex on asset filename
  version_pattern: "v?([0-9.]+)"         # optional, default strips "v"
  prerelease: false                      # optional, default false
  token: "${GITHUB_TOKEN}"               # optional, supports env expansion
Configuration Fields
  • repo (required): GitHub repo as "owner/name".
  • asset_pattern (required): Regex matched against asset filename. First match wins. Case-sensitive by default; prefix with (?i) for case-insensitive matching.
  • version_pattern (optional): Regex for extracting the version from the release tag. Uses a named group (?P<version>...) or capture group 1 if present, otherwise the full match. Default: v?([0-9.]+).
  • prerelease (optional, default false): When true, includes pre-release versions; otherwise the latest release must be stable.
  • token (optional): GitHub personal access token. Raises the API rate limit from 60 to 5000 requests/hour. Supports ${ENV_VAR} expansion. Public repos do not require any special permissions.
Note

GitHub returns the most recent release first. If no asset matches, or the latest release is a pre-release while prerelease: false, discovery raises an error rather than walking back through history.

ApiGithubStrategy

Discovery strategy for GitHub releases.

Source code in napt/discovery/api_github.py
class ApiGithubStrategy:
    """Discovery strategy for GitHub releases."""

    def discover(self, app_config: dict[str, Any]) -> RemoteVersion:
        r"""Discovers the latest GitHub release version and asset download URL.

        Queries the GitHub releases API for the latest release of the
        configured repository. Extracts the version from the release tag
        (via ``version_pattern``) and the download URL from the first
        asset matching ``asset_pattern``.

        Args:
            app_config: Merged recipe configuration dict containing
                ``discovery.repo`` and ``discovery.asset_pattern``,
                plus optional ``version_pattern``, ``prerelease``, and
                ``token`` fields.

        Returns:
            Latest version, the matched asset's download URL, and
            ``"api_github"`` as the source identifier.

        Raises:
            ConfigError: On missing or malformed required configuration,
                or when patterns do not match the release.
            NetworkError: On API failure, missing assets, or rejected
                pre-releases.

        """
        from napt.logging import get_global_logger

        logger = get_global_logger()
        # Validate configuration
        source = app_config.get("discovery", {})
        repo = source.get("repo")
        if not repo:
            raise ConfigError("api_github strategy requires 'discovery.repo' in config")

        # Validate repo format
        if "/" not in repo or repo.count("/") != 1:
            raise ConfigError(
                f"Invalid repo format: {repo!r}. Expected 'owner/repository'"
            )

        # Optional configuration
        asset_pattern = source.get("asset_pattern")
        if not asset_pattern:
            raise ConfigError(
                "api_github strategy requires 'discovery.asset_pattern' in config"
            )

        version_pattern = source.get("version_pattern", _DEFAULT_VERSION_PATTERN)
        prerelease = source.get("prerelease", _DEFAULT_PRERELEASE)
        token = source.get("token")

        # Expand environment variables in token (e.g., ${GITHUB_TOKEN})
        if token:
            if token.startswith("${") and token.endswith("}"):
                env_var = token[2:-1]
                token = os.environ.get(env_var)
                if not token:
                    logger.verbose(
                        "DISCOVERY",
                        f"Warning: Environment variable {env_var} not set",
                    )

        logger.verbose("DISCOVERY", "Strategy: api_github (version-first)")
        logger.verbose("DISCOVERY", f"Repository: {repo}")
        logger.verbose("DISCOVERY", f"Version pattern: {version_pattern}")
        if asset_pattern:
            logger.verbose("DISCOVERY", f"Asset pattern: {asset_pattern}")
        if prerelease:
            logger.verbose("DISCOVERY", "Including pre-releases")

        # Fetch latest release from GitHub API
        api_url = f"https://api.github.com/repos/{repo}/releases/latest"
        headers = {
            "Accept": "application/vnd.github+json",
            "X-GitHub-Api-Version": "2022-11-28",
        }

        # Add authentication if token provided
        if token:
            headers["Authorization"] = f"token {token}"
            logger.verbose("DISCOVERY", "Using authenticated API request")

        logger.verbose("DISCOVERY", f"Fetching release from: {api_url}")

        try:
            response = requests.get(api_url, headers=headers, timeout=30)
        except requests.exceptions.RequestException as err:
            raise NetworkError(f"Failed to fetch GitHub release: {err}") from err

        if response.status_code == 404:
            raise NetworkError(f"Repository {repo!r} not found or has no releases")
        elif response.status_code == 403:
            raise NetworkError(
                f"GitHub API rate limit exceeded. Consider using a token. "
                f"Status: {response.status_code}"
            )
        elif not response.ok:
            raise NetworkError(
                f"GitHub API request failed: {response.status_code} "
                f"{response.reason}"
            )

        release_data = response.json()

        # Check if this is a prerelease and we don't want those
        if release_data.get("prerelease", False) and not prerelease:
            raise NetworkError(
                f"Latest release is a pre-release and prerelease=false. "
                f"Tag: {release_data.get('tag_name')}"
            )

        # Extract version from tag name
        tag_name = release_data.get("tag_name", "")
        if not tag_name:
            raise NetworkError("Release has no tag_name field")

        logger.verbose("DISCOVERY", f"Release tag: {tag_name}")

        try:
            pattern = re.compile(version_pattern)
            match = pattern.search(tag_name)
            if not match:
                raise ConfigError(
                    f"Version pattern {version_pattern!r} did not match "
                    f"tag {tag_name!r}"
                )

            # Try to get named capture group 'version' first, else use group 1,
            # else full match
            if "version" in pattern.groupindex:
                version_str = match.group("version")
            elif pattern.groups > 0:
                version_str = match.group(1)
            else:
                version_str = match.group(0)

        except re.error as err:
            raise ConfigError(
                f"Invalid version_pattern regex: {version_pattern!r}"
            ) from err
        except (ValueError, IndexError) as err:
            raise ConfigError(
                f"Failed to extract version from tag {tag_name!r} "
                f"using pattern {version_pattern!r}: {err}"
            ) from err

        logger.verbose("DISCOVERY", f"Extracted version: {version_str}")

        # Find matching asset
        assets = release_data.get("assets", [])
        if not assets:
            raise NetworkError(
                f"Release {tag_name} has no assets. "
                f"Check if assets were uploaded to the release."
            )

        logger.verbose("DISCOVERY", f"Release has {len(assets)} asset(s)")

        # Match asset by pattern
        matched_asset = None
        try:
            pattern = re.compile(asset_pattern)
        except re.error as err:
            raise ConfigError(
                f"Invalid asset_pattern regex: {asset_pattern!r}"
            ) from err

        for asset in assets:
            asset_name = asset.get("name", "")
            if pattern.search(asset_name):
                matched_asset = asset
                logger.verbose("DISCOVERY", f"Matched asset: {asset_name}")
                break

        if not matched_asset:
            available = [a.get("name", "(unnamed)") for a in assets]
            raise ConfigError(
                f"No assets matched pattern {asset_pattern!r}. "
                f"Available assets: {', '.join(available)}"
            )

        # Get download URL
        download_url = matched_asset.get("browser_download_url")
        if not download_url:
            raise NetworkError(f"Asset {matched_asset.get('name')} has no download URL")

        logger.verbose("DISCOVERY", f"Download URL: {download_url}")

        return RemoteVersion(
            version=version_str,
            download_url=download_url,
            source="api_github",
        )

    def validate_config(self, app_config: dict[str, Any]) -> list[str]:
        """Validate api_github strategy configuration.

        Checks for required fields and correct types without making network calls.

        Args:
            app_config: The app configuration from the recipe.

        Returns:
            List of error messages (empty if valid).

        """
        errors = []
        source = app_config.get("discovery", {})

        # Check required fields
        if "repo" not in source:
            errors.append("Missing required field: discovery.repo")
        elif not isinstance(source["repo"], str):
            errors.append("discovery.repo must be a string")
        elif not source["repo"].strip():
            errors.append("discovery.repo cannot be empty")
        else:
            # Validate repo format
            repo = source["repo"]
            if repo.count("/") != 1:
                errors.append(
                    "discovery.repo must be in format 'owner/repo' (e.g., 'git/git')"
                )

        if "asset_pattern" not in source:
            errors.append("Missing required field: discovery.asset_pattern")
        elif not isinstance(source["asset_pattern"], str):
            errors.append("discovery.asset_pattern must be a string")
        elif not source["asset_pattern"].strip():
            errors.append("discovery.asset_pattern cannot be empty")
        else:
            # Validate regex pattern syntax
            pattern = source["asset_pattern"]
            import re

            try:
                re.compile(pattern)
            except re.error as err:
                errors.append(f"Invalid asset_pattern regex: {err}")

        # Optional fields validation
        if "version_pattern" in source:
            if not isinstance(source["version_pattern"], str):
                errors.append("discovery.version_pattern must be a string")
            else:
                pattern = source["version_pattern"]
                import re

                try:
                    re.compile(pattern)
                except re.error as err:
                    errors.append(f"Invalid version_pattern regex: {err}")

        return errors

discover

discover(app_config: dict[str, Any]) -> RemoteVersion

Discovers the latest GitHub release version and asset download URL.

Queries the GitHub releases API for the latest release of the configured repository. Extracts the version from the release tag (via version_pattern) and the download URL from the first asset matching asset_pattern.

Parameters:

Name Type Description Default
app_config dict[str, Any]

Merged recipe configuration dict containing discovery.repo and discovery.asset_pattern, plus optional version_pattern, prerelease, and token fields.

required

Returns:

Type Description
RemoteVersion

Latest version, the matched asset's download URL, and

RemoteVersion

"api_github" as the source identifier.

Raises:

Type Description
ConfigError

On missing or malformed required configuration, or when patterns do not match the release.

NetworkError

On API failure, missing assets, or rejected pre-releases.

Source code in napt/discovery/api_github.py
def discover(self, app_config: dict[str, Any]) -> RemoteVersion:
    r"""Discovers the latest GitHub release version and asset download URL.

    Queries the GitHub releases API for the latest release of the
    configured repository. Extracts the version from the release tag
    (via ``version_pattern``) and the download URL from the first
    asset matching ``asset_pattern``.

    Args:
        app_config: Merged recipe configuration dict containing
            ``discovery.repo`` and ``discovery.asset_pattern``,
            plus optional ``version_pattern``, ``prerelease``, and
            ``token`` fields.

    Returns:
        Latest version, the matched asset's download URL, and
        ``"api_github"`` as the source identifier.

    Raises:
        ConfigError: On missing or malformed required configuration,
            or when patterns do not match the release.
        NetworkError: On API failure, missing assets, or rejected
            pre-releases.

    """
    from napt.logging import get_global_logger

    logger = get_global_logger()
    # Validate configuration
    source = app_config.get("discovery", {})
    repo = source.get("repo")
    if not repo:
        raise ConfigError("api_github strategy requires 'discovery.repo' in config")

    # Validate repo format
    if "/" not in repo or repo.count("/") != 1:
        raise ConfigError(
            f"Invalid repo format: {repo!r}. Expected 'owner/repository'"
        )

    # Optional configuration
    asset_pattern = source.get("asset_pattern")
    if not asset_pattern:
        raise ConfigError(
            "api_github strategy requires 'discovery.asset_pattern' in config"
        )

    version_pattern = source.get("version_pattern", _DEFAULT_VERSION_PATTERN)
    prerelease = source.get("prerelease", _DEFAULT_PRERELEASE)
    token = source.get("token")

    # Expand environment variables in token (e.g., ${GITHUB_TOKEN})
    if token:
        if token.startswith("${") and token.endswith("}"):
            env_var = token[2:-1]
            token = os.environ.get(env_var)
            if not token:
                logger.verbose(
                    "DISCOVERY",
                    f"Warning: Environment variable {env_var} not set",
                )

    logger.verbose("DISCOVERY", "Strategy: api_github (version-first)")
    logger.verbose("DISCOVERY", f"Repository: {repo}")
    logger.verbose("DISCOVERY", f"Version pattern: {version_pattern}")
    if asset_pattern:
        logger.verbose("DISCOVERY", f"Asset pattern: {asset_pattern}")
    if prerelease:
        logger.verbose("DISCOVERY", "Including pre-releases")

    # Fetch latest release from GitHub API
    api_url = f"https://api.github.com/repos/{repo}/releases/latest"
    headers = {
        "Accept": "application/vnd.github+json",
        "X-GitHub-Api-Version": "2022-11-28",
    }

    # Add authentication if token provided
    if token:
        headers["Authorization"] = f"token {token}"
        logger.verbose("DISCOVERY", "Using authenticated API request")

    logger.verbose("DISCOVERY", f"Fetching release from: {api_url}")

    try:
        response = requests.get(api_url, headers=headers, timeout=30)
    except requests.exceptions.RequestException as err:
        raise NetworkError(f"Failed to fetch GitHub release: {err}") from err

    if response.status_code == 404:
        raise NetworkError(f"Repository {repo!r} not found or has no releases")
    elif response.status_code == 403:
        raise NetworkError(
            f"GitHub API rate limit exceeded. Consider using a token. "
            f"Status: {response.status_code}"
        )
    elif not response.ok:
        raise NetworkError(
            f"GitHub API request failed: {response.status_code} "
            f"{response.reason}"
        )

    release_data = response.json()

    # Check if this is a prerelease and we don't want those
    if release_data.get("prerelease", False) and not prerelease:
        raise NetworkError(
            f"Latest release is a pre-release and prerelease=false. "
            f"Tag: {release_data.get('tag_name')}"
        )

    # Extract version from tag name
    tag_name = release_data.get("tag_name", "")
    if not tag_name:
        raise NetworkError("Release has no tag_name field")

    logger.verbose("DISCOVERY", f"Release tag: {tag_name}")

    try:
        pattern = re.compile(version_pattern)
        match = pattern.search(tag_name)
        if not match:
            raise ConfigError(
                f"Version pattern {version_pattern!r} did not match "
                f"tag {tag_name!r}"
            )

        # Try to get named capture group 'version' first, else use group 1,
        # else full match
        if "version" in pattern.groupindex:
            version_str = match.group("version")
        elif pattern.groups > 0:
            version_str = match.group(1)
        else:
            version_str = match.group(0)

    except re.error as err:
        raise ConfigError(
            f"Invalid version_pattern regex: {version_pattern!r}"
        ) from err
    except (ValueError, IndexError) as err:
        raise ConfigError(
            f"Failed to extract version from tag {tag_name!r} "
            f"using pattern {version_pattern!r}: {err}"
        ) from err

    logger.verbose("DISCOVERY", f"Extracted version: {version_str}")

    # Find matching asset
    assets = release_data.get("assets", [])
    if not assets:
        raise NetworkError(
            f"Release {tag_name} has no assets. "
            f"Check if assets were uploaded to the release."
        )

    logger.verbose("DISCOVERY", f"Release has {len(assets)} asset(s)")

    # Match asset by pattern
    matched_asset = None
    try:
        pattern = re.compile(asset_pattern)
    except re.error as err:
        raise ConfigError(
            f"Invalid asset_pattern regex: {asset_pattern!r}"
        ) from err

    for asset in assets:
        asset_name = asset.get("name", "")
        if pattern.search(asset_name):
            matched_asset = asset
            logger.verbose("DISCOVERY", f"Matched asset: {asset_name}")
            break

    if not matched_asset:
        available = [a.get("name", "(unnamed)") for a in assets]
        raise ConfigError(
            f"No assets matched pattern {asset_pattern!r}. "
            f"Available assets: {', '.join(available)}"
        )

    # Get download URL
    download_url = matched_asset.get("browser_download_url")
    if not download_url:
        raise NetworkError(f"Asset {matched_asset.get('name')} has no download URL")

    logger.verbose("DISCOVERY", f"Download URL: {download_url}")

    return RemoteVersion(
        version=version_str,
        download_url=download_url,
        source="api_github",
    )

validate_config

validate_config(app_config: dict[str, Any]) -> list[str]

Validate api_github strategy configuration.

Checks for required fields and correct types without making network calls.

Parameters:

Name Type Description Default
app_config dict[str, Any]

The app configuration from the recipe.

required

Returns:

Type Description
list[str]

List of error messages (empty if valid).

Source code in napt/discovery/api_github.py
def validate_config(self, app_config: dict[str, Any]) -> list[str]:
    """Validate api_github strategy configuration.

    Checks for required fields and correct types without making network calls.

    Args:
        app_config: The app configuration from the recipe.

    Returns:
        List of error messages (empty if valid).

    """
    errors = []
    source = app_config.get("discovery", {})

    # Check required fields
    if "repo" not in source:
        errors.append("Missing required field: discovery.repo")
    elif not isinstance(source["repo"], str):
        errors.append("discovery.repo must be a string")
    elif not source["repo"].strip():
        errors.append("discovery.repo cannot be empty")
    else:
        # Validate repo format
        repo = source["repo"]
        if repo.count("/") != 1:
            errors.append(
                "discovery.repo must be in format 'owner/repo' (e.g., 'git/git')"
            )

    if "asset_pattern" not in source:
        errors.append("Missing required field: discovery.asset_pattern")
    elif not isinstance(source["asset_pattern"], str):
        errors.append("discovery.asset_pattern must be a string")
    elif not source["asset_pattern"].strip():
        errors.append("discovery.asset_pattern cannot be empty")
    else:
        # Validate regex pattern syntax
        pattern = source["asset_pattern"]
        import re

        try:
            re.compile(pattern)
        except re.error as err:
            errors.append(f"Invalid asset_pattern regex: {err}")

    # Optional fields validation
    if "version_pattern" in source:
        if not isinstance(source["version_pattern"], str):
            errors.append("discovery.version_pattern must be a string")
        else:
            pattern = source["version_pattern"]
            import re

            try:
                re.compile(pattern)
            except re.error as err:
                errors.append(f"Invalid version_pattern regex: {err}")

    return errors

napt.discovery.api_json

JSON API discovery strategy.

Queries a JSON API endpoint for the latest version and download URL. Both fields are extracted from the response using JSONPath expressions.

Recipe Example
discovery:
  strategy: api_json
  api_url: "https://vendor.example.com/api/latest"  # required
  version_path: "version"                           # required, JSONPath
  download_url_path: "download_url"                 # required, JSONPath
  method: "GET"                                     # optional, GET or POST
  headers:                                          # optional
    Authorization: "Bearer ${API_TOKEN}"
    Accept: "application/json"
  body:                                             # optional, POST only
    platform: "windows"
    arch: "x64"
  timeout: 30                                       # optional, seconds

Nested response, with auth header:

discovery:
  strategy: api_json
  api_url: "https://vendor.example.com/api/releases"
  version_path: "stable.version"
  download_url_path: "stable.platforms.windows.x64"
  headers:
    Authorization: "Bearer ${API_TOKEN}"

Configuration Fields
  • api_url (required): JSON endpoint URL.
  • version_path (required): JSONPath expression locating the version string in the response (e.g. "version", "release.version").
  • download_url_path (required): JSONPath expression locating the installer download URL in the response.
  • method (optional, default "GET"): "GET" or "POST".
  • headers (optional): HTTP headers to send. Values support ${ENV_VAR} expansion.
  • body (optional): Dict sent as a JSON body. Only used when method: POST.
  • timeout (optional, default 30): Request timeout in seconds.
Note

JSONPath uses the jsonpath-ng library. Environment-variable expansion (${VAR}) is applied to string values in headers. POST bodies are always sent as application/json.

ApiJsonStrategy

Discovery strategy for JSON API endpoints.

Source code in napt/discovery/api_json.py
class ApiJsonStrategy:
    """Discovery strategy for JSON API endpoints."""

    def discover(self, app_config: dict[str, Any]) -> RemoteVersion:
        """Discovers version and download URL from a JSON API endpoint.

        Calls the configured ``api_url`` and extracts the version and
        download URL using JSONPath expressions. The HTTP method,
        headers, and body are configurable so the same strategy works
        for GET and POST endpoints.

        Args:
            app_config: Merged recipe configuration dict containing
                ``discovery.api_url``, ``discovery.version_path``, and
                ``discovery.download_url_path``, plus optional
                ``method``, ``headers``, and ``body`` fields.

        Returns:
            Discovered version, download URL, and ``"api_json"`` as
            the source identifier.

        Raises:
            ConfigError: On missing required configuration or when
                the JSONPath expressions do not match the response.
            NetworkError: On API request failure.

        """
        from napt.logging import get_global_logger

        logger = get_global_logger()
        # Validate configuration
        source = app_config.get("discovery", {})
        api_url = source.get("api_url")
        if not api_url:
            raise ConfigError(
                "api_json strategy requires 'discovery.api_url' in config"
            )

        version_path = source.get("version_path")
        if not version_path:
            raise ConfigError(
                "api_json strategy requires 'discovery.version_path' in config"
            )

        download_url_path = source.get("download_url_path")
        if not download_url_path:
            raise ConfigError(
                "api_json strategy requires 'discovery.download_url_path' in config"
            )

        # Optional configuration
        method = source.get("method", _DEFAULT_METHOD).upper()
        if method not in ("GET", "POST"):
            raise ConfigError(f"Invalid method: {method!r}. Must be 'GET' or 'POST'")

        headers = source.get("headers", {})
        body = source.get("body", {})
        timeout = source.get("timeout", _DEFAULT_TIMEOUT)

        logger.verbose("DISCOVERY", "Strategy: api_json (version-first)")
        logger.verbose("DISCOVERY", f"API URL: {api_url}")
        logger.verbose("DISCOVERY", f"Method: {method}")
        logger.verbose("DISCOVERY", f"Version path: {version_path}")
        logger.verbose("DISCOVERY", f"Download URL path: {download_url_path}")

        # Expand environment variables in headers
        expanded_headers = {}
        for key, value in headers.items():
            if (
                isinstance(value, str)
                and value.startswith("${")
                and value.endswith("}")
            ):
                env_var = value[2:-1]
                env_value = os.environ.get(env_var)
                if not env_value:
                    logger.verbose(
                        "DISCOVERY",
                        f"Warning: Environment variable {env_var} not set",
                    )
                else:
                    expanded_headers[key] = env_value
            else:
                expanded_headers[key] = value

        # Make API request
        logger.verbose("DISCOVERY", f"Calling API: {method} {api_url}")
        try:
            if method == "GET":
                response = requests.get(
                    api_url, headers=expanded_headers, timeout=timeout
                )
            else:  # POST
                response = requests.post(
                    api_url,
                    headers=expanded_headers,
                    json=body,
                    timeout=timeout,
                )
        except requests.exceptions.RequestException as err:
            raise NetworkError(f"Failed to call API: {err}") from err

        if not response.ok:
            raise NetworkError(
                f"API request failed: {response.status_code} {response.reason}"
            )

        logger.verbose("DISCOVERY", f"API response: {response.status_code} OK")

        # Parse JSON response
        try:
            json_data = response.json()
        except json.JSONDecodeError as err:
            raise NetworkError(
                f"Invalid JSON response from API. Response: {response.text[:200]}"
            ) from err

        logger.debug("DISCOVERY", f"JSON response: {json.dumps(json_data, indent=2)}")

        # Extract version using JSONPath
        logger.verbose("DISCOVERY", f"Extracting version from path: {version_path}")
        try:
            version_expr = jsonpath_parse(version_path)
            version_matches = version_expr.find(json_data)

            if not version_matches:
                raise ConfigError(
                    f"Version path {version_path!r} did not match anything "
                    f"in API response"
                )

            version_str = str(version_matches[0].value)
        except Exception as err:
            if isinstance(err, ConfigError):
                raise
            raise ConfigError(
                f"Failed to extract version using path {version_path!r}: {err}"
            ) from err

        logger.verbose("DISCOVERY", f"Extracted version: {version_str}")

        # Extract download URL using JSONPath
        logger.verbose(
            "DISCOVERY", f"Extracting download URL from path: {download_url_path}"
        )
        try:
            url_expr = jsonpath_parse(download_url_path)
            url_matches = url_expr.find(json_data)

            if not url_matches:
                raise ConfigError(
                    f"Download URL path {download_url_path!r} did not match "
                    f"anything in API response"
                )

            download_url = str(url_matches[0].value)
        except Exception as err:
            if isinstance(err, ConfigError):
                raise
            raise ConfigError(
                f"Failed to extract download URL using path "
                f"{download_url_path!r}: {err}"
            ) from err

        logger.verbose("DISCOVERY", f"Download URL: {download_url}")

        return RemoteVersion(
            version=version_str,
            download_url=download_url,
            source="api_json",
        )

    def validate_config(self, app_config: dict[str, Any]) -> list[str]:
        """Validate api_json strategy configuration.

        Checks for required fields and correct types without making network calls.

        Args:
            app_config: The app configuration from the recipe.

        Returns:
            List of error messages (empty if valid).

        """
        errors = []
        source = app_config.get("discovery", {})

        # Check required fields
        if "api_url" not in source:
            errors.append("Missing required field: discovery.api_url")
        elif not isinstance(source["api_url"], str):
            errors.append("discovery.api_url must be a string")
        elif not source["api_url"].strip():
            errors.append("discovery.api_url cannot be empty")

        if "version_path" not in source:
            errors.append("Missing required field: discovery.version_path")
        elif not isinstance(source["version_path"], str):
            errors.append("discovery.version_path must be a string")
        elif not source["version_path"].strip():
            errors.append("discovery.version_path cannot be empty")
        else:
            # Validate JSONPath syntax
            from jsonpath_ng import parse as jsonpath_parse

            try:
                jsonpath_parse(source["version_path"])
            except Exception as err:
                errors.append(f"Invalid version_path JSONPath: {err}")

        if "download_url_path" not in source:
            errors.append("Missing required field: discovery.download_url_path")
        elif not isinstance(source["download_url_path"], str):
            errors.append("discovery.download_url_path must be a string")
        elif not source["download_url_path"].strip():
            errors.append("discovery.download_url_path cannot be empty")
        else:
            # Validate JSONPath syntax
            from jsonpath_ng import parse as jsonpath_parse

            try:
                jsonpath_parse(source["download_url_path"])
            except Exception as err:
                errors.append(f"Invalid download_url_path JSONPath: {err}")

        # Optional fields validation
        if "method" in source:
            method = source["method"]
            if not isinstance(method, str):
                errors.append("discovery.method must be a string")
            elif method.upper() not in ["GET", "POST"]:
                errors.append("discovery.method must be 'GET' or 'POST'")

        if "headers" in source and not isinstance(source["headers"], dict):
            errors.append("discovery.headers must be a dictionary")

        if "body" in source and not isinstance(source["body"], dict):
            errors.append("discovery.body must be a dictionary")

        return errors

discover

discover(app_config: dict[str, Any]) -> RemoteVersion

Discovers version and download URL from a JSON API endpoint.

Calls the configured api_url and extracts the version and download URL using JSONPath expressions. The HTTP method, headers, and body are configurable so the same strategy works for GET and POST endpoints.

Parameters:

Name Type Description Default
app_config dict[str, Any]

Merged recipe configuration dict containing discovery.api_url, discovery.version_path, and discovery.download_url_path, plus optional method, headers, and body fields.

required

Returns:

Type Description
RemoteVersion

Discovered version, download URL, and "api_json" as

RemoteVersion

the source identifier.

Raises:

Type Description
ConfigError

On missing required configuration or when the JSONPath expressions do not match the response.

NetworkError

On API request failure.

Source code in napt/discovery/api_json.py
def discover(self, app_config: dict[str, Any]) -> RemoteVersion:
    """Discovers version and download URL from a JSON API endpoint.

    Calls the configured ``api_url`` and extracts the version and
    download URL using JSONPath expressions. The HTTP method,
    headers, and body are configurable so the same strategy works
    for GET and POST endpoints.

    Args:
        app_config: Merged recipe configuration dict containing
            ``discovery.api_url``, ``discovery.version_path``, and
            ``discovery.download_url_path``, plus optional
            ``method``, ``headers``, and ``body`` fields.

    Returns:
        Discovered version, download URL, and ``"api_json"`` as
        the source identifier.

    Raises:
        ConfigError: On missing required configuration or when
            the JSONPath expressions do not match the response.
        NetworkError: On API request failure.

    """
    from napt.logging import get_global_logger

    logger = get_global_logger()
    # Validate configuration
    source = app_config.get("discovery", {})
    api_url = source.get("api_url")
    if not api_url:
        raise ConfigError(
            "api_json strategy requires 'discovery.api_url' in config"
        )

    version_path = source.get("version_path")
    if not version_path:
        raise ConfigError(
            "api_json strategy requires 'discovery.version_path' in config"
        )

    download_url_path = source.get("download_url_path")
    if not download_url_path:
        raise ConfigError(
            "api_json strategy requires 'discovery.download_url_path' in config"
        )

    # Optional configuration
    method = source.get("method", _DEFAULT_METHOD).upper()
    if method not in ("GET", "POST"):
        raise ConfigError(f"Invalid method: {method!r}. Must be 'GET' or 'POST'")

    headers = source.get("headers", {})
    body = source.get("body", {})
    timeout = source.get("timeout", _DEFAULT_TIMEOUT)

    logger.verbose("DISCOVERY", "Strategy: api_json (version-first)")
    logger.verbose("DISCOVERY", f"API URL: {api_url}")
    logger.verbose("DISCOVERY", f"Method: {method}")
    logger.verbose("DISCOVERY", f"Version path: {version_path}")
    logger.verbose("DISCOVERY", f"Download URL path: {download_url_path}")

    # Expand environment variables in headers
    expanded_headers = {}
    for key, value in headers.items():
        if (
            isinstance(value, str)
            and value.startswith("${")
            and value.endswith("}")
        ):
            env_var = value[2:-1]
            env_value = os.environ.get(env_var)
            if not env_value:
                logger.verbose(
                    "DISCOVERY",
                    f"Warning: Environment variable {env_var} not set",
                )
            else:
                expanded_headers[key] = env_value
        else:
            expanded_headers[key] = value

    # Make API request
    logger.verbose("DISCOVERY", f"Calling API: {method} {api_url}")
    try:
        if method == "GET":
            response = requests.get(
                api_url, headers=expanded_headers, timeout=timeout
            )
        else:  # POST
            response = requests.post(
                api_url,
                headers=expanded_headers,
                json=body,
                timeout=timeout,
            )
    except requests.exceptions.RequestException as err:
        raise NetworkError(f"Failed to call API: {err}") from err

    if not response.ok:
        raise NetworkError(
            f"API request failed: {response.status_code} {response.reason}"
        )

    logger.verbose("DISCOVERY", f"API response: {response.status_code} OK")

    # Parse JSON response
    try:
        json_data = response.json()
    except json.JSONDecodeError as err:
        raise NetworkError(
            f"Invalid JSON response from API. Response: {response.text[:200]}"
        ) from err

    logger.debug("DISCOVERY", f"JSON response: {json.dumps(json_data, indent=2)}")

    # Extract version using JSONPath
    logger.verbose("DISCOVERY", f"Extracting version from path: {version_path}")
    try:
        version_expr = jsonpath_parse(version_path)
        version_matches = version_expr.find(json_data)

        if not version_matches:
            raise ConfigError(
                f"Version path {version_path!r} did not match anything "
                f"in API response"
            )

        version_str = str(version_matches[0].value)
    except Exception as err:
        if isinstance(err, ConfigError):
            raise
        raise ConfigError(
            f"Failed to extract version using path {version_path!r}: {err}"
        ) from err

    logger.verbose("DISCOVERY", f"Extracted version: {version_str}")

    # Extract download URL using JSONPath
    logger.verbose(
        "DISCOVERY", f"Extracting download URL from path: {download_url_path}"
    )
    try:
        url_expr = jsonpath_parse(download_url_path)
        url_matches = url_expr.find(json_data)

        if not url_matches:
            raise ConfigError(
                f"Download URL path {download_url_path!r} did not match "
                f"anything in API response"
            )

        download_url = str(url_matches[0].value)
    except Exception as err:
        if isinstance(err, ConfigError):
            raise
        raise ConfigError(
            f"Failed to extract download URL using path "
            f"{download_url_path!r}: {err}"
        ) from err

    logger.verbose("DISCOVERY", f"Download URL: {download_url}")

    return RemoteVersion(
        version=version_str,
        download_url=download_url,
        source="api_json",
    )

validate_config

validate_config(app_config: dict[str, Any]) -> list[str]

Validate api_json strategy configuration.

Checks for required fields and correct types without making network calls.

Parameters:

Name Type Description Default
app_config dict[str, Any]

The app configuration from the recipe.

required

Returns:

Type Description
list[str]

List of error messages (empty if valid).

Source code in napt/discovery/api_json.py
def validate_config(self, app_config: dict[str, Any]) -> list[str]:
    """Validate api_json strategy configuration.

    Checks for required fields and correct types without making network calls.

    Args:
        app_config: The app configuration from the recipe.

    Returns:
        List of error messages (empty if valid).

    """
    errors = []
    source = app_config.get("discovery", {})

    # Check required fields
    if "api_url" not in source:
        errors.append("Missing required field: discovery.api_url")
    elif not isinstance(source["api_url"], str):
        errors.append("discovery.api_url must be a string")
    elif not source["api_url"].strip():
        errors.append("discovery.api_url cannot be empty")

    if "version_path" not in source:
        errors.append("Missing required field: discovery.version_path")
    elif not isinstance(source["version_path"], str):
        errors.append("discovery.version_path must be a string")
    elif not source["version_path"].strip():
        errors.append("discovery.version_path cannot be empty")
    else:
        # Validate JSONPath syntax
        from jsonpath_ng import parse as jsonpath_parse

        try:
            jsonpath_parse(source["version_path"])
        except Exception as err:
            errors.append(f"Invalid version_path JSONPath: {err}")

    if "download_url_path" not in source:
        errors.append("Missing required field: discovery.download_url_path")
    elif not isinstance(source["download_url_path"], str):
        errors.append("discovery.download_url_path must be a string")
    elif not source["download_url_path"].strip():
        errors.append("discovery.download_url_path cannot be empty")
    else:
        # Validate JSONPath syntax
        from jsonpath_ng import parse as jsonpath_parse

        try:
            jsonpath_parse(source["download_url_path"])
        except Exception as err:
            errors.append(f"Invalid download_url_path JSONPath: {err}")

    # Optional fields validation
    if "method" in source:
        method = source["method"]
        if not isinstance(method, str):
            errors.append("discovery.method must be a string")
        elif method.upper() not in ["GET", "POST"]:
            errors.append("discovery.method must be 'GET' or 'POST'")

    if "headers" in source and not isinstance(source["headers"], dict):
        errors.append("discovery.headers must be a dictionary")

    if "body" in source and not isinstance(source["body"], dict):
        errors.append("discovery.body must be a dictionary")

    return errors