Skip to content

bug(providers): provider v2 profile lint does not validate L7 endpoint constraints (protocol/access/rules) #1714

@Tojaj

Description

@Tojaj

Agent Diagnostic

Skills loaded: openshell-cli, create-github-issue

Investigation: Traced the full lint path through three crates:

  1. Profile lint path (crates/openshell-cli/src/run.rscrates/openshell-server/src/grpc/provider.rscrates/openshell-providers/src/profiles.rs): The lint command calls validate_profile_set() which delegates endpoint checking to endpoint_is_valid() at profiles.rs:1096. That function only validates host is non-empty and port is in 1–65535. No other endpoint fields (protocol, access, enforcement, rules, deny_rules) are validated.

  2. Runtime validation (crates/openshell-sandbox/src/l7/mod.rs:515): validate_l7_policies() performs comprehensive L7 semantic checks including the rule at line 592: if !protocol.is_empty() && !has_rules && access.is_empty() — this is the check that catches the error at sandbox boot but is unreachable from the lint path.

  3. Dependency chain: openshell-cliopenshell-providers (lint-time validation) cannot reach openshell-sandbox (runtime validation). The L7 validation is trapped in the sandbox crate. openshell-policy (shared by both) has validate_sandbox_policy() but it only checks safety concerns (process identity, filesystem path traversal, TLD wildcards), not L7 semantics.

Conclusion: The lint command has a validation gap — it accepts Provider v2 Profile endpoint configurations that will always fail at sandbox startup. The fix is to add L7 endpoint field validation to validate_profile_set() in openshell-providers.

Description

A Provider v2 Profile bundles credentials, endpoints, and network policy into a single YAML definition. The openshell provider profile lint command is intended to validate these profiles before use, but it does not check L7 endpoint field semantics.

Actual behavior: openshell provider profile lint -f profile.yaml reports success for a profile with an endpoint that has protocol: rest but no access or rules field. When this profile is used to create a sandbox, the sandbox enters the Error phase with:

L7 policy validation failed: _provider_opencode_openrouter_provider.endpoints[0]:
  protocol requires rules or access to define allowed traffic

Expected behavior: The lint command should catch this error and report it as a diagnostic, since the endpoint configuration is invalid and will always fail at sandbox startup.

Reproduction Steps

  1. Create a Provider v2 Profile YAML with an endpoint that sets protocol but omits both access and rules:

    id: opencode-openrouter
    display_name: OpenCode (OpenRouter)
    description: OpenCode agent CLI configured to use OpenRouter.
    category: agent
    inference_capable: true
    
    credentials:
      - name: openrouter_api_key
        description: OpenRouter API Key
        env_vars: [OPENROUTER_API_KEY, OPENAI_API_KEY]
        required: true
        auth_style: bearer
        header_name: authorization
    
    discovery:
      credentials: [openrouter_api_key]
    
    endpoints:
      - host: openrouter.ai
        port: 443
        protocol: rest          # <-- protocol set
        enforcement: enforce
        # access and rules both missing — invalid but lint doesn't catch it
    
      - host: opencode.ai
        port: 443
    
    binaries:
      - /usr/local/bin/opencode
  2. Run lint — it passes:

    openshell provider profile lint -f profile.yaml
  3. Create a provider and sandbox using the profile — sandbox fails:

    openshell provider create --name opencode --type generic --from-profile profile.yaml
    openshell sandbox create --provider opencode
  4. Add access: read-write to the first endpoint and repeat — sandbox starts successfully.

Environment

  • OS: Fedora (Linux 7.0.10-201.fc44.x86_64)
  • OpenShell: built from main branch at commit 1f07bf04

Logs

Gateway logs when the sandbox fails:

[gateway] [INFO ] [openshell_server::compute] Sandbox phase changed
[gateway] [WARN ] [openshell_server::compute] Container exited with code 1

Sandbox container log:

Error: × L7 policy validation failed:
│ _provider_opencode_openrouter_provider.endpoints[0]: protocol requires
│ rules or access to define allowed traffic

Additional Context

The policy schema docs list access as "Required: No", which is correct in isolation — but when protocol is set, either access or rules must be provided. The docs could clarify this conditional requirement.

Other L7 checks also missing from profile lint that validate_l7_policies catches only at sandbox runtime:

  • rules and access are mutually exclusive
  • Unknown protocol values (must be rest/websocket/graphql/sql)
  • deny_rules require protocol to be set
  • deny_rules require either rules or access as a base

Suggested fix

Add the relevant subset of L7 endpoint field validation checks from validate_l7_policies (crates/openshell-sandbox/src/l7/mod.rs) into validate_profile_set (crates/openshell-providers/src/profiles.rs), specifically in the endpoint validation loop at line 1071. This keeps the fix within the openshell-providers crate boundary and doesn't require restructuring the dependency graph.

Metadata

Metadata

Assignees

No one assigned

    Labels

    state:triage-neededOpened without agent diagnostics and needs triage

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions