Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ dotnet run --project src/SharpClaw.Code.Cli -- repl

# Run a one-shot prompt
dotnet run --project src/SharpClaw.Code.Cli -- prompt "Summarize this workspace"
dotnet run --project src/SharpClaw.Code.Cli -- --auto-approve shell --auto-approve-budget 2 prompt "Check git status and summarize"

# Inspect runtime health and status
dotnet run --project src/SharpClaw.Code.Cli -- doctor
Expand Down Expand Up @@ -105,12 +106,13 @@ Parity-oriented commands now include:
- `unshare` / `/unshare`
- `compact` / `/compact`
- `serve` / `/serve`
- `worktree` / `/worktree`
- `/sessions` as a friendlier alias over `/session list`

Primary workflow modes:

- `build`: normal coding-agent execution
- `plan`: analysis-first mode that blocks mutating tools
- `plan`: structured deep planning that blocks mutating tools and syncs planning-owned session todos
- `spec`: generates Kiro-style spec artifacts under `docs/superpowers/specs/<date>-<slug>/`

## Core Capabilities
Expand Down Expand Up @@ -202,6 +204,8 @@ dotnet test SharpClawCode.sln --filter "FullyQualifiedName~ParityScenarioTests"
| `--cwd <path>` | Working directory; defaults to the current directory |
| `--model <id>` | Model id or alias; `provider/model` forms are supported where configured |
| `--permission-mode <mode>` | `readOnly`, `workspaceWrite`, or `dangerFullAccess`; see [docs/permissions.md](docs/permissions.md) |
| `--auto-approve <scopes>` | Auto-approve specific elevated scopes such as `shell`, `network`, or `promptRead` |
| `--auto-approve-budget <n>` | Cap how many elevated operations may be auto-approved in the session |
| `--output-format text\|json` | Human-readable or structured output |
| `--primary-mode <mode>` | Workflow bias for prompts: `build`, `plan`, or `spec` |
| `--session <id>` | Reuse a specific SharpClaw session id for prompt execution |
Expand All @@ -211,7 +215,7 @@ dotnet test SharpClawCode.sln --filter "FullyQualifiedName~ParityScenarioTests"
| `--storage-root <path>` | External root for host-managed durable runtime state |
| `--session-store fileSystem\|sqlite` | Select the embedded session/event storage backend |

Subcommands include `prompt`, `repl`, `doctor`, `status`, `session`, `index`, `memory`, `models`, `usage`, `cost`, `stats`, `connect`, `hooks`, `skills`, `agents`, `todo`, `share`, `unshare`, `compact`, `serve`, `commands`, `mcp`, `plugins`, `tool-packages`, `acp`, `bridge`, and `version`.
Subcommands include `prompt`, `repl`, `doctor`, `status`, `session`, `index`, `memory`, `models`, `usage`, `cost`, `stats`, `connect`, `hooks`, `skills`, `agents`, `todo`, `share`, `unshare`, `compact`, `serve`, `commands`, `worktree`, `mcp`, `plugins`, `tool-packages`, `acp`, `bridge`, and `version`.

## Documentation Map

Expand Down
32 changes: 32 additions & 0 deletions docs/permissions.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,29 @@

Default: **`WorkspaceWrite`**.

## Bounded auto-approval

The CLI and REPL now support finer-grained approval control without switching all the way to **`DangerFullAccess`**.

- CLI:
- `--auto-approve shell,network`
- `--auto-approve-budget 3`
- REPL:
- `/approvals`
- `/approvals set shell,promptRead 2`
- `/approvals reset`

`ApprovalSettings` flow through `RuntimeCommandContext`, `RunPromptRequest`, `ToolExecutionContext`, and `PermissionEvaluationContext`.

Behavior:

- matching scopes are auto-approved only when the current rule/mode path would otherwise ask for approval
- explicit deny rules still win
- remembered approvals still short-circuit before budget consumption
- when the configured auto-approve budget is exhausted, the engine falls back to the normal approval transport

The auto-approve budget is process-local and session-scoped, similar to remembered approvals.

## Policy engine

**`PermissionPolicyEngine`** evaluates **`ToolExecutionRequest`** with **`PermissionEvaluationContext`** by running an ordered list of **`IPermissionRule`** instances:
Expand Down Expand Up @@ -57,6 +80,15 @@ Authenticated approvals are tenant-bound. If the runtime host context carries `T

When a rule returns **`RequireApproval`** with **`CanRememberApproval`**, an approved outcome may be **`Store`**d and reused via **`TryGet`**. In embedded-host flows, the remembered approval remains scoped to the current session and tenant context.

### Auto-approve budget tracking

**`IAutoApprovalBudgetTracker`** (**`AutoApprovalBudgetTracker`**) tracks how many elevated operations have been auto-approved for the current session/tenant key.

When `ApprovalSettings.AutoApproveBudget` is set:

- the first matching operations consume the budget and are auto-approved
- later matching operations are no longer auto-approved and go through the normal approval path

## Tool execution context

**`ToolExecutionContext`** (`src/SharpClaw.Code.Tools/Models/ToolExecutionContext.cs`) carries **`IsInteractive`** (default **true** on the record). Parity tests set **`interactive: true/false`** to exercise approval vs deny paths.
Expand Down
14 changes: 14 additions & 0 deletions docs/runtime.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,8 @@ Prompt references are resolved before provider execution. Outside-workspace file

When the effective **`PrimaryMode`** is **`Spec`**, the assembler appends a structured output contract that requires the model to return machine-readable requirements, design, and task content.

When the effective **`PrimaryMode`** is **`Plan`**, the assembler now appends a deep-planning JSON contract that requires the model to return summary, assumptions, risks, next action, and task data.

Conversation history is rebuilt from persisted session events and truncated by token budget before being attached to the next provider request. Assistant history prefers the persisted final turn output and only falls back to streamed provider deltas when needed.

Cross-session memory is sourced from:
Expand All @@ -77,6 +79,17 @@ The runtime injects only compact recall text and index freshness metadata. Detai

Each spec-mode prompt creates a fresh folder. If the same slug already exists, the runtime appends `-2`, `-3`, and so on instead of overwriting an existing spec set.

## Plan workflow

**`IPlanWorkflowService`** handles the post-processing path for **`plan`** mode:

- parses the model response as structured JSON
- persists the latest deep-plan summary and next action into session metadata
- synchronizes planning-owned session todos through **`ITodoService`**
- returns a structured **`PlanExecutionResult`** on the turn result contract

Planning-managed todos are isolated by owner id (`deep-planning`) so manual session todos remain untouched.

## Operational diagnostics

**`OperationalDiagnosticsCoordinator`** runs injectable **`IOperationalCheck`** implementations:
Expand All @@ -102,6 +115,7 @@ The parity layer adds several runtime-owned services:
- **`IShareSessionService`** — creates and removes self-hosted share snapshots
- **`IHookDispatcher`** — executes configured hook processes for turn/tool/share/server events and exposes hook inspection/testing
- **`ITodoService`** — persists session and workspace todo items under session metadata and `.sharpclaw/tasks.json`
- deep plan mode also uses `ITodoService.SyncManagedSessionTodosAsync(...)` to reconcile planning-owned session tasks
- **`IWorkspaceInsightsService`** — reconstructs durable usage, cost, and execution stats from persisted event logs

These services are intentionally small and runtime-owned rather than separate orchestration subsystems.
Expand Down
23 changes: 23 additions & 0 deletions src/SharpClaw.Code.Agents/Abstractions/ISubAgentOrchestrator.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
using SharpClaw.Code.Agents.Models;
using SharpClaw.Code.Protocol.Models;
using SharpClaw.Code.Tools.Models;

namespace SharpClaw.Code.Agents.Abstractions;

/// <summary>
/// Executes bounded delegated subagent tasks on behalf of a parent agent tool call.
/// </summary>
public interface ISubAgentOrchestrator
{
/// <summary>
/// Executes the supplied delegated tasks using the bounded subagent worker.
/// </summary>
/// <param name="request">The delegated task batch.</param>
/// <param name="context">The parent tool execution context.</param>
/// <param name="cancellationToken">The cancellation token.</param>
/// <returns>The batch execution result and emitted runtime events.</returns>
Task<SubAgentBatchExecutionResult> ExecuteAsync(
SubAgentBatchRequest request,
ToolExecutionContext context,
CancellationToken cancellationToken);
}
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ public static class AgentsServiceCollectionExtensions
public static IServiceCollection AddSharpClawAgents(this IServiceCollection services)
{
services.AddOptions<AgentLoopOptions>();
services.AddSingleton<ISubAgentOrchestrator, SubAgentOrchestrator>();
services.AddSingleton<ToolCallDispatcher>();
services.AddSingleton<ProviderBackedAgentKernel>();
services.AddSingleton<IAgentFrameworkBridge, AgentFrameworkBridge>();
Expand Down
166 changes: 166 additions & 0 deletions src/SharpClaw.Code.Agents/Internal/SubAgentOrchestrator.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
using System.Text.Json;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using SharpClaw.Code.Agents.Abstractions;
using SharpClaw.Code.Agents.Agents;
using SharpClaw.Code.Agents.Models;
using SharpClaw.Code.Protocol.Enums;
using SharpClaw.Code.Protocol.Events;
using SharpClaw.Code.Protocol.Models;
using SharpClaw.Code.Protocol.Serialization;
using SharpClaw.Code.Tools.Abstractions;
using SharpClaw.Code.Tools.Models;

namespace SharpClaw.Code.Agents.Internal;

/// <summary>
/// Executes delegated subagent tasks as bounded read-only child runs.
/// </summary>
public sealed class SubAgentOrchestrator(
IServiceProvider serviceProvider,
IToolExecutor toolExecutor,
ILogger<SubAgentOrchestrator> logger) : ISubAgentOrchestrator
{
/// <inheritdoc />
public async Task<SubAgentBatchExecutionResult> ExecuteAsync(
SubAgentBatchRequest request,
ToolExecutionContext context,
CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(request);
ArgumentNullException.ThrowIfNull(context);

if (request.Tasks is not { Length: > 0 })
{
throw new InvalidOperationException("The subagent request must include at least one task.");
}

if (request.Tasks.Length > SubAgentToolContract.MaxTasks)
{
throw new InvalidOperationException($"The subagent request exceeds the limit of {SubAgentToolContract.MaxTasks} tasks.");
}

var runs = request.Tasks
.Select((task, index) => ExecuteSingleAsync(task, index, context, cancellationToken))
.ToArray();
var completedRuns = await Task.WhenAll(runs).ConfigureAwait(false);

var taskResults = completedRuns.Select(static run => run.TaskResult).ToArray();
var events = completedRuns.SelectMany(static run => run.Events).ToArray();
var result = new SubAgentBatchResult(
Tasks: taskResults,
CompletedCount: taskResults.Count(static task => task.Succeeded),
FailedCount: taskResults.Count(static task => !task.Succeeded));

return new SubAgentBatchExecutionResult(result, events);
}

private async Task<SingleTaskExecutionResult> ExecuteSingleAsync(
SubAgentTaskRequest task,
int index,
ToolExecutionContext parentContext,
CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(task);
var goal = task.Goal?.Trim();
var expectedOutput = task.ExpectedOutput?.Trim();
if (string.IsNullOrWhiteSpace(goal) || string.IsNullOrWhiteSpace(expectedOutput))
{
throw new InvalidOperationException("Each subagent task requires both goal and expectedOutput.");
}

var taskId = $"subtask-{index + 1:D2}-{Guid.NewGuid():N}";
var delegatedTask = new DelegatedTaskContract(
taskId,
goal,
expectedOutput,
NormalizeConstraints(task.Constraints));

try
{
var subAgentWorker = serviceProvider.GetRequiredService<SubAgentWorker>();
var result = await subAgentWorker.RunAsync(
new AgentRunContext(
SessionId: parentContext.SessionId,
TurnId: parentContext.TurnId,
Prompt: goal,
WorkingDirectory: parentContext.WorkingDirectory,
Model: string.IsNullOrWhiteSpace(parentContext.Model) ? "default" : parentContext.Model!,
PermissionMode: PermissionMode.ReadOnly,
OutputFormat: OutputFormat.Text,
ToolExecutor: toolExecutor,
Metadata: BuildChildMetadata(parentContext),
ParentAgentId: parentContext.AgentId,
DelegatedTask: delegatedTask,
PrimaryMode: PrimaryMode.Plan,
ToolMutationRecorder: null,
ConversationHistory: null,
IsInteractive: false,
ApprovalSettings: ApprovalSettings.Empty),
cancellationToken).ConfigureAwait(false);

return new SingleTaskExecutionResult(
new SubAgentTaskResult(
TaskId: taskId,
Goal: goal,
ExpectedOutput: expectedOutput,
Succeeded: true,
Output: string.IsNullOrWhiteSpace(result.Output) ? "(no output)" : result.Output.Trim(),
ErrorMessage: null,
AgentId: result.AgentId),
result.Events ?? []);
}
catch (OperationCanceledException)
{
throw;
}
catch (Exception exception)
{
logger.LogWarning(
exception,
"Delegated subagent task {TaskId} failed for session {SessionId}, turn {TurnId}.",
taskId,
parentContext.SessionId,
parentContext.TurnId);

return new SingleTaskExecutionResult(
new SubAgentTaskResult(
TaskId: taskId,
Goal: goal,
ExpectedOutput: expectedOutput,
Succeeded: false,
Output: null,
ErrorMessage: exception.Message,
AgentId: SubAgentWorker.SubAgentId),
[]);
}
}

private static string[] NormalizeConstraints(string[]? constraints)
=> constraints?
.Where(static value => !string.IsNullOrWhiteSpace(value))
.Select(static value => value.Trim())
.Distinct(StringComparer.Ordinal)
.ToArray()
?? [];

private static Dictionary<string, string> BuildChildMetadata(ToolExecutionContext parentContext)
{
var metadata = new Dictionary<string, string>(StringComparer.Ordinal);
if (parentContext.Metadata is not null
&& parentContext.Metadata.TryGetValue("provider", out var provider)
&& !string.IsNullOrWhiteSpace(provider))
{
metadata["provider"] = provider;
}

metadata[SharpClawWorkflowMetadataKeys.AgentAllowedToolsJson] = JsonSerializer.Serialize(
SubAgentToolContract.AllowedReadOnlyTools,
ProtocolJsonContext.Default.StringArray);
return metadata;
}

private sealed record SingleTaskExecutionResult(
SubAgentTaskResult TaskResult,
IReadOnlyList<RuntimeEvent> Events);
}
51 changes: 51 additions & 0 deletions src/SharpClaw.Code.Agents/Internal/SubAgentToolContract.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
using SharpClaw.Code.Protocol.Models;
using SharpClaw.Code.Tools.BuiltIn;

namespace SharpClaw.Code.Agents.Internal;

internal static class SubAgentToolContract
{
public const string ToolName = "use_subagents";
public const int MaxTasks = 3;

public static readonly string[] AllowedReadOnlyTools =
[
ReadFileTool.ToolName,
GlobSearchTool.ToolName,
GrepSearchTool.ToolName,
WorkspaceSearchTool.ToolName,
SymbolSearchTool.ToolName,
ToolSearchTool.ToolName,
];

public static readonly ProviderToolDefinition Definition = new(
ToolName,
"Delegate up to 3 bounded read-only repository investigation tasks to subagents. Use this for parallel codebase research, not for edits or shell commands.",
"""
{
"type": "object",
"additionalProperties": false,
"properties": {
"tasks": {
"type": "array",
"minItems": 1,
"maxItems": 3,
"items": {
"type": "object",
"additionalProperties": false,
"properties": {
"goal": { "type": "string" },
"expectedOutput": { "type": "string" },
"constraints": {
"type": "array",
"items": { "type": "string" }
}
},
"required": ["goal", "expectedOutput"]
}
}
},
"required": ["tasks"]
}
""");
}
Loading
Loading