recovery: L402 and static address recovery from local backup file#1121
recovery: L402 and static address recovery from local backup file#1121hieblmi wants to merge 12 commits intolightninglabs:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the resilience of Loop by introducing a robust local recovery system. It allows users to restore their static address and L402 client state from an encrypted backup, ensuring continuity of operations even after data loss or a fresh installation. The changes include automated backup creation on startup, a new CLI command for restoration, and a dedicated recovery service that handles encryption, key derivation, and integration with existing static address and deposit management functionalities. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a local recovery mechanism for Loop static addresses and L402 authentication state. It includes a new recovery package for managing encrypted backups, a recover CLI command, and a gRPC service to trigger the restoration process. On the daemon side, backups are automatically generated during startup. Feedback suggests that backup failures during startup should not prevent the daemon from running and recommends enhancing the atomic file writing logic with explicit synchronization and better temporary file cleanup.
| backupFile, err := recoveryService.WriteBackup(d.mainCtx) | ||
| if err != nil { | ||
| return fmt.Errorf("unable to write backup file: %w", err) | ||
| } | ||
| if backupFile != "" { | ||
| infof("Wrote encrypted backup file to %s", backupFile) | ||
| } |
There was a problem hiding this comment.
Failing to write a backup file on startup should probably not prevent the entire daemon from starting. While backups are important, a failure here (e.g., due to temporary disk issues or permission problems) shouldn't cause a regression in the availability of the swap service. Consider logging the error and continuing instead of returning it.
recoveryService := recovery.NewService(
d.cfg.DataDir, d.cfg.Network, d.lnd.Signer, d.lnd.WalletKit,
staticAddressManager, depositManager,
)
backupFile, err := recoveryService.WriteBackup(d.mainCtx)
if err != nil {
errorf("Unable to write backup file: %v", err)
} else if backupFile != "" {
infof("Wrote encrypted backup file to %s", backupFile)
}| func writeFileAtomically(path string, data []byte, mode os.FileMode) error { | ||
| tempPath := path + ".tmp" | ||
|
|
||
| err := os.WriteFile(tempPath, data, mode) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| return os.Rename(tempPath, path) | ||
| } |
There was a problem hiding this comment.
To ensure the durability of the backup file, it is recommended to Sync() the file before closing and renaming it. Additionally, using defer os.Remove(tempPath) ensures that the temporary file is cleaned up if the rename operation fails.
func writeFileAtomically(path string, data []byte, mode os.FileMode) error {
tempPath := path + ".tmp"
f, err := os.OpenFile(tempPath, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, mode)
if err != nil {
return err
}
defer func() {
f.Close()
_ = os.Remove(tempPath)
}()
if _, err := f.Write(data); err != nil {
return err
}
if err := f.Sync(); err != nil {
return err
}
if err := f.Close(); err != nil {
return err
}
return os.Rename(tempPath, path)
}a2e5f02 to
83e2547
Compare
5790124 to
dddf020
Compare
Surface static-address deposits as soon as they appear in the wallet instead of waiting for the old six-confirmation readiness threshold. Reconcile the wallet view on startup, on each block, and on the polling ticker so mempool deposits are created immediately. Backfill the first confirmation height once those outputs confirm, protect unconfirmed deposits from expiry, and mark vanished unconfirmed outpoints as Replaced so RBFed-away deposits stop showing up in RPCs. Expose the new state through static-address RPCs by deriving availability and summary totals from stored deposit state, reporting sensible expiry data for unconfirmed outputs, and hiding Replaced records from normal listings.
Allow static loop-ins to select unconfirmed deposits because their CSV timeout has not started yet, while still preferring confirmed outputs during automatic selection. Keep confirmed-input requirements for channel opens and withdrawals now that Deposited includes mempool outputs. Filter unconfirmed deposits out of automatic selection for those flows and fail manual requests that reference them, so the client does not build PSBTs or withdrawal attempts with unusable inputs. Treat deposit.MinConfs as the legacy readiness threshold rather than the single source of truth for all flows. Loop-in readiness is now governed by server confirmation-risk policy, while withdrawals and channel opens keep their confirmed-input checks.
Remove the old "no confirmed deposits available" error now that mempool deposits are listed immediately and can be selected for static loop-ins. Reproduce the server static-address deposit selection order in the CLI using the already-returned deposit metadata. This keeps the low-confirmation warning focused on the deposits auto-selection would actually choose, so users only see it when the swap payment may wait for the server confirmation-risk policy.
If InitHtlcAction creates the private swap invoice but fails before the loop-in is stored, the retry path otherwise leaves behind a live orphan invoice. Cancel that invoice on the early error path with a detached, timeout-limited context, and reuse the same helper when tearing down the monitor path. This keeps failed initialization attempts from leaving invoices that no local swap can complete.
FinalizeDepositAction only needs to tell the manager to remove the FSM from its active set, but the old synchronous send was still tied to the caller context and could race with request cancellation or a busy manager loop. Send the cleanup notification asynchronously and tie it to the FSM lifetime instead. Withdrawal completion no longer blocks while deposit locks are held just because the original request context was canceled.
Keep replacement UTXOs as fresh deposits while preserving the original deposit record and selected outpoint snapshot for pending swaps. Before signing a static loop-in HTLC, check each original selected outpoint with GetTxOut(..., includeMempool=true). Cancel the pending invoice only when that check reports an original outpoint unavailable; lookup errors fail the action without canceling so transient chain backend errors do not incorrectly abandon the swap. Keep recovered loop-ins using their stored outpoint snapshot and cover replacement discovery and cancellation in tests.
ListUnspentDeposits now reports only wallet UTXOs that have an active Deposited record. That matches the static loop-in admission path and avoids exposing wallet-seen outputs that are not ready for loop-in selection. Make local notification fan-out non-blocking for best-effort categories so a slow subscriber cannot stall the notification manager while it holds the subscriber lock. Static loop-in sweep signing requests remain blocking because they are work requests required for sweepbatcher presigning and must not be dropped.
Wait for the server's static loop-in risk-accepted notification before starting the client payment deadline. The server may intentionally hold the swap at the confirmation-risk gate after HTLC signing, and the client deadline should not run while that server-side wait is still in progress. Cache risk-accepted notifications by swap hash inside the local notification manager and replay them to the per-swap subscriber. This covers both reconnects and the internal race where the global notification stream receives the server event before the static loop-in FSM registers its waiter.
Add client handling for the server's static loop-in risk-rejected notification. If the server aborts confirmation-risk waiting before payment, the client fails the local swap instead of waiting for a payment deadline that will never start. Cache rejected notifications by swap hash using the same replay path as accepted notifications, and clear the opposite cached state when a final risk decision is received. This keeps reconnect and subscription-order races from stranding the client in the risk wait.
dddf020 to
5dfa10c
Compare
|
@hieblmi, remember to re-request review from reviewers when ready |
only last commit is relevant for this PR, the rest is rebased on the dyn-conf-tracker PR
Adds encrypted local recovery for static-address/L402 state.
The recovery backup is written once per paid L402 generation and contains the paid l402.token, Bitcoin network, L402 token metadata, the L402-bound static-address server key, protocol/expiry, main/change key families, first address height, and the V0 client pubkey needed to recreate the current concrete static-address row.
On fresh installs, Loop restores the latest valid backup before creating a new paid L402 generation. Existing installs backfill the immutable backup for their active generation. loop recover restores a specific backup file, or the latest valid backup in the active network directory when no file is provided.
The backup intentionally does not store mutable address cursors, per-address rows, server xpubs, pkScript, Taproot address strings, deposit FSM state, or scan gap/lookahead policy. These values are either derivable or recovered through wallet/chain scanning and reconciliation.