Skip to content

Add graph reinit on failure#4237

Open
atobiszei wants to merge 14 commits into
mainfrom
atobisze_mp_reinit_graph
Open

Add graph reinit on failure#4237
atobiszei wants to merge 14 commits into
mainfrom
atobisze_mp_reinit_graph

Conversation

@atobiszei

@atobiszei atobiszei commented May 22, 2026

Copy link
Copy Markdown
Collaborator
  • rename holder -> observerHolder

Ticket: CVS-187323

Copilot AI review requested due to automatic review settings May 22, 2026 14:12

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds automatic MediaPipe graph reinitialization on inference failure when using the graph queue/pool, to prevent returning an errored (“poisoned”) graph instance back to the pool for reuse.

Changes:

  • Introduces GraphHelper::reinitialize() plus an RAII GraphReinitGuard that rebuilds a fresh CalculatorGraph unless dismissed on the success path.
  • Hooks the guard into queue-based unary and streaming inference paths to trigger graph rebuild after failures.
  • Moves per-graph shutdown logic into GraphHelper’s destructor and simplifies GraphQueue destructor.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/mediapipe_internal/mediapipegraphexecutor.hpp Adds GraphReinitGuard around queue-path inference and dismisses it on success.
src/mediapipe_internal/graphqueue.hpp Adds GraphHelper destructor, reinitialize() declaration, and new GraphReinitGuard type.
src/mediapipe_internal/graphqueue.cpp Implements GraphHelper shutdown + reinit logic; defaults GraphQueue destructor.
Comments suppressed due to low confidence (2)

src/mediapipe_internal/graphqueue.cpp:142

  • reinitialize() sets graph to nullptr on ObserveOutputStream failure (graph.reset()). The next request acquiring this GraphHelper will dereference graphHelper->graph in GraphIdGuard (graph(*graphHelper->graph)), which can crash if graph is null. Ensure reinitialize() always leaves a non-null graph, or update acquisition logic to detect nullptr and rebuild/replace before use.
        if (!absStatus.ok()) {
            SPDLOG_ERROR("Graph reinitialize: ObserveOutputStream failed: {}", absStatus.ToString());
            graph.reset();
            return;
        }

src/mediapipe_internal/graphqueue.cpp:162

  • reinitialize() sets graph to nullptr on StartRun failure (graph.reset()). The next request acquiring this GraphHelper will dereference graphHelper->graph in GraphIdGuard (graph(*graphHelper->graph)), which can crash if graph is null. Ensure reinitialize() always leaves a non-null graph, or update acquisition logic to detect nullptr and rebuild/replace before use.
    if (!absStatus.ok()) {
        SPDLOG_ERROR("Graph reinitialize: StartRun failed: {}", absStatus.ToString());
        graph.reset();
        return;
    }

Comment thread src/mediapipe_internal/graphqueue.hpp Outdated
Comment on lines +93 to +95
~GraphReinitGuard() {
if (!dismissed) {
helper.reinitialize(config, sidePacketMaps);
Comment thread src/mediapipe_internal/graphqueue.cpp
Comment thread src/mediapipe_internal/graphqueue.cpp Outdated
Comment on lines +128 to +132
auto absStatus = graph->Initialize(config);
if (!absStatus.ok()) {
SPDLOG_ERROR("Graph reinitialize: Initialize failed: {}", absStatus.ToString());
graph.reset();
return;
Comment thread src/mediapipe_internal/graphqueue.cpp

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Comment thread src/mediapipe_internal/graphqueue.hpp
Comment thread src/mediapipe_internal/graphqueue.cpp Outdated
Comment on lines +128 to +133
auto absStatus = graph->Initialize(config);
if (!absStatus.ok()) {
SPDLOG_ERROR("Graph reinitialize: Initialize failed: {}", absStatus.ToString());
graph.reset();
return;
}
Comment thread src/mediapipe_internal/graphqueue.cpp Outdated
Comment thread src/mediapipe_internal/graphqueue.cpp Outdated
Comment thread src/logging.cpp Outdated
@atobiszei atobiszei requested a review from mzegla June 3, 2026 11:00
try {
helper.reinitialize(config, sidePacketMaps);
} catch (const std::exception& e) {
SPDLOG_ERROR("GraphReinitGuard: reinitialize threw: {}", e.what());

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this point we are in a state where we lost 1 of the graph from the pool. if it continues to throw, we might end up in a state where no more graphs are available
shouldnt we exit(1) here?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case next infer request will try to reinitialize on failure. It should be pretty rare occasion as reinitialization is basically the same as creation of graph from scratch when not using graph pool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants