Skip to content

DeePKS: support collinear spin (nspin=2)#7433

Open
ErjieWu wants to merge 14 commits into
deepmodeling:developfrom
ErjieWu:spin
Open

DeePKS: support collinear spin (nspin=2)#7433
ErjieWu wants to merge 14 commits into
deepmodeling:developfrom
ErjieWu:spin

Conversation

@ErjieWu
Copy link
Copy Markdown
Collaborator

@ErjieWu ErjieWu commented Jun 4, 2026

Summary

Adds collinear spin-polarized (nspin=2) support to the traditional DeePKS module, using a charge / magnetization channel decomposition. The equivariant version is intentionally left unchanged and stays a separate code path; nspin=4
(non-collinear) is future work. The nspin=1 path is unchanged (every spin branch is gated on nspin==2 && !deepks_equiv).

What changed

  • Data: per-atom magnetization-channel members dm_r_mag (rho_up - rho_dn), pdm_mag, gedm_mag, allocated only for nspin==2.
  • SCF correction: the projected density matrix and descriptor are computed for both the charge channel n = rho_up + rho_dn and the magnetization channel m = rho_up - rho_dn; the model is fed a 2-channel descriptor (1, nat, 2, des);
    one autograd pass fills gedm = dE/d(pdm^n) and gedm_mag = dE/d(pdm^m). The Hamiltonian correction is applied per spin as |alpha>(gedm +/- gedm_mag)<alpha|, toggling the operator's current_spin like Veff.
  • Forces / stress: the live getForceStress -> cal_f_delta path gains the magnetization contribution in a single pass (F = dm_r*nlm(gedm) + dm_r_mag*nlm(gedm_mag)), preserving the single stress symmetrize+scale.
  • Labels / band gap: every output/precalc label (dm_eig, gradvx, gvepsl, orbpre, vdpre/vdrpre, gevdm) carries a size-2 channel axis before the descriptor axis (channel 0 = charge, 1 = magnetization), matching the 2-channel
    model input; band-gap o_delta/orbpre are spin-resolved. Precalc kernels are unchanged (channel-agnostic); the interface calls them per channel and stacks (on rank 0, where the rank-0-assembled tensors are valid).

Validation

  • tests/09_DeePKS: 313/313 pass under the standard np=4 harness (28 original + 3 new nspin=2 cases).
  • Non-magnetic CH4: nspin=2 == nspin=1 total energy to 2.5e-9 eV (the m->0 reduction) and forces to ~1e-5.
  • Polarized CH4 (nupdown=2): finite-difference force matches the analytic DeePKS force to 0.05% (the magnetization force term is correct).

New tests (tests/09_DeePKS, 29-31)

Use a small synthetic 2-channel demo model committed at Model_ProjOrb/model_nspin2_demo.ptg, should be changed to real model in the future (require deepks-kit code adaptation):

  • 29_NO_GO_deepks_scf_nspin2 - gamma SCF + force + stress (net moment)
  • 30_NO_KP_deepks_scf_nspin2 - multi-k SCF (net moment)
  • 31_NO_GO_deepks_bandgap_nspin2 - output labels (all precalc) + band gap

Notes

The nspin=2 descriptor / precalc label layout places the channel axis (size 2) immediately before the descriptor axis, channel 0 = charge, 1 = magnetization, so deepks-kit can train a 2-channel model. A trained physical nspin=2 model is not included; the committed demo model is a small synthetic fixture for the integration tests only. Tests should be updated to real model in the future.

Things to do:

  • Update deepks-kit to support two channel (spin) data and training frame. (Not in this repository)
  • Update test cases and add some new necessary cases using deepks-kit trained model.
  • Support for nspin=4.

ErjieWu and others added 6 commits June 4, 2026 17:47
Add charge/magnetization data members for collinear nspin=2 DeePKS without
changing the nspin=1 path:
  * dm_r_mag : (rho_up - rho_dn) real-space density matrix
  * pdm_mag  : magnetization projected density matrix
  * gedm_mag : dE/d(pdm_mag)
Allocated and freed only when nspin==2; for nspin==1 they stay null/empty and
the existing dm_r/pdm/gedm behave exactly as before. Scaffolding only, no
physics yet. tests/09_DeePKS: 284/284.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… model

Traditional (non-equivariant) nspin=2 collinear DeePKS now computes both the
charge and magnetization channels and evaluates a 2-channel model:
  * update_dmr gains nspin/mag options; setup_deepks fills dm_r_mag = rho_up - rho_dn
  * the operator builds pdm_mag and descriptor_mag, stacks charge+magnetization
    descriptors into model input (1, nat, 2, des), and autograds in one pass to
    fill gedm = dE/d(pdm) and gedm_mag = dE/d(pdm_mag)
The equivariant version stays a separate top-level branch; all magnetization
allocation/use is gated on nspin==2 && !deepks_equiv. The correction is not yet
applied per spin to the Hamiltonian (next step), so nspin=1 is unchanged.
tests/09_DeePKS: 284/284.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
SCF correction: calculate_HR takes the gedm to assemble from; for nspin=2
(traditional) the operator rebuilds V_delta_R per spin as
|alpha>(gedm +/- gedm_mag)<alpha| and toggles current_spin like Veff. The
label/output interface now feeds the 2-channel model for nspin=2 (previously
crashed a real 2-channel model post-SCF).

Forces/stress: the single live path is getForceStress -> cal_f_delta
(integral_part/ftable is dead code). cal_f_delta gains an optional
magnetization channel (dmr_mag, gedm_mag) and adds its contribution in the same
pass, so F_delta = dm_r * nlm(gedm) + dm_r_mag * nlm(gedm_mag) with correct
single symmetrize+weight on the stress.

Validation (synthetic 2-channel model):
  * non-magnetic CH4: nspin=2 == nspin=1 total energy to 2.5e-9 eV, forces to ~1e-5
  * polarized CH4 (nupdown=2): finite-difference force matches analytic to 0.05%
nspin=1 tests/09_DeePKS: 284/284 throughout. nspin=1 path unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
For nspin=2 (traditional), the interface now builds the magnetization-channel
descriptor before saving labels and reuses it for the 2-channel model. save_npy_d
gains an optional descriptor_mag: when present it writes dm_eig as (nat, 2, des)
- channel 0 = charge, channel 1 = magnetization - matching the 2-channel model
input, so deepks-kit can train on nspin=2 descriptors. nspin=1 writes (nat, des)
as before.

Validated: nspin=2 deepks_out_labels=1 on CH4 produces deepks_dm_eig.npy of
shape (5, 2, 18). nspin=1 tests/09_DeePKS: 284/284.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
For nspin=2 (traditional), every precalc/output label now carries the
charge/magnetization channel (channel 0 = charge, 1 = magnetization), inserted
before the descriptor dimension to match dm_eig (nat, 2, des):
  * gevdm computed for both pdm and pdm_mag
  * gradvx (force) and gvepsl (stress): gdmx/gdmepsl built from dm_r_mag, then
    gvx/gvepsl per channel, stacked
  * orbpre (bandgap): orbital_precalc per channel, stacked (cal_o_delta already
    sums spin)
  * vdpre / vdrpre (v_delta precalc) and gevdm export: per-channel + stacked
The precalc kernels are unchanged (channel-agnostic); only the interface calls
them per channel and stacks. deepks_spin2 hoisted to function scope.

Validated (synthetic 2-channel model, nupdown=2): label shapes gain the size-2
channel dim (dm_eig (5,2,18), gradvx (5,3,5,2,18), gvepsl (6,5,2,18),
orbpre (2,5,2,18), vdpre (2,33,33,5,2,18)); nspin=1 shapes unchanged.
tests/09_DeePKS (nspin=1): 284/284.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add spin-polarized DeePKS integration tests under tests/09_DeePKS, using a
small synthetic 2-channel (charge + magnetization) demo model committed as
Model_ProjOrb/model_nspin2_demo.ptg:
  * 29_NO_GO_deepks_scf_nspin2  : gamma SCF + force + stress (net moment)
  * 30_NO_KP_deepks_scf_nspin2  : multi-k SCF (net moment)
  * 31_NO_GO_deepks_bandgap_nspin2 : output labels (all precalc) + bandgap
All use nupdown to give a real moment so the magnetization channel is active.
tests/09_DeePKS now passes 313/313 (28 original + 3 new) under the standard
np=4 harness.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 4, 2026 17:05
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds collinear spin-polarized (nspin=2) support to the traditional DeePKS LCAO path by introducing a charge (n = ρ↑ + ρ↓) and magnetization (m = ρ↑ - ρ↓) channel decomposition, while leaving the equivariant DeePKS path unchanged. It extends DeePKS SCF correction, force/stress contributions, and output-label/precalc plumbing to operate on a 2-channel descriptor layout and adds integration tests covering the new behavior.

Changes:

  • Add magnetization-channel DeePKS data structures (dm_r_mag, pdm_mag, gedm_mag) and compute a 2-channel descriptor for nspin=2 (traditional).
  • Apply spin-dependent Hamiltonian corrections by combining gedm ± gedm_mag, and include magnetization-channel contributions in the force/stress path.
  • Extend DeePKS output labels/precalc tensors to carry a size-2 channel axis (charge/magnetization) and add new nspin=2 regression tests.

Reviewed changes

Copilot reviewed 31 out of 32 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/09_DeePKS/CASES_CPU.txt Registers new nspin=2 DeePKS CPU integration cases.
tests/09_DeePKS/29_NO_GO_deepks_scf_nspin2/STRU Adds structure input for gamma-only nspin=2 SCF test.
tests/09_DeePKS/29_NO_GO_deepks_scf_nspin2/KPT Adds K-points file for gamma-only nspin=2 SCF test.
tests/09_DeePKS/29_NO_GO_deepks_scf_nspin2/INPUT Adds DeePKS nspin=2 SCF + force + stress inputs (gamma-only).
tests/09_DeePKS/29_NO_GO_deepks_scf_nspin2/README Documents the new test case intent.
tests/09_DeePKS/29_NO_GO_deepks_scf_nspin2/result.ref Adds reference outputs for the new nspin=2 SCF test.
tests/09_DeePKS/30_NO_KP_deepks_scf_nspin2/STRU Adds structure input for multi-k nspin=2 SCF test.
tests/09_DeePKS/30_NO_KP_deepks_scf_nspin2/KPT Adds multi-k K-points file for nspin=2 SCF test.
tests/09_DeePKS/30_NO_KP_deepks_scf_nspin2/INPUT Adds DeePKS nspin=2 SCF inputs for multi-k system.
tests/09_DeePKS/30_NO_KP_deepks_scf_nspin2/README Documents the new multi-k nspin=2 test.
tests/09_DeePKS/30_NO_KP_deepks_scf_nspin2/result.ref Adds reference outputs for the new multi-k test.
tests/09_DeePKS/31_NO_GO_deepks_bandgap_nspin2/STRU Adds structure input for nspin=2 bandgap + label outputs test.
tests/09_DeePKS/31_NO_GO_deepks_bandgap_nspin2/KPT Adds K-points file for gamma-only nspin=2 bandgap test.
tests/09_DeePKS/31_NO_GO_deepks_bandgap_nspin2/INPUT Adds DeePKS nspin=2 bandgap + output-label inputs.
tests/09_DeePKS/31_NO_GO_deepks_bandgap_nspin2/README Documents the nspin=2 bandgap/labels case.
tests/09_DeePKS/31_NO_GO_deepks_bandgap_nspin2/result.ref Adds reference outputs for nspin=2 labels/bandgap.
source/source_lcao/setup_deepks.cpp Updates DeePKS setup to build the magnetization-channel real-space DM when nspin=2.
source/source_lcao/module_operator_lcao/deepks_lcao.h Updates DeePKS HR builder API to accept a gedm pointer for spin-dependent builds.
source/source_lcao/module_operator_lcao/deepks_lcao.cpp Implements 2-channel (charge/mag) SCF correction and per-spin HR rebuild logic.
source/source_lcao/module_deepks/LCAO_deepks.h Adds magnetization-channel DeePKS members (dm_r_mag, pdm_mag, gedm_mag).
source/source_lcao/module_deepks/LCAO_deepks.cpp Allocates and initializes magnetization-channel PDM/DMR/gedm storage for nspin=2.
source/source_lcao/module_deepks/LCAO_deepks_io.h Extends descriptor .npy writer interface to optionally include magnetization-channel descriptor.
source/source_lcao/module_deepks/LCAO_deepks_io.cpp Writes dm_eig as (nat, 2, des) when magnetization-channel descriptor is provided.
source/source_lcao/module_deepks/LCAO_deepks_interface.cpp Plumbs 2-channel labels/precalc outputs and bandgap-related tensors for nspin=2.
source/source_lcao/module_deepks/deepks_pdm.h Extends update_dmr API with nspin/magnetization flags.
source/source_lcao/module_deepks/deepks_pdm.cpp Implements magnetization-channel DMR via sign change for spin-down DMK blocks.
source/source_lcao/module_deepks/deepks_force.h Extends DeePKS force/stress kernel API with optional magnetization-channel inputs.
source/source_lcao/module_deepks/deepks_force.cpp Adds magnetization-channel contributions into force/stress accumulation.
source/source_lcao/module_deepks/deepks_basic.h Extends cal_edelta_gedm API to support optional magnetization-channel descriptor/PDM and output gradients.
source/source_lcao/module_deepks/deepks_basic.cpp Builds 2-channel model input and extracts gradients for both charge and magnetization channels.
source/source_lcao/FORCE_STRESS.cpp Passes magnetization-channel DM/gedm into DeePKS force/stress evaluation for nspin=2.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread source/source_lcao/module_deepks/deepks_basic.cpp
Comment thread source/source_lcao/module_deepks/deepks_basic.cpp Outdated
Comment thread source/source_lcao/module_deepks/LCAO_deepks_interface.cpp Outdated
@mohanchen mohanchen added Machine Learning Issues related to the DeePKS Refactor Refactor ABACUS codes Features Needed The features are indeed needed, and developers should have sophisticated knowledge labels Jun 5, 2026
ErjieWu and others added 2 commits June 5, 2026 09:38
The cal_gvx/cal_gvepsl/cal_orbital_precalc/cal_v_delta_precalc helpers
assemble their output tensor only on the root rank; on other ranks the
returned tensor is undefined. Stacking the charge and magnetization
channels unconditionally therefore aborts on non-root ranks with
"tensor does not have a device". Restrict each two-channel torch::stack
to rank 0, matching where the labels are actually written.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
torch::autograd::grad(..., allow_unused=true) returns an undefined
gradient for any PDM block the model output does not depend on, which is
now reachable because the magnetization-channel PDM blocks are appended
to grad_inputs. Calling .accessor() on such a tensor aborts; check
.defined() and write zeros for missing gradients before accessing.

Also re-indent the deepks_v_delta == -2 magnetization block to match the
surrounding scope.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 31 out of 32 changed files in this pull request and generated 2 comments.

Comment thread source/source_lcao/module_deepks/deepks_basic.cpp
Comment thread source/source_lcao/module_operator_lcao/deepks_lcao.cpp Outdated
@ErjieWu
Copy link
Copy Markdown
Collaborator Author

ErjieWu commented Jun 5, 2026

The current realization just add another group of variables to treat magnetic term (up minus down), so you can see many variables named as *_mag. Theoretically they can be combined with those without _mag suffix but are not realized now. Should we combine them?

ErjieWu and others added 6 commits June 5, 2026 11:28
Group each magnetization-channel argument directly after its charge
counterpart and pass it the same way as the base, so the paired
quantities stay together and similarly-named parameters no longer use
different calling conventions (which was easy to misuse):

  cal_edelta_gedm: descriptor/descriptor_mag, pdm/pdm_mag, gedm/gedm_mag
  cal_f_delta:     dmr/dmr_mag, gedm/gedm_mag
  save_npy_d:      descriptor/descriptor_mag

descriptor_mag and pdm_mag now take const& like their bases; the
two-channel path is selected by descriptor_mag being non-empty. As the
paired arguments are no longer trailing they carry no defaults, so the
nspin=1 / equivariant call sites pass empty containers and nullptr
explicitly.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Move the magnetization-channel arguments into a single optional group at
the end of each signature, with their charge counterparts grouped
immediately before. The _mag group carries defaults (empty container /
nullptr) so nspin=1 and equivariant call sites simply omit them instead
of passing explicit empty/null placeholders.

  cal_edelta_gedm(..., descriptor, pdm, gedm, descriptor_mag, pdm_mag, gedm_mag)
  cal_f_delta(..., dmr, gedm, dmr_mag, gedm_mag)
  save_npy_d(..., descriptor, descriptor_mag)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
No functional change; bring the nspin=2 additions into line with the
project .clang-format (Microsoft style, one argument per line, expanded
single-line braces).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two improvements to the nspin=2 path in contributeHR, no functional
change:

- calculate_HR now folds the per-spin combination gedm + sigma*gedm_mag
  into the small per-atom gedm copy it already builds, taking the
  magnetization channel and sign as optional arguments. This removes the
  inlmax x pdm_size temporary buffer (and its inlmax per-call heap
  allocations) that was rebuilt on every contributeHR call during SCF.

- The magnetization-channel cal_pdm/cal_descriptor is hoisted into its
  own block parallel to the charge-channel preparation, so the flow reads
  prepare-charge -> prepare-mag (nspin=2) -> evaluate-model, and the
  traditional model evaluation is a single cal_edelta_gedm call (an empty
  descriptor_mag selects the single-channel path for nspin=1).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The DeePKS contributeHR timer was started and stopped inside the
get_hr_cal() guard. After the nspin=2 per-spin V_delta_R build was moved
out of that guard, the timer ended before that work ran (and was not
started at all on the second-spin call). Start the timer at function
entry and stop it at the end so it brackets all work in every call,
matching Veff::contributeHR.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The deepks_fpre/spre labels in case 31 are large-magnitude sums (~113,
~47) whose last digits vary at the ~1e-7 level run to run, due to the
nondeterministic OpenMP critical accumulation in calculate_HR. That
occasionally exceeds the default 1e-7 tolerance. Add a per-case
threshold of 1e-5; the small-magnitude labels (edelta, deltas) are
bit-stable and remain tightly checked, and a real regression shifts
them far more than 1e-5.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Features Needed The features are indeed needed, and developers should have sophisticated knowledge Machine Learning Issues related to the DeePKS Refactor Refactor ABACUS codes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants