Add geometric distribution (lpmf, cdf, lcdf, lccdf, rng)#3299
Add geometric distribution (lpmf, cdf, lcdf, lccdf, rng)#3299yasumorishima wants to merge 14 commits intostan-dev:developfrom
Conversation
Implements the geometric distribution as requested in stan-dev#3098. The geometric distribution models the number of failures before the first success, with PMF: P(n|theta) = theta * (1-theta)^n where n in {0, 1, 2, ...} and theta in (0, 1]. This uses the number-of-failures parameterization, consistent with the geometric distribution being a special case of the negative binomial distribution with r=1 (neg_binomial(1, theta/(1-theta))). Verified: geometric_lpmf(n, theta) == neg_binomial_lpmf(n, 1, theta/(1-theta)) for all tested values. Includes: - geometric_lpmf: log probability mass function with autodiff - geometric_cdf: cumulative distribution function with autodiff - geometric_lcdf: log CDF with autodiff - geometric_lccdf: log complementary CDF with autodiff - geometric_rng: random number generation via inverse CDF method - Unit tests for all functions including distribution checks - CDF test fixture for gradient verification Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Guard gradient computations against theta=0 division by zero in geometric_lpmf, geometric_cdf, and geometric_lcdf - Fix include ordering in prob.hpp (gaussian before geometric) - Fix signed/unsigned comparison in geometric_rng loop index Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Hi @yasumorishima ! Is there a reason we can't just have direct calls to the negative binomial function for all of these? So these would just be calls to neg_binomial that fix |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
Refactor lpmf, cdf, lcdf, lccdf, and rng to delegate to the corresponding neg_binomial functions with alpha=1 and beta=theta/(1-theta), as suggested in review. Handle theta=1 (where beta diverges) with early returns. Add autodiff test HPPs for lpmf, lcdf, and lccdf.
f822a16 to
84319ae
Compare
|
Hi @SteveBronder, thanks for the review! Good point — I've refactored all five functions (lpmf, cdf, lcdf, lccdf, rng) The only special case is θ = 1, where β diverges. Each function handles this I also added autodiff test HPPs for lpmf, lcdf, and lccdf — all 1,412 tests Let me know if you'd like any further changes! |
Jenkins Console Log Machine informationNo LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.3 LTS Release: 20.04 Codename: focalCPU: G++: Clang: |
| template <bool propto, typename T_n, typename T_prob, | ||
| require_all_not_nonscalar_prim_or_rev_kernel_expression_t< | ||
| T_n, T_prob>* = nullptr> | ||
| inline return_type_t<T_prob> geometric_lpmf(const T_n& n, const T_prob& theta) { | ||
| using T_n_ref = ref_type_t<T_n>; | ||
| using T_prob_ref = ref_type_t<T_prob>; | ||
| static constexpr const char* function = "geometric_lpmf"; | ||
| check_consistent_sizes(function, "Outcome variable", n, | ||
| "Success probability parameter", theta); | ||
| if (size_zero(n, theta)) { | ||
| return 0.0; | ||
| } | ||
|
|
||
| T_n_ref n_ref = n; | ||
| T_prob_ref theta_ref = theta; | ||
| check_nonnegative(function, "Outcome variable", n_ref); | ||
| check_bounded(function, "Success probability parameter", value_of(theta_ref), | ||
| 0.0, 1.0); | ||
|
|
||
| // theta = 1 => deterministic: P(0) = 1, P(n>0) = 0 | ||
| // Cannot delegate since beta = theta / (1 - theta) diverges | ||
| scalar_seq_view<T_n_ref> n_vec(n_ref); | ||
| scalar_seq_view<T_prob_ref> theta_vec(theta_ref); | ||
| const size_t max_sz = max_size(n_ref, theta_ref); | ||
| bool all_theta_one = true; | ||
| for (size_t i = 0; i < max_sz; i++) { | ||
| if (value_of(theta_vec[i]) == 1.0) { | ||
| if (n_vec[i] > 0) { | ||
| return negative_infinity(); | ||
| } | ||
| } else { | ||
| all_theta_one = false; | ||
| } | ||
| } | ||
| if (all_theta_one) { | ||
| return 0.0; | ||
| } | ||
|
|
||
| // geometric(theta) = neg_binomial(1, theta / (1 - theta)) | ||
| if constexpr (is_stan_scalar_v<T_prob>) { | ||
| const auto beta = theta_ref / (1.0 - theta_ref); | ||
| return neg_binomial_lpmf<propto>(n_ref, 1, beta); | ||
| } else if constexpr (is_std_vector_v<T_prob>) { | ||
| std::vector<value_type_t<T_prob>> beta; | ||
| beta.reserve(stan::math::size(theta)); | ||
| for (size_t i = 0; i < stan::math::size(theta); i++) { | ||
| beta.push_back(theta_vec[i] / (1.0 - theta_vec[i])); | ||
| } | ||
| return neg_binomial_lpmf<propto>(n_ref, 1, beta); | ||
| } else { | ||
| const auto beta = elt_divide(theta_ref, subtract(1.0, theta_ref)); | ||
| return neg_binomial_lpmf<propto>(n_ref, 1, beta); | ||
| } | ||
| } |
There was a problem hiding this comment.
So looking at this I would prefer we did one of two styles
- Completely delegate to
neg_binomial_*. You do not need thestd::vectorhere to accumulate the non 1 values.
template <typename T_n, typename T_prob>
inline return_type_t<T_prob> geometric_cdf(const T_n& n, const T_prob& theta) {
using T_n_ref = ref_type_t<T_n>;
using T_prob_ref = ref_type_t<T_prob>;
static constexpr const char* function = "geometric_cdf";
check_consistent_sizes(function, "Outcome variable", n,
"Success probability parameter", theta);
decltype(auto) n_ref = as_array_or_scalar(to_ref(n));
decltype(auto) theta_ref = as_array_or_scalar(to_ref(theta));
check_bounded(function, "Success probability parameter", value_of(theta_ref),
0.0, 1.0);
if constexpr (is_stan_scalar_v<T_n>) {
if (n_ref < 0) {
return 0.0;
}
} else {
const bool is_any_nonpositive = (n_ref.array() < 0).any();
if (is_any_nonpositive) {
return 0.0;
}
}
if constexpr (is_stan_scalar_v<T_prob>) {
if (theta_ref == 1.0) {
return 1.0;
} else {
return neg_binomial_cdf(n, 1, theta_ref / (1.0 - theta_ref));
}
} else {
return theta_ref.binaryExpr(n_ref, [](const auto& theta_, const auto& n_) {
if (theta_ == 1.0) {
return return_type_t<T_prob>{1.0};
} else {
return neg_binomial_cdf(n_, 1, elt_divide(theta_, subtract(1.0, theta_)));
}
}).prod();
}
}- A full autodiff impl with the adjoint calculations
template <typename T_n, typename T_prob>
inline return_type_t<T_prob> geometric_cdf(const T_n& n,
const T_prob& theta) {
using T_partials_return = partials_return_t<T_prob>;
using T_n_ref = ref_type_t<T_n>;
using T_prob_ref = ref_type_t<T_prob>;
static constexpr const char* function = "geometric_cdf";
check_consistent_sizes(function, "Outcome variable", n,
"Success probability parameter", theta);
if (size_zero(n, theta)) {
return 1.0;
}
auto n_arr = as_array_or_scalar(to_ref(n));
auto theta_arr = as_array_or_scalar(to_ref(theta));
// Need to check theta > 0 && theta <= 1.0. Bounded checks >= and <=
check_bounded(function, "Success probability parameter", value_of(theta_arr),
std::numeric_limits<double>::min(), 1.0);
auto ops_partials = make_partials_propagator(theta_arr);
if constexpr (is_stan_scalar_v<T_n>) {
if (n_arr < 0) {
return ops_partials.build(0.0);
}
} else {
if (any(n_arr < 0)) {
return ops_partials.build(0.0);
}
}
auto log1m_theta = log1m(theta_arr);
auto P_i = -expm1((n_arr + 1.0) * log1m_theta);
auto P = prod(P_i);
if constexpr (is_autodiff_v<T_prob>) {
// d/dtheta [1 - (1 - theta)^(n + 1)]
// = (n + 1) * (1 - theta)^n
//
// The select handles theta == 1, n == 0, where the derivative is 1
// and the log form would otherwise run into 0 * -inf.
auto dP_dtheta = select(
n_arr == 0, 1.0, (n_arr + 1.0) * exp(n_arr * log1m_theta));
auto theta_partials = P * elt_divide(dP_dtheta, P_i);
if constexpr (is_stan_scalar_v<T_prob>) {
partials<0>(ops_partials) = sum(theta_partials);
} else {
partials<0>(ops_partials) = theta_partials;
}
}
return ops_partials.build(P);
}Replace neg_binomial_* delegation with a full autodiff implementation per PR stan-dev#3299 review. All four functions compute analytic partials on Eigen arrays directly, with explicit boundary handling for theta=0, theta=1, n=0, n=INT_MAX. Tests: 180/180 autodiff + 8/8 prim pass.
|
Thanks for the detailed review @SteveBronder! Went with option 2 — all four functions now use Boundary handling: short-circuit Tests: 180/180 autodiff distribution tests (var/vv/ffv) + 8/8 prim unit tests pass locally. Happy to iterate. |
Address CodeRabbit review on PR stan-dev#3299: with check_bounded(theta, 0.0, 1.0) (inclusive), passing theta = 0 produced -inf logp but the autodiff path still evaluated inv(theta_arr), poisoning the partial with +inf. Replace the inclusive bound with check_positive_finite + check_less_or_ equal in lpmf/cdf/lcdf/lccdf to match the documented domain (0, 1] and the existing geometric_rng pattern. With the tightened bound the theta == 0 short-circuits in cdf/lcdf become unreachable; remove them. Add an EXPECT_THROW unit test covering theta = 0 across all four distribution functions to lock in the new behavior.
|
Addressed @coderabbitai's review on In commit 31dcbd6:
Local results: prim unit test 9/9 PASS (incl. the new throw test), all 12 generated autodiff distribution test binaries PASS. |
geometric_lccdf: - Drop the any(n_arr < 0) early return that silently dropped contributions from positive n_i in mixed-sign vectors. P(N > n_i) = 1 for n_i < 0 is the multiplicative identity (log 0), not an absorbing element, so the term has to remain in the sum. Per-element select(n < 0, 0, ...) keeps both value and partials paths correct. - Replace the any(theta == 1.0) blanket guard with a scalar_seq_view per-element loop, so theta=1 only forces -inf when paired with n_i >= 0 at the same index. Previously theta=[0.5, 1.0], n=[2, -1] returned -inf instead of 3*log(0.5). geometric_lpmf: - Add the is_stan_scalar_v<T_prob> guard around partials<0> assignment so a scalar theta paired with a vector n collapses to a scalar partial via sum(...), matching cdf/lcdf/lccdf. Tests: lccdf_vectorized_mixed_sign and lccdf_vectorized_theta_one_alignment lock in both contract fixes.
|
Round 2 of CodeRabbit findings + an alignment bug Opus surfaced during self-review (commit
Tests (
prim 11/11 PASS, autodiff distribution lccdf v/vv/ffv 90/180/90 PASS. |
Summary
Implements the geometric distribution as requested in #3098.
geometric_lpmf(n, θ) == neg_binomial_lpmf(n, 1, θ/(1−θ))— verified numericallyFiles added/modified
stan/math/prim/prob/geometric_lpmf.hpp— log PMF with autodiff gradientsstan/math/prim/prob/geometric_cdf.hpp— CDF with autodiff gradientsstan/math/prim/prob/geometric_lcdf.hpp— log CDF with autodiff gradientsstan/math/prim/prob/geometric_lccdf.hpp— log CCDF with autodiff gradientsstan/math/prim/prob/geometric_rng.hpp— RNG via inverse CDF methodstan/math/prim/prob.hpp— include additions (alphabetical order)test/unit/math/prim/prob/geometric_test.cpp— unit tests (8 tests)test/prob/geometric/geometric_cdf_test.hpp— CDF gradient test fixtureTests
errorCheck— validates parameter constraints via test rigdistributionCheck— RNG samples match PMF (chi-squared test)error_check— domain errors for invalid θlpmf_values— hand-calculated PMF values including θ=1 edge casecdf_values— hand-calculated CDF valuescdf_below_support— returns 0 / -inf for n < 0lccdf_values— hand-calculated CCDF valuesrng_theta_one— θ=1 always returns 0Notes
Closes #3098
🤖 Generated with Claude Code