-
-
Notifications
You must be signed in to change notification settings - Fork 12
Avoid hangs and debug asserts on invalid parameters for Zipf #41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
53daaa7 to
ca480a8
Compare
ec0bf49 to
227e0dd
Compare
|
I think |
It should be OK. The range of possible inputs for |
9b7a6e4 to
924ee37
Compare
The generalized Zipf distribution is not well defined when its normalization constant is infinity (when n is inf and s is <= 1). Also, when s is inf, there is a reasonable limiting distribution, which can be produced with a small special case in Zipf::new.
924ee37 to
9a8000b
Compare
dhardy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The case where n.is_finite() && s = inf does, as you say, converge to a result of 1. Is this reasonable, given that infinities often occur due to loss of precision? That depends, but this is certainly not the first case where we have decided to adopt this approach, so it seems acceptable. (If for certain applications this is deemed unacceptable we could look at adding a feature flag for stricter bounds checking.)
The case where n = s = inf is more dubious; I think it would be more reasonable to consider this IllDefined.
Otherwise this PR is acceptable in my opinion.
| if !(n >= F::one()) { | ||
| return Err(Error::NTooSmall); | ||
| } | ||
| if n.is_infinite() && s <= F::one() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if n.is_infinite() && s <= F::one() { | |
| if n.is_infinite() && (s.is_infinite() || s <= F::one()) { |
CHANGELOG.mdentrySummary
The Zipf distribution sampler could panic (at
debug_assert!(t > F::zero());) when given invalid parameters (likes = inf); or hang in an infinite loop when sampling. This change rejects parameter combinations for which the generalized Zipf distribution is not well defined, and slightly modifies Zipf::new() so that sampling avoids the debug assert and produces the correct value whens = inf.Motivation
This another simple panic / hang issue that I found by fuzzing input parameters.
Details
The normalization constant (sum from i=1 to n of 1/i^s) of the generalized Zipf distribution needs to be positive and finite for the distribution to be well defined. if
s <= 1andn = inf, then the sum in the constant diverges.If
s = inf, then each term of the constant is 0, so the specific formula in Wikipedia will not work; however, the Zipf distribution approaches the distribution concentrated oni=1ass -> inf. The previous implementation computed ananin Zipf::new which triggered the debug assertion (or an infinite loop), but a small change avoids both.(Edit: originally this PR rejected the case
s = inf, but after posting it I changed my mind; in general, if a limiting distribution exists and is close enough to the distribution at the largest finite parameters, then producing that limiting distribution is safe and avoids having the library user write a special case to produce effectively that limiting distribution.)