feat: Support float 32 and float 64 in approx_distinct#23084
Conversation
2e71c6c to
58dbfa1
Compare
48921b9 to
c7599d2
Compare
c7599d2 to
d5f203c
Compare
|
@Jefffrey Thanks for the review! |
|
What is the usecase for doing (bitwise) distinct for floats? |
Would bitwise distinct fundamentally different from a normal (mathematical) distinct ? My intention was to just support float types here. |
|
The corner cases for floats such as -0.0, 0.0 and NaN are handled correctly because of this #22835. |
|
Ok, from reading up on this subject this whole pr is not a good idea. |
I am sorry to be a wet blanket @mkleen -- and I apologize if I have wasted any of your time. Thank you for all your contributions so far. They have made the project better. |
No worries, this is for what reviews are for. Thank you for pointing out I was on the wrong track! |
|
Just to make sure I have the same understanding: is the concern that it is ambiguous whether nearly contiguous float values should be treated as distinct? For example: -- Should these be treated as distinct or equal?
3.141590, 3.141591, 3.141592I think the +-0.0 corner cases are not the key issue here, since they only make the result differ by a small constant and can also be special-cased. I think it is still valuable to add a comment, or improve the error message, to explain why |
Yeah -- and more specifically I think the idea is that since floating point values are typically not precise (they are subject to rounding error, etc) semantically trying to say "what are the distinct set of these values that are not exact" doesn't make much sense
Agreed |
Which issue does this PR close?
approx_distinctfunction #22989 but does not close it. More types are coming.Rationale for this change
Float32andFloat64were not supported inapprox_distinctyet.Float32andFloat64can be hashed generically viacreate_hashesinHLLAccumulatorandHllGroupsAccumulator, exactly like the string/binary types over their bit patterns.Float32andFloat64don't implementHashon their native type so they can't useNumericHLLAccumulator.What changes are included in this PR?
HLLAccumulatorandHllGroupsAccumulatorto supportFloat32andFloat64.approx_distinct.rstandaggregate.slt.Are these changes tested?
Yes.
Are there any user-facing changes?
Yes,
approx_distinctsupports nowFloat32andFloat64but no breaking changes.