This is a collection of changes for enabling sane handling of Numeric columns
in SQL, JDBC (and Kafka). It has potential wider applicability for anyone
needing to handle Numeric data types.
The underlying SQL Standard for Numeric and Decimal columns is restrictive and flawed, to the point it is ignored by around half of databases examined, and not fully followed by (m)any other databases.
Leading the charge of those discarding the standard are PostgreSQL, Oracle, and SQLite. Many others, including MySQL, ignore parts of the standard. The only database that seems to follow the standard precisely is MS SQL Server!
For the databases that do not follow the standard, they tend to ignore some of
the restrictions on scale, though that is not the only way that databases
deviate from the specification.
More generally the result of this is that standard interfaces to databases are unnecessarily difficult.
There is a potential equivalence between the way strings with defined or
undefined lengths are stored and represented and the way Numerics could be
represented.
JDBC specifies two methods by which a client can get information about the
precision and scale of a Numeric column.
First, using DatabaseMetadata.getColumns() to get a ResultSet that
summarises data about a table. This approach is ambiguous as to how undefined
precision and scale should be represented. The columns in the ResultSet
are int but are not explicitly NOT NULL.
Second, using ResultSetMetadata.getPrecision() and
ResultSetMetadata.getScale() to get an int for the scale of a column in the
ResultSet. This is explicitly an int and there is no mechanism by which a
NULL can be returned.
The combination of these means that JDBC requires that information about
scale (and potentially precision) is lost when it is used to interact with
databases.
The meaning of precision and scale are NOT directly equivalent to the
mathematical concepts of "significant digits" and "decimal places", though
similar. In maths, "significant digits" are reasonably clearly defined
but do not entirely map onto the SQL standard precision. Similarly, "decimal
places" are not the same as scale.
precision is the max digits, including ones that are
mathematically insignificant. Therefore the concepts of precision and
"significant digits" are NOT the same.
When expressing a number as N * R^+-E where N is the number as an integer
and R is the radix (usually 2 or 10), scale is NOT equivalent to the
exponent E.
In SQL the scale is restricted to positive values less than or equal to
precision. In those cases, the scale is the number of digits of the integer
number that are after the decimal point. This restriction to only positive
values, and restriction relative to precision means the Mathematical and SQL
definitions are not quite equivalent.
Oracle allows scale outside the standard range, and allows (some) negative
values, and so is more like the mathematical definition of an exponent (but not
the same).
While in most cases the precision and scale directly inform the display
formatting of a value from a column, this muddies the waters and complicates the
relationship between the mathematical definitions and the SQL definitions. I am
ignoring the display implications of the values and leaving that to each
implementation. The focus is therefore on the mathematical value of what is
stored.
The solution is a multi-layered set of clarifications and changes to the meaning
of precision and scale for Numeric (and Decimal) columns in JDBC and a
generalisation of the SQL specification.
For the approach reconcile these differences, please see the individual documents for each area.