The BX floating point text format The current SI is defined by published numerical values of physical constants in SI units. These numbers are expressed, as usual in print, as decimal numbers in scientific notation. They were (I believe) most likely produced on computing devices using something like the IEEE754 floating point format. The latter representation is often a slightly diffrent number, due to the difference between the bases of the fractions (10 and 2). It is therefore not entirely clear which of the two (sometimes) distinct numbers is actually being specified. These differences are currently smaller than any actual experimental precision, but not by very much in some cases. On closer inspection, the problem is not with the use of decimal digits in the representation, but rather the base of the exponent. To back up a step, scientific notation represents a rational number of a certain restricted sort. The number may be (in principle) any integer, but if the number has a denominator greater than one, that denominator must in fact be a power of 10. Computer floating point works exactly the same way except that the denominator must be a power of 2. Each format therefore has numbers it can't represent in any finite length mantissa, and the two sets do not coincide. The problem is worse beacuse even an integer may be apodized to a fixed precision. The simplest thing to do, it seems to me, is to print the numbers in a format using decimal digits with a binary exponent - bx. The bx basically substitutes for the e in typical "%f" type notation. I have functions in C, python, and perl which are suitable for plugging into printf/scanf type formats (though I don't know what letter to use). 1bx1 means 1 times 2 to the 1, which is 2. 1bx5 is 32, as is 32bx0, but you could also say 2bx4 etc. But the way that bx usually works is to show you how the number is actually represented (in slightly mangled form) in the native floating point format. For 64-bit IEEE754 double precision arithmetic, the integer 32 is 4503599627370496bx-47 - probably not what you expected. It's not intended to be human readable, though. It's an exact representation. For a given data type the mantissa (the first number) will always be roughly the same length, as you can see in the table below. Note especially that Avogadro's number, which was given an exact integer value, is represented inaccurately. The represented number differs by 12,976,128 atoms from the printed value! Which of these two numbers is the "exact" value? The table below shows the approximate differences between the two representations. It was calculated using Intel 80-bit extended-precision floating point. Consider the implication. The best clocks are approching a precision where double precision floating point will have to be abandoned soon. That's easy enough when an 80-bit type is available. But when you switch, the value of many of the constants changes. This doesn't happen if you use bx for publishing and input, because a bx value is exactly represented when read into a larger data type - in fact it follows the same rules as hardware floating point type conversion. It will never be necessary to redefine bx for new data types. The "bx factor" is what you're already using in IEEE754! Constant SI Expression SN factor bx factor Difference N_A 1/mol 6.02214076e23 8973689019680023bx26 1.29761e+07 c m/s 299792458 5029682823036928bx-24 0 h kg m^2/s 6.62607015e-34 7747209898635537bx-163 1.70389e-50 e A s 1.602176634e-19 6655181362828883bx-115 1.06265e-35 k kg m^2/K s^2 1.380649e-23 4698105096070268bx-128 -9.21225e-40 nuCs 1/s 9192631770 4819586525429760bx-19 0 Here is the README file from the software download: The software to do all of this - including the above demo - is here in this distribution. The C library is in the lib directory, with some basic docs and the demo code, which can be compiled and run with the provided script to_make, which also builds the library libbx.so, header file bx.h. The C library documentation follows, with a word about perl and python at the end. bx is a library for printing floating point numbers in a compact format using decimal digits, which truly and correctly represents the numerical content, and also for reading them back in again. Functions for converting in both directions are provided, for the data types 'double' (IEEE-754 64-bit) and 'long double' (Intel extended precision 80-bit). The bx format supports the special values (-)0, (-)inf, and (-)nan. Conversion to and from denormal values is supported. Signalling nan is not treated specially. These functions do not perform floating point operations on their arguments, except to convert double to long double in the case of double denormals. Requires -lm. #include char *f2bxl(long double x, char *ret); Converts x to the bx string format in the string ret, which must already be allocated. Normal, denormal, and pseudo-normal values are properly represented. Zeroes, infinities, and nans also have special representations. Pseudo-infinities and pseudo-nans are treated as nan. Pseudo-zeroes are treated as nonzero pseudo-normals, which should never come up unless you're testing this feature. char *f2bx(double x, char *ret); Converts x to the bx string format in the string ret, which must already be allocated. Normal and denormal values are properly represented. Zeroes, infinities, and nans also have special representations. Internally denormals are passed to f2bxdl, which is the only case where a floating-point operation is performed on the x argument. char *f2bxdl(long double x, long double d, char *ret); Converts x to the bx string format in the string ret, which must already be allocated. If d is zero, the number is represented in full precision, but if a nonzero value of d is passed the mantissa is truncated so that it has no more than d decimal digits. Since d is passed as a floating point number, an exact number of bits to remove can in effect be specified also. Internally d is converted into an integer shift width. Normal, denormal, and pseudo-normal values are properly represented. Zeroes, infinities, and nans also have special representations. Pseudo-infinities and pseudo-nans are treated as nan. Pseudo-zeroes are treated as nonzero pseudo-normals, which should never come up unless you're testing this feature. char *f2bxd(double x, double d, char *ret); Converts x to the bx string format in the string ret, which must already be allocated. If d is zero, the number is represented in full precision, but if a nonzero value of d is passed the mantissa is truncated so that it has no more than d decimal digits. Since d is passed as a floating point number, an exact number of bits to remove can in effect be specified also. Internally d is converted into an integer shift width. Normal and denormal values are properly represented. Zeroes, infinities, and nans also have special representations. Internally denormals are passed to f2bxdl, which is the only case where a floating-point operation is performed on the x argument. char *f2bxnl(long double x, int of, char *ret); Converts x to the bx string format in the string ret, which must already be allocated. If of is zero, the number is represented in full precision, but if a nonzero value of of is passed the mantissa is shifted right that many bits before representation (used internally for f2bxdl). Normal, denormal, and pseudo-normal values are properly represented. Zeroes, infinities, and nans also have special representations. Pseudo-infinities and pseudo-nans are treated as nan. Pseudo-zeroes are treated as nonzero pseudo-normals, which should never come up unless you're testing this feature. f2bxnl((double), 11, ret) should be equivalent to f2bx, although the f2bx function should be used as it avoids an unnecessary FP conversion. char *f2bxn(double x, int of, char *ret); Converts x to the bx string format in the string ret, which must already be allocated. If of is zero, the number is represented in full precision, but if a nonzero value of of is passed the mantissa is shifted right that many bits before representation (used internally for f2bxd). Normal and denormal values are properly represented. Zeroes, infinities, and nans also have special representations. Internally denormals are passed to f2bxdl, which is the only case where a floating-point operation is performed on the x argument. f2bxn((single), 29, ret) should produce a value that can be read back into a single via conversion from bx2f. long double bx2fl(char *bx); Returns the floating point value of a valid bx string (may begin with space). Handles the special values (-)0, (-)inf, and (-)nan. If the format of the string is not recognized, or if the values of the mantissa or exponent are out of range of the long double data type, nan is returned and an error is generated on stderr. Denormal values are supported, and only in the case of denormals is any truncation performed - the mantissa may be shifted to the right to fit the representation. No rounding is otherwise performed. Values which came from f2bx* will always be fully accurate. double bx2f(char *bx); Returns the floating point value of a valid bx string (may begin with space). Handles the special values (-)0, (-)inf, and (-)nan. If the format of the string is not recognized, or if the values of the mantissa or exponent are out of range of the double data type, nan is returned and an error is generated on stderr. Denormal values are supported, and only in the case of denormals is any truncation performed - the mantissa may be shifted to the right to fit the representation. No rounding is otherwise performed. Values which came from f2bx or f2bxd/n will always be fully accurate. In general strings generated by f2bxl will not be readable, and should be passed to bx2fl instead. See the note under f2bxn about using this function for single precision. bx.pl contains a perl library with functions f2bx and bx2f, which are very similar to f2bxd/f2bx ($d defaults to 0) and bx2f in the C implementation. Normals, denormals, zeroes, infinities, and nans are handled, in double precision only. Floating-point operations are avoided. bx.py contains basically the same thing for Python, but without support for inf and nan, and using some floating-point arithmetic. Normals should work correctly. I don't know the language well enough to plug in the compiled C version, which would be best in the long run.