A fast alternative to the standard C/C++ pow() function. With adjustable accuracy-space tradeoff. (1400 words)
This technique is formed of two ideas: manipulation of IEEE-754 floating-point bits, and precalculation of a lookup table.
First, a fast pow
:
‘A Fast, Compact Approximation of the Exponential Function’
Technical Report IDSIA-07-98;
Nicol N. Schraudolph;
IDSIA,
1998-06-24.
http://users.rsise.anu.edu.au/~nici/bib2html/b2hd-Schraudolph99.html
http://www.idsia.ch/idsiareport/IDSIA-07-98.ps.gz
— was re-written for float
implementation instead of double
(avoiding endian platform dependence of original). It works by inserting the submitted number into the exponent bits of the answer. The fractional-part makes a linear interpolation in the mantissa (simple and fast, but inaccurate). Or it can be used as follows.
Second, an adjustable fast log
:
‘Revisiting a basic function on current CPUs: A fast logarithm implementation
with adjustable accuracy’
Technical Report ICSI TR-07-002;
Oriol Vinyals, Gerald Friedland, Nikki Mirghafori;
ICSI,
2007-06-21.
http://www.icsi.berkeley.edu/cgi-bin/pubs/publication.pl?ID=002209
http://www.icsi.berkeley.edu/pubs/techreports/TR-07-002.pdf
— was adapted to fit the fast pow
core (and modified to be twice as accurate). This uses the mantissa bits to index a lookup table. The table value then replaces the mantissa.
Precision is adustable from 0 to 18. This is how many bits to use as index, and hence how large the table is: 4B to 1MB.
For precision 11 (8KB table), mean error is < 0.01%, and max error is < 0.02% (proportional, ie: abs(true - approx) / true). And, on one machine, it was over 9 times faster than standard pow
.
In C (89) (and assuming 32 bit integers), the core is:
const float _2p23 = 8388608.0f;
/**
* Initialize powFast lookup table.
*
* @pTable length must be 2 ^ precision
* @precision number of mantissa bits used, >= 0 and <= 18
*/
void powFastSetTable
(
unsigned int* const pTable,
const unsigned int precision
)
{
/* step along table elements and x-axis positions */
float zeroToOne = 1.0f / ((float)(1 << precision) * 2.0f); /* A */
int i; /* B */
for( i = 0; i < (1 << precision); ++i ) /* C */
{
/* make y-axis value for table element */
const float f = ((float)pow( 2.0f, zeroToOne ) - 1.0f) * _2p23;
pTable[i] = (unsigned int)( f < _2p23 ? f : (_2p23 - 1.0f) );
zeroToOne += 1.0f / (float)(1 << precision);
} /* D */
}
/**
* Get pow (fast!).
*
* @val power to raise radix to
* @ilog2 one over log, to required radix, of two
* @pTable length must be 2 ^ precision
* @precision number of mantissa bits used, >= 0 and <= 18
*/
float powFastLookup
(
const float val,
const float ilog2,
unsigned int* const pTable,
const unsigned int precision
)
{
/* build float bits */
const int i = (int)( (val * (_2p23 * ilog2)) + (127.0f * _2p23) );
/* replace mantissa with lookup */
const int it = (i & 0xFF800000) | pTable[(i & 0x7FFFFF) >> /* E */
(23 - precision)]; /* F */
/* convert bits to float */
return *(const float*)( &it );
}
Setting the ilog2
parameter:
pow( 2, val)
, ilog2 = 1 / log2(2) = 1
pow( e, val)
, ilog2 = 1 / log(2) = 1.44269504088896
pow(10, val)
, ilog2 = 1 / log10(2) = 3.32192809488736
pow( r, val)
, ilog2 = log(r) / log(2) = log(r) * 1.44269504088896
These two functions can easily be put in a small class/module that manages the storage and wraps calls for different radixes.
An IEEE-754 single-precision floating point number comprises a sign, exponent, and mantissa, with value: (sign)mantissa * 2exponent. The exponent is -125 to +128, the mantissa is 1 to 2 - 2-23 (and the sign is + or -). The bit-representation is slightly different. The exponent is 8 bits with 127 added (the bias), and the mantissa is 23 bits with an implicit extra 1 assumed, but not stored, as its next higher bit. This is the form of a usual number.
sign | exponent | mantissa |
---|---|---|
S | EEEEEEEE | MMMMMMMMMMMMMMMMMMMMMMM |
31 | 23 | 0 |
Considering the parts as plain integer numbers, the value is: (+/-)(1 + mantissa/223) * 2(exponent - 127). So, for example, to make 2.5, set the exponent to 128
and the mantissa to 0x200000
, shift them into position, and reinterpret as a float.
If you set the exponent to an integer number, and the mantissa to zero, you effectively get pow(2,number)
. That leaves fractional numbers. An Engineering Mathematics book will show that a(b + c) = ab * ac, which in this case can be 2(int + frac) = 2int * 2frac. This separates the fractional calculation into something constant for all integers. And it matches the IEEE-754 machinery of mantissa * 2exponent: setting the exponent and mantissa bits to int and 2frac, and reinterpreting, will do the calculation.
The range of 2frac is 1 to <2 — the same as the usual mantissa. So if the fractional part of the power is simply placed in the mantissa bits, it gives an approximation by linear interpolation (0 ≤ x < 1 → 1 ≤ y < 2). This is simplest and fastest, if up to 7% error is tolerable. Or a higher order, perhaps spline, interpolation can be calculated. Or, as here, a precalculated table can be queried.
There are 23 bits of mantissa, but the maximum precision is 18. This is because 5 bits are used by the integer part of the power.
(For other radixes, multiply the power by ln(radix) / ln(2).)
The essential approximation is a ‘staircase’ function across the fraction range between successive integer powers. It has full float precision y values, but at limited regular x intervals. Accuracy is proportional to number of table elements.
Those limited intervals imply rounding considerations for setting and looking-up the table values. The mantissa could be rounded before truncation for table look-up. This costs a few instructions in speed. But instead this can be treated when setting the table. Set the table values for the middles of the x intervals — so line A (in the code) starts the iteration half an increment across. This (practically) minimises error. (Better would be the mean of the y values across each interval. But the improvement would be small for these circumstances.)
This solution has a small weakness: for radix two it produces inexact results for integer powers when it need not. This may be undesirable, since they are ‘special’. One treatment is to round the mantissa, as mentioned before:
in powFastSetTable
, change lines A-C to:
float zeroToOne = 0.0f;
int i;
for( i = 0; i < ((1 << precision) + 1); ++i )
— in powFastLookup
, change lines E-F to:
/* (rounding mantissa-index before use) */
const int it = (i & 0xFF800000) | pTable[((i & 0x7FFFFF) + (0x400000 >>
precision)) >> (23 - precision)];
— and the table is one element longer for all precisions (so change the @pTable
comments too).
Instead, add a special case to set the first table element (the integer value) to zero. It avoids the rounding cost, but almost doubles max error in the first interval.
in powFastSetTable
, add after line D:
/* make integer power exact */
pTable[0] = 0;
It is possible to have maximum precision with medium precision table size (max error < 0.002%, 4KB instead of 1MB). The cost in speed is quite small — a few extra instructions.
The power equation above suggests the further reformulation. Just as the fractional calculation is separable from the integer calculation, the fractional one can itself be separated into parts: eg. 2(frachigh + fraclow) = 2frachigh * 2fraclow. Breaking it into two is probably the most useful.
Here is an excerpt from a C++ class:
PowFast2::PowFast2()
{
setTable( tableH_m, 9, 9, false );
setTable( tableL_m, 18, 9, true );
}
void PowFast2::setTable
(
float* const pTable,
const udword precision,
const udword extent,
const bool isRound
)
{
// step along table elements and x-axis positions
float zeroToOne = !isRound ?
0.0f : (1.0f / (static_cast<float>(1 << precision) * 2.0f));
for( int i = 0; i < (1 << extent); ++i )
{
// make y-axis value for table element
pTable[i] = ::powf( 2.0f, zeroToOne );
zeroToOne += 1.0f / static_cast<float>(1 << precision);
}
}
inline
float PowFast2::lookup
(
const float val,
const float ilog2
) const
{
const float _2p23 = 8388608.0f;
// build float bits
const int i = static_cast<int>( (val * (_2p23 * ilog2)) + (127.0f * _2p23) );
// replace mantissa with combined lookups
const float t = tableH_m[(i >> 14) & 0x1FF] * tableL_m[(i >> 5) & 0x1FF];
const int it = (i & 0xFF800000) |
(*reinterpret_cast<const int*>( &t ) & 0x7FFFFF);
// convert bits to float
return *reinterpret_cast<const float*>( &it );
}
The limit is separating each bit. This yields 18 two-element tables, combinable with 17 multiplies. Since the first elements are all 1, only 8.5 multiplies are needed on average, if branching is used.
Here are two source-code packages, including tests. In C++ as a simple constant class, and in C with an object interface:
http://www.hxa.name/articles/content/fast-pow-adjustable_hxa7241_2007.zip
The source code is available according to the (new) BSD license:
———
Copyright (c) 2007, Harrison Ainsworth / HXA7241.
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.