# Floating-point Numbers Aren't Real

### From WikiContent

Line 1: | Line 1: | ||

- | Floating-point numbers are not real numbers in the mathematical sense. Real numbers have infinite precision; floating-point numbers have fixed precision, and resemble "badly-behaved" integers. If you have access to a 32-bit platform with single-precision IEEE floating-point, like '''float''' in C++ on Windows, assign 2147483647 (the largest signed integer) to a float variable ('''x''', say), and print it. You'll see 2147483648. Now print x - 64. Still 2147483648. Now print x-65 and you'll get 2147483520! Why? Because the spacing between adjacent floats in there is 128. IEEE floating-point numbers are fixed-precision numbers based on base-two scientific notation: <math>1.d_1d_2d_3...d_(p-1)\times 2^e</math>. '''p''' is the precision (24 for '''float''', 53 for '''double'''). The spacing between two consecutive numbers is <math>2^(1-p+e)</math>, which can be approximated by <math>\varepsilon |x|</math>, where <math>\varepsilon</math> is the ''machine epsilon'' (<math>2^(1-p)</math>). Knowing the spacing in the neighborhood of a floating-point number can help you avoid classic numerical programming blunders. For example, if you're performing an iterative calculation, such as searching for the root of an equation, there's no sense in asking for greater precision than the number system can give in the neighborhood of the answer. Make sure that the tolerance you request is no smaller | + | Floating-point numbers are not real numbers in the mathematical sense. Real numbers have infinite precision; floating-point numbers have fixed precision, and resemble "badly-behaved" integers (because they're not evenly spaced throughout their range). If you have access to a 32-bit platform with single-precision IEEE floating-point, like '''float''' in C++ on Windows, assign 2147483647 (the largest signed integer) to a float variable ('''x''', say), and print it. You'll see 2147483648. Now print x - 64. Still 2147483648. Now print x-65 and you'll get 2147483520! Why? Because the spacing between adjacent floats in there is 128, and floating-point operations round to the nearest floating-point number. IEEE floating-point numbers are fixed-precision numbers based on base-two scientific notation: <math>1.d_1d_2d_3...d_(p-1)\times 2^e</math>. '''p''' is the precision (24 for '''float''', 53 for '''double'''). The spacing between two consecutive numbers is <math>2^(1-p+e)</math>, which can be approximated by <math>\varepsilon |x|</math>, where <math>\varepsilon</math> is the ''machine epsilon'' (<math>2^(1-p)</math>). Knowing the spacing in the neighborhood of a floating-point number can help you avoid classic numerical programming blunders. For example, if you're performing an iterative calculation, such as searching for the root of an equation, there's no sense in asking for greater precision than the number system can give in the neighborhood of the answer. Make sure that the tolerance you request is no smaller t |

## Revision as of 16:32, 14 December 2008

Floating-point numbers are not real numbers in the mathematical sense. Real numbers have infinite precision; floating-point numbers have fixed precision, and resemble "badly-behaved" integers (because they're not evenly spaced throughout their range). If you have access to a 32-bit platform with single-precision IEEE floating-point, like **float** in C++ on Windows, assign 2147483647 (the largest signed integer) to a float variable (**x**, say), and print it. You'll see 2147483648. Now print x - 64. Still 2147483648. Now print x-65 and you'll get 2147483520! Why? Because the spacing between adjacent floats in there is 128, and floating-point operations round to the nearest floating-point number. IEEE floating-point numbers are fixed-precision numbers based on base-two scientific notation: <math>1.d_1d_2d_3...d_(p-1)\times 2^e</math>. **p** is the precision (24 for **float**, 53 for **double**). The spacing between two consecutive numbers is <math>2^(1-p+e)</math>, which can be approximated by <math>\varepsilon |x|</math>, where <math>\varepsilon</math> is the *machine epsilon* (<math>2^(1-p)</math>). Knowing the spacing in the neighborhood of a floating-point number can help you avoid classic numerical programming blunders. For example, if you're performing an iterative calculation, such as searching for the root of an equation, there's no sense in asking for greater precision than the number system can give in the neighborhood of the answer. Make sure that the tolerance you request is no smaller t