Bounds checking

1 Range checking
2 Index checking
3 Data validation
4 See also
5 References

In computer programming, bounds checking is any method of detecting whether a variable is within some bounds before its use. It is particularly relevant to a variable used as an index into an array to ensure its value lies within the bounds of the array. For example: a value of 32768 about to be assigned to a sixteen-bit signed integer variable (whose bounds are −32768 to +32767), or accessing element 25 on an array with index range 0 through 9 only. The first is also known as range checking, the second as index checking.

A failed bounds check usually results in the generation of some sort of exception signal.

Because performing bounds checking during every usage is time-consuming it is not always done. Bounds-checking elimination are compiler technologies that eliminate unneeded bounds checking in many common cases.

Many programming languages, such as C, never perform automatic bounds checking, to raise speed. However, this leaves uncaught many off-by-one errors and buffer overflows. Many programmers believe these languages sacrifice too much for rapid execution. In his 1980 Turing Award lecture, C. A. R. Hoare described his experience in the design of ALGOL 60, a language that included bounds checking, saying:

A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interest of efficiency on production runs. Unanimously, they urged us not to—they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980, language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law.

Mainstream languages that enforce run time checking include Ada, C#, Haskell, Java, JavaScript, Lisp, PHP, Python, Ruby, and Visual Basic. The D and OCaml languages have run time bounds checking that is enabled or disabled with a compiler switch. C# also supports unsafe regions: sections of code that (among other things) temporarily suspend bounds checking to raise efficiency. These are useful for speeding up small time-critical bottlenecks without sacrificing the safety of a whole program.

Range checking

A range check is a check to make sure a number is within a certain range, for instance that a value about to be assigned to say a sixteen-bit integer is within the capacity of a sixteen-bit integer. This is not quite the same as type checking. Other range checks may be more restrictive, for example a variable to hold the number of a calendar month may be declared to accept only the range 1 to 12. This is often used with arrays, as using a number outside of the upper range in an array may cause the program to crash, or may introduce security vulnerabilities (see buffer overflow). In Java, the interpreter automatically does a range-check when items in an array are accessed, and throws an exception if the item is out of range.

Index checking

Index checking means that, in all expressions indexing an array, first check the index value against the bounds of the array which were established when the array was defined, and should an index be out of bounds, further execution is suspended via some sort of error.

Pascal, Fortran, Java have index checking ability. The VAX computer has an INDEX assembly instruction for array index checking which takes six operands, all of which can use any VAX addressing mode. The B6500 and similar Burroughs computers performed bound checking via hardware, irrespective of which computer language had been compiled to produce the machine code.

Data validation

In the context of data collection and data quality, bounds checking refers to checking that the data is not trivially invalid. For example, a percentage measurement must be in the range 0 to 100; the height of an adult person must be in the range 0 to 3 meters.

References

“On The Advantages Of Tagged Architecture”, IEEE Transactions On Computers, Volume C-22, Number 7, July, 1973.

“The Emperor’s Old Clothes”, The 1980 ACM Turing Award Lecture, CACM volume 24 number 2, February 1981, pp 75–83.

“Bounds Checking for C”, Richard Jones and Paul Kelly, Imperial College, July 1995.

“ClearPath Enterprise Servers MCP Security Overview”, Unisys, April 2006.

“Secure Virtual Architecture: A Safe Execution Environment for Commodity Operating Systems”, John Criswell, Andrew Lenharth, Dinakar Dhurjati, Vikram Adve, SOSP'07 21st ACM Symposium on Operating Systems Principles, 2007.

“Fail-Safe C”, Yutaka Oiwa. Implementation of the Memory-safe Full ANSI-C Compiler. ACM SIGPLAN Conference on Programing Language Design and Implementations (PLDI2009), June 2009.

“address-sanitizer”, Timur Iskhodzhanov, Alexander Potapenko, Alexey Samsonov, Kostya Serebryany, Evgeniy Stepanov, Dmitriy Vyukov, LLVM Dev Meeting, November 18, 2011.

Safe C Library of Bounded APIs

"The Safe C Library". Dr. Dobb's Journal. February 20, 2009.

Safe C API—Concise solution of buffer overflow, The OWASP Foundation, OWASP AppSec, Beijing 2011

Contents

Range checking

Index checking

Data validation

See also

References