Referential transparency (computer science)

Referential transparency and referential opacity are properties of parts of computer programs. An expression is said to be referentially transparent if it can be replaced with its value without changing the behavior of a program (in other words, yielding a program that has the same effects and output on the same input). The opposite term is referential opaqueness.

While in mathematics all function applications are referentially transparent, in programming this is not always the case. The importance of referential transparency is that it allows the programmer and the compiler to reason about program behavior. This can help in proving correctness, simplifying an algorithm, assisting in modifying code without breaking it, or optimizing code by means of memoization, common subexpression elimination or parallelization.

Referential transparency is one of the principles of functional programming; only referentially transparent functions can be memoized (transformed into equivalent functions which cache results). Some programming languages provide means to guarantee referential transparency. Some functional programming languages enforce referential transparency for all functions.

As referential transparency requires the same results for a given set of inputs at any point in time, a referentially transparent expression is therefore deterministic.

1 Examples and counterexamples
2 Contrast to imperative programming
3 Another example
4 See also
5 References

Examples and counterexamples

If all functions involved in the expression are pure functions, then the expression is referentially transparent. Also, some impure functions can be included in the expression if their values are discarded and their side effects are insignificant.

Consider a function that returns the input from some source. In pseudocode, a call to this function might be GetInput(Source) where Source might identify a particular disc file, the keyboard, etc. Even with identical values of Source, the successive return values will be different. Therefore, function GetInput() is neither deterministic nor referentially transparent.

A more subtle example is that of a function that uses a global variable (or a dynamically scoped variable, or a lexical closure) to help it compute its results. Since this variable is not passed as a parameter but can be altered, the results of subsequent calls to the function can differ even if the parameters are identical. In pure functional programming, destructive assignment is not allowed; thus, a function that uses global (or dynamically scoped^{[citation needed]}) variables may or may not be referentially transparent, depending on whether the global variables are immutable.

Arithmetic operations are referentially transparent: 5*5 can be replaced by 25, for instance. In fact, all functions in the mathematical sense are referentially transparent: sin(x) is transparent, since it will always give the same result for each particular x.

Assignments are not transparent. For instance, the C expression x = x + 1 changes the value assigned to the variable x. Assuming x initially has value 10, two consecutive evaluations of the expression yield, respectively, 11 and 12. Clearly, replacing x = x + 1 with either 11 or 12 gives a program with different meaning, and so the expression is not referentially transparent. However, calling a function such as int plusone(int x) {return x+1;} is transparent, as it will not implicitly change the input x and thus has no such side effects.

today() is not transparent, as if you evaluate it and replace it by its value (say, "Jan 1, 2001"), you don't get the same result as you will if you run it tomorrow. This is because it depends on a state (the time).

Contrast to imperative programming

If the substitution of an expression with its value is valid only at a certain point in the execution of the program, then the expression is not referentially transparent. The definition and ordering of these sequence points are the theoretical foundation of imperative programming, and part of the semantics of an imperative programming language.

However, because a referentially transparent expression can be evaluated at any time, it is not necessary to define sequence points nor any guarantee of the order of evaluation at all. Programming done without these considerations is called purely functional programming.

One advantage of writing code in a referentially transparent style is that given an intelligent compiler, static code analysis is easier and better code-improving transformations are possible automatically. For example, when programming in C, there will be a performance penalty for including a call to an expensive function inside a loop, even if the function call could be moved outside of the loop without changing the results of the program. The programmer would be forced to perform manual code motion of the call, possibly at the expense of source code readability. However, if the compiler is able to determine that the function call is referentially transparent, it can perform this transformation automatically.

The primary disadvantage of languages which enforce referential transparency is that it makes the expression of operations that naturally fit a sequence-of-steps imperative programming style more awkward and less concise. Such languages often incorporate mechanisms to make these tasks easier while retaining the purely functional quality of the language, such as definite clause grammars and monads.

With referential transparency, no difference is made or recognized between a reference to a thing and the corresponding thing itself. Without referential transparency, such difference can be easily made and utilized in programs.

Another example

As an example, let's use two functions, one which is referentially opaque, and the other which is referentially transparent:

 globalValue = 0; integer function rq(integer x) begin   globalValue = globalValue + 1;   return x + globalValue; end integer function rt(integer x) begin   return x + 1; end

The function rt is referentially transparent, which means that rt(x) = rt(y) if x = y. For instance, rt(6) = 6 + 1 = 7, rt(4) = 4 + 1 = 5, and so on. However, we can't say any such thing for rq because it uses a global variable which it modifies.

The referential opacity of rq makes reasoning about programs more difficult. For example, say we wish to reason about the following statement:

integer p = rq(x) + rq(y) * (rq(x) - rq(x));

One may be tempted to simplify this statement to:

integer p = rq(x) + rq(y) * (0);integer p = rq(x) + 0;integer p = rq(x);

However, this will not work for rq() because each occurrence of rq(x) evaluates to a different value. Remember, that the return value of rq is based on a global value which isn't passed in and which gets modified on each call to rq. This means that mathematical identities such as $x - x = 0$ no longer hold.

Such mathematical identities will hold for referentially transparent functions such as rt.

However, a more sophisticated analysis can be used to simplify the statement to:

integer a = globalValue; integer p = x + a + 1 + (y + a + 2) * (x + a + 3 - (x + a + 4)); globalValue = globalValue + 4;integer a = globalValue; integer p = x + a + 1 + (y + a + 2) * (x + a + 3 - x - a - 4)); globalValue = globalValue + 4;integer a = globalValue; integer p = x + a + 1 + (y + a + 2) * -1; globalValue = globalValue + 4;integer a = globalValue; integer p = x + a + 1 - y - a - 2; globalValue = globalValue + 4;integer p = x - y - 1; globalValue = globalValue + 4;

This takes more steps and requires a degree of insight into the code infeasible for compiler optimization.

Therefore, referential transparency allows us to reason about our code which will lead to more robust programs, the possibility of finding bugs that we couldn't hope to find by testing, and the possibility of seeing opportunities for optimization.

References

Søndergaard, Harald; Sestoft, Peter (1990). "Referential transparency, definiteness and unfoldability". Acta Informatica 27 (6): 505–517.

Davie, Antony (1992). An Introduction to Functional Programming Systems Using Haskell. New York: Cambridge University Press. p. 290. ISBN 0-521-27724-8.

Contents

Examples and counterexamples

Contrast to imperative programming

Another example

See also

References