History of Volatile modifier

What does a C program mean?

The C standard  defines the meaning of a C program in terms of an “abstract machine” that you can think of as a simple, non-optimizing interpreter for C.  The behavior of any given C implementation (a compiler plus target machine) must produce the same side effects as the abstract machine, but otherwise the standard mandates very little correspondence between the abstract machine and computation that actually happens on the target platform.  In other words, the C program can be thought of as a specification of desired effects, and the C implementation decides how to best go about making those effects happen.

As a simple example, consider this function:

int increment_number (int x)
{
int i;
for (i=0; i<3; i++)
{
x++;
}
return x;
}

The behavior of the abstract machine is clear: it creates a function-scoped variable named i which loops from 0 through 2, adding 1 to x on each iteration of the loop.  On the other hand, a good compiler emits code like this:

increment_number:
  movl 4(%esp), %eax
  addl $3, %eax
  ret

In the actual machine, i is never incremented , but the net effect is the same as if it had been.  In general, this gap between the abstract and actual semantics is considered to be a good thing, and the “as if” wording in the standard is what gives the optimizer the freedom to generate efficient code from a high-level specification.

What does volatile mean?

The problem with the gap between the abstract and concrete semantics is that C is a low-level language that was designed for implementing operating systems.  Writing OS code often requires that the gap between the abstract and actual semantics be narrowed or closed.  For example, if an OS wants to create a new page table, the abstract semantics for C fails to capture an important fact: that the programmer requires an actual page table that sits in RAM where it can be traversed by hardware.  If the C implementation concludes that the page table is useless and optimizes it away, the developers’ intent has not been served.  The canonical example of this problem is when a compiler decides that a developer’s code for zeroing sensitive data is useless and optimizes it away.  Since the C abstract machine was not designed to consider cases where this data may be snooped later, the optimization is legitimate (though obviously undesirable).

The C standard gives us just a few ways to establish a connection between the abstract and actual machines:

  • the arguments to main()
  • the return value from main()
  • the side-effecting functions in the C standard library
  • volatile variables

Most C implementations offer additional mechanisms, such as inline assembly and extra library functions not mandated by the standard.

The way the volatile connects the abstract and real semantics is this:

For every read from a volatile variable by the abstract machine, the actual machine must load from the memory address corresponding to that variable.  Also, each read may return a different value.  For every write to a volatile variable by the abstract machine, the actual machine must store to the corresponding address.  Otherwise, the address should not be accessed (with some exceptions) and also accesses to volatiles should not be reordered (with some exceptions).

In summary:

  • Volatile has no effect on the abstract machine; in fact, the C standard explicitly states that for a C implementation that closely mirrors the abstract machine (i.e. a simple C interpreter), volatile has no effect at all.
  • Accesses to volatile-qualified objects obligate the C implementation to perform certain operations at the level of the actual computation.

Where did volatile come from?

Historically, the connection between the abstract and actual machines was established mainly through accident: compilers weren’t good enough at optimizing to create an important semantic gap.  As optimizers improved, it became increasingly clear that a systematic solution was needed.  In an excellent USENET post  explained how volatile came about:

To take a specific example, UNIX device drivers are almost always coded entirely in C, and on the PDP-11 and similar memory-mapped I/O architectures, some device registers perform different actions upon a “read-byte”, “read-word”, “write-byte”, “write-word”, “read-modify-write”, or other variations of the memory-bus access cycles involved.  Trying to get the right type of machine code generated while coding the driver in C was quite tricky, and many hard-to-track-down bugs resulted.  With compilers other than Ritchie’s, enabling optimization often would change this behavior, too.  At least one version of the UNIX Portable C Compiler (PCC) had a special hack to recognize constructs like

((struct xxx *)0177450)->zzz

as being potential references to I/O space (device registers) and would avoid excessive optimization involving such expressions (where the constant lay within the Unibus I/O address range).  X3J11 decided that this problem had to be faced squarely, and introduced “volatile” to obviate the need for such hacks.  However, although it was proposed that conforming implementations be required to implement the minimum possible access “width” for volatile-qualified data, and that is the intent of requiring an implementation definition for it, it was not practical to insist on it in every implementation; thus, some latitude was allowed implementors in that regard.

It’s hard to overstate how bad an idea it is for a compiler to use strange heuristics about code structure to guess the developer’s intent.

PART2: The six deadly coding sins concerning volatile Modifier.

Advertisements

One thought on “History of Volatile modifier

  1. Pingback: The six deadly coding sins concerning volatile Modifier | Embeddedx

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s