It was too late to run the code that looked cool in C language

This is a "done" story

My name is Mihara from ACCESS CO., LTD. Thank you for reading Day 11 of ACCESS Advent Calendar 2016.

This is a "done" story.

In a nutshell, it's a story of "I purposely did a slow pattern in C with a C ++ virtual function".

If you can hear it and laugh at it, you don't need to read it (Oh, can you read only tomorrow's introduction at the end). If you don't get it right, can you ask for a bad story?

The beautiful code left by my seniors is slow

We have taken over the maintenance of the interpreter for a certain language that we implemented. Even with an interpreter, creating a whole language processing system can be a daunting task. I still admire the seniors who implemented it until it was fully operational.

However, water may leak from your good hands.

After the senior who was originally implemented left, a customer pointed out that the interpreter was slow. Since we are an embedded programming company, our customers are not end users but electronic device developers. The manufacturer did a lot of research and found evidence that the lexical analysis part was slow rather than the interpreter's main loop.

The slow part had the following structure. (It is not the product as it is. It is a mock code)


/*
 *Of a function that performs processing on the next character
 *Define the interface.
 */
typedef int (*handle_character_proc)(int next_character, ...);

/*
 *A function that performs processing on the next character
 *Hold in a function pointer.
 */
static handle_character_proc *state_proc;

/*
 *Set the function that performs the following processing.
 */
static void
SetStateProc(handle_character_proc next_proc)
{
  handle_character_proc = next_proc;
}


/*
 *Set the initial state.
 *That is,
 *Set the first function pointer.
 */
static int
SetInitialStateProc(int next_character)
{
  ... /*The first stage is omitted*/
  /*
   *Set the function pointer according to the first character.
   */
  switch (next_character) {
  case '?':
    SetStateProc(HandleCharacter1);
    break;
  case '=':
    SetStateProc(HandleCharacter2);
    break;
  /*
   *The case statement that sets the function pointer
   *It continues for as many characters as you want to process.
   */
  }
  ... /*The latter part is omitted*/
}

static int
HandleCharacter1(int next_character, ...)
{
  ... /*Perform processing on characters.*/
  switch (next_character) {
  ...
  /*
   *State transition by setting the next function pointer.
   */
  case '=':
    SetStateProc(HandleCharacter5);
    break;
  ...
  }
}

/*
 *Calling a function pointer is the processing for characters.
 */
int
ScanChar(int next_character, ...)
{
  int ret;

  ...
  ret = (*handle_character_proc)(next_character, ...);
  ...
}

I was looking at this code, thinking that it was beautiful, as expected ...

If you think about it, it ’s natural.

Come to think of it, it's not surprising that the above code is slow.

--Prologue and epilogue of function call every time one character is processed --C compiler optimization and CPU speculative execution are disabled

Function calls cost prologue and epilogue to create an environment, such as saving and reverting automatic variables.

The cost of calling a function is acceptable, considering the maintenance cost, if the processing performed by the function is large enough.

However, the code above is an interpreter lexical analyzer that calls a function each time it processes a single character. The internal processing is small, and the program ends up executing only the prologue and epilogue of the function call.

Some may say that C compiler optimization is for such cases, but this code is not optimized.

The value of a function pointer is frequently rewritten during the program and has no rules. So the C compiler can't make assumptions about function calls, and honestly has to generate code that reads function pointers and jumps.

In addition, CPU speculative execution is also disabled. Since the value of the function pointer is not known in advance, even if you try to execute a probable jump destination for the time being, you cannot find a "probable jump destination".

If you're a C ++ programmer, you've probably used techniques to replace slow virtual function calls with static calls. The above code is in C language, but I created a virtual function.

Muddy solution with stupid

The code above, no matter how it looks, needed to be fast immediately. So, I rewrote everything with a switch statement. The switch statement does not work for CPU branch prediction, but the function call disappears for the time being.

Then the processing time was cut in half. In the original code, half of the processing was the prologue and epilogue of the function call.

A bitter memory named "lesson"

One thing I learned bitterly is that branching processing with function pointers is cool but slow.

It has the advantage that the structure becomes clearer than the if statement or switch statement when the number of branches is huge. But instead, I learned to pay the cost of execution time, along with customer complaints.

Don't rush to the conclusion that you shouldn't do cool programming. In general, fast programs are syntactically beautiful.

Tomorrow's ACCESS Advent Calendar 2016 is @DaisukeKondo. Please look forward to it.

Recommended Posts