C/C++ Language Lawyer Goodness

Recently, I discovered the language-lawyer tag on Stack Overflow. Suffice to say that over the past week or so, I've learned a lot about the C/C++ languages that I have missed over the past decade or so when I haven't been doing much Java coding. The Java language itself has evolved (and I'm a little behind since I've only really been using Java 8/9). During that time, C++ got namespaces, default values for parameters, lambda expressions, automatic type deduction, default functions, the nullptr constant, Rvalue references, and much much more. So the question becomes this: have these changes made the language better?

I would have to argue that the answer is no. Frankly, the more features you try to pack into a language, the more people have to learn. Those of us that know the language can easily learn new tricks along the way, and that's usually why these things continue to happen. Newcomers, however, have more and more to learn when they are getting started, and even the C/C++ concepts that existed when I started were hard enough.

Pointers continue to be the thing that makes these languages tricky, even when using classes to hide a lot of the complexity. Eventually you have to navigate through complex data structures and you will end up doing some pointer arithmetic or assigning the indirection to local variables or such. Simple pointer expressions are not difficult to understand, but once you start combining them into expressions, it starts getting harder. Take the following example:

void (*bsd_signal(int, void (*)(int)))(int)

What the heck is going on here? Well the thing that makes this really hard to understand is that it involves pointers to functions. Unfortunately, the syntax that was chosen for specifying pointers to functions was basically an asterisk, some parentheses, the name of the function and some types or void if not returning anything. It's not entirely clear that bad_signal is the thing being defined here either.

So let's break it down a little. First, take a look at this:

void (*)(int)

This declares an anonymous function that takes an int and returns nothing. Normally we'd give this function a name, which changes the syntax to this, assuming we are defining the function as fp:

void (*fp)(int)

The anonymous function pointer above is a parameter to the bsd_signal function, the first parameter being an int. So what does bsd_signal return? It's not entirely clear. If you said void, you are wrong. It actually returns yet another function pointer, pointing to a function that takes an int and returns nothing. So there you have it. This isn't that complicated, and yet the syntax to define it is convoluted. None of the changes that have come along really address this. Why? Because it isn't considered to be a flaw in the language. Parsing C code in our heads is part of the process, right?

Thankfully, this example is the exception rather than the rule, and you can make this better if you use typedefs or the using expression:

typedef void (*int_func_pointer)(int);
typedef int_func_pointer (*bsd_signal_func_pointer(int, int_func_pointer));

This is better. By naming things a certain way and using the typedefs, I have now created something much easier to understand - bsd_signal_func_pointer is clearly a function pointer, and the second arg and return value are clearly function pointers as well.

The key point being that although the language changes haven't addressed these issues, there has always been creative ways to do that yourself, and that continues to be the case. There are limitless cases of where C code can be really obscure, but if you set up some nice baseline syntactic sugar, you can make it much better.