An Alternative Syntax for Multiple Return Values in C-based Languages
Most functions do not need more than one return value, but when you need
more there's no easy way around it. Java has this problem, functions are limited to one return value, and in my experience it is often complicated
to get along with this limitation. In most cases you either write more
functions than necessary or you need to create a additional classes
that need even more constructors, accessor methods and documentation.
C++ and C# allow multiple return values using pointers and references. This solves the problem,
but I think it feels wrong and especially the C++ syntax does not
create very readable code. So I am going to show you an alternative, derived
from Python's tuples. But first the existing syntax variants and their
disadvantages:
Classic C syntax
The classic C syntax for multiple return values is to use a pointer.
Here is an example function that parses an integer in a string. It returns
a boolean to show whether it was parsed successfully and the integer itself:
// implementation: int parseInt(const char *str, bool *success) { const char *s = str; int r = 0; while (*s) { char c = *s; if ((c < '0') || (c > '9')) { *success = false; return 0; } r = r * 10 + (c - '0'); } *success = true; return r; } // invocation: bool s; int v = parseInt("2004", &s);
Disadvantages:
- Neither declaration nor invocation syntax indicate whether 'success' is really a return value. It may also be just an optimization for an input value (admittedly unlikely in this example) or may be both input and output value. Only the documentation and the implementation can help
- You can not find out whether null is allowed for 'success' without looking at the documentation or the implementation
- The compiler won't catch a bug if 'success' is not initialized before returning in some code paths, because it does not know the purpose of 'success'
Classic C syntax with null
This is the same as above, but it allows a 0 for 'success' in order to make it optional:
// implementation: int parseInt(const char *str, bool *success = 0) { const char *s = str; int r = 0; while (*s) { char c = *s; if ((c < '0') || (c > '9')) { if (success) *success = false; return 0; } r = r * 10 + (c - '0'); } if (success) *success = true; return r; }// invocation int v = parseInt("2004");
Disadvantages:
- You still need to look at documentation/implementation to find out what success is good for
- The compiler will still not notice when success has not been set before returning, and the check whether 'success' is null adds another potential error
- Two additional lines-of-code were needed in the implementation to make success optional
C++ syntax with references
// implementation: int parseInt(const char *str, bool &success) { const char *s = str; int r = 0; while (*s) { char c = *s; if ((c < '0') || (c > '9')) { success = false; return 0; } r = r * 10 + (c - '0'); } success = true; return r; } // invocation: bool s; int v = parseInt("2004", s);
Advantages:
- References do not have the 'null' issue or other pointer problems
Disadvantages:
- The invocation does not have any hint that the second argument will be modified. This can make code very hard to read if you do not know the functions, because any function may modify any argument
- You still do not know whether 'success' is a input or an output value
- Default values are not possible, you always need to have a bool even if you do not look at it
- The compiler won't notice the bug when 'success' is not initialized in some code paths, because it does not know the purpose of 'success'
C# syntax
This is the same function in C#. IMHO the C# syntax is vastly superior to
the C++ alternatives:
// implementation: int parseInt(String str, out bool success) { char s[] = str.ToCharArray(); int r = 0; foreach (char c in s) { if ((c < '0') || (c > '9')) { success = false; return 0; } r = r * 10 + (c - '0'); } success = true; return r; } // invocation: bool s; int v = parseInt("2004", out s);
Advantages:
- It's obvious in declaration and invocation that 'success' is an output argument (in/out arguments use the keyword 'ref')
- The compiler can check whether 'success' has been set by the function before returning
- There are no pointer issues
- Default arguments are not possible (a C# limitation)
- You always need to declare the bool before invoking the function
Using Python-style tuples
An alternative to the C# syntax would be using Python-like tuples. Tuples are comma-separated values in parentheses that can be on the left and right side of an assignment statement. The syntax would look like this:
int x, y, z; (x, y, z) = (1, 2, 3); // The equivalent of the last line is: x = 1; y = 2; z = 3; // The source tuple can contain expressions as items: (x, y) = (z-2, 5*5); // the right side can have more items than the left (but not the other way round): (x, y) = (1, 2, 3, 4, 5); // the left side can be a regular value; then only the first item is taken: x = (1, 2, 3); // local variable declaration in a tuple (int a, int b, int c) = (10, 20, 30); // A tuple can combine several types, as long as the types of both sides match: (x, bool f, double d) = (5, true, 3.14); // Unlike other languages, the assignment is processed item-by-item: (x, y) = (5, 10); (x, y) = (y, x); // now a and b are both 10! Swapping is not possible. // When you embed the statement it returns the first item's value: if ( (f, a) = (true, x) ) { // always executed }
Note that tuples only exist as a helper construct for assignments. You can not use
operators on them, they are not represented by an object, they can not be
used like arrays etc.
Now that there are tuples it becomes easy to extend the function syntax to have several return values - just return a tuple:
// implementation: (int, bool) parseInt(String str) { char s[] = str.ToCharArray(); int r = 0; foreach (char c in s) { if ((c < '0') || (c > '9')) return (0, false); r = r * 10 + (c - '0'); } return (r, true); } // invocation: (int v, bool s) = parseInt("2004");
What I like most about that syntax is that it makes the code more compact. In
this example it removed 3 lines-of-code. It is also a nice solution for optional return values.
If you don't need the second return value, just write
int v = parseInt("2004");
You can name the return value and then use it like a C# reference argument. The C# function
void inc2(ref int number1, ref int number2) { number1++; number2++; }
could be written as
(int number1, int number2) inc2(int number1, int number2) { number1++; number2++; }
Note that input and output values have the same name and no return statement is needed, since the return values are named and already set. When you name output variables you can also combine initialized return values with the return statement. Here's an alternative implementation for parseInt():
(int r = 0, bool success = true) parseInt(String str) { char s[] = str.ToCharArray(); foreach (char c in s) { if ((c < '0') || (c > '9')) return (0, false); r = r * 10 + (c - '0'); } }
Another two LOC's less. As tuples can have only a single item, it's also possible to use the same syntax for a function with only one return value:
(int r) add(int a, int b) { r = a+b; }
To summarize it, I think that tuples are a better solution for the multiple-return-value problem than argument references. They feel more natural because they bundle input and output values in declaration and invocation. The concept is closer to the Smalltalk concepts of messages, which makes it easier to create bindings to message-based protocols like SOAP. And last but not least it would help you to write shorter code.