JAN
24
2004

An Alternative Syntax for Multiple Return Values in C-based Languages

Most functions do not need more than one return value, but when you need
more there's no easy way around it. Java has this problem, functions are limited to one return value, and in my experience it is often complicated
to get along with this limitation. In most cases you either write more
functions than necessary or you need to create a additional classes
that need even more constructors, accessor methods and documentation.


C++ and C# allow multiple return values using pointers and references. This solves the problem,
but I think it feels wrong and especially the C++ syntax does not
create very readable code. So I am going to show you an alternative, derived
from Python's tuples. But first the existing syntax variants and their
disadvantages:



Classic C syntax

The classic C syntax for multiple return values is to use a pointer.
Here is an example function that parses an integer in a string. It returns
a boolean to show whether it was parsed successfully and the integer itself:

// implementation:
int parseInt(const char *str, bool *success) {
	const char *s = str;
	int r = 0;
	while (*s) {
		char c = *s;

		if ((c < '0') || (c > '9')) {
			*success = false;
			return 0;
		}
		r = r * 10 + (c - '0');
	}
	*success = true;
	return r;
}

// invocation:
bool s;
int v = parseInt("2004", &s);



Disadvantages:

  • Neither declaration nor invocation syntax indicate whether 'success' is really
    a return value. It may also be just an optimization for an input value
    (admittedly unlikely in this example) or may be both input and output value. Only the documentation and the implementation can help
  • You can not find out whether null is allowed for 'success' without
    looking at the documentation or the implementation
  • The compiler won't catch a bug if 'success' is not initialized before
    returning in some code paths, because it does not know the purpose of 'success'




Classic C syntax with null

This is the same as above, but it allows a 0 for 'success' in order to
make it optional:

// implementation:
int parseInt(const char *str, bool *success = 0) {
	const char *s = str;
	int r = 0;
	while (*s) {
		char c = *s;
		if ((c < '0') || (c > '9')) {
			if (success)
				*success = false;
			return 0;
		}
		r = r * 10 + (c - '0');
	}
	if (success)
		*success = true;
	return r;
}

// invocation
int v = parseInt("2004");


Disadvantages:

  • You still need to look at documentation/implementation to find out what
    success is good for
  • The compiler will still not notice when success has not been set
    before returning, and the check whether 'success' is null adds another
    potential error
  • Two additional lines-of-code were needed in the implementation
    to make success optional



C++ syntax with references

// implementation:
int parseInt(const char *str, bool &success) {
	const char *s = str;
	int r = 0;
	while (*s) {
		char c = *s;
		if ((c < '0') || (c > '9')) {
			success = false;
			return 0;
		}
		r = r * 10 + (c - '0');
	}
	success = true;
	return r;
}

// invocation:
bool s;
int v = parseInt("2004", s);



Advantages:

  • References do not have the 'null' issue or other pointer problems



Disadvantages:

  • The invocation does not have any hint that the second argument will be
    modified. This can make code very hard to read if you do not know
    the functions, because any function may modify any argument
  • You still do not know whether 'success' is a input or an output
    value
  • Default values are not possible, you always need to have a bool even
    if you do not look at it
  • The compiler won't notice the bug when 'success' is not initialized in
    some code paths, because it does not know the purpose of 'success'




C# syntax

This is the same function in C#. IMHO the C# syntax is vastly superior to
the C++ alternatives:

// implementation:
int parseInt(String str, out bool success) {
	char s[] = str.ToCharArray();
	int r = 0;
	foreach (char c in s) {
		if ((c < '0') || (c > '9')) {
			success = false;
			return 0;
		}
		r = r * 10 + (c - '0');
	}
	success = true;
	return r;
}

// invocation:
bool s;
int v = parseInt("2004", out s);



Advantages:

  • It's obvious in declaration and invocation that 'success' is an output
    argument (in/out arguments use the keyword 'ref')
  • The compiler can check whether 'success' has been set by the function
    before returning
  • There are no pointer issues

Disadvantages:

  • Default arguments are not possible (a C# limitation)
  • You always need to declare the bool before invoking the function



Using Python-style tuples

An alternative to the C# syntax would be using Python-like tuples.
Tuples are comma-separated values in parentheses that can be on the left and right
side of an assignment statement. The syntax would look like this:

int x, y, z;
(x, y, z) = (1, 2, 3);

// The equivalent of the last line is:
x = 1;
y = 2;
z = 3;

// The source tuple can contain expressions as items:
(x, y) = (z-2, 5*5);

// the right side can have more items than the left (but not the other way round):
(x, y) = (1, 2, 3, 4, 5);

// the left side can be a regular value; then only the first item is taken:
x = (1, 2, 3);

// local variable declaration in a tuple
(int a, int b, int c) = (10, 20, 30);

// A tuple can combine several types, as long as the types of both sides match:
(x, bool f, double d) = (5, true, 3.14);

// Unlike other languages, the assignment is processed item-by-item:
(x, y) = (5, 10);
(x, y) = (y, x);
// now a and b are both 10! Swapping is not possible.

// When you embed the statement it returns the first item's value:
if ( (f, a) = (true, x) ) {
	// always executed
}

Note that tuples only exist as a helper construct for assignments. You can not use
operators on them, they are not represented by an object, they can not be
used like arrays etc.

Now that there are tuples it becomes easy to extend the function syntax to have
several return values - just return a tuple:

// implementation:
(int, bool) parseInt(String str) {
	char s[] = str.ToCharArray();
	int r = 0;
	foreach (char c in s) {
		if ((c < '0') || (c > '9'))
			return (0, false);
		r = r * 10 + (c - '0');
	}
	return (r, true);
}

// invocation:
(int v, bool s) = parseInt("2004");

What I like most about that syntax is that it makes the code more compact. In
this example it removed 3 lines-of-code. It is also a nice solution for optional return values.

If you don't need the second return value, just write

int v = parseInt("2004");

You can name the return value and then use it like a C# reference
argument. The C# function

void inc2(ref int number1, ref int number2) {
	number1++;
	number2++;
}

could be written as

(int number1, int number2) inc2(int number1, int number2) {
	number1++;
	number2++;
}

Note that input and output values have the same name and no return
statement is needed, since the return values are named and already set.
When you name output variables you can also combine initialized return
values with the return statement. Here's an alternative implementation for
parseInt():

(int r = 0, bool success = true) parseInt(String str) {
	char s[] = str.ToCharArray();
	foreach (char c in s) {
		if ((c < '0') || (c > '9'))
			return (0, false);
		r = r * 10 + (c - '0');
	}
}

Another two LOC's less. As tuples can have only a single item, it's also possible
to use the same syntax for a function with only one return value:

(int r) add(int a, int b) {
	r = a+b;
}

To summarize it, I think that tuples are a better solution for the multiple-return-value problem than argument references. They feel more natural because they bundle input and output values in declaration and invocation. The concept is closer to the Smalltalk concepts of messages, which makes it easier to create bindings to message-based protocols like SOAP. And last but not least it would help you to write shorter code.

Comments

Just some thoughts to the c++ variant
>int parseInt(const char *str, bool &success)
>Disadvantages:
>The invocation does not have any hint that the second argument will be modified.
It has. If it wasn't supposed to be modified it would have been const bool& instead of bool&
A reference without const modifier as function argument should usually be "output values", at least atm i don't see a reason to have a non-const reference for any other purpose.

>Default values are not possible, you always need to have a bool even if you do not look at it
I'm not sure i'm getting this(probably you mean something else, than i understand), but default values are of course possible for reference parameters.

There is even a way to get your tuples in c++, e.g. take a look at Loki::Tuple.


By rischwa at Sat, 01/24/2004 - 03:31

If it wasn’t supposed to be modified it would have been const bool&
instead of bool&




Hmm.. you're right, didn't think of it. You would have depend on the API developer to use const for all unmodified values though, and it does not answer the question whether the argument will be read or not.



I’m not sure i’m getting this(probably you mean something else, than I understand), but default values are of course possible for reference parameters.

But they don't do what you want. You can't write

int parseInt(const char *str, bool &success = false) {
}

to save the API user from giving the second argument. You can avoid the problem by working around it with a second variable

static bool dummyBool;
int parseInt(const char *str, bool &success = dummyBool) {
}

but I would call that extremely ugly, and it won't look good in the API docs.




> There is even a way to get your tuples in c++, e.g. take a look at

>Loki::Tuple.


You can emulate almost every feature with plain C and a preprocessor, the only difference is how short and concise the syntax is.


By tjansen at Sat, 01/24/2004 - 05:15

Why not do:

int parseInt( const char *str );

...

try
{
int v = parseInt( "2006" );
} catch(...) {
...
}

This is C++ after all :D


By Andre Eisenbach at Thu, 03/23/2006 - 18:08

A reference without const modifier as function argument should usually be “output values".

Or an input/output value. In this later case it would need to be initialized. I remember Ada let you specify if a parameter is "in", "out" or "in out". Does C# has "in out" too?


By [email protected] at Sat, 01/24/2004 - 20:50

Does C# has “in out


By tjansen at Sat, 01/24/2004 - 23:37

You can easily achieve what you proposed with a simple C++ struct template:

pair<int, bool> parseInt(const char*);

Of course, the ugly thing is that you must write code like this to use that function:

int  v = parseInt("1024").first;
pair<int, bool> result = parseInt("1234");
if (result.second)
  // do something

The first problem could be easily alleviated by adding a cast operator to the first type of the template.

Further improvements would call for more arguments than just two typenames to the template, as well as an assignment operator (and copy constructor) against other, larger tuples.

Talking about the cost of such constructs, remember that practically every ABI out there has a method of returning ONE and one only return value. So, extending this to multiple values necessarily introduces the need of a structure and, therefore, an implicit first parameter.

Assume using namespace std;, since writing std::pair requires quoting the second : to &#58;. How do I make paragraphs in here?


By Thiago Macieira at Sat, 01/24/2004 - 05:13

Talking about the cost of such constructs, remember that practically every ABI out there has a method of returning ONE and one only return value. So, extending this to multiple values necessarily introduces the need of a structure and, therefore, an implicit first parameter.



It's hardly possible with the Linux binary ABI or Java, but quite easy with CLI. Just add every 'out' argument to the list of return values, and have every 'ref' argument in both lists.



How do I make paragraphs in here?


<br><br> :)

It's quite messy, but I am getting used to escaping text on this site. Since I started posting code snippets www.asciitable.org became one of my most frequently visited sites...


By tjansen at Sat, 01/24/2004 - 13:32

Tuples will be a part of the upcoming C++ Standard.
For more information: http://std.dkuug.dk/jtc1/sc22/wg21/docs/library_technical_report.html or http://www.cuj.com/documents/s=8250/cujcexp2106sutter/


By Christian Loose at Sat, 01/24/2004 - 11:37

Thanks, interesting, didn't know that. My problem with most template-based features is that they create pretty verbose code, and the compilers create horrible error messages when you do something wrong.


By tjansen at Sat, 01/24/2004 - 13:59

The code for the proposed extensions already exists as part of boost (same author):
http://www.boost.org/libs/tuple/doc/tuple_users_guide.html

BTW... Jan, your article series is quite interesting to read. Great stuff!


By Eva Brucherseifer at Sun, 01/25/2004 - 17:12

Pages