Declarations and types

Title stata.com

Declarations — Declarations and types

Description Syntax Remarks and examples Also see

Description

The type and the use of declarations are explained. Also discussed is the calling convention (functions

are called by address, not by value, and so may change the caller’s arguments), and the use of external

globals.

Mata also has structures—the eltype is struct name—but these are not discussed here. For a

discussion of structures, see [M-2] struct.

Mata also has classes—the eltype is class name—but these are not discussed here. For a discussion

of classes, see [M-2] class.

Declarations are optional but, for careful work, their use is recommended.

Syntax

declaration

fcnname(declaration

)

{

declaration

. . .

}

such as

myfunction( )

{

. . .

}

declaration

is one of

function

type



function



void



function



declaration



type



argname



type



argname



, . . .

 

where argname is the name you wish to assign to the argument.

declaration

are lines of the form of either of

type varname



, varname



, . . .

 

external



type



varname



, varname



, . . .

 

2 Declarations — Declarations and types

type is deﬁned as one of

eltype orgtype such as real vector

eltype such as real

orgtype such as vector

eltype and orgtype are each one of

eltype orgtype

transmorphic matrix

numeric vector

real rowvector

complex colvector

string scalar

pointer

If eltype is not speciﬁed, transmorphic is assumed. If orgtype is not speciﬁed, matrix is assumed.

Remarks and examples stata.com

Remarks are presented under the following headings:

The purpose of declarations

Types, element types, and organizational types

Implicit declarations

Element types

Organizational types

Function declarations

Argument declarations

The by-address calling convention

Variable declarations

Linking to external globals

The purpose of declarations

Declarations occur in three places: in front of function deﬁnitions, inside the parentheses deﬁning

the function’s arguments, and at the top of the body of the function, deﬁning private variables the

function will use. For instance, consider the function

Declarations — Declarations and types 3

real matrix swaprows(real matrix A, real scalar i1, real scalar i2)

{

real matrix B

real rowvector v

B = A

v = B[i1, .]

B[i1, .] = B[i2, .]

B[i2, .] = v

return(B)

}

This function returns a copy of matrix A with rows i1 and i2 swapped.

There are three sets of declarations in the above function. First, there is a declaration in front of the

function name:

swaprows(. . . )

{

. . .

}

That declaration states that this function will return a real matrix.

The second set of declarations occur inside the parentheses:

. . . swaprows( )

{

. . .

}

Those declarations state that this function expects to receive three arguments, which we chose to call

A, i1, and i2, and which we expect to be a real matrix, a real scalar, and a real scalar, respectively.

The third set of declarations occur at the top of the body of the function:

. . . swaprows(. . . )

{

. . .

}

Those declarations state that we will use variables B and v inside our function and that, as a matter

of fact, B will be a real matrix and v a real row vector.

We could have omitted all those declarations. Our function could have read

function swaprows(A, i1, i2)

{

B = A

v = B[i1, .]

B[i1, .] = B[i2, .]

B[i2, .] = v

return(B)

}

4 Declarations — Declarations and types

and it would have worked just ﬁne. So why include the declarations?

1. By including the outside declaration, we announced to other programs what to expect. They

can depend on swaprows() returning a real matrix because, when swaprows() is done,

Mata will verify that the function really is returning a real matrix and, if it is not, abort

execution.

Without the outside declaration, anything goes. Our function could return a real scalar in

one case, a complex row vector in another, and nothing at all in yet another case.

Including the outside declaration makes debugging easier.

2. By including the argument declaration, we announced to other programmers what they are

expected to pass to our function. We have made it easier to understand our function.

We have also told Mata what to expect and, if some other program attempts to use our

function incorrectly, Mata will stop execution.

Just as in (1), we have made debugging easier.

3. By including the inside declaration, we have told Mata what variables we will need and how

we will be using them. Mata can do two things with that information: ﬁrst, it can make

sure that we are using the variables correctly (making debugging easier again), and second,

Mata can produce more efﬁcient code (making our function run faster).

Interactively, we admit that we sometimes deﬁne functions without declarations. For more careful

work, however, we include them.

Types, element types, and organizational types

When you use Mata interactively, you just willy-nilly create new variables:

: n = 2

: A = (1,2 \ 3,4)

: z = (sqrt(-4+0i), sqrt(4))

When you create a variable, you may not think about the type, but Mata does. n above is, to Mata,

a real scalar. A is a real matrix. z is a complex row vector.

Mata thinks of the type of a variable as having two parts:

1. the type of the elements the variable contains (such as real or complex) and

2. how those elements are organized (such as a row vector or a matrix).

We call those two types the eltype—element type—and orgtype—organizational type. The eltypes and

orgtypes are

eltype orgtype

transmorphic matrix

numeric vector

real rowvector

complex colvector

string scalar

pointer

You may choose one of each and so describe all the types Mata understands.

Declarations — Declarations and types 5

Implicit declarations

When you do not declare an object, Mata behaves as if you declared it to be transmorphic matrix:

1. transmorphic means that the matrix can be real, complex, string, or pointer.

2. matrix means that the organization is to be r × c, r ≥ 0 and c ≥ 0.

At one point in your function, a transmorphic matrix might be a real scalar (real being a

special case of transmorphic and scalar being a special case of a matrix when r = c = 1), and

at another point, it might be a string colvector (string being a special case of transmorphic,

and colvector being a special case of a matrix when c = 1).

Consider our swaprows() function without declarations,

function swaprows(A, i1, i2)

{

B = A

v = B[i1, .]

B[i1, .] = B[i2, .]

B[i2, .] = v

return(B)

}

The result of compiling this function is just as if the function read

transmorphic matrix swaprows(transmorphic matrix A,

transmorphic matrix i1,

transmorphic matrix i2)

{

transmorphic matrix B

transmorphic matrix v

B = A

v = B[i1, .]

B[i1, .] = B[i2, .]

B[i2, .] = v

return(B)

}

When we declare a variable, we put restrictions on it.

Element types

There are six eltypes, or element types:

1. transmorphic, which means real, complex, string, or pointer.

2. numeric, which means real or complex.

3. real, which means that the elements are real numbers, such as 1, 3, −50, and 3.14159.

4. complex, which means that each element is a pair of numbers, which are given the

interpretation a + bi. complex is a storage type; the number stored in a complex might

be real, such as 2 + 0i.

6 Declarations — Declarations and types

5. string, which means the elements are strings of text. Each element may contain up to

2,147,483,647 bytes and strings may (need not) contain binary 0; that is, strings may be

binary strings or text strings. Mata strings are similar to the strL type in Stata in that they

can be very long and may contain binary 0. Mata strings, like all other strings in Stata, can

contain Unicode characters and are stored in UTF-8 encoding.

6. pointer means the elements are pointers to (addresses of) other Mata matrices, vectors,

scalars, or even functions; see [M-2] pointers.

Organizational types

There are ﬁve orgtypes, or organizational types:

1. matrix, which means r × c, r ≥ 0 and c ≥ 0.

2. vector, which means 1 × n or n × 1, n ≥ 0.

3. rowvector, which means 1 × n, n ≥ 0.

4. colvector, which means n × 1, n ≥ 0.

5. scalar, which means 1 × 1.

Sharp-eyed readers will note that vectors and matrices can have zero rows or columns! See [M-2] void

for more information.

Function declarations

Function declarations are the declarations that appear in front of the function name, such as

swaprows(. . . )

{

. . .

}

The syntax for what may appear there is

function

type



function



void



function



Something must appear in front of the name, and if you do not want to declare the type (which makes

the type transmorphic matrix), you just put the word function:

swaprows(. . . )

{

. . .

}

You may also declare the type and include the word function if you wish,

swaprows(. . . )

{

. . .

}

but most programmers omit the word function; it makes no difference.

Declarations — Declarations and types 7

In addition to all the usual types, void is a type allowed only with functions—it states that the

function returns nothing:

_swaprows(real matrix A, real scalar i1, real scalar i2)

{

real rowvector v

v = A[i1, .]

A[i1, .] = A[i2, .]

A[i2, .] = v

}

The function above returns nothing; it instead modiﬁes the matrix it is passed. That might be useful

to save memory, especially if every use of the original swaprows() was going to be

A = swaprows(A, i1, i2)

In any case, we named this new function swaprows() (note the underscore), to ﬂag the user that

there is something odd and deserving caution concerning the use of this function.

void, that is to say, returning nothing, is also considered a special case of a transmorphic matrix

because Mata secretly returns a 0 × 0 real matrix, which the caller just discards.

Argument declarations

Argument declarations are the declarations that appear inside the parentheses, such as

. . . swaprows( )

{

. . .

}

The syntax for what may appear there is



type



argname



type



argname



, . . .

 

The names are required—they specify how we will refer to the argument—and the types are optional.

Omit the type and transmorphic matrix is assumed. Specify the type, and it will be checked

when your function is called. If the caller attempts to use your function incorrectly, Mata will stop

the execution and complain.

The by-address calling convention

Arguments are passed to functions by address, not by value. If you change the value of an argument,

you will change the caller’s argument. That is what made swaprows() (above) work. The caller

passed us A and we changed it. And that is why in the original version of swaprows(), the ﬁrst line

read

B = A

we did our work on B, and returned B. We did not want to modify the caller’s original matrix.

8 Declarations — Declarations and types

You do not ordinarily have to make copies of the caller’s arguments, but you do have to be careful

if you do not want to change the argument. That is why in all the ofﬁcial functions (with the single

exception of st

view()—see [M-5] st view( )), if a function changes the caller’s argument, the

function’s name starts with an underscore. The reverse logic does not hold: some functions start with

an underscore and do not change the caller’s argument. The underscore signiﬁes caution, and you

need to read the function’s documentation to ﬁnd out what it is you need to be cautious about.

Variable declarations

The variable declarations are the declarations that appear at the top of the body of a function:

. . . swaprows(. . . )

{

. . .

}

These declarations are optional. If you omit them, Mata will observe that you are using B and v in

your code, and then Mata will compile your code just as if you had declared the variables to be

transmorphic matrix, meaning that the resulting compiled code might be a little more inefﬁcient

than it could be, but that is all.

The variable declarations are optional as long as you have not mata set matastrict on; see

[M-3] mata set. Some programmers believe so strongly that variables really ought to be declared that

Mata provides a provision to issue an error when they forget.

In any case, these declarations—explicit or implicit—deﬁne the variables we will use. The variables

we use in our function are private—it does not matter if there are other variables named B and v

ﬂoating around somewhere. Private variables are created when a function is invoked and destroyed

when the function ends. The variables are private but, as explained above, if we pass our variables

to another function, that function may change their values. Most functions do not.

The syntax for declaring variables is

type varname



, varname



, . . .

 

external



type



varname



, varname



, . . .

 

real matrix B and real rowvector v match the ﬁrst syntax.

Linking to external globals

The second syntax has to do with linking to global variables. When you use Mata interactively and

type

: n = 2

you create a variable named n. That variable is global. When you code inside a function

. . . myfunction(. . . )

{

. . .

}

Declarations — Declarations and types 9

The n variable your function will use is the global variable named n. If your function were to examine

the value of n right now, it would discover that it contained 2.

If the variable does not already exist, the statement external n will create it. Pretend that we had

not previously deﬁned n. If myfunction() were to examine the contents of n, it would discover

that n is a 0 × 0 matrix. That is because we coded

external n

and Mata behaved as if we had coded

external transmorphic matrix n

Let’s modify myfunction() to read:

. . . myfunction(. . . )

{

. . .

}

Let’s consider the possibilities:

1. n does not exist. Here external real scalar n will create n—as a real scalar, of

course—and set its value to missing.

If n had been declared a rowvector, a 1 × 0 vector would have been created.

If n had been declared a colvector, a 0 × 1 vector would have been created.

If n had been declared a vector, a 0 × 1 vector would have been created. Mata could just

as well have created a 1 × 0 vector, but it creates a 0 × 1.

If n had been declared a matrix, a 0 × 0 matrix would have been created.

2. n exists, and it is a real scalar. Our function executes, using the global n.

3. n exists, and it is a real 1 × 1 rowvector, colvector, or matrix. The important thing

is that it is 1 × 1; our function executes, using the global n.

4. n exists, but it is complex or string or pointer, or it is real but not 1 × 1. Mata

issues an error message and aborts execution of our function.

Complicated systems of programs sometimes ﬁnd it convenient to communicate via globals. Because

globals are globals, we recommend that you give your globals long names. A good approach is to

put the name of your system as a preﬁx:

. . . myfunction(. . . )

{

. . .

}

For another approach to globals, see [M-5] ﬁndexternal( ) and [M-5] valofexternal( ).

10 Declarations — Declarations and types

Also see

[M-2] Intro — Language deﬁnition

Stata, Stata Press, and Mata are registered trademarks of StataCorp LLC. Stata and

Stata Press are registered trademarks with the World Intellectual Property Organization

of the United Nations. StataNow and NetCourseNow are trademarks of StataCorp

LLC. Other brand and product names are registered trademarks or trademarks of their

respective companies. Copyright

 1985–2023 StataCorp LLC, College Station, TX,

For suggested citations, see the FAQ on citing Stata documentation.