This is lecture is a slightly modified version of https://lectures.quantecon.org/jl/fundamental_types.html Thank you to the amazing Quantecon.org team!
In Julia, arrays and tuples are the most important data type for working with numerical data
In this lecture we give more details on
# using InstantiateFromURL
# activate the QuantEcon environment
# activate_github("QuantEcon/QuantEconLecturePackages", tag = "v0.9.5");
# load some packages
using LinearAlgebra, Statistics, Compat
(See multi-dimensional arrays documentation)
Since it is one of the most important types, we will start with arrays
Later, we will see how arrays (and all other types in Julia) are handled in a generic and extensible way
We’ve already seen some Julia arrays in action
a = [10, 20, 30]
a = [1.0, 2.0, 3.0]
The output tells us that the arrays are of types Array{Int64,1}
and Array{Float64,1}
respectively
Here Int64
and Float64
are types for the elements inferred by the compiler
We’ll talk more about types later
The 1
in Array{Int64,1}
and Array{Any,1}
indicates that the array is
one dimensional (i.e., a Vector
)
This is the default for many Julia functions that create arrays
typeof(randn(100))
In Julia, one dimensional vectors are best interpreted as column vectors, which we will see when we take transposes
We can check the dimensions of a
using size()
and ndims()
functions
ndims(a)
size(a)
The syntax (3,)
displays a tuple containing one element – the size along the one dimension that exists
In Julia, Vector
and Matrix
are just aliases for one- and two-dimensional arrays
respectively
Array{Int64, 1} == Vector{Int64}
Array{Int64, 2} == Matrix{Int64}
Vector construction with ,
is then interpreted as a column vector
To see this, we can create a column vector and row vector more directly
[1, 2, 3] == [1; 2; 3] # both column vectors
[1 2 3] # a row vector is 2-dimensional
As we’ve seen, in Julia we have both
(1, n)
or (n, 1)
that represent row and column vectors respectively Why do we need both?
On one hand, dimension matters for matrix algebra
On the other, we use arrays in many settings that don’t involve matrix algebra
In such cases, we don’t care about the distinction between row and column vectors
This is why many Julia functions return flat arrays by default
zeros(3)
# This generalizes to matrices and higher dimensional arrays
zeros(2, 2)
# To return an array filled with a single value, use `fill`
fill(5.0, 2, 2)
# You can create an empty array using the `Array()` constructor
# Need to add undef
x = Array{Float64}(undef, 2, 2)
The printed values you see here are just garbage values.
(the existing contents of the allocated memory slots being interpreted as 64 bit floats)
If you need more control over the types, fill with a non-floating point:
fill(0, 2, 2) # fills with 0, not 0.0
# Or fill with a boolean type
fill(false, 2, 2) # produces a boolean matrix
For the most part, we will avoid directly specifying the types of arrays, and let the compiler deduce the optimal types on its own
The reasons for this, discussed in more detail in this lecture, are to ensure both clarity and generality
One place this can be inconvenient is when we need to create an array based on an existing array
First, note that assignment in Julia binds a name to a value, but does not make a copy of that type
x = [1, 2, 3]
y = x
y[1] = 2
x
In the above, y = x
simply creates a new named binding called y
which refers to whatever x
currently binds to
To copy the data, you need to be more explicit
x = [1, 2, 3]
y = copy(x)
y[1] = 2
x
However, rather than making a copy of x
, you may want to just have a similarly sized array
x = [1, 2, 3]
y = similar(x)
y
We can also use similar
to pre-allocate a vector with a different size, but the same shape
x = [1, 2, 3]
y = similar(x, 4) # make a vector of length 4
# higher dimensions
x = [1, 2, 3]
y = similar(x, 2, 2) # make a 2x2 matrix
As we’ve seen, you can create one dimensional arrays from manually specified data like so
a = [10, 20, 30, 40]
In two dimensions we can proceed as follows
a = [10 20 30 40] # two dimensional, shape is 1 x n
ndims(a)
a = [10 20; 30 40] # 2 x 2
You might then assume that a = [10; 20; 30; 40]
creates a two dimensional column vector but this isn’t the case
a = [10; 20; 30; 40]
ndims(a)
Instead transpose the matrix (or adjoint if complex)
a = [10 20 30 40]'
ndims(a)
We’ve already seen the basics of array indexing
a = [10 20 30 40]
a[end-1]
a[1:3]
For 2D arrays the index syntax is straightforward
a = randn(2, 2)
a[1, 1]
a[1, :] # first row
a[:, 1] # first column
Booleans can be used to extract elements
a = randn(2, 2)
b = [true false; false true]
a[b]
This is useful for conditional extraction, as we’ll see below
An aside: some or all elements of an array can be set equal to one number using slice notation
a = zeros(4)
a[2:end] .= 42
a
Using the :
notation provides a slice of an array, copying the sub-array to a new array with a similar type
a = [1 2; 3 4]
b = a[:, 2]
@show b
a[:, 2] = [4, 5] # modify a
@show a
@show b;
A view on the other hand does not copy the value
a = [1 2; 3 4]
@views b = a[:, 2]
@show b
a[:, 2] = [4, 5]
@show a
@show b;
Note that the only difference is the @views
macro, which will replace any slices with views in the expression
An alternative is to call the view
function directly – though it is generally discouraged since it is a step away from the math
@views b = a[:, 2]
view(a, :, 2) == b
As with most programming in Julia, it is best to avoid prematurely assuming that @views
will have a significant impact on performance, and stress code clarity above all else
Another important lesson about @views
is that they are not normal, dense arrays
a = [1 2; 3 4]
b_slice = a[:, 2]
@show typeof(b_slice)
@show typeof(a)
@views b = a[:, 2]
@show typeof(b);
The type of b
is a good example of how types are not as they may seem
Similarly
a = [1 2; 3 4]
b = a' # transpose
typeof(b)
To copy into a dense array
a = [1 2; 3 4]
b = a' # transpose
c = Matrix(b) # convert to matrix
d = collect(b) # also `collect` works on any iterable
c == d
As we saw with transpose
, sometimes types that look like matrices are not stored as a dense array
As an example, consider creating a diagonal matrix
d = [1.0, 2.0]
a = Diagonal(d)
As you can see, the type is 2×2 Diagonal{Float64,Array{Float64,1}}
, which is not a 2-dimensional array
The reasons for this are both efficiency in storage, as well as efficiency in arithmetic and matrix operations
In every important sense, matrix types such as Diagonal
are just as much a “matrix” as the dense matrices we have using (see the introduction to types lecture for more)
@show 2a
b = rand(2,2)
@show b * a;
Another example is in the construction of an identity matrix, where a naive implementation is
b = [1.0 2.0; 3.0 4.0]
b - Diagonal([1.0, 1.0]) # poor style, inefficient code
# should to this instead
b = [1.0 2.0; 3.0 4.0]
b - I # good style, and note the lack of dimensions of I
While the implementation of I
is a little abstract to go into at this point, a hint is:
typeof(I)
This is a UniformScaling
type rather than an identity matrix, making it much more powerful and general
As discussed above, in Julia, the left hand side of an assignment is a “binding” or a label to a value
x = [1 2 3]
y = x # name `y` binds to whatever value `x` bound to
The consequence of this, is that you can re-bind that name
x = [1 2 3]
y = x # name `y` binds to whatever `x` bound to
z = [2 3 4]
y = z # only changes name binding, not value!
@show (x, y, z);
What this means is that if a
is an array and we set b = a
then a
and b
point to exactly the same data in your RAM!
In the above, suppose you had meant to change the value of x
to the values of y
, you need to assign the values rather than the name
x = [1 2 3]
y = x # name `y` binds to whatever `x` bound to
z = [2 3 4]
y .= z # now dispatches the assignment of each element
@show (x, y, z);
Alternatively, you could have used y[:] = z
This applies to in-place functions as well
First, define a simple function for a linear map
function f(x)
return [1 2; 3 4] * x # matrix * column vector
end
val = [1, 2]
f(val)
In general, these “out-of-place” functions are preferred to “in-place” functions, which modify the arguments
function f(x)
return [1 2; 3 4] * x # matrix * column vector
end
val = [1, 2]
y = similar(val)
function f!(out, x)
out .= [1 2; 3 4] * x
end
f!(y, val)
y
This demonstrates a key convention in Julia: functions which modify any of the arguments have the name ending with !
(e.g. push!
)
We can also see a common mistake, where instead of modifying the arguments, the name binding is swapped
function f(x)
return [1 2; 3 4] * x # matrix * column vector
end
val = [1, 2]
y = similar(val)
function f!(out, x)
out = [1 2; 3 4] * x # MISTAKE! Should be .= or [:]
end
f!(y, val)
y
Note that scalars are always immutable, such that
y = [1 2]
y .-= 2 # y .= y .- 2, no problem
x = 5
# x .-= 2 # Fails!
x = x - 2 # subtle difference - creates a new value and rebinds the variable
In particular, there is no way to pass any immutable into a function and have it modified
x = 2
function f(x)
x = 3 # MISTAKE! does not modify x, creates a new value!
end
f(x) # cannot modify immutables in place
@show x;
This is also true for other immutable types such as tuples, as well as some vector types
using StaticArrays
xdynamic = [1, 2]
xstatic = @SVector [1, 2] # turns it into a highly optimized static vector
f(x) = 2x
@show f(xdynamic)
@show f(xstatic)
# inplace version
function g(x)
x .= 2x
return "Success!"
end
@show xdynamic
@show g(xdynamic)
@show xdynamic;
# g(xstatic) # fails, static vectors are immutable
Julia provides standard functions for acting on arrays, some of which we’ve already seen
a = [-1, 0, 1]
@show length(a)
@show sum(a)
@show mean(a)
@show std(a) # standard deviation
@show var(a) # variance
@show maximum(a)
@show minimum(a)
@show extrema(a) # (mimimum(a), maximum(a))
To sort an array
b = sort(a, rev = true) # returns new array, original not modified
b = sort!(a, rev = true) # returns *modified original* array
b == a # tests if have the same values
b === a # tests if arrays are identical (i.e share same memory)
For two dimensional arrays, *
means matrix multiplication
a = ones(1, 2)
b = ones(2, 2)
a * b
b * a'
To solve the linear system $ A X = B $ for $ X $ use A \ B
A = [1 2; 2 3]
B = ones(2, 2)
A \ B
inv(A) * B
Although the last two operations give the same result, the first one is numerically more stable and should be preferred in most cases
Multiplying two one dimensional vectors gives an error – which is reasonable since the meaning is ambiguous
ones(2) * ones(2)
If you want an inner product in this setting use dot()
or the unicode \cdot<TAB>
dot(ones(2), ones(2))
Matrix multiplication using one dimensional vectors is a bit inconsistent – pre-multiplication by the matrix is OK, but post-multiplication gives an error
b = ones(2, 2)
b * ones(2)
ones(2,2) * b
ones(2, 2) * ones(2, 2) # matrix multiplication
ones(2, 2) .* ones(2, 2) # element by element multiplication
.x
means apply operator x
elementwise.x
before when talking about Broadcasting.op(x)
this just applies operation op
to all elements of argument x
.A = -ones(2, 2)
A.^2 # square every element
However in practice some operations are mathematically valid without broadcasting, and hence the .
can be omitted
ones(2, 2) + ones(2, 2) # same as ones(2, 2) .+ ones(2, 2)
Scalar multiplication is similar
A = ones(2, 2)
2 * A # same as 2 .* A
In fact you can omit the *
altogether and just write 2A
Unlike MATLAB and other languages, scalar addition requires the .+
in order to correctly broadcast
x = [1, 2]
x .+ 1 # not x + 1
x .- 1 # not x - 1
Elementwise comparisons also use the .x
style notation
a = [10, 20, 30]
b = [-100, 0, 100]
b .> a
a .== b
We can also do comparisons against scalars with parallel syntax
b
b .> 1
This is particularly useful for conditional extraction – extracting the elements of an array that satisfy a condition
a = randn(4)
a .< 0
a[a .< 0]
The primary function for changing the dimensions of an array is reshape()
a = [10, 20, 30, 40]
b = reshape(a, 2, 2)
b
Notice that this function returns a view
on the existing array
This means that changing the data in the new array will modify the data in the old one!
b[1, 1] = 100 # continuing the previous example
b
a
To collapse an array along one dimension you can use dropdims()
a = [1 2 3 4] # two dimensional
# The return value is an array with the specified dimension “flattened”
dropdims(a, dims = 1)
Julia provides standard mathematical functions such as log
, exp
, sin
, etc.
log(1.0)
By default, these functions act elementwise on arrays
log.(1:4)
Note that we can get the same result as with a comprehension or more explicit loop
[ log(x) for x in 1:4 ]
# A Comprehension is in square brackets:
# LinSpace is depreciated in Julia v1.3
foo(x,y,z) = sin(x)*0.5y^0.5 + z^2
d = [foo(i,j,k) for i in 1:3, j in range(0.01,stop=0.1,length=5), k in [log(l) for l in 2:7]];
# generator expressions work in a similar way
# just leave the square brackets away
sum(1/n^2 for n=1:1000) # this allocates no temp array: very efficient!
# can have indices depend on each other
[(i,j) for i=1:3 for j=1:i]
# you can even condition on the indices
[(i,j) for i=1:3 for j=1:i if i+j == 4]
(See linear algebra documentation)
Julia provides some a great deal of additional functionality related to linear operations
A = [1 2; 3 4]
det(A)
tr(A)
eigvals(A)
rank(A)
As with many other types, a Range
can act as a vector
a = 10:12 # a range, equivalent to 10:1:12
@show Vector(a) # can convert, but shouldn't
b = Diagonal([1.0, 2.0, 3.0])
b * a .- [1.0; 2.0; 3.0]
#
a = 0.0:0.1:1.0 # 0.0, 0.1, 0.2, ... 1.0
But care should be taken if the terminal node is not a multiple of the set sizes
maxval = 1.0
minval = 0.0
stepsize = 0.15
a = minval:stepsize:maxval # 0.0, 0.15, 0.3, ...
maximum(a) == maxval
To evenly space points where the maximum value is important, i.e., linspace
in other languages
maxval = 1.0
minval = 0.0
numpoints = 10
a = range(minval, maxval, length=numpoints)
# or range(minval, stop=maxval, length=numpoints)
maximum(a) == maxval
range(minval, maxval, length=numpoints)
notation, until the release of Julia v1.1, you will need to have the using Compat
in the header, as we do above.stop
.(See tuples and named tuples documentation)
We were introduced to tuples earlier, which provide high-performance immutable sets of distinct types
t = (1.0, "test")
t[1] # access by index
a, b = t # unpack
# t[1] = 3.0 # would fail as tuples are immutable
println("a = $a and b = $b")
As well as named tuples, which extend tuples with names for each argument
t = (val1 = 1.0, val2 = "test")
t.val1 # access by index
# a, b = t # bad style, better to unpack by name with @unpack
println("val1 = $(t.val1) and val1 = $(t.val1)") # access by name
While immutable, it is possible to manipulate tuples and generate new ones
t2 = (val3 = 4, val4 = "test!!")
t3 = merge(t, t2) # new tuple
Named tuples are a convenient and high-performance way to manage and unpack sets of parameters
function f(parameters)
α, β = parameters.α, parameters.β # poor style
# poor because we'll make errors once
# we add more parameters!
return α + β
end
parameters = (α = 0.1, β = 0.2)
f(parameters)
This functionality is aided by the Parameters.jl
package and the @unpack
macro
using Parameters
function f(parameters)
@unpack α, β = parameters # good style, less sensitive to errors
return α + β
end
parameters = (α = 0.1, β = 0.2)
f(parameters)
In order to manage default values, use the @with_kw
macro
using Parameters
paramgen = @with_kw (α = 0.1, β = 0.2) # create named tuples with defaults
# creates named tuples, replacing defaults
@show paramgen() # calling without arguments gives all defaults
@show paramgen(α = 0.2)
@show paramgen(α = 0.2, β = 0.5);
An alternative approach, defining a new type using struct
tends to be more prone to accidental misuse, and leads to a great deal of boilerplate code
For that, and other reasons of generality, we will use named tuples for collections of parameters where possible
Sometimes a variable, return type from a function, or value in an array needs to represent the absence of a value rather than a particular value
There are two distinct use cases for this
nothing
(“software engineers null”): used where no value makes sense in a particular context due to a failure in the code, a function parameter not passed in, etc. missing
(“data scientists null”): used when a value would make conceptual sense, but it isn’t available The value nothing
is a single value of type Nothing
typeof(nothing)
An example of a reasonable use of nothing
is if you need to have a variable defined in an outer scope, which may or may not be set in an inner one
function f(y)
x = nothing
if y > 0.0
# calculations to set `x`
x = y
end
# later, can check `x`
if x === nothing
println("x was not set")
else
println("x = $x")
end
x
end
@show f(1.0)
@show f(-1.0);
While in general you want to keep a variable name bound to a single type in Julia, this is a notable exception
Similarly, if needed, you can return a nothing
from a function to indicate that it did not calculate as expected
function f(x)
if x > 0.0
return sqrt(x)
else
return nothing
end
end
x1 = 1.0
x2 = -1.0
y1 = f(x1)
y2 = f(x2)
# check results with === nothing
if y1 === nothing
println("f($x2) successful")
else
println("f($x2) failed");
end
As an aside, an equivalent way to write the above function is to use the ternary operator, which gives a compact if/then/else structure
function f(x)
x > 0.0 ? sqrt(x) : nothing # the "a ? b : c" pattern is the ternary
end
f(1.0)
We will sometimes use this form when it makes the code more clear (and it will occasionally make the code higher performance)
Regardless of how f(x)
is written, the return type is an example of a union, where the result could be one of an explicit set of types
In this particular case, the compiler would deduce that the type would be a Union{Nothing,Float64}
– that is, it returns either a floating point or a nothing
You will see this type directly if you use an array containing both types
x = [1.0, nothing]
When considering error handling, whether you want a function to return nothing
or simply fail depends on whether the code calling f(x)
is carefully checking the results
For example, if you were calling on an array of parameters where a priori you were not sure which ones will succeed, then
x = [0.1, -1.0, 2.0, -2.0]
y = f.(x)
# presumably check `y`
On the other hand, if the parameter passed is invalid and you would prefer not to handle a graceful failure, then using an assertion is more appropriate
function f(x)
@assert x > 0.0
sqrt(x)
end
f(1.0)
Finally, nothing
is a good way to indicate an optional parameter in a function
function f(x; z = nothing)
if(z === nothing)
println("No z given with $x")
else
println("z = $z given with $x")
end
end
f(1.0)
f(1.0, z=3.0)
An alternative to nothing
, which can be useful and sometimes higher performance,
is to use NaN
to signal that a value is invalid returning from a function
function f(x)
if x > 0.0
return x
else
return NaN
end
end
f(0.1)
f(-1.0)
@show typeof(f(-1.0))
@show f(-1.0) == NaN # note, this fails!
@show isnan(f(-1.0)) # check with this
Note that in this case, the return type is Float64
regardless of the input for Float64
input
Keep in mind, though, that this only works if the return type of a function is Float64
(See exceptions documentation)
While returning a nothing
can be a good way to deal with functions which may or may not return values, a more robust error handling method is to use exceptions
Unless you are writing a package, you will rarely want to define and throw your own exceptions, but will need to deal with them from other libraries
The key distinction for when to use an exceptions vs. return a nothing
is whether an error is unexpected rather than a normal path of execution
An example of an exception is a DomainError
, which signifies that a value passed to a function is invalid
# throws exception, turned off to prevent breaking notebook
# sqrt(-1.0)
# to see the error
try sqrt(-1.0); catch err; err end # catches the exception and prints it
Another example you will see is when the compiler cannot convert between types
# throws exception, turned off to prevent breaking notebook
# convert(Int64, 3.12)
# to see the error
try convert(Int64, 3.12); catch err; err end # catches the exception and prints it.
If these exceptions are generated from unexpected cases in your code, it may be appropriate simply let them occur and ensure you can read the error
Occasionally you will want to catch these errors and try to recover, as we did above in the try
block
function f(x)
try
sqrt(x)
catch err # enters if exception thrown
sqrt(complex(x, 0)) # convert to complex number
end
end
f(0.0)
f(-1.0)
(see “missing” documentation)
The value missing
of type Missing
is used to represent missing value in a statistical sense
For example, if you loaded data from a panel, and gaps existed
x = [3.0, missing, 5.0, missing, missing]
A key feature of missing
is that it propagates through other function calls - unlike nothing
f(x) = x^2
@show missing + 1.0
@show missing * 2
@show missing * "test"
@show f(missing); # even user-defined functions
@show mean(x);
The purpose of this is to ensure that failures do not silently fail and provide meaningless numerical results
This even applies for the comparison of values, which
x = missing
@show x == missing
@show x === missing # an exception
@show ismissing(x);
Where ismissing
is the canonical way to test the value
In the case where you would like to calculate a value without the missing values, you can use skipmissing
x = [1.0, missing, 2.0, missing, missing, 5.0]
@show mean(x)
@show mean(skipmissing(x))
@show coalesce.(x, 0.0); # replace missing with 0.0;
As missing
is similar to R’s NA
type, we will see more of missing
when we cover DataFrames
This exercise uses matrix operations that arise in certain problems, including when dealing with linear stochastic difference equations
If you aren’t familiar with all the terminology don’t be concerned – you can skim read the background discussion and focus purely on the matrix exercise
With that said, consider the stochastic difference equation
$$ X_{t+1} = A X_t + b + \Sigma W_{t+1} \tag{1} $$
Here
Let $ S_t $ denote the $ n \times n $ variance-covariance matrix of $ X_t $
Using the rules for computing variances in matrix expressions, it can be shown from (1) that $ \{S_t\} $ obeys
$$ S_{t+1} = A S_t A' + \Sigma \Sigma' \tag{2} $$
It can be shown that, provided all eigenvalues of $ A $ lie within the unit circle, the sequence $ \{S_t\} $ converges to a unique limit $ S $
This is the unconditional variance or asymptotic variance of the stochastic difference equation
As an exercise, try writing a simple function that solves for the limit $ S $ by iterating on (2) given $ A $ and $ \Sigma $
To test your solution, observe that the limit $ S $ is a solution to the matrix equation
$$ S = A S A' + Q \quad \text{where} \quad Q := \Sigma \Sigma' \tag{3} $$
This kind of equation is known as a discrete time Lyapunov equation
The QuantEcon package
provides a function called solve_discrete_lyapunov
that implements a fast
“doubling” algorithm to solve this equation
Test your iterative method against solve_discrete_lyapunov
using matrices
Take a stochastic process for $ \{y_t\}_{t=0}^T $
$$ y_{t+1} = \gamma + \theta y_t + \sigma w_{t+1} $$where
Normal(0,1)
Given these parameters
alpha
to a plot to make it transparent (e.g. histogram(vals, alpha = 0.5)
) or
use stephist(vals)
to show just the step function for the histogram Later, we will interpret some of these in this lecture
Let the data generating process for a variable be
$$ y = a x_1 + b x_1^2 + c x_2 + d + \sigma w $$where $ y, x_1, x_2 $ are scalar observables, $ a,b,c,d $ are parameters to estimate, and $ w $ are iid normal with mean 0 and variance 1
First, let’s simulate data we can use to estimate the parameters
Then, simulate with different $ w $
N
values and then y
from this simulated data if the parameters were $ a = 0.1, b = 0.2 c = 0.5, d = 1.0, \sigma = 0.1 $M = 20
different simulations of the y
for the N
valuesFinally, calculate order least squares manually (i.e., put the observables into matrices and vectors, and directly use the equations for OLS rather than a package)
M=20
simulations, calculate the OLS estimates for $ a, b, c, d, \sigma $