Linear Algebra Primer

Simon Dennis and Rachael Gibson

Table of Contents

  • Introduction

    Introduction

    Tensors Explained

    Tensors are convenient ways of collecting together numbers. For instance, a vector, which is also known as a rank one tensor, could be used to describe the age, gender, and salary of an employee. If Jeff is 50 years old, is male (where male = 0 and female = 1) and earns $56000 per annum then we could describe Jeff with the vector [50, 1, 56000] (see figure 1). Note that vectors (and tensors in general) are ordered. The vector [56000, 1, 50] would describe someone who was 56000 years old who made a total of $50 per annum!





    Figure 1: Examples of vectors. (a) a row vector describing Jeff, (b) a column vector, (c) a vector with N components.

    The rank one tensor described above has a dimension of three because it contains three components. There is no reason that vectors need be restricted to three dimensions, however. We could have added shoe size, for instance, to increase the dimension to four. Similarly, there is no reason that we need to restrict ourselves to a single row of numbers. A tensor with N rows and M columns is known as an NxM matrix and has a rank of two, indicating that the array of numbers extends in two directions (see figure 2).

    Figure 2: Examples of matrices. (a) a 2x2 matrix, (b) a 3x2 matrix, (c) an NxN matrix.

    The process of extending the number of directions in which the array extends can theoretically continue indefinitely, creating tensors of rank three, four, five etc. In the following sections, we will look at vectors, matrices and tensors of rank three (see figure 3) as they are critical to understanding the Matrix Model. Other models, such as the STAR model of analogical reasoning (Halford, Wiles, Humphreys and Wilson 1992), employ tensors of higher rank.









    Figure 3: A rank three tensor (NxNxN).

    Vectors - Rank One Tensors

    Tensors, and in particular vectors, can be represented in many different forms including:
    1. Cartesian form in which the components are enumerated explicitly. Figure 1 depicts vectors represented in Cartesian form.
    2. Geometric form in which the vector is plotted in N dimensional space. For instance, figure 4 shows the vector representing Jeff plotted in three dimensional space.
      Figure 4: The vector representing Jeff plotted in three dimensional space (Geometric form).
    3. Algebraic form in which a vector is represented as a bolded lower case letter (e.g. v). Algebraic form is a particularly concise form of representation, which makes it easy to talk about the operations that can be performed on vectors such as addition (e.g. w = v + t).
    4. Neural network form which diagrams a neural network architecture in which either a set of units or a set of weights contain the elements of the vector. For instance, a vector can be mapped to a two layer network (one input and one output layer) as depicted in Figure 5. The number of units in the input layer corresponds to the number of dimensions in the original vector, while the output layer contains only 1 unit. Each input unit is connected to each output unit. The input units represent one vector and the weights represent a second vector.

      Figure 5: The network corresponding to a vector memory.

      The output of this network is defined to be the dot product (or inner product) of the input and weight vectors. A Dot Product is calculated by multiplying together the values which are in the same position within the two vectors, and then adding the results of these multiplications together to get a scalar (see Figure 6a). In the case of the neural network, this involves multiplying each input unit activation by the corresponding weight value and then adding. The dot product of two vectors represents the level of similarity between them and can be extended to higher rank tensors (see figure 6b)

      Figure 6: The Dot Product.

      The dot product is expressed algebraically as a dot, that is, the dot product of the vectors v and w is written v.w.

      Learning occurs in this network by adding the input vectors. Vector addition superimposes vectors of the same dimension. It is calculated by adding together the elements in a particular position in each vector (see Figure 7a). In this way, multiple memories can be stored within the same vector. [Note: the network actually employs Hebbian learning (see Neural Networks by Example: Chapter three). However, when the output unit is fixed at one Hebbian learning is identical to vector addition.]

      Figure 7: (a) Vector Addition, (b) Matrix Addition.

      Again vector addition can be extended to tensors of arbitrary rank (see figure 7b). Vector addition is expressed algebraically as a plus sign (+). So if we wanted to talk about the dot product of v with the addition of w and x we would write v.(w + x). Another useful property to keep in mind is that the dot product distributes over addition. That is:

      v.(w + x) = v.w + v.x

    In the following exercises, you will build a vector network that learns to discriminate between stored items and new items (see figure 8).

    Figure 8: A vector memory network.

    Follow the instructions below to create the network and then work through the exercises.

    1. Load BrainWave
    2. Select 'New Hebbian Network'
    3. Set up the vector memory network:
      • Create two input units and one output unit.
      • Connect both input units to the output unit.
      • Set up VALUE objects for the units and weights.
    4. Create the data sets:
      • Create an Input Set containing the two input units
      • Create a Test Set containing the two input units
      • Create an Output Set containing the output unit
      • The items in this exercise are FROG [0.95, 0.32], TOAD [0.49, 0.87], KOALA [0.32, -0.95]. Add the FROG pattern to the input set.
      • Add a pattern containing a 1 to the output set.
      • Add all three items, FROG, TOAD and KOALA, to the test set.
Exercise 1: Train the network for one epoch and record the weights in the TRAIN FROG row of the following table. How have the weights changed?
Weight 1 Weight 2
TRAIN FROG
TRAIN FROG & KOALA
TRAIN FROG & TOAD

Exercise 2: Test each of the items, FROG, TOAD and KOALA, and record the match values (the activation of the output unit) in the second table. Explain the match values.

TEST FROG TEST TOAD TEST KOALA
TRAIN FROG
TRAIN FROG & KOALA
TRAIN FROG & TOAD

Exercise 3: Train the network for one more epoch and test again. What happens to the match values after a second training trial? Why?

Exercise 4: Add KOALA to the input set and an output value of 1 to the output set. Zero the weights (using the ACTIONS menu) and retrain the network on the updated input set. Test the network as before, recording the values in the table in the TRAIN FROG & KOALA row.

Exercise 5: Delete KOALA from the input set and add TOAD. Zero the weights, retrain and test as above, recording the values in the TRAIN FROG & TOAD row. You should have six weight values and nine match values for each training trial. Create a graph of the match values after the first training trial: plot three lines, one for each test item, against the three training conditions. Explain the shape of each line on the graph.

Exercise 6: For each of the three training conditions (FROG alone, FROG & KOALA, FROG & TOAD):

  1. Draw the geometric (graphical) representation of the weights,
  2. Provide the algebraic representation of the weights.


Matrices - Rank Two Tensors

The vector memory, discussed above, was capable of storing items so that at a later time it could be determined if they had appeared. A matrix memory allows two items to be associated - so that given one we can retrieve the other. Algebraically, a matrix is usually represented as a bolded upper case letter (e.g. M).

Associations are formed using the outer product operation. A outer product between two vectors is calculated by multiplying each element in one vector by each element in the other vector (see Figure 8). If the first vector has dimension d1 and the second vector dimension d2, the outer product matrix has dimension d1xd2. For instance, a three dimensional vector multiplied by a two dimensional vector has dimension 3x2.

Figure 8: The outer product.

The outer product operation is expressed algebraically by placing the vectors to be multiplied next to each other. So the outer product of v and w is written as v w.

These association matrices are then added into the memory matrix (as in the vector memory case) - so that all associations are stored as a single composite. A matrix memory maps to a two layer network (one input and one output layer) as depicted in Figure 9. The number of input units corresponds to the number of rows in the original matrix, while the number of output units corresponds to the number of columns. Each input unit is connected to each output unit.

Figure 9: The network representation of a matrix.

In the following exercise you will use a matrix memory network to store and recall pairs of items.

Exercise 7: Load the simulator, BrainWave. From the NETWORKS menu - select Matrix Model (1). What rank tensor does this network implement? What are its dimensions?

Exercise 8: The items in this exercise are:

Cues:
FROG [0.5, -0.5, 0.5, -0.5]
KOALA [0.5, 0.5, -0.5, -0.5]
SNAIL [-0.5, 0.5, 0.5, -0.5]
TOAD [0.5, 0.4, 0.6, 0.45]
Targets:
FLIES [0.7, 0.5, 0.5]
LEAVES [0.7, -0.5, -0.5]
LETTUCE [0, -0.7, 0.7]
The input set contains the items FROG, KOALA and SNAIL, paired with items in the output set FLIES, LEAVES and LETTUCE, respectively. Another input item, TOAD [0.5, 0.4, 0.6, 0.45], can be used to test the network on unfamiliar input. Calculate the similarity value (i.e. dot product) of the items FROG, KOALA, SNAIL and TOAD with themselves, and each other, and record the values in the table below:
FROG KOALA SNAIL TOAD
FROG
KOALA
SNAIL
TOAD

Exercise 9: Train the network for one epoch. Test each of the items FROG, KOALA, SNAIL and TOAD. What output is produced in each case? (Give the output pattern and also describe the output patterns in terms of their similarity to FLIES, LEAVES and LETTUCE).

FROG
KOALA
SNAIL
TOAD

Exercise 10: Give the algebraic equation that describes the matrix memory formed from the three pairs of associates:

M =
Exercise 11: Give the equations that describe each of the retrievals in exercise 9. Use the similarity measures from the table above to simplify each equation to a weighted sum of the target patterns.
FROG
KOALA
SNAIL
TOAD




Tensors of Rank Three and Above

The final sort of tensor we need to demonstrate the matrix model is the rank three tensor. The rank three tensor allows a three way association to be represented. For instance, we could store the information that John loves Mary - [Loves John Mary] - or that Apple appeared with Pencil in List 1 [List 1, Apple, Pencil].

A tensor of rank three maps to a three layer network (one input layer with two sets of units, one output layer, and one layer of hidden units) as depicted in Figure 10. The number of units in the input sets and the output set correspond to the dimensionality of the tensor. The number of hidden units corresponds to the number of units in one input set times the number of units in the other input set. Each hidden unit has a connection from one input unit from each input set, with a hidden unit existing for each possible combination. These hidden units are SigmaPi units, the value of which is set to the multiplication of the two input units to which it is connected. To implement a rank three tensor, the weights in the first layer are frozen at one. Consequently, a hidden unit's activation will equal the multiplication of the activations of the input units to which it is connected. Each hidden unit is then connected to each output unit.

Figure 10: The network representation of a rank three tensor.

In these exercises, you will use both rank two and three tensor networks to store and recall triples of items.

Exercise 12: Load the simulator, BrainWave. From the NETWORKS menu - Matrix Model (2). What rank tensor does this network implement?







Exercise 13: The items in this exercise are:

Cues:

FROG [0.5, -0.5, 0.5, -0.5]
KOALA [0.5, 0.5, -0.5, -0.5]
SNAIL [-0.5, 0.5, 0.5, -0.5]
TOAD [0.5, 0.4, 0.6, 0.45]
Relations:
EATS [0.5, -0.5, -0.5, 0.5]
LIVES-IN [0.5, 0.5, -0.5, -0.5]
Targets:
FLIES [0.7, 0.5, 0.5]
LEAVES [0.7, -0.5, -0.5]
LETTUCE [0, -0.7, 0.7]
POND [0.89, 0.43, -0.22]
TREE [-0.22, 0.76, 0.62]
SHELL [0.43, -0.5, 0.76]
Notice that the vectors for the cues are the same as those used above. Also notice that EATS and LIVES-IN are orthogonal to each other - that is they have a dot product of zero.

Calculate the similarity (dot product) table for the targets.

FLIES LEAVES LETTUCE POND TREE SHELL
FLIES
LEAVES
LETTUCE
POND
TREE
SHELL

Exercise 14: The cue+relation input set contains the items FROG-EATS, KOALA-EATS, SNAIL-EATS, FROG-LIVES_IN, KOALA-LIVES_IN and SNAIL-LIVES_IN, paired with items in the output set FLIES, LEAVES, LETTUCE, POND, TREE, and SHELL, respectively. Two other input items, TOAD-EATS and TOAD-LIVES_IN, can be used to test the network's response to unfamiliar input.

Train the network for one epoch. Test each of the items FROG-EATS, KOALA-EATS, SNAIL-EATS, FROG-LIVES_IN, KOALA-LIVES_IN, SNAIL-LIVES_IN, TOAD-EATS and TOAD-LIVES_IN. What output is produced in each case? (Give the output pattern and also describe the output patterns in terms of their similarity to FLIES, LEAVES, LETTUCE, POND, TREE and SHELL)

FROG-EATS

KOALA-EATS

SNAIL-EATS

FROG-LIVES_IN

KOALA-LIVES_IN

SNAIL-LIVES_IN

TOAD-EATS

TOAD-LIVES_IN


Exercise 15: How does the performance of this network compare with the performance of the network in Exercise 8? Why is it not as good?

Exercise 16: Give the algebraic equation that describes the matrix memory formed from the three pairs of associates:

M =

Exercise 17: Give the equations that describe each of the retrievals from exercise 14. Use the similarity measures from the table above to simplify each equation to a weighted sum of the target patterns.

FROG-EATS

KOALA-EATS

SNAIL-EATS

FROG-LIVES_IN

KOALA-LIVES_IN

SNAIL-LIVES_IN

TOAD-EATS

TOAD-LIVES_IN


Exercise 18: From the NETWORKS menu - select Matrix Model (3). What rank tensor does this network implement?



Exercise 19: The inputs and outputs for this network are the same as for the previous one, but the connections and hidden SigmaPi units perform different calculations on the inputs to try and achieve the correct outputs. Train the network for one epoch. Test each of the items FROG-EATS, KOALA-EATS, SNAIL-EATS, FROG-LIVES_IN, KOALA-LIVES_IN, SNAIL-LIVES_IN, TOAD-EATS and TOAD-LIVES_IN.

What output is produced in each case? (Give the output pattern and also describe the output patterns in terms of their similarity to FLIES, LEAVES, LETTUCE, POND, TREE and SHELL).

FROG-EATS

KOALA-EATS

SNAIL-EATS

FROG-LIVES_IN

KOALA-LIVES_IN

SNAIL-LIVES_IN

TOAD-EATS

TOAD-LIVES_IN


Exercise 20: Which of the two networks performs the memory task better? Why?

Exercise 21: Give the algebraic equation that describes the matrix memory formed from the three pairs of associates:

M =
Exercise 22: Give the equations that describe each of the cued recall tests from question 19. Use the similarity measures from the table above to simplify each equation to a weighted sum of the target patterns.
FROG-EATS

KOALA-EATS

SNAIL-EATS

FROG-LIVES_IN

KOALA-LIVES_IN

SNAIL-LIVES_IN

TOAD-EATS

TOAD-LIVES_IN


In this section, we have been looking at the way in which tensors of rank one, two and three can be used to store information. In the next section, we will examine the Matrix Model, which uses precisely this mechanism to explain the nature of human memory.

Objective Checklist

In this chapter, we have been looking at the Matrix Model of long term memory. The following is a check list of skills and knowledge which you should obtain while working on this chapter. Go through the list and tick off those things you are confident you can do. For any item outstanding, you should refer back to the appropriate section or consult your tutor.
  • understand the distributed representation of items and associations
  • calculate the vector memory values when two patterns are superimposed, in terms of:
    • network weights,
    • Cartesian co-ordinates,
    • vector addition.

References