User:BenFrantzDale/Linear Algebra and Functional Analysis

This is a draft of some ideas I may scatter in appropriate places around Wikipedia or may just blog about.

Abstract

I think the college math curriculum for scientists and engineers could really be improved. The usual college math curriculum begins with multivariable calculus, linear algebra, and differential equations. From there curricula go off in their own directions. Since college, I have become deeply familiar with linear algebra and functional analysis. With these tools, topics including differential equations, statistics, signal processing, control systems, computer vision, Fourier transforms, and many more, have become much much more clear.

Vectors

A vector is a mathematical construct that is generally introduced to quantify position and velocity in space. For example, a baseball's velocity at a particular time could be described by its velocity in the x, y, and z directions. This is a very useful application of vectors, but vectors can be much more than this and are indispensable in higher mathematics.

A vector space (aka a "linear space") is a collection of objects (called vectors) that, informally speaking, may be scaled and added. For example, if you throw a ball with velocity $v_{1}$ from a car moving at velocity $v_{2}$ , its velocity with respect to the ground is $v_{1}+v_{2}$ ; if it were moving twice as fast with respect to the ground, it would have a velocity of $2(v_{1}+v_{2})$ . While it is common to write vectors with respect to a particular coordinate system (a "basis"), this is not required, and belies the simplicity of a vector. A velocity vector simply means "that way, that fast".

We can do other useful things with vectors: we can project one vector onto another vector and we can measure the length of a vector. For example, if a baseball is flying through the air with velocity v, we might want to know how fast a ball is moving across the ground. Given a vector, g, pointing along the ground, there is an operation, $\mathrm {proj} (v,g)$ that tells us how fast the ball is moving in the direction of g. It might be useful to know how fast the ball is moving—its speed. We write this $s=\|v\|$ .

Implementation details

For practical applications, we need to be able to compute $\mathrm {proj} (v,g)$ and $\|v\|$ . These are critical details, but require that we pick a representation for our vectors. Keep in mind that vectors can be represented in any number of ways; we will pick one representation that is very useful. We will describe a vector, v, in terms of three orthogonal vectors of length 1, x, y, and z. This is the Cartesian representation:

v=(v_{x},v_{y},v_{z})

.

Since we have operations to project one vector onto another, we can also say

v=(v_{x},v_{y},v_{z})=(\mathrm {proj} (v,x),\mathrm {proj} (v,y),\mathrm {proj} (v,z))

.

Also, since we have scalar multiplication and addition, we can write this as

v=(v_{x},v_{y},v_{z})=(\mathrm {proj} (v,x),\mathrm {proj} (v,y),\mathrm {proj} (v,z))=\mathrm {proj} (v,x)x+\mathrm {proj} (v,y)y+\mathrm {proj} (v,z)z

.

The length of v can then be determined by the Pythagorean theorem:

\|v\|={\sqrt {v_{x}^{2}+v_{y}^{2}+v+z^{2}}}

.

The projection operator, $\mathrm {proj} (v_{1},v_{2})$ is a bit trickier to implement [diagrams needed]. We will first define another operation called an "inner product", also known as a "dot product". In different disciplines it is written in different ways, but it is easy to compute and its result means "the amount that two vectors point in the same direction times the lengths of the two vectors". This seems a bit odd at first, but if we divide that by the length of the second vectors, we get "the amount that the first vector points in the same direction as the second times the length of the first vector", which is the projection of the first vector onto the second. It seems round-about, but it's the easiest way to compute what we want. The inner product of two vectors in Cartesian coordinates can be written as

v_{1}\cdot v_{2}=\langle v_{1},v_{2}\rangle =v_{1x}v_{2x}+v_{1y}v_{2y}+v_{1z}v_{2z}

.

With the dot product defined, it is easy to define projection:

\mathrm {proj} (v_{1},v_{2})=v_{1}\cdot v_{2}/\|v_{2}\|

.

NEED MORE EXPLANATION AND DIAGRAMS. Note that the inner product provides a concise way to compute the magnitude of a vector:

\|v\|={\sqrt {v_{x}^{2}+v_{y}^{2}+v+z^{2}}}={\sqrt {v\cdot v}}

.

Keep in mind that these details are just that: details. The important thing is that we have ways to compute the length of a vector and to project one vector onto another.

Core ideas [to explain]

Vectors, linear transformations, and tensors are first-class mathematical objects that exist free of chosen basis.
- This means that reasoning about linear operations in any number of dimensions should be independant of chosen basis.
- This means that the exact numbers in a matrix aren't particularly meaningful as they depend on the basis.
- This means that operations such as determinant, trace, singular value decomposition, and others all have geometric meaning (e.g., determinant is the n-dimensional volume scale factor).
- This means a tensor, such as a stress tensor is a first-class object just like a vector. A stress tensor doesn't tell you the stress in the x, y, z, xy, xz, and yz directions.
- This also relates to information hiding in software engineering.
Functions are vectors. Look at the definition of a vector space; note that functions have all the usual vector operations.
Integral transforms are really just infinite-dimensional forms of a change of basis.
There is no number that is the square root of negative one. When you see i, it's really just the two-by-two matrix that performs a ninety-degree counterclockwise rotation.
Linearity is almost always an approximation (e.g., assuming infinitesimal deformation), but it is tremendously useful. That's why we do it.
Sine and cosine have the property that linear combinations of the two correspond to shifting the functions. That is, you can always solve $a\sin(x)+b\cos(x)=c\sin(x+\Delta )$ .

Results

The above observations make a lot of things simple that seemed nonsensical to me when I first saw them.

Linear differential equations are solved by taking a linear combination of eigenfunctions. This is what you are doing when you assume your solution is made of sines and cosines and solving for the coefficients. It seemed arbitrary at the time.
The Fourier transform is tremendously useful for signal processing because (a) as a sinusoidal basis, it makes differentiation trivial and (b) because convolution is easy (the equivalent of diagonal for matrices) in a sinusoidal basis by the convolution theorem.
Differential equations is really a task of finding solutions to linear systems that just happen to be infinite dimensional. You can approximate the functions through a variety of methods such as finite element analysis.
Because functions are vectors, nonlinear function optimization makes sense geometrically as hill climbing in infinite dimensions. You just pick a direction to move and a distance to go and repeat.