## My 2019 Mathematics A To Z: Norm

Today’s A To Z term is another free choice. So I’m picking a term from the world of … mathematics. There are a lot of norms out there. Many are specialized to particular roles, such as looking at complex-valued numbers, or vectors, or matrices, or polynomials.

Still they share things in common, and that’s what this essay is for. And I’ve brushed up against the topic before.

The norm, also, has nothing particular to do with “normal”. “Normal” is an adjective which attaches to every noun in mathematics. This is security for me as while these A-To-Z sequences may run out of X and Y and W letters, I will never be short of N’s.

# Norm.

A “norm” is the size of whatever kind of thing you’re working with. You can see where this is something we look for. It’s easy to look at two things and wonder which is the smaller.

There are many norms, even for one set of things. Some seem compelling. For the real numbers, we usually let the absolute value do this work. By “usually” I mean “I don’t remember ever seeing a different one except from someone introducing the idea of other norms”. For a complex-valued number, it’s usually the square root of the sum of the square of the real part and the square of the imaginary coefficient. For a vector, it’s usually the square root of the vector dot-product with itself. (Dot product is this binary operation that is like multiplication, if you squint, for vectors.) Again, these, the “usually” means “always except when someone’s trying to make a point”.

Which is why we have the convention that there is a “the norm” for a kind of operation. The norm dignified as “the” is usually the one that looks as much as possible like the way we find distances between two points on a plane. I assume this is because we bring our intuition about everyday geometry to mathematical structures. You know how it is. Given an infinity of possible choices we take the one that seems least difficult.

Every sort of thing which can have a norm, that I can think of, is a vector space. This might be my failing imagination. It may also be that it’s quite easy to have a vector space. A vector space is a collection of things with some rules. Those rules are about adding the things inside the vector space, and multiplying the things in the vector space by scalars. These rules are not difficult requirements to meet. So a lot of mathematical structures are vector spaces, and the things inside them are vectors.

A norm is a function that has these vectors as its domain, and the non-negative real numbers as its range. And there are three rules that it has to meet. So. Give me a vector ‘u’ and a vector ‘v’. I’ll also need a scalar, ‘a. Then the function f is a norm when:

1. $f(u + v) \le f(u) + f(v)$. This is a famous rule, called the triangle inequality. You know how in a triangle, the sum of the lengths of any two legs is greater than the length of the third leg? That’s the rule at work here.
2. $f(a\cdot u) = |a| \cdot f(u)$. This doesn’t have so snappy a name. Sorry. It’s something about being homogeneous, at least.
3. If $f(u) = 0$ then u has to be the additive identity, the vector that works like zero does.

Norms take on many shapes. They depend on the kind of thing we measure, and what we find interesting about those things. Some are familiar. Look at a Euclidean space, with Cartesian coordinates, so that we might write something like (3, 4) to describe a point. The “the norm” for this, called the Euclidean norm or the L2 norm, is the square root of the sum of the squares of the coordinates. So, 5. But there are other norms. The L1 norm is the sum of the absolute values of all the coefficients; here, 7. The L norm is the largest single absolute value of any coefficient; here, 4.

A polynomial, meanwhile? Write it out as $a_0 + a_1 x + a_2 x^2 + a_3 x^3 + \cdots + a_n x^n$. Take the absolute value of each of these $a_k$ terms. Then … you have choices. You could take those absolute values and add them up. That’s the L1 polynomial norm. Take those absolute values and square them, then add those squares, and take the square root of that sum. That’s the L2 norm. Take the largest absolute value of any of these coefficients. That’s the L norm.

These don’t look so different, even though points in space and polynomials seem to be different things. We designed the tool. We want it not to be weirder than it has to be. When we try to put a norm on a new kind of thing, we look for a norm that resembles the old kind of thing. For example, when we want to define the norm of a matrix, we’ll typically rely on a norm we’ve already found for a vector. At least to set up the matrix norm; in practice, we might do a calculation that doesn’t explicitly use a vector’s norm, but gives us the same answer.

If we have a norm for some vector space, then we have an idea of distance. We can say how far apart two vectors are. It’s the norm of the difference between the vectors. This is called defining a metric on the vector space. A metric is that sense of how far apart two things are. What keeps a norm and a metric from being the same thing is that it’s possible to come up with a metric that doesn’t match any sensible norm.

It’s always possible to use a norm to define a metric, though. Doing that promotes our normed vector space to the dignified status of a “metric space”. Many of the spaces we find interesting enough to work in are such metric spaces. It’s hard to think of doing without some idea of size.

I’ve made it through one more week without missing deadline! This and all the other Fall 2019 A To Z posts should be at this link. I remain open for subjects for the letters Q through T, and would appreciate nominations at this link. Thank you for reading and I’ll fill out the rest of this week with reminders of old A-to-Z essays.

## y-axis.

It’s easy to tell where you are on a line. At least it is if you have a couple tools. One is a reference point. Another is the ability to say how far away things are. Then if you say something is a specific distance from the reference point you can pin down its location to one of at most two points. If we add to the distance some idea of direction we can pin that down to at most one point. Real numbers give us a good sense of distance. Positive and negative numbers fit the idea of orientation pretty well.

To tell where you are on a plane, though, that gets tricky. A reference point and a sense of how far things are help. Knowing something is a set distance from the reference point tells you something about its position. But there’s still an infinite number of possible places the thing could be, unless it’s at the reference point.

The classic way to solve this is to divide space into a couple directions. René Descartes made his name for himself — well, with many things. But one of them, in mathematics, was to describe the positions of things by components. One component describes how far something is in one direction from the reference point. The next component describes how far the thing is in another direction.

This sort of scheme we see as laying down axes. One, conventionally taken to be the horizontal or left-right axis, we call the x-axis. The other direction — one perpendicular, or orthogonal, to the x-axis — we call the y-axis. Usually this gets drawn as the vertical axis, the one running up and down the sheet of paper. That’s not required; it’s just convention.

We surely call it the x-axis in echo of the use of x as the name for a number whose value we don’t know right away. (That, too, is a convention Descartes gave us.) x carries with it connotations of the unknown, the sought-after, the mysterious thing to be understood. The next axis we name y because … well, that’s a letter near x and we don’t much need it for anything else, I suppose. If we need another direction yet, if we want something in space rather than a plane, then the third axis we dub the z-axis. It’s perpendicular to the x- and the y-axis directions.

These aren’t the only names for these directions, though. It’s common and often convenient to describe positions of things using vector notation. A vector describes the relative distance and orientation of things. It’s compact symbolically. It lets one think of the position of things as a single variable, a single concept. Then we can talk about a position being a certain distance in the direction of the x-axis plus a certain distance in the direction of the y-axis. And, if need be, plus some distance in the direction of the z-axis.

The direction of the x-axis is often written as $\hat{i}$, and the direction of the y-axis as $\hat{j}$. The direction of the z-axis if needed gets written $\hat{k}$. The circumflex there indicates two things. First is that the thing underneath it is a vector. Second is that it’s a vector one unit long. A vector might have any length, including zero. It’s convenient to make some mention when it’s a nice one unit long.

Another popular notation is to write the direction of the x-axis as the vector $\hat{e}_1$, and the y-axis as the vector $\hat{e}_2$, and so on. This method offers several advantages. One is that we can talk about the vector $\hat{e}_j$, that is, some particular direction without pinning down just which one. That’s the equivalent of writing “x” or “y” for a number we don’t want to commit ourselves to just yet. Another is that we can talk about axes going off in two, or three, or four, or more directions without having to pin down how many there are. And then we don’t have to think of what to call them. x- and y- and z-axes make sense. w-axis sounds a little odd but some might accept it. v-axis? u-axis? Nobody wants that, trust me.

Sometimes people start the numbering from $\hat{e}_0$ so that the y-axis is the direction $\hat{e}_1$. Usually it’s either clear from context or else it doesn’t matter.