My All 2020 Mathematics A to Z: Wronskian


Today’s is another topic suggested by Mr Wu, author of the Singapore Maths Tuition blog. The Wronskian is named for Józef Maria Hoëne-Wroński, a Polish mathematician, born in 1778. He served in General Tadeusz Kosciuszko’s army in the 1794 Kosciuszko Uprising. After being captured and forced to serve in the Russian army, he moved to France. He kicked around Western Europe and its mathematical and scientific circles. I’d like to say this was all creative and insightful, but, well. Wikipedia describes him trying to build a perpetual motion machine. Trying to square the circle (also impossible). Building a machine to predict the future. The St Andrews mathematical biography notes his writing a summary of “the general solution of the fifth degree [polynomial] equation”. This doesn’t exist.

Both sources, though, admit that for all that he got wrong, there were flashes of insight and brilliance in his work. The St Andrews biography particularly notes that Wronski’s tables of logarithms were well-designed. This is a hard thing to feel impressed by. But it’s hard to balance information so that it’s compact yet useful. He wrote about the Wronskian in 1812; it wouldn’t be named for him until 1882. This was 29 years after his death, but it does seem likely he’d have enjoyed having a familiar thing named for him. I suspect he wouldn’t enjoy my next paragraph, but would enjoy the fight with me about it.

Color cartoon illustration of a coati in a beret and neckerchief, holding up a director's megaphone and looking over the Hollywood hills. The megaphone has the symbols + x (division obelus) and = on it. The Hollywood sign is, instead, the letters MATHEMATICS. In the background are spotlights, with several of them crossing so as to make the letters A and Z; one leg of the spotlights has 'TO' in it, so the art reads out, subtly, 'Mathematics A to Z'.
Art by Thomas K Dye, creator of the web comics Projection Edge, Newshounds, Infinity Refugees, and Something Happens. He’s on Twitter as @projectionedge. You can get to read Projection Edge six months early by subscribing to his Patreon.

Wronskian.

The Wronskian is a thing put into Introduction to Ordinary Differential Equations courses because students must suffer in atonement for their sins. Those who fail to reform enough must go on to the Hessian, in Partial Differential Equations.

To be more precise, the Wronskian is the determinant of a matrix. The determinant you find by adding and subtracting products of the elements in a matrix together. It’s not hard, but it is tedious, and gets more tedious pretty fast as the matrix gets bigger. (In Big-O notation, it’s the order of the cube of the matrix size. This is rough, for things humans do, although not bad as algorithms go.) The matrix here is made up of a bunch of functions and their derivatives. The functions need to be ones of a single variable. The derivatives, you need first, second, third, and so on, up to one less than the number of functions you have.

If you have two functions, f and g , you need their first derivatives, f' and g' . If you have three functions, f , g , and h , you need first derivatives, f' , g' , and h' , as well as second derivatives, f'' , g'' , and h'' . If you have N functions and here I’ll call them f_1, f_2, f_3, \cdots f_N , you need N-1 derivatives, f'_1, f''_1, f'''_1, \cdots f^{(N-1)}_1 and so on through f^{(N-1)}_N . You see right away this is a fun and exciting thing to calculate. Also why in intro to differential equations you only work this out with two or three functions. Maybe four functions if the class has been really naughty.

Go through your N functions and your N-1 derivatives and make a big square matrix. And then you go through calculating the derivative. This involves a lot of multiplying strings of these derivatives together. It’s a lot of work. But at least doing all this work gets you older.

So one will ask why do all this? Why fit it into every Intro to Ordinary Differential Equations textbook and why slip it in to classes that have enough stuff going on?

One answer is that if the Wronskian is not zero for some values of the independent variable, then the functions that went into it are linearly independent. Mathematicians learn to like sets of linearly independent functions. We can treat functions like directions in space. Linear independence assures us none of these functions are redundant, pointing a way we already can describe. (Real people see nothing wrong in having north, east, and northeast as directions. But mathematicians would like as few directions in our set as possible.) The Wronskian being zero for every value of the independent variable seems like it should tell us the functions are linearly dependent. It doesn’t, not without some more constraints on the functions.

This is fine, but who cares? And, unfortunately, in Intro it’s hard to reach a strong reason to care. To this major, the emphasis on linearly independent functions felt misplaced. It’s the sort of thing we care about in linear algebra. Or some course where we talk about vector spaces. Differential equations do lead us into vector spaces. It’s hard to find a corner of analysis that doesn’t.

Every ordinary differential equation has a secret picture. This is a vector field. One axis in the field is the independent variable of the function. The other axes are the value of the function. And maybe its derivatives, depending on how many derivatives are used in the ordinary differential equation. To solve one particular differential equation is to find one path in this field. People who just use differential equations will want to find one path.

Mathematicians tend to be fine with finding one path. But they want to find what kinds of paths there can be. Are there paths which the differential equation picks out, by making paths near it stay near? Or by making paths that run away from it? And here is the value of the Wronskian. The Wronskian tells us about the divergence of this vector field. This gives us insight to how these paths behave. It’s in the same way that knowing where high- and low-pressure systems are describes how the weather will change. The Wronskian, by way of a thing called Liouville’s Theorem that I haven’t the strength to describe today, ties in to the Hamiltonian. And the Hamiltonian we see in almost every mechanics problem of note.

You can see where the mathematics PhD, or the physicist, would find this interesting. But what about the student, who would look at the symbols evoked by those paragraphs above with reasonable horror?

And here’s the second answer for what the Wronskian is good for. It helps us solve ordinary differential equations. Like, particular ones. An ordinary differential equation will (normally) have several linearly independent solutions. If you know all but one of those solutions, it’s possible to calculate the Wronskian and, from that, the last of the independent solutions. Since a big chunk of mathematics — particularly for science or engineering — is solving differential equations you see why this is something valuable. Allow that it’s tedious. Tedious work we can automate, or give to research assistant to do.

One then asks what kind of differential equation would have all-but-one answer findable, and yield that last one only by long efforts of hard work. So let me show you an example ordinary differential equation:

y'' + a(x) y' + b(x) y = g(x)

Here a(x) , b(x) , and g(x) are some functions that depend only on the independent variable, x . Don’t know what they are; don’t care. The differential equation is a lot easier of a(x) and b(x) are constants, but we don’t insist on that.

This equation has a close cousin, and one that’s easier to solve than the original. Is cousin is called a homogeneous equation:

y'' + a(x) y' + b(x) y = 0

The left-hand-side, the parts with the function y that we want to find, is the same. It’s the right-hand-side that’s different, that’s a constant zero. This is what makes the new equation homogenous. This homogenous equation is easier and we can expect to find two functions, y_1 and y_2 , that solve it. If a(x) and b(x) are constant this is even easy. Even if they’re not, if you can find one solution, the Wronskian lets you generate the second.

That’s nice for the homogenous equation. But if we care about the original, inhomogenous one? The Wronskian serves us there too. Imagine that the inhomogenous solution has any solution, which we’ll call y_p . (The ‘p’ stands for ‘particular’, as in “the solution for this particular g(x) ”.) But y_p + y_1 also has to solve that inhomogenous differential equation. It seems startling but if you work it out, it’s so. (The key is the derivative of the sum of functions is the same as the sum of the derivative of functions.) y_p + y_2 also has to solve that inhomogenous differential equation. In fact, for any constants C_1 and C_2 , it has to be that y_p + C_1 y_1 + C_2 y_2 is a solution.

I’ll skip the derivation; you have Wikipedia for that. The key is that knowing these homogenous solutions, and the Wronskian, and the original g(x) , will let you find the y_p that you really want.

My reading is that this is more useful in proving things true about differential equations, rather than particularly solving them. It takes a lot of paper and I don’t blame anyone not wanting to do it. But it’s a wonder that it works, and so well.

Don’t make your instructor so mad you have to do the Wronskian for four functions.


This and all the others in My 2020 A-to-Z essays should be at this link. All the essays from every A-to-Z series should be at this link. Thank you for reading.

My 2019 Mathematics A To Z: Norm


Today’s A To Z term is another free choice. So I’m picking a term from the world of … mathematics. There are a lot of norms out there. Many are specialized to particular roles, such as looking at complex-valued numbers, or vectors, or matrices, or polynomials.

Still they share things in common, and that’s what this essay is for. And I’ve brushed up against the topic before.

The norm, also, has nothing particular to do with “normal”. “Normal” is an adjective which attaches to every noun in mathematics. This is security for me as while these A-To-Z sequences may run out of X and Y and W letters, I will never be short of N’s.

Cartoony banner illustration of a coati, a raccoon-like animal, flying a kite in the clear autumn sky. A skywriting plane has written 'MATHEMATIC A TO Z'; the kite, with the letter 'S' on it to make the word 'MATHEMATICS'.
Art by Thomas K Dye, creator of the web comics Projection Edge, Newshounds, Infinity Refugees, and Something Happens. He’s on Twitter as @projectionedge. You can get to read Projection Edge six months early by subscribing to his Patreon.

Norm.

A “norm” is the size of whatever kind of thing you’re working with. You can see where this is something we look for. It’s easy to look at two things and wonder which is the smaller.

There are many norms, even for one set of things. Some seem compelling. For the real numbers, we usually let the absolute value do this work. By “usually” I mean “I don’t remember ever seeing a different one except from someone introducing the idea of other norms”. For a complex-valued number, it’s usually the square root of the sum of the square of the real part and the square of the imaginary coefficient. For a vector, it’s usually the square root of the vector dot-product with itself. (Dot product is this binary operation that is like multiplication, if you squint, for vectors.) Again, these, the “usually” means “always except when someone’s trying to make a point”.

Which is why we have the convention that there is a “the norm” for a kind of operation. The norm dignified as “the” is usually the one that looks as much as possible like the way we find distances between two points on a plane. I assume this is because we bring our intuition about everyday geometry to mathematical structures. You know how it is. Given an infinity of possible choices we take the one that seems least difficult.

Every sort of thing which can have a norm, that I can think of, is a vector space. This might be my failing imagination. It may also be that it’s quite easy to have a vector space. A vector space is a collection of things with some rules. Those rules are about adding the things inside the vector space, and multiplying the things in the vector space by scalars. These rules are not difficult requirements to meet. So a lot of mathematical structures are vector spaces, and the things inside them are vectors.

A norm is a function that has these vectors as its domain, and the non-negative real numbers as its range. And there are three rules that it has to meet. So. Give me a vector ‘u’ and a vector ‘v’. I’ll also need a scalar, ‘a. Then the function f is a norm when:

  1. f(u + v) \le f(u) + f(v) . This is a famous rule, called the triangle inequality. You know how in a triangle, the sum of the lengths of any two legs is greater than the length of the third leg? That’s the rule at work here.
  2. f(a\cdot u) = |a| \cdot f(u) . This doesn’t have so snappy a name. Sorry. It’s something about being homogeneous, at least.
  3. If f(u) = 0 then u has to be the additive identity, the vector that works like zero does.

Norms take on many shapes. They depend on the kind of thing we measure, and what we find interesting about those things. Some are familiar. Look at a Euclidean space, with Cartesian coordinates, so that we might write something like (3, 4) to describe a point. The “the norm” for this, called the Euclidean norm or the L2 norm, is the square root of the sum of the squares of the coordinates. So, 5. But there are other norms. The L1 norm is the sum of the absolute values of all the coefficients; here, 7. The L norm is the largest single absolute value of any coefficient; here, 4.

A polynomial, meanwhile? Write it out as a_0 + a_1 x + a_2 x^2 + a_3 x^3 + \cdots + a_n x^n . Take the absolute value of each of these a_k terms. Then … you have choices. You could take those absolute values and add them up. That’s the L1 polynomial norm. Take those absolute values and square them, then add those squares, and take the square root of that sum. That’s the L2 norm. Take the largest absolute value of any of these coefficients. That’s the L norm.

These don’t look so different, even though points in space and polynomials seem to be different things. We designed the tool. We want it not to be weirder than it has to be. When we try to put a norm on a new kind of thing, we look for a norm that resembles the old kind of thing. For example, when we want to define the norm of a matrix, we’ll typically rely on a norm we’ve already found for a vector. At least to set up the matrix norm; in practice, we might do a calculation that doesn’t explicitly use a vector’s norm, but gives us the same answer.

If we have a norm for some vector space, then we have an idea of distance. We can say how far apart two vectors are. It’s the norm of the difference between the vectors. This is called defining a metric on the vector space. A metric is that sense of how far apart two things are. What keeps a norm and a metric from being the same thing is that it’s possible to come up with a metric that doesn’t match any sensible norm.

It’s always possible to use a norm to define a metric, though. Doing that promotes our normed vector space to the dignified status of a “metric space”. Many of the spaces we find interesting enough to work in are such metric spaces. It’s hard to think of doing without some idea of size.


I’ve made it through one more week without missing deadline! This and all the other Fall 2019 A To Z posts should be at this link. I remain open for subjects for the letters Q through T, and would appreciate nominations at this link. Thank you for reading and I’ll fill out the rest of this week with reminders of old A-to-Z essays.

A Leap Day 2016 Mathematics A To Z: Vector


And as we approach the last letters of the alphabet, my Leap Day A To Z gets to the lats of Gaurish’s requests.

Vector.

A vector’s a thing you can multiply by a number and then add to another vector.

Oh, I know what you’re thinking. Wasn’t a vector one of those things that points somewhere? A direction and a length in that direction? (Maybe dressed up in more formal language. I’m glad to see that apparently New Jersey Tech’s student newspaper is still The Vector and still uses the motto “With Magnitude And Direction’.) Yeah, that’s how we’re always introduced to it. Pointing to stuff is a good introduction to vectors. Nearly everyone finds their way around places. And it’s a good learning model, to learn how to multiply vectors by numbers and to add vectors together.

But thinking too much about directions, either in real-world three-dimensional space, or in the two-dimensional space of the thing we’re writing notes on, can be limiting. We can get too hung up on a particular representation of a vector. Usually that’s an ordered set of numbers. That’s all right as far as it goes, but why limit ourselves? A particular representation can be easy to understand, but as the scary people in the philosophy department have been pointing out for 26 centuries now, a particular example of a thing and the thing are not identical.

And if we look at vectors as “things we can multiply by a number, then add another vector to”, then we see something grand. We see a commonality in many different kinds of things. We can do this multiply-and-add with those things that point somewhere. Call those coordinates. But we can also do this with matrices, grids of numbers or other stuff it’s convenient to have. We can also do this with ordinary old numbers. (Think about it.) We can do this with polynomials. We can do this with sets of linear equations. We can do this with functions, as long as they’re defined for compatible domains. We can even do this with differential equations. We can see a unity in things that seem, at first, to have nothing to do with one another.

We call these collections of things “vector spaces”. It’s a space much like the space you happen to exist in is. Adding two things in the space together is much like moving from one place to another, then moving again. You can’t get out of the space. Multiplying a thing in the space by a real number is like going in one direction a short or a long or whatever great distance you want. Again you can’t get out of the space. This is called “being closed”.

(I know, you may be wondering if it isn’t question-begging to say a vector is a thing in a vector space, which is made up of vectors. It isn’t. We define a vector space as a set of things that satisfy a certain group of rules. The things in that set are the vectors.)

Vector spaces are nice things. They work much like ordinary space does. We can bring many of the ideas we know from spatial awareness to vector spaces. For example, we can usually define a “length” of things. And something that works like the “angle” between things. We can define bases, breaking down a particular element into a combination of standard reference elements. This helps us solve problems, by finding ways they’re shadows of things we already know how to solve. And it doesn’t take much to satisfy the rules of being a vector space. I think mathematicians studying new groups of objects look instinctively for how we might organize them into a vector space.

We can organize them further. A vector space that satisfies some rules about sequences of terms, and that has a “norm” which is pretty much a size, becomes a Banach space. It works a little more like ordinary three-dimensional space. A Banach space that has a norm defined by a certain common method is a Hilbert space. These work even more like ordinary space, but they don’t need anything in common with it. For example, the functions that describe quantum mechanics are in a Hilbert space. There’s a thing called a Sobolev Space, a kind of vector space that also meets criteria I forget, but the name has stuck with me for decades because it is so wonderfully assonant.

I mentioned how vectors are stuff you can multiply by numbers, and add to other vectors. That’s true, but it’s a little limiting. The thing we multiply a vector by is called a scalar. And the scalar is a number — real or complex-valued — so often it’s easy to think that’s the default. But it doesn’t have to be. The scalar just has to be an element of some field. A ‘field’ is a ring that you can do addition, multiplication, and division on. So numbers are the obvious choice. They’re not the only ones, though. The scalar has to be able to multiply with the vector, since otherwise the entire concept collapses into gibberish. But we wouldn’t go looking among the gibberish except to be funny anyway.

The idea of the ‘vector’ is straightforward and powerful. So we see it all over a wide swath of mathematics. It’s one of the things that shapes how we expect mathematics to look.

The Set Tour, Part 12: What Can You Do With Functions?


I want to resume my tour of sets that turn up a lot as domains and ranges. But I need to spend some time explaining stuff before the next bunch. I want to talk about things that aren’t so familiar as “numbers” or “shapes”. We get into more abstract things.

We have to start out with functions. Functions are built of three points, a set that’s the domain, a set that’s the range, and a rule that matches things in the domain to things in the range. But what’s a set? Sets are bunches of things. (If we want to avoid logical chaos we have to be more exact. But we’re not going near the zones of logical chaos. So we’re all right going with “sets are bunches of things”. WARNING: do not try to pass this off at your thesis defense.)

So if a function is a thing, can’t we have a set that’s made up of functions? Sure, why not? We can get a set by describing the collection of things we want in it. At least if we aren’t doing anything weird. (See above warning.)

Let’s pick out a set of functions. Put together a group of functions that all have the same set as their domain, and that have compatible sets as their range. The real numbers are a good pick for a domain. They’re also good for a range.

Is this an interesting set? Generally, a set is boring unless we can do something with the stuff in it. That something is, almost always, taking a pair of the things in the set and relating it to something new. Whole numbers, for example, would be trivia if we weren’t able to add them together. Real numbers would be a complicated pile of digits if we couldn’t multiply them together. Having things is nice. Doing stuff with things is all that’s meaningful.

So what can we do with a couple of functions, if they have the same domains and ranges? Let’s pick one out. Give it the name ‘f’. That’s a common name for functions. It was given to us by Leonhard Euler, who was brilliant in every field of mathematics, including in creating notation. Now let’s pick out a function again. Give this new one the name ‘g’. That’s a common name for functions, given to us by every mathematician who needed something besides ‘f’. (There are alternatives. One is to start using subscripts, like f1 and f2. That’s too hard for me to type. Another is to use different typefaces. Again, too hard for me. Another is to use lower- and upper-case letters, ‘f’ and ‘F’. Using alternate-case forms usually connotes that these two functions are related in some way. I don’t want to suggest that they are related here. So, ‘g’ it is.)

We can do some obvious things. We can add them together. We can create a new function, imaginatively named `f + g’. It’ll have the same domain and the same range as f and g did. What rule defines how it matches things in the domain to things in the range?

Mathematicians throw the term “obvious” around a lot. Also “intuitive”. What they mean is “what makes sense to me but I don’t want to write it down”. Saying that is fine if your mathematician friend knows roughly what you’d think makes sense. It can be catastrophic if she’s much smarter than you, or thinks in weird ways, and is always surprised other people don’t think like her. It’s hard to better describe it than “obvious”, though. Well, here goes.

Let me pick something that’s in the domain of both f and g. I’m going to call that x, which mathematicians have been doing ever since René Descartes gave us the idea. So “f(x)” is something in the range of f, and “g(x) is something in the range of g. I said, way up earlier, that both of these ranges are the same set and suggested the real numbers there. That is, f(x) is some real number and I don’t care which just now. g(x) is also some real number and again I don’t care right now just which.

The function we call “f + g” matches the thing x, in the domain, to something in the range. What thing? The number f(x) + g(x). I told you, I can’t see any fair way to describe that besides being “obvious” and “intuitive”.

Another thing we’ll want to do is multiply a function by a real number. Suppose we have a function f, just like above. Give me a real number. We’ll call that real number ‘a’ because I don’t remember if you can do the alpha symbol easily on web pages. Anyway, we can define a function, `af’, the multiplication of the real number a by the function f. It has the same domain as f, and the same range as f. What’s its rule?

Let me say x is something in the domain of f. So f(x) is some real number. Then the new function `af’ matches the x in the domain with a real number. That number is what you get by multiplying `a’ by whatever `f(x)’ is. So there are major parts of your mathematician friend from college’s classes that you could have followed without trouble.

(Her class would have covered many more things, mind you, and covered these more cryptically.)

There’s more stuff we would like to do with functions. But for now, this is enough. This lets us turn a set of functions into a “vector space”. Vector spaces are kinds of things that work, at least a bit, like arithmetic. And mathematicians have studied these kinds of things. We have a lot of potent tools that work on vector spaces. So mathematicians develop a habit of finding vector spaces in what they study.

And I’m subject to that too. This is why I’ve spent such time talking about what we can do with functions rather than naming particular sets. I’ll pick up from that.

The Set Tour, Part 5: C^n


The next piece in this set tour is a hybrid. It mixes properties of the last two sets. And I’ll own up now that while it’s a set that gets used a lot, it’s one that gets used a lot in just some corners of mathematics. It’s got a bit of that “Internet fame”. In particular circles it’s well-known; venture outside those circles even a little, and it’s not. But it leads us into other, useful places.

Cn

C here is the set of complex-valued numbers. We may have feared them once, but now they’re friends, or at least something we can work peacefully with. n here is some counting number, just as it is with Rn. n could be one or two or forty or a hundred billion. It’ll be whatever fits the problem we’re doing, if we need to pin down its value at all.

The reference to Rn, another friend, probably tipped you off to the rest. The items in Cn are n-tuples, ordered sets of some number n of numbers. Each of those numbers is itself a complex-valued number, something from C. Cn gets typeset in bold, and often with that extra vertical stroke on the left side of the C arc. It’s handwritten that way, too.

As with Rn we can add together things in Cn. Suppose that we are in C2 so that I don’t have to type too much. Suppose the first number is (2 + i, -3 – 3*i) and the second number is (6 – 2*i, 2 + 9*i). There could be fractions or irrational numbers in the real and imaginary components, but I don’t want to type that much. The work is the same. Anyway, the sum will be another number in Cn. The first term in that sum will be the sum of the first term in the first number, 2 + i, and the first term in the second number, 6 – 2*i. That in turn will be the sum of the real and of the imaginary components, so, 2 + 6 + i – 2*i, or 8 – i all told. The second term of the sum will be the second term of the first number, -3 – 3*i, and the second term of the second number, 2 + 9*i, which will be -3 – 3*i + 2 + 9*i or, all told, -1 + 6*i. The sum is the n-tuple (8 – i, -1 + 6*i).

And also as with Rn there really isn’t multiplying of one term of Cn by another. Generally, we can’t do this in any useful way. We can multiply something in Cn by a scalar, a single real — or, why not, complex-valued — number, though.

So let’s start out with (8 – i, -1 + 6*i), a number in C2. And then pick a scalar, say, 2 + 2*i. It doesn’t have to be complex-valued, but, why not? The product of this scalar and this term will be another number in C2. Its first term will the scalar, 2 + 2*i, multiplied by the first term in it, 8 – i. That’s (2 + 2*i) * (8 – i), or 2*8 – 2*i + 16*i – 2*i*i, or 2*8 – 2*i + 16*i + 2, or 18 + 14*i. And then its second term will be the scalar 2 + 2*i multiplied by the second term, -1 + 6*i. That’s (2 + 2*i)*(-1 + 6*i), or 2*(-1) + 2*6*i -2*i + 2*6*i*i. And that’s -2 + 12*i – 2*i -12, or -14 + 10*i. So the product is (18 + 14*i, -14 + 10*i).

So as with Rn, Cn creates a “vector space”. These spaces are useful in complex analysis. They’re also useful in the study of affine geometry, a corner of geometry that I’m sad to admit falls outside what I studied. I have tried reading up on them on my own, and I run aground each time. I understand the basic principles but never quite grasp why they are interesting. That’s my own failing, of course, and I’d be glad for a pointer that explained in ways I understood why they’re so neat.

I do understand some of what’s neat about them: affine geometry tells us what we can know about shapes without using the concept of “distance”. When you discover that we can know anything about shapes without the idea of “distance” your imagination should be fired. Mine is, too. I just haven’t followed from that to feel comfortable with the terminology and symbols of the field.

You could, if you like, think of Cn as being a specially-delineated version of R2*n. This is just as you can see a complex number as an ordered pair of real numbers. But sometimes information is usefully thought of as a single, complex-valued number. And there is a value in introducing the idea of ordered sets of things that are not real numbers. We will see the concept again.


Also, the heck did I write an 800-word essay about the family of sets of complex-valued n-tuples and have Hemingway Editor judge it to be at the “Grade 3” reading level? I rarely get down to “Grade 6” when I do a Reading the Comics post explaining how Andertoons did a snarky-word-problem-answers panel. That’s got to be a temporary glitch.

A Summer 2015 Mathematics A To Z: orthogonal


Orthogonal.

Orthogonal is another word for perpendicular. So why do we need another word for that?

It helps to think about why “perpendicular” is a useful way to organize things. For example, we can describe the directions to a place in terms of how far it is north-south and how far it is east-west, and talk about how fast it’s travelling in terms of its speed heading north or south and its speed heading east or west. We can separate the north-south motion from the east-west motion. If we’re lucky these motions separate entirely, and we turn a complicated two- or three-dimensional problem into two or three simpler problems. If they can’t be fully separated, they can often be largely separated. We turn a complicated problem into a set of simpler problems with a nice and easy part plus an annoying yet small hard part.

And this is why we like perpendicular directions. We can often turn a problem into several simpler ones describing each direction separately, or nearly so.

And now the amazing thing. We can separate these motions because the north-south and the east-west directions are at right angles to one another. But we can describe something that works like an angle between things that aren’t necessarily directions. For example, we can describe an angle between things like functions that have the same domain. And once we can describe the angle between two functions, we can describe functions that make right angles between each other.

This means we can describe functions as being perpendicular to one another. An example. On the domain of real numbers from -1 to 1, the function f(x) = x is perpendicular to the function g(x) = x^2 . And when we want to study a more complicated function we can separate the part that’s in the “direction” of f(x) from the part that’s in the “direction” of g(x). We can treat functions, even functions we don’t know, as if they were locations in space. And we can study and even solve for the different parts of the function as if we were pinning down the north-south and the east-west movements of a thing.

So if we want to study, say, how heat flows through a body, we can work out a series of “direction” for functions, and work out the flow in each of those “directions”. These don’t have anything to do with left-right or up-down directions, but the concepts and the convenience is similar.

I’ve spoken about this in terms of functions. But we can define the “angle” between things for many kinds of mathematical structures. Once we can do that, we can have “perpendicular” pairs of things. I’ve spoken only about functions, but that’s because functions are more familiar than many of the mathematical structures that have orthogonality.

Ah, but why call it “orthogonal” rather than “perpendicular”? And I don’t know. The best I can work out is that it feels weird to speak of, say, the cosine function being “perpendicular” to the sine function when you can’t really say either is in any particular direction. “Orthogonal” seems to appeal less directly to physical intuition while still meaning something. But that’s my guess, rather than the verdict of a skilled etymologist.