Tagged: linear algebra Toggle Comment Threads | Keyboard Shortcuts

  • Joseph Nebus 6:00 pm on Sunday, 16 April, 2017 Permalink | Reply
    Tags: Amanda the Great, , , , , , , , linear algebra, , Skin Horse, , ,   

    Reading the Comics, April 15, 2017: Extended Week Edition 


    It turns out last Saturday only had the one comic strip that was even remotely on point for me. And it wasn’t very on point either, but since it’s one of the Creators.com strips I’ve got the strip to show. That’s enough for me.

    Henry Scarpelli and Craig Boldman’s Archie for the 8th is just about how algebra hurts. Some days I agree.

    'Ugh! Achey head! All blocked up! Throbbing! Completely stuffed!' 'Sounds like sinuses!' 'No. Too much algebra!'

    Henry Scarpelli and Craig Boldman’s Archie for the 8th of April, 2017. Do you suppose Archie knew that Dilton was listening there, or was he just emoting his fatigue to himself?

    Ruben Bolling’s Super-Fun-Pak Comix for the 8th is an installation of They Came From The Third Dimension. “Dimension” is one of those oft-used words that’s come loose of any technical definition. We use it in mathematics all the time, at least once we get into Introduction to Linear Algebra. That’s the course that talks about how blocks of space can be stretched and squashed and twisted into each other. You’d expect this to be a warmup act to geometry, and I guess it’s relevant. But where it really pays off is in studying differential equations and how systems of stuff changes over time. When you get introduced to dimensions in linear algebra they describe degrees of freedom, or how much information you need about a problem to pin down exactly one solution.

    It does give mathematicians cause to talk about “dimensions of space”, though, and these are intuitively at least like the two- and three-dimensional spaces that, you know, stuff moves in. That there could be more dimensions of space, ordinarily inaccessible, is an old enough idea we don’t really notice it. Perhaps it’s hidden somewhere too.

    Amanda El-Dweek’s Amanda the Great of the 9th started a story with the adult Becky needing to take a mathematics qualification exam. It seems to be prerequisite to enrolling in some new classes. It’s a typical set of mathematics anxiety jokes in the service of a story comic. One might tsk Becky for going through university without ever having a proper mathematics class, but then, I got through university without ever taking a philosophy class that really challenged me. Not that I didn’t take the classes seriously, but that I took stuff like Intro to Logic that I was already conversant in. We all cut corners. It’s a shame not to use chances like that, but there’s always so much to do.

    Mark Anderson’s Andertoons for the 10th relieves the worry that Mark Anderson’s Andertoons might not have got in an appearance this week. It’s your common kid at the chalkboard sort of problem, this one a kid with no idea where to put the decimal. As always happens I’m sympathetic. The rules about where to move decimals in this kind of multiplication come out really weird if the last digit, or worse, digits in the product are zeroes.

    Mel Henze’s Gentle Creatures is in reruns. The strip from the 10th is part of a story I’m so sure I’ve featured here before that I’m not even going to look up when it aired. But it uses your standard story problem to stand in for science-fiction gadget mathematics calculation.

    Dave Blazek’s Loose Parts for the 12th is the natural extension of sleep numbers. Yes, I’m relieved to see Dave Blazek’s Loose Parts around here again too. Feels weird when it’s not.

    Bill Watterson’s Calvin and Hobbes rerun for the 13th is a resisting-the-story-problem joke. But Calvin resists so very well.

    John Deering’s Strange Brew for the 13th is a “math club” joke featuring horses. Oh, it’s a big silly one, but who doesn’t like those too?

    Dan Thompson’s Brevity for the 14th is one of the small set of punning jokes you can make using mathematician names. Good for the wall of a mathematics teacher’s classroom.

    Shaenon K Garrity and Jefferey C Wells’s Skin Horse for the 14th is set inside a virtual reality game. (This is why there’s talk about duplicating objects.) Within the game, the characters are playing that game where you start with a set number (in this case 20) tokens and take turn removing a couple of them. The “rigged” part of it is that the house can, by perfect play, force a win every time. It’s a bit of game theory that creeps into recreational mathematics books and that I imagine is imprinted in the minds of people who grow up to design games.

     
  • Joseph Nebus 6:00 pm on Friday, 25 November, 2016 Permalink | Reply
    Tags: , , , , kernels, linear algebra, null space,   

    The End 2016 Mathematics A To Z: Kernel 


    I told you that Image thing would reappear. Meanwhile I learned something about myself in writing this.

    Kernel.

    I want to talk about functions again. I’ve been keeping like a proper mathematician to a nice general idea of what a function is. The sort where a function’s this rule matching stuff in a set called the domain with stuff in a set called the range. And I’ve tried not to commit myself to saying anything about what that domain and range are. They could be numbers. They could be other functions. They could be the set of DVDs you own but haven’t watched in more than two years. They could be collections socks. Haven’t said.

    But we know what functions anyone cares about. They’re stuff that have domains and ranges that are numbers. Preferably real numbers. Complex-valued numbers if we must. If we look at more exotic sets they’re ones that stick close to being numbers: vectors made up of an ordered set of numbers. Matrices of numbers. Functions that are themselves about numbers. Maybe we’ll get to something exotic like a rotation, but then what is a rotation but spinning something a certain number of degrees? There are a bunch of unavoidably common domains and ranges.

    Fine, then. I’ll stick to functions with ranges that look enough like regular old numbers. By “enough” I mean they have a zero. That is, something that works like zero does. You know, add it to something else and that something else isn’t changed. That’s all I need.

    A natural thing to wonder about a function — hold on. “Natural” is the wrong word. Something we learn to wonder about in functions, in pre-algebra class where they’re all polynomials, is where the zeroes are. They’re generally not at zero. Why would we say “zeroes” to mean “zero”? That could let non-mathematicians think they knew what we were on about. By the “zeroes” we mean the things in the domain that get matched to the zero in the range. It might be zero; no reason it couldn’t, until we know what the function’s rule is. Just we can’t count on that.

    A polynomial we know has … well, it might have zero zeroes. Might have no zeroes. It might have one, or two, or so on. If it’s an n-th degree polynomial it can have up to n zeroes. And if it’s not a polynomial? Well, then it could have any conceivable number of zeroes and nobody is going to give you a nice little formula to say where they all are. It’s not that we’re being mean. It’s just that there isn’t a nice little formula that works for all possibilities. There aren’t even nice little formulas that work for all polynomials. You have to find zeroes by thinking about the problem. Sorry.

    But! Suppose you have a collection of all the zeroes for your function. That’s all the points in the domain that match with zero in the range. Then we have a new name for the thing you have. And that’s the kernel of your function. It’s the biggest subset in the domain with an image that’s just the zero in the range.

    So we have a name for the zeroes that isn’t just “the zeroes”. What does this get us?

    If we don’t know anything about the kind of function we have, not much. If the function belongs to some common kinds of functions, though, it tells us stuff.

    For example. Suppose the function has domain and range that are vectors. And that the function is linear, which is to say, easy to deal with. Let me call the function ‘f’. And let me pick out two things in the domain. I’ll call them ‘x’ and ‘y’ because I’m writing this after Thanksgiving dinner and can’t work up a cleverer name for anything. If f is linear then f(x + y) is the same thing as f(x) + f(y). And now something magic happens. If x and y are both in the kernel, then x + y has to be in the kernel too. Think about it. Meanwhile, if x is in the kernel but y isn’t, then f(x + y) is f(y). Again think about it.

    What we can see is that the domain fractures into two directions. One of them, the direction of the kernel, is invisible to the function. You can move however much you like in that direction and f can’t see it. The other direction, perpendicular (“orthogonal”, we say in the trade) to the kernel, is visible. Everything that might change changes in that direction.

    This idea threads through vector spaces, and we study a lot of things that turn out to look like vector spaces. It keeps surprising us by letting us solve problems, or find the best-possible approximate solutions. This kernel gives us room to match some fiddly conditions without breaking the real solution. The size of the null space alone can tell us whether some problems are solvable, or whether they’ll have infinitely large sets of solutions.

    In this vector-space construct the kernel often takes on another name, the “null space”. This means the same thing. But it reminds us that superhero comics writers miss out on many excellent pieces of terminology by not taking advanced courses in mathematics.

    Kernels also appear in group theory, whenever we get into rings. We’re always working with rings. They’re nearly as unavoidable as vector spaces.

    You know how you can divide the whole numbers into odd and even? And you can do some neat tricks with that for some problems? You can do that with every ring, using the kernel as a dividing point. This gives us information about how the ring is shaped, and what other structures might look like the ring. This often lets us turn proofs that might be hard into a collection of proofs on individual cases that are, at least, doable. Tricks about odd and even numbers become, in trained hands, subtle proofs of surprising results.

    We see vector spaces and rings all over the place in mathematics. Some of that’s selection bias. Vector spaces capture a lot of what’s important about geometry. Rings capture a lot of what’s important about arithmetic. We have understandings of geometry and arithmetic that transcend even our species. Raccoons understand space. Crows understand number. When we look to do mathematics we look for patterns we understand, and these are major patterns we understand. And there are kernels that matter to each of them.

    Some mathematical ideas inspire metaphors to me. Kernels are one. Kernels feel to me like the process of holding a polarized lens up to a crystal. This lets one see how the crystal is put together. I realize writing this down that my metaphor is unclear: is the kernel the lens or the structure seen in the crystal? I suppose the function has to be the lens, with the kernel the crystallization planes made clear under it. It’s curious I had enjoyed this feeling about kernels and functions for so long without making it precise. Feelings about mathematical structures can be like that.

     
    • Barb Knowles 8:42 pm on Friday, 25 November, 2016 Permalink | Reply

      Don’t be mad if I tell you I’ve never had a feeling about a mathematical structure, lol. But it is immensely satisfying to solve an equation. I’m not a math person. As an English as a New Language teacher, I have to help kids with algebra at times. I usually break out in a sweat and am ecstatic when I can actually help them.

      Liked by 1 person

      • Joseph Nebus 11:24 pm on Friday, 25 November, 2016 Permalink | Reply

        I couldn’t be mad about that! I don’t have feeling like that about most mathematical constructs myself. There’s just a few that stand out for one reason or another.

        I am intrigued by the ways teaching differs for different subjects. How other people teach mathematics (or physics) interests me too, but I’ve noticed some strong cultural similarities across different departments and fields. Other subjects have a greater novelty value for me.

        Liked by 2 people

        • Barb Knowles 11:42 pm on Friday, 25 November, 2016 Permalink | Reply

          My advisor in college (Romance Language manor) told me that I should do well in math because it is a language, formulas are like grammar and there is a lot of memorization. Not being someone with math skills, I replied ummmm. I don’t think she was impressed, lol.

          Liked by 1 person

          • Joseph Nebus 9:22 pm on Wednesday, 30 November, 2016 Permalink | Reply

            I’m not sure that I could go along with the idea of mathematics as a language. But there is something that seems like a grammar to formulas. That is, there are formulas that just look right or look wrong, even before exploring their content. Sometimes a formula just looks … ungrammatical. Sometimes that impression is wrong. But there is something that stands out.

            As for mathematics skills, well, I think people usually have more skill than they realize. There’s a lot of mathematics out there, much of it not related to calculations, and it’d be amazing if none of it intrigued you or came easily.

            Liked by 1 person

  • Joseph Nebus 6:00 pm on Wednesday, 2 November, 2016 Permalink | Reply
    Tags: , , eigenvalues, , , , linear algebra,   

    The End 2016 Mathematics A To Z: Algebra 


    So let me start the End 2016 Mathematics A To Z with a word everybody figures they know. As will happen, everybody’s right and everybody’s wrong about that.

    Algebra.

    Everybody knows what algebra is. It’s the point where suddenly mathematics involves spelling. Instead of long division we’re on a never-ending search for ‘x’. Years later we pass along gifs of either someone saying “stop asking us to find your ex” or someone who’s circled the letter ‘x’ and written “there it is”. And make jokes about how we got through life without using algebra. And we know it’s the thing mathematicians are always doing.

    Mathematicians aren’t always doing that. I expect the average mathematician would say she almost never does that. That’s a bit of a fib. We have a lot of work where we do stuff that would be recognizable as high school algebra. It’s just we don’t really care about that. We’re doing that because it’s how we get the problem we are interested in done. the most recent few pieces in my “Why Stuff can Orbit” series include a bunch of high school algebra-style work. But that was just because it was the easiest way to answer some calculus-inspired questions.

    Still, “algebra” is a much-used word. It comes back around the second or third year of a mathematics major’s career. It comes in two forms in undergraduate life. One form is “linear algebra”, which is a great subject. That field’s about how stuff moves. You get to imagine space as this stretchy material. You can stretch it out. You can squash it down. You can stretch it in some directions and squash it in others. You can rotate it. These are simple things to build on. You can spend a whole career building on that. It becomes practical in surprising ways. For example, it’s the field of study behind finding equations that best match some complicated, messy real data.

    The second form is “abstract algebra”, which comes in about the same time. This one is alien and baffling for a long while. It doesn’t help that the books all call it Introduction to Algebra or just Algebra and all your friends think you’re slumming. The mathematics major stumbles through confusing definitions and theorems that ought to sound comforting. (“Fermat’s Little Theorem”? That’s a good thing, right?) But the confusion passes, in time. There’s a beautiful subject here, one of my favorites. I’ve talked about it a lot.

    We start with something that looks like the loosest cartoon of arithmetic. We get a bunch of things we can add together, and an ‘addition’ operation. This lets us do a lot of stuff that looks like addition modulo numbers. Then we go on to stuff that looks like picking up floor tiles and rotating them. Add in something that we call ‘multiplication’ and we get rings. This is a bit more like normal arithmetic. Add in some other stuff and we get ‘fields’ and other structures. We can keep falling back on arithmetic and on rotating tiles to build our intuition about what we’re doing. This trains mathematicians to look for particular patterns in new, abstract constructs.

    Linear algebra is not an abstract-algebra sort of algebra. Sorry about that.

    And there’s another kind of algebra that mathematicians talk about. At least once they get into grad school they do. There’s a huge family of these kinds of algebras. The family trait for them is that they share a particular rule about how you can multiply their elements together. I won’t get into that here. There are many kinds of these algebras. One that I keep trying to study on my own and crash hard against is Lie Algebra. That’s named for the Norwegian mathematician Sophus Lie. Pronounce it “lee”, as in “leaning”. You can understand quantum mechanics much better if you’re comfortable with Lie Algebras and so now you know one of my weaknesses. Another kind is the Clifford Algebra. This lets us create something called a “hypercomplex number”. It isn’t much like a complex number. Sorry. Clifford Algebra does lend to a construct called spinors. These help physicists understand the behavior of bosons and fermions. Every bit of matter seems to be either a boson or a fermion. So you see why this is something people might like to understand.

    Boolean Algebra is the algebra of this type that a normal person is likely to have heard of. It’s about what we can build using two values and a few operations. Those values by tradition we call True and False, or 1 and 0. The operations we call things like ‘and’ and ‘or’ and ‘not’. It doesn’t sound like much. It gives us computational logic. Isn’t that amazing stuff?

    So if someone says “algebra” she might mean any of these. A normal person in a non-academic context probably means high school algebra. A mathematician speaking without further context probably means abstract algebra. If you hear something about “matrices” it’s more likely that she’s speaking of linear algebra. But abstract algebra can’t be ruled out yet. If you hear a word like “eigenvector” or “eigenvalue” or anything else starting “eigen” (or “characteristic”) she’s more probably speaking of abstract algebra. And if there’s someone’s name before the word “algebra” then she’s probably speaking of the last of these. This is not a perfect guide. But it is the sort of context mathematicians expect other mathematicians notice.

     
    • John Friedrich 2:13 am on Thursday, 3 November, 2016 Permalink | Reply

      The cruelest trick that happened to me was when a grad school professor labeled the Galois Theory class “Algebra”. Until then, the lowest score I’d ever gotten in a math class was a B. After that, I decided to enter the work force and abandon my attempts at a master’s degree.

      Like

      • Joseph Nebus 3:32 pm on Friday, 4 November, 2016 Permalink | Reply

        Well, it’s true enough that it’s part of algebra. But I’d feel uncomfortable plunging right into that without the prerequisites being really clear. I’m not sure I’ve even run into a nice clear pop-culture explanation of Galois Theory past some notes about how there’s two roots to a quadratic equation and see how they mirror each other.

        Like

  • Joseph Nebus 3:00 pm on Wednesday, 2 March, 2016 Permalink | Reply
    Tags: , , , , linear algebra   

    A Leap Day 2016 Mathematics A To Z: Basis 


    Today’s glossary term is one that turns up in many areas of mathematics. But these all share some connotations. So I mean to start with the easiest one to understand.

    Basis.

    Suppose you are somewhere. Most of us are. Where is something else?

    That isn’t hard to answer if conditions are right. If we’re allowed to point and the something else is in sight, we’re done. It’s when pointing and following the line of sight breaks down that we’re in trouble. We’re also in trouble if we want to say how to get from that something to yet another spot. How can we guide someone from one point to another?

    We have a good answer from everyday life. We can impose some order, some direction, on space. We’re familiar with this from the cardinal directions. We say where things on the surface of the Earth are by how far they are north or south, east or west, from something else. The scheme breaks down a bit if we’re at the North or the South pole exactly, but there we can fall back on pointing.

    When we start using north and south and east and west as directions we are choosing basis vectors. Vectors are directions in how far to move and in what direction. Suppose we have two vectors that aren’t pointing in the same direction. Then we can describe any two-dimensional movement using them. We can say “go this far in the direction of the first vector and also that far in the direction of the second vector”. With the cardinal directions, we consider north and east, or east and south, or south and west, or west and north to be a pair of vectors going in different directions.

    (North and south, in this context, are the same thing. “Go twenty paces north” says the same thing as “go negative twenty paces south”. Most mathematicians don’t pull this sort of stunt when telling you how to get somewhere unless they’re trying to be funny without succeeding.)

    A basis vector is just a direction, and distance in that direction, that we’ve decided to be a reference for telling different points in space apart. A basis set, or basis, is the collection of all the basis vectors we need. What do we need? We need enough basis vectors to get to all the points in whatever space we’re working with.

    (If you are going to ask about doesn’t “east” point in different directions as we go around the surface of the Earth, you’re doing very well. Please pretend we never move so far from where we start that anyone could notice the difference. If you can’t do that, please pretend the Earth has been smooshed into a huge flat square with north at one end and we’re only just now noticing.)

    We are free to choose whatever basis vectors we like. The worst that can happen if we choose a lousy basis is that we have to write out more things than we otherwise would. Our work won’t be less true, it’ll just be more tedious. But there are some properties that often make for a good basis.

    One is that the basis should relate to the problem you’re doing. Suppose you were in one of mathematicians’ favorite places, midtown Manhattan. There is a compelling grid here of streets running north-south and avenues running east-west. (Broadway we ignore as an implementation error retained for reasons of backwards compatibility.) Well, we pretend they run north-south and east-west. They’re actually a good bit clockwise of north-south and east-west. They do that to better match the geography of the island. A “north” street runs about parallel to the way Manhattan’s long dimension runs. In the circumstance, it would be daft to describe directions by true north or true east. We would say to go so many streets “north” and so many avenues “east”.

    Purely mathematical problems aren’t concerned with streets and avenues. But there will often be preferred directions. Mathematicians often look at the way a process alters shapes or redirects forces. There’ll be some directions where the alterations are biggest. There’ll be some where the alterations are shortest. Those directions are probably good choices for a basis. They stand out as important.

    We also tend to like basis vectors that are a unit length. That is, their size is 1 in some convenient unit. That’s for the same reason it’s easier to say how expensive something is if it costs 45 dollars instead of nine five-dollar bills. Or if you’re told it was 180 quarter-dollars. The length of your basis vector is just a scaling factor. But the more factors you have to work with the more likely you are to misunderstand something.

    And we tend to like basis vectors that are perpendicular to one another. They don’t have to be. But if they are then it’s easier to divide up our work. We can study each direction separately. Mathematicians tend to like techniques that let us divide problems up into smaller ones that we can study separately.

    I’ve described basis sets using vectors. They have intuitive appeal. It’s easy to understand directions of things in space. But the idea carries across into other things. For example, we can build functions out of other functions. So we can choose a set of basis functions. We can multiply them by real numbers (scalars) and add them together. This makes whatever function we’re interested in into a kind of weighted average of basis functions.

    Why do that? Well, again, we often study processes that change shapes and directions. If we choose a basis well, though, the process changes the basis vectors in easy to describe ways. And many interesting processes let us describe the changing of an arbitrary function as the weighted sum of the changes in the basis vectors. By solving a couple of simple problems we get the ability to solve every interesting problem.

    We can even define something that works like the angle between functions. And something that works a lot like perpendicularity for functions.

    And this carries on to other mathematical constructs. We look for ways to impose some order, some direction, on whatever structure we’re looking at. We’re often successful, and can work with unreal things using tools like those that let us find our place in a city.

     
  • Joseph Nebus 3:00 pm on Friday, 6 November, 2015 Permalink | Reply
    Tags: , , linear algebra, , , ,   

    The Set Tour, Part 7: Matrices 


    I feel a bit odd about this week’s guest in the Set Tour. I’ve been mostly concentrating on sets that get used as the domains or ranges for functions a lot. The ones I want to talk about here don’t tend to serve the role of domain or range. But they are used a great deal in some interesting functions. So I loosen my rule about what to talk about.

    Rm x n and Cm x n

    Rm x n might explain itself by this point. If it doesn’t, then this may help: the “x” here is the multiplication symbol. “m” and “n” are positive whole numbers. They might be the same number; they might be different. So, are we done here?

    Maybe not quite. I was fibbing a little when I said “x” was the multiplication symbol. R2 x 3 is not a longer way of saying R6, an ordered collection of six real-valued numbers. The x does represent a kind of product, though. What we mean by R2 x 3 is an ordered collection, two rows by three columns, of real-valued numbers. Say the “x” here aloud as “by” and you’re pronouncing it correctly.

    What we get is called a “matrix”. If we put into it only real-valued numbers, it’s a “real matrix”, or a “matrix of reals”. Sometimes mathematical terminology isn’t so hard to follow. Just as with vectors, Rn, it matters just how the numbers are organized. R2 x 3 means something completely different from what R3 x 2 means. And swapping which positions the numbers in the matrix occupy changes what matrix we have, as you might expect.

    You can add together matrices, exactly as you can add together vectors. The same rules even apply. You can only add together two matrices of the same size. They have to have the same number of rows and the same number of columns. You add them by adding together the numbers in the corresponding slots. It’s exactly what you would do if you went in without preconceptions.

    You can also multiply a matrix by a single number. We called this scalar multiplication back when we were working with vectors. With matrices, we call this scalar multiplication. If it strikes you that we could see vectors as a kind of matrix, yes, we can. Sometimes that’s wise. We can see a vector as a matrix in the set R1 x n or as one in the set Rn x 1, depending on just what we mean to do.

    It’s trickier to multiply two matrices together. As with vectors multiplying the numbers in corresponding positions together doesn’t give us anything. What we do instead is a time-consuming but not actually hard process. But according to its rules, something in Rm x n we can multiply by something in Rn x k. “k” is another whole number. The second thing has to have exactly as many rows as the first thing has columns. What we get is a matrix in Rm x k.

    I grant you maybe didn’t see that coming. Also a potential complication: if you can multiply something in Rm x n by something in Rn x k, can you multiply the thing in Rn x k by the thing in Rm x n? … No, not unless k and m are the same number. Even if they are, you can’t count on getting the same product. Matrices are weird things this way. They’re also gateways to weirder things. But it is a productive weirdness, and I’ll explain why in a few paragraphs.

    A matrix is a way of organizing terms. Those terms can be anything. Real matrices are surely the most common kind of matrix, at least in mathematical usage. Next in common use would be complex-valued matrices, much like how we get complex-valued vectors. These are written Cm x n. A complex-valued matrix is different from a real-valued matrix. The terms inside the matrix can be complex-valued numbers, instead of real-valued numbers. Again, sometimes, these mathematical terms aren’t so tricky.

    I’ve heard occasionally of people organizing matrices of other sets. The notation is similar. If you’re building a matrix of “m” rows and “n” columns out of the things you find inside a set we’ll call H, then you write that as Hm x n. I’m not saying you should do this, just that if you need to, that’s how to tell people what you’re doing.

    Now. We don’t really have a lot of functions that use matrices as domains, and I can think of fewer that use matrices as ranges. There are a couple of valuable ones, ones so valuable they get special names like “eigenvalue” and “eigenvector”. (Don’t worry about what those are.) They take in Rm x n or Cm x n and return a set of real- or complex-valued numbers, or real- or complex-valued vectors. Not even those, actually. Eigenvectors and eigenfunctions are only meaningful if there are exactly as many rows as columns. That is, for Rm x m and Cm x m. These are known as “square” matrices, just as you might guess if you were shaken awake and ordered to say what you guessed a “square matrix” might be.

    They’re important functions. There are some other important functions, with names like “rank” and “condition number” and the like. But they’re not many. I believe they’re not even thought of as functions, any more than we think of “the length of a vector” as primarily a function. They’re just properties of these matrices, that’s all.

    So why are they worth knowing? Besides the joy that comes of knowing something, I mean?

    Here’s one answer, and the one that I find most compelling. There is cultural bias in this: I come from an applications-heavy mathematical heritage. We like differential equations, which study how stuff changes in time and in space. It’s very easy to go from differential equations to ordered sets of equations. The first equation may describe how the position of particle 1 changes in time. It might describe how the velocity of the fluid moving past point 1 changes in time. It might describe how the temperature measured by sensor 1 changes as it moves. It doesn’t matter. We get a set of these equations together and we have a majestic set of differential equations.

    Now, the dirty little secret of differential equations: we can’t solve them. Most interesting physical phenomena are nonlinear. Linear stuff is easy. Small change 1 has effect A; small change 2 has effect B. If we make small change 1 and small change 2 together, this has effect A plus B. Nonlinear stuff, though … it just doesn’t work. Small change 1 has effect A; small change 2 has effect B. Small change 1 and small change 2 together has effect … A plus B plus some weird A times B thing plus some effect C that nobody saw coming and then C does something with A and B and now maybe we’d best hide.

    There are some nonlinear differential equations we can solve. Those are the result of heroic work and brilliant insights. Compared to all the things we would like to solve there’s not many of them. Methods to solve nonlinear differential equations are as precious as ways to slay krakens.

    But here’s what we can do. What we usually like to know about in systems are equilibriums. Those are the conditions in which the system stops changing. Those are interesting. We can usually find those points by boring but not conceptually challenging calculations. If we can’t, we can declare x0 represents the equilibrium. If we still care, we leave calculating its actual values to the interested reader or hungry grad student.

    But what’s really interesting is: what happens if we’re near but not exactly at the equilibrium? Sometimes, we stay near it. Think of pushing a swing. However good a push you give, it’s going to settle back to the boring old equilibrium of dangling straight down. Sometimes, we go racing away from it. Think of trying to balance a pencil on its tip; if we did this perfectly it would stay balanced. It never does. We’re never perfect, or there’s some wind or somebody walks by and the perfect balance is foiled. It falls down and doesn’t bounce back up. Sometimes, whether it it stays near or goes away depends on what way it’s away from the equilibrium.

    And now we finally get back to matrices. Suppose we are starting out near an equilibrium. We can, usually, approximate the differential equations that describe what will happen. The approximation may only be good if we’re just a tiny bit away from the equilibrium, but that might be all we really want to know. That approximation will be some linear differential equations. (If they’re not, then we’re just wasting our time.) And that system of linear differential equations we can describe using matrices.

    If we can write what we are interested in as a set of linear differential equations, then we have won. We can use the many powerful tools of matrix arithmetic — linear algebra, specifically — to tell us everything we want to know about the system. We can say whether a small push away from the equilibrium stays small, or whether it grows, or whether it depends. We can say how fast the small push shrinks, or grows (for a while). We can say how the system will change, approximately.

    This is what I love in matrices. It’s not everything there is to them. But it’s enough to make matrices important to me.

     
  • Joseph Nebus 11:19 pm on Wednesday, 13 August, 2014 Permalink | Reply
    Tags: linear algebra, , , , ,   

    Combining Matrices And Model Universes 


    I would like to resume talking about matrices and really old universes and the way nucleosynthesis in these model universes causes atoms to keep settling down to peculiar but unchanging distribution.

    I’d already described how a matrix offers a nice way to organize elements, and in ways that encode information about the context of the elements by where they’re placed. That’s useful and saves some writing, certainly, although by itself it’s not that interesting. Matrices start to get really powerful when, first, the elements being stored are things on which you can do something like arithmetic with pairs of them. Here I mostly just mean that you can add together two elements, or multiply them, and get back something meaningful.

    This typically means that the matrix is made up of a grid of numbers, although that isn’t actually required, just, really common if we’re trying to do mathematics.

    Then you get the ability to add together and multiply together the matrices themselves, turning pairs of matrices into some new matrix, and building something that works a lot like arithmetic on these matrices.

    Adding one matrix to another is done in almost the obvious way: add the element in the first row, first column of the first matrix to the element in the first row, first column of the second matrix; that’s the first row, first column of your new matrix. Then add the element in the first row, second column of the first matrix to the element in the first row, second column of the second matrix; that’s the first row, second column of the new matrix. Add the element in the second row, first column of the first matrix to the element in the second row, first column of the second matrix, and put that in the second row, first column of the new matrix. And so on.

    This means you can only add together two matrices that are the same size — the same number of rows and of columns — but that doesn’t seem unreasonable.

    You can also do something called scalar multiplication of a matrix, in which you multiply every element in the matrix by the same number. A scalar is just a number that isn’t part of a matrix. This multiplication is useful, not least because it lets us talk about how to subtract one matrix from another: to find the difference of the first matrix and the second, scalar-multiply the second matrix by -1, and then add the first to that product. But you can do scalar multiplication by any number, by two or minus pi or by zero if you feel like it.

    I should say something about notation. When we want to write out these kinds of operations efficiently, of course, we turn to symbols to represent the matrices. We can, in principle, use any symbols, but by convention a matrix usually gets represented with a capital letter, A or B or M or P or the like. So to add matrix A to matrix B, with the result being matrix C, we can write out the equation “A + B = C”, which is about as simple as we could hope to see. Scalars are normally written in lowercase letters, often Greek letters, if we don’t know what the number is, so that the scalar multiplication of the number r and the matrix A would be the product “rA”, and we could write the difference between matrix A and matrix B as “A + (-1)B” or “A – B”.

    Matrix multiplication, now, that is done by a process that sounds like doubletalk, and it takes a while of practice to do it right. But there are good reasons for doing it that way and we’ll get to one of those reasons by the end of this essay.

    To multiply matrix A and matrix B together, we do multiply various pairs of elements from both matrix A and matrix B. The surprising thing is that we also add together sets of these products, per this rule.

    Take the element in the first row, first column of A, and multiply it by the element in the first row, first column of B. Add to that the product of the element in the first row, second column of A and the second row, first column of B. Add to that total the product of the element in the first row, third column of A and the third row, second column of B, and so on. When you’ve run out of columns of A and rows of B, this total is the first row, first column of the product of the matrices A and B.

    Plenty of work. But we have more to do. Take the product of the element in the first row, first column of A and the element in the first row, second column of B. Add to that the product of the element in the first row, second column of A and the element in the second row, second column of B. Add to that the product of the element in the first row, third column of A and the element in the third row, second column of B. And keep adding those up until you’re out of columns of A and rows of B. This total is the first row, second column of the product of matrices A and B.

    This does mean that you can multiply matrices of different sizes, provided the first one has as many columns as the second has rows. And the product may be a completely different size from the first or second matrices. It also means it might be possible to multiply matrices in one order but not the other: if matrix A has four rows and three columns, and matrix B has three rows and two columns, then you can multiply A by B, but not B by A.

    My recollection on learning this process was that this was crazy, and the workload ridiculous, and I imagine people who get this in Algebra II, and don’t go on to using mathematics later on, remember the process as nothing more than an unpleasant blur of doing a lot of multiplying and addition for some reason or other.

    So here is one of the reasons why we do it this way. Let me define two matrices:

    A = \left(\begin{tabular}{c c c} 3/4 & 0 & 2/5 \\ 1/4 & 3/5 & 2/5 \\ 0 & 2/5 & 1/5 \end{tabular}\right)

    B = \left(\begin{tabular}{c} 100 \\ 0 \\ 0 \end{tabular}\right)

    Then matrix A times B is

    AB = \left(\begin{tabular}{c} 3/4 * 100 + 0 * 0 + 2/5 * 0 \\ 1/4 * 100 + 3/5 * 0 + 2/5 * 0 \\ 0 * 100 + 2/5 * 0 + 1/5 * 0 \end{tabular}\right) = \left(\begin{tabular}{c} 75 \\ 25 \\ 0 \end{tabular}\right)

    You’ve seen those numbers before, of course: the matrix A contains the probabilities I put in my first model universe to describe the chances that over the course of a billion years a hydrogen atom would stay hydrogen, or become iron, or become uranium, and so on. The matrix B contains the original distribution of atoms in the toy universe, 100 percent hydrogen and nothing anything else. And the product of A and B was exactly the distribution after that first billion years: 75 percent hydrogen, 25 percent iron, nothing uranium.

    If we multiply the matrix A by that product again — well, you should expect we’re going to get the distribution of elements after two billion years, that is, 56.25 percent hydrogen, 33.75 percent iron, 10 percent uranium, but let me write it out anyway to show:

    \left(\begin{tabular}{c c c} 3/4 & 0 & 2/5 \\ 1/4 & 3/5 & 2/5 \\ 0 & 2/5 & 1/5 \end{tabular}\right)\left(\begin{tabular}{c} 75 \\ 25 \\ 0 \end{tabular}\right) = \left(\begin{tabular}{c} 3/4 * 75 + 0 * 25 + 2/5 * 0 \\ 1/4 * 75 + 3/5 * 25 + 2/5 * 0 \\ 0 * 75 + 2/5 * 25 + 1/5 * 0 \end{tabular}\right) = \left(\begin{tabular}{c} 56.25 \\ 33.75 \\ 10 \end{tabular}\right)

    And if you don’t know just what would happen if we multipled A by that product, you aren’t paying attention.

    This also gives a reason why matrix multiplication is defined this way. The operation captures neatly the operation of making a new thing — in the toy universe case, hydrogen or iron or uranium — out of some combination of fractions of an old thing — again, the former distribution of hydrogen and iron and uranium.

    Or here’s another reason. Since this matrix A has three rows and three columns, you can multiply it by itself and get a matrix of three rows and three columns out of it. That matrix — which we can write as A2 — then describes how two billion years of nucleosynthesis would change the distribution of elements in the toy universe. A times A times A would give three billion years of nucleosynthesis; A10 ten billion years. The actual calculating of the numbers in these matrices may be tedious, but it describes a complicated operation very efficiently, which we always want to do.

    I should mention another bit of notation. We usually use capital letters to represent matrices; but, a matrix that’s just got one column is also called a vector. That’s often written with a lowercase letter, with a little arrow above the letter, as in \vec{x} , or in bold typeface, as in x. (The arrows are easier to put in writing, the bold easier when you were typing on typewriters.) But if you’re doing a lot of writing this out, and know that (say) x isn’t being used for anything but vectors, then even that arrow or boldface will be forgotten. Then we’d write the product of matrix A and vector x as just Ax.  (There are also cases where you put a little caret over the letter; that’s to denote that it’s a vector that’s one unit of length long.)

    When you start writing vectors without an arrow or boldface you start to run the risk of confusing what symbols mean scalars and what ones mean vectors. That’s one of the reasons that Greek letters are popular for scalars. It’s also common to put scalars to the left and vectors to the right. So if one saw “rMx”, it would be expected that r is a scalar, M a matrix, and x a vector, and if they’re not then this should be explained in text nearby, preferably before the equations. (And of course if it’s work you’re doing, you should know going in what you mean the letters to represent.)

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel
%d bloggers like this: