Tagged: A-To-Z Toggle Comment Threads | Keyboard Shortcuts

  • Joseph Nebus 6:00 pm on Monday, 25 September, 2017 Permalink | Reply
    Tags: A-To-Z, , , , , , , ,   

    The Summer 2017 Mathematics A To Z: Young Tableau 


    I never heard of today’s entry topic three months ago. Indeed, three weeks ago I was still making guesses about just what Gaurish, author of For the love of Mathematics,, was asking about. It turns out to be maybe the grand union of everything that’s ever been in one of my A To Z sequences. I overstate, but barely.

    Young Tableau.

    The specific thing that a Young Tableau is is beautiful in its simplicity. It could almost be a recreational mathematics puzzle, except that it isn’t challenging enough.

    Start with a couple of boxes laid in a row. As many or as few as you like.

    Now set another row of boxes. You can have as many as the first row did, or fewer. You just can’t have more. Set the second row of boxes — well, your choice. Either below the first row, or else above. I’m going to assume you’re going below the first row, and will write my directions accordingly. If you do things the other way you’re following a common enough convention. I’m leaving it on you to figure out what the directions should be, though.

    Now add in a third row of boxes, if you like. Again, as many or as few boxes as you like. There can’t be more than there are in the second row. Set it below the second row.

    And a fourth row, if you want four rows. Again, no more boxes in it than the third row had. Keep this up until you’ve got tired of adding rows of boxes.

    How many boxes do you have? I don’t know. But take the numbers 1, 2, 3, 4, 5, and so on, up to whatever the count of your boxes is. Can you fill in one number for each box? So that the numbers are always increasing as you go left to right in a single row? And as you go top to bottom in a single column? Yes, of course. Go in order: ‘1’ for the first box you laid down, then ‘2’, then ‘3’, and so on, increasing up to the last box in the last row.

    Can you do it in another way? Any other order?

    Except for the simplest of arrangements, like a single row of four boxes or three rows of one box atop another, the answer is yes. There can be many of them, turns out. Seven boxes, arranged three in the first row, two in the second, one in the third, and one in the fourth, have 35 possible arrangements. It doesn’t take a very big diagram to get an enormous number of possibilities. Could be fun drawing an arbitrary stack of boxes and working out how many arrangements there are, if you have some time in a dull meeting to pass.

    Let me step away from filling boxes. In one of its later, disappointing, seasons Futurama finally did a body-swap episode. The gimmick: two bodies could only swap the brains within them one time. So would it be possible to put Bender’s brain back in his original body, if he and Amy (or whoever) had already swapped once? The episode drew minor amusement in mathematics circles, and a lot of amazement in pop-culture circles. The writer, a mathematics major, found a proof that showed it was indeed always possible, even after many pairs of people had swapped bodies. The idea that a theorem was created for a TV show impressed many people who think theorems are rarer and harder to create than they necessarily are.

    It was a legitimate theorem, and in a well-developed field of mathematics. It’s about permutation groups. These are the study of the ways you can swap pairs of things. I grant this doesn’t sound like much of a field. There is a surprising lot of interesting things to learn just from studying how stuff can be swapped, though. It’s even of real-world relevance. Most subatomic particles of a kind — electrons, top quarks, gluons, whatever — are identical to every other particle of the same kind. Physics wouldn’t work if they weren’t. What would happen if we swap the electron on the left for the electron on the right, and vice-versa? How would that change our physics?

    A chunk of quantum mechanics studies what kinds of swaps of particles would produce an observable change, and what kind of swaps wouldn’t. When the swap doesn’t make a change we can describe this as a symmetric operation. When the swap does make a change, that’s an antisymmetric operation. And — the Young Tableau that’s a single row of two boxes? That matches up well with this symmetric operation. The Young Tableau that’s two rows of a single box each? That matches up with the antisymmetric operation.

    How many ways could you set up three boxes, according to the rules of the game? A single row of three boxes, sure. One row of two boxes and a row of one box. Three rows of one box each. How many ways are there to assign the numbers 1, 2, and 3 to those boxes, and satisfy the rules? One way to do the single row of three boxes. Also one way to do the three rows of a single box. There’s two ways to do the one-row-of-two-boxes, one-row-of-one-box case.

    What if we have three particles? How could they interact? Well, all three could be symmetric with each other. This matches the first case, the single row of three boxes. All three could be antisymmetric with each other. This matches the three rows of one box. Or you could have two particles that are symmetric with each other and antisymmetric with the third particle. Or two particles that are antisymmetric with each other but symmetric with the third particle. Two ways to do that. Two ways to fill in the one-row-of-two-boxes, one-row-of-one-box case.

    This isn’t merely a neat, aesthetically interesting coincidence. I wouldn’t spend so much time on it if it were. There’s a matching here that’s built on something meaningful. The different ways to arrange numbers in a set of boxes like this pair up with a select, interesting set of matrices whose elements are complex-valued numbers. You might wonder who introduced complex-valued numbers, let alone matrices of them, into evidence. Well, who cares? We’ve got them. They do a lot of work for us. So much work they have a common name, the “symmetric group over the complex numbers”. As my leading example suggests, they’re all over the place in quantum mechanics. They’re good to have around in regular physics too, at least in the right neighborhoods.

    These Young Tableaus turn up over and over in group theory. They match up with polynomials, because yeah, everything is polynomials. But they turn out to describe polynomial representations of some of the superstar groups out there. Groups with names like the General Linear Group (square matrices), or the Special Linear Group (square matrices with determinant equal to 1), or the Special Unitary Group (that thing where quantum mechanics says there have to be particles whose names are obscure Greek letters with superscripts of up to five + marks). If you’d care for more, here’s a chapter by Dr Frank Porter describing, in part, how you get from Young Tableaus to the obscure baryons.

    Porter’s chapter also lets me tie this back to tensors. Tensors have varied ranks, the number of different indicies you can have on the things. What happens when you swap pairs of indices in a tensor? How many ways can you swap them, and what does that do to what the tensor describes? Please tell me you already suspect this is going to match something in Young Tableaus. They do this by way of the symmetries and permutations mentioned above. But they are there.

    As I say, three months ago I had no idea these things existed. If I ever ran across them it was from seeing the name at MathWorld’s list of terms that start with ‘Y’. The article shows some nice examples (with each rows a atop the previous one) but doesn’t make clear how much stuff this subject runs through. I can’t fit everything in to a reasonable essay. (For example: the number of ways to arrange, say, 20 boxes into rows meeting these rules is itself a partition problem. Partition problems are probability and statistical mechanics. Statistical mechanics is the flow of heat, and the movement of the stars in a galaxy, and the chemistry of life.) I am delighted by what does fit.

    Advertisements
     
  • Joseph Nebus 6:00 pm on Friday, 22 September, 2017 Permalink | Reply
    Tags: A-To-Z, , , , , , , ,   

    The Summer 2017 Mathematics A To Z: X 


    We come now almost to the end of the Summer 2017 A To Z. Possibly also the end of all these A To Z sequences. Gaurish of, For the love of Mathematics, proposed that I talk about the obvious logical choice. The last promising thing I hadn’t talked about. I have no idea what to do for future A To Z’s, if they’re even possible anymore. But that’s a problem for some later time.

    X.

    Some good advice that I don’t always take. When starting a new problem, make a list of all the things that seem likely to be relevant. Problems that are worth doing are usually about things. They’ll be quantities like the radius or volume of some interesting surface. The amount of a quantity under consideration. The speed at which something is moving. The rate at which that speed is changing. The length something has to travel. The number of nodes something must go across. Whatever. This all sounds like stuff from story problems. But most interesting mathematics is from a story problem; we want to know what this property is like. Even if we stick to a purely mathematical problem, there’s usually a couple of things that we’re interested in and that we describe. If we’re attacking the four-color map theorem, we have the number of territories to color. We have, for each territory, the number of territories that touch it.

    Next, select a name for each of these quantities. Write it down, in the table, next to the term. The volume of the tank is ‘V’. The radius of the tank is ‘r’. The height of the tank is ‘h’. The fluid is flowing in at a rate ‘r’. The fluid is flowing out at a rate, oh, let’s say ‘s’. And so on. You might take a moment to go through and think out which of these variables are connected to which other ones, and how. Volume, for example, is surely something to do with the radius times something to do with the height. It’s nice to have that stuff written down. You may not know the thing you set out to solve, but you at least know you’ve got this under control.

    I recommend this. It’s a good way to organize your thoughts. It establishes what things you expect you could know, or could want to know, about the problem. It gives you some hint how these things relate to each other. It sets you up to think about what kinds of relationships you figure to study when you solve the problem. It gives you a lifeline, when you’re lost in a sea of calculation. It’s reassurance that these symbols do mean something. Better, it shows what those things are.

    I don’t always do it. I have my excuses. If I’m doing a problem that’s very like one I’ve already recently done, the things affecting it are probably the same. The names to give these variables are probably going to be about the same. Maybe I’ll make a quick sketch to show how the parts of the problem relate. If it seems like less work to recreate my thoughts than to write them down, I skip writing them down. Not always good practice. I tell myself I can always go back and do things the fully right way if I do get lost. So far that’s been true.

    So, the names. Suppose I am interested in, say, the length of the longest rod that will fit around this hallway corridor. Then I am in a freshman calculus book, yes. Fine. Suppose I am interested in whether this pinball machine can be angled up the flight of stairs that has a turn in it Then I will measure things like the width of the pinball machine. And the width of the stairs, and of the landing. I will measure this carefully. Pinball machines are heavy and there are many hilarious sad stories of people wedging them into hallways and stairwells four and a half stories up from the street. But: once I have identified, say, ‘width of pinball machine’ as a quantity of interest, why would I ever refer to it as anything but?

    This is no dumb question. It is always dangerous to lose the link between the thing we calculate and the thing we are interested in. Without that link we are less able to notice mistakes in either our calculations or the thing we mean to calculate. Without that link we can’t do a sanity check, that reassurance that it’s not plausible we just might fit something 96 feet long around the corner. Or that we estimated that we could fit something of six square feet around the corner. It is common advice in programming computers to always give variables meaningful names. Don’t write ‘T’ when ‘Total’ or, better, ‘Total_Value_Of_Purchase’ is available. Why do we disregard this in mathematics, and switch to ‘T’ instead?

    First reason is, well, try writing this stuff out. Your hand (h) will fall off (foff) in about fifteen minutes, twenty seconds. (15′ 20”). If you’re writing a program, the programming environment you have will auto-complete the variable after one or two letters in. Or you can copy and paste the whole name. It’s still good practice to leave a comment about what the variable should represent, if the name leaves any reasonable ambiguity.

    Another reason is that sure, we do specific problems for specific cases. But a mathematician is naturally drawn to thinking of general problems, in abstract cases. We see something in common between the problem “a length and a quarter of the length is fifteen feet; what is the length?” and the problem “a volume plus a quarter of the volume is fifteen gallons; what is the volume?”. That one is about lengths and the other about volumes doesn’t concern us. We see a saving in effort by separating the quantity of a thing from the kind of the thing. This restores danger. We must think, after we are done calculating, about whether the answer could make sense. But we can minimize that, we hope. At the least we can check once we’re done to see if our answer makes sense. Maybe even whether it’s right.

    For centuries, as the things we now recognize as algebra developed, we would use words. We would talk about the “thing” or the “quantity” or “it”. Some impersonal name, or convenient pronoun. This would often get shortened because anything you write often you write shorter. “Re”, perhaps. In the late 16th century we start to see the “New Algebra”. Here mathematics starts looking like … you know … mathematics. We start to see stuff like “addition” represented with the + symbol instead of an abbreviation for “addition” or a p with a squiggle over it or some other shorthand. We get equals signs. You start to see decimals and exponents. And we start to see letters used in place of numbers whose value we don’t know.

    There are a couple kinds of “numbers whose value we don’t know”. One is the number whose value we don’t know, but hope to learn. This is the classic variable we want to solve for. Another kind is the number whose value we don’t know because we don’t care. I mean, it has some value, and presumably it doesn’t change over the course of our problem. But it’s not like our work will be so different if, say, the tank is two feet high rather than four.

    Is there a problem? If we pick our letters to fit a specific problem, no. Presumably all the things we want to describe have some clear name, and some letter that best represents the name. It’s annoying when we have to consider, say, the pinball machine width and the corridor width. But we can work something out.

    But what about general problems?

    Is m b \cos(e) + b^2 \log(y) = \sqrt{e} an easy problem to solve?

    If we want to figure what ‘m’ is, yes. Similarly ‘y’. If we want to know what ‘b’ is, it’s tedious, but we can do that. If we want to know what ‘e’ is? Run and hide, that stuff is crazy. If you have to, do it numerically and accept an estimate. Don’t try figuring what that is.

    And so we’ve developed conventions. There are some letters that, except in weird circumstances, are coefficients. They’re numbers whose value we don’t know, but either don’t care about or could look up. And there are some that, by default, are variables. They’re the ones whose value we want to know.

    These conventions started forming, as mentioned, in the late 16th century. François Viète here made a name that lasts to mathematics historians at least. His texts described how to do algebra problems in the sort of procedural methods that we would recognize as algebra today. And he had a great idea for these letters. Use the whole alphabet, if needed. Use the consonants to represent the coefficients, the numbers we know but don’t care what they are. Use the vowels to represent the variables, whose values we want to learn. So he would look at that equation and see right away: it’s a terrible mess. (I exaggerate. He doesn’t seem to have known the = sign, and I don’t know offhand when ‘log’ and ‘cos’ became common. But suppose the rest of the equation were translated into his terminology.)

    It’s not a bad approach. Besides the mnemonic value of consonant-coefficient, vowel-variable, it’s true that we usually have fewer variables than anything else. The more variables in a problem the harder it is. If someone expects you to solve an equation with ten variables in it, you’re excused for refusing. So five or maybe six or possibly seven choices for variables is plenty.

    But it’s not what we settled on. René Descartes had a better idea. He had a lot of them, but here’s one. Use the letters at the end of the alphabet for the unknowns. Use the letters at the start of the alphabet for coefficients. And that is, roughly, what we’ve settled on. In my example nightmare equation, we’d suppose ‘y’ to probably be the variable we want to solve for.

    And so, and finally, x. It is almost the variable. It says “mathematics” in only two strokes. Even π takes more writing. Descartes used it. We follow him. It’s way off at the end of the alphabet. It starts few words, very few things, almost nothing we would want to measure. (Xylem … mass? Flow? What thing is the xylem anyway?) Even mathematical dictionaries don’t have much to say about it. The letter transports almost no connotations, no messy specific problems to it. If it suggests anything, it suggests the horizontal coordinate in a Cartesian system. It almost is mathematics. It signifies nothing in itself, but long use has given it an identity as the thing we hope to learn by study.

    And pirate treasure maps. I don’t know when ‘X’ became the symbol of where to look for buried treasure. My casual reading suggests “never”. Treasure maps don’t really exist. Maps in general don’t work that way. Or at least didn’t before cartoons. X marking the spot seems to be the work of Robert Louis Stevenson, renowned for creating a fanciful map and then putting together a book to justify publishing it. (I jest. But according to Simon Garfield’s On The Map: A Mind-Expanding Exploration of the Way The World Looks, his map did get lost on the way to the publisher, and he had to re-create it from studying the text of Treasure Island. This delights me to no end.) It makes me wonder if Stevenson was thinking of x’s service in mathematics. But the advantages of x as a symbol are hard to ignore. It highlights a point clearly. It’s fast to write. Its use might be coincidence.

    But it is a letter that does a needed job really well.

     
    • gaurish 1:34 am on Saturday, 23 September, 2017 Permalink | Reply

      Nice post! I also liked the Joe Vanilla comic. I find it very wierd that English language is biased towards certain alphabets (like S, E) and have very few words starting with X, Y and Z: https://en.oxforddictionaries.com/explore/which-letters-are-used-most Why would someone create a sound which he/she can’t pronounce to start a word? (I ask this question because English is not my native language).

      Like

  • Joseph Nebus 6:00 pm on Wednesday, 20 September, 2017 Permalink | Reply
    Tags: A-To-Z, , infinite descent, , , , , ,   

    The Summer 2017 Mathematics A To Z: Well-Ordering Principle 


    It’s the last full week of the Summer 2017 A To Z! Four more essays and I’ll have completed this project and curl up into a word coma. But I’m not there yet. Today’s request is another from Gaurish, who’s given me another delightful topic to write about. Gaurish hosts a fine blog, For the love of Mathematics, which I hope you’ve given a try.

    Well-Ordering Principle.

    An old mathematics joke. Or paradox, if you prefer. What is the smallest whole number with no interesting properties?

    Not one. That’s for sure. We could talk about one forever. It’s the first number we ever know. It’s the multiplicative identity. It divides into everything. It exists outside the realm of prime or composite numbers. It’s — all right, we don’t need to talk about one forever. Two? The smallest prime number. The smallest even number. The only even prime. The only — yeah, let’s move on. Three; the smallest odd prime number. Triangular number. One of only two prime numbers that isn’t one more or one less than a multiple of six. Let’s move on. Four. A square number. The smallest whole number that isn’t 1 or a prime. Five. Prime number. First sum of two prime numbers. Part of the first prime pair. Six. Smallest perfect number. Smallest product of two different prime numbers. Let’s move on.

    And so on. Somewhere around 22 or so, the imagination fails and we can’t think of anything not-boring about this number. So we’ve found the first number that hasn’t got any interesting properties! … Except that being the smallest boring number must be interesting. So we have to note that this is otherwise the smallest boring number except for that bit where it’s interesting. On to 23, which used to be the default funny number. 24. … Oh, carry on. Maybe around 31 things settle down again. Our first boring number! Except that, again, being the smallest boring number is interesting. We move on to 32, 33, 34. When we find one that couldn’t be interesting, we find that’s interesting. We’re left to conclude there is no such thing as a boring number.

    This would be a nice thing to say for numbers that otherwise get no attention, if we pretend they can have hurt feelings. But we do have to admit, 1729 is actually only interesting because it’s a part of the legend of Srinivasa Ramanujan. Enjoy the silliness for a few paragraphs more.

    (This is, if I’m not mistaken, a form of the heap paradox. Don’t remember that? Start with a heap of sand. Remove one grain; you’ve still got a heap of sand. Remove one grain again. Still a heap of sand. Remove another grain. Still a heap of sand. And yet if you did this enough you’d leave one or two grains, not a heap of sand. Where does that change?)

    Another problem, something you might consider right after learning about fractions. What’s the smallest positive number? Not one-half, since one-third is smaller and still positive. Not one-third, since one-fourth is smaller and still positive. Not one-fourth, since one-fifth is smaller and still positive. Pick any number you like and there’s something smaller and still positive. This is a difference between the positive integers and the positive real numbers. (Or the positive rational numbers, if you prefer.) The thing positive integers have is obvious, but it is not a given.

    The difference is that the positive integers are well-ordered, while the positive real numbers aren’t. Well-ordering we build on ordering. Ordering is exactly what you imagine it to be. Suppose you can say, for any two things in a set, which one is less than another. A set is well-ordered if whenever you have a non-empty subset you can pick out the smallest element. Smallest means exactly what you think, too.

    The positive integers are well-ordered. And more. The way they’re set up, they have a property called the “well-ordering principle”. This means any non-empty set of positive integers has a smallest number in it.

    This is one of those principles that seems so obvious and so basic that it can’t teach anything interesting. That it serves a role in some proofs, sure, that’s easy to imagine. But something important?

    Look back to the joke/paradox I started with. It proves that every positive integer has to be interesting. Every number, including the ones we use every day. Including the ones that no one has ever used in any mathematics or physics or economics paper, and never will. We can avoid that paradox by attacking the vagueness of “interesting” as a word. Are you interested to know the 137th number you can write as the sum of cubes in two different ways? Before you say ‘yes’, consider whether you could name it ten days after you’ve heard the number.

    (Granted, yes, it would be nice to know the 137th such number. But would you ever remember it? Would you trust that it’ll be on some Wikipedia page that somehow is never threatened with deletion for not being noteworthy? Be honest.)

    But suppose we have some property that isn’t so mushy. Suppose that we can describe it in some way that’s indexed by the positive integers. Furthermore, suppose that we show that in any set of the positive integers it must be true for the smallest number in that set. What do we know?

    — We know that it must be true for all the positive integers. There’s a smallest positive integer. The positive integers have this well-ordered principle. So any subset of the positive integers has some smallest member. And if we can show that something or other is always true for the smallest number in a subset of the positive integers, there you go.

    This technique we call, when it’s introduced, induction. It’s usually a baffling subject because it’s usually taught like this: suppose the thing you want to show is indexed to the positive integers. Show that it’s true when the index is ‘1’. Show that if the thing is true for an arbitrary index ‘n’, then you know it’s true for ‘n + 1’. It’s baffling because that second part is hard to visualize. The student makes a lot of mistakes in learning, on examples of what the sum of the first ‘N’ whole numbers or their squares or cubes are. I don’t think induction is ever taught in this well-ordering principle method. But it does get used in proofs, once you get to the part of analysis where you don’t have to interact with actual specific numbers much anymore.

    The well-ordering principle also gives us the method of infinite descent. You encountered this in learning proofs about, like, how the square root of two must be an irrational number. In this, you show that if something is true for some positive integer, then it must also be true for some other, smaller positive integer. And therefore some other, smaller positive integer again. And again, until you get into numbers small enough you can check by hand.

    It keeps creeping in. The Fundamental Theorem of Arithmetic says that every positive whole number larger than one is a product of a unique string of prime numbers. (Well, the order of the primes doesn’t matter. 2 times 3 times 5 is the same number as 3 times 2 times 5, and so on.) The well-ordering principle guarantees you can factor numbers into a product of primes. Watch this slick argument.

    Suppose you have a set of whole numbers that isn’t the product of prime numbers. There must, by the well-ordering principle, be some smallest number in that set. Call that number ‘n’. We know that ‘n’ can’t be prime, because if it were, then that would be its prime factorization. So it must be the product of at least two other numbers. Let’s suppose it’s two numbers. Call them ‘a’ and ‘b’. So, ‘n’ is equal to ‘a’ times ‘b’.

    Well, ‘a’ and ‘b’ have to be less than ‘n’. So they’re smaller than the smallest number that isn’t a product of primes. So, ‘a’ is the product of some set of primes. And ‘b’ is the product of some set of primes. And so, ‘n’ has to equal the primes that factor ‘a’ times the primes that factor ‘b’. … Which is the prime factorization of ‘n’. So, ‘n’ can’t be in the set of numbers that don’t have prime factorizations. And so there can’t be any numbers that don’t have prime factorizations. It’s for the same reason we worked out there aren’t any numbers with nothing interesting to say about them.

    And isn’t it delightful to find so simple a principle can prove such specific things?

     
    • gaurish 1:23 pm on Thursday, 21 September, 2017 Permalink | Reply

      My favourite application of Fermat’s method of infinite descent: x^4+y^4=z^4 has no non-zero integer solutions. We can apply this method not only to solve many other Diophantine equations, but also the famous divisibility question from IMO: https://math.stackexchange.com/q/1897942

      Like

      • Joseph Nebus 8:27 pm on Friday, 22 September, 2017 Permalink | Reply

        Oh good heavens, I remember the 1988 International Mathematics Olympiad question. Not from solving it myself, but from seeing it passed around as the sort of thing to practice on if I wanted to try my last year in high school. It felt back then like the sort of problem and argument just transmitted from space. I’m still not sure I’m comfortable with it.

        Like

  • Joseph Nebus 6:00 pm on Monday, 18 September, 2017 Permalink | Reply
    Tags: A-To-Z, , , , , Jacobian, , , ,   

    The Summer 2017 Mathematics A To Z: Volume Forms 


    I’ve been reading Elke Stangl’s Elkemental Force blog for years now. Sometimes I even feel social-media-caught-up enough to comment, or at least to like posts. This is relevant today as I discuss one of the Stangl’s suggestions for my letter-V topic.

    Volume Forms.

    So sometime in pre-algebra, or early in (high school) algebra, you start drawing equations. It’s a simple trick. Lay down a coordinate system, some set of axes for ‘x’ and ‘y’ and maybe ‘z’ or whatever letters are important. Look to the equation, made up of x’s and y’s and maybe z’s and so. Highlight all the points with coordinates whose values make the equation true. This is the logical basis for saying (eg) that the straight line “is” y = 2x + 1 .

    A short while later, you learn about polar coordinates. Instead of using ‘x’ and ‘y’, you have ‘r’ and ‘θ’. ‘r’ is the distance from the center of the universe. ‘θ’ is the angle made with respect to some reference axis. It’s as legitimate a way of describing points in space. Some classrooms even have a part of the blackboard (whiteboard, whatever) with a polar-coordinates “grid” on it. This looks like the lines of a dartboard. And you learn that some shapes are easy to describe in polar coordinates. A circle, centered on the origin, is ‘r = 2’ or something like that. A line through the origin is ‘θ = 1’ or whatever. The line that we’d called y = 2x + 1 before? … That’s … some mess. And now r = 2\theta + 1 … that’s not even a line. That’s some kind of spiral. Two spirals, really. Kind of wild.

    And something to bother you a while. y = 2x + 1 is an equation that looks the same as r = 2\theta + 1 . You’ve changed the names of the variables, but not how they relate to each other. But one is a straight line and the other a spiral thing. How can that be?

    The answer, ultimately, is that the letters in the equations aren’t these content-neutral labels. They carry meaning. ‘x’ and ‘y’ imply looking at space a particular way. ‘r’ and ‘θ’ imply looking at space a different way. A shape has different representations in different coordinate systems. Fair enough. That seems to settle the question.

    But if you get to calculus the question comes back. You can integrate over a region of space that’s defined by Cartesian coordinates, x’s and y’s. Or you can integrate over a region that’s defined by polar coordinates, r’s and θ’s. The first time you try this, you find … well, that any region easy to describe in Cartesian coordinates is painful in polar coordinates. And vice-versa. Way too hard. But if you struggle through all that symbol manipulation, you get … different answers. Eventually the calculus teacher has mercy and explains. If you’re integrating in Cartesian coordinates you need to use “dx dy”. If you’re integrating in polar coordinates you need to use “r dr dθ”. If you’ve never taken calculus, never mind what this means. What is important is that “r dr dθ” looks like three things multiplied together, while “dx dy” is two.

    We get this explained as a “change of variables”. If we want to go from one set of coordinates to a different one, we have to do something fiddly. The extra ‘r’ in “r dr dθ” is what we get going from Cartesian to polar coordinates. And we get formulas to describe what we should do if we need other kinds of coordinates. It’s some work that introduces us to the Jacobian, which looks like the most tedious possible calculation ever at that time. (In Intro to Differential Equations we learn we were wrong, and the Wronskian is the most tedious possible calculation ever. This is also wrong, but it might as well be true.) We typically move on after this and count ourselves lucky it got no worse than that.

    None of this is wrong, even from the perspective of more advanced mathematics. It’s not even misleading, which is a refreshing change. But we can look a little deeper, and get something good from doing so.

    The deeper perspective looks at “differential forms”. These are about how to encode information about how your coordinate system represents space. They’re tensors. I don’t blame you for wondering if they would be. A differential form uses interactions between some of the directions in a space. A volume form is a differential form that uses all the directions in a space. And satisfies some other rules too. I’m skipping those because some of the symbols involved I don’t even know how to look up, much less make WordPress present.

    What’s important is the volume form carries information compactly. As symbols it tells us that this represents a chunk of space that’s constant no matter what the coordinates look like. This makes it possible to do analysis on how functions work. It also tells us what we would need to do to calculate specific kinds of problem. This makes it possible to describe, for example, how something moving in space would change.

    The volume form, and the tools to do anything useful with it, demand a lot of supporting work. You can dodge having to explicitly work with tensors. But you’ll need a lot of tensor-related materials, like wedge products and exterior derivatives and stuff like that. If you’ve never taken freshman calculus don’t worry: the people who have taken freshman calculus never heard of those things either. So what makes this worthwhile?

    Yes, person who called out “polynomials”. Good instinct. Polynomials are usually a reason for any mathematics thing. This is one of maybe four exceptions. I have to appeal to my other standard answer: “group theory”. These volume forms match up naturally with groups. There’s not only information about how coordinates describe a space to consider. There’s ways to set up coordinates that tell us things.

    That isn’t all. These volume forms can give us new invariants. Invariants are what mathematicians say instead of “conservation laws”. They’re properties whose value for a given problem is constant. This can make it easier to work out how one variable depends on another, or to work out specific values of variables.

    For example, classical physics problems like how a bunch of planets orbit a sun often have a “symplectic manifold” that matches the problem. This is a description of how the positions and momentums of all the things in the problem relate. The symplectic manifold has a volume form. That volume is going to be constant as time progresses. That is, there’s this way of representing the positions and speeds of all the planets that does not change, no matter what. It’s much like the conservation of energy or the conservation of angular momentum. And this has practical value. It’s the subject that brought my and Elke Stangl’s blogs into contact, years ago. It also has broader applicability.

    There’s no way to provide an exact answer for the movement of, like, the sun and nine-ish planets and a couple major moons and all that. So there’s no known way to answer the question of whether the Earth’s orbit is stable. All the planets are always tugging one another, changing their orbits a little. Could this converge in a weird way suddenly, on geologic timescales? Might the planet might go flying off out of the solar system? It doesn’t seem like the solar system could be all that unstable, or it would have already. But we can’t rule out that some freaky alignment of Jupiter, Saturn, and Halley’s Comet might not tweak the Earth’s orbit just far enough for catastrophe to unfold. Granted there’s nothing we could do about the Earth flying out of the solar system, but it would be nice to know if we face it, we tell ourselves.

    But we can answer this numerically. We can set a computer to simulate the movement of the solar system. But there will always be numerical errors. For example, we can’t use the exact value of π in a numerical computation. 3.141592 (and more digits) might be good enough for projecting stuff out a day, a week, a thousand years. But if we’re looking at millions of years? The difference can add up. We can imagine compensating for not having the value of π exactly right. But what about compensating for something we don’t know precisely, like, where Jupiter will be in 16 million years and two months?

    Symplectic forms can help us. The volume form represented by this space has to be conserved. So we can rewrite our simulation so that these forms are conserved, by design. This does not mean we avoid making errors. But it means we avoid making certain kinds of errors. We’re more likely to make what we call “phase” errors. We predict Jupiter’s location in 16 million years and two months. Our simulation puts it thirty degrees farther in its circular orbit than it actually would be. This is a less serious mistake to make than putting Jupiter, say, eight-tenths as far from the Sun as it would really be.

    Volume forms seem, at first, a lot of mechanism for a small problem. And, unfortunately for students, they are. They’re more trouble than they’re worth for changing Cartesian to polar coordinates, or similar problems. You know, ones that the student already has some feel for. They pay off on more abstract problems. Tracking the movement of a dozen interacting things, say, or describing a space that’s very strangely shaped. Those make the effort to learn about forms worthwhile.

     
    • elkement (Elke Stangl) 7:56 am on Tuesday, 19 September, 2017 Permalink | Reply

      That was again very intriguing! I only came across volume forms as a technical term when learning General Relativity. It seems in theoretical statistical mechanics it was not necessary to use that term – despite these fancy integrals in phase spaces with trillions of dimensions that you need when proving things about the canonical ensemble, then move on to the grand canonical one etc. I wonder why. Because the geometry of the spaces or volumes considered are fairly simple after all? “Only highly symmetrical N-balls”?

      And – continuing from my comment on your post on Topology – again, Landau and Lifshitz had pulled it off in GR without “Volume Forms” if I recall correctly. But they explain the integration of tensors of different ranks in spaces with different dimensions separately, which is still doable when having 3D space or 4D spacetime in mind (they also put much emphasis on working out the 3D space-only tensor part of the 4D tensors) – but perhaps exactly these insights and generalizations that you allude to are lost without introducing Volume Forms.

      Like

      • Joseph Nebus 8:23 pm on Friday, 22 September, 2017 Permalink | Reply

        I suspect that, for most problems, the geometry of the phase spaces in statistical mechanics is pretty simple. The problems I’ve worked on have been easy enough in that regard, although there is a lot in the field (especially non-equilibrium statistical mechanics) that I just don’t know.

        Probably it does all come back to the perception of how hard these things are to pick up versus how much one wants to do with them. Or an estimate of the audience, and how likely they are to be familiar with something, and how much book space they’re willing to spend bringing readers up to speed.

        Liked by 1 person

  • Joseph Nebus 6:00 pm on Friday, 15 September, 2017 Permalink | Reply
    Tags: 20 Questions, A-To-Z, , , , snakes, solitaire, spirals, ,   

    The Summer 2017 Mathematics A To Z: Ulam’s Spiral 


    Gaurish, of For the love of Mathematics, asked me about one of those modestly famous (among mathematicians) mathematical figures. Yeah, I don’t have a picture of it. Too much effort. It’s easier to write instead.

    Ulam’s Spiral.

    Boredom is unfairly maligned in our society. I’ve said this before, but that was years ago, and I have some different readers today. We treat boredom as a terrible thing, something to eliminate. We treat it as a state in which nothing seems interesting. It’s not. Boredom is a state in which anything, however trivial, engages the mind. We would not count the tiles on the floor, or time the rocking of a chandelier, or wonder what fraction of solitaire games can be won if we were never bored. A bored mind is a mind ready to discover things. We should welcome the state.

    Several times in the 20th century Stanislaw Ulam was bored. I mention solitaire games because, according to Ulam, he spent some time in 1946 bored, convalescent and playing a lot of solitaire. He got to wondering what’s the probability a particular solitaire game is winnable? (He was specifically playing Canfield solitaire. The game’s also called Demon, Chameleon, or Storehouse, if Wikipedia is right.) What’s the chance the cards can’t be played right, no matter how skilled the player is? It’s a problem impossible to do exactly. Ulam was one of the mathematicians designing and programming the computers of the day.

    He, with John von Neumann, worked out how to get a computer to simulate many, many rounds of cards. They would get an answer that I have never seen given in any history of the field. The field is Monte Carlo simulations. It’s built on using random numbers to conduct experiments that approximate an answer. (They’re also what my specialty is in. I mention this for those who’ve wondered what, if any, mathematics field I do consider myself competent in. This is not it.) The chance of a winnable deal is about 71 to 72 percent, although actual humans can’t hope to do more than about 35 percent. My evening’s experience with this Canfield Solitaire game suggests the chance of winning is about zero.

    In 1963, Ulam told Martin Gardner, he was bored again during a paper’s presentation. Ulam doodled, and doodled something interesting enough to have a computer doodle more than mere pen and paper could. It was interesting enough to feature in Gardner’s Mathematical Games column for March 1964. It started with what the name suggested, a spiral.

    Write down ‘1’ in the center. Write a ‘2’ next to it. This is usually done to the right of the ‘1’. If you want the ‘2’ to be on the left, or above, or below, fine, it’s your spiral. Write a ‘3’ above the ‘2’. (Or below if you want, or left or right if you’re doing your spiral that way. You’re tracing out a right angle from the “path” of numbers before that.) A ‘4’ to the left of that, a ‘5’ under that, a ‘6’ under that, a ‘7’ to the right of that, and so on. A spiral, for as long as your paper or your patience lasts. Now draw a circle around the ‘2’. Or a box. Whatever. Highlight it. Also do this for the ‘3’, and the ‘5’, and the ‘7’ and all the other prime numbers. Do this for all the numbers on your spiral. And look at what’s highlighted.

    It looks like …

    It’s …

    Well, it’s something.

    It’s hard to say what exactly. There’s a lot of diagonal lines to it. Not uninterrupted lines. Every diagonal line has some spottiness to it. There are blank regions too. There are some long stretches of numbers not highlighted, many of them horizontal or vertical lines with no prime numbers in them. Those stop too. The eye can’t help seeing clumps, especially. Imperfect diagonal stitching across the fabric of the counting numbers.

    Maybe seeing this is some fluke. Start with another number in the center. 2, if you like. 41, if you feel ambitious. Repeat the process. The details vary. But the pattern looks much the same. Regions of dense-packed broken diagonals, all over the plane.

    It begs us to believe there’s some knowable pattern here. That we could get an artist to draw a figure, with each spot in the figure corresponding to a prime number. This would be great. We know many things about prime numbers, but we don’t really have any system to generate a lot of prime numbers. Not much better than “here’s a thing, try dividing it”. Back in the 80s and 90s we had the big Fractal Boom. Everybody got computers that could draw what passed for pictures. And we could write programs that drew them. The Ulam Spiral was a minor but exciting prospect there. Was it a fractal? I don’t know. I’m not sure if anyone knows. (The spiral like you’d draw on paper wouldn’t be. The spiral that went out to infinitely large numbers might conceivably be.) It seemed plausible enough for computing magazines to be interested in. Maybe we could describe the pattern by something as simple as the Koch curve (that wriggly triangular snowflake shape). Or as easy to program as the Mandelbrot Curve.

    We haven’t found one. As keeps happening with prime numbers, the answers evade us. We can understand why diagonals should appear. Write a polynomial of the form 4n^2 + b n + c . Evaluate it for n of 1, 2, 3, 4, and so on. Highlight those numbers. This will tend to highlight numbers that, in this spiral, are diagonal or horizontal or vertical lines. A lot of polynomials like this give a string of some prime numbers. But the polynomials all peter out. The lines all have interruptions.

    There are other patterns. One, predating Ulam’s boring paper by thirty years, was made by Laurence Klauber. Klauber was a herpetologist of some renown, if Wikipedia isn’t misleading me. It claims his Rattlesnakes: Their Habits, Life Histories, and Influence on Mankind is still an authoritative text. I don’t know and will defer to people versed in the field. It also credits him with several patents in electrical power transmission.

    Anyway, Klauber’s Triangle sets a ‘1’ at the top of the triangle. The numbers ‘2 3 4’ under that, with the ‘3’ directly beneath the ‘1’. The numbers ‘5 6 7 8 9’ beneath that, the ‘7’ directly beneath the ‘3’. ’10 11 12 13 14 15 16′ beneath that, the ’13’ underneath the ‘7’. And so on. Again highlight the prime numbers. You get again these patterns of dots and lines. Many vertical lines. Some lines in isometric view. It looks like strands of Morse Code.

    In 1994 Robert Sacks created another variant. This one places the counting numbers on an Archimedian spiral. Space the numbers correctly and highlight the primes. The primes will trace out broken curves. Some are radial. Some spiral in (or out, if you rather). Some open up islands. The pattern looks like a Saul Bass logo for a “Nifty Fifty”-era telecommunications firm or maybe an airline.

    You can do more. Draw a hexagonal spiral. Triangular ones. Other patterns of laying down numbers. You get patterns. The eye can’t help seeing order there. We can’t quite pin down what it is. Prime numbers keep evading our full understanding. Perhaps it would help to doodle a little during a tiresome conference call.


    Stanislaw Ulam did enough fascinating numerical mathematics that I could probably do a sequence just on his work. I do want to mention one thing. It’s part of information theory. You know the game Twenty Questions. Play that, but allow for some lying. The game is still playable. Ulam did not invent this game; Alfréd Rényi did. (I do not know anything else about Rényi.) But Ulam ran across Rényi’s game, and pointed out how interesting it was, and mathematicians paid attention to him.

     
    • gaurish 9:04 am on Saturday, 16 September, 2017 Permalink | Reply

      “Yeah, I don’t have a picture of it. Too much effort. It’s easier to write instead.” :-)
      My interest in ulam spiral was due to its relation with an open problem in number theory to find a non-linear, non-constant polynomial which can take prime values infinitely many times. I am glad that you mentioned it.(https://mathoverflow.net/q/98431/90056)

      Like

      • Joseph Nebus 11:30 pm on Sunday, 17 September, 2017 Permalink | Reply

        I’m glad to give satisfaction. Also, regarding your link: gosh, I haven’t thought about Bunyakovski (as I learned the spelling) in years. Wow.

        Like

  • Joseph Nebus 6:00 pm on Wednesday, 13 September, 2017 Permalink | Reply
    Tags: A-To-Z, , , , , ,   

    The Summer 2017 Mathematics A To Z: Topology 


    Today’s glossary entry comes from Elke Stangl, author of the Elkemental Force blog. I’ll do my best, although it would have made my essay a bit easier if I’d had the chance to do another topic first. We’ll get there.

    Topology.

    Start with a universe. Nice thing to have around. Call it ‘M’. I’ll get to why that name.

    I’ve talked a fair bit about weird mathematical objects that need some bundle of traits to be interesting. So this will change the pace some. Here, I request only that the universe have a concept of “sets”. OK, that carries a little baggage along with it. We have to have intersections and unions. Those come about from having pairs of sets. The intersection of two sets is all the things that are in both sets simultaneously. The union of two sets is all the things that are in one set, or the other, or both simultaneously. But it’s hard to think of something that could have sets that couldn’t have intersections and unions.

    So from your universe ‘M’ create a new collection of things. Call it ‘T’. I’ll get to why that name. But if you’ve formed a guess about why, then you know. So I suppose I don’t need to say why, now. ‘T’ is a collection of subsets of ‘M’. Now let’s suppose these four things are true.

    First. ‘M’ is one of the sets in ‘T’.

    Second. The empty set ∅ (which has nothing at all in it) is one of the sets in ‘T’.

    Third. Whenever two sets are in ‘T’, their intersection is also in ‘T’.

    Fourth. Whenever two (or more) sets are in ‘T’, their union is also in ‘T’.

    Got all that? I imagine a lot of shrugging and head-nodding out there. So let’s take that. Your universe ‘M’ and your collection of sets ‘T’ are a topology. And that’s that.

    Yeah, that’s never that. Let me put in some more text. Suppose we have a universe that consists of two symbols, say, ‘a’ and ‘b’. There’s four distinct topologies you can make of that. Take the universe plus the collection of sets {∅}, {a}, {b}, and {a, b}. That’s a topology. Try it out. That’s the first collection you would probably think of.

    Here’s another collection. Take this two-thing universe and the collection of sets {∅}, {a}, and {a, b}. That’s another topology and you might want to double-check that. Or there’s this one: the universe and the collection of sets {∅}, {b}, and {a, b}. Last one: the universe and the collection of sets {∅} and {a, b} and nothing else. That one barely looks legitimate, but it is. Not a topology: the universe and the collection of sets {∅}, {a}, and {b}.

    The number of toplogies grows surprisingly with the number of things in the universe. Like, if we had three symbols, ‘a’, ‘b’, and ‘c’, there would be 29 possible topologies. The universe of the three symbols and the collection of sets {∅}, {a}, {b, c}, and {a, b, c}, for example, would be a topology. But the universe and the collection of sets {∅}, {a}, {b}, {c}, and {a, b, c} would not. It’s a good thing to ponder if you need something to occupy your mind while awake in bed.

    With four symbols, there’s 355 possibilities. Good luck working those all out before you fall asleep. Five symbols have 6,942 possibilities. You realize this doesn’t look like any expected sequence. After ‘4’ the count of topologies isn’t anything obvious like “two to the number of symbols” or “the number of symbols factorial” or something.

    Are you getting ready to call me on being inconsistent? In the past I’ve talked about topology as studying what we can know about geometry without involving the idea of distance. How’s that got anything to do with this fiddling about with sets and intersections and stuff?

    So now we come to that name ‘M’, and what it’s finally mnemonic for. I have to touch on something Elke Stangl hoped I’d write about, but a letter someone else bid on first. That would be a manifold. I come from an applied-mathematics background so I’m not sure I ever got a proper introduction to manifolds. They appeared one day in the background of some talk about physics problems. I think they were introduced as “it’s a space that works like normal space”, and that was it. We were supposed to pretend we had always known about them. (I’m translating. What we were actually told would be that it “works like R3”. That’s how mathematicians say “like normal space”.) That was all we needed.

    Properly, a manifold is … eh. It’s something that works kind of like normal space. That is, it’s a set, something that can be a universe. And it has to be something we can define “open sets” on. The open sets for the manifold follow the rules I gave for a topology above. You can make a collection of these open sets. And the empty set has to be in that collection. So does the whole universe. The intersection of two open sets in that collection is itself in that collection. The union of open sets in that collection is in that collection. If all that’s true, then we have a manifold.

    And now the piece that makes every pop mathematics article about topology talk about doughnuts and coffee cups. It’s possible that two topologies might be homeomorphic to each other. “Homeomorphic” is a term of art. But you understand it if you remember that “morph” means shape, and suspect that “homeo” is probably close to “homogenous”. Two things being homeomorphic means you can match their parts up. In the matching there’s nothing left over in the first thing or the second. And the relations between the parts of the first thing are the same as the relations between the parts of the second thing.

    So. Imagine the snippet of the number line for the numbers larger than -π and smaller than π. Think of all the open sets you can use to cover that. It will have a set like “the numbers bigger than 0 and less than 1”. A set like “the numbers bigger than -π and smaller than 2.1”. A set like “the numbers bigger than 0.01 and smaller than 0.011”. And so on.

    Now imagine the points that exist on a circle, if you’ve omitted one point. Let’s say it’s the unit circle, centered on the origin, and that what we’re leaving out is the point that’s exactly to the left of the origin. The open sets for this are the arcs that cover some part of this punctured circle. There’s the arc that corresponds to the angles from 0 to 1 radian measure. There’s the arc that corresponds to the angles from -π to 2.1 radians. There’s the arc that corresponds to the angles from 0.01 to 0.011 radians. You see where this is going. You see why I say we can match those sets on the number line to the arcs of this punctured circle. There’s some details to fill in here. But you probably believe me this could be done if I had to.

    There’s two (or three) great branches of topology. One is called “algebraic topology”. It’s the one that makes for fun pop mathematics articles about imaginary rubber sheets. It’s called “algebraic” because this field makes it natural to study the holes in a sheet. And those holes tend to form groups and rings, basic pieces of Not That Algebra. The field (I’m told) can be interpreted as looking at functors on groups and rings. This makes for some neat tying-together of subjects this A To Z round.

    The other branch is called “differential topology”, which is a great field to study because it sounds like what Mister Spock is thinking about. It inspires awestruck looks where saying you study, like, Bayesian probability gets blank stares. Differential topology is about differentiable functions on manifolds. This gets deep into mathematical physics.

    As you study mathematical physics, you stop worrying about ever solving specific physics problems. Specific problems are petty stuff. What you like is solving whole classes of problems. A steady trick for this is to try to find some properties that are true about the problem regardless of what exactly it’s doing at the time. This amounts to finding a manifold that relates to the problem. Consider a central-force problem, for example, with planets orbiting a sun. A planet can’t move just anywhere. It can only be in places and moving in directions that give the system the same total energy that it had to start. And the same linear momentum. And the same angular momentum. We can match these constraints to manifolds. Whatever the planet does, it does it without ever leaving these manifolds. To know the shapes of these manifolds — how they are connected — and what kinds of functions are defined on them tells us something of how the planets move.

    The maybe-third branch is “low-dimensional topology”. This is what differential topology is for two- or three- or four-dimensional spaces. You know, shapes we can imagine with ease in the real world. Maybe imagine with some effort, for four dimensions. This kind of branches out of differential topology because having so few dimensions to work in makes a lot of problems harder. We need specialized theoretical tools that only work for these cases. Is that enough to count as a separate branch? It depends what topologists you want to pick a fight with. (I don’t want a fight with any of them. I’m over here in numerical mathematics when I’m not merely blogging. I’m happy to provide space for anyone wishing to defend her branch of topology.)

    But each grows out of this quite general, quite abstract idea, also known as “point-set topology”, that’s all about sets and collections of sets. There is much that we can learn from thinking about how to collect the things that are possible.

     
    • gaurish 5:31 pm on Thursday, 14 September, 2017 Permalink | Reply

      I am really happy that you didn’t start with “Topology is also known as rubber sheet geometry”.

      Like

      • Joseph Nebus 1:46 am on Friday, 15 September, 2017 Permalink | Reply

        Although I never know precisely what I’m going to write before I put in the first paragraph, I did resolve that I was going to put off rubber sheets, as well as coffee cups, as long as I possibly could.

        Liked by 1 person

    • elkement (Elke Stangl) 7:33 am on Tuesday, 19 September, 2017 Permalink | Reply

      Great post! I was interested in your take as there are different ways to introduce manifolds in theoretical physics – I worked through different General Relativity textbooks / courses in parallel: One lecturer insisted that you need to treat that stuff “with the rigor of a mathematician”, and he went to great lengths to point out why a manifold is different from “normal space”. Others use the typical physicist’s approach of avoiding all specialized terms like fiber bundles and pushbacks, calling everything a “vector field” and “space”, only alluding to comprehensible familiar structures that sort of work in the same way – and somehow still managed to get across the messages and theorems in the end. But the rigorous lecturer said that it was exactly confusing the actual space (or spacetime) and a manifold that had stalled and confused Einstein for many years – so I suppose one should really learn the mathematics thoroughly here …
      On the other hand from what you say it seems to me that manifolds have sort of emerged as a tool in physics, and so Einstein had to create or inspire new mathematics as he went along … while today we can build on this and after we learned the rigorous stuff it is probably OK to fall back into the typical physicist’s mode. (Landau / Lifshitz are my favorite resource in the latter class – the treat GR very concisely in the volume on the Classical Theory of Fields, part of their 10-volume Course of Theoretical Physics – and they use hardly any specialized term related to topologies).

      Like

      • Joseph Nebus 8:10 pm on Friday, 22 September, 2017 Permalink | Reply

        Thank you so. Well, I’ve shared just how I got introduced to manifolds myself. I come from a more mathematical physics background and it’s a little surprising how often things would be introduced casually, trusting that the precise details would be filled in later. Sometimes they even were. I don’t think that’s idiosyncratic to my school, although it was a heavily applied-mathematics department. (The joke was that we had two tracks, Applied Mathematics and More Applied Mathematics.)

        I’m not very well-studied in the history of modern physics, at least not in how the mathematical models develop. But I think that you have a good read on it, that we started to get manifolds because they solved some very specific niche problems well. And then treated rigorously they promised more, and then people started looking for problems they could solve. I think that’s probably more common a history for mathematical structures than people realize. But, as you point out, that doesn’t mean everyone’s going to see the tool as worth learning how to use.

        Liked by 1 person

  • Joseph Nebus 6:00 pm on Monday, 11 September, 2017 Permalink | Reply
    Tags: A-To-Z, , , , , , ,   

    The Summer 2017 Mathematics A To Z: Sárközy’s Theorem 


    Gaurish, of For the love of Mathematics, gives me another chance to talk number theory today. Let’s see how that turns out.

    Sárközy’s Theorem.

    I have two pieces to assemble for this. One is in factors. We can take any counting number, a positive whole number, and write it as the product of prime numbers. 2038 is equal to the prime 2 times the prime 1019. 4312 is equal to 2 raised to the third power times 7 raised to the second times 11. 1040 is 2 to the fourth power times 5 times 13. 455 is 5 times 7 times 13.

    There are many ways to divide up numbers like this. Here’s one. Is there a square number among its factors? 2038 and 455 don’t have any. They’re each a product of prime numbers that are never repeated. 1040 has a square among its factors. 2 times 2 divides into 1040. 4312, similarly, has a square: we can write it as 2 squared times 2 times 7 squared times 11. So that is my first piece. We can divide counting numbers into squarefree and not-squarefree.

    The other piece is in binomial coefficients. These are numbers, often quite big numbers, that get dumped on the high school algebra student as she tries to work with some expression like (a + b)^n . They’re also dumped on the poor student in calculus, as something about Newton’s binomial coefficient theorem. Which we hear is something really important. In my experience it wasn’t explained why this should rank up there with, like, the differential calculus. (Spoiler: it’s because of polynomials.) But it’s got some great stuff to it.

    Binomial coefficients are among those utility players in mathematics. They turn up in weird places. In dealing with polynomials, of course. They also turn up in combinatorics, and through that, probability. If you run, for example, 10 experiments each of which could succeed or fail, the chance you’ll get exactly five successes is going to be proportional to one of these binomial coefficients. That they touch on polynomials and probability is a sign we’re looking at a thing woven into the whole universe of mathematics. We saw them some in talking, last A-To-Z around, about Yang Hui’s Triangle. That’s also known as Pascal’s Triangle. It has more names too, since it’s been found many times over.

    The theorem under discussion is about central binomial coefficients. These are one specific coefficient in a row. The ones that appear, in the triangle, along the line of symmetry. They’re easy to describe in formulas. for a whole number ‘n’ that’s greater than or equal to zero, evaluate what we call 2n choose n:

    {{2n} \choose{n}} =  \frac{(2n)!}{(n!)^2}

    If ‘n’ is zero, this number is \frac{0!}{(0!)^2} or 1. If ‘n’ is 1, this number is \frac{2!}{(1!)^2} or 2. If ‘n’ is 2, this number is \frac{4!}{(2!)^2} 6. If ‘n’ is 3, this number is (sparing the formula) 20. The numbers keep growing. 70, 252, 924, 3432, 12870, and so on.

    So. 1 and 2 and 6 are squarefree numbers. Not much arguing that. But 20? That’s 2 squared times 5. 70? 2 times 5 times 7. 252? 2 squared times 3 squared times 7. 924? That’s 2 squared times 3 times 7 times 11. 3432? 2 cubed times 3 times 11 times 13; there’s a 2 squared in there. 12870? 2 times 3 squared times it doesn’t matter anymore. It’s not a squarefree number.

    There’s a bunch of not-squarefree numbers in there. The question: do we ever stop seeing squarefree numbers here?

    So here’s Sárközy’s Theorem. It says that this central binomial coefficient {{2n} \choose{n}} is never squarefree as long as ‘n’ is big enough. András Sárközy showed in 1985 that this was true. How big is big enough? … We have a bound, at least, for this theorem. If ‘n’ is larger than the number 2^{8000} then the corresponding coefficient can’t be squarefree. It might not surprise you that the formulas involved here feature the Riemann Zeta function. That always seems to turn up for questions about large prime numbers.

    That’s a common state of affairs for number theory problems. Very often we can show that something is true for big enough numbers. I’m not sure there’s a clear reason why. When numbers get large enough it can be more convenient to deal with their logarithms, I suppose. And those look more like the real numbers than the integers. And real numbers are typically easier to prove stuff about. Maybe that’s it. This is vague, yes. But to ask ‘why’ some things are easy and some are hard to prove is a hard question. What is a satisfying ’cause’ here?

    It’s tempting to say that since we know this is true for all ‘n’ above a bound, we’re done. We can just test all the numbers below that bound, and the rest is done. You can do a satisfying proof this way: show that eventually the statement is true, and show all the special little cases before it is. This particular result is kind of useless, though. 2^{8000} is a number that’s something like 241 digits long. For comparison, the total number of things in the universe is something like a number about 80 digits long. Certainly not more than 90. It’d take too long to test all those cases.

    That’s all right. Since Sárközy’s proof in 1985 there’ve been other breakthroughs. In 1988 P Goetgheluck proved it was true for a big range of numbers: every ‘n’ that’s larger than 4 and less than 2^{42,205,184} . That’s a number something more than 12 million digits long. In 1991 I Vardi proved we had no squarefree central binomial coefficients for ‘n’ greater than 4 and less than 2^{774,840,978} , which is a number about 233 million digits long. And then in 1996 Andrew Granville and Olivier Ramare showed directly that this was so for all ‘n’ larger than 4.

    So that 70 that turned up just a few lines in is the last squarefree one of these coefficients.

    Is this surprising? Maybe, maybe not. I’ll bet most of you didn’t have an opinion on this topic twenty minutes ago. Let me share something that did surprise me, and continues to surprise me. In 1974 David Singmaster proved that any integer divides almost all the binomial coefficients out there. “Almost all” is here a term of art, but it means just about what you’d expect. Imagine the giant list of all the numbers that can be binomial coefficients. Then pick any positive integer you like. The number you picked will divide into so many of the giant list that the exceptions won’t be noticeable. So that square numbers like 4 and 9 and 16 and 25 should divide into most binomial coefficients? … That’s to be expected, suddenly. Into the central binomial coefficients? That’s not so obvious to me. But then so much of number theory is strange and surprising and not so obvious.

     
    • gaurish 2:56 pm on Tuesday, 12 September, 2017 Permalink | Reply

      Nice exposition, like always :-) Another place where this central binomial coefficient appears is in Paul Erdős’s proof of Bertrand’s postulate: https://en.wikipedia.org/wiki/Proof_of_Bertrand%27s_postulate

      Like

      • Joseph Nebus 1:39 am on Friday, 15 September, 2017 Permalink | Reply

        Thank you. And I’m not sure how I overlooked that, since Bertrand’s Postulate is such a nice, easy-to-understand result. (The Postulate, which can be proven, is that there is always at least one prime between a whole number ‘n’ and its double, ‘2n’. With Sarkozy’s Theorem you can show this has to be true for numbers larger than 468. For the numbers from 1 up to 468, you can just check each case. It’s time-consuming but not hard.)

        Like

  • Joseph Nebus 6:00 pm on Wednesday, 6 September, 2017 Permalink | Reply
    Tags: A-To-Z, , carousels, , , , Rye Playland,   

    The Summer 2017 Mathematics A To Z: Quasirandom numbers 


    Gaurish, host of, For the love of Mathematics, gives me the excuse to talk about amusement parks. You may want to brace yourself. Yes, this essay includes a picture. It would have included a video if I had enough WordPress privileges for that.

    Quasirandom numbers.

    Think of a merry-go-round. Or carousel, if you prefer. I will venture a guess. You might like merry-go-rounds. They’re beautiful. They can evoke happy thoughts of childhood when they were a big ride it was safe to go on. But they don’t often make one think of thrills.. They’re generally sedate things. They don’t need to be. There’s no great secret to making a carousel a thrill ride. They knew it a century ago, when all the great American carousels were carved. It’s simple. Make the thing spin fast enough, at the five or six rotations per minute the ride was made for. There are places that do this yet. There’s the Cedar Downs ride at Cedar Point, Sandusky, Ohio. There’s the antique carousel at Crossroads Village, a historical village/park just outside Flint, Michigan. There’s the Derby Racer at Playland in Rye, New York. There’s the carousel in the Merry-Go-Round Museum in Sandusky, Ohio. Any of them are great rides. Two of them have a special edge. I’ll come back to them.

    Playland's Derby Racer in motion, at night, featuring a ride operator leaning maybe twenty degrees inward.

    Rye (New York) Playland Amusement Park’s is the fastest carousel I’m aware of running. Riders are warned ahead of time to sit so they’re leaning to the left, and the ride will not get up to full speed until the ride operator checks everyone during the ride. To get some idea of its speed, notice the ride operator on the left and how far she leans. She’s not being dramatic; that’s the natural stance. Also the tilt in the carousel’s floor is not camera trickery; it does lean like that. If you have a spare day in the New York City area and any interest in classic amusement parks, this is worth the trip.

    Randomness is a valuable resource. We know it’s key to many things. We have major fields of mathematics built on it. We can understand the behavior of variables without ever knowing what value they have. All we need is to know than the chance they might be in some particular range. This makes possible all kinds of problems too complicated to do otherwise. We know it’s critical. Quantum mechanics would not work without randomness. Without quantum mechanics, matter doesn’t work. And that’s true randomness, the kind where something is unpredictable. It’s not the kind of randomness we talk about when we ask, say, what’s the chance someone was born on a Tuesday. That’s mere hidden information: if we knew the month and date and year of a person’s birth we would know whether they were born Tuesday or not. We need more.

    So the trouble is actually getting a random number. Well, a sequence of randomly drawn numbers. We rarely need this if we’re doing analysis. We can understand how some process changes the shape of a distribution without ever using the distribution. We can take derivatives of a function without ever evaluating the original function, after all.

    But we do need randomly drawn numbers. We do too much numerical work with them. For example, it’s impossible to exactly integrate most functions. Numerical methods can take a ferociously long time to evaluate. A family of methods called Monte Carlo rely on randomly-drawn values to estimate the integral. The results are strikingly good for the work required. But they must have random numbers. The name “Monte Carlo” is not some cryptic code. It is an expression of how randomly drawn numbers make the tool work.

    It’s hard to get random numbers. Consider: we can’t write an algorithm to do it. If we were to write one, then we’d be able to predict that the sequence of numbers was. We have some recourse. We could set up instruments to rely on the randomness that seems to be in the world. Thermal fluctuations, for example, created by processes outside any computer’s control, can give us a pleasant dose of randomness. If we need higher-quality random numbers than that we can go to exotic equipment. Geiger counters watching the decay of a not-alarmingly-radioactive sample. Cosmic ray detectors watching the sky.

    Or we can write something that produces numbers that look random enough. They won’t really be random, and if we wait long enough we’ll notice the sequence repeats itself. But if we only need, say, ten numbers, who cares if the sequence will repeat after ten million numbers? (We’ll surely need more than ten numbers. But we can postpone the repetition until we’ve drawn far more than ten million numbers.)

    Two of the carousels I’ve mentioned have an astounding property. The horses in a file move. I mean, relative to each other. Some horse will start the race in front of its neighbors; some will start behind. The four move forward and back thanks to a mechanism of, I am assured, staggering complexity. There are only three carousels in the world that have it. There’s Cedar Downs at Cedar Point in Sandusky, Ohio; the Racing Downs at Playland in Rye, New York; and the Derby Racer at Blackpool Pleasure Beach in Blackpool, England. The mechanism in Blackpool’s hasn’t operated in years. The one at Playland’s had not run in years, but was restored for the 2017 season. My love and I made a trip specifically to ride that. (You may have heard of a fire at the carousel in Playland this summer. This was of part of the building for their other, non-racing, antique carousel. My last information was that the carousel itself was all right.)

    These racing derbies have the horses in a file move forward and back in a “random” way. It’s not truly random. If you knew exactly which gears were underneath each horse, and where in their rotations they were, you could say which horse was about to gain on its partners and which was about to fall back. But all that is concealed from the rider. The horse patterns will eventually, someday, repeat. If the gear cycles aren’t interrupted by maintenance or malfunctions. But nobody’s going to ride any horse long enough to notice. We have in these rides a randomness as good as what your computer makes, at least for the purpose it serves.

    Cedar Point's Cedar Downs during the race, showing the blur of the ride's motion.

    The racing nature of Playland’s and Cedar Point’s derby racers mean that every ride includes exciting extra moments of overtaking or falling behind your partners to the side. It also means quarreling with your siblings about who really won the race because your horse started like four feet behind your sister’s and it ended only two feet behind so hers didn’t beat yours and, long story short, there was some punching, there was some spitting, and now nobody is gonna be allowed to get ice cream at the Carvel’s (for Playland) or cheese on a stick (for Cedar Point). This is the Cedar Downs ride at Cedar Point, and focuses on the poles that move the horses.

    What does it mean to look random? Some things seem obvious. All the possible numbers ought to come up, sooner or later. Any particular possible number shouldn’t repeat too often. Any particular possible number shouldn’t go too long without repeating. There shouldn’t be clumps of numbers; if, say, ‘4’ turns up, we shouldn’t see ‘5’ turn up right away all the time.

    We can make the idea of “looking” random quite literal. Suppose we’re selecting numbers from 0 through 9. We can draw the random numbers we’ve picked. Use the numbers as coordinates. Say we pick four digits: 1, 3, 9, and 0. Then draw the point that’s at x-coordinate 13, y-coordinate 90. Then the next four digits. Let’s say they’re 4, 2, 3, and 8. Then draw the point that’s at x-coordinate 42, y-coordinate 38. And repeat. What will this look like?

    If it clumps up, we probably don’t have good random numbers. If we see lines that points collect along, or avoid, there’s a good chance our numbers aren’t very random. If there’s whole blocks of space that they occupy, and others they avoid, we may have a defective source of random numbers. We should expect the points to cover a space pretty uniformly. (There are more rigorous, logically sound, methods. The eye can be fooled easily enough. But it’s the same principle. We have some test that notices clumps and gaps.) But …

    The thing is, there’s always going to be some clumps. There’ll always be some gaps. Part of randomness is that it forms patterns, or at least things that look like patterns to us. We can describe how big a clump (or gap; it’s the same thing, really) is for any particular quantity of randomly drawn numbers. If we see clumps bigger than that we can throw out the numbers as suspect. But … still …

    Toss a coin fairly twenty times, and there’s no reason it can’t turn up tails sixteen times. This doesn’t happen often, but it will happen sometimes. Just luck. This surplus of tails should evaporate as we take more tosses. That is, we most likely won’t see 160 tails out of 200 tosses. We certainly will not see 1,600 tails out of 2,000 tosses. We know this as the Law of Large Numbers. Wait long enough and weird fluctuations will average out.

    What if we don’t have time, though? For coin-tossing that’s silly; of course we have time. But for Monte Carlo integration? It could take too long to be confident we haven’t got too-large gaps or too-tight clusters.

    This is why we take quasi-random numbers. We begin with what randomness we’re able to manage. But we massage it. Imagine our coins example. Suppose after ten fair tosses we noticed there had been eight tails turn up. Then we would start tossing less fairly, trying to make heads more common. We would be happier if there were 12 rather than 16 tails after twenty tosses.

    Draw the results. We get now a pattern that looks still like randomness. But it’s a finer sorting; it looks like static tidied up some. The quasi-random numbers are not properly random. Knowing that, say, the last several numbers were odd means the next one is more likely to be even, the Gambler’s Fallacy put to work. But in aggregate, we trust, we’ll be able to enjoy the speed and power of randomly-drawn numbers. It shows its strengths when we don’t know just how finely we must sample a range of numbers to get good, reliable results.

    To carousels. I don’t know whether the derby racers have quasirandom outcomes. I would find believable someone telling me that all the possible orderings of the four horses in any file are equally likely. To know would demand detailed knowledge of how the gearing works, though. Also probably simulations of how the system would work if it ran long enough. It might be easier to watch the ride for a couple of days and keep track of the outcomes. If someone wants to sponsor me doing a month-long research expedition to Cedar Point, drop me a note. Or just pay for my season pass. You folks would do that for me, wouldn’t you? Thanks.

     
    • gaurish 6:55 pm on Wednesday, 6 September, 2017 Permalink | Reply

      I wonder how do you get ideas for such analogies. Felt happy after reading this article:-)

      Liked by 1 person

      • Joseph Nebus 1:22 am on Friday, 8 September, 2017 Permalink | Reply

        This was actually an analogy I had waiting to be unleashed. I’d been thinking about using the racing derbies as an exciting case for pseudorandom numbers for ages, and this gave me the excuse to actually do it.

        If I figure out how to upload videos I might do another essay about making pseudorandom sequences of numbers. I’ve got the movie footage of the Cedar Point and the Playland derbies. (Blackpool’s I visited with a barely-functional camera; it had gotten soaked in heavy rains a few days earlier. So I have precious few pictures of Blackpool Pleasure Beach and d’Efteling in the Netherlands. But that just gives me a pretext to go back and revisit both places.)

        Liked by 2 people

  • Joseph Nebus 6:00 pm on Monday, 4 September, 2017 Permalink | Reply
    Tags: A-To-Z, , , , , , ,   

    The Summer 2017 Mathematics A To Z: Prime Number 


    Gaurish, host of, For the love of Mathematics, gives me another topic for today’s A To Z entry. I think the subject got away from me. But I also like where it got.

    Prime Number.

    Something about ‘5’ that you only notice when you’re a kid first learning about numbers. You know that it’s a prime number because it’s equal to 1 times 5 and nothing else. You also know that once you introduce fractions, it’s equal to all kinds of things. It’s 10 times one-half and it’s 15 times one-third and it’s 2.5 times 2 and many other things. Why, you might ask the teacher, is it a prime number if it’s got a million billion trillion different factors? And when every other whole number has as many factors? If you get to the real numbers it’s even worse yet, although when you’re a kid you probably don’t realize that. If you ask, the teacher probably answers that it’s only the whole numbers that count for saying whether something is prime or not. And, like, 2.5 can’t be considered anything, prime or composite. This satisfies the immediate question. It doesn’t quite get at the underlying one, though. Why do integers have prime numbers while real numbers don’t?

    To maybe have a prime number we need a ring. This is a creature of group theory, or what we call “algebra” once we get to college. A ring consists of a set of elements, and a rule for adding them together, and a rule for multiplying them together. And I want this ring to have a multiplicative identity. That’s some number which works like ‘1’: take something, multiply it by that, and you get that something back again. Also, I want this multiplication rule to commute. That is, the order of multiplication doesn’t affect what the result is. (If the order matters then everything gets too complicated to deal with.) Let me say the things in the set are numbers. It turns out (spoiler!) they don’t have to be. But that’s how we start out.

    Whether the numbers in a ring are prime or not depends on the multiplication rule. Let’s take a candidate number that I’ll call ‘a’ to make my writing easier. If the only numbers whose product is ‘a’ are the pair of ‘a’ and the multiplicative identity, then ‘a’ is prime. If there’s some other pair of numbers that give you ‘a’, then ‘a’ is not prime.

    The integers — the positive and negative whole numbers, including zero — are a ring. And they have prime numbers just like you’d expect, if we figure out some rule about how to deal with the number ‘-1’. There are many other rings. There’s a whole family of rings, in fact, so commonly used that they have shorthand. Mathematicians write them as “Zn”, where ‘n’ is some whole number. They’re the integers, modulo ‘n’. That is, they’re the whole numbers from ‘0’ up to the number ‘n-1’, whatever that is. Addition and multiplication work as they do with normal arithmetic, except that if the result is less than ‘0’ we add ‘n’ to it. If the result is more than ‘n-1’ we subtract ‘n’ from it. We repeat that until the result is something from ‘0’ to ‘n-1’, inclusive.

    (We use the letter ‘Z’ because it’s from the German word for numbers, and a lot of foundational work was done by German-speaking mathematicians. Alternatively, we might write this set as “In”, where “I” stands for integers. If that doesn’t satisfy, we might write this set as “Jn”, where “J” stands for integers. This is because it’s only very recently that we’ve come to see “I” and “J” as different letters rather than different ways to write the same letter.)

    These modulo arithmetics are legitimate ones, good reliable rings. They make us realize how strange prime numbers are, though. Consider the set Z4, where the only numbers are 0, 1, 2, and 3. 0 times anything is 0. 1 times anything is whatever you started with. 2 times 1 is 2. Obvious. 2 times 2 is … 0. All right. 2 times 3 is 2 again. 3 times 1 is 3. 3 times 2 is 2. 3 times 3 is 1. … So that’s a little weird. The only product that gives us 3 is 3 times 1. So 3’s a prime number here. 2 isn’t a prime number: 2 times 3 is 2. For that matter even 1 is a composite number, an unsettling consequence.

    Or then Z5, where the only numbers are 0, 1, 2, 3, and 4. Here, there are no prime numbers. Each number is the product of at least one pair of other numbers. In Z6 we start to have prime numbers again. But Z7? Z8? I recommend these questions to a night when your mind is too busy to let you fall asleep.

    Prime numbers depend on context. In the crowded universe of all the rational numbers, or all the real numbers, nothing is prime. In the more austere world of the Gaussian Integers, familiar friends like ‘3’ are prime again, although ‘5’ no longer is. We recognize that as the product of 2 + \imath and 2 - \imath , themselves now prime numbers.

    So given that these things do depend on context. Should we care? Or let me put it another way. Suppose we contact a wholly separate culture, one that we can’t have influenced and one not influenced by us. It’s plausible that they should have a mathematics. Would they notice prime numbers as something worth study? Or would they notice them the way we notice, say, pentagonal numbers, a thing that allows for some pretty patterns and that’s about it?

    Well, anything could happen, of course. I’m inclined to think that prime numbers would be noticed, though. They seem to follow naturally from pondering arithmetic. And if one has thought of rings, then prime numbers seem to stand out. The way that Zn behaves changes in important ways if ‘n’ is a prime number. Most notably, if ‘n’ is prime (among the whole numbers), then we can define something that works like division on Zn. If ‘n’ isn’t prime (again), we can’t. This stands out. There are a host of other intriguing results that all seem to depend on whether ‘n’ is a prime number among the whole numbers. It seems hard to believe someone could think of the whole numbers and not notice the prime numbers among them.

    And they do stand out, as these reliably peculiar things. Many things about them (in the whole numbers) are easy to prove. That there are infinitely many, for example, you can prove to a child. And there are many things we have no idea how to prove. That there are infinitely many primes which are exactly two more than another prime, for example. Any child can understand the question. The one who can prove it will win what fame mathematicians enjoy. If it can be proved.

    They turn up in strange, surprising places. Just in the whole numbers we find some patches where there are many prime numbers in a row (Forty percent of the numbers 1 through 10!). We can find deserts; we know of a stretch of 1,113,106 numbers in a row without a single prime among them. We know it’s possible to find prime deserts as vast as we want. Say you want a gap between primes of at least size N. Then look at the numbers (N+1)! + 2, (N+1)! + 3, (N+1)! + 4, and so on, up to (N+1)! + N+1. None of those can be prime numbers. You must have a gap at least the size N. It may be larger; how we know that (N+1)! + 1 is a prime number?

    No telling. Well, we can check. See if any prime number divides into (N+1)! + 1. This takes a long time to do if N is all that big. There’s no formulas we know that will make this easy or quick.

    We don’t call it a “prime number” if it’s in a ring that isn’t enough like the numbers. Fair enough. We shift the name to “prime element”. “Element” is a good generic name for a thing whose identity we don’t mean to pin down too closely. I’ve talked about the Gaussian Primes already, in an earlier essay and earlier in this essay. We can make a ring out of the polynomials whose coefficients are all integers. In that, x^2 + 1 is a prime. So is x^2 - 2 . If this hasn’t given you some ideas what other polynomials might be primes, then you have something else to ponder while trying to sleep. Thinking of all the prime polynomials is likely harder than you can do, though.

    Prime numbers seem to stand out, obvious and important. Humans have known about prime numbers for as long as we’ve known about multiplication. And yet there is something obscure about them. If there are cultures completely independent of our own, do they have insights which make prime numbers not such occult figures? How different would the world be if we knew all the things we now wonder about primes?

     
    • gaurish 1:51 am on Tuesday, 5 September, 2017 Permalink | Reply

      When I submitted this topic I didn’t expect algebraic number theory since for most people, prime numbers= analytic number theory. I really enjoyed this discussion about rings and modulo. My favourite statement: “To maybe have a prime number we need a ring. “

      Liked by 1 person

      • Joseph Nebus 1:19 am on Friday, 8 September, 2017 Permalink | Reply

        Thank you. Most of the time I spent preparing this was in thinking about what there was to say about primes that wasn’t sieves and cryptography. Once I thought about how ‘5’ isn’t always prime that’s when I knew I had it.

        Liked by 2 people

  • Joseph Nebus 6:00 pm on Saturday, 2 September, 2017 Permalink | Reply
    Tags: A-To-Z, , , ,   

    The Summer 2017 Mathematics A To Z: Open Set 


    Today’s glossary entry is another request from Elke Stangl, author of the Elkemental Force blog. I’m hoping this also turns out to be a well-received entry. Half of that is up to you, the kind reader. At least I hope you’re a reader. It’s already gone wrong, as it was supposed to be Friday’s entry. I discovered I hadn’t actually scheduled it while I was too far from my laptop to do anything about that mistake. This spoils the nice Monday-Wednesday-Friday routine of these glossary entries that dates back to the first one I ever posted and just means I have to quit forever and not show my face ever again. Sorry, Ulam Spiral. Someone else will have to think of you.

    Open Set.

    Mathematics likes to present itself as being universal truths. And it is. At least if we allow that the rules of logic by which mathematics works are universal. Suppose them to be true and the rest follows. But we start out with intuition, with things we observe in the real world. We’re happy when we can remove the stuff that’s clearly based on idiosyncratic experience. We find something that’s got to be universal.

    Sets are pretty abstract things, as mathematicians use the term. They get to be hard to talk about; we run out of simpler words that we can use. A set is … a bunch of things. The things are … stuff that could be in a set, or else that we’d rule out of a set. We can end up better understanding things by drawing a picture. We draw the universe, which is a rectangular block, sometimes with dashed lines as the edges. The set is some blotch drawn on the inside of it. Some shade it in to emphasize which stuff we want in the set. If we need to pick out a couple things in the universe we drop in dots or numerals. If we’re rigorous about the drawing we could create a Venn Diagram.

    When we do this, we’re giving up on the pure mathematical abstraction of the set. We’re replacing it with a territory on a map. Several territories, if we have several sets. The territories can overlap or be completely separate. We’re subtly letting our sense of geography, our sense of the spaces in which we move, infiltrate our understanding of sets. That’s all right. It can give us useful ideas. Later on, we’ll try to separate out the ideas that are too bound to geography.

    A set is open if whenever you’re in it, you can’t be on its boundary. We never quite have this in the real world, with territories. The border between, say, New Jersey and New York becomes this infinitesimally slender thing, as wide in space as midnight is in time. But we can, with some effort, imagine the state. Imagine being as tiny in every direction as the border between two states. Then we can imagine the difference between being on the border and being away from it.

    And not being on the border matters. If we are not on the border we can imagine the problem of getting to the border. Pick any direction; we can move some distance while staying inside the set. It might be a lot of distance, it might be a tiny bit. But we stay inside however we might move. If we are on the border, then there’s some direction in which any movement, however small, drops us out of the set. That’s a difference in kind between a set that’s open and a set that isn’t.

    I say “a set that’s open and a set that isn’t”. There are such things as closed sets. A set doesn’t have to be either open or closed. It can be neither, a set that includes some of its borders but not other parts of it. It can even be both open and closed simultaneously. The whole universe, for example, is both an open and a closed set. The empty set, with nothing in it, is both open and closed. (This looks like a semantic trick. OK, if you’re in the empty set you’re not on its boundary. But you can’t be in the empty set. So what’s going on? … The usual. It makes other work easier if we call the empty set ‘open’. And the extra work we’d have to do to rule out the empty set doesn’t seem to get us anything interesting. So we accept what might be a trick.) The definitions of ‘open’ and ‘closed’ don’t exclude one another.

    I’m not sure how this confusing state of affairs developed. My hunch is that the words ‘open’ and ‘closed’ evolved independent of each other. Why do I think this? An open set has its openness from, well, not containing its boundaries; from the inside there’s always a little more to it. A closed set has its closedness from sequences. That is, you can consider a string of points inside a set. Are these points leading somewhere? Is that point inside your set? If a string of points always leads to somewhere, and that somewhere is inside the set, then you have closure. You have a closed set. I’m not sure that the terms were derived with that much thought. But it does explain, at least in terms a mathematician might respect, why a set that isn’t open isn’t necessarily closed.

    Back to open sets. What does it mean to not be on the boundary of the set? How do we know if we’re on it? We can define sets by all sorts of complicated rules: complex-valued numbers of size less than five, say. Rational numbers whose denominator (in lowest form) is no more than ten. Points in space from which a satellite dropped would crash into the moon rather than into the Earth or Sun. If we have an idea of distance we could measure how far it is from a point to the nearest part of the boundary. Do we need distance, though?

    No, it turns out. We can get the idea of open sets without using distance. Introduce a neighborhood of a point. A neighborhood of a point is an open set that contains that point. It doesn’t have to be small, but that’s the connotation. And we get to thinking of little N-balls, circle or sphere-like constructs centered on the target point. It doesn’t have to be N-balls. But we think of them so much that we might as well say it’s necessary. If every point in a set has a neighborhood around it that’s also inside the set, then the set’s open.

    You’re going to accuse me of begging the question. Fair enough. I was using open sets to define open sets. This use is all right for an intuitive idea of what makes a set open, but it’s not rigorous. We can give in and say we have to have distance. Then we have N-balls and we can build open sets out of balls that don’t contain the edges. Or we can try to drive distance out of our idea of open sets.

    We can do it this way. Start off by saying the whole universe is an open set. Also that the union of any number of open sets is also an open set. And that the intersection of any finite number of open sets is also an open set. Does this sound weak? So it sounds weak. It’s enough. We get the open sets we were thinking of all along from this.

    This works for the sets that look like territories on a map. It also works for sets for which we have some idea of distance, however strange it is to our everyday distances. It even works if we don’t have any idea of distance. This lets us talk about topological spaces, and study what geometry looks like if we can’t tell how far apart two points are. We can, for example, at least tell that two points are different. Can we find a neighborhood of one that doesn’t contain the other? Then we know they’re some distance apart, even without knowing what distance is.

    That we reached so abstract an idea of what an open set is without losing the idea’s usefulness suggests we’re doing well. So we are. It also shows why Nicholas Bourbaki, the famous nonexistent mathematician, thought set theory and its related ideas were the core of mathematics. Today category theory is a more popular candidate for the core of mathematics. But set theory is still close to the core, and much of analysis is about what we can know from the fact of sets being open. Open sets let us explain a lot.

     
    • elkement (Elke Stangl) 9:52 am on Sunday, 3 September, 2017 Permalink | Reply

      Thanks – beautifully written and very interesting :-)

      Like

    • gaurish 10:17 am on Sunday, 3 September, 2017 Permalink | Reply

      Whenever I study analysis/topology, I can’t stop myself from appreciating this simple yet powerful idea.

      Liked by 1 person

      • Joseph Nebus 1:17 am on Friday, 8 September, 2017 Permalink | Reply

        It’s a great concept, and one more powerful than it looks. It’s hard to explain how open-ness creeps in to everything, and why it offers something useful that closed-ness doesn’t.

        Like

  • Joseph Nebus 6:00 pm on Wednesday, 30 August, 2017 Permalink | Reply
    Tags: A-To-Z, , , , , , ,   

    The Summer 2017 Mathematics A To Z: N-Sphere/N-Ball 


    Today’s glossary entry is a request from Elke Stangl, author of the Elkemental Force blog, which among other things has made me realize how much there is interesting to say about heat pumps. Well, you never know what’s interesting before you give it serious thought.

    N-Sphere/N-Ball.

    I’ll start with space. Mathematics uses a lot of spaces. They’re inspired by geometry, by the thing that fills up our room. Sometimes we make them different by simplifying them, by thinking of the surface of a table, or what geometry looks like along a thread. Sometimes we make them bigger, imagining a space with more directions than we have. Sometimes we make them very abstract. We realize that we can think of polynomials, or functions, or shapes as if they were points in space. We can describe things that work like distance and direction and angle that work for these more abstract things.

    What are useful things we know about space? Many things. Whole books full of things. Let me pick one of them. Start with a point. Suppose we have a sense of distance, of how far one thing is from one another. Then we can have an idea of the neighborhood. We can talk about some chunk of space that’s near our starting point.

    So let’s agree on a space, and on some point in that space. You give me a distance. I give back to you — well, two obvious choices. One of them is all the points in that space that are exactly that distance from our agreed-on point. We know what this is, at least in the two kinds of space we grow up comfortable with. In three-dimensional space, this is a sphere. A shell, at least, centered around whatever that first point was. In two-dimensional space, on our desktop, it’s a circle. We know it can look a little weird: if we started out in a one-dimensional space, there’d be only two points, one on either side of the original center point. But it won’t look too weird. Imagine a four-dimensional space. Then we can speak of a hypersphere. And we can imagine that as being somehow a ball that’s extremely spherical. Maybe it pokes out of the rendering we try making of it, like a cartoon character falling out of the movie screen. We can imagine a five-dimensional space, or a ten-dimensional one, or something with even more dimensions. And we can conclude there’s a sphere for even that much space. Well, let it.

    What are spheres good for? Well, they’re nice familiar shapes. Even if they’re in a weird number of dimensions. They’re useful, too. A lot of what we do in calculus, and in analysis, is about dealing with difficult points. Points where a function is discontinuous. Points where the function doesn’t have a value. One of calculus’s reliable tricks, though, is that we can swap information about the edge of things for information about the interior. We can replace a point with a sphere and find our work is easier.

    The other thing I could give you. It’s a ball. That’s all the points that aren’t more than your distance away from our point. It’s the inside, the whole planet rather than just the surface of the Earth.

    And here’s an ambiguity. Is the surface a part of the ball? Should we include the edge, or do we just want the inside? And that depends on what we want to do. Either might be right. If we don’t need the edge, then we have an open set (stick around for Friday). This gives us the open ball. If we do need the edge, then we have a closed set, and so, the closed ball.

    Balls are so useful. Take a chunk of space that you find interesting for whatever reason. We can represent that space as the joining together (the “union”) of a bunch of balls. Probably not all the same size, but that’s all right. We might need infinitely many of these balls to get the chunk precisely right, or as close to right as can be. But that’s all right. We can still do it. Most anything we want to analyze is easier to prove on any one of these balls. And since we can describe the complicated shape as this combination of balls, then we can know things about the whole complicated shape. It’s much the way we can know things about polygons by breaking them into triangles, and showing things are true about triangles.

    Sphere or ball, whatever you like. We can describe how many dimensions of space the thing occupies with the prefix. The 3-ball is everything close enough to a point that’s in a three-dimensional space. The 2-ball is everything close enough in a two-dimensional space. The 10-ball is everything close enough to a point in a ten-dimensional space. The 3-sphere is … oh, all right. Here we have a little squabble. People doing geometry prefer this to be the sphere in three dimensions. People doing topology prefer this to be the sphere whose surface has three dimensions, that is, the sphere in four dimensions. Usually which you mean will be clear from context: are you reading a geometry or a topology paper? If you’re not sure, oh, look for anything hinting at the number of spatial dimensions. If nothing gives you a hint maybe it doesn’t matter.

    Either way, we do want to talk about the family of shapes without committing ourselves to any particular number of dimensions. And so that’s why we fall back on ‘N’. ‘N’ is a good name for “the number of dimensions we’re working in”, and so we use it. Then we have the N-sphere and the N-ball, a sphere-like shape, or a ball-like shape, that’s in however much space we need for the problem.

    I mentioned something early on that I bet you paid no attention to. That was that we need a space, and a point inside the space, and some idea of distance. One of the surprising things mathematics teaches us about distance is … there’s a lot of ideas of distance out there. We have what I’ll call an instinctive idea of distance. It’s the one that matches what holding a ruler up to stuff tells us. But we don’t have to have that.

    I sense the grumbling already. Yes, sure, we can define distance by some screwball idea, but do we ever need it? To which the mathematician answers, well, what if you’re trying to figure out how far away something in midtown Manhattan is? Where you can only walk along streets or avenues and we pretend Broadway doesn’t exist? Huh? How about that? Oh, fine, the skeptic might answer. Grant that there can be weird cases where the straight-line ruler distance is less enlightening than some other scheme is.

    Well, there are. There exists a whole universe of different ideas of distance. There’s a handful of useful ones. The ordinary straight-line ruler one, the Euclidean distance, you get in a method so familiar it’s worth saying what you do. You find the coordinates of your two given points. Take the pairs of corresponding coordinates: the x-coordinates of the two points, the y-coordinates of the two points, the z-coordinates, and so on. Find the differences between corresponding coordinates. Take the absolute value of those differences. Square all those absolute-value differences. Add up all those squares. Take the square root of that. Fine enough.

    There’s a lot of novelty acts. For example, do that same thing, only instead of raising the differences to the second power, raise them to the 26th power. When you get the sum, instead of the square root, take the 26th root. There. That’s a legitimate distance. No, you will never need this, but your analysis professor might give you it as a homework problem sometime.

    Some are useful, though. Raising to the first power, and then eventually taking the first root, gives us something useful. Yes, raising to a first power and taking a first root isn’t doing anything. We just say we’re doing that for the sake of consistency. Raising to an infinitely large power, and then taking an infinitely great root, inspires angry glares. But we can make that idea rigorous. When we do it gives us something useful.

    And here’s a new, amazing thing. We can still make “spheres” for these other distances. On a two-dimensional space, the “sphere” with this first-power-based distance will look like a diamond. The “sphere” with this infinite-power-based distance will look like a square. On a three-dimensional space the “sphere” with the first-power-based distance looks like a … well, more complicated, three-dimensional diamond. The “sphere” with the infinite-power-based distance looks like a box. The “balls” in all these cases look like what you expect from knowing the spheres.

    As with the ordinary ideas of spheres and balls these shapes let us understand space. Spheres offer a natural path to understanding difficult points. Balls offer a natural path to understanding complicated shapes. The different ideas of distance change how we represent these, and how complicated they are, but not the fact that we can do it. And it allows us to start thinking of what spheres and balls for more abstract spaces, universes made of polynomials or formed of trig functions, might be. They’re difficult to visualize. But we have the grammar that lets us speak about them now.

    And for a postscript: I also wrote about spheres and balls as part of my Set Tour a couple years ago. Here’s the essay about the N-sphere, although I didn’t exactly call it that. And here’s the essay about the N-ball, again not quite called that.

     
    • elkement (Elke Stangl) 6:25 pm on Wednesday, 30 August, 2017 Permalink | Reply

      Thanks a lot for picking my suggestion! Great essay – I need the idea of this general distance sink in …

      As you mentioned heat pumps :-) … my fondness of N-balls or spheres is related to that, well, sort of – related to thermodynamics / statistical mechanics. What fascinated me a long time ago was that, for extremely high N, ‘all the volume’ of the N-ball is concentrated in just a thin shell beneath the surface – which I tried to describe here https://elkement.blog/2017/06/17/spheres-in-a-space-with-trillions-of-dimensions/

      Something, that’s maybe rather trivial or only weird because it is hard to visualize a sphere in a space with 10^25 dimensions… And after your post now I wonder what would happen if we use the 10^25th root to define the distance?

      Liked by 1 person

      • Joseph Nebus 1:06 am on Friday, 8 September, 2017 Permalink | Reply

        Thank you, and I’m quite glad you like. … And yes, where the ‘volume’ is in an N-ball is a weird and wondrous thing. It maybe doesn’t break intuition, but it does challenge it at least.

        Using the 10^25th power-and-root to define distance would look, practically, like using the infinite power. In practice, that would select, between two points, whatever the longest distance along one of the axes is and pick that out. The rest of the axes would make for a tiny modification, but at that extreme a power it wouldn’t be noticeable. I’m told that when someone does need to simulate the infinite-power distance for numerical purposes they’ll just toss in a very large power. The error made by doing that should be smaller than the usual acceptable floating-point errors. My impression is that the power used would be closer to, like, a hundred or a thousand. But I don’t have experience directly in the field about this and don’t know why they wouldn’t just use the greatest-coordinate-difference if that’s what they wanted in the first place.

        Liked by 1 person

  • Joseph Nebus 6:00 pm on Monday, 28 August, 2017 Permalink | Reply
    Tags: A-To-Z, critical points, , , Morse theory, , , ,   

    The Summer 2017 Mathematics A To Z: Morse Theory 


    Today’s A To Z entry is a change of pace. It dives deeper into analysis than this round has been. The term comes from Mr Wu, of the Singapore Maths Tuition blog, whom I thank for the request.

    Morse Theory.

    An old joke, as most of my academia-related ones are. The young scholar says to his teacher how amazing it was in the old days, when people were foolish, and thought the Sun and the Stars moved around the Earth. How fortunate we are to know better. The elder says, ah yes, but what would it look like if it were the other way around?

    There are many things to ponder packed into that joke. For one, the elder scholar’s awareness that our ancestors were no less smart or perceptive or clever than we are. For another, the awareness that there is a problem. We want to know about the universe. But we can only know what we perceive now, where we are at this moment. Even a note we’ve written in the past, or a message from a trusted friend, we can’t take uncritically. What we know is that we perceive this information in this way, now. When we pay attention to our friends in the philosophy department we learn that knowledge is even harder than we imagine. But I’ll stop there. The problem is hard enough already.

    We can put it in a mathematical form, one that seems immune to many of the worst problems of knowledge. In this form it looks something like this: if what can we know about the universe, if all we really know is what things in that universe are doing near us? The things that we look at are functions. The universe we’re hoping to understand is the domain of the functions. One filter we use to see the universe is Morse Theory.

    We don’t look at every possible function. Functions are too varied and weird for that. We look at functions whose range is the real numbers. And they must be smooth. This is a term of art. It means the function has derivatives. It has to be continuous. It can’t have sharp corners. And it has to have lots of derivatives. The first derivative of a smooth function has to also be continuous, and has to also lack corners. And the derivative of that first derivative has to be continuous, and to lack corners. And the derivative of that derivative has to be the same. A smooth function can can differentiate over and over again, infinitely many times. None of those derivatives can have corners or jumps or missing patches or anything. This is what makes it smooth.

    Most functions are not smooth, in much the same way most shapes are not circles. That’s all right. There are many smooth functions anyway, and they describe things we find interesting. Or we think they’re interesting, anyway. Smooth functions are easy for us to work with, and to know things about. There’s plenty of smooth functions. If you’re interested in something else there’s probably a smooth function that’s close enough for practical use.

    Morse Theory builds on the “critical points” of these smooth functions. A critical point, in this context, is one where the derivative is zero. Derivatives being zero usually signal something interesting going on. Often they show where the function changes behavior. In freshman calculus they signal where a function changes from increasing to decreasing, so the critical point is a maximum. In physics they show where a moving body no longer has an acceleration, so the critical point is an equilibrium. Or where a system changes from one kind of behavior to another. And here — well, many things can happen.

    So take a smooth function. And take a critical point that it’s got. (And, erg. Technical point. The derivative of your smooth function, at that critical point, shouldn’t be having its own critical point going on at the same spot. That makes stuff more complicated.) It’s possible to approximate your smooth function near that critical point with, of course, a polynomial. It’s always polynomials. The shape of these polynomials gives you an index for these points. And that can tell you something about the shape of the domain you’re on.

    At least, it tells you something about what the shape is where you are. The universal model for this — based on skimming texts and papers and popularizations of this — is of a torus standing vertically. Like a doughnut that hasn’t tipped over, or like a tire on a car that’s working as normal. I suspect this is the best shape to use for teaching, as anyone can understand it while it still shows the different behaviors. I won’t resist.

    Imagine slicing this tire horizontally. Slice it close to the bottom, below the central hole, and the part that drops down is a disc. At least, it could be flattened out tolerably well to a disc.

    Slice it somewhere that intersects the hole, though, and you have a different shape. You can’t squash that down to a disc. You have a noodle shape. A cylinder at least. That’s different from what you got the first slice.

    Slice the tire somewhere higher. Somewhere above the central hole, and you have … well, it’s still a tire. It’s got a hole in it, but you could imagine patching it and driving on. There’s another different shape that we’ve gotten from this.

    Imagine we were confined to the surface of the tire, but did not know what surface it was. That we start at the lowest point on the tire and ascend it. From the way the smooth functions around us change we can tell how the surface we’re on has changed. We can see its change from “basically a disc” to “basically a noodle” to “basically a doughnut”. We could work out what the surface we’re on has to be, thanks to how these smooth functions around us change behavior.

    Occasionally we mathematical-physics types want to act as though we’re not afraid of our friends in the philosophy department. So we deploy the second thing we know about Immanuel Kant. He observed that knowing the force of gravity falls off as the square of the distance between two things implies that the things should exist in a three-dimensional space. (Source: I dunno, I never read his paper or book or whatever and dunno I ever heard anyone say they did.) It’s a good observation. Geometry tells us what physics can happen, but what physics does happen tells us what geometry they happen in. And it tells the philosophy department that we’ve heard of Immanuel Kant. This impresses them greatly, we tell ourselves.

    Morse Theory is a manifestation of how observable physics teaches us the geometry they happen on. And in an urgent way, too. Some of Edward Witten’s pioneering work in superstring theory was in bringing Morse Theory to quantum field theory. He showed a set of problems called the Morse Inequalities gave us insight into supersymmetric quantum mechanics. The link between physics and doughnut-shapes may seem vague. This is because you’re not remembering that mathematical physics sees “stuff happening” as curves drawn on shapes which represent the kind of problem you’re interested in. Learning what the shapes representing the problem look like is solving the problem.

    If you’re interested in the substance of this, the universally-agreed reference is J Milnor’s 1963 text Morse Theory. I confess it’s hard going to read, because it’s a symbols-heavy textbook written before the existence of LaTeX. Each page reminds one why typesetters used to get hazard pay, and not enough of it.

     
    • gaurish 5:40 am on Tuesday, 29 August, 2017 Permalink | Reply

      My favourite functions: “Most functions are not smooth, in much the same way most shapes are not circles.”

      Liked by 1 person

      • Joseph Nebus 12:59 am on Friday, 8 September, 2017 Permalink | Reply

        Thank you. Sometimes I think while writing that I’ve really hit something good, and that sentence was the one that gave me that feeling that week.

        Liked by 1 person

    • gaurish 5:41 am on Tuesday, 29 August, 2017 Permalink | Reply

      Typo: my favourite statement (not functions)

      Liked by 1 person

    • mathtuition88 12:01 pm on Tuesday, 29 August, 2017 Permalink | Reply

      Reblogged this on Singapore Maths Tuition.

      Like

    • elkement (Elke Stangl) 6:39 pm on Wednesday, 30 August, 2017 Permalink | Reply

      I totally like: “Occasionally we mathematical-physics types want to act as though we’re not afraid of our friends in the philosophy department.” :-)

      It’s an amazing A-Z – I am always late and in catch-up mode, but I am enjoying every post!

      Like

      • Joseph Nebus 1:09 am on Friday, 8 September, 2017 Permalink | Reply

        Thank you kindly. And never worry about being late; you can see how scrambled my schedule has been lately. I’m hoping to get down to Inbox 100 sometime this weekend, if all goes well.

        Liked by 1 person

  • Joseph Nebus 6:00 pm on Friday, 25 August, 2017 Permalink | Reply
    Tags: A-To-Z, , , , L-Function, , , ,   

    The Summer 2017 Mathematics A To Z: L-function 


    I’m brought back to elliptic curves today thanks to another request from Gaurish, of the For The Love Of Mathematics blog. Interested in how that’s going to work out? Me too.

    So stop me if you’ve heard this one before. We’re going to make something interesting. You bring to it a complex-valued number. Anything you like. Let me call it ‘s’ for the sake of convenience. I know, it’s weird not to call it ‘z’, but that’s how this field of mathematics developed. I’m going to make a series built on this. A series is the sum of all the terms in a sequence. I know, it seems weird for a ‘series’ to be a single number, but that’s how that field of mathematics developed. The underlying sequence? I’ll make it in three steps. First, I start with all the counting numbers: 1, 2, 3, 4, 5, and so on. Second, I take each one of those terms and raise them to the power of your ‘s’. Third, I take the reciprocal of each of them. That’s the sequence. And when we add —

    Yes, that’s right, it’s the Riemann-Zeta Function. The one behind the Riemann Hypothesis. That’s the mathematical conjecture that everybody loves to cite as the biggest unsolved problem in mathematics now that we know someone did something about Fermat’s Last Theorem. The conjecture is about what the zeroes of this function are. What values of ‘s’ make this sum equal to zero? Some boring ones. Zero, negative two, negative four, negative six, and so on. It has a lot of non-boring zeroes. All the ones we know of have an ‘s’ with a real part of ½. So far we know of at least 36 billion values of ‘s’ that make this add up to zero. They’re all ½ plus some imaginary number. We conjecture that this isn’t coincidence and all the non-boring zeroes are like that. We might be wrong. But it’s the way I would bet.

    Anyone who’d be reading this far into a pop mathematics blog knows something of why the Riemann Hypothesis is interesting. It carries implications about prime numbers. It tells us things about a host of other theorems that are nice to have. Also they know it’s hard to prove. Really, really hard.

    Ancient mathematical lore tells us there are a couple ways to solve a really, really hard problem. One is to narrow its focus. Try to find as simple a case of it as you can solve. Maybe a second simple case you can solve. Maybe a third. This could show you how, roughly, to solve the general problem. Not always. Individual cases of Fermat’s Last Theorem are easy enough to solve. You can show that a^3 + b^3 = c^3 doesn’t have any non-boring answers where a, b, and c are all positive whole numbers. Same with a^5 + b^5 = c^5 , though it takes longer. That doesn’t help you with the general a^n + b^n = c^n .

    There’s another approach. It sounds like the sort of crazy thing Captain Kirk would get away with. It’s to generalize, to make a bigger, even more abstract problem. Sometimes that makes it easier.

    For the Riemann-Zeta Function there’s one compelling generalization. It fits into that sequence I described making. After taking the reciprocals of integers-raised-to-the-s-power, multiply each by some number. Which number? Well, that depends on what you like. It could be the same number every time, if you like. That’s boring, though. That’s just the Riemann-Zeta Function times your number. It’s more interesting if what number you multiply by depends on which integer you started with. (Do not let it depend on ‘s’; that’s more complicated than you want.) When you do that? Then you’ve created an L-Function.

    Specifically, you’ve created a Dirichlet L-Function. Dirichlet here is Peter Gustav Lejeune Dirichlet, a 19th century German mathematician who got his name on like everything. He did major work on partial differential equations, on Fourier series, on topology, in algebra, and on number theory, which is what we’d call these L-functions. There are other L-Functions, with identifying names such as Artin and Hecke and Euler, which get more directly into group theory. They look much like the Dirichlet L-Function. In building the sequence I described in the top paragraph, they do something else for the second step.

    The L-Function is going to look like this:

    L(s) = \sum_{n \ge 1}^{\infty} a_n \cdot \frac{1}{n^s}

    The sigma there means to evaluate the thing that comes after it for each value of ‘n’ starting at 1 and increasing, by 1, up to … well, something infinitely large. The a_n are the numbers you’ve picked. They’re some value that depend on the index ‘n’, but don’t depend on the power ‘s’. This may look funny but it’s a standard way of writing the terms in a sequence.

    An L-Function has to meet some particular criteria that I’m not going to worry about here. Look them up before you get too far into your research. These criteria give us ways to classify different L-Functions, though. We can describe them by degree, much as we describe polynomials. We can describe them by signature, part of those criteria I’m not getting into. We can describe them by properties of the extra numbers, the ones in that fourth step that you multiply the reciprocals by. And so on. LMFDB, an encyclopedia of L-Functions, lists eight or nine properties usable for a taxonomy of these things. (The ambiguity is in what things you consider to depend on what other things.)

    What makes this interesting? For one, everything that makes the Riemann Hypothesis interesting. The Riemann-Zeta Function is a slice of the L-Functions. But there’s more. They merge into elliptic curves. Every elliptic curve corresponds to some L-Function. We can use the elliptic curve or the L-Function to prove what we wish to show. Elliptic curves are subject to group theory; so, we can bring group theory into these series.

    And then it gets deeper. It always does. Go back to that formula for the L-Function like I put in mathematical symbols. I’m going to define a new function. It’s going to look a lot like a polynomial. Well, that L(s) already looked a lot like a polynomial, but this is going to look even more like one.

    Pick a number τ. It’s complex-valued. Any number. All that I care is that its imaginary part be positive. In the trade we say that’s “in the upper half-plane”, because we often draw complex-valued numbers as points on a plane. The real part serves as the horizontal and the imaginary part serves as the vertical axis.

    Now go back to your L-Function. Remember those a_n numbers you picked? Good. I’m going to define a new function based on them. It looks like this:

    f(\tau) = \sum_{n \ge 1}^{\infty} a_n \left(  e^{2 \pi \imath \tau}\right)^n

    You see what I mean about looking like a polynomial? If τ is a complex-valued number, then e^{2 \pi \imath \tau} is just another complex-valued number. If we gave that a new name like ‘z’, this function would look like the sum of constants times z raised to positive powers. We’d never know it was any kind of weird polynomial.

    Anyway. This new function ‘f(τ)’ has some properties. It might be something called a weight-2 Hecke eigenform, a thing I am not going to explain without charging someone by the hour. But see the logic here: every elliptic curve matches with some kind of L-Function. Each L-Function matches with some ‘f(τ)’ kind of function. Those functions might or might not be these weight-2 Hecke eigenforms.

    So here’s the thing. There was a big hypothesis formed in the 1950s that every rational elliptic curve matches to one of these ‘f(τ)’ functions that’s one of these eigenforms. It’s true. It took decades to prove. You may have heard of it, as the Taniyama-Shimura Conjecture. In the 1990s Wiles and Taylor proved this was true for a lot of elliptic curves, which is what proved Fermat’s Last Theorem after all that time. The rest of it was proved around 2000.

    As I said, sometimes you have to make your problem bigger and harder to get something interesting out of it.

    I mentioned this above. LMFDB is a fascinating site worth looking at. It’s got a lot of L-Function and Riemann-Zeta function-related materials.

     
  • Joseph Nebus 6:00 pm on Monday, 21 August, 2017 Permalink | Reply
    Tags: A-To-Z, , Camille Jordan, , Jordan Canonical Form, , , , ,   

    The Summer 2017 Mathematics A To Z: Jordan Canonical Form 


    I made a mistake! I thought we had got to the end of the block of A To Z topics suggested by Gaurish, of the For The Love Of Mathematics blog. Not so and, indeed, I wonder if it wouldn’t be a viable writing strategy around here for me to just ask Gaurish to throw out topics and I have two weeks to write about them. I don’t think there’s a single unpromising one in the set.

    Jordan Canonical Form.

    Before you ask, yes, this is named for the Camille Jordan.

    So this is a thing from algebra. Particularly, linear algebra. And more particularly, matrices. Matrices are so much of linear algebra that you could be forgiven thinking they’re all of linear algebra. The thing is, matrices are a really good way of describing linear transformations. That is, where you take a block of space and stretch it out, or squash it down, or rotate it, or do some combination of these things. And stretching and squashing and rotating is a lot of what you’d ever want to do. Refer to any book on how to draw animated cartoons. The only thing matrices can’t do is have their eyes bug out huge when an attractive region of space walks past.

    Thing about a matrix is if you want to do something with it, you’re going to write it as a grid of numbers. It doesn’t have to be a grid of numbers. But about all the matrices anyone does anything with are grids of numbers. And that’s fine. They do an incredible lot of stuff. What’s not fine is that on looking at a huge block of numbers, the mind sees: huh. That’s a big block of numbers. Good luck finding what’s meaningful in them. To help find meaning we have a set of standard forms. We call them “canonical” or “normal” or some other approving term. They rearrange and change the terms in the matrix so that more interesting stuff is more obvious.

    Now you’re justified asking: how can we rearrange and change the terms in a matrix without changing what the matrix is? We can get away with doing this because we can show some rearrangements don’t change what we’re interested in. That covers the “how dare we” part of “how”. We do it by using matrix multiplication. You might remember from high school algebra that matrix multiplication is this agonizing process of multiplying every pair of numbers that ever existed together, then adding them all up, and then maybe you multiply something by minus one because you’re thinking of determinants, and it all comes out wrong anyway and you have to do it over? Yeah. Well, matrix multiplication is defined hard because it makes stuff like this work out. So that covers the “by what technique” part of “how”. We start out with some matrix, let me imaginatively name it A . And then we find some transformation matrix for which, eh, let’s say P is a good enough name. I’ll say why in a moment. Then we use that matrix and its multiplicative inverse P^{-1} . And we evaluate the product P^{-1} A P . This won’t just be the same old matrix we started with. Not usually. Promise. But what this will be, if we chose our matrix P correctly, is some new matrix that’s easier to read.

    The matrices involved here have to follow some rules. Most important, they’re all going to be square matrices. There’ll be more rules that your linear algebra textbook will tell you. Or your instructor will, after checking the textbook.

    So what makes a matrix easy to read? Zeroes. Lots and lots of zeroes. When we have a standardized form of a matrix it’s nearly all zeroes. This is for a good reason: zeroes are easy to multiply stuff by. And they’re easy to add stuff to. And almost everything we do with matrices, as a calculation, is a lot of multiplication and addition of the numbers in the matrix.

    What also makes a matrix easy to read? Everything important being on the diagonal. The diagonal is one of the two things you would imagine if you were told “here’s a grid of numbers, pick out the diagonal”. In particular it’s the one that goes from the upper left to the bottom right, that is, row one column one, and row two column two, and row three column three, and so on up to row 86 column 86 (or whatever). If everything is on the diagonal the matrix is incredibly easy to work with. If it can’t all be on the diagonal at least everything should be close to it. As close as possible.

    In the Jordan Canonical Form not everything is on the diagonal. I mean, it can be, but you shouldn’t count on that. But everything either will be on the diagonal or else it’ll be one row up from the diagonal. That is, row one column two, row two column three, row 85 column 86. Like that. There’s two other important pieces.

    First is the thing in the row above the diagonal will be either 1 or 0. Second is that on the diagonal you’ll have a sequence of all the same number. Like, you’ll get four instances of the number ‘2’ along this string of the diagonal. Third is that you’ll get a 1 above all but the row above first instance of this particular number. Fourth is that you’ll get a 0 in the row above the first instance of this number.

    Yeah, that’s fussy to visualize. This is one of those things easiest to show in a picture. A Jordan canonical form is a matrix that looks like this:

    2 1 0 0 0 0 0 0 0 0 0 0
    0 2 1 0 0 0 0 0 0 0 0 0
    0 0 2 1 0 0 0 0 0 0 0 0
    0 0 0 2 0 0 0 0 0 0 0 0
    0 0 0 0 3 1 0 0 0 0 0 0
    0 0 0 0 0 3 0 0 0 0 0 0
    0 0 0 0 0 0 4 1 0 0 0 0
    0 0 0 0 0 0 0 4 1 0 0 0
    0 0 0 0 0 0 0 0 4 0 0 0
    0 0 0 0 0 0 0 0 0 -1 0 0
    0 0 0 0 0 0 0 0 0 0 -2 1
    0 0 0 0 0 0 0 0 0 0 0 -2

    This may have you dazzled. It dazzles mathematicians too. When we have to write a matrix that’s almost all zeroes like this we drop nearly all the zeroes. If we have to write anything we just write a really huge 0 in the upper-right and the lower-left corners.

    What makes this the Jordan Canonical Form is that the matrix looks like it’s put together from what we call Jordan Blocks. Look around the diagonals. Here’s the first Jordan Block:

    2 1 0 0
    0 2 1 0
    0 0 2 1
    0 0 0 2

    Here’s the second:

    3 1
    0 3

    Here’s the third:

    4 1 0
    0 4 1
    0 0 4

    Here’s the fourth:

    -1

    And here’s the fifth:

    -2 1
    0 -2

    And we can represent the whole matrix as this might-as-well-be-diagonal thing:

    First Block 0 0 0 0
    0 Second Block 0 0 0
    0 0 Third Block 0 0
    0 0 0 Fourth Block 0
    0 0 0 0 Fifth Block

    These blocks can be as small as a single number. They can be as big as however many rows and columns you like. Each individual block is some repeated number on the diagonal, and a repeated one in the row above the diagonal. You can call this the “superdiagonal”.

    (Mathworld, and Wikipedia, assert that sometimes the row below the diagonal — the “subdiagonal” — gets the 1’s instead of the superdiagonal. That’s fine if you like it that way, and it won’t change any of the real work. I have not seen these subdiagonal 1’s in the wild. But I admit I don’t do a lot of this field and maybe there’s times it’s more convenient.)

    Using the Jordan Canonical Form for a matrix is a lot like putting an object in a standard reference pose for photographing. This is a good metaphor. We get a Jordan Canonical Form by matrix multiplication, which works like rotating and scaling volumes of space. You can view the Jordan Canonical Form for a matrix as how you represent the original matrix from a new viewing angle that makes it easy to recognize. And this is why P is not a bad name for the matrix that does this work. We can see all this as “projecting” the matrix we started with into a new frame of reference. The new frame is maybe rotated and stretched and squashed and whatnot, compared to how we started. But it’s as valid a base. Projecting a mathematical object from one frame of reference to another usually involves calculating something that looks like P^{-1} A P so, projection. That’s our name.

    Mathematicians will speak of “the” Jordan Canonical Form for a matrix as if there were such a thing. I don’t mean that Jordan Canonical Forms don’t exist. They exist just as much as matrices do. It’s the “the” that misleads. You can put the Jordan Blocks in any order and have as valid, and as useful, a Jordan Canonical Form. But it’s easy to swap the orders of these blocks around — it’s another matrix multiplication, and a blessedly easy one — so it doesn’t matter which form you have. Get any one and you have them all.

    I haven’t said anything about what these numbers on the diagonal are. They’re the eigenvalues of the original matrix. I hope that clears things up.

    Yeah, not to anyone who didn’t know what a Jordan Canonical Form was to start with. Rather than get into calculations let me go to well-established metaphor. Take a sample of an unknown chemical and set it on fire. Put the light from this through a prism and photograph the spectrum. There will be lines, interruptions in the progress of colors. The locations of those lines and how intense they are tell you what the chemical is made of, and in what proportions. These are much like the eigenvectors and eigenvalues of a matrix. The eigenvectors tell you what the matrix is made of, and the eigenvalues how much of the matrix is those. This stuff gets you very far in proving a lot of great stuff. And part of what makes the Jordan Canonical Form great is that you get the eigenvalues right there in neat order, right where anyone can see them.

    So! All that’s left is finding the things. The best way to find the Jordan Canonical Form for a given matrix is to become an instructor for a class on linear algebra and assign it as homework. The second-best way is to give the problem to your TA, who will type it in to Mathematica and return the result. It’s too much work to do most of the time. Almost all the stuff you could learn from having the thing in the Jordan Canonical Form you work out in the process of finding the matrix P that would let you calculate what the Jordan Canonical Form is. And once you had that, why go on?

    Where the Jordan Canonical Form shines is in doing proofs about what matrices can do. We can always put a square matrix into a Jordan Canonical Form. So if we want to show something is true about matrices in general, we can show that it’s true for the simpler-to-work-with Jordan Canonical Form. Then show that shifting a matrix to or from the Jordan Canonical Form doesn’t change whether the thing we’re interested in is true. It exists in that strange space: it is quite useful, but never on a specific problem.

    Oh, all right. Yes, it’s the same Camille Jordan of the Jordan Curve and also of the Jordan Curve Theorem. That fellow.

     
    • elkement (Elke Stangl) 7:09 pm on Wednesday, 13 September, 2017 Permalink | Reply

      I really like the spectroscopy metaphor for eigenvectors and eigenvalues!

      Like

      • Joseph Nebus 1:40 am on Friday, 15 September, 2017 Permalink | Reply

        Thank you. It isn’t my metaphor originally, although I don’t know where I did pick it up. Very likely either a linear algebra text if not my instructor.

        Liked by 1 person

  • Joseph Nebus 6:00 pm on Friday, 18 August, 2017 Permalink | Reply
    Tags: A-To-Z, , , George Berkeley, , numerical integration, , ,   

    The Summer 2017 Mathematics A To Z: Integration 


    One more mathematics term suggested by Gaurish for the A-To-Z today, and then I’ll move on to a couple of others. Today’s is a good one.

    Integration.

    Stand on the edge of a plot of land. Walk along its boundary. As you walk the edge pay attention. Note how far you walk before changing direction, even in the slightest. When you return to where you started consult your notes. Contained within them is the area you circumnavigated.

    If that doesn’t startle you perhaps you haven’t thought about how odd that is. You don’t ever touch the interior of the region. You never do anything like see how many standard-size tiles would fit inside. You walk a path that is as close to one-dimensional as your feet allow. And encoded in there somewhere is an area. Stare at that incongruity and you realize why integrals baffle the student so. They have a deep strangeness embedded in them.

    We who do mathematics have always liked integration. They grow, in the western tradition, out of geometry. Given a shape, what is a square that has the same area? There are shapes it’s easy to find the area for, given only straightedge and compass: a rectangle? Easy. A triangle? Just as straightforward. A polygon? If you know triangles then you know polygons. A lune, the crescent-moon shape formed by taking a circular cut out of a circle? We can do that. (If the cut is the right size.) A circle? … All right, we can’t do that, but we spent two thousand years trying before we found that out for sure. And we can do some excellent approximations.

    That bit of finding-a-square-with-the-same-area was called “quadrature”. The name survives, mostly in the phrase “numerical quadrature”. We use that to mean that we computed an integral’s approximate value, instead of finding a formula that would get it exactly. The otherwise obvious choice of “numerical integration” we use already. It describes computing the solution of a differential equation. We’re not trying to be difficult about this. Solving a differential equation is a kind of integration, and we need to do that a lot. We could recast a solving-a-differential-equation problem as a find-the-area problem, and vice-versa. But that’s bother, if we don’t need to, and so we talk about numerical quadrature and numerical integration.

    Integrals are built on two infinities. This is part of why it took so long to work out their logic. One is the infinity of number; we find an integral’s value, in principle, by adding together infinitely many things. The other is an infinity of smallness. The things we add together are infinitesimally small. That we need to take things, each smaller than any number yet somehow not zero, and in such quantity that they add up to something, seems paradoxical. Their geometric origins had to be merged into that of arithmetic, of algebra, and it is not easy. Bishop George Berkeley made a steady name for himself in calculus textbooks by pointing this out. We have worked out several logically consistent schemes for evaluating integrals. They work, mostly, by showing that we can make the error caused by approximating the integral smaller than any margin we like. This is a standard trick, or at least it is, now that we know it.

    That “in principle” above is important. We don’t actually work out an integral by finding the sum of infinitely many, infinitely tiny, things. It’s too hard. I remember in grad school the analysis professor working out by the proper definitions the integral of 1. This is as easy an integral as you can do without just integrating zero. He escaped with his life, but it was a close scrape. He offered the integral of x as a way to test our endurance, without actually doing it. I’ve never made it through that.

    But we do integrals anyway. We have tools on our side. We can show, for example, that if a function obeys some common rules then we can use simpler formulas. Ones that don’t demand so many symbols in such tight formation. Ones that we can use in high school. Also, ones we can adapt to numerical computing, so that we can let machines give us answers which are near enough right. We get to choose how near is “near enough”. But then the machines decide how long we’ll have to wait to get that answer.

    The greatest tool we have on our side is the Fundamental Theorem of Calculus. Even the name promises it’s the greatest tool we might have. This rule tells us how to connect integrating a function to differentiating another function. If we can find a function whose derivative is the thing we want to integrate, then we have a formula for the integral. It’s that function we found. What a fantastic result.

    The trouble is it’s so hard to find functions whose derivatives are the thing we wanted to integrate. There are a lot of functions we can find, mind you. If we want to integrate a polynomial it’s easy. Sine and cosine and even tangent? Yeah. Logarithms? A little tedious but all right. A constant number raised to the power x? Also tedious but doable. A constant number raised to the power x2? Hold on there, that’s madness. No, we can’t do that.

    There is a weird grab-bag of functions we can find these integrals for. They’re mostly ones we can find some integration trick for. An integration trick is some way to turn the integral we’re interested in into a couple of integrals we can do and then mix back together. A lot of a Freshman Calculus course is a heap of tricks we’ve learned. They have names like “u-substitution” and “integration by parts” and “trigonometric substitution”. Some of them are really exotic, such as turning a single integral into a double integral because that leads us to something we can do. And there’s something called “differentiation under the integral sign” that I don’t know of anyone actually using. People know of it because Richard Feynman, in his fun memoir What Do You Care What Other People Think: 250 Pages Of How Awesome I Was In Every Situation Ever, mentions how awesome it made him in so many situations. Mathematics, physics, and engineering nerds are required to read this at an impressionable age, so we fall in love with a technique no textbook ever mentions. Sorry.

    I’ve written about all this as if we were interested just in areas. We’re not. We like calculating lengths and volumes and, if we dare venture into more dimensions, hypervolumes and the like. That’s all right. If we understand how to calculate areas, we have the tools we need. We can adapt them to as many or as few dimensions as we need. By weighting integrals we can do calculations that tell us about centers of mass and moments of inertial, about the most and least probable values of something, about all quantum mechanics.

    As often happens, this powerful tool starts with something anyone might ponder: what size square has the same area as this other shape? And then think seriously about it.

     
    • gaurish 7:24 am on Saturday, 19 August, 2017 Permalink | Reply

      I myself tried to write about integration a couple of years ago, but failed. This is much better. My favourite statement: “Their geometric origins had to be merged into that of arithmetic, of algebra, and it is not easy.”

      Liked by 1 person

      • Joseph Nebus 12:33 am on Thursday, 24 August, 2017 Permalink | Reply

        Aw, thank you kindly. It may be worth your trying to write again. We all come to new perspectives with time, and a variety of views are good for people trying to find one that helps them understand a thing.

        Liked by 1 person

      • elkement (Elke Stangl) 7:01 pm on Wednesday, 30 August, 2017 Permalink | Reply

        My favorite statement from this article is: “Integrals are built on two infinities. This is part of why it took so long to work out their logic. “

        Liked by 1 person

        • Joseph Nebus 1:14 am on Friday, 8 September, 2017 Permalink | Reply

          That was one of those happy sentences that’s really the whole essay, and everything else was just the run-up and the relaxation from. Have one of those and the rest of the writing is easy.

          Liked by 1 person

  • Joseph Nebus 6:00 pm on Wednesday, 16 August, 2017 Permalink | Reply
    Tags: A-To-Z, , , , rank,   

    The Summer 2017 Mathematics A To Z: Height Function (elliptic curves) 


    I am one letter closer to the end of Gaurish’s main block of requests. They’re all good ones, mind you. This gets me back into elliptic curves and Diophantine equations. I might be writing about the wrong thing.

    Height Function.

    My love’s father has a habit of asking us to rate our hobbies. This turned into a new running joke over a family vacation this summer. It’s a simple joke: I shuffled the comparables. “Which is better, Bon Jovi or a roller coaster?” It’s still a good question.

    But as genial yet nasty as the spoof is, my love’s father asks natural questions. We always want to compare things. When we form a mathematical construct we look for ways to measure it. There’s typically something. We’ll put one together. We call this a height function.

    We start with an elliptic curve. The coordinates of the points on this curve satisfy some equation. Well, there are many equations they satisfy. We pick one representation for convenience. The convenient thing is to have an easy-to-calculate height. We’ll write the equation for the curve as

    y^2 = x^3 + Ax + B

    Here both ‘A’ and ‘B’ are some integers. This form might be unique, depending on whether a slightly fussy condition on prime numbers hold. (Specifically, if ‘p’ is a prime number and ‘p4‘ divides into ‘A’, then ‘p6‘ must not divide into ‘B’. Yes, I know you realized that right away. But I write to a general audience, some of whom are learning how to see these things.) Then the height of this curve is whichever is the larger number, four times the cube of the absolute value of ‘A’, or 27 times the square of ‘B’. I ask you to just run with it. I don’t know the implications of the height function well enough to say why, oh, 25 times the square of ‘B’ wouldn’t do as well. The usual reason for something like that is that some obvious manipulation makes the 27 appear right away, or disappear right away.

    This idea of height feeds in to a measure called rank. “Rank” is a term the young mathematician encounters first while learning matrices. It’s the number of rows in a matrix that aren’t equal to some sum or multiple of other rows. That is, it’s how many different things there are among a set. You can see why we might find that interesting. So many topics have something called “rank” and it measures how many different things there are in a set of things. In elliptic curves, the rank is a measure of how complicated the curve is. We can imagine the rational points on the elliptic curve as things generated by some small set of starter points. The starter points have to be of infinite order. Starter points that don’t, don’t count for the rank. Please don’t worry about what “infinite order” means here. I only mention this infinite-order business because if I don’t then something I have to say about two paragraphs from here will sound daft. So, the rank is how many of these starter points you need to generate the elliptic curve. (WARNING: Call them “generating points” or “generators” during your thesis defense.)

    There’s no known way of guessing what the rank is if you just know ‘A’ and ‘B’. There are algorithms that can calculate the rank given a particular ‘A’ and ‘B’. But it’s not something like the quadratic formula where you can just do a quick calculation and know what you’re looking for. We don’t even know if the algorithms we have will work for every elliptic curve.

    We think that there’s no limit to the height of elliptic curves. We don’t know this. We know there exist curves with ranks as high as 28. They seem to be rare [*]. I don’t know if that’s proven. But we do know there are elliptic curves with rank zero. A lot of them, in fact. (See what I meant two paragraphs back?) These are the elliptic curves that have only finitely many rational points on them.

    And there’s a lot of those. There’s a well-respected that the average rank, of all the elliptic curves there are, is ½. It might be. What we have been able to prove is that the average rank is less than or equal to 1.17. Also that it should be larger than zero. So we’re maybe closing in on the ½ conjecture? At least we know something. I admit this essay I’ve started wondering what we do know of elliptic curves.

    What do the height, and through it the rank, get us? I worry I’m repeating myself. By themselves they give us families of elliptic curves. Shapes that are similar in a particular and not-always-obvious way. And they feed into the Birch and Swinnerton-Dyer conjecture, which is the hipster’s Riemann Hypothesis. That is, it’s this big, unanswered, important problem that would, if answered, tell us things about a lot of questions that I’m not sure can be concisely explained. At least not why they’re interesting. We know some special cases, at least. Wikipedia tells me nothing’s proved for curves with rank greater than 1. Humanity’s ignorance on this point makes me feel slightly better pondering what I don’t know about elliptic curves.

    (There are some other things within the field of elliptic curves called height functions. There’s particularly a height of individual points. I was unsure which height Gaurish found interesting so chose one. The other starts by measuring something different; it views, for example, \frac{1}{2} as having a lower height than does \frac{51}{101} , even though the numbers are quite close in value. It develops along similar lines, trying to find classes of curves with similar behavior. And it gets into different unsolved conjectures. We have our ideas about how to think of fields.).


    [*] Wikipedia seems to suggest we only know of one, provided by Professor Noam Elkies in 2006, and let me quote it in full. I apologize that it isn’t in the format I suggested at top was standard. Elkies way outranks me academically so we have to do things his way:

    y^2 + xy + y = x^3 - x^2 -  20,067,762,415,575,526,585,033,208,209,338,542,750,930,230,312,178,956,502 x + 34,481,611,795,030,556,467,032,985,690,390,720,374,855,944,359,319,180,361,266,008,296,291,939,448,732,243,429

    I can’t figure how to get WordPress to present that larger. I sympathize. I’m tired just looking at an equation like that. This page lists records of known elliptic curve ranks. I don’t know if the lack of any records more recent than 2006 reflects the page not having been updated or nobody having found a rank-29 curve. I fully accept the field might be more difficult than even doing maintenance on a web page’s content is.

     
    • gaurish 6:45 pm on Thursday, 17 August, 2017 Permalink | Reply

      Yet another beautiful post. You may like this lecture about the BSD conjecture: https://youtu.be/2gbQWIzb6Dg

      Like

      • Joseph Nebus 3:05 am on Saturday, 19 August, 2017 Permalink | Reply

        Thank you! And also thank you for the link. I never think to look for videos that would explain topics. In many ways I still think of the Internet as being in about 1998, when video was nothing more than a theoretical possibility that might someday finish loading before it glitches out.

        Like

  • Joseph Nebus 6:00 pm on Monday, 14 August, 2017 Permalink | Reply
    Tags: A-To-Z, , , , , , ,   

    The Summer 2017 Mathematics A To Z: Gaussian Primes 


    Once more do I have Gaurish to thank for the day’s topic. (There’ll be two more chances this week, providing I keep my writing just enough ahead of deadline.) This one doesn’t touch category theory or topology.

    Gaussian Primes.

    I keep touching on group theory here. It’s a field that’s about what kinds of things can work like arithmetic does. A group is a set of things that you can add together. At least, you can do something that works like adding regular numbers together does. A ring is a set of things that you can add and multiply together.

    There are many interesting rings. Here’s one. It’s called the Gaussian Integers. They’re made of numbers we can write as a + b\imath , where ‘a’ and ‘b’ are some integers. \imath is what you figure, that number that multiplied by itself is -1. These aren’t the complex-valued numbers, you notice, because ‘a’ and ‘b’ are always integers. But you add them together the way you add complex-valued numbers together. That is, a + b\imath plus c + d\imath is the number (a + c) + (b + d)\imath . And you multiply them the way you multiply complex-valued numbers together. That is, a + b\imath times c + d\imath is the number (a\cdot c - b\cdot d) + (a\cdot d + b\cdot c)\imath .

    We created something that has addition and multiplication. It picks up subtraction for free. It doesn’t have division. We can create rings that do, but this one won’t, any more than regular old integers have division. But we can ask what other normal-arithmetic-like stuff these Gaussian integers do have. For instance, can we factor numbers?

    This isn’t an obvious one. No, we can’t expect to be able to divide one Gaussian integer by another. But we can’t expect to divide a regular old integer by another, not and get an integer out of it. That doesn’t mean we can’t factor them. It means we divide the regular old integers into a couple classes. There’s prime numbers. There’s composites. There’s the unit, the number 1. There’s zero. We know prime numbers; they’re 2, 3, 5, 7, and so on. Composite numbers are the ones you get by multiplying prime numbers together: 4, 6, 8, 9, 10, and so on. 1 and 0 are off on their own. Leave them there. We can’t divide any old integer by any old integer. But we can say an integer is equal to this string of prime numbers multiplied together. This gives us a handle by which we can prove a lot of interesting results.

    We can do the same with Gaussian integers. We can divide them up into Gaussian primes, Gaussian composites, units, and zero. The words mean what they mean for regular old integers. A Gaussian composite can be factored into the multiples of Gaussian primes. Gaussian primes can’t be factored any further.

    If we know what the prime numbers are for regular old integers we can tell whether something’s a Gaussian prime. Admittedly, knowing all the prime numbers is a challenge. But a Gaussian integer a + b\imath will be prime whenever a couple simple-to-test conditions are true. First is if ‘a’ and ‘b’ are both not zero, but a^2 + b^2 is a prime number. So, for example, 5 + 4\imath is a Gaussian prime.

    You might ask, hey, would -5 - 4\imath also be a Gaussian prime? That’s also got components that are integers, and the squares of them add up to a prime number (41). Well-spotted. Gaussian primes appear in quartets. If a + b\imath is a Gaussian prime, so is -a -b\imath . And so are -b + a\imath and b - a\imath .

    There’s another group of Gaussian primes. These are the numbers a + b\imath where either ‘a’ or ‘b’ is zero. Then the other one is, if positive, three more than a whole multiple of four. If it’s negative, then it’s three less than a whole multiple of four. So ‘3’ is a Gaussian prime, as is -3, and as is 3\imath and so is -3\imath .

    This has strange effects. Like, ‘3’ is a prime number in the regular old scheme of things. It’s also a Gaussian prime. But familiar other prime numbers like ‘2’ and ‘5’? Not anymore. Two is equal to (1 + \imath) \cdot (1 - \imath) ; both of those terms are Gaussian primes. Five is equal to (2 + \imath) \cdot (2 - \imath) . There are similar shocking results for 13. But, roughly, the world of composites and prime numbers translates into Gaussian composites and Gaussian primes. In this slightly exotic structure we have everything familiar about factoring numbers.

    You might have some nagging thoughts. Like, sure, two is equal to (1 + \imath) \cdot (1 - \imath) . But isn’t it also equal to (1 + \imath) \cdot (1 - \imath) \cdot \imath \cdot (-\imath) ? One of the important things about prime numbers is that every composite number is the product of a unique string of prime numbers. Do we have to give that up for Gaussian integers?

    Good nag. But no; the doubt is coming about because you’ve forgotten the difference between “the positive integers” and “all the integers”. If we stick to positive whole numbers then, yeah, (say) ten is equal to two times five and no other combination of prime numbers. But suppose we have all the integers, positive and negative. Then ten is equal to either two times five or it’s equal to negative two times negative five. Or, better, it’s equal to negative one times two times negative one times five. Or suffix times any even number of negative ones.

    Remember that bit about separating ‘one’ out from the world of primes and composites? That’s because the number one screws up these unique factorizations. You can always toss in extra factors of one, to taste, without changing the product of something. If we have positive and negative integers to use, then negative one does almost the same trick. We can toss in any even number of extra negative ones without changing the product. This is why we separate “units” out of the numbers. They’re not part of the prime factorization of any numbers.

    For the Gaussian integers there are four units. 1 and -1, \imath and -\imath . They are neither primes nor composites, and we don’t worry about how they would otherwise multiply the number of factorizations we get.

    But let me close with a neat, easy-to-understand puzzle. It’s called the moat-crossing problem. In the regular old integers it’s this: imagine that the prime numbers are islands in a dangerous sea. You start on the number ‘2’. Imagine you have a board that can be set down and safely crossed, then picked up to be put down again. Could you get from the start and go off to safety, which is infinitely far away? If your board is some, fixed, finite length?

    No, you can’t. The problem amounts to how big the gap between one prime number and the next largest prime number can be. It turns out there’s no limit to that. That is, you give me a number, as small or as large as you like. I can find some prime number that’s more than your number less than its successor. There are infinitely large gaps between prime numbers.

    Gaussian primes, though? Since a Gaussian prime might have nearest neighbors in any direction? Nobody knows. We know there are arbitrarily large gaps. Pick a moat size; we can (eventually) find a Gaussian prime that’s at least that far away from its nearest neighbors. But this does not say whether it’s impossible to get from the smallest Gaussian primes — 1 + \imath and its companions -1 + \imath and on — infinitely far away. We know there’s a moat of width 6 separating the origin of things from infinity. We don’t know that there’s bigger ones.

    You’re not going to solve this problem. Unless I have more brilliant readers than I know about; if I have ones who can solve this problem then I might be too intimidated to write anything more. But there is surely a pleasant pastime, maybe a charming game, to be made from this. Try finding the biggest possible moats around some set of Gaussian prime island.

    Ellen Gethner, Stan Wagon, and Brian Wick’s A Stroll Through the Gaussian Primes describes this moat problem. It also sports some fine pictures of where the Gaussian primes are and what kinds of moats you can find. If you don’t follow the reasoning, you can still enjoy the illustrations.

     
  • Joseph Nebus 6:00 pm on Friday, 11 August, 2017 Permalink | Reply
    Tags: A-To-Z, , computer programming, contravariant, covariant, , functors, , ,   

    The Summer 2017 Mathematics A To Z: Functor 


    Gaurish gives me another topic for today. I’m now no longer sure whether Gaurish hopes me to become a topology blogger or a category theory blogger. I have the last laugh, though. I’ve wanted to get better-versed in both fields and there’s nothing like explaining something to learn about it.

    Functor.

    So, category theory. It’s a foundational field. It talks about stuff that’s terribly abstract. This means it’s powerful, but it can be hard to think of interesting examples. I’ll try, though.

    It starts with categories. These have three parts. The first part is a set of things. (There always is.) The second part is a collection of matches between pairs of things in the set. They’re called morphisms. The third part is a rule that lets us combine two morphisms into a new, third one. That is. Suppose ‘a’, ‘b’, and ‘c’ are things in the set. Then there’s a morphism that matches a \rightarrow b , and a morphism that matches b \rightarrow c . And we can combine them into another morphism that matches a \rightarrow c . So we have a set of things, and a set of things we can do with those things. And the set of things we can do is itself a group.

    This describes a lot of stuff. Group theory fits seamlessly into this description. Most of what we do with numbers is a kind of group theory. Vector spaces do too. Most of what we do with analysis has vector spaces underneath it. Topology does too. Most of what we do with geometry is an expression of topology. So you see why category theory is so foundational.

    Functors enter our picture when we have two categories. Or more. They’re about the ways we can match up categories. But let’s start with two categories. One of them I’ll name ‘C’, and the other, ‘D’. A functor has to match everything that’s in the set of ‘C’ to something that’s in the set of ‘D’.

    And it does more. It has to match every morphism between things in ‘C’ to some other morphism, between corresponding things in ‘D’. It’s got to do it in a way that satisfies that combining, too. That is, suppose that ‘f’ and ‘g’ are morphisms for ‘C’. And that ‘f’ and ‘g’ combine to make ‘h’. Then, the functor has to match ‘f’ and ‘g’ and ‘h’ to some morphisms for ‘D’. The combination of whatever ‘f’ matches to and whatever ‘g’ matches to has to be whatever ‘h’ matches to.

    This might sound to you like a homomorphism. If it does, I admire your memory or mathematical prowess. Functors are about matching one thing to another in a way that preserves structure. Structure is the way that sets of things can interact. We naturally look for stuff made up of different things that have the same structure. Yes, functors are themselves a category. That is, you can make a brand-new category whose set of things are the functors between two other categories. This is a good spot to pause while the dizziness passes.

    There are two kingdoms of functor. You tell them apart by what they do with the morphisms. Here again I’m going to need my categories ‘C’ and ‘D’. I need a morphism for ‘C’. I’ll call that ‘f’. ‘f’ has to match something in the set of ‘C’ to something in the set of ‘C’. Let me call the first something ‘a’, and the second something ‘b’. That’s all right so far? Thank you.

    Let me call my functor ‘F’. ‘F’ matches all the elements in ‘C’ to elements in ‘D’. And it matches all the morphisms on the elements in ‘C’ to morphisms on the elmenets in ‘D’. So if I write ‘F(a)’, what I mean is look at the element ‘a’ in the set for ‘C’. Then look at what element in the set for ‘D’ the functor matches with ‘a’. If I write ‘F(b)’, what I mean is look at the element ‘b’ in the set for ‘C’. Then pick out whatever element in the set for ‘D’ gets matched to ‘b’. If I write ‘F(f)’, what I mean is to look at the morphism ‘f’ between elements in ‘C’. Then pick out whatever morphism between elements in ‘D’ that that gets matched with.

    Here’s where I’m going with this. Suppose my morphism ‘f’ matches ‘a’ to ‘b’. Does the functor of that morphism, ‘F(f)’, match ‘F(a)’ to ‘F(b)’? Of course, you say, what else could it do? And the answer is: why couldn’t it match ‘F(b)’ to ‘F(a)’?

    No, it doesn’t break everything. Not if you’re consistent about swapping the order of the matchings. The normal everyday order, the one you’d thought couldn’t have an alternative, is a “covariant functor”. The crosswise order, this second thought, is a “contravariant functor”. Covariant and contravariant are distinctions that weave through much of mathematics. They particularly appear through tensors and the geometry they imply. In that introduction they tend to be difficult, even mean, creations, since in regular old Euclidean space they don’t mean anything different. They’re different for non-Euclidean spaces, and that’s important and valuable. The covariant versus contravariant difference is easier to grasp here.

    Functors work their way into computer science. The avenue here is in functional programming. That’s a method of programming in which instead of the normal long list of commands, you write a single line of code that holds like fourteen “->” symbols that makes the computer stop and catch fire when it encounters a bug. The advantage is that when you have the code debugged it’s quite speedy and memory-efficient. The disadvantage is if you have to alter the function later, it’s easiest to throw everything out and start from scratch, beginning from vacuum-tube-based computing machines. But it works well while it does. You just have to get the hang of it.

     
    • gaurish 9:55 am on Saturday, 12 August, 2017 Permalink | Reply

      Can you suggest a nice introductory book on category theory for beginners? What I understand is that they generalize the notions defined concretely in algebra (which were motivated by arithmetic), but I lack any concrete understanding.

      Liked by 1 person

    • mathtuition88 2:56 pm on Sunday, 13 August, 2017 Permalink | Reply

      “Categories for the Working Mathematician” by Mac Lane is good and foundational (recommended for serious readers). Another book “Cakes, Custard and Category Theory” by Eugenia Cheng is accessible even to laymen.

      Like

      • Joseph Nebus 5:08 pm on Sunday, 13 August, 2017 Permalink | Reply

        I’m grateful to MathTuition88 for the suggestion. I’m afraid I’m poorly-enough read in category theory I don’t have any good idea where beginners ought to start.

        Liked by 1 person

    • elkement (Elke Stangl) 1:59 pm on Friday, 18 August, 2017 Permalink | Reply

      May I ask a computer science question ;-) ? I tried to understand how this functor from category theory would be mapped onto (Ha – another level of mapping!! ;-)) a functor in C++ but was not very successful. In this discussion https://stackoverflow.com/questions/356950/c-functors-and-their-uses somebody says that a functor in category theory ‘has nothing to do with the C++ concept of functor’.

      Would you agree? Or if not, can you maybe explain how an ‘implementation’ of your functor example would look like in C++ (or some pseudo-code in some language…). Or keep that in mind for a future post if you ever want to return to that subject!

      Anyway: I really enjoy this series!!

      Like

      • Joseph Nebus 3:29 am on Saturday, 19 August, 2017 Permalink | Reply

        Hoo, boy, that’s a good question. I’m afraid I don’t have proper computer science training; what I do know is what I’ve picked up trying to do specific problems. In my defense, many of them lately have been database-related stuff that can benefit from these tools. Any time I need to impress the boss, I do a crash course of reading Stack Overflow for a couple weeks and rewrite some core bit of code until it breaks differently. But I will try, with the warning that I am speaking outside my actual proper training.

        To me, I see a reasonably straightforward connection between category-theory functors and C++ functors. We can look at functors as ways to match unary functions to other unary functions. This seems to me a good bit of what we’d do with C++ functors, describing ways to manipulate data without needing to know much about what the data is. If I may offer a counterbalancing Stack Overflow thread, https://stackoverflow.com/questions/2030863/in-functional-programming-what-is-a-functor has several people who seem to know what they’re talking about arguing in favor of programming-functors being enough like category-theory-functors to be enlightening.

        My understanding is that the functors of programming language Haskell are more obviously category-theory functors. But I haven’t done anything in Haskell, so I can’t say what is particularly good about doing this.

        Liked by 1 person

        • elkement (Elke Stangl) 6:56 pm on Wednesday, 30 August, 2017 Permalink | Reply

          Thanks, that was very helpful! Reading this discussion on Stack Overflow reminded me of Lisp – and then I googled for Lisp + Functors … https://en.wikipedia.org/wiki/Function_object#In_Lisp_and_Scheme – think I got it now: Quote from Wikipedia: “Many uses of functors in languages like C++ are simply emulations of the missing closure constructor. Since the programmer cannot directly construct a closure, they must define a class that has all of the necessary state variables, and also a member function.”
          It’s funny that the concept of closure feels rather natural in Lisp – not that complicated, or at least less complicated than the explanations of Functor sound…

          Like

          • Joseph Nebus 1:11 am on Friday, 8 September, 2017 Permalink | Reply

            Thank you, and let me offer something I keep not being able to believe I forget. John D Cook offers, among his many Twitter feeds, the Functor Fact of the Day account: https://twitter.com/FunctorFact

            It does go through phases of being about category theory directly and phases of being about programming, which helps me feel better thinking of what I’ve said about functors.

            Liked by 1 person

  • Joseph Nebus 6:00 pm on Wednesday, 9 August, 2017 Permalink | Reply
    Tags: A-To-Z, , , , , , , ,   

    The Summer 2017 Mathematics A To Z: Elliptic Curves 


    Gaurish, of the For The Love Of Mathematics gives me another subject today. It’s one that isn’t about ellipses. Sad to say it’s also not about elliptic integrals. This is sad to me because I have a cute little anecdote about a time I accidentally gave my class an impossible problem. I did apologize. No, nobody solved it anyway.

    Elliptic Curves.

    Elliptic Curves start, of course, with polynomials. Particularly, they’re polynomials with two variables. We call the ‘x’ and ‘y’ because we have no reason to be difficult. They’re of at most third degree. That is, we can have terms like ‘x’ and ‘y2‘ and ‘x2y’ and ‘y3‘. Something with higher powers, like, ‘x4‘ or ‘x2y2‘ — a fourth power, all together — is right out. Doesn’t matter. Start from this and we can do some slick changes of variables so that we can rewrite it to look like this:

    y^2 = x^3 + Ax + B

    Here, ‘A’ and ‘B’ are some numbers that don’t change for this particular curve. Also, we need it to be true that 4A^3 + 27B^2 doesn’t equal zero. It avoids problems. What we’ll be looking at are coordinates, values of ‘x’ and ‘y’ together which make this equation true. That is, it’s points on the curve. If you pick some real numbers ‘A’ and ‘B’ and draw all the values of ‘x’ and ‘y’ that make the equation true you get … well, there’s different shapes. They all look like those microscope photos of a water drop emerging and falling from a tap, only rotated clockwise ninety degrees.

    So. Pick any of these curves that you like. Pick a point. I’m going to name your point ‘P’. Now pick a point once more. I’m going to name that point ‘Q’. Now draw a line from P through Q. Keep drawing it. It’ll cross the original elliptic curve again. And that point is … not actually special. What is special is the reflection of that point. That is, the same x-coordinate, but flip the plus or minus sign for the y-coordinate. (WARNING! Do not call it “the reflection” at your thesis defense! Call it the “conjugate” point. It means “reflection”.) Your elliptic curve will be symmetric around the x-axis. If, say, the point with x-coordinate 4 and y-coordinate 3 is on the curve, so is the point with x-coordinate 4 and y-coordinate -3. So that reflected point is … something special.

    Kind of a curved-out less-than-sign shape.

    y^2 = x^3 - 1 . The water drop bulges out from the surface.

    This lets us do something wonderful. We can think of this reflected point as the sum of your ‘P’ and ‘Q’. You can ‘add’ any two points on the curve and get a third point. This means we can do something that looks like addition for points on the elliptic curve. And this means the points on this curve are a group, and we can bring all our group-theory knowledge to studying them. It’s a commutative group, too; ‘P’ added to ‘Q’ leads to the same point as ‘Q’ added to ‘P’.

    Let me head off some clever thoughts that make fair objections. What if ‘P’ and ‘Q’ are already reflections, so the line between them is vertical? That never touches the original elliptic curve again, right? Yeah, fair complaint. We patch this by saying that there’s one more point, ‘O’, that’s off “at infinity”. Where is infinity? It’s wherever your vertical lines end. Shut up, this can too be made rigorous. In any case it’s a common hack for this sort of problem. When we add that, everything’s nice. The ‘O’ serves the role in this group that zero serves in arithmetic: the sum of point ‘O’ and any point ‘P’ is going to be ‘P’ again.

    Second clever thought to head off: what if ‘P’ and ‘Q’ are the same point? There’s infinitely many lines that go through a single point so how do we pick one to find an intersection with the elliptic curve? Huh? If you did that, then we pick the tangent line to the elliptic curve that touches ‘P’, and carry on as before.

    The curved-out less-than-sign shape has a noticeable c-shaped bulge on the end.

    y^2 = x^3 + 1 . The water drop is close to breaking off, but surface tension has not yet pinched off the falling form.

    There’s more. What kind of number is ‘x’? Or ‘y’? I’ll bet that you figured they were real numbers. You know, ordinary stuff. I didn’t say what they were, so left it to our instinct, and that usually runs toward real numbers. Those are what I meant, yes. But we didn’t have to. ‘x’ and ‘y’ could be in other sets of numbers too. They could be complex-valued numbers. They could be just the rational numbers. They could even be part of a finite collection of possible numbers. As the equation y^2 = x^3 + Ax + B is something meaningful (and some technical points are met) we can carry on. The elliptical curves, and the points we “add” on them, might not look like the curves we started with anymore. They might not look like anything recognizable anymore. But the logic continues to hold. We still create these groups out of the points on these lines intersecting a curve.

    By now you probably admit this is neat stuff. You may also think: so what? We can take this thing you never thought about, draw points and lines on it, and make it look very loosely kind of like just adding numbers together. Why is this interesting? No appreciation just for the beauty of the structure involved? Well, we live in a fallen world.

    It comes back to number theory. The modern study of Diophantine equations grows out of studying elliptic curves on the rational numbers. It turns out the group of points you get for that looks like a finite collection of points with some collection of integers hanging on. How long that collection of numbers is is called the ‘rank’, and there are deep mysteries at work. We know there are elliptic equations that have a rank as big as 28. Nobody knows if the rank can be arbitrary high, though. And I believe we don’t even know if there are any curves with rank of, like, 27, or 25.

    Yeah, I’m still sensing skepticism out there. Fine. We’ll go back to the only part of number theory everybody agrees is useful. Encryption. We have roughly the same goals for every encryption scheme. We want it to be easy to encode a message. We want it to be easy to decode the message if you have the key. We want it to be hard to decode the message if you don’t have the key.

    The curved-out sign has a bulge with convex loops to it, so that it resembles the cut of a jigsaw puzzle piece.

    y^2 = 3x^2 - 3x + 3 . The water drop is almost large enough that its weight overcomes the surface tension holding it to the main body of water.

    Take something inside one of these elliptic curve groups. Especially one that’s got a finite field. Let me call your thing ‘g’. It’s really easy for you, knowing what ‘g’ is and what your field is, to raise it to a power. You can pretty well impress me by sharing the value of ‘g’ raised to some whole number ‘m’. Call that ‘h’.

    Why am I impressed? Because if all I know is ‘h’, I have a heck of a time figuring out what ‘g’ is. Especially on these finite field groups there’s no obvious connection between how big ‘h’ is and how big ‘g’ is and how big ‘m’ is. Start with a big enough finite field and you can encode messages in ways that are crazy hard to crack.

    We trust. At least, if there are any ways to break the code quickly, nobody’s shared them. And there’s one of those enormous-money-prize awards waiting for someone who does know how to break such a code quickly. (I don’t know which. I’m going by what I expect from people.)

    And then there’s fame. These were used to prove Fermat’s Last Theorem. Suppose there are some non-boring numbers ‘a’, ‘b’, and ‘c’, so that for some prime number ‘p’ that’s five or larger, it’s true that a^p + b^p = c^p . (We can separately prove Fermat’s Last Theorem for a power that isn’t a prime number, or a power that’s 3 or 4.) Then this implies properties about the elliptic curve:

    y^2 = x(x - a^p)(x + b^p)

    This is a convenient way of writing things since it showcases the ap and bp. It’s equal to:

    y^2 = x^3 + \left(b^p - a^p\right)x^2 + a^p b^p x

    (I was so tempted to leave an arithmetic error in there so I could make sure someone commented.)

    A little ball off to the side of a curved-out less-than-sign shape.

    y^2 = 3x^3 - 4x . The water drop has broken off, and the remaining surface rebounds to its normal meniscus.

    If there’s a solution to Fermat’s Last Theorem, then this elliptic equation can’t be modular. I don’t have enough words to explain what ‘modular’ means here. Andrew Wiles and Richard Taylor showed that the equation was modular. So there is no solution to Fermat’s Last Theorem except the boring ones. (Like, where ‘b’ is zero and ‘a’ and ‘c’ equal each other.) And it all comes from looking close at these neat curves, none of which looks like an ellipse.

    They’re named elliptic curves because we first noticed them when Carl Jacobi — yes, that Carl Jacobi — while studying the length of arcs of an ellipse. That’s interesting enough on its own. But it is hard. Maybe I could have fit in that anecdote about giving my class an impossible problem after all.

     
  • Joseph Nebus 6:00 pm on Monday, 7 August, 2017 Permalink | Reply
    Tags: A-To-Z, , , , , , , ,   

    The Summer 2017 Mathematics A To Z: Diophantine Equations 


    I have another request from Gaurish, of the For The Love Of Mathematics blog, today. It’s another change of pace.

    Diophantine Equations

    A Diophantine equation is a polynomial. Well, of course it is. It’s an equation, or a set of equations, setting one polynomial equal to another. Possibly equal to a constant. What makes this different from “any old equation” is the coefficients. These are the constant numbers that you multiply the variables, your x and y and x2 and z8 and so on, by. To make a Diophantine equation all these coefficients have to be integers. You know one well, because it’s that x^n + y^n = z^n thing that Fermat’s Last Theorem is all about. And you’ve probably seen ax + by = 1 . It turns up a lot because that’s a line, and we do a lot of stuff with lines.

    Diophantine equations are interesting. There are a couple of cases that are easy to solve. I mean, at least that we can find solutions for. ax + by = 1 , for example, that’s easy to solve. x^n + y^n = z^n it turns out we can’t solve. Well, we can if n is equal to 1 or 2. Or if x or y or z are zero. These are obvious, that is, they’re quite boring. That one took about four hundred years to solve, and the solution was “there aren’t any solutions”. This may convince you of how interesting these problems are. What, from looking at it, tells you that ax + by = 1 is simple while x^n + y^n = z^n is (most of the time) impossible?

    I don’t know. Nobody really does. There are many kinds of Diophantine equation, all different-looking polynomials. Some of them are special one-off cases, like x^n + y^n = z^n . For example, there’s x^4 + y^4 + z^4 = w^4 for some integers x, y, z, and w. Leonhard Euler conjectured this equation had only boring solutions. You’ll remember Euler. He wrote the foundational work for every field of mathematics. It turns out he was wrong. It has infinitely many interesting solutions. But the smallest one is 2,682,440^4 + 15,365,639^4 + 18,796,760^4 = 20,615,673^4 and that one took a computer search to find. We can forgive Euler not noticing it.

    Some are groups of equations that have similar shapes. There’s the Fermat’s Last Theorem formula, for example, which is a different equation for every different integer n. Then there’s what we call Pell’s Equation. This one is x^2 - D y^2 = 1 (or equals -1), for some counting number D. It’s named for the English mathematician John Pell, who did not discover the equation (even in the Western European tradition; Indian mathematicians were familiar with it for a millennium), did not solve the equation, and did not do anything particularly noteworthy in advancing human understanding of the solution. Pell owes his fame in this regard to Leonhard Euler, who misunderstood Pell’s revising a translation of a book discussing a solution for Pell’s authoring a solution. I confess Euler isn’t looking very good on Diophantine equations.

    But nobody looks very good on Diophantine equations. Make up a Diophantine equation of your own. Use whatever whole numbers, positive or negative, that you like for your equation. Use whatever powers of however many variables you like for your equation. So you get something that looks maybe like this:

    7x^2 - 20y + 18y^2 - 38z = 9

    Does it have any solutions? I don’t know. Nobody does. There isn’t a general all-around solution. You know how with a quadratic equation we have this formula where you recite some incantation about “b squared minus four a c” and get any roots that exist? Nothing like that exists for Diophantine equations in general. Specific ones, yes. But they’re all specialties, crafted to fit the equation that has just that shape.

    So for each equation we have to ask: is there a solution? Is there any solution that isn’t obvious? Are there finitely many solutions? Are there infinitely many? Either way, can we find all the solutions? And we have to answer them anew. What answers these have? Whether answers are known to exist? Whether answers can exist? We have to discover anew for each kind of equation. Knowing answers for one kind doesn’t help us for any others, except as inspiration. If some trick worked before, maybe it will work this time.

    There are a couple usually reliable tricks. Can the equation be rewritten in some way that it becomes the equation for a line? If it can we probably have a good handle on any solutions. Can we apply modulo arithmetic to the equation? If it is, we might be able to reduce the number of possible solutions that the equation has. In particular we might be able to reduce the number of possible solutions until we can just check every case. Can we use induction? That is, can we show there’s some parameter for the equations, and that knowing the solutions for one value of that parameter implies knowing solutions for larger values? And then find some small enough value we can test it out by hand? Or can we show that if there is a solution, then there must be a smaller solution, and smaller yet, until we can either find an answer or show there aren’t any? Sometimes. Not always. The field blends seamlessly into number theory. And number theory is all sorts of problems easy to pose and hard or impossible to solve.

    We name these equation after Diophantus of Alexandria, a 3rd century Greek mathematician. His writings, what we have of them, discuss how to solve equations. Not general solutions, the way we might want to solve ax^2 + bx + c = 0 , but specific ones, like 1x^2 - 5x + 6 = 0 . His books are among those whose rediscovery shaped the rebirth of mathematics. Pierre de Fermat’s scribbled his famous note in the too-small margins of Diophantus’s Arithmetica. (Well, a popular translation.)

    But the field predates Diophantus, at least if we look at specific problems. Of course it does. In mathematics, as in life, any search for a source ends in a vast, marshy ambiguity. The field stays vital. If we loosen ourselves to looking at inequalities — x - Dy^2 < A , let's say — then we start seeing optimization problems. What values of x and y will make this equation most nearly true? What values will come closest to satisfying this bunch of equations? The questions are about how to find the best possible fit to whatever our complicated sets of needs are. We can't always answer. We keep searching.

     
c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r
Reply
e
Edit
o
Show/Hide comments
t
Go to top
l
Go to login
h
Show/Hide help
shift + esc
Cancel
%d bloggers like this: