I’m sorry not to be able to offer more about his mathematical work. If someone knows of a mathematics-history podcast with a similar goal, please leave a comment. I’d love to know and to share with other people.
I continue to share things I’ve heard, rather than created. Peter Adamson’s podcast The History Of Philosophy Without Any Gaps this week had an episode about Nicholas of Cusa. There’s another episode on him scheduled for two weeks from now.
Nicholas is one of those many polymaths of the not-quite-modern era. Someone who worked in philosophy, theology, astronomy, mathematics, with a side in calendar reform. He’s noteworthy in mathematics and theology and philosophy for trying to understand the infinite and the infinitesimal. Adamson’s podcast — about a half-hour — focuses on the philosophical and theological sides of things. But the mathematics can’t help creeping in, with questions like, how can you tell the difference between a straight line and the edge of a circle with infinitely large diameter? Or between a circle and a regular polygon with infinitely many sides?
I have only a couple strips this time, and from this week. I’m not sure when I’ll return to full-time comics reading, but I do want to share strips that inspire something.
Carol Lay’s Lay Lines for the 24th of May riffs on Hilbert’s Hotel. This is a metaphor often used in pop mathematics treatments of infinity. So often, in fact, a friend snarked that he wished for any YouTube mathematics channel that didn’t do the same three math theorems. Hilbert’s Hotel was among them. I think I’ve never written a piece specifically about Hilbert’s Hotel. In part because every pop mathematics blog has one, so there are better expositions available. I have a similar restraint against a detailed exploration of the different sizes of infinity, or of the Monty Hall Problem.
Hilbert’s Hotel is named for David Hilbert, of Hilbert problems fame. It’s a thought experiment to explore weird consequences of our modern understanding of infinite sets. It presents various cases about matching elements of a set to the whole numbers, by making it about guests in hotel rooms. And then translates things we accept in set theory, like combining two infinitely large sets, into material terms. In material terms, the operations seem ridiculous. So the set of thought experiments get labelled “paradoxes”. This is not in the logician sense of being things both true and false, but in the ordinary sense that we are asked to reconcile our logic with our intuition.
So the Hotel serves a curious role. It doesn’t make a complex idea understandable, the way many demonstrations do. It instead draws attention to the weirdness in something a mathematics student might otherwise nod through. It does serve some role, or it wouldn’t be so popular now.
Anyway, Carol Lay does an great job making a story of it.
Leigh Rubin’s Rubes for the 25th of May I’ll toss in here too. It’s a riff on the art convention of a blackboard equation being meaningless. Normally, of course, the content of the equation doesn’t matter. So it gets simplified and abstracted, for the same reason one draws a brick wall as four separate patches of two or three bricks together. It sometimes happens that a cartoonist makes the equation meaningful. That’s because they’re a recovering physics major like Bill Amend of FoxTrot. Or it’s because the content of the blackboard supports the joke. Which, in this case, it does.
The BBC’s In Our Time program, and podcast, did a 50-minute chat about the longitude problem. That’s the question of how to find one’s position, east or west of some reference point. It’s an iconic story of pop science and, I’ll admit, I’d think anyone likely to read my blog already knows the rough outline of the story. But you never know what people don’t know. And even if you do know, it’s often enjoyable to hear the story told a different way.
The mathematics content of the longitude problem is real, although it’s not discussed more than in passing during the chat. The core insight Western mapmakers used is that the difference between local (sun) time and a reference point’s time tells you how far east or west you are of that reference point. So then the question becomes how you know what your reference point’s time is.
This story, as it’s often told in pop science treatments, tends to focus on the brilliant clockmaker John Harrison, and the podcast does a fair bit of this. Harrison spent his life building a series of ever-more-precise clocks. These could keep London time on ships sailing around the world. (Or at least to the Caribbean, where the most profitable, slavery-driven, British interests were.) But he also spent decades fighting with the authorities he expected to reward him for his work. It makes for an almost classic narrative of lone genius versus the establishment.
But, and I’m glad the podcast discussion comes around to this, the reality more ambiguous than this. (Actual history is always more ambiguous than whatever you think.) Part of the goal of the goal of the British (and other powers) was finding a practical way for any ship to find longitude. Granted Harrison could build an advanced, ingenious clock more accurate than anyone else could. Could he build the hundreds, or thousands, of those clocks that British shipping needed? Could anyone?
And the competing methods for finding longitude were based on astronomy and calculation. The moment when, say, the Moon passes in front of Jupiter is the same for everyone on Earth. (At least for the accuracy needed here.) It can, in principle, be forecast years, even decades ahead of time. So why not print up books listing astronomical events for the next five years and the formulas to turn observations into longitudes? Books are easy to print. You already train your navigators in astronomy so that they can find latitude. (This by how far above the horizon the pole star, or the sun, or another identifiable feature is.) And, incidentally, you gain a way of computing longitude that you don’t lose if your clock breaks. I appreciated having some of that perspective shown.
(The problem of longitude on land gets briefly addressed. The same principles that work at sea work on land. And land offers some secondary checks. For an unmentioned example there’s triangulation. It’s a great process, and a compelling use of trigonometry. I may do a piece about that myself sometime.)
Also a thing I somehow did not realize: British English pronounces “longitude” with a hard G sound. Huh.
I have another mathematics-themed podcast to share. It’s again from the BBC’s In Our Time, a 50-minute program in which three experts discuss a topic. Here they came back around to mathematics and physics. And along the way chemistry and mensuration. The topic here was Pierre-Simon Laplace, who’s one of those people whose name you learn well as a mathematics or physics major. He doesn’t quite reach the levels of Euler — who does? — but he’s up there.
Laplace might be best known for his work in celestial mechanics. He (independently of Immanuel Kant) developed the nebular hypothesis, that the solar system formed from the contraction of a great cloud of dust. We today accept a modified version of this. And for studying the question of whether the solar system is stable. That is, whether the perturbations every planet has on one another average out to nothing, or to something catastrophic. And studying probability, which has more to do with these questions than one might imagine. And then there’s general mechanics, and differential equations, and if that weren’t enough, his role in establishing the Metric system. This and more gets discussion.
This is not the whole of her work, though my understanding is she’d be worth noticing even if it were. Part of the greatness of the translation was putting Newton’s mathematics — which he had done as geometric demonstrations — into the calculus of the day. The experts on In Our Time’s podcast argue that she did a good bit of work advancing the state of calculus in doing this. She’d also done a good bit of work on the problem of colliding bodies.
A major controversy was, in modern terms, whether momentum and kinetic energy are different things and, if they are different, which one collisions preserve. Châtelet worked on experiments — inspired by ideas of Gottfried Wilhelm Liebniz — to show kinetic energy was its own thing and was the important part of collisions. We today understand both momentum and energy are conserved, but we have the advantage of her work and the people influenced by her work to draw on.
She’s also renowned for a paper about the nature and propagation of fire, submitted anonymously for the Académie des Sciences’s 1737 Grand Prix. It didn’t win — Leonhard Euler’s did — but her paper and her lover Voltaire’s papers were published.
Châtelet was also surprisingly connected to the nascent mathematics and physics scene of the time. She had ongoing mathematical discussions with Pierre-Louis Maupertuis, of the principle of least action; Alexis Clairaut, who calculated the return of Halley’s Comet; Samuel König, author of a theorem relating systems of particles to their center of mass; and Bernard de Fontenelle, perpetual secretary of the Acadeémie des Sciences.
So for those interested in the history of mathematics and physics, and of women who are able to break through social restrictions to do good work, the podcast is worth a listen.
I spent much of the time waiting for a mention of Chatelier’s principle which never came. This because Chatelier’s principle’s — about the tendency of a system in equilibrium to resist changes — is named for Henry Louis Le Chatelier, a late 19th/early 20th century chemist with, so far as I know, no relation to Émile du Châtelet. I hope this spares you the confusion I felt.
My love read a thread about the < and > signs, and mnemonics people had learned to tell which was which. And my love wondered, is a mnemonic needed? The symbol is wider on the side with the larger quantity; that’s what it means, right? Why imagine an alligator that’s already swallowed the smaller and is ready to eat the larger? In my elementary school it was goldfish, not alligators. Much easier to draw them in.
All right, but just because an interpretation seems obvious doesn’t mean it is. The questions are, who introduced the < and > symbols to mathematics, and what were they thinking?
And here we get complications. The symbols first appear, meaning what they do today, in Artis Analyticae Praxis ad Aequationes Algebraicas Resolvendas (“The Analytical Art by which Algebraic Equations can be Resolved”). This is a book, by Thomas Harriot, published in 1631. Thomas Harriot was one of the great English mathematicians of the late 16th and early 17th centuries. He worked on the longitude problem, on optics, on astronomy. Harriot’s observations are our first record of sunspots. He almost observed what we now call Halley’s Comet, with records used to work out its orbit. And he worked on how to solve equations, in ways that look at least recognizably close to what we do today.
There is a tradition that holds Harriot drew these symbols from the arm markings on a Native American. Harriot did sail to the New World at least once. He was on Walter Raleigh’s 1585-86 expedition to Virginia and observed the solar eclipse of April 1585. This was a rare chance to calculate the longitude of a ship at sea. So that’s possible. But there is also an argument that Harriot (or editor) drew from the example of the equals sign.
The = sign we first see in the mid-16th century, written by Robert Recorde, another of the great English mathematicians. Recorde did write, in The Whetstone of Witte (1557) that he used parallel lines of a common length because no two things could be more equal. Good mnemonic there. It seems Harriot (or editor) interpreted the common distance between the lines in the equals sign as the thing kept equal. So, on the side of the symbol with the greater number, make the distance between lines greater. On the lower-number’s side, make the distance between lines smaller. Which is another useful mnemonic for the symbol, if you need one.
It’s not an inevitable scheme. William Oughtred also had symbols for less-than and greater-than. Oughtred’s another vaguely familiar name in mathematics symbols. He gave us the symbol for multiplication, and and for the trig functions. He also pioneered slide rules. Oughtred’s symbols look like a block-letter U set on its side, with the upper leg longer than the lower. The vertical stroke and the shorter horizontal stroke would be on the left, to represent the left being greater than the right. The vertical stroke and shorter horizontal stroke would be on the right, for the left being less than the right. That is, the “open” side would face the smaller of the numbers, opposite to what we do with < and >.
And that seems to be as much as can be definitely said. If I’m reading right, we don’t have Harriot’s (or editor’s) statement of what inspired these symbols. We have guesses that seem reasonable, but that might only seem reasonable because we’ve brought our own interpretations to it. I’d love to know if there’s better information available.
Nobody had particular suggestions for the letter ‘Y’ this time around. It’s a tough letter to find mathematical terms for. It doesn’t even lend itself to typography or wordplay the way ‘X’ does. So I chose to do one more biographical piece before the series concludes. There were twists along the way in writing.
Several problems beset me in writing about this significant 13th-century Chinese mathematician. One is my ignorance of the Chinese mathematical tradition. I have little to guide me in choosing what tertiary sources to trust. Another is that the tertiary sources know little about him. The Complete Dictionary of Scientific Biography gives a dire verdict. “Nothing is known about the life of Yang Hui, except that he produced mathematical writings”. MacTutor’s biography gives his lifespan as from circa 1238 to circa 1298, on what basis I do not know. He seems to have been born in what’s now Hangzhou, near Shanghai. He seems to have worked as a civil servant. This is what I would have imagined; most scholars then were. It’s the sort of job that gives one time to write mathematics. Also he seems not to have been a prominent civil servant; he’s apparently not listed in any dynastic records. After that, we need to speculate.
E F Robertson, writing the MacTutor biography, speculates that Yang Hui was a teacher. That he was writing to explain mathematics in interesting and helpful ways. I’m not qualified to judge Robertson’s conclusions. And Robertson notes that’s not inconsistent with Yang being a civil servant. Robertson’s argument is based on Yang’s surviving writings, and what they say about the demonstrated problems. There is, for example, 1274’s Cheng Chu Tong Bian Ben Mo. Robertson translates that title as Alpha and omega of variations on multiplication and division. I try to work out my unease at having something translated from Chinese as “Alpha and Omega”. That is my issue. Relevant here is that a syllabus prefaces the first chapter. It provides a schedule and series of topics, as well as a rationale for why this plan.
Was Yang Hui a discoverer of significant new mathematics? Or did he “merely” present what was already known in a useful way? This is not to dismiss him; we have the same questions about Euclid. He is held up as among the great Chinese mathematicians of the 13th century, a particularly fruitful time and place for mathematics. How much greatness to assign to original work and how much to good exposition is unanswerable with what we know now.
Consider for example the thing I’ve featured before, Yang Hui’s Triangle. It’s the arrangement of numbers known in the west as Pascal’s Triangle. Yang provides the earliest extant description of the triangle and how to form it and use it. This in the 1261 Xiangjie jiuzhang suanfa (Detailed analysis of the mathematical rules in the Nine Chapters and their reclassifications). But in it, Yang Hui says he learned the triangle from a treatise by Jia Xian, Huangdi Jiuzhang Suanjing Xicao (The Yellow Emperor’s detailed solutions to the Nine Chapters on the Mathematical Art). Jia Xian lived in the 11th century; he’s known to have written two books, both lost. Yang Hui’s commentary gives us a fair idea what Jia Xian wrote about. But we’re limited in judging what was Jia Xian’s idea and what was Yang Hui’s inference or what.
The Nine Chapters referred to is Jiuzhang suanshu. An English title is Nine Chapters on the Mathematical Art. The book is a 246-problem handbook of mathematics that dates back to antiquity. It’s impossible to say when the Nine Chapters was first written. Liu Hui, who wrote a commentary on the Nine Chapters in 263 CE, thought it predated the Qin ruler Shih Huant Ti’s 213 BCE destruction of all books. But the book — and the many commentaries on the book — served as a centerpiece for Chinese mathematics for a long while. Jia Xian’s and Yang Hui’s work was part of this tradition.
Yang Hui’s Detailed Analysis covers the Nine Chapters. It goes on for three chapters, more about geometry and fundamentals of mathematics. Even how to classify the problems. He had further works. In 1275 Yang published Practical mathematical rules for surveying and Continuation of ancient mathematical methods for elucidating strange properties of numbers. (I’m not confident in my ability to give the Chinese titles for these.) The first title particularly echoes how in the Western tradition geometry was born of practical concerns.
The breadth of topics covers, it seems to me, a decent modern (American) high school mathematics education. The triangle, and the binomial expansions it gives us, fit that. Yang writes about more efficient ways to multiply on the abacus. He writes about finding simultaneous solutions to sets of equations. And through a technique that amounts to finding the matrix of coefficients for the equations, and its determinant. He writes about finding the roots for cubic and quartic equations. The technique is commonly known in the west as Horner’s Method, a technique of calculating divided differences. We see the calculating of areas and volumes for regular shapes.
And sequences. He found the sum of the squares of natural numbers followed a rule:
And then there’s magic squares, and magic circles. He seems to have found them, as professional mathematicians today would, good ways to interest people in calculation. Not magic; he called them something like number diagrams. But he gives magic squares from three-by-three all the way to ten-by-ten. We don’t know of earlier examples of Chinese mathematicians writing about the larger magic squares. But Yang Hui doesn’t claim to be presenting new work. He also gives magic circles. The simplest is a web of seven intersecting circles, each with four numbers along the circle and one at its center. The sum of the center and the circumference numbers are 65 for all seven circles. Is this significant? No; merely fun.
Grant this breadth of work. Is he significant? I learned this year that familiar names might have been obscure until quite recently. The record is once again ambiguous. Other mathematicians wrote about Yang Hui’s work in the early 1300s. Yang Hui’s works were printed in China in 1378, says the Complete Dictionary of Scientific Biography, and reprinted in Korea in 1433. They’re listed in a 1441 catalogue of the Ming Imperial Library. Seki Takakazu, a towering figure in 17th century Japanese mathematics, copied the Korean text by hand. Yet Yang Hui’s work seems to have been lost by the 18th century. Reconstructions, from commentaries and encyclopedias, started in the 19th century. But we don’t have everything we know he wrote. We don’t even have a complete text of Detailed Analysis. This is not to say he wasn’t influential. All I could say is there seems to have been a time his influence was indirect.
I owe Mr Wu, author of the Singapore Maths Tuition blog, thanks for another topic for this A-to-Z. Statistics is a big field of mathematics, and so I won’t try to give you a course’s worth in 1500 words. But I have to start with a question. I seem to have ended at two thousand words.
Is statistics mathematics?
The answer seems obvious at first. Look at a statistics textbook. It’s full of algebra. And graphs of great sloped mounds. There’s tables full of four-digit numbers in back. The first couple chapters are about probability. They’re full of questions about rolling dice and dealing cards and guessing whether the sibling who just entered is the younger.
Thinking of the field’s history, though, and its use, tell us more. Some of the earliest work we now recognize as statistics was Arab mathematicians deciphering messages. This cryptanalysis is the observation that (in English) a three-letter word is very likely to be ‘the’, mildly likely to be ‘one’, and not likely to be ‘pyx’. A more modern forerunner is the Republic of Venice supposedly calculating that war with Milan would not be worth the winning. Or the gatherings of mortality tables, recording how many people of what age can be expected to die any year, and what from. (Mortality tables are another of Edmond Halley’s claims to fame, though it won’t displace his comet work.) Florence Nightingale’s charts explaining how more soldiers die of disease than in fighting the Crimean War. William Sealy Gosset sharing sample-testing methods developed at the Guinness brewery.
You see a difference in kind to a mathematical question like finding a square with the same area as this trapezoid. It’s not that mathematics is not practical; it’s always been. And it’s not that statistics lacks abstraction and pure mathematics content. But statistics wears practicality in a way that number theory won’t.
Practical about what? History and etymology tip us off. The early uses of things we now see as statistics are about things of interest to the State. Decoding messages. Counting the population. Following — in the study of annuities — the flow of money between peoples. With the industrial revolution, statistics sneaks into the factory. To have an economy of scale you need a reliable product. How do you know whether the product is reliable, without testing every piece? How can you test every beer brewed without drinking it all?
One great leg of statistics — it’s tempting to call it the first leg, but the history is not so neat as to make that work — is descriptive. This gives us things like mean and median and mode and standard deviation and quartiles and quintiles. These try to let us represent more data than we can really understand in a few words. We lose information in doing so. But if we are careful to remember the difference between the descriptive statistics we have and the original population? (nb, a word of the State) We might not do ourselves much harm.
Another great leg is inferential statistics. This uses tools with names like z-score and the Student t distribution. And talk about things like p-values and confidence intervals. Terms like correlation and regression and such. This is about looking for causes in complex scenarios. We want to believe there is a cause to, say, a person’s lung cancer. But there is no tracking down what that is; there are too many things that could start a cancer, and too many of them will go unobserved. But we can notice that people who smoke have lung cancer more often than those who don’t. We can’t say why a person recovered from the influenza in five days. But we can say people who were vaccinated got fewer influenzas, and ones that passed quicker, than those who did not. We can get the dire warning that “correlation is not causation”, uttered by people who don’t like what the correlation suggests may be a cause.
Also by people being honest, though. In the 1980s geologists wondered if the sun might have a not-yet-noticed companion star. Its orbit would explain an apparent periodicity in meteor bombardments of the Earth. But completely random bombardments would produce apparent periodicity sometimes. It’s much the same way trees in a forest will sometimes seem to line up. Or imagine finding there is a neighborhood in your city with a high number of arrests. Is this because it has the highest rate of street crime? Or is the rate of street crime the same as any other spot and there are simply more cops here? But then why are there more cops to be found here? Perhaps they’re attracted by the neighborhood’s reputation for high crime. It is difficult to see through randomness, to untangle complex causes, and to root out biases.
The tools of statistics, as we recognize them, largely came together in the 19th and early 20th century. Adolphe Quetelet, a Flemish scientist, set out much early work, including introducing the concept of the “average man”. He studied the crime statistics of Paris for five years and noticed how regular the numbers were. The implication, to Quetelet — who introduced the idea of the “average man”, representative of societal matters — was that crime is a societal problem. It’s something we can control by mindfully organizing society, without infringing anyone’s autonomy. Put like that, the study of statistics seems an obvious and indisputable good, a way for governments to better serve their public.
So here is the dispute. It’s something mathematicians understate when sharing the stories of important pioneers like Francis Galton or Karl Pearson. They were eugenicists. Part of what drove their interest in studying human populations was to find out which populations were the best. And how to help them overcome their more-populous lessers.
I don’t have the space, or depth of knowledge, to fully recount the 19th century’s racial politics, popular scientific understanding, and international relations. Please accept this as a loose cartoon of the situation. Do not forget the full story is more complex and more ambiguous than I write.
One of the 19th century’s greatest scientific discoveries was evolution. That populations change in time, in size and in characteristics, even budding off new species, is breathtaking. Another of the great discoveries was entropy. This incorporated into science the nostalgic romantic notion that things used to be better. I write that figuratively, but to express the way the notion is felt.
There are implications. If the Sun itself will someday wear out, how long can the Tories last? It was easy for the aristocracy to feel that everything was quite excellent as it was now and dread the inevitable change. This is true for the aristocracy of any country, although the United Kingdom had a special position here. The United Kingdom enjoyed a privileged position among the Great Powers and the Imperial Powers through the 19th century. Note we still call it the Victorian era, when Louis Napoleon or Giuseppe Garibaldi or Otto von Bismarck are more significant European figures. (Granting Victoria had the longer presence on the world stage; “the 19th century” had a longer presence still.) But it could rarely feel secure, always aware that France or Germany or Russia was ready to displace it.
And even internally: if Darwin was right and reproductive success all that matters in the long run, what does it say that so many poor people breed so much? How long could the world hold good things? Would the eternal famines and poverty of the “overpopulated” Irish or Indian colonial populations become all that was left? During the Crimean War, the British military found a shocking number of recruits from the cities were physically unfit for service. In the 1850s this was only an inconvenience; there were plenty of strong young farm workers to recruit. But the British population was already majority-urban, and becoming more so. What would happen by 1880? 1910?
One can follow the reasoning, even if we freeze at the racist conclusions. And we have the advantage of a century-plus hindsight. We can see how the eugenic attitude leads quickly to horrors. And also that it turns out “overpopulated” Ireland and India stopped having famines once they evicted their colonizers.
Does this origin of statistics matter? The utility of a hammer does not depend on the moral standing of its maker. The Central Limit Theorem has an even stronger pretense to objectivity. Why not build as best we can with the crooked timbers of mathematics?
It is in my lifetime that a popular racist book claimed science proved that Black people were intellectual inferiors to White people. This on the basis of supposedly significant differences in the populations’ IQ scores. It proposed that racism wasn’t a thing, or at least nothing to do anything about. It would be mere “realism”. Intelligence Quotients, incidentally, are another idea we can trace to Francis Galton. But an IQ test is not objective. The best we can say is it might be standardized. This says nothing about the biases built into the test, though, or of the people evaluating the results.
So what if some publisher 25 years ago got suckered into publishing a bad book? And racist chumps bought it because they liked its conclusion?
The past is never fully past. In the modern environment of surveillance capitalism we have abundant data on any person. We have abundant computing power. We can find many correlations. This gives people wild ideas for “artificial intelligence”. Something to make predictions. Who will lose a job soon? Who will get sick, and from what? Who will commit a crime? Who will fail their A-levels? At least, who is most likely to?
Consider, for example, the body mass index. It was developed by our friend Adolphe Quetelet, as he tried to understand the kinds of bodies in the population. It is now used to judge whether someone is overweight. Weight is treated as though it were a greater threat to health than actual illnesses are. Your diagnosis for the same condition with the same symptoms will be different — and on average worse — if your number says 25.2 rather than 24.8.
We must do better. We can hope that learning how tools were used to injure people will teach us to use them better, to reduce or to avoid harm. We must fight our tendency to latch on to simple ideas as the things we can understand in the world. We must not mistake the greater understanding we have from the statistics for complete understanding. To do this we must have empathy, and we must have humility, and we must understand what we have done badly in the past. We must catch ourselves when we repeat the patterns that brought us to past evils. We must do more than only calculate.
Jacob Siehler suggested this topic. I had to check several times that I hadn’t written an essay about the Möbius strip already. While I have talked about it some, mostly in comic strip essays, this is a chance to specialize on the shape in a way I haven’t before.
I have ridden at least 252 different roller coasters. These represent nearly every type of roller coaster made today, and most of the types that were ever made. One type, common in the 1920s and again since the 70s, is the racing coaster. This is two roller coasters, dispatched at the same time, following tracks that are as symmetric as the terrain allows. Want to win the race? Be in the train with the heavier passenger load. The difference in the time each train takes amounts to losses from friction, and the lighter train will lose a bit more of its speed.
There are three special wooden racing coasters. These are Racer at Kennywood Amusement Park (Pittsburgh), Grand National at Blackpool Pleasure Beach (Blackpool, England), and Montaña Rusa at La Feria Chapultepec Magico (Mexico City). I’ve been able to ride them all. When you get into the train going up, say, the left lift hill, you return to the station in the train that will go up the right lift hill. These racing roller coasters have only one track. The track twists around itself and becomes a Möbius strip.
This is a fun use of the Möbius strip. The shape is one of the few bits of advanced mathematics to escape into pop culture. Maybe dominates it, in a way nothing but the blackboard full of calculus equations does. In 1958 the public intellectual and game show host Clifton Fadiman published the anthology Fantasia Mathematica. It’s all essays and stories and poems with some mathematical element. I no longer remember how many of the pieces were about the Möbius strip one way or another. The collection does include A J Deutschs’s classic A Subway Named Möbius. In this story the Boston subway system achieves hyperdimensional complexity. It does not become a Möbius strip, though, in that story. It might be one in reality anyway.
The Möbius strip we name for August Ferdinand Möbius, who in 1858 was the second person known to have noticed the shape’s curious properties. The first — to notice, in 1858, and to publish, in 1862 — was Johann Benedict Listing. Listing seems to have coined the term “topology” for the field that the Möbius strip would be emblem for. He wrote one of the first texts on the field. He also seems to have coined terms like “entrophic phenomena” and “nodal points” and “geoid” and “micron”, for a millionth of a meter. It’s hard to say why we don’t talk about Listing strips instead. Mathematical fame is a strange, unpredictable creature. There is a topological invariant, the Listing Number, named for him. And he’s known to ophthalmologists for Listing’s Law, which describes how human eyes orient themselves.
The Möbius strip is an easy thing to construct. Loop a ribbon back to itself, with an odd number of half-twist before you fasten the ends together. Anyone could do it. So it seems curious that for all recorded history nobody thought to try. Not until 1858 when Lister and then Möbius hit on the same idea.
An irresistible thing, while riding these roller coasters, is to try to find the spot where you “switch”, where you go from being on the left track to the right. You can’t. The track is — well, the track is a series of metal straps bolted to a base of wood. (The base the straps are bolted to is what makes it a wooden roller coaster. The great lattice holding the tracks above ground have nothing to do with it.) But the path of the tracks is a continuous whole. To split it requires the same arbitrariness with which mapmakers pick a prime meridian. It’s obvious that the “longitude” of a cylinder or a rubber ball is arbitrary. It’s not obvious that roller coaster tracks should have the same property. Until you draw the shape in that ∞-loop figure we always see. Then you can get lost imagining a walk along the surface.
And it’s not true that nobody thought to try this shape before 1858. Julyan H E Cartwright and Diego L González wrote a paper searching for pre-Möbius strips. They find some examples. To my eye not enough examples to support their abstract’s claim of “lots of them”, but I trust they did not list every example. One example is a Roman mosaic showing Aion, the God of Time, Eternity, and the Zodiac. He holds a zodiac ring that is either a Möbius strip or cylinder with artistic errors. Cartwright and González are convinced. I’m reminded of a Looks Good On Paper comic strip that forgot to include the needed half-twist.
Islamic science gives us a more compelling example. We have a book by Ismail al-Jazari dated 1206, The Book of Knowledge of Ingenious Mechanical Devices. Some manuscripts of it illustrate a chain pump, with the chain arranged as a Möbius strip. Cartwright and González also note discussions in Scientific American, and other engineering publications in the United States, about drive and conveyor belts with the Möbius strip topology. None of those predate Lister or Möbius, or apparently credit either. And they do come quite soon after. It’s surprising something might leap from abstract mathematics to Yankee ingenuity that fast.
If it did. It’s not hard to explain why mechanical belts didn’t consider Möbius strip shapes before the late 19th century. Their advantage is that the wear of the belt distributes over twice the surface area, the “inside” and “outside”. A leather belt has a smooth and a rough side. Many other things you might make a belt from have a similar asymmetry. By the late 19th century you could make a belt of rubber. Its grip and flexibility and smoothness is uniform on all sides. “Balancing” the use suddenly could have a point.
I still find it curious almost no one drew or speculated about or played with these shapes until, practically, yesterday. The shape doesn’t seem far away from a trefoil knot. The recycling symbol, three folded-over arrows, suggests a Möbius strip. The strip evokes the ∞ symbol, although that symbol was not attached to the concept of “infinity” until John Wallis put it forth in 1655.
Even with the shape now familiar, and loved, there are curious gaps. Consider game design. If you play on a board that represents space you need to do something with the boundaries. The easiest is to make the boundaries the edges of playable space. The game designer has choices, though. If a piece moves off the board to the right, why not have it reappear on the left? (And, going off to the left, reappear on the right.) This is fine. It gives the game board, a finite rectangle, the topology of a cylinder. If this isn’t enough? Have pieces that go off the top edge reappear at the bottom, and vice-versa. Doing this, along with matching the left to the right boundaries, makes the game board a torus, a doughnut shape.
A Möbius strip is easy enough to code. Make the top and bottom impenetrable borders. And match the left to the right edges this way: a piece going off the board at the upper half of the right edge reappears at the lower half of the left edge. Going off the lower half of the right edge brings the piece to the upper half of the left edge. And so on. It isn’t hard, but I’m not aware of any game — board or computer — that uses this space. Maybe there’s a backgammon variant which does.
Still, the strip defies our intuition. It has one face and one edge. To reflect a shape across the width of the strip is the same as sliding a shape along its length. Cutting the strip down the center unfurls it into a cylinder. Cutting the strip down, one-third of the way from the edge, divides it into two pieces, a skinnier Möbius strip plus a cylinder. If we could extract the edge we could tug and stretch it until it was a circle.
And it primes our intuition. Once we understand there can be shapes lacking sides we can look for more. Anyone likely to read a pop mathematics blog about the Möbius strip has heard of the Klein bottle. This is a three-dimensional surface that folds back on itself in the fourth dimension of space. The shape is a jug with no inside, or with nothing but inside. Three-dimensional renditions of this get suggested as gifts to mathematicians. This for your mathematician friend who’s already got a Möbius scarf.
Though a Möbius strip looks — at any one spot — like a plane, the four-color map theorem doesn’t hold for it. Even the five-color theorem won’t do. You need six colors to cover maps on such a strip. A checkerboard drawn on a Möbius strip can be completely covered by T-shape pentominoes or Tetris pieces. You can’t do this for a checkerboard on the plane. In the mathematics of music theory the organization of dyads — two-tone “chords” — has the structure of a Möbius strip. I do not know music theory or the history of music theory. I’m curious whether Möbius strips might have been recognized by musicians before the mathematicians caught on.
And they inspire some practical inventions. Mechanical belts are obvious, although I don’t know how often they’re used. More clever are designs for resistors that have no self-inductance. They can resist electric flow without causing magnetic interference. I can look up the patents; I can’t swear to how often these are actually used. There exist — there are made — Möbius aromatic compounds. These are organic compounds with rings of carbon and hydrogen. I do not know a use for these. That they’ve only been synthesized this century, rather than found in nature, suggests they are more neat than practical.
Perhaps this shape is most useful as a path into a particular type of topology, and for its considerable artistry. And, with its “late” discovery, a reminder that we do not yet know all that is obvious. That is enough for anything.
There are three steel roller coasters with a Möbius strip track. That is, the metal rail on which the coaster runs is itself braced directly by metal. One of these is in France, one in Italy, and one in Iran. One in Liaoning, China has been under construction for five years. I can’t say when it might open. I have yet to ride any of them.
The exact suggestion I got for L was “Leibniz, the inventor of Calculus”. I can’t in good conscience offer that. This isn’t to deny Leibniz’s critical role in calculus. We rely on many of the ideas he’d had for it. We especially use his notation. But there are few great big ideas that can be truly credited to an inventor, or even a team of inventors. Put aside the sorry and embarrassing priority dispute with Isaac Newton. Many mathematicians in the 16th and 17th century were working on how to improve the Archimedean “method of exhaustion”. This would find the areas inside select curves, integral calculus. Johannes Kepler worked out the areas of ellipse slices, albeit with considerable luck. Gilles Roberval tried working out the area inside a curve as the area of infinitely many narrow rectangular strips. We still learn integration from this. Pierre de Fermat recognized how tangents to a curve could find maximums and minimums of functions. This is a critical piece of differential calculus. Isaac Barrow, Evangelista Torricelli (of barometer fame), Pietro Mengoli, and Stephano Angeli all pushed mathematics towards calculus. James Gregory proved, in geometric form, the relationship between differentiation and integration. That relationship is the Fundamental Theorem of Calculus.
This is not to denigrate Leibniz. We don’t dismiss the Wright Brothers though we know that without them, Alberto Santos-Dumont or Glenn Curtiss or Samuel Langley would have built a workable airplane anyway. We have Leibniz’s note, dated the 29th of October, 1675 (says Florian Cajori), writing out to mean the sum of all l’s. By mid-November he was integrating functions, and writing out his work as . Any mathematics or physics or chemistry or engineering major today would recognize that. A year later he was writing things like , which we’d also understand if not quite care to put that way.
Though we use his notation and his basic tools we don’t exactly use Leibniz’s particular ideas of what calculus means. It’s been over three centuries since he published. It would be remarkable if he had gotten the concepts exactly and in the best of all possible forms. Much of Leibniz’s calculus builds on the idea of a differential. This is a quantity that’s smaller than any positive number but also larger than zero. How does that make sense? George Berkeley argued it made not a lick of sense. Mathematicians frowned, but conceded Berkeley was right. By the mid-19th century they had a rationale for differentials that avoided this weird sort of number.
It’s hard to avoid the differential’s lure. The intuitive appeal of “imagine moving this thing a tiny bit” is always there. In science or engineering applications it’s almost mandatory. Few things we encounter in the real world have the kinds of discontinuity that create logic problems for differentials. Even in pure mathematics, we will look at a differential equation like and rewrite it as . Leibniz’s notation gives us the idea that taking derivatives is some kind of fraction. It isn’t, but in many problems we act as though it were. It works out often enough we forget that it might not.
Better, though. From the 1960s Abraham Robinson and others worked out a different idea of what real numbers are. In that, differentials have a rigorous logical definition. We call the mathematics which uses this “non-standard analysis”. The name tells something of its use. This is not to call it wrong. It’s merely not what we learn first, or necessarily at all. And it is Leibniz’s differentials. 304 years after his death there is still a lot of mathematics he could plausibly recognize.
There is still a lot of still-vital mathematics that he touched directly. Leibniz appears to be the first person to use the term “function”, for example, to describe that thing we’re plotting with a curve. He worked on systems of linear equations, and methods to find solutions if they exist. This technique is now called Gaussian elimination. We see the bundling of the equations’ coefficients he did as building a matrix and finding its determinant. We know that technique, today, as Cramer’s Rule, after Gabriel Cramer. The Japanese mathematician Seki Takakazu had discovered determinants before Leibniz, though.
Leibniz tried to study a thing he called “analysis situs”, which two centuries on would be a name for topology. My reading tells me you can get a good fight going among mathematics historians by asking whether he was a pioneer in topology. So I’ll decline to take a side in that.
In the 1680s he tried to create an algebra of thought, to turn reasoning into something like arithmetic. His goal was good: we see these ideas today as Boolean algebra, and concepts like conjunction and disjunction and negation and the empty set. Anyone studying logic knows these today. He’d also worked in something we can see as symbolic logic. Unfortunately for his reputation, the papers he wrote about that went unpublished until late in the 19th century. By then other mathematicians, like Gottlob Frege and Charles Sanders Peirce, had independently published the same ideas.
We give Leibniz’ name to a particular series that tells us the value of π:
(The Indian mathematician Madhava of Sangamagrama knew the formula this comes from by the 14th century. I don’t know whether Western Europe had gotten the news by the 17th century. I suspect it hadn’t.)
The drawback to using this to figure out digits of π is that it takes forever to use. Taking ten decimal digits of π demands evaluating about five billion terms. That’s not hyperbole; it just takes like forever to get its work done.
Which is something of a theme in Leibniz’s biography. He had a great many projects. Some of them even reached a conclusion. Many did not, and instead sprawled out with great ambition and sometimes insight before getting lost. Consider a practical one: he believed that the use of wind-driven propellers and water pumps could drain flooded mines. (Mines are always flooding.) In principle, he was right. But they all failed. Leibniz blamed deliberate obstruction by administrators and technicians. He even blamed workers afraid that new technologies would replace their jobs. Yet even in this failure he observed and had bracing new thoughts. The geology he learned in the mines project made him hypothesize that the Earth had been molten. I do not know the history of geology well enough to say whether this was significant to that field. It may have been another frustrating moment of insight (lucky or otherwise) ahead of its time but not connected to the mainstream of thought.
Another project, tantalizing yet incomplete: the “stepped reckoner”, a mechanical arithmetic machine. The design was to do addition and subtraction, multiplication and division. It’s a breathtaking idea. It earned him election into the (British) Royal Society in 1673. But it never was quite complete, never getting carries to work fully automatically. He never did finish it, and lost friends with the Royal Society when he moved on to other projects. He had a note describing a machine that could do some algebraic operations. In the 1690s he had some designs for a machine that might, in theory, integrate differential equations. It’s a fantastic idea. At some point he also devised a cipher machine. I do not know if this is one that was ever used in its time.
His greatest and longest-lasting unfinished project was for his employer, the House of Brunswick. Three successive Brunswick rulers were content to let Leibniz work on his many side projects. The one that Ernest Augustus wanted was a history of the Guelf family, in the House of Brunswick. One that went back to the time of Charlemagne or earlier if possible. The goal was to burnish the reputation of the house, which had just become a hereditary Elector of the Holy Roman Empire. (That is, they had just gotten to a new level of fun political intriguing. But they were at the bottom of that level.) Starting from 1687 Leibniz did good diligent work. He travelled throughout central Europe to find archival materials. He studied their context and meaning and relevance. He organized it. What he did not do, by his death in 1716, was write the thing.
It is always difficult to understand another person. Moreso someone you know only through biography. And especially someone who lived in very different times. But I do see a particular an modern personality type here. We all know someone who will work so very hard getting prepared to do a project Right that it never gets done. You might be reading the words of one right now.
Leibniz was a compulsive Society-organizer. He promoted ones in Brandenberg and Berlin and Dresden and Vienna and Saint Petersburg. None succeeded. It’s not obvious why. Leibniz was well-connected enough; he’s known to have over six hundred correspondents. Even for a time of great letter-writing, that’s a lot.
But it does seem like something about him offended others. Failing to complete big projects, like the stepped reckoner or the History of the Guelf family, seems like some of that. Anyone who knows of calculus knows of the dispute about the Newton-versus-Leibniz priority dispute. Grant that Leibniz seems not to have much fueled the quarrel. (And that modern historians agree Leibniz did not steal calculus from Newton.) Just being at the center of Drama causes people to rate you poorly.
There seems like there’s more, though. He was liked, for example, by the Electress Sophia of Hanover and her daughter Sophia Charlotte. These were the mother and the sister of Britain’s King George I. When George I ascended to the British throne he forbade Leibniz coming to London until at least one volume of the history was written. (The restriction seems fair, considering Leibniz was 27 years into the project by then.)
There are pieces in his biography that suggest a person a bit too clever for his own good. His first salaried position, for example, was as secretary to a Nuremberg alchemical society. He did not know alchemy. He passed himself off as deeply learned, though. I don’t blame him. Nobody would ever pass a job interview if they didn’t pretend to have expertise. Here it seems to have worked.
But consider, for example, his peace mission to Paris. Leibniz was born in the last years of the Thirty Years War. In that, the Great Powers of Europe battled each other in the German states. They destroyed Germany with a thoroughness not matched until World War II. Leibniz reasonably feared France’s King Louis XIV had designs on what was left of Germany. So his plan was to sell the French government on a plan of attacking Egypt and, from there, the Dutch East Indies. This falls short of an early-Enlightenment idea of rational world peace and a congress of nations. But anyone who plays grand strategy games recognizes the “let’s you and him fight” scheming. (The plan became irrelevant when France went to war with the Netherlands. The war did rope Brandenberg-Prussia, Cologne, Münster, and the Holy Roman Empire into the mess.)
And I have not discussed Leibniz’s work in philosophy, outside his logic. He’s respected for the theory of monads, part of the long history of trying to explain how things can have qualities. Like many he tried to find a deductive-logic argument about whether God must exist. And he proposed the notion that the world that exists is the most nearly perfect that can possibly be. Everyone has been dragging him for that ever since he said it, and they don’t look ready to stop. It’s an unfair rap, even if it makes for funny spoofs of his writing.
The optimal world may need to be badly defective in some ways. And this recognition inspires a question in me. Obviously Leibniz could come to this realization from thinking carefully about the world. But anyone working on optimization problems knows the more constraints you must satisfy, the less optimal your best-fit can be. Some things you might like may end up being lousy, because the overall maximum is more important. I have not seen anything to suggest Leibniz studied the mathematics of optimization theory. Is it possible he was working in things we now recognize as such, though? That he has notes in the things we would call Lagrange multipliers or such? I don’t know, and would like to know if anyone does.
Leibniz’s funeral was unattended by any dignitary or courtier besides his personal secretary. The Royal Academy and the Berlin Academy of Sciences did not honor their member’s death. His grave was unmarked for a half-century. And yet historians of mathematics, philosophy, physics, engineering, psychology, social science, philology, and more keep finding his work, and finding it more advanced than one would expect. Leibniz’s legacy seems to be one always rising and emerging from shade, but never being quite where it should.
I have another topic today suggested by Beth, of the I Didn’t Have My Glasses On …. inspiration blog. It overlaps a bit with other essays I’ve posted this A-to-Z sequence, but that’s all right. We get a better understanding of things by considering them from several perspectives. This one will be a bit more historical.
Pop science writer Isaac Asimov told a story he was proud of about his undergraduate days. A friend’s philosophy professor held court after class. One day he declared mathematicians were mystics, believing in things they even admit are “imaginary numbers”. Young Asimov, taking offense, offered to prove the reality of the square root of minus one, if the professor gave him one-half pieces of chalk. The professor snapped a piece of chalk in half and gave one piece to him. Asimov said this is one piece of chalk. The professor answered it was half the length of a piece of chalk and Asimov said that’s not what he asked for. Even if we accept “half the length” is okay, how do we know this isn’t 48 percent the length of a standard piece of chalk? If the professor was that bad on “one-half” how could he have opinions on “imaginary numbers”?
This story is another “STEM undergraduates outwitting the philosophy expert” legend. (Even if it did happen. What we know is the story Asimov spun it into, in which a plucky young science fiction fan out-argued someone whose job is forming arguments.) Richard Feynman tells a similar story, befuddling a philosophy class with the question of how we can prove a brick has a interior. It helps young mathematicians and science majors feel better about their knowledge. But Asimov’s story does get at a couple points. First, that “imaginary” is a terrible name for a class of numbers. The square root of minus one is as “real” as one-half is. Second, we’ve decided that one-half is “real” in some way. What the philosophy professor would have baffled Asimov to explain is: in what way is one-half real? Or minus one?
We’re introduced to imaginary numbers through polynomials. I mean in education. It’s usually right after getting into quadratics, looking for solutions to equations like . That quadratic has two solutions, but it’s possible to have a quadratic with only one, such as . Or to have a quadratic with no solutions, such as, iconically, . We might underscore that by plotting the curve whose x- and y-coordinates makes true the equation . There’s no point on the curve with a y-coordinate of zero, so, there we go.
Having established that has no solutions, the course then asks “what if we go ahead and say there was one”? Two solutions, in fact, and . This is all right for introducing the idea that mathematics is a tool. If it doesn’t do something we need, we can alter it.
But I see trouble in teaching someone how you can’t take square roots of negative numbers and then teaching them how to take square roots of negative numbers. It’s confusing at least. It needs some explanation about what changed. We might do better introducing them in a more historical method.
Historically, imaginary numbers (in the West) come from polynomials, yes. Different polynomials. Cubics, and quartics. Mathematicians still liked finding roots of them. Mathematicians would challenge one another to solve sets of polynomials. This seems hard to believe, but many sources agree on this. I hope we’re not all copying Eric Temple Bell here. (Bell’s Men of Mathematics is an inspiring collection of biographical sketches. But it’s not careful differentiating legends from documented facts.) And there are enough nerd challenges today that I can accept people daring one another to find solutions of .
Quadratics, equations we can write as for some real numbers a, b, and c, we’ve known about forever. Euclid solved these kinds of equations using geometric reasoning. Chinese mathematicians 2200 years ago described rules for how to find roots. The Indian mathematician Brahmagupta, by the early 7th century, described the quadratic formula to find at least one root. Both possible roots were known to Indian mathematicians a thousand years ago. We’ve reduced the formula today to
With that filtering into Western Europe, the search was on for similar formulas for other polynomials. This turns into several interesting threads. One is a tale of intrigue and treachery involving Gerolamo Cardano, Niccolò Tartaglia, and Ludovico Ferrari. I’ll save that for another essay because I have to cut something out, so of course I skip the dramatic thing. Another thread is the search for quadratic-like formulas for other polynomials. They exist for third-power and fourth-power polynomials. Not (generally) for the fifth- or higher-powers. That is, there are individual polynomials you can solve by formulas, like, . But stare at it and you can see where that’s “really” a quadratic pretending to be sixth-power. Finding there was no formula to find, though, lead people to develop group theory. And group theory underlies much of mathematics and modern physics.
The first great breakthrough solving the general cubic, , came near the end of the 14th century in some manuscripts out of Florence. It’s built on a transformation. Transformations are key to mathematics. The point of a transformation is to turn a problem you don’t know how to do into one you do. As I write this, MathWorld lists 543 pages as matching “transformation”. That’s about half what “polynomial” matches (1,199) and about three times “trigonometric” (184). So that can help you judge importance.
Here, the transformation to make is to write a related polynomial in terms of a new variable. You can call that new variable x’ if you like, or z. I’ll use z so as to not have too many superscript marks flying around. This will be a “depressed polynomial”. “Depressed” here means that at least one of the coefficients in the new polynomial is zero. (Here, for this problem, it means we won’t have a squared term in the new polynomial.) I suspect the term is old-fashioned.
Let z be the new variable, related to x by the equation . And then figure out what and are. Using all that, and the knowledge that , and a lot of arithmetic, you get to one of these three equations:
where p and q are some new coefficients. They’re positive numbers, or possibly zeros. They’re both derived from a, b, c, and d. And so in the 15th Century the search was on to solve one or more of these equations.
From our perspective in the 21st century, our first question is: what three equations? How are these not all the same equation? And today, yes, we would write this as one depressed equation, most likely . We would allow that p or q or both might be negative numbers.
And there is part of the great mysterious historical development. These days we generally learn about negative numbers. Once we are comfortable, our teachers hope, with those we get imaginary numbers. But in the Western tradition mathematicians noticed both, and approached both, at roughly the same time. With roughly similar doubts, too. It’s easy to point to three apples; who can point to “minus three” apples? We can arrange nine apples into a neat square. How big a square can we set “minus nine” apples in?
Hesitation and uncertainty about negative numbers would continue quite a long while. At least among Western mathematicians. Indian mathematicians seem to have been more comfortable with them sooner. And merchants, who could model a negative number as a debt, seem to have gotten the idea better.
But even seemingly simple questions could be challenging. John Wallis, in the 17th century, postulated that negative numbers were larger than infinity. Leonhard Euler seems to have agreed. (The notion may seem odd. It has echoes today, though. Computers store numbers as bit patterns. The normal scheme represents negative numbers by making the first bit in a pattern 1. These bit patterns make the negative numbers look bigger than the biggest positive numbers. And thermodynamics gives us a temperature defined by the relationship of energy to entropy. That definition implies there can be negative temperatures. Those are “hotter” — higher-energy, at least — than infinitely-high positive temperatures.) In the 18th century we see temperature scales designed so that the weather won’t give negative numbers too often. Augustus De Morgan wrote in 1831 that a negative number “occurring as the solution of a problem indicates some inconsistency or absurdity”. De Morgan was not an amateur. He coded the rules for deductive logic so well we still call them De Morgan’s laws. He put induction on a logical footing. And he found negative numbers (and imaginary numbers) a sign of defective work. In 1831. 1831!
But back to cubic equations. Allow that we’ve gotten comfortable enough with negative numbers we only want to solve the one depressed equation of . How to do it? … Another transformation, then. There are a couple you can do. Modern mathematicians would likely define a new variable w, set so that . This turns the depressed equation into
And this, believe it or not, is a disguised quadratic. Multiply everything in it by and move things around a little. You get
From there, quadratic formula to solve for . Then from that, take cube roots and you get three values of z. From that, you get your three values of x.
You see why nobody has taught this in high school algebra since 1959. Also why I am not touching the quartic formula, the equivalent of this for polynomials of degree four.
There are other approaches. And they can work out easier for particular problems. Take, for example, which I introduced in the first act. It’s past the time we set it off.
Rafael Bombelli, in the 1570s, pondered this particular equation. Notice it’s already depressed. A formula developed by Cardano addressed this, in the form . Notice that’s the second of the three sorts of depressed polynomial. Cardano’s formula says that one of the roots will be at
Put to this problem, we get something that looks like a compelling reason to stop:
Bombelli did not stop with that, though. He carried on as though these expressions of the square root of -121 made sense. And, if he did that he found these terms added up. You get an x of 4.
Which is true. It’s easy to check that it’s right. And here is the great surprising thing. Start from the respectable enough equation. It has nothing suspicious in it, not even negative numbers. Follow it through and you need to use negative numbers. Worse, you need to use the square roots of negative numbers. But keep going, as though you were confident in this, and you get a correct answer. And a real number.
We can get the other roots. Divide out of . What’s left is . You can use the quadratic formula for this. The other two roots are , about -0.268, and , about -3.732.
So here we have good reasons to work with negative numbers, and with imaginary numbers. We may not trust them. But they get us to correct answers. And this brings up another little secret of mathematics. If all you care about is an answer, then it’s all right to use a dubious method to get an answer.
There is a logical rigor missing in “we got away with it, I guess”. The name “imaginary numbers” tells of the disapproval of its users. We get the name from René Descartes, who was more generally discussing complex numbers. He wrote something like “in many cases no quantity exists which corresponds to what one imagines”.
John Wallis, taking a break from negative numbers and his other projects and quarrels, thought of how to represent imaginary numbers as branches off a number line. It’s a good scheme that nobody noticed at the time. Leonhard Euler envisioned matching complex numbers with points on the plane, but didn’t work out a logical basis for this. In 1797 Caspar Wessel presented a paper that described using vectors to represent complex numbers. It’s a good approach. Unfortunately that paper too sank without a trace, undiscovered for a century.
In 1806 Jean-Robert Argand wrote an “Essay on the Geometrical Interpretation of Imaginary Quantities”. Jacques Français got a copy, and published a paper describing the basics of complex numbers. He credited the essay, but noted that there was no author on the title page and asked the author to identify himself. Argand did. We started to get some good rigor behind the concept.
In 1831 William Rowan Hamilton, of Hamiltonian fame, described complex numbers using ordered pairs. Once we can define their arithmetic using the arithmetic of real numbers we have a second solid basis. More reason to trust them. Augustin-Louis Cauchy, who proved about four billion theorems of complex analysis, published a new construction of them. This used a group theory approach, a polynomial ring we denote as . I don’t have the strength to explain all that today. Matrices give us another approach. This matches complex numbers with particular two-row, two-column matrices. This turns the addition and multiplication of numbers into what Hamilton described.
And here we have some idea why mathematicians use negative numbers, and trust imaginary numbers. We are pushed toward them by convenience. Negative numbers let us work with one equation, , rather than three. (Or more than three equations, if we have to work with an x we know to be negative.) Imaginary numbers we can start with, and find answers we know to be true. And this encourages us to find reasons to trust the results. Having one line of reasoning is good. Having several lines — Argand’s geometric, Hamilton’s coordinates, Cauchy’s rings — is reassuring. We may not be able to point to an imaginary number of anything. But if we can trust our arithmetic on real numbers we can trust our arithmetic on imaginary numbers.
As I mentioned Descartes gave the name “imaginary number” to all of what we would now call “complex numbers”. Gauss published a geometric interpretation of complex numbers in 1831. And gave us the term “complex number”. Along the way he complained about the terminology, though. He noted “had +1, -1, and , instead of being called positive, negative, and imaginary (or worse still, impossible) unity, been given the names say, of direct, inverse, and lateral unity, there would hardly have been any scope for such obscurity”. I’ve never heard them term “impossible numbers”, except as an adjective.
The name of a thing doesn’t affect what it is. It can affect how we think about it, though. We can ask whether Asimov’s professor would dismiss “lateral numbers” as mysticism. Or at least as more mystical than “three” is. We can, in context, understand why Descartes thought of these as “imaginary numbers”. He saw them as something to use for the length of a calculation, and that would disappear once its use was done. We still have such concepts, things like “dummy variables” in a calculus problem. We can’t think of a use for dummy variables except to let a calculation proceed. But perhaps we’ll see things differently in four hundred years. Shall have to come back and check.
Beth, author of the popular inspiration blog I Didn’t Have My Glasses On …. proposed this topic. Hilbert’s problems are a famous set of questions. I couldn’t hope to summarize them all in an essay of reasonable length. I’d have trouble to do them justice in a short book. But there are still things to say about them.
It’s easy to describe what Hilbert’s Problems are. David Hilbert, at the 1900 International Congress of Mathematicians, listed ten important problems of the field. In print he expanded this to 23 problems. They covered topics like number theory, group theory, physics, geometry, differential equations, and more. One of the problems was solved that year. Eight of them have been resolved fully. Another nine have been partially answered. Four remain unanswered. Two have generally been regarded as too vague to resolve.
Everyone in mathematics agrees they were big, important questions. Things that represented the things mathematicians of 1900 would most want to know. Things that guided mathematical research for, so far, 120 years.
There is reason to say that Hilbert’s judgement was good. He listed, for example, the Riemann hypothesis. The hypothesis is still unanswered. Many interesting results would follow from it being proved true, or proved false, or proved unanswerable. Hilbert did not list Fermat’s Last Theorem, unresolved then. Any mathematician would have liked an answer. But nothing of consequence depends on it. But then he also listed making advances in the calculus of variations. A good goal, but not one that requires particular insight to want.
So here is a related problem. Why hasn’t anyone else made such a list? A concise summary of the problems that guides mathematical research?
It’s not because no one tried. At the 1912 International Conference of Mathematicians, Edmund Landau identified four problems in number theory worth solving. None of them have been solved yet. Yutaka Taniyama listed three dozen problems in 1955. William Thurston put forth 24 questions in 1982. Stephen Smale, famous for work in chaos theory, gathered a list of 18 questions in 1998. Barry Simon offered fifteen of them in 2000. Also in 2000 the Clay Mathematics Institute put up seven problems, with a million-dollar bounty on each. Jair Minoro Abe and Shotaro Tanaka gathered 22 questions for a list for 2001. The United States Defense Advanced Research Projects Agency put out a list of 23 of them in 2007.
Apart from Smale’s and the Clay Mathematics lists I never heard of any of them either. Why not? What was special about Hilbert’s list?
For one, he was David Hilbert. Hilbert was a great mathematician, held in high esteem then and now. Besides his list of problems he’s known for the axiomatization of geometry. This built not just logical rigor but a new, formalist, perspective. Also, he’s known for the formalist approach to mathematics. In this, for example, we give up the annoyingly hard task of saying exactly what we mean by a point and a line and a plane. We instead talk about how points and lines and planes relate to each other, definitions we can give. He’s also known for general relativity: Hilbert and Albert Einstein developed its field equations at the same time. We have Hilbert spaces and Hilbert curves and Hilbert metrics and Hilbert polynomials. Fans of pop mathematics speak of the Hilbert Hotel, a structure with infinitely many rooms and used to explore infinitely large sets.
So he was a great mind, well-versed in many fields. And he was in an enviable position, professor of mathematics at the University of Göttingen. At the time, German mathematics was held in particularly high renown. When you see, for example, mathematicians using ‘Z’ as shorthand for ‘integers’? You are seeing a thing that makes sense in German. (It’s for “Zahlen”, meaning the counting numbers.) Göttingen was at the top of German mathematics, and would be until the Nazi purges of academia. It would be hard to find a more renowned position.
And he was speaking at a great moment. The transition from one century to another is a good one for ambitious projects and declarations to be remembered. But the International Congress of Mathematicians was of particular importance. This was only the second meeting of the International Congress of Mathematicians. International Congresses of anything were new in the late 19th century. Many fields — not only mathematics — were asserting their professionalism at the time. It’s when we start to see professional organizations for specific subjects, not just “Science”. It’s when (American) colleges begin offering elective majors for their undergraduates. When they begin offering PhD degrees.
So it was a field when mathematics, like many fields (and nations), hoped to define its institutional prestige. Having an ambitious goal is one way to define that.
It was also an era when mathematicians were thinking seriously about what the field was about. The results were mixed. In the last decades of the 19th century, mathematicians had put differential calculus on a sound logical footing. But then found strange things in, for example, mathematical physics. Boltzmann’s H-theorem (1872) tells us that entropy in a system of particles always increases. Poincaré’s recurrence theorem (1890) tells us a system of particles has to, eventually, return to its original condition. (Or to something close enough.) And therefore it returns to its original entropy, undoing any increase. Both are sound theorems; how can they not conflict?
Even ancient mathematics had new uncertainty. In 1882 Moritz Pasch discovered that Euclid, and everyone doing plane geometry since then, had been using an axiom no one had acknowledged. (If a line that doesn’t pass through any vertex of a triangle intersects one leg of the triangle, then it also meets one other leg of the triangle.) It’s a small and obvious thing. But if everyone had missed it for thousands of years, what else might be overlooked?
I wish now to share my interpretation of this background. And with it my speculations about why we care about Hilbert’s Problems and not about Thurston’s. And I wish to emphasize that, whatever my pretensions, I am not a professional historian of mathematics. I am an amateur and my training consists of “have read some books about a subject of interest”.
By 1900 mathematicians wanted the prestige and credibility and status of professional organizations. Who would not? But they were also aware the foundation of mathematics was not as rigorous as they had thought. It was not yet the “crisis of foundations” that would drive the philosophy of mathematics in the early 20th century. But the prelude to the crisis was there. And here was a universally respected figure, from the most prestigious mathematical institution. He spoke to all the best mathematicians in a way they could never have been addressed before. And presented a compelling list of tasks to do. These were good tasks, challenging tasks. Many of these tasks seemed doable. One was even done almost right away.
And they covered a broad spectrum of mathematics of the time. Everyone saw at least one problem relevant to their field, or to something close to their field. Landau’s problems, posed twelve years later, were all about number theory. Not even all number theory; about prime numbers. That’s nice, but it will only briefly stir the ambitions of the geometer or the mathematical physicist or the logician.
By the time of Taniyama, though? 1955? Times are changed. Taniyama is no inconsiderable figure. The Taniyama-Shimura theorem is a major piece of elliptic functions. It’s how we have a proof of Fermat’s last theorem. But by then, too, mathematics is not so insecure. We have several good ideas of what mathematics is and why it should work. It has prestige and institutional authority. It has enough Congresses and Associations and Meetings that no one can attend them all. It’s moreso by 1982, when William Thurston set up questions. I know that I’m aware of Stephen Smale’s list because I was a teenager during the great fractals boom of the 80s and knew Smale’s name. Also that he published his list near the time I finished my quals. Quals are an important step in pursuing a doctorate. After them you look for a specific thesis problem. I was primed to hear about great ambitious projects I could not possibly complete.
Only the Clay Mathematics Institute’s list has stood out, aided by its catchy name of Millennium Prizes and its offer of quite a lot of money. That’s a good memory aid. Any lay reader can understand that motivation. Two of the Millennium Prize problems were also Hilbert’s problems. One in whole (the Riemann hypothesis again). One in part (one about solutions to elliptic curves). And as the name states, it came out in 2000. It was a year when many organizations were trying to declare bold and fresh new starts for a century they hoped would be happier than the one before. This, too, helps the memory. Who has any strong associations with 1982 who wasn’t born or got their driver’s license that year?
These are my suppositions, though. I could be giving a too-complicated answer. It’s easy to remember that United States President John F Kennedy challenged the nation to land a man on the moon by the end of the decade. Space enthusiasts, wanting something they respect to happen in space, sometimes long for a president to make a similar strong declaration of an ambitious goal and specific deadline. President Ronald Reagan in 1984 declared there would be a United States space station by 1992. In 1986 he declared there would be by 2000 a National Aerospace Plane, capable of flying from Washington to Tokyo in two hours. President George H W Bush in 1989 declared there would be humans on the Moon “to stay” by 2010 and to Mars thereafter. President George W Bush in 2004 declared the Vision for Space Exploration, bringing humans to the moon again by 2020 and to Mars thereafter.
No one has cared about any of these plans. Possibly because the first time a thing is done, it has a power no repetition can claim. But also perhaps because the first attempt succeeded. Which was not due only to its being first, of course, but to the factors that made its goal important to a great number of people for long enough that it succeeded.
Which brings us back to the Euthyphro-like dilemma of Hilbert’s Problems. Are they influential because Hilbert chose well, or did Hlbert’s choosing them make them influential? I suspect this is a problem that cannot be resolved.
Dina Yagodich suggested today’s A-to-Z topic. I thought a quick little biography piece would be a nice change of pace. I discovered things were more interesting than that.
I realized preparing for this that I have never read a biography of Fibonacci. This is hardly unique to Fibonacci. Mathematicians buy into the legend that mathematics is independent of human creation. So the people who describe it are of lower importance. They learn a handful of romantic tales or good stories. In this way they are much like humans. I know at least a loose sketch of many mathematicians. But Fibonacci is a hard one for biography. Here, I draw heavily on the book Fibonacci, his numbers and his rabbits, by Andriy Drozdyuk and Denys Drozdyuk.
We know, for example, that Fibonacci lived until at least 1240. This because in 1240 Pisa awarded him an annual salary in recognition of his public service. We think he was born around 1170, and died … sometime after 1240. This seems like a dismal historical record. But, for the time, for a person of slight political or military importance? That’s about as good as we could hope for. It is hard to appreciate how much documentation we have of lives now, and how recent a phenomenon that is.
Even a fact like “he was alive in the year 1240” evaporates under study. Italian cities, then as now, based the year on the time since the notional birth of Christ. Pisa, as was common, used the notional conception of Christ, on the 25th of March, as the new year. But we have a problem of standards. Should we count the year as the number of full years since the notional conception of Christ? Or as the number of full and partial years since that important 25th of March?
If the question seems confusing and perhaps angering let me try to clarify. Would you say that the notional birth of Christ that first 25th of December of the Christian Era happened in the year zero or in the year one? (Pretend there was a year zero. You already pretend there was a year one AD.) Pisa of Leonardo’s time would have said the year one. Florence would have said the year zero, if they knew of “zero”. Florence matters because when Florence took over Pisa, they changed Pisa’s dating system. Sometime later Pisa changed back. And back again. Historians writing, aware of the Pisan 1240 on the document, may have corrected it to the Florence-style 1241. Or, aware of the change of the calendar and not aware that their source already accounted for it, redated it 1242. Or tried to re-correct it back and made things worse.
This is not a problem unique to Leonardo. Different parts of Europe, at the time, had different notions for the year count. Some also had different notions for what New Year’s Day would be. There were many challenges to long-distance travel and commerce in the time. Not the least is that the same sun might shine on at least three different years at once.
We call him Fibonacci. Did he? The question defies a quick answer. His given name was Leonardo, and he came from Pisa, so a reliable way to address him would have “Leonardo of Pisa”, albeit in Italian. He was born into the Bonacci family. He did in some manuscripts describe himself as “Leonardo filio Bonacci Pisano”, give or take a few letters. My understanding is you can get a good fun quarrel going among scholars of this era by asking whether “Filio Bonacci” would mean “the son of Bonacci” or “of the family Bonacci”. Either is as good for us. It’s tempting to imagine the “Filio” being shrunk to “Fi” and the two words smashed together. But that doesn’t quite say that Leonardo did that smashing together.
Nor, exactly, when it did happen. We see “Fibonacci” used in mathematical works in the 19th century, followed shortly by attempts to explain what it means. We know of a 1506 manuscript identifying Leonardo as Fibonacci. But there remains a lot of unexplored territory.
If one knows one thing about Fibonacci though, one knows about the rabbits. They give birth to more rabbits and to the Fibonacci Sequence. More on that to come. If one knows two things about Fibonacci, the other is about his introducing Arabic numerals to western mathematics. I’ve written of this before. And the subject is … more ambiguous, again.
Most of what we “know” of Fibonacci’s life is some words he wrote to explain why he was writing his bigger works. If we trust he was not creating a pleasant story for the sake of engaging readers, then we can finally say something. (If one knows three things about Fibonacci, and then five things, and then eight, one is making a joke.)
Fibonacci’s father was, in the 1290s, posted to Bejaia, a port city on the Algerian coast. The father did something for Pisa’s duana there. And what is a duana? … Again, certainty evaporates. We have settled on saying it’s a customs house, and suppose our readers know what goes on in a customs house. The duana had something to do with clearing trade through the port. His father’s post was as a scribe. He was likely responsible for collecting duties and registering accounts and keeping books and all that. We don’t know how long Fibonacci spent there. “Some days”, during which he alleges he learned the digits 1 through 9. And after that, travelling around the Mediterranean, he saw why this system was good, and useful. He wrote books to explain it all and convince Europe that while Roman numerals were great, Arabic numerals were more practical.
It is always dangerous to write about “the first” person to do anything. Except for Yuri Gagarin, Alexei Leonov, and Neil Armstrong, “the first” to do anything dissolves into ambiguity. Gerbert, who would become Pope Sylvester II, described Arabic numerals (other than zero) by the end of the 10th century. He added in how this system along with the abacus made computation easier. Arabic numerals appear in the Codex Conciliorum Albeldensis seu Vigilanus, written in 976 AD in Spain. And it is not as though Fibonacci was the first European to travel to a land with Arabic numerals, or the first perceptive enough to see their value.
Allow that, though. Every invention has precursors, some so close that it takes great thinking to come up with a reason to ignore them. There must be some credit given to the person who gathers an idea into a coherent, well-explained whole. And with Fibonacci, and his famous manuscript of 1202, the Liber Abaci, we have … more frustration.
It’s not that Liber Abaci does not exist, or that it does not have what we credit it for having. We don’t have any copies of the 1202 edition, but we do have a 1228 manuscript, at least, and that starts out with the Arabic numeral system. And why this system is so good, and how to use it. It should convince anyone who reads it.
If anyone read it. We know of about fifteen manuscripts of Liber Abaci, only two of them reasonably complete. This seems sparse even for manuscripts in the days they had to be hand-copied. This until you learn that Baldassarre Boncompagni published the first known printed version in 1857. In print, in Italian, it took up 459 pages of text. Its first English translation, published by Laurence E Sigler in 2002(!) takes up 636 pages (!!). Suddenly it’s amazing that as many as two complete manuscripts survive. (Wikipedia claims three complete versions from the 13th and 14th centuries exist. And says there are about nineteen partial manuscripts with another nine incomplete copies. I do not explain this discrepancy.)
So perhaps only a handful of people read Fibonacci. Ah, but if they were the right people? He could have been a mathematical Velvet Underground, read by a hundred people, each of whom founded a new mathematics.
This is not to say Fibonacci copied any of these (and more) Indian mathematicians. The world is large and manuscripts are hard to read. The sequence can be re-invented by anyone bored in the right way. Ah, but think of those who learned of the sequence and used it later on, following Fibonacci’s lead. For example, in 1611 Johannes Kepler wrote a piece that described Fibonacci’s sequence. But that does not name Fibonacci. He mentions other mathematicians, ancient and contemporary. The easiest supposition is he did not know he was writing something already seen. In 1844, Gabriel Lamé used Fibonacci numbers in studying algorithm complexity. He did not name Fibonacci either, though. (Lamé is famous today for making some progress on Fermat’s last theorem. He’s renowned for work in differential equations and on ellipse-like curves. If you have thought what a neat weird shape the equation can describe you have tread in Lamé’s path.)
Things picked up for Fibonacci’s reputation in 1876, thanks to Édouard Lucas. (Lucas is notable for other things. Normal people might find interesting that he proved by hand the number was prime. This seems to be the largest prime number ever proven by hand. He also created the Tower of Hanoi problem.) In January of 1876, Lucas wrote about the Fibonacci sequence, describing it as “the series of Lamé”. By May, though in writing about prime numbers, he has read Boncompagni’s publications. He says how this thing “commonly known as the sequence of Lamé was first presented by Fibonacci”.
And Fibonacci caught Lucas’s imagination. Lucas shared, particularly, the phrasing of this sequence as something in the reproduction of rabbits. This captured mathematicians’, and then people’s imaginations. It’s akin to Émile Borel’s room of a million typing monkeys. By the end of the 19th century Leonardo of Pisa had both a name and fame.
We can still ask why. The proximate cause is Édouard Lucas, impressed (I trust) by Boncompagni’s editions of Fibonacci’s work. Why did Baldassarre Boncompagni think it important to publish editions of Fibonacci? Well, he was interested in the history of science. He edited the first Italian journal dedicated to the history of mathematics. He may have understood that Fibonacci was, if not an important mathematician, at least one who had interesting things to write. Boncompagni’s edition of Liber Abaci came out in 1857. By 1859 the state of Tuscany voted to erect a statue.
So I speculate, without confirming that at least some of Fibonacci’s good name in the 19th century was a reflection of Italian unification. The search for great scholars whose intellectual achievements could reflect well on a nation trying to build itself.
And so we have bundles of ironies. Fibonacci did write impressive works of great mathematical insight. And he was recognized at the time for that work. The things he wrote about Arabic numerals were correct. His recommendation to use them was taken, but by people who did not read his advice. After centuries of obscurity he got some notice. And a problem he did not create nor particularly advance brought him a fame that’s lasted a century and a half now, and looks likely to continue.
I am always amazed to learn there are people not interested in history.
A throwaway joke somewhere in The Hitchhiker’s Guide To The Galaxy has Marvin The Paranoid Android grumble that he’s invented a square root for minus one. Marvin’s gone and rejiggered all of mathematics while waiting for something better to do. Nobody cares. It reminds us while Douglas Adams established much of a particular generation of nerd humor, he was not himself a nerd. The nerds who read The Hitchhiker’s Guide To The Galaxy obsessively know we already did that, centuries ago. Marvin’s creation was as novel as inventing “one-half”. (It may be that Adams knew, and intended Marvin working so hard on the already known as the joke.)
Anyone who’d read a pop mathematics blog like this likely knows the rough story of complex numbers in Western mathematics. The desire to find roots of polynomials. The discovery of formulas to find roots. Polynomials with numbers whose formulas demanded the square roots of negative numbers. And the discovery that sometimes, if you carried on as if the square root of a negative number made sense, the ugly terms vanished. And you got correct answers in the end. And, eventually, mathematicians relented. These things were unsettling enough to get unflattering names. To call a number “imaginary” may be more pejorative than even “negative”. It hints at the treatment of these numbers as falsework, never to be shown in the end. To call the sum of a “real” number and an “imaginary” “complex” is to warn. An expert might use these numbers only with care and deliberation. But we can count them as numbers.
I mentioned when writing about quaternions how when I learned of complex numbers I wanted to do the same trick again. My suspicion is many mathematicians do. The example of complex numbers teases us with the possibilites of other numbers. If we’ve defined to be “a number that, squared, equals minus one”, what next? Could we define a ? How about a ? Maybe something else? An arc-cosine of ?
You can try any of these. They turn out to be redundant. The real numbers and already let you describe any of those new numbers. You might have a flash of imagination: what if there were another number that, squared, equalled minus one, and that wasn’t equal to ? Numbers that look like ? Here, and later on, a and b and c are some real numbers. means “multiply the real number b by whatever is”, and we trust that this makes sense. There’s a similar setup for c and . And if you just try that, with , you get some interesting new mathematics. Then you get stuck on what the product of these two different square roots should be.
If you think of that. If all you think of is addition and subtraction and maybe multiplication by a real number? works fine. You only spot trouble if you happen to do multiplication. Granted, multiplication is to us not an exotic operation. Take that as a warning, though, of how trouble could develop. How do we know, say, that complex numbers are fine as long as you don’t try to take the log of the haversine of one of them, or some other obscurity? And that then they produce gibberish? Or worse, produce that most dread construct, a contradiction?
Here I am indebted to an essay that ten minutes ago I would have sworn was in one of the two books I still have out from the university library. I’m embarrassed to learn my error. It was about the philosophy of complex numbers and it gave me fresh perspectives. When the university library reopens for lending I will try to track back through my borrowing and find the original. I suspect, without confirming, that it may have been in Reuben Hersh’s What Is Mathematics, Really?.
The insight is that we can think of complex numbers in several ways. One fruitful way is to match complex numbers with points in a two-dimensional space. It’s common enough to pair, for example, the number with the point at Cartesian coordinates . Mathematicians do this so often it can take a moment to remember that is just a convention. And there is a common matching between points in a Cartesian coordinate system and vectors. Chaining together matches like this can worry. Trust that we believe our matches are sound. Then we notice that adding two complex numbers does the same work as adding ordered coordinate pairs. If we trust that adding coordinate pairs makes sense, then we need to accept that adding complex numbers makes sense. Adding coordinate pairs is the same work as adding real numbers. It’s just a lot of them. So we’re lead to trust that if addition for real numbers works then addition for complex numbers does.
Multiplication looks like a mess. A different perspective helps us. A different way to look at where point are on the plane is to use polar coordinates. That is, the distance a point is from the origin, and the angle between the positive x-axis and the line segment connecting the origin to the point. In this format, multiplying two complex numbers is easy. Let the first complex number have polar coordinates . Let the second have polar coordinates . Their product, by the rules of complex numbers, is a point with polar coordinates . These polar coordinates are real numbers again. If we trust addition and multiplication of real numbers, we can trust this for complex numbers.
If we’re confident in adding complex numbers, and confident in multiplying them, then … we’re in quite good shape. If we can add and multiply, we can do polynomials. And everything is polynomials.
We might feel suspicious yet. Going from complex numbers to points in space is calling on our geometric intuitions. That might be fooling ourselves. Can we find a different rationalization? The same result by several different lines of reasoning makes the result more believable. Is there a rationalization for complex numbers that never touches geometry?
Adding matrices is compelling. It’s the same work as adding ordered pairs of numbers. Multiplying matrices is tedious, though it’s not so bad for matrices this small. And it’s all done with real-number multiplication and addition. If we trust that the real numbers work, we can trust complex numbers do. If we can show that our new structure can be understood as a configuration of the old, we convince ourselves the new structure is meaningful.
The process by which we learn to trust them as numbers, guides us to learning how to trust any new mathematical structure. So here is a new thing that complex numbers can teach us, years after we have learned how to divide them. Do not attempt to divide complex numbers. That’s too much work.
The program is three people, plus host Melvyn Bragg, talking about the life and work of Gauss. Gauss is one of those figures hard to exaggerate. He was extremely prolific and insightful. It is an exaggeration to say that he did foundational work in every field of mathematics, but only a slight exaggeration. (He compares to Leonhard Euler that way.) I’d imagine that anyone reading a pop mathematics blog knows something of Gauss. But you may learn something new, or a new perspective on something familiar.
I don’t just read comic strips around here. It seems like it, I grant. But there’s other things that catch my interest and that you might also like.
The first: many people have talked about what great thinkers did during their quarantine-induced disruptions to their lives. Isaac Newton is held up as a great example. While avoiding the Plague, after all, he had that great year of discovering calculus, gravity, optics, and an automatic transmission that doesn’t fail after eight years of normal driving. It’s a great story. The trouble is that real thing is always more ambiguous, more hesitant, and less well-defined than the story. The Renaissance Mathematicus discusses, in detail, something closer to the reality of Newton’s accomplishments during that plague year. This is not to say that his work was not astounding. But it was not as much, or as intense, or as superhuman as inspirational tweets would like.
If you do decide the quarantine is a great chance to revolutionize academia, good luck. You need some reference material, though. Springer publishing has put out several hundred of its textbooks as free PDFs or eBooks. A list of 408 of them (the poster claims) is here on Reddit. This is not only a list of mathematics and mathematics-related topics, and I not undrestand the poster’s organization scheme. But there are a lot of books here, including at least two Introduction to Partial Differential Equations texts. There’s something of note there. This could finally be the thing that gets me to learn the mathematical-statistics programming language R. (It will not get me to learn the mathematical-statistics programming language R.)
And, finally, the disruption to everything has messed up academic departments’ routines. Some of those routines are seminars, in which people share the work they’re doing. Fortunately, many of these seminars are moving to online presentations. And then you can join in, and at least listen, without needing even to worry about being the stranger hanging around the mathematics department. Mathseminars.org has a list of upcoming seminars, with links to what the sessions are about and how to join them. The majority are in English, but there are listed seminars in Spanish, Russian, and French.
Yesterday was the birthday of Herman Hollerith. His 40th since his birth in 1860. He’s renowned in computing circles. His work in automating the counting and of data made the United States’s 1890 Census possible. This is not the ordinary hyperbole: the 1880 Census’s data took eight years to fully collate. Hollerith’s tabulating machines took … well, six years for the full job, but they were keeping track of quite a bit of information. Hollerith’s system would go on to be used for other censuses, and also for general inventory and data-tracking purposes. His tabulating company would go on to be one of the original components of IBM. Cards, card readers, and card sorters with a clear lineage to this system would be used until fully electronic computers took over.
(It’s commonly assumed that the traditional 80-character width of a text terminal traces to the 80-hole punch cards which became the standard. Programmers particularly love to tell that tale, ignoring early computing screens that had different lengths, particularly 72 characters. More plausibly 80 characters owes to two things: it’s a nice round number, and it’s close to the number of characters you can type on a standard sheet of paper with a normal typewriter font. So it’s about the “right” length, one that we’ve been trained to accept as enough text to read at a glance.)
Well. In about 1970 IBM hired Bob Newhart to record a bit, for … fun, if that word applies to IBM. Part of the publicity for launching the famous System 370 machine. The structure echos the bit where Bob Newhart imagines being the first guy to hear of Sir Walter Raleigh’s importing of tobacco, and just how weird every bit of that is. In this bit, Newhart imagines talking on the phone with Herman Hollerith and hearing about just how this punched-card system is supposed to work. For decades, though, the film was reported lost.
What I did not know until mentioning to a friend two days ago is: the film was found! And a decade ago! In a Swedish bank vault because that’s the way this sort of thing always happens. Which is a neat bit of historical rhyming: the original fine data from the first Hollerith census of 1890 is lost, most likely destroyed in 1933 or 1934. So, please let me share with you Bob Newhart hearing about Herman Hollerith’s system. The end appears to be cut off, and there are Swedish subtitles that might just give away a couple jokes, if you can’t help paying attention to them.
Like a lot of comic work-for-hire it’s not Newhart’s best. It’s not going to displace the Voyage of the USS Codfish in my heart. There are a few spots to me where it seems like Newhart’s overlooked a good additional punch line, and I don’t know whether that reflects Newhart wanting to keep the piece from growing too long or too technical or what. It’s possible Newhart didn’t feel familiar enough with punch card technology to get too technical too. Newhart did work, briefly, as an accountant and might have had some reason to use the things. But I’m not aware of his telling any stories of doing so, and that seems a telling omission.
Still, it’s great to see this bit has been preserved, and is available. And is a Bob Newhart routine about early computer technologies, somehow.
I’m more fluent in graph theory, and my writing will reflect that. But its critical insight involves looking at spaces and ignoring things like distance and area and angle. It is amazing that one can discard so much of geometry and still have anything to consider. What we do learn then applies to very many problems.
Königsberg Bridge Problem.
Once upon a time there was a city named Königsberg. It no longer is. It is Kaliningrad now. It’s no longer in that odd non-contiguous chunk of Prussia facing the Baltic Sea. It’s now in that odd non-contiguous chunk of Russia facing the Baltic Sea.
I put it this way because what the city evokes, to mathematicians, is a story. I do not have specific reason to think the story untrue. But it is a good story, and as I think more about history I grow more skeptical of good stories. A good story teaches, though not always the thing it means to convey.
The story is this. The city is on two sides of the Pregel river, now the Pregolya River. Two large islands are in the river. For several centuries these four land masses were connected by a total of seven bridges. And we are told that people in the city would enjoy free time with an idle puzzle. Was there a way to walk all seven bridges one and only one time? If no one did something fowl like taking a boat to cross the river, or not going the whole way across a bridge, anyway? There were enough bridges, though, and enough possible ways to cross them, that trying out every option was hopeless.
Then came Leonhard Euler. Who is himself a preposterous number of stories. Pick any major field of mathematics; there is an Euler’s Theorem at its center. Or an Euler’s Formula. Euler’s Method. Euler’s Function. Likely he brought great new light to it.
And in 1736 he solved the Königsberg Bridge Problem. The answer was to look at what would have to be true for a solution to exist. He noticed something so obvious it required genius not to dismiss it. It seems too simple to be useful. In a successful walk you enter each land mass (river bank or island) the same number of times you leave it. So if you cross each bridge exactly once, you use an even number of bridges per land mass. The exceptions are that you must start at one land mass, and end at a land mass. Maybe a different one. How you get there doesn’t count for the problem. How you leave doesn’t either. So the land mass you start from may have an odd number of bridges. So may the one you end on. So there are up to two land masses that may have an odd number of bridges.
Once this is observed, it’s easy to tell that Königsberg’s Bridges did not match that. All four land masses in Königsberg have an odd number of bridges. And so we could stop looking. It’s impossible to walk the seven bridges exactly once each in a tour, not without cheating.
Graph theoreticians, like the topologists of my prologue, now consider this foundational to their field. To look at a geographic problem and not concern oneself with areas and surfaces and shapes? To worry only about how sets connect? This guides graph theory in how to think about networks.
The city exists, as do the islands, and the bridges existed as described. So does Euler’s solution. And his reasoning is sound. The reasoning is ingenious, too. Everything hard about the problem evaporates. So what do I doubt about this fine story?
Teo Paoletti, author of that web page, says Danzig mayor Carl Leonhard Gottlieb Ehler wrote Euler, asking for a solution. This falls short of proving that the bridges were a common subject of speculation. It does show at least that Ehler thought it worth pondering. Euler apparently did not think it was even mathematics. Not that he thought it was hard; he simply thought it didn’t depend on mathematical principles. It took only reason. But he did find something interesting: why was it not mathematics? Paoletti quotes Euler as writing:
This question is so banal, but seemed to me worthy of attention in that [neither] geometry, nor algebra, nor even the art of counting was sufficient to solve it.
I am reminded of a mathematical joke. It’s about the professor who always went on at great length about any topic, however slight. I have no idea why this should stick with me. Finally one day the professor admitted of something, “This problem is not interesting.” The students barely had time to feel relief. The professor went on: “But the reasons why it is not interesting are very interesting. So let us explore that.”
The Königsberg Bridge Problem is in the first chapter of every graph theory book ever. And it is a good graph theory problem. It may not be fair to say it created graph theory, though. Euler seems to have treated this as a little side bit of business, unrelated to his real mathematics. Graph theory as we know it — as a genre — formed in the 19th century. So did topology. In hindsight we can see how studying these bridges brought us good questions to ask, and ways to solve them. But for something like a century after Euler published this, it was just the clever solution to a recreational mathematics puzzle. It was as important as finding knight’s tours of chessboards.
That we take it as the introduction to graph theory, and maybe topology, tells us something. It is an easy problem to pose. Its solution is clever, but not obscure. It takes no long chains of complex reasoning. Many people approach mathematics problems with fear. By telling this story, we promise mathematics that feels as secure as a stroll along the riverfront. This promise is good through about chapter three, section four, where there are four definitions on one page and the notation summons obscure demons of LaTeX.
Still. Look at what the story of the bridges tells us. We notice something curious about our environment. The problem seems mathematical, or at least geographic. The problem is of no consequence. But it lingers in the mind. The obvious approaches to solving it won’t work. But think of the problem differently. The problem becomes simple. And better than simple. It guides one to new insights. In a century it gives birth to two fields of mathematics. In two centuries these are significant fields. They’re things even non-mathematicians have heard of. It’s almost a mathematician’s fantasy of insight and accomplishment.
But this does happen. The world suggests no end of little mathematics problems. Sometimes they are wonderful. Richard Feynman’s memoirs tell of his imagination being captured by a plate spinning in the air. Solving that helped him resolve a problem in developing Quantum Electrodynamics. There are more mundane problems. One of my professors in grad school remembered tossing and catching a tennis racket and realizing he didn’t know why sometimes it flipped over and sometimes didn’t. His specialty was in dynamical systems, and he could work out the mechanics of what a tennis racket should do, and when. And I know that within me is the ability to work out when a pile of books becomes too tall to stand on its own. I just need to work up to it.
The story of the Königsberg Bridge Problem is about this. Even if nobody but the mayor of Danzig pondered how to cross the bridges, and he only got an answer because he infected Euler with the need to know? It is a story of an important piece of mathematics. Good stories will tell us things that are true, which are not necessarily the things that happen in them.
This is almost all a post about some comics that don’t need more than a mention. You know, strips that just have someone in class not buying the word problem. These are the rest of last week’s.
Before I get there, though, I want to share something. I ran across an essay by Chris K Caldwell and Yeng Xiong: What Is The Smallest Prime? The topic is about 1, and whether that should be a prime number. Everyone who knows a little about mathematics knows that 1 is generally not considered a prime number. But we’re also a bit stumped to figure out why, since the idea of “a prime number is divisible by 1 and itself” seems to fit this, even if the fit is weird. And we have an explanation for this: 1 used to be thought of as prime, but it made various theorems more clumsy to present. So it was either cut 1 out of the definition or add the equivalent work to everything, and mathematicians went for the solution that was less work. I know that I’ve shared this story around here. (I’m surprised to find I didn’t share it in my Summer 2017 A-to-Z essay about prime numbers.)
The truth is more complicated than that. The truth of anything is always more complicated than its history. Even an excellent history’s. It’s not that the short story has things wrong, precisely. But that that matters are more complicated than that. The history includes things we forget were ever problems, like, the question of whether 1 should be a number. And that the question of whether mathematicians “used to” consider 1 a number is built on the supposition that mathematicians were a lot more uniform in their thinking than they were. Even to the individual: people were inconsistent in what they themselves wrote, because most mathematicians turn out to be people.
Tim Rickard’s Brewster Rockit for the 17th mentions entropy, which is so central to understanding statistical mechanics and information theory. It’s in the popular understanding of entropy, that of it being a thing which makes stuff get worse. But that’s of mathematical importance too.
Nobody had a suggested topic starting with ‘W’ for me! So I’ll take that as a free choice, and get lightly autobiogrpahical.
Witch of Agnesi.
I know I encountered the Witch of Agnesi while in middle school. Eighth grade, if I’m not mistaken. It was a footnote in a textbook. I don’t remember much of the textbook. What I mostly remember of the course was how much I did not fit with the teacher. The only relief from boredom that year was the month we had a substitute and the occasional interesting footnote.
It was in a chapter about graphing equations. That is, finding curves whose points have coordinates that satisfy some equation. In a bit of relief from lines and parabolas the footnote offered this:
In a weird tantalizing moment the footnote didn’t offer a picture. Or say what an ‘a’ was doing in there. In retrospect I recognize ‘a’ as a parameter, and that different values of it give different but related shapes. No hint what the ‘8’ or the ‘4’ were doing there. Nor why ‘a’ gets raised to the third power in the numerator or the second in the denominator. I did my best with the tools I had at the time. Picked a nice easy boring ‘a’. Picked out values of ‘x’ and found the corresponding ‘y’ which made the equation true, and tried connecting the dots. The result didn’t look anything like a witch. Nor a witch’s hat.
It was one of a handful of biographical notes in the book. These were a little attempt to add some historical context to mathematics. It wasn’t much. But it was an attempt to show that mathematics came from people. Including, here, from Maria Gaëtana Agnesi. She was, I’m certain, the only woman mentioned in the textbook I’ve otherwise completely forgotten.
We have few names of ancient mathematicians. Those we have are often compilers like Euclid whose fame obliterated the people whose work they explained. Or they’re like Pythagoras, credited with discoveries by people who obliterated their own identities. In later times we have the mathematics done by, mostly, people whose social positions gave them time to write mathematics results. So we see centuries where every mathematician is doing it as their side hustle to being a priest or lawyer or physician or combination of these. Women don’t get the chance to stand out here.
Today of course we can name many women who did, and do, mathematics. We can name Emmy Noether, Ada Lovelace, and Marie-Sophie Germain. Challenged to do a bit more, we can offer Florence Nightingale and Sofia Kovalevskaya. Well, and also Grace Hopper and Margaret Hamilton if we decide computer scientists count. Katherine Johnson looks likely to make that cut. But in any case none of these people are known for work understandable in a pre-algebra textbook. This must be why Agnesi earned a place in this book. She’s among the earliest women we can specifically credit with doing noteworthy mathematics. (Also physics, but that’s off point for me.) Her curve might be a little advanced for that textbook’s intended audience. But it’s not far off, and pondering questions like “why ? Why not ?” is more pleasant, to a certain personality, than pondering what a directrix might be and why we might use one.
The equation might be a lousy way to visualize the curve described. The curve is one of that group of interesting shapes you get by constructions. That is, following some novel process. Constructions are fun. They’re almost a craft project.
For this we start with a circle. And two parallel tangent lines. Without loss of generality, suppose they’re horizontal, so, there’s lines at the top and the bottom of the curve.
Take one of the two tangent points. Again without loss of generality, let’s say the bottom one. Draw a line from that point over to the other line. Anywhere on the other line. There’s a point where the line you drew intersects the circle. There’s another point where it intersects the other parallel line. We’ll find a new point by combining pieces of these two points. The point is on the same horizontal as wherever your line intersects the circle. It’s on the same vertical as wherever your line intersects the other parallel line. This point is on the Witch of Agnesi curve.
Now draw another line. Again, starting from the lower tangent point and going up to the other parallel line. Again it intersects the circle somewhere. This gives another point on the Witch of Agnesi curve. Draw another line. Another intersection with the circle, another intersection with the opposite parallel line. Another point on the Witch of Agnesi curve. And so on. Keep doing this. When you’ve drawn all the lines that reach from the tangent point to the other line, you’ll have generated the full Witch of Agnesi curve. This takes more work than writing out , yes. But it’s more fun. It makes for neat animations. And I think it prepares us to expect the shape of the curve.
It’s a neat curve. Between it and the lower parallel line is an area four times that of the circle that generated it. The shape is one we would get from looking at the derivative of the arctangent. So there’s some reasons someone working in calculus might find it interesting. And people did. Pierre de Fermat studied it, and found this area. Isaac Newton and Luigi Guido Grandi studied the shape, using this circle-and-parallel-lines construction. Maria Agnesi’s name attached to it after she published a calculus textbook which examined this curve. She showed, according to people who present themselves as having read her book, the curve and how to find it. And she showed its equation and found the vertex and asymptote line and the inflection points. The inflection points, here, are where the curve chances from being cupped upward to cupping downward, or vice-versa.
It’s a neat function. It’s got some uses. It’s a natural smooth-hill shape, for example. So this makes a good generic landscape feature if you’re modeling the flow over a surface. I read that solitary waves can have this curve’s shape, too.
And the curve turns up as a probability distribution. Take a fixed point. Pick lines at random that pass through this point. See where those lines reach a separate, straight line. Some regions are more likely to be intersected than are others. Chart how often any particular line is the new intersection point. That chart will (given some assumptions I ask you to pretend you agree with) be a Witch of Agnesi curve. This might not surprise you. It seems inevitable from the circle-and-intersecting-line construction process. And that’s nice enough. As a distribution it looks like the usual Gaussian bell curve.
It’s different, though. And it’s different in strange ways. Like, for a probability distribution we can find an expected value. That’s … well, what it sounds like. But this is the strange probability distribution for which the law of large numbers does not work. Imagine an experiment that produces real numbers, with the frequency of each number given by this distribution. Run the experiment zillions of times. What’s the mean value of all the zillions of generated numbers? And it … doesn’t … have one. I mean, we know it ought to, it should be the center of that hill. But the calculations for that don’t work right. Taking a bigger sample makes the sample mean jump around more, not less, the way every other distribution should work. It’s a weird idea.
Imagine carving a block of wood in the shape of this curve, with a horizontal lower bound and the Witch of Agnesi curve as the upper bound. Where would it balance? … The normal mathematical tools don’t say, even though the shape has an obvious line of symmetry. And a finite area. You don’t get this kind of weirdness with parabolas.
(Yes, you’ll get a balancing point if you actually carve a real one. This is because you work with finitely-long blocks of wood. Imagine you had a block of wood infinite in length. Then you would see some strange behavior.)
It teaches us more strange things, though. Consider interpolations, that is, taking a couple data points and fitting a curve to them. We usually start out looking for polynomials when we interpolate data points. This is because everything is polynomials. Toss in more data points. We need a higher-order polynomial, but we can usually fit all the given points. But sometimes polynomials won’t work. A problem called Runge’s Phenomenon can happen, where the more data points you have the worse your polynomial interpolation is. The Witch of Agnesi curve is one of those. Carl Runge used points on this curve, and trying to fit polynomials to those points, to discover the problem. More data and higher-order polynomials make for worse interpolations. You get curves that look less and less like the original Witch. Runge is himself famous to mathematicians, known for “Runge-Kutta”. That’s a family of techniques to solve differential equations numerically. I don’t know whether Runge came to the weirdness of the Witch of Agnesi curve from considering how errors build in numerical integration. I can imagine it, though. The topics feel related to me.
I understand how none of this could fit that textbook’s slender footnote. I’m not sure any of the really good parts of the Witch of Agnesi could even fit thematically in that textbook. At least beyond the fact of its interesting name, which any good blog about the curve will explain. That there was no picture, and that the equation was beyond what the textbook had been describing, made it a challenge. Maybe not seeing what the shape was teased the mathematician out of this bored student.
And next is ‘X’. Will I take Mr Wu’s suggestion and use that to describe something “extreme”? Or will I take another topic or suggestion? We’ll see on Friday, barring unpleasant surprises. Thanks for reading.
I have been reading Mapping In Michigan and the Great Lakes Region, edited by David I Macleod, because — look, I understand that I have a problem. I just live with it. The book is about exactly what you might imagine from the title. And it features lots of those charming old maps where, you know, there wasn’t so very much hard data available and everyone did the best with what they had. So you get these maps with spot-on perfect Lake Eries and the eastern shore of Lake Huron looking like you pulled it off of Open Street Maps. And then Michigan looks like a kid’s drawing of a Thanksgiving turkey. Also sometimes they drop a mountain range in the middle of the state because I guess it seemed a little empty without.
The first chapter, by Mary Sponberg Pedley, is a biography and work-history of Louis Charles Karpinski, 1878-1956. Karpinski did a lot to bring scholastic attention to maps of the Great Lakes area. He was a professor of mathematics for the University of Michigan. And he commented a good bit about the problems of teaching mathematics. Pedley quoted this bit that I thought was too good not to share. It’s from Arithmetic For The Farm. It’s about the failure of textbooks to provide examples that actually reflected anything anyone might want to know. I quote here Pedley’s endnote:
Karpinski disparaged the typical “story problems” found in contemporary textbooks, such as the following: “How many sacks, holding 2 bushels, 3 pecks and 2 quarts each can be filled from a bin containing 366 bushels, 3 pecks, 4 quarts of what?” Karpinski comments: “How carefully would you have to fill a sack to make it hold 3 pecks 2 quarts of anything? And who filled the bin so marvelously that the capacity is known with an accuracy of one-25th of 1% of the total?” He recommended an easier, more practical means of doing such problems, noting that a bushel is about 1 & 1/4 or 5/4 cubic feet. Therefore the number of bushels in the bin is the length times width times 4/5; the easiest way to get 4/5 of anything is to take away one-fifth of it.
This does read to me like Pedley jumped a track somewhere. It seems to go from the demolition of the plausibility of one problem’s setup to demolishing the plausibility of how to answer a problem. Still, the core complaint is with us yet. It’s hard to frame problems that might actually come up in ways that clearly test specific mathematical skills.
And on another note. This is the 1,000th mathematical piece that I’ve published since I started in September of 2011. If I’m not misunderstanding this authorship statistic on WordPress, which is never a safe bet. I’m surprised that it has taken as long as this to get to a thousand posts. Also I’m surprised that I should be surprised. I know roughly how many days there are in a year. And I know I need special circumstances to post something more often than every other day. Still, I’m glad to reach this milestone, and gratified that there’s anyone interested in what I have to say. In my next thousand posts I hope to say something.
So, I must confess failure. Not about deciphering Józef Maria Hoëne-Wronski’s attempted definition of π. He’d tried this crazy method throwing a lot of infinities and roots of infinities and imaginary numbers together. I believe I translated it into the language of modern mathematics fairly. And my failure is not that I found the formula actually described the number -½π.
Oh, I had an error in there, yes. And I’d found where it was. It was all the way back in the essay which first converted Wronski’s formula into something respectable. It was a small error, first appearing in the last formula of that essay and never corrected from there. This reinforces my suspicion that when normal people see formulas they mostly look at them to confirm there is a formula there. With luck they carry on and read the sentences around them.
My failure is I wanted to write a bit about boring mistakes. The kinds which you make all the time while doing mathematics work, but which you don’t worry about. Dropped signs. Constants which aren’t divided out, or which get multiplied in incorrectly. Stuff like this which you only detect because you know, deep down, that you should have gotten to an attractive simple formula and you haven’t. Mistakes which are tiresome to make, but never make you wonder if you’re in the wrong job.
The trouble is I can’t think of how to make an essay of that. We don’t tend to rate little mistakes like the wrong sign or the wrong multiple or a boring unnecessary added constant as important. This is because they’re not. The interesting stuff in a mathematical formula is usually the stuff representing variations. Change is interesting. The direction of the change? Eh, nice to know. A swapped plus or minus sign alters your understanding of the direction of the change, but that’s all. Multiplying or dividing by a constant wrongly changes your understanding of the size of the change. But that doesn’t alter what the change looks like. Just the scale of the change. Adding or subtracting the wrong constant alters what you think the change is varying from, but not what the shape of the change is. Once more, not a big deal.
But you also know that instinctively, or at least you get it from seeing how it’s worth one or two points on an exam to write -sin where you mean +sin. Or how if you ask the instructor in class about that 2 where a ½ should be, she’ll say, “Oh, yeah, you’re right” and do a hurried bit of erasing before going on.
Thus my failure: I don’t know what to say about boring mistakes that has any insight.
For the record here’s where I got things wrong. I was creating a function, named ‘f’ and using as a variable ‘x’, to represent Wronski’s formula. I’d gotten to this point:
And then I observed how the stuff in curly braces there is “one of those magic tricks that mathematicians know because they see it all the time”. And I wanted to call in this formula, correctly:
So here’s where I went wrong. I took the way off in the front of that first formula and combined it with the stuff in braces to make 2 times a sine of some stuff. I apologize for this. I must have been writing stuff out faster than I was thinking about it. If I had thought, I would have gone through this intermediate step:
Because with that form in mind, it’s easy to take the stuff in curled braces and the in the denominator. From that we get, correctly, . And then the on the far left of that expression and the on the right multiply together to produce the number 8.
So the function ought to have been, all along:
Not very different, is it? Ah, but it makes a huge difference. Carry through with all the L’Hôpital’s Rule stuff described in previous essays. All the complicated formula work is the same. There’s a different number hanging off the front, waiting to multiply in. That’s all. And what you find, redoing all the work but using this corrected function, is that Wronski’s original mess —
Possibly the book I drew this from misquoted Wronski. It’s at least as good to have a formula for 2π as it is to have one for π. Or Wronski had a mistake in his original formula, and had a constant multiplied out front which he didn’t want. It happens to us all.
I had thought I’d culled some more pieces from my Twitter and other mathematics-writing-reading the last couple weeks and I’m not sure where it all went. I think I might be baffled by the repostings of things on Quanta Magazine (which has a lot of good mathematics articles, but not, like, a 3,000-word piece every day, and they showcase their archive just as anyone ought).
It reviews Kim Plofker’s 2008 text Mathematics In India, a subject that I both know is important — I love to teach with historic context included — and something that I very much bluff my way through. I mean, I do research things I expect I’ll mention, but I don’t learn enough of the big picture and a determined questioner could prove how fragile my knowledge was. So Plofker’s book should go on my reading list at least.
We all know Newton and Leibniz introduced Calculus in the 17th century, but Cauchy made seminal contributions to its precision in the 1800s, thus we might say Cauchy introduced what we now call Analysis. Here's some notes on 19th century (real) Analysis. https://t.co/crP9tyiELC
These are lecture notes about analysis. In the 19th century mathematicians tried to tighten up exactly what we meant by things like “functions” and “limits” and “integrals” and “numbers” and all that. It was a lot of good solid argument, and a lot of surprising, intuition-defying results. This isn’t something that a lay reader’s likely to appreciate, and I’m sorry for that, but if you do know the difference between Riemann and Lebesgue integrals the notes are likely to help.
And this, Daniel Grieser and Svenja Maronna’s Hearing The Shape Of A Triangle, follows up on a classic mathematics paper, Mark Kac’s Can One Hear The Shape Of A Drum? This is part of a class of problems in which you try to reconstruct what kinds of things can produce a signal. It turns out to be impossible to perfectly say what shape and material of a drum produced a certain sound of a drum. But. A triangle — the instrument, that is, but also the shape — has a simpler structure. Could we go from the way a triangle sounds to knowing what it looks like?
Józef Maria Hoëne-Wronski’s had an idea for a new, universal, culturally-independent definition of π. It was this formula that nobody went along with because they had looked at it:
I made some guesses about what he would want this to mean. And how we might put that in terms of modern, conventional mathematics. I describe those in the above links. In terms of limits of functions, I got this:
The trouble is that limit took more work than I wanted to do to evaluate. If you try evaluating that ‘f(x)’ at ∞, you get an expression that looks like zero times ∞. This begs for the use of L’Hôpital’s Rule, which tells you how to find the limit for something that looks like zero divided by zero, or like ∞ divided by ∞. Do a little rewriting — replacing that first ‘x’ with ‘ — and this ‘f(x)’ behaves like L’Hôpital’s Rule needs.
The trouble is, that’s a pain to evaluate. L’Hôpital’s Rule works on functions that look like one function divided by another function. It does this by calculating the derivative of the numerator function divided by the derivative of the denominator function. And I decided that was more work than I wanted to do.
Where trouble comes up is all those parts where turns up. The derivatives of functions with a lot of terms in them get more complicated than the original functions were. Is there a way to get rid of some or all of those?
And there is. Do a change of variables. Let me summon the variable ‘y’, whose value is exactly . And then I’ll define a new function, ‘g(y)’, whose value is whatever ‘f’ would be at . That is, and this is just a little bit of algebra:
The limit of ‘f(x)’ for ‘x’ at ∞ should be the same number as the limit of ‘g(y)’ for ‘y’ at … you’d really like it to be zero. If ‘x’ is incredibly huge, then has to be incredibly small. But we can’t just swap the limit of ‘x’ at ∞ for the limit of ‘y’ at 0. The limit of a function at a point reflects the value of the function at a neighborhood around that point. If the point’s 0, this includes positive and negative numbers. But looking for the limit at ∞ gets at only positive numbers. You see the difference?
… For this particular problem it doesn’t matter. But it might. Mathematicians handle this by taking a “one-sided limit”, or a “directional limit”. The normal limit at 0 of ‘g(y)’ is based on what ‘g(y)’ looks like in a neighborhood of 0, positive and negative numbers. In the one-sided limit, we just look at a neighborhood of 0 that’s all values greater than 0, or less than 0. In this case, I want the neighborhood that’s all values greater than 0. And we write that by adding a little + in superscript to the limit. For the other side, the neighborhood less than 0, we add a little – in superscript. So I want to evalute:
Limits and L’Hôpital’s Rule and stuff work for one-sided limits the way they do for regular limits. So there’s that mercy. The first attempt at this limit, seeing what ‘g(y)’ is if ‘y’ happens to be 0, gives . A zero divided by a zero is promising. That’s not defined, no, but it’s exactly the format that L’Hôpital’s Rule likes. The numerator is:
And the denominator is:
The first derivative of the denominator is blessedly easy: the derivative of y, with respect to y, is 1. The derivative of the numerator is a little harder. It demands the use of the Product Rule and the Chain Rule, just as last time. But these chains are easier.
The first derivative of the numerator is going to be:
Yeah, this is the simpler version of the thing I was trying to figure out last time. Because this is what’s left if I write the derivative of the numerator over the derivative of the denominator:
And now this is easy. Promise. There’s no expressions of ‘y’ divided by other expressions of ‘y’ or anything else tricky like that. There’s just a bunch of ordinary functions, all of them defined for when ‘y’ is zero. If this limit exists, it’s got to be equal to:
is 0. And the sine of 0 is 0. The cosine of 0 is 1. So all this gets to be a lot simpler, really fast.
And 20 is equal to 1. So the part to the left of the + sign there is all zero. What remains is:
And so, finally, we have it. Wronski’s formula, as best I make it out, is a function whose value is …
… So, what Wronski had been looking for, originally, was π. This is … oh, so very close to right. I mean, there’s π right there, it’s just multiplied by an unwanted . The question is, where’s the mistake? Was Wronski wrong to start with? Did I parse him wrongly? Is it possible that the book I copied Wronski’s formula from made a mistake?
Could be any of them. I’d particularly suspect I parsed him wrongly. I returned the library book I had got the original claim from, and I can’t find it again before this is set to publish. But I should check whether Wronski was thinking to find π, the ratio of the circumference to the diameter of a circle. Or might he have looked to find the ratio of the circumference to the radius of a circle? Either is an interesting number worth finding. We’ve settled on the circumference-over-diameter as valuable, likely for practical reasons. It’s much easier to measure the diameter than the radius of a thing. (Yes, I have read the Tau Manifesto. No, I am not impressed by it.) But if you know 2π, then you know π, or vice-versa.
The next question: yeah, but I turned up -½π. What am I talking about 2π for? And the answer there is, I’m not the first person to try working out Wronski’s stuff. You can try putting the expression, as best you parse it, into a tool like Mathematica and see what makes sense. Or you can read, for example, Quora commenters giving answers with way less exposition than I do. And I’m convinced: somewhere along the line I messed up. Not in an important way, but, essentially, doing something equivalent to divided by -2 when I should have multiplied by that.
I’ve spotted my mistake. I figure to come back around to explaining where it is and how I made it.
I’m slow about sharing them is all. It’s a simple dynamic: I want to write enough about each tweet that it’s interesting to share, and then once a little time has passed, I need to do something more impressive to be worth the wait. Eventually, nothing is ever shared. Let me try to fix that.
Just as it says: a link to Leonhard Euler’s Elements of Algebra, as rendered by Google Books. Euler you’ll remember from every field of mathematics ever. This 1770 textbook is one of the earliest that presents algebra that looks like, you know, algebra, the way we study it today. Much of that is because this book presented algebra so well that everyone wanted to imitate it.
This Theorem of the Day from back in November already is one about elliptic functions. Those came up several times in the Summer 2017 Mathematics A To Z. This day about the Goins-Maddox-Rusin Theorem on Heron Triangles, is dense reading even by the standards of the Theorem of the Day tweet (which fits each day’s theorem into a single slide). Still, it’s worth lounging about in the mathematics.
Elke Stangl, writing about one of those endlessly-to-me interesting subjects: phase space. This is a particular way of representing complicated physical systems. Set it up right and all sorts of physics problems become, if not easy, at least things there’s a standard set of tools for. Thermodynamics really encourages learning about such phase spaces, and about entropy, and here she writes about some of this.
Non-limit calculating e by hand. https://t.co/Kv80RotboJ Fun activity & easily reproducible. Anyone know the author?
So ‘e’ is an interesting number. At least, it’s a number that’s got a lot of interesting things built around it. Here, John Golden points out a neat, fun, and inefficient way to find the value of ‘e’. It’s kin to that scheme for calculating π inefficiently that I was being all curmudgeonly about a couple of Pi Days ago.
Jo Morgan comes to the rescue of everyone who tries to read old-time mathematics. There were a lot of great and surprisingly readable great minds publishing in the 19th century, but then you get partway through a paragraph and it might as well be Old High Martian with talk about diminishings and consequents and so on. So here’s some help.
For college students that will be taking partial differential equations next semester, here is a very good online book. https://t.co/txtfbMaRKc
As it says on the tin: a textbook on partial differential equations. If you find yourself adrift in the subject, maybe seeing how another author addresses the same subject will help, if nothing else for finding something familiar written in a different fashion.
Here's a cool way to paper-fold an ellipse:
1) Cut a circle and fold it so that the circumference falls on a fixed point inside 2) Repeat this procedure using random folds pic.twitter.com/TAU50pvgll
And this is just fun: creating an ellipse as the locus of points that are never on the fold line when a circle’s folded by a particular rule.
Finally, something whose tweet origin I lost. It was from one of the surprisingly many economists I follow considering I don’t do financial mathematics. But it links to a bit of economic history: Origins of the Sicilian Mafia: The Market for Lemons. It’s 31 pages plus references. And more charts about wheat production in 19th century Sicily than I would have previously expected to see.
By the way, if you’re interested in me on Twitter, that would be @Nebusj. Thanks for stopping in, should you choose to.
So now a bit more on Józef Maria Hoëne-Wronski’s attempted definition of π. I had got it rewritten to this form:
And I’d tried the first thing mathematicians do when trying to evaluate the limit of a function at a point. That is, take the value of that point and put it in whatever the formula is. If that formula evaluates to something meaningful, then that value is the limit. That attempt gave this:
Because the limit of ‘x’, for ‘x’ at ∞, is infinitely large. The limit of ‘‘ for ‘x’ at ∞ is 1. The limit of ‘ for ‘x’ at ∞ is 0. We can take limits that are 0, or limits that are some finite number, or limits that are infinitely large. But multiplying a zero times an infinity is dangerous. Could be anything.
Mathematicians have a tool. We know it as L’Hôpital’s Rule. It’s named for the French mathematician Guillaume de l’Hôpital, who discovered it in the works of his tutor, Johann Bernoulli. (They had a contract giving l’Hôpital publication rights. If Wikipedia’s right the preface of the book credited Bernoulli, although it doesn’t appear to be specifically for this. The full story is more complicated and ambiguous. The previous sentence may be said about most things.)
So here’s the first trick. Suppose you’re finding the limit of something that you can write as the quotient of one function divided by another. So, something that looks like this:
(Normally, this gets presented as ‘f(x)’ divided by ‘g(x)’. But I’m already using ‘f(x)’ for another function and I don’t want to muddle what that means.)
Suppose it turns out that at ‘a’, both ‘h(x)’ and ‘g(x)’ are zero, or both ‘h(x)’ and ‘g(x)’ are ∞. Zero divided by zero, or ∞ divided by ∞, looks like danger. It’s not necessarily so, though. If this limit exists, then we can find it by taking the first derivatives of ‘h’ and ‘g’, and evaluating:
That ‘ mark is a common shorthand for “the first derivative of this function, with respect to the only variable we have around here”.
This doesn’t look like it should help matters. Often it does, though. There’s an excellent chance that either ‘h'(x)’ or ‘g'(x)’ — or both — aren’t simultaneously zero, or ∞, at ‘a’. And once that’s so, we’ve got a meaningful limit. This doesn’t always work. Sometimes we have to use this l’Hôpital’s Rule trick a second time, or a third or so on. But it works so very often for the kinds of problems we like to do. Reaches the point that if it doesn’t work, we have to suspect we’re calculating the wrong thing.
But wait, you protest, reasonably. This is fine for problems where the limit looks like 0 divided by 0, or ∞ divided by ∞. What Wronski’s formula got me was 0 times 1 times ∞. And I won’t lie: I’m a little unsettled by having that 1 there. I feel like multiplying by 1 shouldn’t be a problem, but I have doubts.
That zero times ∞ thing, thought? That’s easy. Here’s the second trick. Let me put it this way: isn’t ‘x’ really the same thing as ?
I expect your answer is to slam your hand down on the table and glare at my writing with contempt. So be it. I told you it was a trick.
And it’s a perfectly good one. And it’s perfectly legitimate, too. is a meaningful number if ‘x’ is any finite number other than zero. So is . Mathematicians accept a definition of limit that doesn’t really depend on the value of your expression at a point. So that wouldn’t be meaningful for ‘x’ at zero doesn’t mean we can’t evaluate its limit for ‘x’ at zero. And just because we might not be sure that would mean for infinitely large ‘x’ doesn’t mean we can’t evaluate its limit for ‘x’ at ∞.
I see you, person who figures you’ve caught me. The first thing I tried was putting in the value of ‘x’ at the ∞, all ready to declare that this was the limit of ‘f(x)’. I know my caveats, though. Plugging in the value you want the limit at into the function whose limit you’re evaluating is a shortcut. If you get something meaningful, then that’s the same answer you would get finding the limit properly. Which is done by looking at the neighborhood around but not at that point. So that’s why this reciprocal-of-the-reciprocal trick works.
So back to my function, which looks like this:
Do I want to replace ‘x’ with , or do I want to replace with ? I was going to say something about how many times in my life I’ve been glad to take the reciprocal of the sine of an expression of x. But just writing the symbols out like that makes the case better than being witty would.
So here is a new, L’Hôpital’s Rule-friendly, version of my version of Wronski’s formula:
I put that -2 out in front because it’s not really important. The limit of a constant number times some function is the same as that constant number times the limit of that function. We can put that off to the side, work on other stuff, and hope that we remember to bring it back in later. I manage to remember it about four-fifths of the time.
So these are the numerator and denominator functions I was calling ‘h(x)’ and ‘g(x)’ before:
The limit of both of these at ∞ is 0, just as we might hope. So we take the first derivatives. That for ‘g(x)’ is easy. Anyone who’s reached week three in Intro Calculus can do it. This may only be because she’s gotten bored and leafed through the formulas on the inside front cover of the textbook. But she can do it. It’s:
When I last looked at Józef Maria Hoëne-Wronski’s attempted definition of π I had gotten it to this. Take the function:
And find its limit when ‘x’ is ∞. Formally, you want to do this by proving there’s some number, let’s say ‘L’. And ‘L’ has the property that you can pick any margin-of-error number ε that’s bigger than zero. And whatever that ε is, there’s some number ‘N’ so that whenever ‘x’ is bigger than ‘N’, ‘f(x)’ is larger than ‘L – ε’ and also smaller than ‘L + ε’. This can be a lot of mucking about with expressions to prove.
Fortunately we have shortcuts. There’s work we can do that gets us ‘L’, and we can rely on other proofs that show that this must be the limit of ‘f(x)’ at some value ‘a’. I use ‘a’ because that doesn’t commit me to talking about ∞ or any other particular value. The first approach is to just evaluate ‘f(a)’. If you get something meaningful, great! We’re done. That’s the limit of ‘f(x)’ at ‘a’. This approach is called “substitution” — you’re substituting ‘a’ for ‘x’ in the expression of ‘f(x)’ — and it’s great. Except that if your problem’s interesting then substitution won’t work. Still, maybe Wronski’s formula turns out to be lucky. Fit in ∞ where ‘x’ appears and we get:
So … all right. Not quite there yet. But we can get there. For example, has to be — well. It’s what you would expect if you were a kid and not worried about rigor: 0. We can make it rigorous if you like. (It goes like this: Pick any ε larger than 0. Then whenever ‘x’ is larger than then is less than ε. So the limit of at ∞ has to be 0.) So let’s run with this: replace all those expressions with 0. Then we’ve got:
The sine of 0 is 0. 20 is 1. So substitution tells us limit is -2 times ∞ times 1 times 0. That there’s an ∞ in there isn’t a problem. A limit can be infinitely large. Think of the limit of ‘x2‘ at ∞. An infinitely large thing times an infinitely large thing is fine. The limit of ‘x ex‘ at ∞ is infinitely large. A zero times a zero is fine; that’s zero again. But having an ∞ times a 0? That’s trouble. ∞ times something should be huge; anything times zero should be 0; which term wins?
So we have to fall back on alternate plans. Fortunately there’s a tool we have for limits when we’d otherwise have to face an infinitely large thing times a zero.
I hope to write about this next time. I apologize for not getting through it today but time wouldn’t let me.
I remain fascinated with Józef Maria Hoëne-Wronski’s attempted definition of π. It had started out like this:
And I’d translated that into something that modern mathematicians would accept without flinching. That is to evaluate the limit of a function that looks like this:
So. I don’t want to deal with that f(x) as it’s written. I can make it better. One thing that bothers me is seeing the complex number raised to a power. I’d like to work with something simpler than that. And I can’t see that number without also noticing that I’m subtracting from it raised to the same power. and are a “conjugate pair”. It’s usually nice to see those. It often hints at ways to make your expression simpler. That’s one of those patterns you pick up from doing a lot of problems as a mathematics major, and that then look like magic to the lay audience.
Here’s the first way I figure to make my life simpler. It’s in rewriting that and stuff so it’s simpler. It’ll be simpler by using exponentials. Shut up, it will too. I get there through Gauss, Descartes, and Euler.
At least I think it was Gauss who pointed out how you can match complex-valued numbers with points on the two-dimensional plane. On a sheet of graph paper, if you like. The number matches to the point with x-coordinate 1, y-coordinate 1. The number matches to the point with x-coordinate 1, y-coordinate -1. Yes, yes, this doesn’t sound like much of an insight Gauss had, but his work goes on. I’m leaving it off here because that’s all that I need for right now.
So these two numbers that offended me I can think of as points. They have Cartesian coordinates (1, 1) and (1, -1). But there’s never only one coordinate system for something. There may be only one that’s good for the problem you’re doing. I mean that makes the problem easier to study. But there are always infinitely many choices. For points on a flat surface like a piece of paper, and where the points don’t represent any particular physics problem, there’s two good choices. One is the Cartesian coordinates. In it you refer to points by an origin, an x-axis, and a y-axis. How far is the point from the origin in a direction parallel to the x-axis? (And in which direction? This gives us a positive or a negative number) How far is the point from the origin in a direction parallel to the y-axis? (And in which direction? Same positive or negative thing.)
The other good choice is polar coordinates. For that we need an origin and a positive x-axis. We refer to points by how far they are from the origin, heedless of direction. And then to get direction, what angle the line segment connecting the point with the origin makes with the positive x-axis. The first of these numbers, the distance, we normally label ‘r’ unless there’s compelling reason otherwise. The other we label ‘θ’. ‘r’ is always going to be a positive number or, possibly, zero. ‘θ’ might be any number, positive or negative. By convention, we measure angles so that positive numbers are counterclockwise from the x-axis. I don’t know why. I guess it seemed less weird for, say, the point with Cartesian coordinates (0, 1) to have a positive angle rather than a negative angle. That angle would be , because mathematicians like radians more than degrees. They make other work easier.
So. The point corresponds to the polar coordinates and . The point corresponds to the polar coordinates and . Yes, the θ coordinates being negative one times each other is common in conjugate pairs. Also, if you have doubts about my use of the word “the” before “polar coordinates”, well-spotted. If you’re not sure about that thing where ‘r’ is not negative, again, well-spotted. I intend to come back to that.
With the polar coordinates ‘r’ and ‘θ’ to describe a point I can go back to complex numbers. I can match the point to the complex number with the value given by , where ‘e’ is that old 2.71828something number. Superficially, this looks like a big dumb waste of time. I had some problem with imaginary numbers raised to powers, so now, I’m rewriting things with a number raised to imaginary powers. Here’s why it isn’t dumb.
It’s easy to raise a number written like this to a power. raised to the n-th power is going to be equal to . (Because and we’re going to go ahead and assume this stays true if ‘b’ is a complex-valued number. It does, but you’re right to ask how we know that.) And this turns into raising a real-valued number to a power, which we know how to do. And it involves dividing a number by that power, which is also easy.
And we can get back to something that looks like too. That is, something that’s a real number plus times some real number. This is through one of the many Euler’s Formulas. The one that’s relevant here is that for any real number ‘φ’. So, that’s true also for ‘θ’ times ‘n’. Or, looking to where everybody knows we’re going, also true for ‘θ’ divided by ‘x’.
OK, on to the people so anxious about all this. I talked about the angle made between the line segment that connects a point and the origin and the positive x-axis. “The” angle. “The”. If that wasn’t enough explanation of the problem, mention how your thinking’s done a 360 degree turn and you see it different now. In an empty room, if you happen to be in one. Your pedantic know-it-all friend is explaining it now. There’s an infinite number of angles that correspond to any given direction. They’re all separated by 360 degrees or, to a mathematician, 2π.
And more. What’s the difference between going out five units of distance in the direction of angle 0 and going out minus-five units of distance in the direction of angle -π? That is, between walking forward five paces while facing east and walking backward five paces while facing west? Yeah. So if we let ‘r’ be negative we’ve got twice as many infinitely many sets of coordinates for each point.
This complicates raising numbers to powers. θ times n might match with some point that’s very different from θ-plus-2-π times n. There might be a whole ring of powers. This seems … hard to work with, at least. But it’s, at heart, the same problem you get thinking about the square root of 4 and concluding it’s both plus 2 and minus 2. If you want “the” square root, you’d like it to be a single number. At least if you want to calculate anything from it. You have to pick out a preferred θ from the family of possible candidates.
For me, that’s whatever set of coordinates has ‘r’ that’s positive (or zero), and that has ‘θ’ between -π and π. Or between 0 and 2π. It could be any strip of numbers that’s 2π wide. Pick what makes sense for the problem you’re doing. It’s going to be the strip from -π to π. Perhaps the strip from 0 to 2π.
What this all amounts to is that I can turn this:
without changing its meaning any. Raising a number to the one-over-x power looks different from raising it to the n power. But the work isn’t different. The function I wrote out up there is the same as this function:
I can’t look at that number, , sitting there, multiplied by two things added together, and leave that. (OK, subtracted, but same thing.) I want to something something distributive law something and that gets us here:
Also, yeah, that square root of two raised to a power looks weird. I can turn that square root of two into “two to the one-half power”. That gets to this rewrite:
And then. Those parentheses. e raised to an imaginary number minus e raised to minus-one-times that same imaginary number. This is another one of those magic tricks that mathematicians know because they see it all the time. Part of what we know from Euler’s Formula, the one I waved at back when I was talking about coordinates, is this:
That’s good for any real-valued φ. For example, it’s good for the number . And that means we can rewrite that function into something that, finally, actually looks a little bit simpler. It looks like this:
And that’s the function whose limit I want to take at ∞. No, really.
I ran out of time to do my next bit on Wronski’s attempted definition of π. Next week, all goes well. But I have something to share anyway. William Lane Craig, of the The author of Boxing Pythagoras blog was intrigued by the starting point. And as a fan of studying how people understand infinity and infinitesimals (and how they don’t), this two-century-old example of mixing the numerous and the tiny set his course.
For example, can we speak of a number that’s larger than zero, but smaller than the reciprocal of any positive integer? It’s hard to imagine such a thing. But what if we can show that if we suppose such a number exists, then we can do this logically sound work with it? If you want to say that isn’t enough to show a number exists, then I have to ask how you know imaginary numbers or negative numbers exist.
Standard analysis, you probably guessed, doesn’t do that. It developed over the 19th century when the logical problems of these kinds of numbers seemed unsolvable. Mostly that’s done by limits, showing that a thing must be true whenever some quantity is small enough, or large enough. It seems safe to trust that the infinitesimally small is small enough, and the infinitely large is large enough. And it’s not like mathematicians back then were bad at their job. Mathematicians learned a lot of things about how infinitesimals and infinities work over the late 19th and early 20th century. It makes modern work possible.
Anyway, Boxing Pythagoras goes over what a non-standard analysis treatment of the formula suggests. I think it’s accessible even if you haven’t had much non-standard analysis in your background. At least it worked for me and I haven’t had much of the stuff. I think it’s also accessible if you’re good at following logical argument and won’t be thrown by Greek letters as variables. Most of the hard work is really arithmetic with funny letters. I recommend going and seeing if he did get to π.
A couple weeks ago I shared a fascinating formula for π. I got it from Carl B Boyer’s The History of Calculus and its Conceptual Development. He got it from Józef Maria Hoëne-Wronski, early 19th-century Polish mathematician. His idea was that an absolute, culturally-independent definition of π would come not from thinking about circles and diameters but rather this formula:
Now, this formula is beautiful, at least to my eyes. It’s also gibberish. At least it’s ungrammatical. Mathematicians don’t like to write stuff like “four times infinity”, at least not as more than a rough draft on the way to a real thought. What does it mean to multiply four by infinity? Is arithmetic even a thing that can be done on infinitely large quantities? Among Wronski’s problems is that they didn’t have a clear answer to this. We’re a little more advanced in our mathematics now. We’ve had a century and a half of rather sound treatment of infinitely large and infinitely small things. Can we save Wronski’s work?
Start with the easiest thing. I’m offended by those bits. Well, no, I’m more unsettled by them. I would rather have in there. The difference? … More taste than anything sound. I prefer, if I can get away with it, using the square root symbol to mean the positive square root of the thing inside. There is no positive square root of -1, so, pfaugh, away with it. Mere style? All right, well, how do you know whether those terms are meant to be or its additive inverse, ? How do you know they’re all meant to be the same one? See? … As with all style preferences, it’s impossible to be perfectly consistent. I’m sure there are times I accept a big square root symbol over a negative or a complex-valued quantity. But I’m not forced to have it here so I’d rather not. First step:
Also dividing by is the same as multiplying by so the second easy step gives me:
Now the hard part. All those infinities. I don’t like multiplying by infinity. I don’t like dividing by infinity. I really, really don’t like raising a quantity to the one-over-infinity power. Most mathematicians don’t. We have a tool for dealing with this sort of thing. It’s called a “limit”.
Mathematicians developed the idea of limits over … well, since they started doing mathematics. In the 19th century limits got sound enough that we still trust the idea. Here’s the rough way it works. Suppose we have a function which I’m going to name ‘f’ because I have better things to do than give functions good names. Its domain is the real numbers. Its range is the real numbers. (We can define functions for other domains and ranges, too. Those definitions look like what they do here.)
I’m going to use ‘x’ for the independent variable. It’s any number in the domain. I’m going to use ‘a’ for some point. We want to know the limit of the function “at a”. ‘a’ might be in the domain. But — and this is genius — it doesn’t have to be. We can talk sensibly about the limit of a function at some point where the function doesn’t exist. We can say “the limit of f at a is the number L”. I hadn’t introduced ‘L’ into evidence before, but … it’s a number. It has some specific set value. Can’t say which one without knowing what ‘f’ is and what its domain is and what ‘a’ is. But I know this about it.
Pick any error margin that you like. Call it ε because mathematicians do. However small this (positive) number is, there’s at least one neighborhood in the domain of ‘f’ that surrounds ‘a’. Check every point in that neighborhood other than ‘a’. The value of ‘f’ at all those points in that neighborhood other than ‘a’ will be larger than L – ε and smaller than L + ε.
Yeah, pause a bit there. It’s a tricky definition. It’s a nice common place to crash hard in freshman calculus. Also again in Intro to Real Analysis. It’s not just you. Perhaps it’ll help to think of it as a kind of mutual challenge game. Try this.
You draw whatever error bar, as big or as little as you like, around ‘L’.
But Ialways respond by drawing some strip around ‘a’.
You then pick absolutely any ‘x’ inside my strip, other than ‘a’.
Is f(x) always within the error bar you drew?
Suppose f(x) is. Suppose that you can pick any error bar however tiny, and I can answer with a strip however tiny, and every single ‘x’ inside my strip has an f(x) within your error bar … then, L is the limit of f at a.
Again, yes, tricky. But mathematicians haven’t found a better definition that doesn’t break something mathematicians need.
To write “the limit of f at a is L” we use the notation:
The ‘lim’ part probably makes perfect sense. And you can see where ‘f’ and ‘a’ have to enter into it. ‘x’ here is a “dummy variable”. It’s the falsework of the mathematical expression. We need some name for the independent variable. It’s clumsy to do without. But it doesn’t matter what the name is. It’ll never appear in the answer. If it does then the work went wrong somewhere.
What I want to do, then, is turn all those appearances of ‘∞’ in Wronski’s expression into limits of something at infinity. And having just said what a limit is I have to do a patch job. In that talk about the limit at ‘a’ I talked about a neighborhood containing ‘a’. What’s it mean to have a neighborhood “containing ∞”?
The answer is exactly what you’d think if you got this question and were eight years old. The “neighborhood of infinity” is “all the big enough numbers”. To make it rigorous, it’s “all the numbers bigger than some finite number that let’s just call N”. So you give me an error bar around ‘L’. I’ll give you back some number ‘N’. Every ‘x’ that’s bigger than ‘N’ has f(x) inside your error bars. And note that I don’t have to say what ‘f(∞)’ is or even commit to the idea that such a thing can be meaningful. I only ever have to think directly about values of ‘f(x)’ where ‘x’ is some real number.
So! First, let me rewrite Wronski’s formula as a function, defined on the real numbers. Then I can replace each ∞ with the limit of something at infinity and … oh, wait a minute. There’s three ∞ symbols there. Do I need three limits?
Ugh. Yeah. Probably. This can be all right. We can do multiple limits. This can be well-defined. It can also be a right pain. The challenge-and-response game needs a little modifying to work. You still draw error bars. But I have to draw multiple strips. One for each of the variables. And every combination of values inside all those strips has give an ‘f’ that’s inside your error bars. There’s room for great mischief. You can arrange combinations of variables that look likely to break ‘f’ outside the error bars.
So. Three independent variables, all taking a limit at ∞? That’s not guaranteed to be trouble, but I’d expect trouble. At least I’d expect something to keep the limit from existing. That is, we could find there’s no number ‘L’ so that this drawing-neighborhoods thing works for all three variables at once.
Let’s try. One of the ∞ will be a limit of a variable named ‘x’. One of them a variable named ‘y’. One of them a variable named ‘z’. Then:
Without doing the work, my hunch is: this is utter madness. I expect it’s probably possible to make this function take on many wildly different values by the judicious choice of ‘x’, ‘y’, and ‘z’. Particularly ‘y’ and ‘z’. You maybe see it already. If you don’t, you maybe see it now that I’ve said you maybe see it. If you don’t, I’ll get there, but not in this essay. But let’s suppose that it’s possible to make f(x, y, z) take on wildly different values like I’m getting at. This implies that there’s not any limit ‘L’, and therefore Wronski’s work is just wrong.
Thing is, Wronski wouldn’t have thought that. Deep down, I am certain, he thought the three appearances of ∞ were the same “value”. And that to translate him fairly we’d use the same name for all three appearances. So I am going to do that. I shall use ‘x’ as my variable name, and replace all three appearances of ∞ with the same variable and a common limit. So this gives me the single function:
And then I need to take the limit of this at ∞. If Wronski is right, and if I’ve translated him fairly, it’s going to be π. Or something easy to get π from.
I’ve been reading Carl B Boyer’s The History of Calculus and its Conceptual Development. It’s been slow going, because reading about how calculus’s ideas developed is hard. The ideas underlying it are subtle to start with. And the ideas have to be discussed using vague, unclear definitions. That’s not because dumb people were making arguments. It’s because these were smart people studying ideas at the limits of what we understood. When we got clear definitions we had the fundamentals of calculus understood. (By our modern standards. The future will likely see us as accepting strange ambiguities.) And I still think Boyer whiffs the discussion of Zeno’s Paradoxes in a way that mathematics and science-types usually do. (The trouble isn’t imagining that infinite series can converge. The trouble is that things are either infinitely divisible or they’re not. Either way implies things that seem false.)
Anyway. Boyer got to a part about the early 19th century. This was when mathematicians were discovering infinities and infinitesimals are amazing tools. Also that mathematicians should maybe learn if they follow any rules. Because you can just plug symbols in to formulas and grind out what looks like they might mean and get answers. Sometimes this works great. Grind through the formulas for solving cubic polynomials as though square roots of negative numbers make sense. You get good results. Later, we worked out a coherent scheme of “complex-valued numbers” that justified it all. We can get lucky with infinities and infinitesimals, sometimes.
And this brought Boyer to an argument made by Józef Maria Hoëne-Wronski. He was a Polish mathematician whose fantastic ambition in … everything … didn’t turn out many useful results. Algebra, the Longitude Problem, building a rival to the railroad, even the Kosciuszko Uprising, none quite panned out. (And that’s not quite his name. The ‘n’ in ‘Wronski’ should have an acute mark over it. But WordPress’s HTML engine doesn’t want to imagine such a thing exists. Nor do many typesetters writing calculus or differential equations books, Boyer’s included.)
But anyone who studies differential equations knows his name, for a concept called the Wronskian. It’s a matrix determinant that anyone who studies differential equations hopes they won’t ever have to do after learning it. And, says Boyer, Wronski had this notion for an “absolute meaning of the number π”. (By “absolute” Wronski means one that not drawn from cultural factors like the weird human interset in circle perimeters and diameters. Compare it to the way we speak of “absolute temperature”, where the zero means something not particular to western European weather.)
I will admit I’m not fond of “real” alternate definitions of π. They seem to me mostly to signal how clever the definition-originator is. The only one I like at all defines π as the smallest positive root of the simple-harmonic-motion differential equation. (With the right starting conditions and all that.) And I’m not sure that isn’t “circumference over diameter” in a hidden form.
And yes, that definition is a mess of early-19th-century wild, untamed casualness in the use of symbols. But I admire the crazypants beauty of it. If I ever get a couple free hours I should rework it into something grammatical. And then see if, turned into something tolerable, Wronski’s idea is something even true.
Boyer allows that “perhaps” because of the strange notation and “bizarre use of the symbol ∞” Wronski didn’t make much headway on this point. I can’t fault people for looking at that and refusing to go further. But isn’t it enchanting as it is?
We come now almost to the end of the Summer 2017 A To Z. Possibly also the end of all these A To Z sequences. Gaurish of, For the love of Mathematics, proposed that I talk about the obvious logical choice. The last promising thing I hadn’t talked about. I have no idea what to do for future A To Z’s, if they’re even possible anymore. But that’s a problem for some later time.
Some good advice that I don’t always take. When starting a new problem, make a list of all the things that seem likely to be relevant. Problems that are worth doing are usually about things. They’ll be quantities like the radius or volume of some interesting surface. The amount of a quantity under consideration. The speed at which something is moving. The rate at which that speed is changing. The length something has to travel. The number of nodes something must go across. Whatever. This all sounds like stuff from story problems. But most interesting mathematics is from a story problem; we want to know what this property is like. Even if we stick to a purely mathematical problem, there’s usually a couple of things that we’re interested in and that we describe. If we’re attacking the four-color map theorem, we have the number of territories to color. We have, for each territory, the number of territories that touch it.
Next, select a name for each of these quantities. Write it down, in the table, next to the term. The volume of the tank is ‘V’. The radius of the tank is ‘r’. The height of the tank is ‘h’. The fluid is flowing in at a rate ‘r’. The fluid is flowing out at a rate, oh, let’s say ‘s’. And so on. You might take a moment to go through and think out which of these variables are connected to which other ones, and how. Volume, for example, is surely something to do with the radius times something to do with the height. It’s nice to have that stuff written down. You may not know the thing you set out to solve, but you at least know you’ve got this under control.
I recommend this. It’s a good way to organize your thoughts. It establishes what things you expect you could know, or could want to know, about the problem. It gives you some hint how these things relate to each other. It sets you up to think about what kinds of relationships you figure to study when you solve the problem. It gives you a lifeline, when you’re lost in a sea of calculation. It’s reassurance that these symbols do mean something. Better, it shows what those things are.
I don’t always do it. I have my excuses. If I’m doing a problem that’s very like one I’ve already recently done, the things affecting it are probably the same. The names to give these variables are probably going to be about the same. Maybe I’ll make a quick sketch to show how the parts of the problem relate. If it seems like less work to recreate my thoughts than to write them down, I skip writing them down. Not always good practice. I tell myself I can always go back and do things the fully right way if I do get lost. So far that’s been true.
So, the names. Suppose I am interested in, say, the length of the longest rod that will fit around this hallway corridor. Then I am in a freshman calculus book, yes. Fine. Suppose I am interested in whether this pinball machine can be angled up the flight of stairs that has a turn in it Then I will measure things like the width of the pinball machine. And the width of the stairs, and of the landing. I will measure this carefully. Pinball machines are heavy and there are many hilarious sad stories of people wedging them into hallways and stairwells four and a half stories up from the street. But: once I have identified, say, ‘width of pinball machine’ as a quantity of interest, why would I ever refer to it as anything but?
This is no dumb question. It is always dangerous to lose the link between the thing we calculate and the thing we are interested in. Without that link we are less able to notice mistakes in either our calculations or the thing we mean to calculate. Without that link we can’t do a sanity check, that reassurance that it’s not plausible we just might fit something 96 feet long around the corner. Or that we estimated that we could fit something of six square feet around the corner. It is common advice in programming computers to always give variables meaningful names. Don’t write ‘T’ when ‘Total’ or, better, ‘Total_Value_Of_Purchase’ is available. Why do we disregard this in mathematics, and switch to ‘T’ instead?
First reason is, well, try writing this stuff out. Your hand (h) will fall off (foff) in about fifteen minutes, twenty seconds. (15′ 20”). If you’re writing a program, the programming environment you have will auto-complete the variable after one or two letters in. Or you can copy and paste the whole name. It’s still good practice to leave a comment about what the variable should represent, if the name leaves any reasonable ambiguity.
Another reason is that sure, we do specific problems for specific cases. But a mathematician is naturally drawn to thinking of general problems, in abstract cases. We see something in common between the problem “a length and a quarter of the length is fifteen feet; what is the length?” and the problem “a volume plus a quarter of the volume is fifteen gallons; what is the volume?”. That one is about lengths and the other about volumes doesn’t concern us. We see a saving in effort by separating the quantity of a thing from the kind of the thing. This restores danger. We must think, after we are done calculating, about whether the answer could make sense. But we can minimize that, we hope. At the least we can check once we’re done to see if our answer makes sense. Maybe even whether it’s right.
For centuries, as the things we now recognize as algebra developed, we would use words. We would talk about the “thing” or the “quantity” or “it”. Some impersonal name, or convenient pronoun. This would often get shortened because anything you write often you write shorter. “Re”, perhaps. In the late 16th century we start to see the “New Algebra”. Here mathematics starts looking like … you know … mathematics. We start to see stuff like “addition” represented with the + symbol instead of an abbreviation for “addition” or a p with a squiggle over it or some other shorthand. We get equals signs. You start to see decimals and exponents. And we start to see letters used in place of numbers whose value we don’t know.
There are a couple kinds of “numbers whose value we don’t know”. One is the number whose value we don’t know, but hope to learn. This is the classic variable we want to solve for. Another kind is the number whose value we don’t know because we don’t care. I mean, it has some value, and presumably it doesn’t change over the course of our problem. But it’s not like our work will be so different if, say, the tank is two feet high rather than four.
Is there a problem? If we pick our letters to fit a specific problem, no. Presumably all the things we want to describe have some clear name, and some letter that best represents the name. It’s annoying when we have to consider, say, the pinball machine width and the corridor width. But we can work something out.
But what about general problems?
Is an easy problem to solve?
If we want to figure what ‘m’ is, yes. Similarly ‘y’. If we want to know what ‘b’ is, it’s tedious, but we can do that. If we want to know what ‘e’ is? Run and hide, that stuff is crazy. If you have to, do it numerically and accept an estimate. Don’t try figuring what that is.
And so we’ve developed conventions. There are some letters that, except in weird circumstances, are coefficients. They’re numbers whose value we don’t know, but either don’t care about or could look up. And there are some that, by default, are variables. They’re the ones whose value we want to know.
These conventions started forming, as mentioned, in the late 16th century. François Viète here made a name that lasts to mathematics historians at least. His texts described how to do algebra problems in the sort of procedural methods that we would recognize as algebra today. And he had a great idea for these letters. Use the whole alphabet, if needed. Use the consonants to represent the coefficients, the numbers we know but don’t care what they are. Use the vowels to represent the variables, whose values we want to learn. So he would look at that equation and see right away: it’s a terrible mess. (I exaggerate. He doesn’t seem to have known the = sign, and I don’t know offhand when ‘log’ and ‘cos’ became common. But suppose the rest of the equation were translated into his terminology.)
It’s not a bad approach. Besides the mnemonic value of consonant-coefficient, vowel-variable, it’s true that we usually have fewer variables than anything else. The more variables in a problem the harder it is. If someone expects you to solve an equation with ten variables in it, you’re excused for refusing. So five or maybe six or possibly seven choices for variables is plenty.
But it’s not what we settled on. René Descartes had a better idea. He had a lot of them, but here’s one. Use the letters at the end of the alphabet for the unknowns. Use the letters at the start of the alphabet for coefficients. And that is, roughly, what we’ve settled on. In my example nightmare equation, we’d suppose ‘y’ to probably be the variable we want to solve for.
And so, and finally, x. It is almost the variable. It says “mathematics” in only two strokes. Even π takes more writing. Descartes used it. We follow him. It’s way off at the end of the alphabet. It starts few words, very few things, almost nothing we would want to measure. (Xylem … mass? Flow? What thing is the xylem anyway?) Even mathematical dictionaries don’t have much to say about it. The letter transports almost no connotations, no messy specific problems to it. If it suggests anything, it suggests the horizontal coordinate in a Cartesian system. It almost is mathematics. It signifies nothing in itself, but long use has given it an identity as the thing we hope to learn by study.
And pirate treasure maps. I don’t know when ‘X’ became the symbol of where to look for buried treasure. My casual reading suggests “never”. Treasure maps don’t really exist. Maps in general don’t work that way. Or at least didn’t before cartoons. X marking the spot seems to be the work of Robert Louis Stevenson, renowned for creating a fanciful map and then putting together a book to justify publishing it. (I jest. But according to Simon Garfield’s On The Map: A Mind-Expanding Exploration of the Way The World Looks, his map did get lost on the way to the publisher, and he had to re-create it from studying the text of Treasure Island. This delights me to no end.) It makes me wonder if Stevenson was thinking of x’s service in mathematics. But the advantages of x as a symbol are hard to ignore. It highlights a point clearly. It’s fast to write. Its use might be coincidence.
But it is a letter that does a needed job really well.
The slightest thing I learned in the most recent set of essays is that I somehow slid from the descriptive “End Of 2016” title to the prescriptive “End 2016” identifier for the series. My unscientific survey suggests that most people would agree that we had too much 2016 and would have been better off doing without it altogether. So it goes.
The most important thing I learned about this is I have to pace things better. The A To Z essays have been creeping up in length. I didn’t keep close track of their lengths but I don’t think any of them came in under a thousand words. 1500 words was more common. And that’s fine enough, but at three per week, plus the Reading the Comics posts, that’s 5500 or 6000 words of mathematics alone. And that before getting to my humor blog, which even on a brief week will be a couple thousand words. I understand in retrospect why November and December felt like I didn’t have any time outside the word mines.
I’m not bothered by writing longer essays, mind. I can apparently go on at any length on any subject. And I like the words I’ve been using. My suspicion is between these A To Zs and the Theorem Thursdays over the summer I’ve found a mode for writing pop mathematics that works for me. It’s just a matter of how to balance workloads. The humor blog has gotten consistently better readership, for the obvious reasons (lately I’ve been trying to explain what the story comics are doing), but the mathematics more satisfying. If I should have to cut back on either it’d be the humor blog that gets the cut first.
Another little discovery is that I can swap out equations and formulas and the like for historical discussion. That’s probably a useful tradeoff for most of my readers. And it plays to my natural tendencies. It is very easy to imagine me having gone into history than into mathematics or science. It makes me aware how mediocre my knowledge of mathematics history is, though. For example, several times in the End 2016 A To Z the Crisis of Foundations came up, directly or in passing. But I’ve never read a proper history, not even a basic essay, about the Crisis. I don’t even know of a good description of this important-to-the-field event. Most mathematics history focuses around biographies of a few figures, often cribbed from Eric Temple Bell’s great but unreliable book, or a couple of famous specific incidents. (Newton versus Leibniz, the bridges of Köningsburg, Cantor’s insanity, Gödel’s citizenship exam.) Plus Bourbaki.
That’s not enough for someone taking the subject seriously, and I do mean to. So if someone has a suggestion for good histories of, for example, how Fourier series affected mathematicians’ understanding of what functions are, I’d love to know it. Maybe I should set that as a standing open request.
In looking over the subjects I wrote about I find a pretty strong mix of group theory and real analysis. Maybe that shouldn’t surprise. Those are two of the maybe three legs that form a mathematics major’s education. So anyone wanting to understand mathematicians would see this stuff and have questions about it. (There are more things mathematics majors learn, but there are a handful of things almost any mathematics major is sure to spend a year being baffled by.)
The third leg, I’d say, is differential equations. That’s a fantastic field, but it’s hard to describe without equations. Also pictures of what the equations imply. I’ve tended towards essays with few equations and pictures. That’s my laziness. Equations are best written in LaTeX, a typesetting tool that might as well be the standard for mathematicians writing papers and books. While WordPress supports a bit of LaTeX it isn’t quite effortless. That comes back around to balancing my workload. I do that a little better and I can explain solving first-order differential equations by integrating factors. (This is a prank. Nobody has ever needed to solve a first-order differential equation by integrating factors except for mathematics majors being taught the method.) But maybe I could make a go of that.
I’m not setting any particular date for the next A-To-Z, or similar, project. I need some time to recuperate. And maybe some time to think of other running projects that would be fun or educational for me. There’ll be something, though.