Giving a good mathematics talk

Last week, Tadashi Tokieda came to Chicago to give a colloquium. If you have seen him speak, you will not be surprised to learn that it was absolutely delightful talk. I carried the talk around with me in my mind for many days afterwards, not only for its content, but also with the nagging question: how I can I make my own talks better?

I think it’s very easy to feel that our subject (mathematics) is so technical that no talk can both convey depth and yet be accessible. But then why did this talk feel otherwise? There is certainly room for carving out exceptions, and arguing that some problems in mathematics are easier to explain than others, but that does seem to me to be making excuses.

I have always felt that conveying the idea behind a proof is more imporant than the details. Almost all results in mathematics are special cases (of special cases) of more general problems, and many talks start with the implicit assumption that the audience is not only aware of these special cases but also understands why one should care about them, and see how they fit into the broader picture of mathematics. But that is rarely the case. Sometimes it’s even the case that the speaker themselves doesn’t really seem to have a bigger picture of what they are doing.

I want listeners in my talks to come away with some satisfying idea in their mind of what is going on. I feel as time goes on that has lead me to give softer and softer talks. I think this has sometimes worked, but it’s certainly not perfect. I think it may have been Kai Wen who spoke to me after I gave a colloquium style talk at Barry Mazur’s 80th birthday conference and said he was a little sad because he was hoping to learn something about the details of my paper in the talk. This was a perfectly valid complaint, although one has to accept that in any talk you have to disappoint some people. But maybe that’s just another excuse.

I’ve seem Manjul Bhargava give some wonderful talks which made me feel like I understood everything. But that feeling dissipated on closer inspection. More concretely, there are a number of technical issues concerning lattice points near cusps of locally symmetric spaces where some key technical steps take place, but those parts of the story never really get top billing during the talk. But this is not a criticism! Not everything can be explained in an hour, and in Manjul’s case there were plenty of other new insights which were easier to convey in a talk setting.

I used to think Henri Darmon was an amazing speaker. But then I once saw Jan Vonk give a great talk on some joint work with Darmon and I started to think: “well maybe Henri just does great math and that’s why his talks are so good” and then I wasn’t sure. But at the very least it’s easier to give a better talk when the mathematics is more interesting.

Perhaps I did at least come to one conclusion from thinking about Tadashi’s talk that I feel I can take away with me. His talk was delivered in a very easy going intuitive manner which I also strive (in my way) to reach. But another thing that is very clear is that his talk was also exceptionally well prepared. I often spend a lot of time thinking about a talk before I give it; but I think I have been most successful when my preparation actually involves deciding exactly (more or less) what I am going to say in advance. I feel that using Beamer is helpful in this way by constraining in advance the direction of the talk. Another lesson I have leanrt is that when wants to give some broad brushstrokes about some technical subject it is tempting to be vague; but actually there is often more clarity in being precise. In practice, this means that instead of being vague about some complicated argument, one can often be very precise about some baby version of the same argument, and still get across some sense of the methods involved.

In the end, talks can be good for different people in many different ways, but for most of us there are not many people who are going to read our papers in any detail, and a talk is one of the few opportunities we get to really communicate our ideas to others. So I think reflecting on how to give better talks is time well spent.

Posted in Mathematics | Tagged , , , , , | 1 Comment

The Arthurian Legend

Some time back, Kevin Buzzard (friend of the blog!) gave a series of talks in which he criticized certain aspects of the mathematical culture when it came to accepting proof. In addition to obvious targets like the classification of finite simple groups, he took aim at my paper with Boxer, Gee, and Pilloni, an in particular this passage:

It should be noted that we use Arthur’s multiplicity formula for the discrete spectrum of \(\mathrm{GSp}_4\), as announced in [Art04]. A proof of this (relying on Arthur’s work for symplectic and orthogonal groups in [Art13]) was given in [GT19], but this proof is only as unconditional as the results of [Art13] and [MW16a,MW16b]. In particular, it depends on cases of the twisted weighted fundamental lemma that were announced in [CL10], but whose proofs have not yet appeared, as well as on the references [A24], [A25], [A26] and [A27] in [Art13], which at the time of writing have not appeared publicly

Kevin asks (of this passage): “Can we honestly say that this is science?” I am certainly broadly sympathetic to Kevin’s concerns; I would say the inclusion of these remarks in the paper is some evidence of that. One interesting remark is that Kevin chose to highlight our paper rather than discuss the existential state of Arthur’s preprints.

It is now six years since our preprint was first posted, and no preprints from Arthur have been forthcoming. I have felt a growing responsibility that I should be obliged to address the issue directly on this blog. But in what form, exactly? Something of the form of this post on ABC?. In the last year I have seen talks explaining how there exist a number of genuine difficulties in carrying out Arthur’s proposed strategy. That strategy involves an inductive argument where the argument for one group might reduce to a claim for a group of much higher rank, and obviously this requires some finesse to avoid any circular argument.

What I ultimately decided was suitable was that the focus of such a post should not be of criticizing Arthur, but in emphasizing that anyone who does step up the the plate and resolves these outstanding issues really needs to be recognized for their original contributions. This is not a situation in which “the experts know how to do this” — it is a situation where the original position was “Arthur has done this” to “Arthur knows how to do this” which gradually evolved to “Arthur has explained a strategy to do this, but this strategy appears to require overcoming serious obstacles”.

But fortunately, the situation has now changed, very much for the better. In a new recent preprint by Hiraku Atobe, Wee Teck Gan, Atsushi Ichino, Tasho Kaletha, Alberto Mínguez, and Sug Woo Shin, all the promised results of Arthur’s missing papers have now been supplied. So instead, I can focus on emphasizing that this new result is a monumental achievement, and that it should be appreciated by the community as the genuine original contribution that it is. Let me add that the authors are incredibly gracious to Arthur and nobody is trying to take away from Arthur’s absolutely key fundamental contributions. But at the same time, that should not detract from our appreciation of this new work.

To return to Kevin’s question of “is this science”, there still is, unfortunately, one remaining caveat. Namely, there is another (non-existent) paper, the proof of the twisted weighted fundamental lemma as announced by Chaudouard and Laumon in 2010(!). So what is the situation here? At one point Chau told me that he was considering writing a book which would (hopefully) include this result, but that this is no longer his intention, in part because at least one graduate student (not at Chicago!) is working on this problem (I won’t say too much more to avoid adding unnecessary pressure). But the message in this case is surely the same: if the the Fundamental Lemma is worth a fields medal, a generalization of that result for which a large amount of mathematics is currently contingent should also be appreciated when it finally appears!

Posted in Uncategorized | Tagged , , , , , , , , , , , , , , , , , | 3 Comments

Walter Neumann

I recently learnt the sad news that Walter Neumann just passed away. Although I don’t think I have seen him in person for over 25 years, Walter was a pretty significant influence in my mathematical life. Here are some of my recollections. (See also this celebration of Walter on his retirement from people who knew him much better than me!)

Walter was lured to Australia in 1993 by Melbourne University. Although Australia was unable to keep him (he moved back to the US in 1999), he tenure included the entire time I was an undergraduate.

The first thing Walter did for me happened before I even met him. During the summer of 1993-94, there was a (in US parlance) “REU” program run over the summer at Melbourne University in hyperbolic geometry, run in part by Walter and Craig Hodgson. Somehow they offered me a position even though I had only just finished high school and had neither formally met them nor applied. I guess my brother must have said something? I was paid the princely sum of $250 a week for six weeks to learn about volumes of hyperbolic 3-manifolds, invariant trace fields, snappea, and so much more. To top it off, there were generous breaks on Lygon street for lunch and gelato. It was an idyllic summer on campus and I can think of no better introduction to what life as a mathematician could be like at its best.

Then, a few years later, Walter happily agreed to be the advisor for my senior undergraduate thesis. I wanted to generalize a problem of Fermat to quadratic fields. Fermat famously proved that there do not exist four distinct rational squares in arithmetic progression, and the problem I had in mind was understanding (to the extent possible) for what quadratic fields this was still true. If \(a^2, b^2, c^2, d^2\) are in arithmetic progression, then (with \(x = (b/a)^2 -1\)) one obtains a point on the curve

\[y^2 = (1+x)(1+2x)(1+3x)\]

which is isogenous to \(E = X_0(24)\), and so the problem reduces to computing the ranks of the quadratic twists \(E_d\). The first part of my thesis consisted of a \(2\)-descent on the quadratic twists \(E_p\) for \(p\) prime. If I remember correctly, the rank can be computed (conditional on the finiteness of the Tate-Shafarevich group) for any prime \(p\) unless \(p \equiv 1 \bmod 24\). The rank is unequivocally zero if \(p \equiv 5,7,19 \bmod 24\), is conditionally one (on the parity conjecture) if \(p \equiv 11,13,23 \bmod 24\), and is conditionally \(0\) or \(2\) if \(p \equiv 1 \bmod 24\). The first appearance of \(2\)-torsion in the Tate-Shafarevich group occurs for \(p=97\). (Exercise for the reader in their head: when \(p=73\) the rank is positive!) This is not exactly Walter’s main area of interest(!), but he was very happy to spend time doing everything from answering my questions in algebraic geometry to watching me doing explicit \(2\)-descents on his whiteboard. I had also hoped to do something similar to what Tunnell did when he analyzed the congruent number problem, but I wasn’t quite able to compute all the relevant spaces of weight \(3/2\) modular forms. Looking at my thesis again for the first time in very many years, it seems I proved the following:

Theorem: Let
\[g = q \prod_{n=1}^{\infty} (1 – q^{12 n})^2
\cdot \sum q^{6 n^2} = q + 2 q^7 + \ldots = \sum a_d q^d \]
Then for \(d > 0\) odd and \(2d\) squarefree, assuming BSD, one has
\[\mathrm{Rank}(E/\mathbf{Q}(\sqrt{2d})) = 0
\leftrightarrow a_d = 0.\]

I’m happy that this seems to still be true, at least for \(d < 1000\), which includes \(17\) curves of rank \(2\), the first corresponding to \(\mathbf{Q}(\sqrt{134})\).

Perhaps one of the most amazing things Walter did for me was to help arrange for me to visit the Max Planck institute (Gottfried-Claren-Straße! Cash payments of 500DM bills!) for a month when I was still an undergraduate. The idea was that it would be useful for me to spend some time with (his collaborator) Don Zagier. In retrospect this obviously went above and beyond what most faculty might ever do for an undergraduate! It was certainly a transformative act for me.

One piece of mathematics that Walter is very much associated with is is work on the Bloch group and hyperbolic volumes, something which is very much dear to my heart as well. I was a bit too wet behind the ears as an undergraduate to learn about it directly from him. I had considered writing something more extensive about Walter’s work on the Bloch group (for example this paper) but then decided I probably couldn’t do better than the article of Stavros and Don in the link I cited above. Another paper of Walter close to my heart is his work with Alan Reid on Arithmetic of hyperbolic manifolds. There is clearly a direct link between my exposure to this subject in Melbourne and my later work on the Taylor-Wiles method.

Due to our respective differences in position at the time, our social interaction was naturally limited, but not entirely so. Walter was also my “masters thesis” advisor in the interregnum between my finishing my undergraduate degree and starting my PhD at Berkeley (recall the differences in timing of the academic year). My housemate (and still best friend in Australia) Toby and I invited Walter and his wife Anne over for dinner, which Toby generously cooked for (was it some variation of Beef Wellington?). We may have drunk a Chateau Tahbilk 1978 as well, but quite possibly that was another night. (My parents were not great wine collectors, but the family lore is that they did was lay down one bottle each of Penfolds Grange Hermitage of the vintage for each of our birth years. The 1972 we drunk one Boxing day in the ’90s when Danny wasn’t there; it was not a great vintage. The 1975 had disappeared long before I knew about it and had been replaced by a the ’78 Tabilk at probably around 10% of the price.) Apart from the general bonhomie, what I most remember was Anne deciding that she needed to fix Toby up with one of her friends. (After quizzing Toby on his opinions, interests, and tastes, Anne’s response was either of the form “wonderful, the two of you are in complete agreement” or, if that were not the case, “excellent, the two of you will have something to argue about!”.) Relatively soon after that, Toby and I found ourselves invited to a dinner party at the Neumann’s house in Ivanhoe. (I say Ivanhoe with apparent confidence, but I confess that many of the details in this post stretch so far back into the past that even the things I know are true still seem unlikely.) I believe they served sherry as an aperitif, which seemed wonderfully sophisticated. And it turned out that Anne was cunningly playing the role of matchmaker and had also invited her friend in question to dinner.

I have learned a lot of mathematics from a lot of people. But looking back, I can see how my mentors have done so much more — some as simple as introducing us to other mathematicians, but other things behind the scenes which take significantly more effort and time but that don’t always get noticed at the time, but which play a key role in enriching our careers. When Anne was working to set up my friend Toby, Walter was working to set me up with his friend Don Zagier in Bonn! Thanks, Walter, for everything you did for me.

(Oh, and for those curious about how successful Anne’s matchmaking was: reader, she married him.)

Posted in Mathematics, Uncategorized | Tagged , , , , , , , , , , , , , , | 2 Comments

Am I taking students?

I receive many (many) unsolicited emails about the possibility of working with me at grad school. Some are clearly bulk emails sent no doubt to a large number of professors. Some are customized to include phrases like “I was really fascinated by you paper [most recent paper] and I want to learn more”. (I get a surprising number of emails also from predatory publishers eager for me to write a book about my paper “Correction to: Modularity lifting beyond the Taylor-Wiles method.”) Some are much more personalized, relevant, and interesting. But it seemed worthwhile to write a short blog post which I can quickly refer someone to which may answer some of their questions. So here we go.

  1. It is worth reminding the reader that, in the US, you are accepted to a graduate program and not to work with a specific advisor. At Chicago, students choose advisors at the end of their first year.
  2. Yes, I definitely plan to take on more students in the future, and even in the short term future. However, I do plan to be somewhat selective; I would say that the more a student wants to work directly on the central topics inside the Langlands program the more background and independence the student must have.
  3. I have no involvement with graduate admissions, and I don’t plan to serve on that committee in the near future. Moreover, I believe that we probably have something like 500 people who apply for graduate school, so there’s not really any reason to send me your CV and I’m most likely not going to read it.
  4. I am totally happy to receive such emails, and some are more relevant than others. I sent a few emails of that sort myself when I applied to graduate school as well. To those emails, my response is: yes, please apply to Chicago, we have a great program in number theory here!
  5. At the same time, I simply don’t have the time to write personalized responses to all the emails I receive along these lines. So please do not be upset if you don’t receive a personalized response and just get a link to this blogpost.
  6. Because I get so many such emails, the previous default action is to save the email and not respond. That was usually the end of the matter (although I did feel a little guilty, thus this post.) More recently, however, some people do not appear willing take “no response” as an answer, and I now frequently get prospective students writing more and more belligerent emails demanding a response. This is probably not necessary for most of you, but please do not send multiple emails to me demanding a response. Emailing me four or six times is probably the optimal way to ensure that I *won’t* take you on as a student.
Posted in Politics | Tagged | 1 Comment

Not quite what I meant

Weibo Fu wrote an interesting paper on upper bounds for spaces of Bianchi modular forms, pushing previous results of Simon Marshall and Yongquan Hu to get more or less optimal results in the weight aspect. More generally, for any number field \(F\) which is not totally real, and for the space of regular algebraic cuspidal automorhpic representations of fixed level and parallel weight \(k\), he obtains the bound (see Theorem 1.2):

\[ \mathrm{dim} S_k = O(k^{d-1})\]

where \(d = [F:\mathbf{Q}]\) (The “easy” bound is \(O(k^d)\)). This is a great result! I do however have one tiny quibble. Fu makes the remark that

If \(F\) only admits one complex place … it seems like [the bound above] gives a sharp upper bound by heuristics from the Calegari-Emerton conjecture.

I never, unfortunately, had the time to examine this paper in any detail, but I do disagree with this comment. The basic point is that the codimension of completed cohomology over the non-commutative Iwasawa algebra is closely related to the growth of mod-\(p\) cohomology, but the growth of mod-\(p\) cohomology only gives an upper bound on cohomology in characteristic zero, and they don’t have much to do with each other unless the cohomology is torsion free. If you take a number field \(F\) whose only totally real subfield is \(\mathbf{Q}\), then I think the most natural guess for the dimension of the space of forms is \(O(k)\), which only coincides with this upper bound for imaginary quadratic fields.

Posted in Mathematics | Tagged , , , , , | 2 Comments

Persiflage, 2012-2024

No, not a eulogy!

I’ve been a bit concerned for a while about how stable wordpress is as a website — various plugins are always updating on their own, and I have sometimes noticed that old blog posts do not always render latex correctly (at some point there was a change in how latex was handled). For a while I thought I should do something to make sure that all my past math posts did not suddenly disappear. This feeling was hastened when the subversion platform I was using, xp-dev, suddenly went down when the owner (and apparently only employee) died and all the older versions of my collaborative projects become permanently unavailable.

Fortunately it is not too hard to download all old posts, and it remained to clean up the html and convert it until a latex document. That has now been done! Now you can read everything all at once at this link, which I plan to keep on my webpage (and perhaps even on the arXiv). Since I wasn’t so keen on editing a 300+ page pdf, it is only lightly modified, so many of the mistakes in the original posts (not corrected by commenters) will remain. I’ve left the posts more or less unchanged except for making some latex improvements, and I’ve included a selection of the comments when they were particularly relevant. I plan to update this file every now and then as I continue writing more blog posts. I’ve added (incomplete) notes on posts when I was aware of some particularly notable update since the post was written, and I tried (in a very incomplete way) to add references to papers I simply mentioned, but it could certainly do (at the very least) with some copy editing. I’m not sure anyone will look at these old posts besides me, but I am happy to have them in a more stable form and also easier to search.

Please feel free to point out any errors (big or small) in the pdf file on this post!

Posted in Uncategorized, Waffle | Tagged , , , | Leave a comment

SL_n versus GL_n

I recently wrote a paper (with Toby Gee and George Boxer, see also here) on constructing regular algebraic automorphic representations \(\pi\) of (cohomological) weight zero and level one, and therefore also cuspidal cohomology classes in the cohomology of \(\mathrm{GL}_n(\mathbf{Z})\) for some values of \(n\).

There was one slightly subtle point which we had to address concerning the relation between the cohomology of \(\mathrm{SL}_n(\mathbf{Z})\) and \(\mathrm{GL}_n(\mathbf{Z})\), or at least the relationship between the parts of cohomology which come from cuspidal modular forms. I have observed this issue turn up in some different contexts, and that is what I wanted to talk about today. The main message is that from the perspective of the Langlands program, the cohomology of \(\mathrm{GL}_n(\mathcal{O}_F)\) is more fundamental than tbe cohomology of \(\mathrm{SL}_n(\mathcal{O}_F)\). When \(F = \mathbf{Q}\) these groups are “more or less” the same (more on that below), but the differences are more pronounced and significant when \(F \ne \mathbf{Q}\). But let’s start by talking about the case of classical modular forms, where there is already something a little bit interesting to say. A regular algebraic automorphic representation \(\pi\) for \(\mathrm{GL}(2)/\mathbf{Q}\) of level one corresponds to a cuspidal modular eigenform of weight \(k \ge 2\) and level one. We know that cuspidal modular forms of weight \(k \ge 2\) and level one contribute via Eichler-Shimura to the Betti cohomology groups of the modular curve. As an orbifold, the modular curve can be realized as \(\mathbf{H}/\Gamma\) where now \(\Gamma = \mathrm{SL}_2(\mathbf{Z})\) rather than \(\mathrm{GL}_2(\mathbf{Z})\). In this situation at least, we understand quite well what is happening. These eigenforms give rise to a two-dimensional space inside \(H^1\) of the modular curve, and thus inside \(H^1(\Gamma)\), and we understand what the “extra” action of the element
\[ \displaystyle{ \left( \begin{matrix} 1 & 0 \\ 0 & -1 \end{matrix} \right) } \]
is; namely under the Eichler-Shimura isomorphism, it corresponds to the action of complex conjugation (so from the perspective of the Hodge filtration, it takes the holomorphic forms to the antiholomorphic forms and vice-versa). It acts on the relevant piece of cohomology with trace zero. Note that this no longer holds on non-cuspidal cohomology, for example \(H^0\) is one dimensional in both cases. Of course in cohomological weight zero (which corresponds to weight \(k = 2\)), there turn out to be no such forms, but the point is that the vanishing of the cuspidal cohomology for \(\mathrm{GL}_2(\mathbf{Z})\) is equivalent to the same statement for \(\mathrm{SL}_2(\mathbf{Z})\). (Something similar is also true in higher weight as well when there really do exist such forms.)

For larger \(n\) there is a similar equivalence; but now the behavior depends on the parity of \(n\). For \(n\) odd, the cohomology of \(\mathrm{GL}_n(\mathbf{Z})\) and \(\mathrm{SL}_n(\mathbf{Z})\) is (rationally) the same because \(\mathrm{GL}_n(\mathbf{Z}) \simeq \mathrm{SL}_n(\mathbf{Z}) \times \mathbf{Z}/2 \mathbf{Z}\) (then by the Künneth formula). But for \(n\) even, a level one weight zero \(\pi\) gives rise to two copies of the exterior algebra

\[ \bigwedge^* \mathbf{C}^{\ell_0} \]

in degrees \([q_0,\ldots,q_0 + \ell_0]\), with \(\ell_0 = (n-2)/2\), and the action of the “extra” element acts freely on these two copies. All this comes down to the differences in the real representation theory of \(\mathrm{GL}_n(\mathbf{R})\) and \(\mathrm{GL}_n(\mathbf{R})^{+}\) which is discussed briefly in the paper but which I won’t talk about here.

But what happens for general number fields \(F\)? There’s a confusion which I have seen in various places even for \(n=2\) about whether one should be considering the cohomology of \(\mathrm{SL}_n(\mathcal{O}_F)\) or \(\mathrm{GL}_n(\mathcal{O}_F)\). Of course it depends on what exactly one wants to do. But at least if one is interested in computing automorphic representations conjecturally associated to motives which have level one, one should really be considering the cohomology of \(\mathrm{GL}_2(\mathcal{O}_F)\). This confusion comes with good pedigree — It turns up in the Serre-Tate correspondence! Tate mentions (October 1969, page 382) a colloquium by Swan who “disappointed everybody” by computing that \(H_1(\mathrm{SL}_2(\mathbf{Z}[\sqrt{-14}]),\mathbf{Z})\) has rank three, compared to the lower bound (coming from the boundary tori) of two. (Side remark: Tate notes in a later letter [Nov 15] it should be \(\sqrt{-10}\), not \(\sqrt{-14}\).) Serre responds (October 15, page 384) that he doesn’t find this at all surprising, and in fact:

(via la théorie de Weil cela signifait qu’il existe de courbes elliptiques sur le corps en question qui n’ont pas de multiplication complexe — on n’en doute pas). En fait, vu Weil, il s’impose d’essayer de construire une courbe elliptique sur \(\mathbf{Q}(\sqrt{-56})\) ayant bonne réduction partout;

Now I confess that when I first read this quote I interpreted it as a misapprehension on Serre’s part, because (since this is \(\mathrm{SL}_2\) not \(\mathrm{GL}_2\)) there need not exist any such elliptic curve. But looking it up again now, I started to have my doubts, and Serre was perhaps more circumspect than I had assumed. Indeed chatgpt tells me:

The phrase “il s’impose d’essayer” in French does not have the same strict sense of necessity as “it is necessary” in English. A more nuanced translation could be “it is imperative to try” or “it is important to try.” It suggests a strong recommendation or importance, rather than an absolute necessity.

(Possibly Colmez can confirm this; AI has rendered his go playing superfluous but not yet his skills interpreting for anglophones the nuances of Serre’s words.) That’s also consistent with how Serre continues:

je connais trop mal la théorie de Weil pour être sûr que ça doit exister; mais il vaut la peine d’essayer

Later (note the remark on \(d=-56\) versus \(d=-40\) abpve), Serre says:

C’est bien \(\mathbf{Q}(\sqrt{-40})\) le corps où Mennicke a trouve que le rang de \(\mathrm{SL}_2\) rendu abélien est nombre de classes. Mais il a un corps encore plus beau: \(\mathbf{Q}(\sqrt{-109})\) où le \(\mathrm{GL}_2\) rendu abélien est infini (c’est une propriété plus forte S1 que la précédente). Ici aussi, on a envie de chercher des courbes elliptiques à bonne réduction.

Perhaps worth adding the modern footnote as well:

«via la théorie de Weil cela signifiait que…» je m’avançais beaucoup en disant ça (I was talking through my hat).

Of course, 45 years later things have been clarified, at least conjecturally. (We still have no general way to produce motives from cohomology, even for Hilbert modular forms of parallel weight \(2\).) One perspective which I think is helpful (at least to those who care more about Galois representations) is thinking about the differences between the Galois representations associated to automorphic forms on \(\mathrm{SL}_n\) versus \(\mathrm{GL}_n\). Given a \(\pi\) for the former (say cuspidal algebraic of weight zero and level one), you should think about this as giving a compatible family of projective representations:
\[\rho(\pi): G_F \rightarrow \mathrm{PGL}_n(\overline{\mathbf{Q}}_p)\]
which are absolutely irreducible and crystalline of the expected weights and unramified outside \(v|p\). Now in this situation,one knows (following for example Patrikis) that there exists for any such \(\rho\) a lift to a genuine representation of \(G_F\) which is crystalline at \(v|p\) of the right weight for all \(v|p\) — this generally requires some parity condition on the weight but we are assuming that here. What is not automatic, however, is that this lift has level \(N=1\) any more; that is, the image of inertia at other primes \(v\) may be non-trivial (though of course the image lies in the center). Here there is something special which happens only for \(F = \mathbf{Q}\); as observed by Tate, you can globalize these local characters and then twist to eliminate all the auxiliary ramification. (This argument is explained by Serre in his 1975 Durham paper which is always impossible to find online; it is used to show that a complex
projective representation can be lifted to an Artin representation ramified at the same set of primes.) For other fields, even if the class number is trivial, you get global obstructions coming (via class field theory) from the unit group. (Even for imaginary quadratic fields, where the unit group is not very big, this is still an issue, and the general problem can only be avoided for fields for which the unit group has order \(2\) and which have a real place, which is quite a restrictive condition when you think about it.) The direct automorphic argument is ultimately quite similar, but there are some traps waiting for the unwary (related to Grunwald-Wang); see the discussion in this paper.

So for example, it is true that as \(F\) ranges over all imaginary quadratic fields, one has
\[H^1_{\mathrm{cusp}}(\mathrm{SL}_2(\mathcal{O}_F),\mathbf{C}) \ne 0\]
for all but finitely many \(F\). But the analogue for \(\mathrm{GL}_2(\mathcal{O}_F)\) is not only unknown, but
we certainly have:

Conjecture: There are infinitely many imaginary quadratic fields \(F\) with
\[H^1_{\mathrm{cusp}}(\mathrm{GL}_2(\mathcal{O}_F),\mathbf{C}) = 0.\]

By the way, from the perspective of Galois representations, one can see why the group above should be non-zero in the case of \(\mathrm{SL}_2(\mathcal{O}_F)\). Let \(F = \mathbf{Q}(\sqrt{-D})\). All we need to find are modular forms \(\pi\) of weight two with the property that, locally at primes \(p|D\), the corresponding Weil-(Deligne) representation on restriction to inertia becomes trivial after restriction to \(\mathbf{Q}_p(\sqrt{-D})\) up to twist. One easy way to achieve this is to take ramified principal series \(\mathrm{PS}(1,\chi)\) for some (local) ramified quadratic character \(\chi\). The problem is this leads (globally) to a sign difficulty; if \(F\) has prime discriminant, then globally you would want the weight of \(\pi\) to be two and the Nebentypus character to be the quadratic character of conductor \(\Delta_F\) which is odd, which is a problem. (Sometimes it is not; if \(F = \mathbf{Q}(\sqrt{-p})\) and \(p \equiv 1 \bmod 4\) then you can take the real character of conductor \(p\), but if \(p \equiv -1 \bmod 4\) this doesn’t work.) But instead of principal series, one can take certain supercuspidal representations: Assume that \(F_p/\mathbf{Q}_p\) is a ramified quadratic extension. Then if \(\chi\) is a totally ramified character of \(F^{\times}_p\) of order \(2^m\) where \(2^{m} \| p-1\), then the base change of this supercuspidal representation will be unramified up to twist, but the original representation will not be unramified up to twist. It’s now easy to construct such forms (and even compute how many of them there are), and see there are plenty of them when the discriminant of \(\Delta_K\) gets large (one has to avoid CM forms over \(K\) which can become non-cuspidal but these are easy to bound.) It’s also easy to see that while these base changes are unramified at every place up to a local twist they are not in general unramified everywhere up to a global twist.

The forms one finds in this way by base change are invariant under complex conjugation (now acting on the group), and there is another “geometric” way to show they exist which was originally done by Rohlfs (see this paper), who I believe was the first person to prove the non-vanishing claim above. (In fact, this is one way to start proving base change in this situation.)

When it comes to general number fields, one certainly expects (by functoriality!) that \(H^*_{\mathrm{cusp}}(\mathrm{GL}_n(\mathcal{O}_F),\mathbf{C})\) should be non-zero for \(n=79\) say and every number field \(F\), but this is hopeless for almost all fields. Using our arguments (and Newton-Thorne for totally real fields!) One certainly can prove it for many totally real and CM fields (some ramification conditions are required for the arguments to work) using the exact same argument. Of course, when for such fields there exists a cuspidal Hilbert modular form of weight two and level one then you can just used Newton-Thorne directly! For general fields, as usual, the problem of understanding automorphic forms eludes us.

Curiously enough, while writing this post, there appeared a very recent preprint by Darshan and Raghuram here which constructs, for example, cuspidal cohomology classes for \(\mathrm{GL}_n/F\) of (for example) cohomological weight zero for any number field \(F\) which is Galois over a totally real field \(F^{+}\) of some deep enough level, but with no further assumptions on \(F\). Clozel (in this paper) did something similar when \(n\) is even by automorphic induction, but already for \(n=3\) this no longer works. Assuming all conjectures, the simplest way to construct such forms for \(F = \mathbf{Q}\) or any totally real field is to take symmetric squares of Hilbert modular forms (these more or less constitute all the self-dual forms). It seems to me that the forms found by Darshan and Raghuram must be some shadow of these forms over the largest totally real subfield \(F^{+}\) of \(F\) and so one is seeing a hint of non-cyclic base change here which is intriguing! I hope to return to this later when I understand it better.

Posted in Mathematics | Tagged , , , , , , , , , , , , , , , , , , , , , , , | 1 Comment

A talk on my new work with Vesselin Dimitrov and Yunqing Tang on irrationality

Here is a video of my talk from the recent 70th birthday conference of Peter Sarnak. During a talk one always forgets to say certain things, so I realized that my blog could be a good place to give some extra context on points I missed. There are three things off the top that I can add before rewatching the talk. The first is that I made a typo in one of my collaborator’s name (oops!). The second is that I didn’t mention the work of Bost-Charles, whose influence on our work is clear. Indeed the \(m = 0\) version of the holonomy theorem (version III) in this talk is a theorem in their monograph. The third is that my presentation of known irrationality results for *explicit* zeta values makes sense in the context of framing of my talk, but it’s good to note that the irrationality results of Rivoal, Ball-Rivoal, and Zudilin (for example, at least (edit: one) of \(\zeta(5), \zeta(7), \zeta(9), \zeta(11)\) is irrational) in a closely related direction are amazing theorems. There’s probably more to say, and I might add some extra comments if I watch the video again).

Some incidental remarks concerning history I thought about when preparing my talk: I know from popular accounts that Apéry’s result came as a complete surprise. Similarly, the result of Gelfond-Schneider was a complete shock as well. (Hilbert was reputed to say that he didn’t think this problem would be solved within his lifetime.) Now these two theorems are “recent enough” so that the memory of their resolution is still within the collective consciousness of mathematicians. In the first case, I still know a bunch of people (Henri Cohen and Frits Beukers) who were actually at Apéry’s infamous lecture. But what about (edit Lindemann’s) proof that \(\pi\) is transcendental? I have no sense as to what was the reaction at the time, in part due to my lack of historical knowledge but also to the lack (as far as I can see) of easily available informal discussions about contemporary mathematics from the 19th century (I assume that personal letters would be the best source). The best (?) I could find was the following (quoted from here):

In fact his [Lindemann’s] proof is based on the proof that \(e\) is transcendental together with the fact that \(e^{i \pi} = -1\). Many historians of science regret that Hermite, despite doing most of the hard work, failed to make the final step to prove the result concerning which would have brought him fame outside the world of mathematics. This fame was instead heaped on Lindemann but many feel that he was a mathematician clearly inferior to Hermite who, by good luck, stumbled on a famous result.

First, this seems pretty brutal towards Lindemann (to be fair, the continuation of the text does give some more grudging praise of Lindemann). Second, which historians are being referred to here? This seems far too judgemental for the historians I have ever spoken to in real life. If this text is at all accurate, it seems to suggest that Lindemann’s result was lauded but perhaps not considered surprising to his contemporaries? I feel that this is recent enough that one should be able to get a fuller idea of what was going on at the time.

Going back in time further, I also wonder what Lambert’s contemporaries thought of his proof (in the 1760s) that \(\pi\) was irrational. When I was giving a public talk on \(\pi\) in Sydney I looked up Lambert’s paper. The introduction is quite amusing, with the following remark that suggests a modern way of thinking not much different to how I think about things today:

Démontrer que le diametre du cercle n’est point à sa circonférence comme un nombre enteir à nombre entrier, c’est là une chose, dont les géometres ne seront gueres sorpris. On connoit les nombres de Ludolph, les rapports trouvés par Archimede, par Metius, etc. de même qu’un grand nombre de suites infinies, qui toures se rapportent à la quadrature du cercle. Et si la somme de ces suites est unq quantité rationelle, on doit assez naturellement conclure, qu’elle sera ou un nombre entier, ou one fraction très simple. Car, s’il y falloit une fraction fort composée, quoi raison y auroit-il, pourquoi plutôt relle que telle autre quelconque?

(Or in translation, errors some combination of mine and google translate):

We prove that the ratio of the diameter of the circle to its circumference is not rational; something that geometers will hardly be surprised by. We know the number \(pi\) of Ludolph, and expressions for this number found by Archimedes, by Metius, etc. in terms of a large number of infinite series of rational numbers, which all relate to the squaring of the circle. If the sum of these sequences was a rational quantity, we must quite naturally conclude that it will be either a whole number, or a very simple fraction. For, if a very complicated fraction were necessary, what reason would there be to be equal to such a number rather than any other real (irrational) number?

I guess Occam was from the 14th century!

Posted in Mathematics | Tagged , , , , , , , , , , , , , , , | 4 Comments

Zeilberger + ChatGPT

Since I don’t have maple, I can’t play with the following code:

https://sites.math.rutgers.edu/~zeilberg/tokhniot/MultiAlmkvistZeilberger.txt

But is ChatGPT now good enough to re-write this in either pari/gp or magma (or Mathematica). I’m not sure how realistic this might be (without some serious extra hands-on editing…)

Posted in Mathematics | Tagged , , | 2 Comments

Unramified Fontaine-Mazur for representations coming from abelian varieties

Mark Kisin gave a talk at the number theory seminar last week where the following problem arose:

Let \(W\) be the Galois representation associated to the Tate module of an abelian variety \(A\) over a number field, and suppose that \(W = U \otimes V\). Now suppose that the Galois action on \(U\) is unramified at all primes above \(p\). Can you prove that the Galois action on \(U\) has finite image?

Of course this is a special case of the unramified Fontaine-Mazur conjecture. But here the representation \(U\) literally “comes from an abelian variety” although as a tensor factor rather than a direct factor. At first sight it seems like it should be much easier than the actual Fontaine-Mazur conjecture if you just find the right trick, but I don’t see how to do it! Here at least is a very special case.

Lemma: Suppose that \(A/K\) has ordinary reduction at a set of primes of density one, and
that \(U\) is a representation which is unramified at all primes dividing \(p\) of odd dimension which occurs as a tensor factor of \(W = H^1(A) = U \otimes V\). Then, after some finite extension of \(K\), \(U\) contains a copy of the trivial representation.

Proof: One may as well assume by induction that the action of the Galois group
on \(U\) is absolutely irreducible of odd dimension \(d\) and remains so for every finite extension (otherwise decompose it into such pieces and take one of odd dimension).

Now choose a prime \(v\) of \(K\). Let \(\alpha_i\) be the eigenvalues of Frobenius at \(v\) on \(U\),
and let \(\beta_j\) be the corresponding eigenvalues on \(V\). We know that \(\alpha_i \beta_j\) are algebraic numbers which are Weil numbers of norm \(N(v)\). The ratios of any two roots thus are also algebraic numbers with absolute value \(1\) at all real places, and so \(\alpha_i/\alpha_1\) has this property.

Let’s suppose that the ratios \(\alpha_i/\alpha_1\) are actually roots of unity for a set \(v\) of density one. Since \(W\) is be defined over a fixed finite extension \(E = \mathbf{Q}_p\), the degrees of these ratios has uniformly bounded order over \(E\), and the the orders of these roots of unity also have uniformly bounded order. But then (projectively) only finitely many characteristic polynomials will arise from Frobenii for a set of (edit: density one), which would imply that \(U\) has finite projective image, from which it easily follows that \(U\) becomes trivial over a finite extension (remember the determinant is unramified so of finite image). Hence it suffices to show that the \(\alpha_i/\alpha_1\) are all algebraic integers and then use Kronecker’s theorem.

For finite places not dividing \(N(v)\) this is clear because the valuations of the \(\alpha_i \beta_j\) are all trivial and so are their ratios. For finite places dividing \(N(v)\) now suppose in addition that \(A\) is ordinary. Fix a place above \(v\). If the \(\alpha_i/\alpha_1\) have valuation given by \(a_i\), and \(\beta_j/\beta_1\) have valuation \(b_i\), it follows that the quantities \(a_i + b_j\) take on precisely two values, zero and either \(1\) or \(-1\), and they take on each of these values exactly half the time. But then either \(a_i\) is constant and thus (considering \(i = 1\)) equal to \(0\), or the \(b_j\) are all zero, and then half the \(a_i\) are zero and half are \(1\) or \(-1\). But that’s clearly only possible if \(U\) has odd dimension. So done!

I suspect the case that \(\dim(U)=2\), even with an ordinary hypothesis, is probably quite hard. But I would be happy to be mistaken.

Posted in Mathematics | Tagged , | 4 Comments