The conjecture in this paper has indeed been proven. But that isn’t the entire story. Serre was fully aware of Katz modular forms of weight one. However, Serre was too timid was prudently conservative and made his conjecture only for weights \(k(\rho) \ge 2\).
Well, perhaps I am overstating the case; we may as well quote Serre himself here:
Au lieu de définir les formes paraboliques à coefficients dans \(\mathbf{F}_p\) par réduction à partir de la caractéristique \(0,\) comme nous l’avons fait, nous aurions pu utiliser la définition de Katz [23], qui conduit à un espace a priori plus grand … Il serait également intéressant d’étudier de ce point de vue le cas \(k =1,\) que nous avons exclu jusqu’ici ; peut-être la définition de Katz donne-t-elle alors beaucoup plus de représentations \(\rho_f\)?
Instead of defining the cusp forms with coefficients in \(\mathbf{F}_p\) by reduction from characteristic 0, as we did, we could have used the definition of Katz [23], which leads, a priori, to a larger space … It would also be interesting to study from this viewpoint the case \(k = 1\) we have ruled out so far; Perhaps Katz’s definition gives more representations \(\rho_f\)?
In his Inventiones paper on the weight in Serre’s conjecture, Edixhoven does give the correct formulation where one allows \(k(\rho) = 1\) and correspondingly also Katz modular forms. The bridge between the two conjectures essentially consists of two further conjectures: first, that Galois representations associated to residual weight one forms are unramified, and second, unramified modular representations come from weight one.
The first progress on this problem was actually pre-Edixhoven, namely, Gross’ companion form paper in Duke. (I have four copies of that paper on my laptop — two called GrossDuke.pdf, one called GrossCompanion.pdf, and one simply called Gross.pdf — does anyone else have scatterbrained naming systems for downloaded pdf files?) Gross deals with both directions in the case when \(\rho(\mathrm{Frob}_p)\) has distinct eigenvalues (I guess the assumption in the direction weight one \(\Rightarrow\) unramified is that the eigenvalues of \(X^2 – a_p X + \chi(p)\) are distinct). Of course, there was the famous matter of the “unchecked compatibilities,” (I’m not one for checking compatibilities myself, to be honest) which have certainly been resolved at this point (does Bryden Cais do this in his thesis? I think he does) The next step was the work of Coleman-Voloch, who deal with the remaining case under the additional assumption that \(p\) is odd. So this leaves the case \(p = 2\). Somewhat more recently, Gabor Wiese showed that weight one Katz modular forms do give rise to unramified representations without any assumptions. So this leaves:
Serre’s Conjecture [Edixhoven formulation]: Let \(\rho: G_{\mathbf{Q}} \rightarrow \mathrm{GL}_2(\mathbf{F}_q)\) be an absolutely irreducible modular representation of characteristic 2. Assume that \(\rho\) is unramified at 2 and that the semi-simplification of \(\rho(\mathrm{Frob}_2)\) is scalar. Then \(\rho\) is modular of weight one.
Now Wiese also explicitly dealt with the case when \(\rho\) was (projectively) dihedral, so we can assume that \(\rho\) is absolutely irreducible with non-dihedral image. Suppose that the Serre level is \(N\). Let \(\mathfrak{m}\) denote the maximal ideal of the weight two Hecke algebra which does not include the Hecke operator \(T_2\). Let’s imagine we are working with Hecke algebras over some sufficiently large extension \(\mathcal{O}_E\) of \(\mathbf{Z}_2\) with residue field \(k\) so to include enough Frobenius eigenvalues. It suffices to prove that
\(\dim_{\mathbf{T}/\mathfrak{m}} H^0(X_1(N)/k,\omega^{\otimes 2})[\mathfrak{m}] \ge 2,\)
because then we will have found two modular forms \(f\) and \(g\) which are Hecke eigenvalues for all Hecke operators away from \(p,\) and by the \(q\)-expansion principle, some linear combination of \(f\) and \(g\) will have to be the square of the desired weight one form.
Let \(R_{\mathrm{loc}}\) denote the Kisin deformation ring at two for \(\rho | D_2\) for the decomposition group \(D_2\) at \(2,\) (this is just the ordinary deformation ring, in the sense of Geraghty). Let \(R^{\dagger}_{\mathrm{loc}}\) denote the augmented deformation which also includes the crystalline Frobenius eigenvalue \(T_2\) (or, to put it differently, the eigenvalue of Frobenius on the “unramified quotient” \(U_2\), where the former is meant in a sense that can and does make sense integrally. By Hensel’s lemma, both pieces of added data are equivalent.) Now one uses the modularity machine, which is OK by Khare-Wintenberger for \(p =2\) because we are in the non-dihedral setting. Let’s patch the Betti cohomology of modular curves following KW, except now working with the modified global Kisin deformation ring \(R^{\dagger}\) which remembers crystalline Frobenius, and the full Hecke algebra \(\mathbf{T}^{\dagger}\) which includes \(T_2.\) Now \(R^{\dagger}_{\mathrm{loc}}\) is a domain with formally smooth generic fibre (this is proved in Snowden’s paper — the ring in question is denoted \(\widetilde{R}_3\) in ibid.). Hence, by Kisin-Khare-Wintenberger method, we obtain an isomorphism \(R^{\dagger}[1/\varpi] = \mathbf{T}^{\dagger}[1/\varpi]\). However, because \(R^{\dagger}_{\mathrm{loc}}\) is in addition Cohen-Macaulay, this can be upgraded to an \(R^{\dagger} = \mathbf{T}^{\dagger}\) theorem. (It might be cleaner to instead patch coherent cohomology — multiplicity one [which always holds with \(T_2\) included] implies that the patched module is free of rank one, which makes it easy to deduce the integral \(R^{\dagger} = \mathbf{T}^{\dagger}\) theorem.) By considering the action of \(\mathbf{T}^{\dagger}\) on coherent cohomology, however, our multiplicity one assumption allows us to deduce by Nakayama that \(\mathbf{T} = \mathbf{T}^{\dagger}\) (more trivially: the space of modular forms with coefficients in \(E/\mathcal{O}_E\) with \(\mathcal{O}_E/\varpi = k\) is co-free of rank one over both of these rings) and so \(R \rightarrow R^{\dagger} = \mathbf{T}^{\dagger} = \mathbf{T}\) is surjective. However, there cannot be a surjection \(R \rightarrow R^{\dagger}\), because there is a map \(R^{\dagger} \rightarrow k[\epsilon]/\epsilon^2\) which is trivial as a Galois deformation but is non-trivial for (the Galois avatar of) \(T_2\). For example, in the trivial case, this just amounts to saying that the trivial representation of \(G_{\mathbf{Q}_2}\) to \(\mathrm{GL}_2(k[\epsilon]/\epsilon^2)\) can be thought of as “ordinary with eigenvalue \(1 + \epsilon.\)” It follows that multiplicity one without \(T_2\) cannot hold.
Thus Serre’s conjecture is true!
Is all the old stuff on companion forms made obsolete by the modern techniques?
I think that there are two ways to answer that question. The method of Gee to produce companion forms (of which this argument is a variant) can now be said to reprove the main theorems of Gross and Coleman-Voloch without having to reference those papers. But in a different and perhaps more important sense, there is plenty of geometric content in these papers which is not reproduced by modularity lifting methods. For example, I think it is an important open problem is to construct companion forms in low (=non-cohomological) weight beyond $latex \mathrm{GL}(2).$ The complete failure of multiplicity one in those contexts (i.e. if you know the Hecke eigenvalues for all the Hecke operators you still can’t recover the modular form) suggests that one might have to start thinking geometrically again. (The argument I give above crucially uses $latex q$-expansions.) Then again, maybe it will turn out that one can prove such companion form results just using modularity techniques; but it would be foolish to ignore the other approaches!