Cover Page

CONTENTS

PREFACE TO THE FIRST EDITION

PREFACE TO THE SECOND EDITION

NOTATION

INTRODUCTION

CHAPTER ONE: FROM FERMAT TO GAUSS

§1. FERMAT, EULER AND OUADRATIC RECIPROCITY

§2. LAGRANGE, LEGENDRE AND QUADRATIC FORMS

§3. GAUSS, COMPOSITION AND GENERA

§4. CUBIC AND BIQUADRATIC RECIPROCITY

CHAPTER TWO: CLASS FIELD THEORY

§5. THE HILBERT CLASS FIELD AND p = x2 + ny2

§6. THE HILBERT CLASS FIELD AND GENUS THEORY

§7. ORDERS IN IMAGINARY QUADRATIC FIELDS

§8. CLASS FIELD THEORY AND THE ČEBOTAREV DENSITY THEOREM

§9. RING CLASS FIELDS AND p = x2 + ny2

CHAPTER THREE: COMPLEX MULTIPLICATION

§10. ELLIPTIC FUNCTIONS AND COMPLEX MULTIPLICATION

§11. MODULAR FUNCTIONS AND RING CLASS FIELDS

§12. MODULAR FUNCTIONS AND SINGULAR j-INVARIANTS

§13. THE CLASS EQUATION

CHAPTER FOUR: ADDITIONAL TOPICS

§14. ELLIPTIC CURVES

§15. SHIMURA RECIPROCITY

REFERENCES

ADDITIONAL REFERENCES

A. References Added to the Text

B. Further Reading for Chapter One

C. Further Reading for Chapter Two

D. Further Reading for Chapter Three

E. Further Reading for Chapter Four

INDEX

title.jpg

PURE AND APPLIED MATHEMATICS

A Wiley Series of Texts, Monographs, and Tracts

Founded by RICHARD COURANT

Editors Emeriti: MYRON B. ALLEN III, DAVID A. COX, PETER HILTON, HARRY HOCHSTADT, PETER LAX, JOHN TOLAND

ADÁMEK, HERRLICH, and STRECKER—Abstract and Concrete Catetories

ADAMOWICZ and ZBIERSKI—Logic of Mathematics

AINSWORTH and ODEN—A Posteriori Error Estimation in Finite Element Analysis

AKIVIS and GOLDBERG—Conformal Differential Geometry and Its Generalizations

ALLEN and ISAACSON—Numerical Analysis for Applied Science

* ARTIN—Geometric Algebra

ATKINSON, HAN, and STEWART—Numerical Solution of Ordinary Differential Equations

AUBIN—Applied Functional Analysis, Second Edition

AZIZOV and IOKHVIDOV—Linear Operators in Spaces with an Indefinite Metric

BASENER—Topology and Its Applications

BERG—The Fourier-Analytic Proof of Quadratic Reciprocity

BERKOVITZ—Convexity and Optimization in imagesn

BERMAN, NEUMANN, and STERN—Nonnegative Matrices in Dynamic Systems

BOYARINTSEV—Methods of Solving Singular Systems of Ordinary Differential Equations

BRIDGER—Real Analysis: A Constructive Approach

BURK—Lebesgue Measure and Integration: An Introduction

* CARTER—Finite Groups of Lie Type

CASTILLO, COBO, JUBETE, and PRUNEDA—Orthogonal Sets and Polar Methods in Linear Algebra: Applications to Matrix Calculations, Systems of Equations, Inequalities, and Linear Programming

CASTILLO, CONEJO, PEDREGAL, GARCIÁ, and ALGUACIL—Building and Solving Mathematical Programming Models in Engineering and Science

CHATELIN—Eigenvalues of Matrices

CLARK—Mathematical Bioeconomics: The Mathematics of Conservation, Third Edition

COX—Galois Theory, Second Edition

COX—Primes of the Form x2 + ny2: Fermat, Class Field Theory, and Complex Multiplication, Second Edition

*CURTIS and REINER—Representation Theory of Finite Groups and Associative Algebras

*CURTIS and REINER—Methods of Representation Theory: With Applications to Finite Groups and Orders, Volume I

CURTIS and REINER—Methods of Representation Theory: With Applications to Finite Groups and Orders, Volume II

DINCULEANU—Vector Integration and Stochastic Integration in Banach Spaces

*DUNFORD and SCHWARTZ—Linear Operators
Part 1—General Theory
Part 2—Spectral Theory, Self Adjoint Operators in Hilbert Space
Part 3—Spectral Operators

FARINA and RINALDI—Positive Linear Systems: Theory and Applications

FATICONI—The Mathematics of Infinity: A Guide to Great Ideas, Second Edition

FOLLAND—Real Analysis: Modem Techniques and Their Applications

FRÖLICHER and KRIEGL—Linear Spaces and Differentiation Theory

GARDINER—Teichmüller Theory and Quadratic Differentials

GILBERT and NICHOLSON—Modern Algebra with Applications, Second Edition

*GRIFFITHS and HARRIS—Principles of Algebraic Geometry

GRILLET—Algebra

GROVE—Groups and Characters

GUSTAFSSON, KREISS and OLIGER—Time Dependent Problems and Difference Methods

HANNA and ROWLAND—Fourier Series, Transforms, and Boundary Value Problems, Second Edition

*HENRICI—Applied and Computational Complex Analysis
Volume 1, Power Series—Integration—Conformal Mapping—Location of Zeros
Volume 2, Special Functions—Integral Transforms—Asymptotics—Continued Fractions
Volume 3, Discrete Fourier Analysis, Cauchy Integrals, Construction of Conformal Maps, Univalent Functions

*HILTON and WU—A Course in Modem Algebra

*HOCHSTADT—Integral Equations

JOST—Two-Dimensional Geometric Variational Procedures

KHAMSI and KIRK—An Introduction to Metric Spaces and Fixed Point Theory

*KOBAYASHI and NOMIZU—Foundations of Differential Geometry, Volume I

*KOBAYASHI and NOMIZU—Foundations of Differential Geometry, Volume II

KOSHY—Fibonacci and Lucas Numbers with Applications

LAX—Functional Analysis

LAX—Linear Algebra and Its Applications, Second Edition

LOGAN—An Introduction to Nonlinear Partial Differential Equations, Second Edition

LOGAN and WOLESENSKY—Mathematical Methods in Biology

LUI—Numerical Analysis of Partial Differential Equations

MARKLEY—Principles of Differential Equations

MORRISON—Functional Analysis: An Introduction to Banach Space Theory

NAYFEH—Perturbation Methods

NAYFEH and MOOK—Nonlinear Oscillations

O'LEARY—Revolutions of Geometry

O'NEIL—Beginning Partial Differential Equations, Second Edition

PANDEY—The Hilbert Transform of Schwartz Distributions and Applications

PETKOV—Geometry of Reflecting Rays and Inverse Spectral Problems

*PRENTER—Splines and Variational Methods

PROMISLOW—A First Course in Functional Analysis

RAO—Measure Theory and Integration

RASSIAS and SIMSA—Finite Sums Decompositions in Mathematical Analysis

RENELT—Elliptic Systems and Quasiconformal Mappings

RIVLIN—Chebyshev Polynomials: From Approximation Theory to Algebra and Number Theory, Second Edition

ROCKAFELLAR—Network Flows and Monotropic Optimization

ROITMAN—Introduction to Modem Set Theory

ROSSI—Theorems, Corollaries, Lemmas, and Methods of Proof

*RUDIN—Fourier Analysis on Groups

SENDOV—The Averaged Moduli of Smoothness: Applications in Numerical Methods and Approximations

SENDOV and POPOV—The Averaged Moduli of Smoothness

SEWELL—The Numerical Solution of Ordinary and Partial Differential Equations, Second Edition

SEWELL—Computational Methods of Linear Algebra, Second Edition

SHICK—Topology: Point-Set and Geometric

SHISKOWSKI and FRINKLE—Principles of Linear Algebra With Maple

SHISKOWSKI and FRINKLE—Principles of Linear Algebra With Mathematical®

*SIEGEL—Topics in Complex Function Theory
Volume 1—Elliptic Functions and Uniformization Theory
Volume 2—Automorphic Functions and Abelian Integrals
Volume 3—Abelian Functions and Modular Functions of Several Variables

SMITH and ROMANOWSKA—Post-Modern Algebra

ŠOLÍN-Partial Differential Equations and the Finite Element Method

STADE—Fourier Analysis

STAHL and STENSON—Introduction to Topology and Geometry, Second Edition

STAHL—Real Analysis, Second Edition

STAKGOLD and HOLST—Green's Functions and Boundary Value Problems, Third Edition

STANOYEVITCH—Introduction to Numerical Ordinary and Partial Differential Equations Using MATLAB®

*STOKER—Differential Geometry

*STOKER—Nonlinear Vibrations in Mechanical and Electrical Systems

*STOKER—Water Waves: The Mathematical Theory with Applications

WATKINS—Fundamentals of Matrix Computations, Third Edition

WESSELING—An Introduction to Multigrid Methods

†WHITHAM—Linear and Nonlinear Waves

ZAUDERER—Partial Differential Equations of Applied Mathematics, Third Edition

*Now available in a lower priced paperback edition in the Wiley Classics Library.

†Now available in paperback.

PREFACE TO THE FIRST EDITION

Several years ago, while reading Weil's Number Theory: An Approach Through History, I noticed a conjecture of Euler concerning primes of the form x2 + 14y2. That same week I picked up Cohn's A Classical Invitation to Algebraic Numbers and Class Fields and saw the same example treated from the point of view of the Hilbert class field. The coincidence made it clear that something interesting was going on, and this book is my attempt to tell the story of this wonderful part of mathematics.

I am an algebraic geometer by training, and number theory has always been more of an avocation than a profession for me. This will help explain some of the curious omissions in the book. There may also be errors of history or attribution (for which I take full responsibility), and doubtless some of the proofs can be improved. Corrections and comments are welcome!

I would like to thank my colleagues in the number theory seminars of Oklahoma State University and the Five Colleges (Amherst College, Hampshire College, Mount Holyoke College, Smith College and the University of Massachusetts) for the opportunity to present material from this book in preliminary form. Special thanks go to Dan Flath and Peter Norman for their comments on earlier versions of the manuscript. I also thank the reference librarians at Amherst College and Oklahoma Slate University for their help in obtaining books through interlibrary loan.

David A. Cox

Amherst, Massachsusetts
August 1989

PREFACE TO THE SECOND EDITION

The philosophy of the second edition is to preserve as much of the original text as possible. The major changes are:

• A new §15 on Shimura reciprocity has been added, based on work of Peter Stevenhagen and Alice Gee [A 10, A11, A23] and Bumkyo Cho [A6].
• The fifteen sections are now organized into four chapters:
– The original §§1–13, which present a complete solution of p = x2 + ny2, now constitute Chapters One, Two and Three.
– The new Chapter Four consists of the original §14 (on elliptic curves) and the new §15 (on Shimura reciprocity).
• An “Additional References” section has been added to supplement the original references [1]–[112]. This section is divided into five parts:
– The first part consists of references [A1]–[A24] that are cited in the text. These references (by no means complete) provide updates to the book.
– The remaining four parts give some references (also not complete) for further reading that are relevant to the topics covered in Chapters One, Two, Three and Four.
• The expanded Notation section now includes all notation used in the book. Specialized notation is listed according to the page where it first appears.

The other changes to the text are very minor, mostly to enhance clarity, improve formatting, and simplify some of the proofs. One exception is the addition of new exercises: at the end of §12, Exercise 12.31 shows how Ramanujan could have derived Weber's formula for images (thanks to Heng Huat Chan), and at the end of §14, Exercise 14.24 gives an elliptic curve primality test for Mersenne numbers due to Dick Gross [A 12] (thanks to Alice Silverberg).

The web site for the book includes typographical errors and a link to supplementary exercises for §§1–3 written by Jeffrey Stopple. The URL of the web site is

http://www.cs.amherst.edu/~dac/primes.html

I would like to thank the following people for the errors they found in the first edition and for the suggestions they made: Michael Baake, Dominique Bemardi, Jeff Beyerl, Reinier Bröker, Tony Feng, Nicholas Gavrielides, Lee Goswik, Christian Guenther, Shiv Gupta, Kazuo Hata, Yves Hellegouarach, Norm Hurt, Tim Hutchinson, Trevor Hyde, Maurice Kostas, Susumu Kuninaga, Franz Lemmermeyer, Joseph Lipman, Mario Magioladitis, David May, Stephen Mildenhall, Takashi Ono, Frans Oort, Alf van der Poorten, Jerry Shurman, Alice Silverberg, Neil Sloane, Steve Swanson, Cihangir Tezcan, Satoshi Tomabechi, Fan Xingyuan and Noriko Yui.

Please let me know if you find any errors in the new edition!

My hope is that the second edition of Primes of the Form x2 + ny2 will help bring this wonderful part of number theory to a new audience of students and researchers.

David A. Cox

Amherst, Massachsusetts
November 2012

NOTATION

The following standard notation will be used throughout the book.

images

Notation for Chapter One

images

Notation for Chapter Two

images

Notation for Chapter Three

images

Notation for Chapter Four

images
images

INTRODUCTION

Most first courses in number theory or abstract algebra prove a theorem of Fermat which states that for an odd prime p,

 

images

 

This is only the first of many related results that appear in Fermat's works. For example, Fermat also states that if p is an odd prime, then

 

images

 

These facts are lovely in their own right, but they also make one curious to know what happens for primes of the form x2 + 5y2, x2 + 6y2, etc. This leads to the basic question of the whole book, which we formulate as follows:

Basic Question 0.1. Given a positive integer n, which primes p can be expressed in the form

 

images

 

where x and y are integers?

We will answer this question completely, and along the way we will encounter some remarkably rich areas of number theory. The first steps will be easy, involving only quadratic reciprocity and the elementary theory of quadratic forms in two variables over images. These methods work nicely in the special cases considered above by Fermat. Using genus theory and cubic and biquadratic reciprocity, we can treat some more cases, but elementary methods fail to solve the problem in general. To proceed further, we need class field theory. This provides an abstract solution to the problem, but doesn't give explicit criteria for a particular choice of n in x2 + ny2. The final step uses modular functions and complex multiplication to show that for a given n, there is an algorithm for answering our question of when p = x2 + ny2.

This book has several goals. The first, to answer the basic question, has already been stated. A second goal is to bridge the gap between elementary number theory and class field theory. Although our basic question is simple enough to be stated in any beginning course in number theory, we will see that its solution is intimately bound up with higher reciprocity laws and class field theory. A related goal is to provide a well-motivated introduction to the classical formulation of class field theory. This will be done by carefully stating the basic theorems and illustrating their power in various concrete situations.

Let us summarize the contents of the book in more detail. We begin in Chapter One with the more elementary approaches to the problem, using the works of Fermat, Euler, Lagrange, Legendre and Gauss as a guide. In §1, we will give Euler's proofs of the above theorems of Fermat for primes of the form x2 + y2, x2 + 2y2 and x2 + 3y2, and we will see what led Euler to discover quadratic reciprocity. We will also discuss the conjectures Euler made concerning p = x2 + ny2 for n > 3. Some of these conjectures, such as

 

(0.2) images

 

are similar to Fermat's theorems, while others, like

 

images

 

are quite unexpected. For later purposes, note that this conjecture can be written in the following form:

 

(0.3) images

 

In §2, we will study Lagrange’s theory of positive definite quadratic forms. After introducing the basic concepts of reduced form and class number, we will develop an elementary form of genus theory which will enable us to prove (0 .2 ) and similar theorems. Unfortunately, for cases like (0.3), genus theory can only prove the partial result that

 

(0.4) images

 

The problem is that x2 + 27y2 and 4x2 + 2xy + 7y2 lie in the same genus and hence can't be separated by simple congruences. We will also discuss Legendre's tentative attempts at a theory of composition.

While the ideas of genus theory and composition were already present in the works of Lagrange and Legendre, the real depth of these theories wasn't revealed until Gauss came along. In §3 we will present some basic results in Gauss' Disquisitiones Arithmeticae, and in particular we will study the remarkable relationship between genus theory and composition. But for our purposes, the real breakthrough came when Gauss used cubic reciprocity to prove Euler's conjecture (0.3) concerning p = x2 + 27y2. In §4 we will give a careful statement of cubic reciprocity, and we will explain how it can be used to prove (0.3). Similarly, biquadratic reciprocity can be used to answer our question for x2 + 64y2. We will see that Gauss clearly recognized the role of higher reciprocity laws in separating forms of the same genus. This section will also begin our study of algebraic integers, for in order to state cubic and biquadratic reciprocity, we must first understand the arithmetic of the rings images[e2πi/3] and images[i].

To go further requires class field theory, which is the topic of Chapter Two. We will begin in §5 with the Hilbert class field, which is the maximal unramified Abelian extension of a given number field. This will enable us to prove the following general result:

Theorem 0.5. Let n = 1,2 mod 4 be a positive squarefree integer. Then there is an irreducible polynomial fn(x) ∈ images[x] such that for a prime p dividing neither n nor the discriminant of fn (x),

 

images

 

While the statement of Theorem 0.5 is elementary, the polynomial fn(x) is quite sophisticated: it is the minimal polynomial of a primitive element of the Hilbert class field L of images.

As an example of this theorem, we will study the case n = 14. We will show that the Hilbert class field of images is L = K(α), where images. By Theorem 0.5, this will show that for an odd prime p,

 

(0.6) images

 

which answers our basic question for x2 + 14y2. The Hilbert class field will also enable us in §6 to give new proofs of the main theorems of genus theory.

The theory sketched so far is very nice, but there are some gaps in it. The most obvious is that the above results for x2 + 27y2 and x2 + 14y2 ((0.3) and (0.6) respectively) both follow the same format, but (0.3) does not follow from Theorem 0.5, for n = 27 is not squarefree. There should be a unified theorem that works for all positive n, yet the proof of Theorem 0.5 breaks down for general n because images is not in general the full ring of integers in images.

The goal of §§7–9 is to show that Theorem 0.5 holds for all positive integers n. This, in fact, is the main theorem of the whole book. In §7 we will study the rings images for general n, which leads to the concept of an order in an imaginary quadratic field. In §8 we will summarize the main theorems of class field theory and the Čebotarev Density Theorem, and in §9 we will introduce a generalization of the Hilbert class field called the ring class field, which is a certain (possibly ramified) Abelian extension of images determined by the order images. Then, in Theorem 9.2, we will use the Artin Reciprocity Theorem to show that Theorem 0.5 holds for all n > 0, where the polynomial fn(x) is now the minimal polynomial of a primitive element of the above ring class field. To give a concrete example of what this means, we will apply Theorem 9.2 to the case x2 + 27y2, which will give us a class field theory proof of (0.3). In §§8 and 9 we will also discuss how class field theory is related to higher reciprocity theorems.

The major drawback to the theory presented in §9 is that it is not constructive: for a given n > 0, we have no idea how to find the polynomial fn(x). From (0.3) and (0.6), we know f27(x) and f14(x), but the methods used in these examples hardly generalize. Chapter Three will use the theory of complex multiplication to remedy this situation. In §10 we will study elliptic functions and introduce the idea of complex multiplication, and then in §11 we will discuss modular functions for the group Γ0(m) and show that the j-function can be used to generate ring class fields. As an example of the wonderful formulas that can be proved, in §12 we will give Weber's computation that

 

images

 

These methods will enable us to prove the Baker–Heegner–Stark Theorem on imaginary quadratic fields of class number 1. In §13 of the book we will discuss the class equation, which is the minimal polynomial of j(images). We will learn how to compute the class equation, which will lead to a constructive solution of p = x2 + ny2. We will then describe some work by Deuring and by Gross and Zagier. In 1946 Deuring proved a result about the difference of singular j-invariants, which implies an especially elegant version of our main theorem, and drawing on Deuring's work, Gross and Zagier discovered yet more remarkable properties of the class equation.

The first three chapters of the book present a complete solution to the problem of when p = x2 + ny2. In Chapter Four, we pursue two additional topics, elliptic curves in § 14 and Shimura reciprocity in § 15, that give a more modem approach to the study of complex multiplication. We also include applications to primality testing in §14. The new § 15 discusses ideles and the field of modular functions, and replaces certain pretty but ad-hoc arguments used in §12 with a more systematic treatment based on Shimura reciprocity. We also give an unexpected application to p = x2 + ny2.

Number theory is usually taught at three levels, as an undergraduate course, a beginning graduate course, or a more advanced graduate course. These levels correspond roughly to the first three chapters of the book. Chapter One requires only beginning number theory (up to quadratic reciprocity) and a semester of abstract algebra. Since the proofs of quadratic, cubic and biquadratic reciprocity are omitted, this book would be best suited as a supplementary text in a beginning course. For Chapter Two, the reader should know Galois theory and some basic facts about algebraic number theory (these are reviewed in §5), but no previous exposure to class field theory is assumed. The theorems of class field theory are stated without proof, so that this book would be most useful as a supplement to the topics covered in a first graduate course. Chapter Three requires a knowledge of complex analysis, but otherwise it is self-contained. (Brief but complete accounts of the Weierstrass images-function and modular functions are included in §§10 and 11.) This portion of the book should be suitable for use in a graduate seminar. The same is true for Chapter Four.

There are exercises at the end of each section, many of which consist of working out the details of arguments sketched in the text. Readers learning this material for the first time should find the exercises to be useful, while more sophisticated readers may skip them without loss of continuity.

Many important (and relevant) topics are not covered in the book. An obvious omission in Chapter One concerns forms such as x2 – 2y2, which were certainly considered by Fermat and Euler. Questions of this sort lead to Pell's equation and the class field theory of real quadratic fields. We have also ignored the problem of representing arbitrary integers, not just primes, by quadratic forms, and there are interesting questions to ask about the number of such representations (this material is covered in Grosswald's book [47]). In Chapter Two we give a classical formulation of class field theory, with only a brief mention of adeles and ideles. A more modem treatment can be found in Neukirch [80] or Weil [104] (see also the new §15). We also do not do justice to the use of analytic methods in number theory. For a nice introduction in the case of quadratic fields, see Zagier [111]. Our treatment of elliptic curves in Chapter Four is rather incomplete. See Husemoller [58], Knapp [A 14] or Silverman [93] for the basic theory, while more advanced topics are covered by Lang [73], Shimura [90] and Silverman [A21]. At a more elemenary level, there is the wonderful book [A22] by Silverman and Tate.

There are many books which touch on the number theory encountered in studying the problem of representing primes by x2 + ny2. Four books that we particularly recommend are Cohn's A Classical Invitation to Algebraic Numbers and Class Fields [19], Lang's Elliptic Functions [73], Scharlau and Opolka's From Fermat to Minkowski [86], and Weil's Number Theory: An Approach Through History [106]. These books, as well as others to be found in the References, open up an extraordinarily rich area of mathematics. The purpose of this book is to reveal some of this richness and to encourage the reader to learn more about it.

Notes on the Second Edition

The original text of the book consisted of §§1–14. For the second edition, we added the new §15 on Shimura reciprocity described above.

As a supplement to the references for the first edition, a new section Additional References has been added. The new references cited in the text are indicated with a leading “A” (e.g., the references Knapp [A14], Silverman [A21], and Silverman and Tate [A22] given above). This section also contains suggestions for further reading for the four chapters.

CHAPTER ONE

FROM FERMAT TO GAUSS

§1. FERMAT, EULER AND QUADRATIC RECIPROCITY

In this section we will discuss primes of the form x2 + ny2, where n is a fixed positive integer. Our starting point will be the three theorems of Fermat for odd primes p

 

(1.1) images

 

mentioned in the introduction. The goals of §1 are to prove (1.1) and, more importantly, to get a sense of what’s involved in studying the equation p = x2 + ny2 when n > 0 is arbitrary. This last question was best answered by Euler, who spent 40 years proving Fermat’s theorems and thinking about how they can be generalized. Our exposition will follow some of Euler’s papers closely, both in the theorems proved and in the examples studied. We will see that Euler’s strategy for proving (1.1) was one of the primary things that led him to discover quadratic reciprocity, and we will also discuss some of his remarkable conjectures concerning p = x2 + ny2 for n > 3. These conjectures touch on quadratic forms, composition, genus theory, cubic and biquadratic reciprocity, and will keep us busy for the rest of the chapter.

A. Fermat

Fermat’s first mention of p = x2 + y2 occurs in a 1640 letter to Mersenne [35, Vol. II, p. 212], while p = x2 + 2y2 and p = x2 + 3y2 come later, first appearing in a 1654 letter to Pascal [35, Vol. II, pp. 310–314]. Although no proofs are given in these letters, Fermat states the results as theorems. Writing to Digby in 1658, he repeats these assertions in the following form:

Every prime number which surpasses by one a multiple of four is composed of two squares. Examples are 5, 13, 17, 29, 37, 41, etc.

Every prime number which surpasses by one a multiple of three is composed of a square and the triple of another square. Examples are 7, 13, 19, 31, 37, 43, etc.

Every prime number which surpasses by one or three a multiple of eight is composed of a square and the double of another square. Examples are 3, 11, 17, 19, 41, 43, etc.

Fermat adds that he has solid proofs—“firmissimis demonstralibus” [35, Vol. II, pp. 402–408 (Latin), Vol. Ill, pp. 314–319 (French)].

The theorems (1.1) are only part of the work that Fermat did with x2 + ny2. For example, concerning x2 + y2, Fermat knew that a positive integer N is the sum of two squares if and only if the quotient of N by its largest square factor is a product of primes congruent to 1 modulo 4 [35, Vol. Ill, Obs. 26, pp. 256–257], and he knew the number of different ways N can be so represented [35, Vol. Ill, Obs. 7, pp. 243–246]. Fermat also studied forms beyond x2 + y2, x2 + 2y2 and x2 + 3y2. For example, in the 1658 letter to Digby quoted above, Fermat makes the following conjecture about x2 + 5y2, which he admits he can’t prove:

If two primes, which end in 3 or 7 and surpass by three a multiple of four, are multiplied, then their product will be composed of a square and the quintuple of another square.

Examples are the numbers 3, 7, 23, 43, 47, 67, etc. Take two of them, for example 7 and 23; their product 161 is composed of a square and the quintuple of another square. Namely 81, a square, and the quintuple of 16 equal 161.

Fermat’s condition on the primes is simply that they be congruent to 3 or 7 modulo 20. In §2 we will present Lagrange’s proof of this conjecture, which uses ideas from genus theory and the composition of forms.

Fermat’s proofs used the method of infinite descent, but that’s often all he said. As an example, here is Fermat’s description of his proof for p = x2 + y2 [35, Vol. II, p. 432]:

If an arbitrarily chosen prime number, which surpasses by one a multiple of four, is not a sum of two squares, then there is a prime number of the same form, less than the given one, and then yet a third still less, etc., descending infinitely until you arrive at the number 5, which is the least of all of this nature, from which it would follow was not the sum of two squares. From this one must infer, by deduction of the impossible, that all numbers of this form are consequently composed of two squares.

This explains the philosophy of infinite descent, but doesn’t tell us how to produce the required lesser prime. We have only one complete proof by Fermat. It occurs in one of his marginal notes (the area of a right triangle with integral sides cannot be an integral square [35, Vol. Ill, Obs. 45, pp. 271–272]—for once the margin was big enough!). The methods of this proof (see Weil [106, p. 77] or Edwards [31, pp. 10–14] for modem expositions) do not apply to our case, so that we are still in the dark. An analysis of Fermat’s approach to infinite descent appears in Bussotti [A5]. Weil’s book [106] makes a careful study of Fermat’s letters and marginal notes, and with some hints from Euler, he reconstructs some of Fermat’s proofs. Weil’s arguments are quite convincing, but we won’t go into them here. For the present, we prefer to leave things as Euler found them, i.e., wonderful theorems but no proofs.

B. Euler

Euler first heard of Fermat’s results through his correspondence with Goldbach. In fact, Goldbach’s first letter to Euler, written in December 1729, mentions Fermat’s conjecture that 22n + 1 is always prime [40, p. 10]. Shortly thereafter, Euler read some of Fermat’s letters that had been printed in Wallis’ Opera [100] (which included the one to Digby quoted above). Euler was intrigued by what he found. For example, writing to Goldbach in June 1730, Euler comments that Fermat’s four-square theorem (every positive integer is a sum of four or fewer squares) is a “non inelegans theorema” [40, p. 24]. For Euler, Fermat’s assertions were serious theorems deserving of proof, and finding the proofs became a life-long project. Euler’s first paper on number theory, written in 1732 at age 25, disproves Fermat’s claim about 22n + 1 by showing that 641 is a factor of 232 + 1 [33, Vol. II, pp. 1–5]. Euler’s interest in number theory continued unabated for the next 51 years—there was a steady stream of papers introducing many of the fundamental concepts of number theory, and even after his death in 1783, his papers continued to appear until 1830 (see [33, Vol. IV–V]). Weil’s book [106] gives a survey of Euler’s work on number theory (other references are Burkhardt [14], Edwards [31, Chapter 2], Scharlau and Opolka [86, Chapter 3], and the introductions to Volumes II–V of Euler’s collected works [33]).

We can now present Euler’s proof of the first of Fermat’s theorems from (1.1):

Theorem 1.2. An odd prime p can be written as x2 + y2 if and only if p ≡ 1 mod 4.

Proof. If p = x2 + y2, then congruences modulo 4 easily imply that p ≡ 1 mod 4. The hard work is proving the converse. We will give a modem version of Euler’s proof. Given an odd prime p, there are two basic steps to be proved:

 

images

 

It will soon become clear why we use the names “Descent” and “Reciprocity.”

We’ll do the Descent Step first since that’s what happened historically. The argument below is taken from a 1747 letter to Goldbach [40, pp. 416–419] (see also [33, Vol. II, pp. 295–327]). We begin with the classical identity

 

(1.3) images

 

(see Exercise 1.1) which enables one to express composite numbers as sums of squares. The key observation is the following lemma:

Lemma 1.4. Suppose that N is a sum of two relatively prime squares, and that q = x2 + y2 is a prime divisor of N. Then N/q is also a sum of two relatively prime squares.

Proof. Write N = a2 + b2, where a and b are relatively prime. We also have q = x2 + y2 and thus q divides

 

images

 

Since q is prime, it divides one of these two factors, and changing the sign of a if necessary, we can assume that q|xbay. Thus xbay = dq for some integer d.

We claim that x | a + dy. Since x and y are relatively prime, this is equivalent to x | (a + dy)y. However,

 

images

 

which is obviously divisible by x. Furthermore, if we set a + dy = cx, then the above equation implies that b = dx + cy. Thus we have

 

(1.5) images

 

Then, using (1.3), we obtain

 

images

 

Thus N/q = c2 + d2 is a sum of squares, and (1.5) shows that c and d must be relatively prime since a and b are. This proves the lemma. Q.E.D.

To complete the proof of the Descent Step, let p be an odd prime dividing N = a2 + b2, where a and b are relatively prime. If a and b are changed by multiples of p, we still have p | a2 + b2. We may thus assume that |a| < p/2 and |b| < p/2, which in turn implies that N < p2/2. The new a and b may have a greatest common divisor d > 1, but p doesn’t divide d, so that dividing a and b by d, we may assume that p | N, N < p2/2, and N = a2 + b2 where gcd (a, b) = 1. Then all prime divisors qp of N are less than p. If q were a sum of two squares, then Lemma 1.4 would show that N/q would be a multiple of p that is again a sum of two squares. If all such q’s were sums of two squares, then repeatedly applying Lemma 1.4 would imply that p itself was of the same form. So if p is not a sum of two squares, there must be a smaller prime q with the same property. Since there is nothing to prevent us from repeating this process indefinitely, we get an infinite decreasing sequence of prime numbers. This contradiction finishes the Descent Step.

This is a classical descent argument, and as Weil argues [106, pp. 68–69], it is probably similar to what Fermat did. In §2 we will take another approach to the Descent Step, using the reduction theory of positive definite quadratic forms.

The Reciprocity Step caused Euler a lot more trouble, taking him until 1749. Euler was clearly relieved when he could write to Goldbach “Now have I finally found a valid proof” [40, pp. 493–495]. The basic idea is quite simple: since p = 1 mod 4, we can write p = 4k + 1. Then Fermat’s Little Theorem implies that

 

images

 

for all x images 0 mod p. If x2k – 1 images 0 mod p for one such x, then p | x2k + 1, so that p divides a sum of relatively prime squares, as desired. For us, the required x is easy to find, since x2k – 1 is a polynomial over the field images/pimages and hence has at most 2k < p – 1 roots. Euler’s first proof is quite different, for it uses the calculus of finite differences—see Exercise 1.2 for details. This proves Fermat’s claim (1.1) for primes of the form x2 + y2. Q.E.D.

Euler used the same two-step strategy in his proofs for x2 + 2y2 and x2 + 3y2. The Descent Steps are

 

images

 

and the Reciprocity Steps are

 

images

 

where p is always an odd prime. In each case, the Reciprocity Step was harder to prove than the Descent Step, and Euler didn’t succeed in giving complete proofs of Fermat’s theorems (1.1) until 1772, 40 years after he first read about them. Weil discusses the proofs for x2 + 2y2 and x2 + 3y2 in [106, pp. 178–179, 191, and 210–212], and in Exercises 1.4 and 1.5 we will present a version of Euler’s argument for x2 + 3y2.

C. p = x2 + ny2 and Quadratic Reciprocity

Let’s turn to the general case of p = x2 + ny2, where n is now any positive integer. To study this problem, it makes sense to start with Euler’s two-step strategy. This won’t lead to a proof, but the Descent and Reciprocity Steps will both suggest some very interesting questions for us to pursue.

The Descent Step for arbitrary n > 0 begins with the identity

 

(1.6) images

 

(see Exercise 1.1), and Lemma 1.4 generalizes easily for n > 0 (see Exercise 1.3). Then suppose that p | x2 + ny2. As in the proof of the Descent Step in Theorem 1.2, we can assume that |x|, |y| ≤ p/2. For n ≤ 3, it follows that x2 + ny2 < p2 when p is odd, and then the argument from Theorem 1.2 shows that p is of the form x2 + ny2 (see Exercise 1.4). One might conjecture that this holds in general, i.e., that p | x2 + ny2 always implies p = x2 + ny2. Unfortunately this fails even for n = 5: for example, 3 | 21 = 12 + 5 · 22 but 3 ≠ x2 + 5y2. Euler knew this, and most likely so did Fermat (remember his speculations about x2 + 5y2). So the question becomes: how are prime divisors of x2 + ny2 to be represented? As we will see in §2, the proper language for this is Lagrange’s theory of quadratic forms, and a complete solution to the Descent Step will follow from the properties of reduced forms.

Turning to the Reciprocity Step for n > 0, the general case asks for congruence conditions on a prime p which will guarantee p | x2 + ny2. To see what kind of congruences we need, note that the conditions of (1.1) can be unified by working modulo 4n. Thus, given n > 0, we’re looking for a congruence of the form p ≡ α, β,… mod 4n which implies p | x2 + ny2, gcd(x, y) = 1. To give a modem formulation of this last condition, we first define the Legendre symbol (a/p). If a is an integer and p an odd prime, then

 

images

 

We can now restate the condition for p | x2 + ny2 as follows:

Lemma 1.7. Let n be a nonzero integer, and let p be an odd prime not dividing n. Then

 

images

 

Proof. The basic idea is that if x2 + ny2 = 0 mod p and gcd(x, y) = 1, then y must be relatively prime to p and consequently has a multiplicative inverse modulo p. The details are left to the reader (see Exercise 1.6). Q.E.D.

The arguments of the above lemma are quite elementary, but for Euler they were not so easy—he first had to realize that quadratic residues were at the heart of the matter. This took several years, and it’s fun to watch his terminology evolve: in 1744, he writes “prime divisors of numbers of the form aa – Nbb” [33, Vol. II, p. 216]; by 1747 this changes to “residues arising from the division of squares by the prime p” [33, Vol. II, p. 313]; and by 1751 the transition is complete—Euler now uses the terms “residua” and “non-residua” freely, with the “quadratic” being understood [33, Vol. II, p. 343].

Using Lemma 1.7, the Reciprocity Step can be restated as the following question: is there a congruence pα, β,… mod 4n which implies (–n/p) = 1 when p is prime? This question also makes sense when n < 0, and in the following discussion n will thus be allowed to be positive or negative. We will see in Corollary 1.19 that the full answer is intimately related to the law of quadratic reciprocity, and in fact the Reciprocity Step was one of the primary things that led Euler to discover quadratic reciprocity.

Euler became intensely interested in this question in the early 1740s, and he mentions numerous examples in his letters to Goldbach. In 1744 Euler collected together his examples and conjectures in the paper Theoremata circa divisores numerorum in hac forma paa±qbb contentorum [33, Vol. II, pp. 194–222]. He labels his examples as “theorems,” but they are really “theorems found by induction,” which is eighteenth-century parlance for conjectures based on working out some particular cases. Here are of some of Euler’s conjectures, stated in modem notation:

 

(1.8) images

 

where p is an odd prime not dividing n. In looking for a unifying pattern, the bottom three look more promising because of the ±’s. If we rewrite the bottom half of (1.8) using 11 = –9 mod 20 and 3 = –25 mod 28, we obtain

 

images

 

All of the numbers that appear are odd squares!

Before getting carried away, we should note another of Euler’s conjectures:

 

images

 

Unfortunately, ±5 is not a square modulo 24, and the same thing happens for (10/p) and (14/p) . But 3, 5 and 7 are prime, while 6, 10 and 14 are composite. Thus it makes sense to make the following conjecture for the prime case:

Conjecture 1.9. If p and q are distinct odd primes, then

 

images

 

The remarkable fact is that this conjecture is equivalent to the usual statement of quadratic reciprocity:

Proposition 1.10. If p and q are distinct odd primes, then Conjecture 1.9 is equivalent to

 

images

 

Proof. Let p* = (–1)(p–1)/2p. Then the standard properties

 

(1.11) images

 

of the Legendre symbol easily imply that quadratic reciprocity is equivalent to

 

(1.12) images

 

(see Exercise 1.7). Since both sides are ±1, it follows that quadratic reciprocity can be stated as

 

images

 

Comparing this to Conjecture 1.9, we see that it suffices to show

 

(1.13) images

 

The proof of (1.13) is straightforward and is left to the reader (see Exercise 1.8). Q.E.D.

With hindsight, we can see why Euler had trouble with the Reciprocity Steps for x2 + 2y2 and x2 + 3y2: he was working out special cases of quadratic reciprocity! Exercise 1.9 will discuss which special cases were involved. We will not prove quadratic reciprocity in this section, but later in §8 we will give a proof using class field theory. Proofs of a more elementary nature can be found in most number theory texts.?

The discussion leading up to Conjecture 1.9 is pretty exciting, but was it what Euler did? The answer is yes and no. To explain this, we must look more closely at Euler’s 1744 paper. In addition to conjectures like (1.8), the paper also contained a series of Annotations where Euler speculated on what was happening in general. For simplicity, we will concentrate on the case of (N/p), where N > 0. Euler notes in Annotation 13 [33, Vol. II, p. 216] that for such N’s, all of the conjectures have the form

 

images

 

for certain odd values of α. Then in Annotation 16 [33, Vol. II, pp. 216–217], Euler states that “while 1 is among the values [of the α’s], yet likewise any square number, which is prime to 4N, furnishes a suitable value for α.” This is close to what we want, but it doesn’t say that the odd squares fill up all possible α’s when N is prime. For this, we turn to Annotation 14 [33, Vol. II, p. 216], where Euler notes that the number of α’s that occur is (1/2)ϕ(N). When N is prime, this equals (N – 1)/2, the number of incongruent squares modulo 4N relatively prime to 4N. Thus what Euler states is fully equivalent to Conjecture 1.9. In 1875, Kronecker identified these Annotations as the first complete statement of quadratic reciprocity [68, Vol. II, pp. 3–4].

The problem is that we have to read between the lines to get quadratic reciprocity. Why didn’t Euler state it more explicitly? He knew that the prime case was special, for why else would he list the prime cases before the composite ones? The answer to this puzzle, as Weil points out [106, pp. 207–209], is that Euler’s real goal was to characterize the α’s for all N, not just primes. To explain this, we need to give a modem description of the ±α’s. The following lemma is at the heart of the matter:

Lemma 1.14. If D ≡ 0,1 mod 4 is a nonzero integer, then there is a unique homomorphism χ : (images/Dimages) → {±1} such that χ([p]) = (D/p) for odd primes p not dividing D. Furthermore,

 

images

 

Proof. The proof will make extensive use of the Jacobi symbol. Given m > 0 odd and relatively prime to M, recall that the Jacobi symbol (M/m) is defined to be the product

 

images

 

where m = p1 ··· pr is the prime factorization of m. Note that (M/m) = (N/m) when MN mod m, and there are the multiplicative identities

 

(1.15) images

 

(see Exercise 1.10). The Jacobi symbol satisfies the following version of quadratic reciprocity:

 

(1.16) images

 

(see Exercise 1.10).

For this lemma, the crucial property of the Jacobi symbol is one usually not mentioned in elementary texts: if mn mod D, where m and n are odd and positive and D ≡ 0, 1 mod 4, then

 

(1.17) images

 

The proof is quite easy when D ≡ 1 mod 4 and D > 0: using quadratic reciprocity (1.16), the two sides of (1.17) become

 

(1.18) images

 

To compare these expressions, first note that the two Jacobi symbols are equal since mn mod D. From D ≡ 1 mod 4 we see that

 

images

 

since m and n are odd. Thus the signs in front of (1.18) are both +1, and (1.17) follows. When D is even or negative, a similar argument using the supplementary laws from (1.16) shows that (1.17) still holds (see Exercise 1.11).

It follows from (1.17) that χ([m|) = (D/m) gives a well-defined homomorphism from (images/Dimages)* to {±1} (see Exercise 1.12), and the statement concerning χ([–1]) follows from the above properties of the Jacobi symbol (see Exercise 1.12). Finally, the condition that χ([p]) = (D/p) for odd primes p determines χ uniquely follows because χ is a homomorphism and every class in (images/Dimages)* contains a positive odd number (hence a product of odd primes) by part (a) of Exercise 1.12. Q.E.D.

The above proof made heavy use of quadratic reciprocity, which is no accident: Lemma 1.14 is in fact equivalent to quadratic reciprocity and the supplementary laws (see Exercise 1.13). For us, however, the main feature of Lemma 1.14 is that it gives a complete solution of the Reciprocity Step of Euler’s strategy:

Corollary 1.19. Let n be a nonzero integer, and let χ : (images/4nimages)* → { ± 1} be the homomorphism from Lemma 1.14 when D = 4–n. If p is an odd prime not dividing n, then the following are equivalent:

(i) p | x2 + ny2, gcd(x, y) = 1.
(ii) (–n/p) = 1.
(iii) [p] ∈ ker(χ) ⊂ (images/4nimages)*.

Proof. (i) and (ii) are equivalent by Lemma 1.7, and since (4–n/p) = (–n/p), (ii) and (iii) are equivalent by Lemma 1.14. Q.E.D.

To see how this solves the Reciprocity Step, write ker(χ) = {[α],[β], [γ],…} Then [p] ∈ ker(χ) is equivalent to the congruence pα, β, γ,… mod 4n, which is exactly the kind of condition we were looking for. Actually, Lemma 1.14 allows us to refine this a bit: when n ≡ 3 mod 4, then congruence can be taken to be of the form pα, β, γ,… mod n (see Exercise 1.14). We should also note that in all cases, the usual statement of quadratic reciprocity makes it easy to compute the classes in question (see Exercise 1.15 for an example).

To see how this relates to what Euler did in 1744, let N be as in our discussion of Euler’s Annotations, and let D = 4N in Lemma 1.14. Then ker(χ) consists exactly of Euler’s ±α’s (when N > 0, the lemma also implies that – 1 ∈ ker(χ), which explains the ± signs). The second thing to note is that when N is odd and squarefree, K = ker(χ) is uniquely characterized by the following four properties:

(i) K is a subgroup of index 2 in (images/4Nimages)*.
(ii) – 1 ∈ K when N > 0 and – 1 ∉ K when N < 0.
(iii) K has period N if N ≡ 1 mod 4 and period 4N otherwise. (Having period P > 0 means that if [a], [b] ∈ (images/4Nimages)*, [a] ∈ K and ab mod P, then [b] ∈ K.)
(iv) K does not have any smaller period.

NNimagesχ