Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many...

195
Johannes Bl¨ omer Simplifying Expressions Involving Radicals Dissertation

Transcript of Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many...

Page 1: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Johannes Blomer

Simplifying Expressions Involving Radicals

Dissertation

Page 2: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:
Page 3: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Simplifying Expressions Involving Radicals

Dissertationzur Erlangung des Doktorgrades

vorgelegt am Fachbereich Mathematikder Freien Universitat Berlin

1993

vonJohannes Blomer

verteidigt am 29. Januar 1993

Betreuer Prof. Dr. Helmut AltFreie Universitat Berlin

Zweitgutachter Prof. Dr. Chee YapCourant Institute, New York University

Page 4: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:
Page 5: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Contents

1 Introduction 3

2 Basics from Algebra and Algebraic Number Theory 12

3 The Structure of Radical Extensions 23

4 Radicals over the Rational Numbers 364.1 Linear Dependence of Radicals over the Rational Numbers . . 364.2 Comparing Sums of Square Roots . . . . . . . . . . . . . . . . 40

5 Linear Dependence of Radicals over Algebraic NumberFields 455.1 Definitions and Bounds . . . . . . . . . . . . . . . . . . . . . 485.2 Lattice Basis Reduction and Reconstructing Algebraic Numbers 555.3 Ratios of Radicals in Algebraic Number Fields . . . . . . . . 635.4 Approximating Radicals and Ratios of Radicals . . . . . . . . 685.5 A Probabilistic Test for Equality. . . . . . . . . . . . . . . . . 795.6 Sums of Radicals over Algebraic Number Fields . . . . . . . . 88

6 Denesting Radicals - The Basic Results 936.1 The Basic Theorems . . . . . . . . . . . . . . . . . . . . . . . 966.2 Denesting Sets and Reduction to Simple Radical Extensions . 1056.3 Characterizing Denesting Elements . . . . . . . . . . . . . . . 1086.4 Characterizing Admissible Sequences . . . . . . . . . . . . . . 1126.5 Denesting Radicals - The Algorithms . . . . . . . . . . . . . . 118

7 Denesting Radicals - The Analysis 1257.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 1257.2 Description and Analysis of Step 1 . . . . . . . . . . . . . . . 1357.3 Description and Analysis of Step 2 . . . . . . . . . . . . . . . 1427.4 Description and Analysis of Step 3 . . . . . . . . . . . . . . . 1557.5 The Final Results . . . . . . . . . . . . . . . . . . . . . . . . . 161

Appendix: Roots of Unity in Radical Extensions of the Ratio-nal Numbers 176

Summary 187

1

Page 6: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Zusammenfassung 189

Lebenslauf 191

2

Page 7: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

1 Introduction

Problems and Results An important issue in symbolic computation isthe simplification of expressions. Since many algorithms in Computer Al-gebra systems like Mathematica, Maple, and Reduce work in quite generalsettings they do not necessarily find a solution to a given problem describedin the easiest possible way. Simplification algorithms can be applied to ex-press these solutions in a form that is more convenient for later use. Forexample, to determine whether the solution itself or the difference of twosuch solutions is zero.

In this thesis we consider simplification algorithms for radical expres-sions. A radical over a field F is a root d

√ρ of some element ρ in the field

F. We prefer the name radical since there are d different roots of ρ and ingeneral we may not be referring to an arbitrary d-th root of ρ but to a cer-tain fixed value of d

√ρ. A radical expression over F is any expression built

from the usual arithmetic operations and from, possibly nested, roots.Dealing with radicals has a long history in mathematics. For example,

Galois Theory emerged from the problem of solving polynomials by radicals.It seems that in Computer Science people first got interested in radicals bytheir connection to the bit complexity of certain optimization problems suchas the Traveling Salesman Problem (TSP) or the Shortest Path Problem(SPP). In fact, in any solution to these problems the length of tours orpaths have to be compared. Assuming that the cities in the TSP haverational coordinates then the length of a path is given by a sum of squareroots of rational numbers. Therefore in order to compare the length of twotours the sign of a sum of square roots has to be determined.

This quite innocent looking problem turned out to be extremely difficultand until now no efficient solution, not even a promising approach, is known.By efficient we mean an algorithm that determines the sign by a number ofbit operations that is polynomial in the length of the sum and in the bit sizeof the rational numbers.

It was exactly this problem that stimulated our interest in the questionwe answer in the first part of this thesis. We show that although the signof a sum of square roots may not be computable in polynomial time it isnevertheless possible to decide in polynomial time whether a sum of squareroots is zero. Perhaps somewhat surprising the solution turns out to bequite simple. Moreover, with some effort we extend the result to sums ofreal radicals of arbitrary degree and generalize the solution by replacing thefield Q by real algebraic number fields.

3

Page 8: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

These fields are easily defined as follows. Let α be the root of a poly-nomial with rational coefficients. The smallest field containing Q and α iscalled the algebraic number field generated by α and is denoted by Q(α).Since exact arithmetic is possible in these fields they play an important rolein symbolic computation (see for example [Lo1]).

To determine whether a sum of radicals over the field Q, say, is zero it isfirst simplified into a sum such that the radicals appearing in the transformedsum are linearly independent. Then it is easy to check whether the originalsum is zero. This can happen if and only if the coefficients in the sumresulting from the transformation step are zero.

The algorithms that transforms an arbitrary sum S of real radicals intoa sum of linearly independent radicals is based on the following result dueto C. L. Siegel [Si].Let F be any real field, i.e., F ⊆ R. If for positive integers di and elementsρi ∈ F, i = 1, . . . , k, di

√ρi ∈ R and

di√ρi

dj√ρj6∈ F for all i 6= j,

then d1√ρ1, . . . , dk

√ρk are linearly independent over F.

Hence we easily do the transformation by determining for any pair ofradicals in S whether its ratio is in F, by computing the representation ofthe ratio in F, and by finally collecting terms in order to determine thecoefficients.

If we were satisfied with an algorithm whose run time is polynomial inthe degrees di rather than in log di the ratio test is simple. But to reducethe run time to a polynomial in log di even in the case F = Q we have towork much harder. For algebraic number fields we achieve this reductiononly by allowing the algorithm to give with small probability an incorrectanswer, that is, by an algorithm of Monte-Carlo-type.

We also show how to determine for sums of complex radicals over cer-tain complex algebraic number fields whether they are zero. In that case,however, the run times are polynomial only in the degrees of the radicalsthemselves.

The result of Siegel mentioned above has some history. A special case ofit was first proven in 1940 by A. S. Besicovitch [Be]. This result was slightlygeneralized in 1953 by L. J. Mordell until in 1971 Siegel proved the theoremin the form stated above. But this is still not the end of the story because in1974 M. Kneser generalized Siegel’s result to certain complex fields. In this

4

Page 9: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

thesis we also contribute a bit to this history by giving a fairly short andsimple proof of Siegel’s Theorem. Moreover, we show how it can be used toprove and generalize certain results from Kummer Theory.

By generalizing the results above to sums of radicals that are not nec-essarily defined over the same field we naturally hit upon the problem ofsimplifying or denesting nested radicals.

Throughout the last yearsthis problem has been studied intensively by various mathematicians andcomputer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have beenattracted by equations of Ramanujan [R] such as the following:

3√

3√

2− 1 = 3

√19− 3

√29

+ 3

√49√

3√

5− 3√

4 =13

(3√

2 + 3√

20− 3√

25)

6√

7 3√

20− 19 = 3

√53− 3

√23.

These examples sufficiently explain the problem itself. Denesting a radicalexpression means decreasing its nesting depth over a field F. For example,the depth over Q of the left-hand side of Ramanujan’s equations is 2 whilethe depth of the right-hand side is 1.

If we are given a nested radical and are asked to denest it then this is atfirst not a meaningful question because for different values of the roots in-

volved we may get different denestings. For example, if 3√√

5 + 2− 3√√

5− 2has to be denested, in general, we will assume that the two square roots havethe same value and that perhaps the real third roots are meant. In this casethe formula denests to 1. But if one of the square roots is positive and theother one negative then the formula denests to a square root of 5 providedwe still take real third roots. And for the complex third roots the denestingsare again different. So the values of the roots involved in the nested radicalhave to be specified in advance. For example, in case of real radicals of evendegree we will assume in this thesis that their value is given by the positivereal root.

In the last years a lot of progress has been made for the problem ofdenesting radicals. Landau [La2] showed how to compute a denesting for anested radical whose nesting depth is just one off the optimal one. Horngand Huang [HH] achieved a minimal denesting and also showed how tosolve a polynomial by a radical of minimum nesting depth, if it is solvable

5

Page 10: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

by radicals at all. However, in both cases the denestings are with respectto a field containing all roots of unity. As it turns out for any radicalexpression a single root of unity can be determined such that a denestingover the field generated by this root of unity is (almost in case of [La2])a minimum depth denesting of the expression over the field containing allroots of unity. The degree of this root, however, is either single- (Landau)or even double-exponential (Horng, Huang) in the description size of theminimal polynomials of the expressions to be denested.

In particular, these results cannot explain or compute the special form ofRamanujan’s examples above since in these examples no roots of unity arerequired. But even if they did compute these denestings they would do sovery inefficiently. Moreover, it looks like the approach of Landau and Horng,Huang cannot yield efficient algorithms. They use Galois Theory and haveto compute the splitting field of the radical expression they want to denest,that is, the field generated by all roots of the minimal polynomial of theradical. The degree of this extension may be exponential in the degree ofthe roots appearing in the radical expression.

In this thesis we follow a different approach that leads to several efficientdenesting algorithms although these algorithms produce minimum depthdenestings only of a restricted kind. Therefore they are not really compa-rable to the results of Landau and Horng, Huang. On the other hand, ourtheoretical results do explain Ramanujan’s denestings and our algorithmsefficiently determine them. Moreover, several of our results indicate thatthere is good reason to believe that the methods of this thesis may evenlead to more efficient algorithms for minimum depth denestings in general.

To be more specific, we prove for example that if γ is an element ofsome field F ( d1

√ρ1, d2√ρ2, . . . , dk

√ρk), F ⊂ R, qi ∈ F, di

√ρi ∈ R, and if d

√γ

is a real nested radical of depth 2 then d√γ can be written as a depth 1

expression over F if and only if there exists an element γ0 ∈ F such that

dN√γ0

d√γ ∈ F ( d1

√ρ1, d2√ρ2, . . . , dk

√ρk).

N is the degree of the extension F ( d1√ρ1, d2√ρ2, . . . , dk

√ρk) : F. This theorem

is a generalization of a similar result due to Borodin et al. [BFHT] that isrestricted to certain expressions of square roots.

We then show how to compute the element γ0 efficiently. This problemwas considered before in [Z] and [La3] although the authors did not know thetheorem above and hence were not aware of the fact that elements γ0 as in thetheorem lead to the only possible form of a denesting. Basically we improve

6

Page 11: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

the algorithms in these two papers from exponential to polynomial time andachieve the first denesting algorithms whose run times are polynomial in themaximal output size.

Before we finish this brief description of the results obtained in this thesislet us mention the basic technique of our algorithms. Using a variant of theKannan, Lenstra, and Lovasz algorithm to (re-)construct minimal polyno-mials we show how to compute in polynomial time the exact representationof an element β in an algebraic number field Q(α), provided an upper boundB on the representation size of β is given and an approximation β to β isknown such that the number of correct bits in β is roughly quadratic in B.We believe that this technique will have a lot of applications in algorithmicalgebra and number theory.

The Model of Computation Throughout this thesis we are only inter-ested in the bit complexity of problems. But this does not uniquely definea model of computation. We can assume for example a (multi-tape) TuringMachine as it is defined in any standard textbook on algorithm theory like[AHU],[CLR]. But we can also use the following variant of a Random AccessMachine (RAM) as it is defined in [Sc2].

As usual the RAM has a storage consisting of an infinite number oflocations indexed with the positive integers. Each location can store anarbitrary integer. Furthermore the RAM has a so-called processing unit.

The operations allowed consist of the usual storage operations like “load”and “store”. But the only arithmetic operation allowed is the successor func-tion. Finally comparisons are allowed. The time needed for each operationis defined as the logarithm log 1 of the maximum bit size of the operands.For further details we refer to [Sc2].

Yet another model of computation are the Pointer Machines which arealso defined precisely in [Sc2].

In these models the best currently known upper bounds formultiplying two n-bit integers are pairwise distinct. They areO(n log n log logn), O(n log n), and O(n), respectively. To make our resultseasily applicable to these and other models of computation we chose to rep-resent the run times in terms of the number of what we call elementaryoperations and in terms of the maximum bit size of the numbers on whichto perform these operations.

1In this thesis log always denotes the logarithm to the base 2, the logarithm to thebase e is denoted by ln .

7

Page 12: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

We define precisely what numbers we allow and which operations areconsidered as elementary.

First, we consider integers z represented in binary. We call z an n-bit integer if its binary length is n. Elementary operations on integers areadditions, subtractions, multiplications, and divisions with remainder.

But we also consider floating-point numbers. Real floating-point numbersare represented by a pair (b,m) ∈ Z× Z which represents the number b2m.(b,m) is called an n-bit floating-point number if the sum of the bit size of band of the bit size of m is n. Complex floating-point numbers are representedby a pair of real floating-point numbers, the first being the real part, thesecond one being the imaginary part. An n-bit complex floating number isa floating-point number such that the sum of the representation size of thereal and imaginary part is n.

As for the integers elementary operations on floating-point numbers areadditions, subtractions, and multiplications. But also computing the inverseor the square root of an n-bit floating-point number with relative errorO(2−n) are considered as elementary.

As is well-known in the three models mentioned above, if M(n) denotesthe time needed to multiply two n-bit integers then any elementary operationon integers or floating-point numbers can be done in O(M(n)) time. Thismay justify that we treat these operations in the same way.

Our main results show that certain simplification problems on radicalscan be solved by a polynomial number of elementary operations on inte-gers and floating-point numbers of polynomial size. This implies that theseproblems can be solved in polynomial time in any reasonable model of com-putation that allows bit operations. The exact run times for a specific modelof computation can be deduced from our general results by plugging in themaximum number of bit operations needed for an elementary operation inthis model of computation.

The elementary operations on floating-point numbers clearly include theelementary operations on integers. Therefore the distinction between ele-mentary operations on integers and on floating-point numbers deserves someexplanation. On the one hand, the algorithms we are going to describe inthis thesis return as results elements in algebraic number fields representedas a linear combination of some basis elements with rational coefficients.These coefficients have to be represented exactly. But by any definitionof floating-point numbers some rational numbers cannot be represented ex-actly by a single floating-point number. For example, in the system definedabove 1

3 is not representable exactly. Describing the coefficients as tuples of

8

Page 13: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

floating-point numbers does not seem to be an appropriate form. At least itdoes not coincide with the usual understanding of a floating-point number.

Finally we want to be able to perform exact arithmetic in algebraic num-ber fields. This is best described by exact arithmetic on integers. For exam-ple, these operations often require greatest common divisor computations.Describing these computations by elementary operations on floating-pointnumbers is a crude abuse of language and notation.

On the other hand, based on approximation algorithms due to Schonhageand Brent we describe approximation algorithms for algebraic numbers. Todescribe these approximation algorithms via elementary operations on inte-gers is at least inaccurate and confusing. Although as far a asymptotic runtimes are concerned and one is very careful it would not cause too muchtrouble. In particular, we are never forced to represent numbers by floating-point approximations because they are either too large or too small.

We also believe that by distinguishing between elementary operations onintegers and elementary operations on floating-point numbers the analysisof the algorithms described in this thesis are easier to understand. Recall forexample from the previous paragraph that our basic technique transformsan approximation to an algebraic number given by a floating-point numberinto an exact representation as a linear combination of certain basis elementsof an algebraic number field.

Finally let us mention that division and taking square roots have beenincluded into the elementary operations on floating-point numbers since inthe three models mentioned in the beginning it is correct that the timeneeded for these procedures is asymptotically the same as the one neededfor multiplication. However, it is not correct that division and taking squareroots can be done by a constant number of arithmetic operations on inte-gers of size O(n). Since the approximation algorithms of Brent [Br] andSchonhage [Sc3] apply divisions and square rootings we cannot accuratelystate the run times of our approximation algorithms in terms of arithmeticoperations only2.

For the algorithms we have to determine certain constants, for example,if approximations with certain precision are required. In general, the con-stants we derive will not be optimal. Deriving better constants would oftencomplicate the notation and add more technical details that are not crucialto our algorithms.

2In particular, Brent’s algorithms use the Arithmetic-Geometric-Mean-Iteration.

9

Page 14: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

For run times we use of course the standard O-notation.Our main interest is to show that certain simplification problems for

radicals are solvable in polynomial time. Although the run times we deduceare asymptotically the best we can prove so far, for many subalgorithms theyare not optimal. Instead the analysis of these subalgorithms only shows thattheir run times will not determine the overall run time. The reader shouldalways be aware of this fact.

A brief overview In Section 2 we present all the material from algebraand number theory that will be used in the thesis. It contains a brief de-scription of the basics from field theory and summarizes the main resultsfrom Galois Theory. We also define algebraic numbers and algebraic integersand mention their main properties. In Section 3 we give a short and fairlysimple proof of Siegel’s Theorem. In Section 4 by restricting ourselves tothe rational numbers we demonstrate the basic ideas of the algorithm thattransforms a sum of radicals into a sum of linearly independent radicals. InSection 5 we generalize the algorithm to algebraic number fields. In partic-ular, the reconstruction algorithm for the exact representation of algebraicnumbers is described.

With Section 6 the part on denesting radicals begins. In Section 6 itselfwe prove all the relevant theoretical facts and outline the basic algorithmsthat compute the denestings. For example, if γ is a sum of radicals overF we describe an algorithm that determines whether an element γ0 ∈ Fexists such that d

√γ0

d√γ can be written as a sum of radicals over F. If such

an element exists the algorithm computes one. In Section 7 we fill in thedetails of the algorithms and give a precise analysis.

Acknowledgement Without the help and encouragement of Helmut Alt,Susan Landau, Emo Welzl, and Chee Yap this thesis would not have beenwritten.

My advisor, Helmut Alt, gave me the opportunity to work in a field thatdoes not belong to his own primary interests. He was an extremely carefulreader of the thesis. He not only found a lot of confusing mistakes in thefirst versions but also tried to improve my style. He, and Emo Welzl, alsogave very useful hints. Helmut Alt drew my attention to the approximationalgorithms used in the thesis and Emo Welzl suggested various methodsfor the probabilistic algorithm of Section 5. Both of them permanentlyencouraged me to try again if first attempts to solve a problem failed and

10

Page 15: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

I was convinced that an efficient solution did not exist or that I would notfind it.

Susan Landau pointed out to me the problem of denesting radicals andsuggested that my previous techniques might be useful for this problem.Without her encouragement the second part of the thesis would not havebeen written.

Had Chee Yap not spend the academic year 1989/90 at the Free Uni-versity Berlin this thesis would not have been written. In his course onComputer Algebra I first learned about many bounds used in the thesisand, most important, I learned about lattice reduction. His class notes werean invaluable help when working out the technical details of the algorithms.Since his forthcoming book on Computer Algebra [Y] is not yet published Idecided to cite the original research papers for bounds and basic techniques.But most of this material will be contained in Chee Yap’s book. But CheeYap not only introduced me to Computer Algebra. He also introduced meto the problem that started this thesis. After trying in vain for months tocome up with an efficient algorithm to determine the sign of a sum of squareroots of integers, he finally suggested to see whether we could at least comeup with an algorithm that checks whether a sum of square roots is zero.The first solution to this problem (using only the Primitive Element Theo-rem) was joint work with Chee Yap. He also suggested to consider not onlysquare roots but arbitrary radicals. This eventually lead to the results ofthis thesis.

11

Page 16: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

2 Basics from Algebra and Algebraic NumberTheory

In this section we review the basic facts from algebra and algebraic numbertheory which will be used in this thesis. This material can be found in anytextbook on algebra and algebraic number theory like [J], [Ja],[L], or [M]. Somost of the proofs are omitted. Furthermore we try to keep the exposition assimple and self-contained as possible. As a consequence the reader familiarwith algebra and number theory will find that a lot of facts mentioned canbe generalized considerably.

The most fundamental concept used is that of an algebraic element.Consider an arbitrary subfield F of the complex numbers C. An elementα ∈ C is called algebraic over F if a polynomial p(X) =

∑ni=0 piX

i, pi ∈ F ,exists such that p(α) = 0. An important example is the case F = Q. Ifα ∈ C is algebraic over Q we simply call α algebraic. It is not very hard toshow that the algebraic numbers in C over F form a field with respect toaddition and multiplication in C.

If α ∈ C is algebraic over F , then the smallest degree polynomial p(X) ∈F [X] with p(α) = 0 and leading coefficient pn = 1 is called the minimalpolynomial of α over F 3. Hence the minimal polynomial is an irreducibleelement of F [X]. The degree of the minimal polynomial is also called thedegree of α over F.

For example, all rational numbers q are algebraic over Q with minimalpolynomial X − q. Furthermore d

√q, q ∈ Q and d ∈ N, is algebraic and the

minimal polynomial of d√q is a divisor of Xd − q.

A field E ⊇ F is called an algebraic extension of F if all elements ofE are algebraic over F . Such an extension will be denoted by E : F . Emay be considered as a vector space over F . If this vector space has finitedimension n then n is called the degree of the extension and is denoted by[E : F ]. Any vector space basis is called a basis of the field extension orsimply a field basis.

Important examples of algebraic extensions are extensions generated byadjoining a single algebraic number to a field F . To define this more pre-cisely let α ∈ C be algebraic over F and assume that the minimal polynomialp(X) of α over F is of degree n (denoted by deg p = n). Then the smallestfield containing F and α is denoted by F (α). Since this field is isomorphic tothe field of all polynomials in F [X] taken modulo the minimal polynomial p

3Polynomials with leading coefficient 1 are called monic.

12

Page 17: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

of α (denoted F [X]/(p)) it is easily seen that F (α) =∑n−1

i=0 κiαi|κi ∈ F

.

Moreover, the degree of F (α) over F is n, the degree of the minimal poly-nomial of α over F .

This can be generalized in the following way. Consider a field F and knumbers α1, . . . , αk that are algebraic over F . By F (α1, . . . , αk) denote thesmallest field containing F and the numbers α1, . . . , αk. Also let ni be thedegree of the minimal polynomial pi of αi over F (α1, . . . , αi−1)4. Then thedegree n of the extension is n =

∏ki=1 ni and a basis is given by the elements

αe11 αe22 · · ·α

ekk , 0 ≤ e1 < n1, 0 ≤ e2 < n2, . . . , 0 ≤ ek < nk.

So any element in F (α1, . . . , αk) can uniquely be written as

n1−1∑e1=0

n2−1∑e2=0

. . .nk−1∑ek=0

κe1,e2,...,ekαe11 α

e22 · · ·α

ekk

with κe1,e2,...,ek ∈ F.We are interested in the isomorphisms of F (α1, α2, . . . , αk) into subfields

of C that fix F pointwise. These mappings are called the embeddings ofF (α1, α2, . . . , αk) into C. Furthermore the images of F (α1, α2, . . . , αk) underthe isomorphisms are called the conjugate fields of F (α1, α2, . . . , αk).

First let us assume that the extension is generated by a single ele-ment α. Let p be the minimal polynomial of α over F, deg p = n. Sincewe assume Q ⊆ F and p is irreducible, p has exactly n distinct rootsα = α(0), α(1), . . . , α(n−1), called the conjugates of α.

Since σ(p(α)) = p(σ(α)) any embedding σ of F (α) must map α onto oneof its conjugates. Hence there are at most n distinct embeddings. On theother hand, each field F (α(i)), i = 0, 1, . . . , n− 1, is isomorphic to the fieldF [X]/(p). Therefore there are exactly n distinct embeddings of F (α) whichare given by

σi : F (α) −→ F (α(i))n−1∑j=0

κjαj 7−→

n−1∑j=0

κjα(i)j .

This can be generalized in the following way to field extensionsF (α1, . . . , αk) generated by more than one algebraic element. Assume theembeddings of F (α1, . . . , αk−1) have already been determined. Let τ be such

4For i = 1 this is F . We apply a similar convention in many situations below.

13

Page 18: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

an embedding. If pk(X) =∑nkj=0 µjX

j , µj ∈ F (α1, . . . , αk−1), is the mini-mal polynomial of αk over F (α1, . . . , αk−1), denote by τ(pk) the polynomial∑nkj=0 τ(µj)Xj . Since pk is irreducible over the field F (α1, . . . , αk−1) so is

τ(pk) over F (τ(α1), . . . , τ(αk−1)).We can extend τ by mapping αk onto any of the nk distinct roots

of τ(pk). Since F (α1, . . . , αk) is isomorphic to the field of polynomials inF (α1, . . . , αk−1)[X] taken modulo pk it is easily verified that these exten-sions are indeed embeddings of F (α1, . . . , αk).

On the other hand, if σ is an embedding of F (α1, . . . , αk) over F then itsrestriction τ to F (α1, . . . , αk−1) must be an embedding of F (α1, . . . , αk−1).Since

0 = σ(pk(αk)) = τ(pk)(σ(αk))

it follows that any embedding of F (α1, . . . , αk) must have the form describedabove.

Summarizing the following theorem has been shown.

Theorem 2.1 Let F ⊂ C be a field and α1, α2, . . . , αk algebraic over F.If [F (α1, . . . , αi) : F (α1, . . . , αi−1)] = ni and n = [F (α1, . . . , αk) : F ] =∏ki=1 ni then F (α1, . . . , αk) has exactly n distinct embeddings over F. More-

over, any embedding of F (α1, . . . , αi−1) over F can be extended in exactlyni different ways to an embedding of F (α1, . . . , αi) over F.

Furthermore note that if Pi is the minimal polynomial of αi over F thenany conjugate field of F (α1, . . . , αk) has the form F (α1,j1 , . . . , αk,jk), whereαi,ji is a root of Pi5. In fact, for any embedding σ of F (α1, . . . , αk) over F

0 = σ(Pi(αi)) = Pi(σ(αi)).

Finally let us mention that if γ is an element of F (α1, . . . , αk) then thedegree m of the minimal polynomial of γ must divide n, since F (γ) is a sub-space of the vector space F (α1, . . . , αk). Any embedding σ of F (α1, . . . , αk)over F must map γ onto one of its conjugates over F and for each conjugateγ′ of γ there are exactly n

m embeddings σ that map γ onto γ′.Important extensions are those generated by the roots α0, α1, . . . , αn−1

of a single polynomial p of degree n. These fields are called splitting fields orGalois extensions6. A Galois extension coincides with all its conjugate fields,

5Observe that unless Pi = pi, pi the minimal polynomial of αi over F (α1, . . . , αi−1),not all combinations of roots of P1, . . . , Pk are possible.

6This definition is only correct since we are restricting ourselves to subfields of thecomplex numbers.

14

Page 19: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

hence the embeddings of such an extension are not only isomorphisms butautomorphisms. Moreover, these automorphisms form a group called theGalois group of the extension. The Fundamental Theorem of Galois Theorydescribes a one-to-one correspondence between the subgroups of the Galoisgroup of a Galois extension and the subfields of the extension.

If F ⊂ L, L a subfield of the Galois extension E = F (α0, α1, . . . , αn−1)over F, and σ is an element of the Galois group G of E then we say that σfixes L if and only if

σ(γ) = γ, for all γ ∈ L.

The set of all σ ∈ G that fix L form a subgroup of G, we denote this groupby Gal(L). Hence Gal is a function that maps a subfield L of E onto asubgroup of G. E can obviously be considered as a Galois extension of anysubfield L. As it turns out the Galois group of E over L is Gal(L).

On the other hand, for any subgroup H of G we consider the set ofelements γ ∈ E such that

σ(γ) = γ, for all σ ∈ H.

It is easily seen that for any subgroup H these elements form a subfield ofE. This field is called the fixed field of H and is denoted by Inv(H).

Theorem 2.2 (Fundamental Theorem of Galois Theory) Let E be aGalois extension of F ⊂ C with Galois group G. By Γ denote the set ofall subgroups H of G and by Σ denote the set of all subfields L of E withF ⊂ L. The mappings

H 7→ Inv(H)

andL 7→ Gal(L)

are inverses and hence bijective mappings of Γ onto Σ and from Σ onto Γ.Moreover, the extension L : F is a Galois extension if and only if Gal(L)is a normal subgroup of G. In this case the Galois group of L over F isisomorphic to G/ Gal(L).

Any textbook on algebra contains a proof of this theorem.Important examples of Galois extensions are the so-called cyclotomic

fields. Consider the equation

Xn − 1 = 0.

15

Page 20: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Any root of this equation is called an n-th root of unity. These roots form amultiplicative group and, since this group is a finite subgroup of C\0, thisgroup is cyclic (Any finite subgroup of the multiplicative group of a field Fis cyclic [J].). Therefore a root ζn of this equation exists such that n is thesmallest integer with ζnn = 1. Any root of Xn − 1 = 0 with this property iscalled a primitive n-th root of unity. In particular, if ζn is a primitive n-throot of unity then any root of Xn − 1 = 0 is a power of ζn. Hence the n-thcyclotomic field Q(ζn) is independent of the choice of the primitive root ζnand is a Galois extension of Q.

As is well-known the degree of the extension Q(ζn) : Q is ϕ(n), where ϕdenotes Euler’s ϕ-function counting the number of integers between 1 and nrelatively prime to n. Moreover, the Galois group of Q(ζn) : Q is isomorphicto the multiplicative group Z∗n of integers between 1 and n taken modulo nwhich are relatively prime to n. Since this group is abelian the FundamentalTheorem of Galois Theory implies that any subfield F, Q ⊂ F ⊂ Q(ζn), isa Galois extension of Q. We summarize these facts in

Lemma 2.3 Let ζn be a primitive n-th root of unity. The n-th cyclotomicfield Q(ζn) is a Galois extension of Q. Furthermore the Galois group of thisextension is isomorphic to the group Z∗n. Hence the Galois group is abelianand all subfields of Q(ζn) are Galois extensions of Q.

If E is an arbitrary extension of Q then E(ζn) is a Galois extension of E forany root of unity ζn. The next theorem relates the Galois group of E(ζn)over E to the Galois group of Q(ζn) over a certain subfield of Q(ζn).

Theorem 2.4 Let E be a Galois extension of the field K. Denote the Galoisgroup of this extension by G. Assume furthermore that F is an arbitraryextension of K and denote by EF the smallest field containing E and F.Then the field EF is a Galois extension of F and the Galois group of EF : Fis isomorphic to the subgroup of G corresponding to the extension E : F ∩E.

A proof of this theorem can be found in [L].Hence for any extension E of Q the Galois group of E(ζn) over E is

isomorphic to the Galois group of Q(ζn) over Q(ζn)∩E. In particular, it isabelian.

So far extensions of the form F (α1, . . . , αk) have been considered butby the well-known Primitive Element Theorem any extension of F ⊂ Cgenerated by a finite number of algebraic numbers over F can already begenerated by adjoining a single algebraic number to F . Using the resultsmentioned so far we derive a quantitative version of this theorem.

16

Page 21: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Lemma 2.5 Let F ⊂ C be a field and E = F (α1, . . . , αk) an algebraicextension of F with [E : F ] = n. Denote the different embeddings of E overF by σ0, σ1, . . . , σn−1.Assume γ ∈ E such that

σi(γ) 6= σj(γ), for all i 6= j,

then E = F (γ).

Proof: Consider for each embedding σi its restriction τi to F (γ). τi is anembedding of F (γ) over F. Since σi(γ) 6= σj(γ) the restrictions τi are dis-tinct embeddings of F (γ). Hence F (γ) has at least n distinct embeddingsover F and therefore [F (γ) : F ] ≥ n. On the other hand, F (γ) is a subfieldof E. So its degree must be less than n. This shows n = [F (γ) : F ]. Butno field extension of degree n has a proper subfield of degree n. This finallyproves F (γ) = E.

As an immediate consequence from this lemma we get the Primitive ElementTheorem which we state in its classical form.

Theorem 2.6 (Primitive Element Theorem) Let F be a field that con-tains the rational numbers. Assume that α, β are algebraic over F . Further-more denote by α0 = α, α1, . . . , αn−1 and β0 = β, β1, . . . , βm−1 the conju-gates of α over F and of β over F, respectively. If κ ∈ F satisfies

αi + κβk 6= αj + κβl

for all conjugates αi, αj of α and all conjugates βk, βl of β such that i 6= jor k 6= l then

F (α, β) = F (α+ κβ).

Proof: Consider two different field embeddings σ, τ of F (α, β). σ(α) = αiand σ(β) = βk and, likewise, τ(α) = αj , τ(β) = βl for conjugates αi, αj ofα and conjugates βk, βl of β. Since any embedding is completely determinedby its action on α and β the two embeddings can be different if and only ifαi 6= αj or βk 6= βl. But then

σ(α+ κβ) = αi + κβk 6= αj + κβl = τ(α+ κβ).

The theorem follows from Lemma 2.5.

17

Page 22: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Since F contains infinitely many elements and the condition of the theoremexcludes at most n(n−1)

2m(m−1)

2 elements from F the theorem actually statesthat any extension of F generated by two elements, and by induction onthe number of generators that any extension generated by a finite numberof elements, can already be generated by a single algebraic number over F .

We nevertheless consider algebraic extensions of the form F (α1, . . . , αk)because in going from F (α1, . . . , αk) to the equivalent description F (α) forsome α ∈ C a lot of information about the structure of the extension maybe lost. This will be quite clear in the next section. On the other hand,from a computational point of view the representation of a field as F (α)for an appropriate α with minimal polynomial p is more convenient, sincein this case arithmetic in F (α) reduces to arithmetic with polynomials inF [X]/(p).

Let us mention one more useful result that can easily be proven usingthe distinct embeddings of a field.

Lemma 2.7 Let α, β be algebraic over a field F and let α0, α1, . . . , αn−1,and β0, β1, . . . , βm−1 denote the conjugates of α and β over F, respectively.Then the conjugates of α+β and αβ are among the complex numbers αi+βjand αiβj , i = 0, . . . , n− 1, j = 0, . . . ,m− 1, respectively.

Proof: Consider the extension F (α, β). Any embedding σ of F (α, β) overF is uniquely defined by the values of σ(α) and σ(β). Now σ(α) = αi forsome conjugate αi of α and σ(β) = βj for some conjugate βj of β.

As has been observed above the conjugates of α+β and αβ are the num-bers σ(α + β) = σ(α) + σ(β) and σ(αβ) = σ(α)σ(β), respectively, where σis any embedding of F (α, β) over F. The lemma follows.

Finally we need from abstract algebra the notions of the trace and normfunction of an algebraic extension. Let E ⊂ C be an algebraic extension ofF ⊂ C. We consider E as a finite-dimensional vector space over F and fixsome basis β0, β1, . . . , βn−1 for this vector space. Then for any β ∈ E themapping

µβ : E −→ E

γ 7−→ βγ

is an F -linear mapping. It therefore has a matrix representation (cij), i, j =0, 1, . . . , n − 1, with respect to the basis β0, β1, . . . , βn−1, where the cij ’s

18

Page 23: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

are elements of F and satisfy

ββi =n−1∑j=0

cijβj , i = 0, 1, . . . , n− 1.

The trace tr(β) is the trace of the matrix (cij), that is, the sum of the maindiagonal elements, and the norm no(β) is the determinant of this matrix.In particular, the trace and norm map an element of E onto an element ofF . Of course we may have chosen any basis B for the F -vector space E inorder to define the trace and norm. It is standard linear algebra that thesedefinitions are invariant under a change of basis.

Next we give a useful alternative definition for the trace and norm func-tion. Consider the distinct field embeddings of E over F which are denotedby σi. Then the trace trE:F of E over F can be defined as follows

trE:F : E −→ F

β 7−→n−1∑i=0

σi(β).

Likewise, the norm is defined as

noE:F : E −→ F

β 7−→n−1∏i=0

σi(β).

A proof that the definitions are equivalent can be found in [Ja].From the second definitions it is clear that the trace function is additive

and that the norm is multiplicative.The trace and norm of an element β of E are closely related to the

minimal polynomial p(X) of β over F . If

pβ(X) =m∑i=0

giXi, gi ∈ F, gm = 1,

is the minimal polynomial of β then

tr(β) = (−1)n

mgm−1(1)

and

no(β) = (−1)mgnm0 .(2)

19

Page 24: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

In particular, if the trace of an element is non-zero then the coefficient gm−1

is non-zero, too.Now let us turn to algebraic number theory. The main objects studied

are the algebraic number fields, that is, algebraic extensions Q(α) of therationals generated by a single algebraic number over Q. By the PrimitiveElement Theorem the algebraic number fields are exactly the fields gen-erated by a finite number of algebraic numbers. As mentioned above, ifp(X) =

∑ni=0 piX

i, pi ∈ Q, pn = 1, is the minimal polynomial of α overQ then Q(α) =

∑n−1i=0 qiα

i|qi ∈ Q

and, moreover, Q(α) is isomorphic toQ[X]/(p).

Next to algebraic numbers the most important concepts in algebraicnumber theory are the concepts of algebraic integers and of the ring ofintegers of a number field. An algebraic number α is called an algebraicinteger or simply an integer if a polynomial p(X) =

∑piX

i, pi ∈ Z, existssuch that p(α) = 0 and pn = 1. It can be shown using Gauss’ Lemma (see[vdW]) that the minimal polynomial of an algebraic integer is also a monicinteger polynomial.

To give some examples, the rational integers Z are algebraic integersand they are the only algebraic integers contained in the rational numbers.Furthermore d

√z for arbitrary rational integers d, z is an algebraic integer

since it is a root of p(X) = Xd− z. Here it does not matter which d-th rootis meant by d

√z. More generally, and most important for our purposes

Lemma 2.8 If α is an algebraic integer so is d√α for any positive rational

integer d and for any of the d possible interpretations of the root.

In fact, if p(X) is the monic minimal polynomial of α then d√α is a root of

p(Xd), which is also monic.The set of algebraic integers forms a ring with respect to the usual ad-

dition and multiplication in C. This shows that if α1, . . . , αk are algebraicintegers then any integer combination

∑ki=1 zi αi, zi ∈ Z, of these numbers is

also an algebraic integer. Furthermore, to any algebraic number field Q(α)the set of all algebraic integers contained in Q(α) can be associated. Thisset is denoted by Rα. Since Rα is the intersection of the field Q(α) with thering of all algebraic integers in the complex numbers Rα is also a ring. Wegive some examples:

1. If α is rational then Q(α) = Q. As mentioned above the ring ofalgebraic integers in Q is Z.

20

Page 25: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

2. Let d ∈ Z be a square-free integer. Then Q(√d) =

a+ b√d|a, b ∈ Q

. The ring of integers in Q(

√d) is given by (see

[M])

R√d =a+ b

√d|a, b ∈ Z

if d ≡ 2, 3 mod 4(3)

R√d =

a+ b

(1 +√d

2

)| a, b ∈ Z

if d ≡ 1 mod 4.(4)

The algebraic structure of the ring of integers of an algebraic number fieldis well-known:

Rα is a free Z-module of degree n, the degree of α, i.e., there exist nalgebraic integers βi ∈ Rα, i = 0, . . . , n − 1, such that any γ ∈ Rα canuniquely be written as γ =

∑n−1i=0 ziβi, zi ∈ Z.

Unfortunately, there are no efficient algorithms known (polynomial inthe degree of the extension) to compute a basis β0, . . . , βn−1. On theother hand, for the purpose of this thesis we only need a sub- and a supersetof Rα, which we are now going to describe. First a useful observation.

Let α be algebraic and consider the number field Q(α) generated by α.We show that there exists an algebraic integer in Q(α) that generates thesame field and that can easily be computed. Given p(X), deg p = n, theminimal polynomial of α, let q be the least common multiple of the denom-inators of the coefficients of p(X). Then qα is an algebraic integer, sinceits minimal polynomial is qnp

(Xq

)∈ Z[X]. Furthermore Q(α) = Q(qα).

Therefore it can be assumed that algebraic number fields are generated byalgebraic integers.

If Q(α) is generated by an algebraic integer α then Rα contains α andZ. Since Rα is a ring it follows that Z[α] =

∑n−1i=0 ziα

i|zi ∈ Z⊆ Rα. In

general, the inclusion is strict. For example, Z[√d]

is strictly contained inR√d if d ≡ 1 mod 4. To describe the superset for Rα one more definition isneeded.

Let α be algebraic with minimal polynomial p(X), deg p = n. Sincep(X) is irreducible it has n distinct roots. Denote the conjugates of α byα0 = α, α1, . . . , αn−1.

Definition 2.9 The number ∆ = ∆(α) = Π0≤i<j≤n−1(αi − αj)2 is calledthe discriminant of α or the discriminant of p.

It is well-known that for an algebraic integer α its discriminant ∆(α) is arational integer.

21

Page 26: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

A proof of the following lemma which describes the superset of Rα canbe found in the book of Marcus [M].

Lemma 2.10 Let α be an algebraic integer, Q(α) the algebraic number fieldgenerated by α and ∆ the discriminant of α. Then the ring of integers Rαof Q(α) is contained in the free Z-module 1

∆Z⊕ 1∆Zα⊕ . . .⊕ 1

∆Zαn−1, i.e.,any integer γ can uniquely be written as γ = 1

∑n−1i=0 ciα

i, ci ∈ Z.

In many respects the ring of integers of an algebraic number field hasthe same properties as the rational integers Z. But there is at least onefundamental difference between Z and the ring of integers of an algebraicnumber field. In general, Rα is not a unique factorization domain (UFD).To give an example, consider the field Q

(√−5). By Equation (4) 6 can be

factored in Rα either as 6 = 2 · 3 or as 6 =(1 +√−5) (

1−√−5). One

checks that 2, 3, 1 +√−5, and 1−

√−5 are all prime in the ring of integers

of Q(√−5).

It is basically due to this fact that the algorithms in the first part of thisthesis get rather complicated (and probabilistic) when generalized from Qto algebraic number fields.

This finishes our description of some fundamental facts in algebra andnumber theory. In the next section special algebraic extensions called radicalextensions will be examined in more detail.

22

Page 27: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

3 The Structure of Radical Extensions

In this section we study algebraic extensions of a special type, the so-calledradical extensions. Based on the ideas of C. L. Siegel [Si] we give a sim-plified proof of a theorem that determines the structure of certain radicalextensions, for example, those radical extensions contained in the field ofreal numbers. In particular, it is shown that in this case no non-trivial sumof linearly independent real radicals is itself a radical. Siegel deduced thisresult from his structure theorem, we proof it directly and base the proofof the structure theorem on this lemma. We also show how to prove andgeneralize certain results from Kummer Theory using this lemma.

Definition 3.1 Let F be a subfield of the complex numbers C. An elementγ ∈ C is called a radical over F iff

γd ∈ F.

for some positive integer d.

Hence radicals are solutions of equations of the form Xd− ρ, ρ ∈ F, and aretherefore algebraic over F.

Throughout this thesis we will denote radicals by the familiar symbolsd√ρ. However, the reader should always bear in mind that the symbol d

√ρ

does not uniquely specify a number. This does not cause too many problemsin this thesis since the only restriction we frequently impose on a radical isthat it should be a real number. In that case we simply say ”the real radicald√ρ”. Moreover, in this situation we implicitly assume that ρ itself is a real

number and that Xd − ρ = 0 has a real solution. If d is even we need notworry which of the two possible real solutions to Xd − ρ = 0 is meant bythe real radical d

√ρ. As will be clear the results will be correct for both of

them.

Definition 3.2 An algebraic extension E of F is called a radical exten-sion iff it has the form E = F ( d1

√ρ1, . . . , dk

√ρk) for a finite number of

radicals di√ρi over F . For k = 1 we call the extension a simple radical

extension. If F ( d1√ρ1, . . . , dk

√ρk) ⊂ R then the extension will be called a

real radical extension.

Recall that if Xd − ρ ∈ F [X] then the roots of the equation Xd − ρ = 0 aregiven by ζid d

√ρ, i = 0, 1, . . . , d− 1, where ζd is a primitive d-th root of unity

and d√ρ is any solution of the equation.

23

Page 28: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

The next theorem describes the minimal polynomial of a radical if thefield F is a subfield of the reals or if the field F contains “appropriate” rootsof unity.

Theorem 3.3 Let E ⊂ C be a field, and let d√ρ be a radical over E. Assume

that either E ⊆ R and d√ρ ∈ R or that E contains a primitive d-th root of

unity. Furthermore assume that d′ is the smallest nonnegative integer suchthat d

√ρd′ ∈ E. Then d′ is a divisor of d and the minimal polynomial g(X)

of d√ρ over E is given by g(X) = Xd′ − d

√ρd′. Hence d′ is the degree of d

√ρ

over E.

Proof: A proof for this theorem can be found in [Mo] or [Si]. For thereader’s convenience we include a proof. We first show that d′ divides d.

Suppose d′ does not divide d. In this case we write d as d = ld′ + k with0 < k < d′. Since d

√ρd = ρ ∈ E and d

√ρld′ ∈ E we conclude d

√ρk ∈ E which

contradicts the minimality of d′.Next suppose that Xd′ − d

√ρd′

is not the minimal polynomial. Insteadassume that the minimal polynomial is given by f(X) =

∑ni=0 fiX

i, fi ∈ E,and n < d′. Since the minimal polynomial of an algebraic number α overa field E divides any other polynomial in E[X] with root α f(X) dividesXd−ρ. Therefore the constant term f0 of f is a product of n of the roots ofXd − ρ. As mentioned, these roots have the form ζi d

√ρ, i = 0, . . . , d− 1, for

some primitive d-th root of unity ζ. Therefore f0 is of the form f0 = ζ ′ d√ρn

for a d-th root of unity ζ ′.If E ⊆ R, d

√ρ ∈ R then d

√ρn ∈ R. Therefore f0/ d

√ρn = ζ ′ ∈ R. Since

+1 and −1 are the only real roots of unity this implies d√ρn ∈ E which

contradicts the minimality of d′.Likewise, if E contains a primitive d-th root of unity then it contains

all d-th roots of unity, i.e. ζ ′ ∈ E, so d√ρn ∈ E, again contradicting the

minimality of d′.

In the analysis of the algorithms that check the linear dependence of radicalsthe following corollary will be one of the key steps.

Corollary 3.4 Let E ⊂ C be a field, and let d1√ρ1, d2√ρ2 be radicals over

E. Assume that either E ⊆ R and d1√ρ1, d2√ρ2 ∈ R or that E contains

primitive d1-th,d2-th roots of unity. Denote the greatest common divisor(gcd) of d1, d2 by d.If d1√ρ1/ d2√ρ2 ∈ E then d1

√ρ1d, d2√ρ2d ∈ E, too

24

Page 29: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Proof: In both cases d1√ρ1/ d2√ρ2 ∈ E implies ( d1

√ρ1/ d2√ρ2)d ∈ E. Hence

d1√ρ1d = γ d2

√ρ2d

for some γ ∈ E.If d′1 = d1/d and d′2 = d2/d then Theorem 3.3 shows that the degree of

d1√ρ1d over E is a divisor of d′1, and that the degree of d2

√ρ2d is a divisor of

d′2. Since gcd(d′1, d′2) = 1 the equality d1

√ρ1d = γ d2

√ρ2d implies that these

degrees must be 1, so d1√ρ1d, d2√ρ2d ∈ E.

Our next goal is to describe the form of radicals contained in a radicalextension and to strengthen Theorem 3.3 in case the field E is a radicalextension of some field F such that d

√ρ is not only a radical over E but

already over F.If E is a radical extension F ( d1

√ρ1, . . . , dk

√ρk) of a field F and we denote

the degree of the extension F ( d1√ρ1, . . . , di

√ρi) : F ( d1

√ρ1, . . . , di−1

√ρi−1) by

ni then using the results mentioned in the previous section n = [E : F ] =∏ki=1 ni and the field extension has a basis B of the form

B =

k∏i=1

di√ρiei , 0 ≤ e1 < n1, 0 ≤ e2 < n2, . . . , 0 ≤ ek < nk

.

Throughout this thesis we refer to this basis as the standard basis of theextension F ( d1

√ρ1, . . . , dk

√ρk) : F.

Instead of generating E by the radicals d1√ρ1, . . . , dk

√ρk we can also gen-

erate this extension by adjoining the radicals d1√ρ1−1, . . . , dk

√ρk−1 to F .

Since F ( d1√ρ1−1, . . . , di−1

√ρi−1

−1) = F ( d1√ρ1, . . . , di−1

√ρi−1) the degree of

−di√ρi over F ( d1

√ρ1−1, . . . , di−1

√ρi−1

−1 is also ni and hence

B′ =

k∏i=1

di√ρi−ei , 0 ≤ e1 < n1, 0 ≤ e2 < n2, . . . , 0 ≤ ek < nk

is another field basis for E : F .Furthermore we need the following lemma.

Lemma 3.5 Let E be an algebraic extension of a field F ⊂ C, [E : F ] = n.By trE:F denote the trace function of this extension. If β0, β1, . . . , βn−1is a F -basis for E and β a non-zero element of E then

trE:F (βiβ) 6= 0

25

Page 30: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

for some i ∈ 0, 1, . . . , n− 1.

Proof: Assume trE:F (βiβ) = 0 for all i. Since the setβ0β, β1β, . . . , βn−1β is a basis of the extension this implies that the tracefunction is identically zero. But trE:F (1) = n 6= 0, which proves thelemma.

Despite its simple nature this lemma will prove to be very useful not only inthe proof of Siegel’s Theorem but also for the problem of denesting nestedradicals.

We are in a position to prove the basic lemma for real radicals.

Lemma 3.6 Let F be a real field and let di√ρi, i = 1, . . . , k, be real radicals

over F . Assume ni is defined as above.If dk√ρk ∈ F ( d1

√ρ1, d2√ρ2, . . . , dk−1

√ρk−1) then it has the form

dk√ρk = γ

k−1∏i=1

di√ρiei

for some γ ∈ F and positive integers ei satisfying 0 ≤ ei < ni.

Proof: Consider the basis

B′ =

k−1∏i=1

di√ρi−ei , 0 ≤ e1 < n1, 0 ≤ e2 < n2, . . . , 0 ≤ ek−1 < nk−1

of the extension E = F ( d1√ρ1, d2√ρ2, . . . , dk−1

√ρk−1) over F. All elements of

this basis are clearly radicals over F. By Lemma 3.5 for some element β ofthis basis the trace trE:F ( dk

√ρkβ) is non-zero.

If g(X) =∑mi=0 giX

i is the minimal polynomial of dk√ρkβ over F then

gm−1 6= 0 (see Equation (2) of Section 2). On the other hand, dk√ρkβ is a

real radical over F . Hence by Theorem 3.3 its minimal polynomial has theform g(X) = Xm − ( dk

√ρkβ)m, where ( dk

√ρkβ)m ∈ F . Therefore gm−1 can

be non-zero if and only if m = 1 and dk√ρkβ ∈ F. This proves the lemma.

To generalize this lemma to complex radicals we need the following well-known result.

Lemma 3.7 Let F ⊂ C be a field containing primitive di-th roots ofunity, i = 1, 2, . . . , k. Let d be the least common multiple ( lcm) of di, i =1, 2, . . . , k. Then F contains a primitive d-th root of unity.

26

Page 31: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Proof: We prove the lemma only for k = 2 the general case follows byinduction on k.

Let ζ be a primitive d1d2-th root of unity. By Euclid’s algorithm the gcdof d1, d2 can be written as

gcd(d1, d2) = md1 + nd2.

ζgcd(d1,d2) is a primitive d-th root of unity. But

ζgcd(d1,d2) = ζmd1+nd2 = ζmd1ζnd2 .

Since ζd1 is a primitive d2-th root of unity and, accordingly, ζd2 is a primi-tive d1-th root of unity, ζgcd(d1,d2) is in F.

Lemma 3.8 Let F ⊂ C be a field and di√ρi, i = 1, . . . , k, be radicals over F.

Furthermore assume that the field F contains primitive di-th roots of unity.If dk√ρk ∈ F ( d1

√ρ1, d2√ρ2, . . . , dk−1

√ρk−1) then dk

√ρk has the form

dk√ρk = γ

k−1∏i=1

di√ρiei

for some γ ∈ F and positive integers ei satisfying 0 ≤ ei < ni, where ni isdefined as above.

Proof: The proof is exactly as for Lemma 3.6. Observe that in order to ap-ply Theorem 3.3 and argue that the minimal polynomial of dk

√ρkβ, β ∈ B′,

has the form Xm − ( dk√ρkβ)m we need a positive integer d such that

( dk√ρkβ)d ∈ F and F contains a primitive d-th root of unity. The least

common multiple of all di, i = 1, 2, . . . , k, clearly has the first property. Bythe previous lemma it also has the second property.

Both lemmata can be interpreted as saying that the only radicals ina radical extension of a field F that is either real or contains all relevantroots of unity are the obvious ones, i.e., no non-trivial linear combinationof radicals is itself a radical. Throughout the rest of the thesis we willrestrict ourselves to radical extensions of fields F that satisfy the conditionsof Lemma 3.6 or of Lemma 3.8 and we refer to them as the real and complexcase, respectively.

27

Page 32: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Next we prove two important corollaries to these lemmata. One is astrengthened form of Theorem 3.3. The other corollary describes underwhich conditions radicals are linearly dependent over a field F .

Theorem 3.9 (Siegel) Let F be a field and let d1√ρ1, d2√ρ2, . . . , dk

√ρk be

radicals over F. In both the real and complex case, as defined above, theminimal polynomial pi(X) of di

√ρi over F ( d1

√ρ1, d2√ρ2, . . . , di−1

√ρi−1) has

the form

pi(X) = Xni − γi−1∏j=1

dj√ρjej

for some γ ∈ F and integers ej ∈ N.In other words, the degree of

the field extension F ( d1√ρ1, d2√ρ2, . . . , di

√ρi) : F ( d1

√ρ1, d2√ρ2, . . . , di−1

√ρi−1)

is the smallest number ni such that di√ρini can be written as a product of an

element in F and powers of the radicals d1√ρ1, d2√ρ2, . . . , di−1

√ρi−1.

Proof: By Theorem 3.3 it suffices to show that any power di√ρie of di

√ρi

that is an element of F ( d1√ρ1, d2√ρ2, . . . , di−1

√ρi−1) has the form

ρei = γi−1∏j=0

dj√ρjej , ej ∈ N, γ ∈ F.

But this is clear from Lemma 3.6 or Lemma 3.8 since any power of di√ρi is

a radical over F.

We easily deduce the following corollary.

Corollary 3.10 Let F ⊂ C be a field and let d1√ρ1, d2√ρ2, . . . , dk

√ρk be non-

zero radicals over F.In both the real and complex case a relation

∑ki=1 κi di

√ρi = 0 for qi ∈ F

not all zero can exist if and only if different radicals di√ρi, dj√ρj exist such

that

di√ρi = κ dj

√ρj

for some κ ∈ F .In other words, the radicals d1

√ρ1, . . . , dk

√ρk are linearly independent over

F if any two of them are linearly independent.

28

Page 33: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Proof: We claim that if for every pair of radicals di√ρi, dj√ρj

di√ρi/ dj√ρj 6∈ F

then the set of radicals d1√ρ1, d2√ρ2, . . . , dk

√ρk can be extended to a field

basis of F ( d1√ρ1, d2√ρ2, . . . , dk

√ρk) : F.

To prove this claim observe that although not all radicals di√ρi need

to be an element of the basis B =∏kj=1

dj√ρjej , 0 ≤ e1 < n1, 0 ≤ e2 < n2, . . . , 0 ≤ ek < nk

(some of the field

degrees nj = [F ( d1√ρ1, d2√ρ2, . . . , dj

√ρj) : F ( d1

√ρ1, d2√ρ2, . . . , dj−1

√ρj−1)] may

be 1) by Lemma 3.6 or Lemma 3.8 any radical di√ρi must be a non-zero

multiple of a basis element. The condition di√ρi/ dj√ρj 6∈ F implies that any

radical is a multiple of a different basis element. This proves the claim andhence the corollary.

Kneser [K] has given weaker conditions under which the theorem aboveis correct, i.e., for complex radicals it is not always necessary to assumethat the base field F contains all di-th roots of unity. Unfortunately, evenin Kneser’s theorem in general the order of the root of unity that has tobe contained in F is exponential in k. Hence from a computational pointof view both versions are only of limited use. As will be seen below theyare of interest basically if all di’s are equal7. But then Kneser’s improvedversion leads to no significant speed-up in the run times of the algorithms.Moreover, Kneser’s condition is often very hard to check making it infeasiblefrom an algorithmic point of view. Therefore for our purposes it is not worththe effort to state or even proof Kneser’s theorem.

We want to use the previous results to describe all subfields of a radicalextension. First consider a field F containing primitive di-th roots of unity,i = 1, 2, . . . , k. Let E = F ( d1

√ρ1, . . . , dk

√ρk) be a radical extension of F.

Denote by d the least common multiple of the integers di.F contains a primitive d-th root of unity (see Lemma 3.7). Let B =

β0, β1, . . . , βn−1 be the standard basis of E. Any element βj ∈ B satisfiesβdj ∈ F.

Consider an arbitrary element γ of E. It generates a subfield of E andwe will show that this subfield is a radical extension of F.

Assume γ = κ1β1 + κ2β2 + · · · + κlβl such that the βi’s are differentelements of B and all coefficients κi are non-zero elements of F. We claim

7This case will be applied in the second part of the thesis where denesting algorithmsare considered.

29

Page 34: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

that F (γ) = F (β1, . . . , βl).By Lemma 2.5, p.17, it suffices to show that σ(γ) 6= τ(γ) for any pair

(σ, τ) of different embeddings of F (β1, . . . , βl) over F.As mentioned above each βj is a root of a polynomial of the form Xd −

ηj , ηj ∈ F. The minimal polynomial of βj must divide this polynomial.Therefore any embedding of this field must send βj to ζj βj for a d-th rootof unity ζj . Hence if

l∑j=1

ζjκj βj 6=l∑

j=1

ζ ′jκj βj ,

for d-th roots of unity ζj , ζ ′j such that at least one index j with ζj 6= ζ ′j existsthen the claim follows.

If this is not the case then for certain roots of unity a relation

l∑j=1

(ζj − ζ ′j)κj βj = 0,

exists where not all coefficients (ζj−ζ ′j)κj are zero. But all these coefficientsare elements in F. Hence such a relation contradicts the linear independenceof the basis elements βj . This proves the claim.

By the Primitive Element Theorem (Theorem 2.6) any subfield of E hasthe form F (γ) for an appropriate γ and therefore we have shown that allsubfields of E are radical extensions.

Now denote by ρi the element di√ρid ∈ F and by A the group

A =

k∏i=1

ρiei , ei ∈ Z

.

Furthermore let F d be the set of all d-th powers of elements in F. ThenA/F d, the set of all elements in A modulo elements in F d, is a finite group.We show that the set of subfields of E is in one-to-one correspondence to theset of subgroups of A/F d. Observe that for any element βj of the standardbasis βdj is in F and is in fact an element of A. We want to show thatelements in A/F d are in one-to-one correspondence with the elements of thestandard basis.

For any element ρ in A denote by d√ρ one of its d-th roots. Since F

contains a primitive d-th root of unity F and F ( d1√ρ1, . . . , dk

√ρk) contains

di√ρi which is a d-th root of ρ′i the field F ( d1

√ρ1, . . . , dk

√ρk) will also contain

d√ρ no matter which d-th root of ρ this symbol denotes. Therefore d

√ρ is

30

Page 35: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

a radical contained in F ( d1√ρ1, . . . , dk

√ρk) and hence must be a product of

an element in F and an element of the standard basis. Finally, observethat for different basis elements βi, βj the powers βdi , β

dj generate different

equivalence classes in A/F d and that two elements ρ and ρ′ in A generatethe same equivalence class in A/F d if and only if the corresponding rootsd√ρ, d√ρ′ are multiples of the same basis element. This proves the one-to-one

correspondence between elements in A/F d and elements in the standardbasis.

By the proof above any subfield of F ( d1√ρ1, . . . , dk

√ρk) is generated by

a subset of the standard basis. But if two subsets generate the same sub-group of A/F d then any element in the first group can be written as aproduct of an element in F and an element in the second subset, and viceversa. Hence if two subsets generate the same group they generate the samesubfield. This proves the one-to-one correspondence between subfields ofF ( d1√ρ1, d2√ρ2, . . . , dk

√ρk) and subgroups of A/F d.

Assume next that the radicals di√ρi are linearly independent. It can be

shown in exactly the same way as above that any sum S =∑ki=1 κi di

√ρi, κi ∈

F\0, generates the extension E = F ( d1√ρ1, . . . , dk

√ρk). In other words, S

is a primitive element of E. We summarize these observations in

Theorem 3.11 Let F be a field containing di-th roots of unity, i =1, 2, . . . , k. If di

√ρi, i = 1, 2, . . . , k, are radicals then all subfields of the rad-

ical extension E = F ( d1√ρ1, d2√ρ2, . . . , dk

√ρk) are radical extensions of F.

The subfields are in one-to-one correspondence to subgroups of A/F d, whered = lcm(d1, d2, . . . , dk), A is the multiplicative group generated by the ele-ments ρd/dii , and F d is the multiplicative group of d-th powers of elementsin F.

If the radicals di√ρi are linearly independent over F then any sum∑k

i=1 κi di√ρi with non-zero coefficients κi ∈ F is a primitive element for

E.

The first part of this theorem is of course well-known from KummerTheory (see [Ar]), where it is proven, however, using Galois theory.

Next we want to prove the same result for real radicals over a real fieldF. To our knowledge this was not known before. The proof given abovecannot be generalized in a straightforward manner to this case since radicalsthat are linearly independent over a real field F need not remain linearlyindependent if we adjoin certain roots of unity to F. The following lemmais the key step to circumvent this problem.

31

Page 36: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Lemma 3.12 Let F be a real field and ζ a root of unity. If d√ρ is a real

radical over F contained in F (ζ) then d√ρ is a square root of an element in

F.

Proof: Assume ζ is an m-th primitive root of unity. The extension F (ζ) : Fis a Galois extension. By Theorem 2.4, p. 16, and Lemma 2.3, p. 16, theGalois group of this extension is isomorphic to a subgroup of Z∗m. In fact,apply Theorem 2.4 with K = Q, E = Q(ζ), and F = F. Hence EF = F (ζ).

Since Z∗m is abelian all subfields of F (ζ) over F are Galois extensions ofF (Theorem 2.2). In particular, the extension generated by d

√ρ must be a

Galois extension.By Theorem 3.3 the minimal polynomial of d

√ρ over F has the form

Xd′ − d√ρd′

for some d′ ∈ N. If d′ > 2 then d√ρ cannot generate a Galois

extension. If it did then F ( d√ρ) had to contain all conjugates of d

√ρ. But

for d′ > 2 some of the conjugates are not even real.

We are now in a position to prove the generalization of Theorem 3.11 toreal radical extensions.

Theorem 3.13 Let F be a real field. If di√ρi, i = 1, 2, . . . , k, are real radi-

cals then all subfields of the radical extension E = F ( d1√ρ1, d2√ρ2, . . . , dk

√ρk)

are radical extensions of F. The subfields of E are in one-to-one cor-respondence to the subgroups of the finite group A/F d, where d =lcm(d1, d2, . . . , dk), A is the multiplicative group generated by the elementsρd/dii and F d is the multiplicative group of d-th powers of elements in F.

If the radicals di√ρi are linearly independent over F then any sum∑k

i=1 κi di√ρi with non-zero coefficients κi ∈ F is a primitive element for

E.

Proof: We claim that if F is a real field and di√ρi, i = 1, . . . , k, are linearly

independent real radicals over F then any sum∑ki=1 κi di

√ρi, κi ∈ F, κi 6= 0,

generates the extension F ( d1√ρ1, d2√ρ2, . . . , dk

√ρk).

Applying this claim to the standard basis of a real radical extension itimplies as in the complex case that any subfield of a real radical extensionis itself a radical extension. The one-to-one correspondence stated in thetheorem can then be proven in exactly the same way as in the complex caseby identifying an element of A with one of its real d-th roots.

To prove the claim denote by d the least common multiple of the integersdi. Let ζd be a d-th primitive root of unity.

32

Page 37: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

By the previous lemma if

κj dj√ρj

κi di√ρi∈ F (ζd)

for two different indices i, j ≤ k then the ratio must be a square root√µ of

some element µ in F. Hence after an appropriate renumbering of the radicalsdi√ρi the sum

∑ki=1 κi di

√ρi can be written as

l∑i=1

κi(1 +√µi,1 + · · ·+√µi,ji

)di√ρi,

where each √µi,h ∈ F (ζd)\0 is a square root of an element in F and theradicals di

√ρi, i = 1, 2, . . . , l, are linearly independent over F (ζd) (Corollary

3.10).The elements in a set 1,õi,1, . . . ,

õi,ji, i = 1, 2, . . . , l, are linearly

independent over F. If this were not the case then by Corollary 3.10 theratio of two elements in this set would be an element of F. But this wouldimply

κj dj√ρj

κi di√ρi∈ F, i 6= j, i, j ≤ k,

for at least one ratio of radicals. This contradicts the linear independenceof these radicals.

In particular, the sums 1 +√µi,1 + · · · +√µi,ji are non-zero for all i =1, 2, . . . , l.

Next observe that E = F ( d1√ρ1, d2√ρ2, . . . , dk

√ρk) is the same field as the

field generated by the elements in

G =l⋃

i=1

di√ρi,√µi,1, . . . ,

õi,ji

.

Any embedding of E maps di√ρi onto ζi di

√ρi for some di-th root of unity ζi

and√µi,h is mapped either onto√µi,h or onto−√µi,h. Furthermore differentembeddings map at least one element in G onto different complex numbers.

Hence by Lemma 2.5, p.17, it suffices to show that

l∑i=1

ζiκi(1 + εi,1

√µi,1 + · · ·+ εi,ji

õi,ji

)di√ρi 6=

33

Page 38: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

l∑i=1

ζ ′iκi(1 + ε′i,1µi,1 + · · ·+ ε′i,jiµi,ji

)di√ρi,

where ζi, ζ ′i are d-th roots of unity, εi,h, εi,h ∈ +1,−1, and for at least oneindex i ζi 6= ζ ′i or εi,h 6= ε′i,h for some h.

Observe that in both sums the coefficients are elements of F (ζd). There-fore if the two sums are equal then a linear relation over F (ζd) between theradicals di

√ρi, i = 1, . . . l, exists. By construction these radicals are linearly

independent over F (ζd) and hence the two sums above are equal if and onlyif all coefficients of the difference of the sums are zero. We will show thatthis is impossible.

Let i be such that ζi 6= ζ ′i or εi,h 6= ε′i,h for some h. κi 6= 0 by assumption.Hence

ζi(1 + εi,1

√µi,1 + · · ·+ εi,ji

õi,ji

)− ζ ′i

(1 + ε′i,1

√µi,1 + · · ·+ ε′i,ji

õi,ji

)must be zero.

If ζi = ζ ′i then

1 + εi,1√µi,1 + · · ·+ εi,ji

√µi,ji = 1 + ε′i,1

√µi,1 + · · ·+ ε′i,ji

õi,ji .

But for a fixed i the √µi,h’s are linearly independent over F and for at leastone index the coefficients εi,h, ε′i,h ∈ +1,−1 are different. Therefore thetwo sums cannot be equal.

Again because for a fixed i the √µi,h and 1 are linearly independent1 + εi,1

√µi,1 + · · ·+ εi,ji

√µi,ji is non-zero for any combination of signs εi,h.

Hence if ζi 6= ζ ′i then

ζiζ ′i

=1 + εi,1

√µi,1 + · · ·+ εi,ji

õi,ji

1 + ε′i,1√µi,1 + · · ·+ ε′i,ji

õi,ji

.

But the expression on the right-hand side is real. Since ζiζ′i

is a d-th root ofunity this implies ζ ′i = −ζi.

So in this case the coefficient can be zero if and only if

2 + (εi,1 + ε′i,1)√µi,1 + · · ·+ (εi,ji + ε′i,ji)

õi,ji = 0.

This is again impossible since the elements in 1,õi,1, . . . ,õi,ji are lin-

early independent over F and the coefficient in front of 1 is non-zero. Thisfinally proves the claim and hence the theorem.

34

Page 39: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Theorem 3.11 and Theorem 3.13 show how to construct a primitive el-ement for a radical extension if the generators of the extension are linearlyindependent. This result will be used in Section 7 where we analyze ourdenesting algorithms.

Corollary 3.10, on the other hand, provides us with a simple way tocheck the linear dependence of a set of radicals d1√ρ1, . . . , dk

√ρk over a

field satisfying one of the conditions in Corollary 3.10. In fact, it has only tobe tested whether any ratio di

√ρi/ dj√ρj of radicals is contained in F . In the

next sections we show how to solve this problem efficiently. To demonstratethe basic ideas we first restrict ourselves to the rather simple case F = Q.

35

Page 40: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

4 Radicals over the Rational Numbers

In the first part of this section we show how to decide efficiently whethera set of real radicals R = d1√q1, . . . , dk

√qk over Q is linearly independent.

By efficient we mean an algorithm that is polynomial in the input size ofthe set R.

If di is even and qi is positive then the symbol di√qi does not specify

uniquely a real number. To avoid ambiguity and simplify the notation inthis case we assume that di

√qi is positive. Note that this is no restriction as

far as we are interested in linear combinations of radicals because we maychange the sign of the coefficients accordingly. With this convention theinput size of di

√qi is the input size of di, qi. If the sum of the bit size of the

numerator and denominator in qi and of the bit size of di is at most l we calldi√qi an l-bit radical. So the input size of a set R containing k l-bit radicals

is bounded by O(kl) and an algorithm that decides in time polynomial inthe input size whether a set of real radicals is linearly independent has to bepolynomial in the number of radicals k and in the logarithm of the degreesdi. Due to Corollary 3.4 and Corollary 3.10 we will achieve such a run timeby a rather simple algorithm.

If we want to check whether a linear combination of the radicals abovewith coefficients in Q is zero we transform the sum into a linear combinationof radicals that are linearly independent. The sum will be zero if and only ifthe coefficients in the transformed sum are zero. The transformation is doneby combining those radicals whose ratio is rational. Due to the solution tothe first problem this transformation is easily computed.

In the second part of this section we apply the results of the first part tothe problem of determining the sign of sums of square roots. Although theseresults cannot be used to describe an algorithm that computes the sign inpolynomial time, in many cases they can be used to speed up the algorithmsknown so far.

4.1 Linear Dependence of Radicals over the Rational Num-bers

Given a set of real radicals d1√q1, d2√q2, . . . , dk

√qk over the rational numbers

we want to check whether these radicals are linearly independent. Likewise,we want to determine whether a given rational combination of these radicalswith coefficients in Q is zero.

Due to Corollary 3.10 we only have to check whether for any pair of

36

Page 41: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

radicals di√qi, dj√qj their ratio is a rational number. As will be seen later, by

Corollary 3.4 this property in turn can be tested by determining for threeradicals over Q whether they are rational. Moreover, the degree of theseradicals will be smaller than maxdi, dj. We can restrict the problem evenfurther to roots of integers.

Lemma 4.1.1 Let q ∈ Q, q = ab , gcd(a, b) = 1 and d ∈ N. Then d

√q ∈ Q

if and only if d√a ∈ Z and d

√b ∈ Z.

Proof: Assume d√q = d

√ab ∈ Q. Hence d

√abd−1 ∈ Q. From the unique

factorization property of Z it easily follows that if d√z ∈ Q for z ∈ Z then

d√z ∈ Z. Therefore d

√abd−1 ∈ Z. Since gcd(a, b) = 1 this implies d

√a ∈ Z and

d√bd−1 ∈ Z.

Due to the unique factorization d√bd−1 ∈ Z if and only if every prime

dividing bd−1 does so with exponent divisible by d. Any such exponent hasthe form e(d − 1), where e is the exponent with which the prime dividesb. But gcd(d − 1, d) = 1, so if d divides e(d − 1) it must already divide e.Therefore d

√bd−1 ∈ Z implies d

√b ∈ Z, too.

We now show how to check in polynomial time whether an integer is a d-thpower.

Lemma 4.1.2 Let z, d be l-bit integers. It can be decided with O(l) elemen-tary operations on integers of bit size at most O(l) whether d

√z ∈ Z and, if

so, within the same time bounds it can be computed.

Proof: We may assume d ≤ log z. Otherwise Xd − z = 0 has a solution inZ if and only if z = 1or z = −1. Furthermore we can restrict ourselves topositive integers.

The integer z′ ∈ Z such that if d√z ∈ Z then d

√z = z′ can easily deter-

mined by a binary search on the interval I = [0, 2dlde]. In each step of the

binary search we have to determine whether an element z from I if raisedto the d-th power is smaller or larger than z or equal to z. The d-th poweris computed by successive squaring. Also observe that we may stop whena power of z has been computed that is larger than z. Hence the binarysearch can be done using at most O( ld log d) ∈ O(l) elementary operationson integers of size O(l).

37

Page 42: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

The binary search described above can be considered as an approximationalgorithm for d

√z. When we generalize the results of this section to number

fields we will see that by applying a more sophisticated approximation algo-rithm the run time in Lemma 4.1.2 can be reduced to O(log l) elementaryoperations on floating-point numbers of size O(l).

We combine the previous lemmata to deduce the following theorem.

Theorem 4.1.3 Let d1√q1, d2√q2 be l-bit radicals over Q. It can be decided

using O(l) elementary operations on integers of length O(l) whether theratio of these radicals is in Q. Furthermore if the ratio is rational it can becomputed within the same time bounds.

Proof: First the gcd of d1 and d2 is computed. By Schonhage’s result [Sc1]this can be done by O(log l) operations on O(l)-bit integers. Denote the gcdby d. We also compute within these time bounds d′i := di/d.

By Corollary 3.4, p.24, if d1√q1/ d2√q2 ∈ Q then di

√qid = d′

i√qi = q′i ∈

Q, i = 1, 2. We check whether this is the case by applying the algorithmleading to Lemma 4.1.2 to d′i and the numerator and denominator of qi, i =1, 2. The correctness of this procedure follows from Lemma 4.1.1, and byLemma 4.1.2 it uses O(l) elementary operations on integers of size O(l).

Then we compute q′1/q′2 and determine (using again Lemma 4.1.1,

Lemma 4.1.2) whether

d

√q′1q′2∈ Q.

Sinced1√q1

d2√q2

= d

√q′1q′2

this will give the desired result. As the numerators and denominators ofq′1, q

′2 have at most l bits the theorem follows.

Combining this result with Corollary 3.10 leads to

Corollary 4.1.4 Let d1√q1, d2√q2, . . . , dk

√qk be a set of l-bit radicals over

Q. It can be decided using O(k2l) elementary operations on integers of lengthO(l) whether this set is linearly independent over Q. Within the same timebound a maximal subset of linearly independent radicals can be computed.

Proof: Apply the algorithm of the previous theorem to the k(k−1)2 different

ratios of radicals.

38

Page 43: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

We can also use Theorem 4.1.3 above to check whether a linear combinationof radicals S =

∑ki=1 vi di

√qi is zero.

Corollary 4.1.5 Let di√qi, i = 1, . . . , k be l-bit radicals. If vi ∈ Q, i =

1, . . . , k, are such that the numerator and denominator of vi are l-bit inte-gers then it can be decided using O(k2l) elementary operations on integersof length O(l) and O(k) elementary operations on integers of length O(kl)whether the sum S =

∑ki=1 vi di

√qi is zero.

Proof: Using the algorithm of the previous corollary partition R = d1√q1, d2

√q2, . . . , dk

√qk into subsets R1, . . . , Rh such that two radicals are

in the same subset if and only if their ratio is rational. To simplify thenotation assume di

√qi ∈ Ri, i = 1, . . . , h. This partitioning can be done by

O(k2l) elementary operations on integers of length O(l). Within the sametime bounds rational numbers rij are computed such that if dj

√qj/ di√qi ∈ Q

then dj√qj/ di√qi = rij . Hence

S =k∑i=1

vi di√qi =

h∑i=1

∑dj√qj∈Ri

vjrij

di√qi.

Since for any pair of radicals in R′ = d1√q1, d2√q2, . . . , dh

√qh their ratio

is not a rational number, by Corollary 3.10, p.28, S = 0 if and only if∑dj√qj∈Ri

vjrij = 0, for i = 1, . . . , h.

To compute these sums for each i between 1 and h we compute the product ofthe denominators of the rational numbers in vjrij | j such that dj

√qj ∈ Ri.

Observe that each denominator has at most O(l) bits. Hence this can bedone by O(k) elementary operations on integers of size at most O(kl). Oncethe product has been computed the sum is easily determined by O(k) ele-mentary operations on integers of size O(kl).

Corollary 4.1.5 is restricted to real radicals. We could apply the sameidea to, say, a sum of complex fifth roots if we work with the fifth cyclotomicfield instead of Q. In fact, then we could apply the results of Section 3 tocomplex radicals. But this implies that we have to check efficiently whether aratio of complex radicals is contained in a cyclotomic field. So this problem

39

Page 44: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

naturally leads to the question whether the results of this section can begeneralized to algebraic number fields.

Before we answer this question we describe an application of Corollary4.1.5 to the problem of determining the sign of a sum of square roots.

4.2 Comparing Sums of Square Roots

Probably one of the most important open problems in algebraic computingis the determination of the sign of a sum S =

∑ki=1 ci

√ni, ci ∈ Z, ni ∈ N, of

square roots in polynomial time. This problem occurs for example in shortestpath computations if the bit complexity is considered. Another applicationis the Euclidean Traveling Salesman Problem again if bit arithmetic insteadof infinite-precision arithmetic is used.

It seems that the only algorithm known so far for this problem is thebrute force solution, that is, compute S with precision good enough to de-termine the sign. Then the question is how many bits of S have to be com-puted. Unfortunately, even the best bounds are exponential in the numberk of square roots.

To prove such a bound is actually quite simple. In fact, consider thealgebraic integer S =

∑ki=1 ci

√ni, ci ∈ Z, ci 6= 0. By the previous subsection

we may assume that the square roots√ni are linearly independent over Q

and that S is non-zero.The polynomial

g(X) =∏(

X −k∑i=1

εici√ni

),

where the product is over all k-tuples (ε1, ε2, . . . , εk) ∈ +1,−1k, is aninteger polynomial with non-zero constant coefficient.

First observe that it must be a polynomial with rational coefficients. Infact, by Lemma 2.7, p. 18, the conjugates of a sum

∑ki=1 εici

√ni are also of

the form∑ki=1 ε

′ici√ni with ε′ ∈ −1,+1. Hence if α is a root of g, so are

all its conjugates over Q. This implies that g is a polynomial with rationalcoefficients.

Furthermore each∑ki=1 εici

√ni is an algebraic integer. This implies that

the coefficients of g are algebraic integers. Since the coefficients are rationalthey must be rational integers. Finally observe that each sum

∑ki=1 εici

√ni

is non-zero since the√ni’s are linearly independent over Q and the ci’s,εi’s

are non-zero. Hence the constant coefficient of g must be non-zero, too.

40

Page 45: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Assume |ci| < 2l, ni < 2l, for all i. Then the coefficients in g are boundedin absolute value by (22lk)2k (see also Lemma 7.1.1, p. 142, to be provenlater). Now we may use the following result that goes back to Cauchy (seealso [Mi1]).

Lemma 4.2.1 Let p(X) =∑ni=0 piX

i, p0 6= 0, pn 6= 0 be a polynomial withcomplex coefficients. Then any root z of p satisfies

(i) |z| < 1 + max|p0|,|p1|,...,|pn−1||pn|

(ii) |z| > |p0||p0|+max|p1|,|p2|,...,|pn|

Applying (ii) to the polynomial g above shows that if S =∑ki=1 ci

√ni 6=

0 then |S| ≥ (22lk)−2k−1 hence O(2k(l + log k)) bits suffice in order todetermine the sign of S.

But the arguments above show a bit more. In fact, assume that thedegree of the minimal polynomial g0 of

∑ki=1 ci

√ni is N ≤ 2k. Then (Lemma

2.7, p.18), g0 has the form

g0 =∏

(ε1,...,εk)∈H

(X −

k∑i=1

εici√ni

),

where H is a subset of +1,−1k of size N . Hence the coefficients of g0 arebounded in absolute value by (k22l)N . This proves that 2N(l + log k) bitssuffice to determine the sign of S =

∑ki=1 ci

√ni.

Of course N may be 2k and in this case knowing N will lead to nospeed-up in the algorithm that determines the sign of S =

∑ki=1 ci

√ni. On

the other hand, if N is much smaller than 2k computing N in advance maysave a lot of time. In the rest of this section we show how to compute thedegree N in time that is almost proportional to the degree itself.

Recall that we assume that the square roots√ni are linearly independent

over Q. Hence by Theorem 3.13, p.32, the degree of S is the same as thedegree of the extension Q(

√n1, . . . ,

√nk) over Q.

Applying the theorems of Section 3 and the algorithms of the previoussubsection a basis for the extension and hence the degree is easily determinedas follows.

Assume w.l.o.g. that no square root is rational. Then a basis for Q(√n1)

over Q is given by B1 = 1,√n1. Furthermore a basis for Q(√n1,√n2) is

B2 = 1,√n1,√n2,√n1n2. Recall that

√n2 6∈ Q(

√n1) since

√n1/√n2 6∈

41

Page 46: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Q and by Lemma 3.6, p.26,√n2 ∈ Q(

√n1) would imply either

√n2 ∈ Q or√

n2 = q√n1, q ∈ Q.

Similarly Lemma 3.6 implies√n3 ∈ Q(

√n1,√n2) if and only if

√n3 is

a rational multiple of an element in B2 = 1,√n1,√n2,√n1n2. In which

case a basis for Q(√n1,√n2,√n3) is given by B2. Otherwise a basis is given

by B3 = B2 ∪ B2√n3 where B2

√n3 denotes the set we get by multiplying

every element in B2 by√n3.

Continuing in this way we see that once we have a basis Bi−1 for theextension Q(

√n1, . . .

√ni−1) : Q consisting of square roots only a basis

Bi for Q(√n1, . . .

√ni) : Q is given either by Bi−1 or by Bi−1 ∪ Bi−1

√ni

depending on whether√ni is a rational multiple of an element in Bi−1 or

not.Therefore a basis Bk for Q(

√n1, . . .

√nk) : Q can be computed by

checking for at most k|Bk| = kN pairs of square roots of integers whethertheir ratio is rational. Furthermore the absolute value of the integers in-volved is bounded by

∏ki=1 ni. Hence by Theorem 4.1.3 the degree N of

Q(√n1, . . .

√nk) : Q together with a basis Bk can be computed using

O(k2Nl) elementary operations on integers of size O(kl). Here as aboveni < 2l.

Theorem 4.2.2 Let √n1,√n2, . . . ,

√nk, ni ∈ N, ni < 2l, be a set of real

square roots. The degree N of the extension Q(√n1,√n2, . . . ,

√nk) : Q

can be determined using O(k2Nl) elementary operations on integers of sizeO(kl).

For any sum S =∑ki=1 ci

√ni, ci ∈ Z, |ci| < 2l, its degree over Q can be

determined within the same time bound.

Proof: By Corollary 4.1.5 within the time bounds stated we compute amaximal subset of linearly independent radicals in √n1,

√n2, . . . ,

√nk and

transform the sum S into a sum of linearly independent radicals.In order to compute the degree of the extension we apply the process

described above to the maximal subset of linearly independent radicals. Inorder to determine the degree of S we apply the process to those linearlyindependent radicals that have a non-zero coefficient in the transformed sum(see Theorem 3.13, p. 32).

Combining this result with the arguments above leads to

Corollary 4.2.3 For any sum S =∑ki=1 ci

√ni, ci ∈ Z, ni ∈ N, ni, |ci| <

2l, its sign can be determined by O(k2Nl) elementary operations on integers

42

Page 47: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

of size O(kl) and O(k) elementary operations on floating-point numbers ofsize O(N(l + log k)). Here N is the degree of S over Q.

Proof: By the previous theorem the degree of S over Q can be determinedwithin the time bounds stated. In particular, it will be determined whetherS is zero. If this is not the case, by the remarks above 2N(l+ log k) correctbits of S suffice to determine the sign of S. As is easily seen computing eachsquare root

√ni with absolute error less than 2−3N(l+log k), multiplying the

approximations with the corresponding coefficient ci, and adding the resultsleads to an approximation to S as required.

Computing each square root√ni with relative error less than

2−4N(l+log k) on the other hand yields an approximation to√ni as required.

Since taking square roots is one of the elementary operations the corollaryfollows.

Let us briefly sketch how to generalize the results above to arbitrarysums of radicals

∑ki=1 ci di

√ni with di, ni ∈ N and ci ∈ Z. To prove the next

corollary we need certain results from the following sections. The proof ofthese lemmata needs some terminology that can be defined only in a latersection therefore we will only mention in the proof which lemmata are used.

Corollary 4.2.4 Let S =∑ki=1 ci di

√ni, di, ni ∈ N, ci ∈ Z, be a sum of real

radicals. Assume ni, di, |ci| < 2l. Using O((k2 + kN2 logN)l) elementaryoperations on integers of size O((k + N logN)l) and O(k log(Nl log k)) el-ementary operations on floating-point numbers of size O(N(l + log k)) thesign of S can be determined. Here N is the degree of S over Q.

Proof: By Corollary 4.1.5 within the time bounds stated S can be trans-formed into a sum of linearly independent radicals. In particular, it can bedecided whether S is zero. If this is not the case, we determine the degree ofthe radical extension generated by the radicals appearing in the transformedsum with non-zero coefficient. By Theorem 3.13, p. 32, the degree of thisextension is the degree of S. As will follow from the proof of Lemma 7.2.1,p. 135, this degree can be determined within the time bounds stated (seealso Remark 7.2.2, p. 140).

As was the case for the square roots the minimal polynomial of S is ofthe form

g0 =∏

(ζ1,...,ζk)∈H

(X −

k∑i=1

ζici√ni

),

43

Page 48: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

where H is a set of k-tuples (ζ1, ζ2, . . . , ζk) such that ζi is di-th root of unity.The cardinality of H is N.

As before, the absolute value of the coefficients in g0 is bounded by(k22l)N and by Cauchy’s bound 2N(l + log k) bits suffice to determine thesign of S =

∑ki=1 ci di

√ni. As in the previous lemma approximating each

radical di√ni with absolute error less than 2−3N(l+log k), multiplying the ap-

proximations with the corresponding coefficient ci, and adding the resultsleads to an approximation to S as required. Lemma 5.4.6, p. 71, adjusted tothe rational case shows that the approximations to di

√ni can computed by

O(k log(Nl log k)) elementary operations on floating-point numbers of sizeO(N(l + log k)). This proves the corollary.

44

Page 49: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

5 Linear Dependence of Radicals over AlgebraicNumber Fields

In this section we generalize the results of Section 4.1 to real radicals over realalgebraic number fields and complex radicals over number fields containingcertain roots of unity. That is, we want to describe efficient algorithms thatcheck whether a set of radicals d1√ρ1, d2

√ρ2, . . . , dk

√ρk over an algebraic

number field Q(α) is linearly dependent or whether a given linear combina-tion of these radicals is zero. By Corollary 3.10, p. 28, if α and the radicalsdi√ρi are real or Q(α) contains primitive di-th roots of unity this problem

can be reduced to the question whether a ratio of radicals d1√ρ1/ d2√ρ2 is

contained in Q(α). So it basically remains to describe an algorithm thatanswers questions of this kind efficiently. Since the ring of integers in Q(α)is in general not a unique factorization domain we really have to work withratios and cannot, like in Section 4, restrict ourselves to roots of algebraicintegers.

First a few words on the input of the problems above. We assume thatthe field Q(α) is given by the n-tuple (pn−1, pn−2, . . . , p1, p0) ∈ Zn such thatthe algebraic integer 8 α has minimal polynomial p(X) = Xn +

∑n−1i=0 piX

i.We also assume that α is distinguished from its conjugates by an isolatinginterval I in the real case and an isolating rectangle R in the complex case.That is, an interval or rectangle that contains exactly one root of p. Wewill see below (see Lemma 5.1.3, p. 49) that the endpoints can be chosen asrational numbers such that both numerator and denominator are O(nl)-bitintegers. This representation of an algebraic number field is exactly the onethat has been used by Loos [Lo1].

In algebraic number theory usually an element β ∈ Q(α) is describedby an n-tuple (q0, q1, . . . , qn−2, qn−1) ∈ Qn such that ρ =

∑n−1i=0 qiα

i.In this thesis we assume instead that ρ is encoded by an (n + 1)-tuple(b, b0, b1, . . . , bn−2, bn−1) ∈ Zn+1 such that ρ = 1

b

∑n−1i=0 biα

i. Moreover, weassume gcd(b, b0, b1, . . . , bn−1) = 1 and b > 0. The integer b will be calledthe denominator of ρ. By computing the lcm of the denominators of theqi we can easily go from one representation to the other. In any case, thetotal input size of ρ is a linear in n and the maximum bit size of the integerspi, b, bi.

To avoid ambiguity, in the algorithms of this section we assume that aradical d

√ρ is given by d, ρ, and a positive integer k between 0 and d − 1

8Recall that we can restrict ourselves to fields generated by integers.

45

Page 50: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

such thatd√ρ = ζkd |ρ|

1d

(cos

1dφ+ i sin

1dφ

),

where φ ∈ (−π, π] denotes the angle of ρ when written in polar coordinates,|ρ|

1d is the positive d-th root, and ζd = cos 2π

d + i sin 2πd . For the sake of

brevity, the integer k will not be mentioned explicitly. If we require that theradical is real we assume that the k is an appropriate integer, i.e., k = 0 ifd is odd and k = 0 or k = d

2 if d is even.Therefore the input size of a set of radicals d1√ρ1, d2

√ρ2, . . . , dk

√ρk over

an algebraic number field is polynomial in the degree of the field, in the bitsize of the coefficients of the minimal polynomial of α, in the bit size of thecoefficients of the ρi’s, and in log di, i = 1, 2, . . . , k.

Remark that if Q(α) ⊂ R then the question whether two elementsdi√ρi, dj√ρj ∈ d1

√ρ1, d2√ρ2, . . . , dk

√ρk ⊂ R have a ratio contained in Q(α),

and hence the question whether the elements in d1√ρ1, d2√ρ2, . . . , dk

√ρk are

linearly independent, does not depend on the values of the radicals in theset as long as they are real. If Q(α) contains primitive di-th roots of unitythen the question whether the elements in d1√ρ1, d2

√ρ2, . . . , dk

√ρk are lin-

early independent is independent of the values of the radicals. We need noteven distinguish α from its conjugates by an isolating rectangle since allconjugate fields contain di-th roots of unity. For linear combinations on theother hand, we really need to specify the roots di

√ρi and the element α since

for some interpretations of the roots a sum of radicals may be zero and forothers not.

In a sense the question whether the set of radicals is linearly indepen-dent over a field containing di-th roots of unity is the only purely algebraicproblem. It can be solved purely symbolically, that is, we need to specify αonly by its minimal polynomial and the radicals di

√ρi by ρi and di.

We will apply a test whether d1√ρ1/ d2√ρ2 ∈ Q(α) for complex radicals

d1√ρ1, d2√ρ2 only for fields Q(α) containing primitive di-th roots of unity.

Hence Q(α) has degree at least maxϕ(d1), ϕ(d2). Recall that ϕ denotesEuler’s ϕ-function. Since ϕ(d) = Ω( d

log log d) (see [Ap]) and the algorithmthat decides whether a ratio of radicals is in Q(α) must be polynomial in thedegree of the extension we cannot hope for an algorithm that is polynomialin log di, instead we have to be content with an algorithm that is polynomialin di. But then the question whether d1

√ρ1/ d2√ρ2 ∈ Q(α) can be solved by

factoring ρd12 Xd1d2 − ρd21 over Q(α) using a polynomial time factorization

algorithm for polynomials over algebraic number fields (see [La1],[Le]).

46

Page 51: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

For real radicals over real algebraic number fields the situation is differ-ent. We will describe an algorithm with run time polynomial in log d1, log d2

that tests for a ratio of radicals d1√ρ1/ d2√ρ2, ρ1, ρ2 ∈ Q(α), whether it is con-

tained in Q(α) and, if so, computes the representation in Q(α).So we have to derive an upper bound on the representation size of

d1√ρ1

d2√ρ2

as an element of Q(α) that is polynomial in log di. In the second subsectionof this section we derive a bound that is even independent of d1 and d2. Thismay seem quite surprising but to some extend it only reflects the well-knownfact that the size of the coefficients of a factor of a polynomial depends onlyon the degree of the factor and on the coefficients of the original polynomialbut not on its degree (see [Mi1]).

The algorithm for ratios of real radicals which we describe has a similarstructure as the algorithm for real radicals over Q. In the first step anapproximation to the ratio

d1√ρ1

d2√ρ2

will be computed that is good enough to

determine in a second step an element γ ∈ Q(α) such that ifd1√ρ1

d2√ρ2

is in Q(α)then it must be γ. Finally, in the third step we check whether γ equals theratio of radicals by computing γd1d2 and comparing it to ρd21 /ρ

d12 .

We will describe the second step first in order to determine the qualityof the approximation to

d1√ρ1

d2√ρ2

that allows us to determine a unique element

γ ∈ Q(α) such that ifd1√ρ1

d2√ρ2∈ Q(α) then it must be γ. The main ingredient

to this step will be a variant of the algorithm of Kannan et al. [KLL] todetermine the minimal polynomial of an algebraic number once an approx-imation to this number is given. That these problems are closely relatedis not hard to see because both can be described as searching for a lineardependence between algebraic numbers. Reconstructing the minimal poly-nomial of an algebraic number is the same as constructing a minimal integerlinear dependence between its powers. Likewise, for (re-)discovering the ex-act representation of an element γ ∈ Q(α) we have to determine an integerlinear dependence between γ and powers of α.

Next we describe how to approximated1√ρ1

d2√ρ2

with precision as required bythe second step. This approximation algorithm will be based on algorithmsdue to R. Brent who showed how to evaluate exp, ln, and the trigonometricfunctions efficiently with small relative error.

In the third step probabilistic arguments are used to check whether thenumber γ computed by the first two steps satisfies ρd12 γ

d1d2 = ρd21 . If γd1d2and ρd21 , ρ

d12 are computed by successive squaring then it may happen that

the coefficients in the results have Ω(d1d2) bits. Moreover, unlike the rational

47

Page 52: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

case the size of d1, d2 cannot be polynomially bounded from above by therepresentation size of ρ1, ρ2. In our approach γd1d2 , ρd21 , and ρd12 are actuallycomputed be successive squaring, after each multiplication step, however, wereduce the coefficients of the resulting elements in Q(α) by several randomlychosen small integers. Finally we compare whether the elements in Q(α)thus obtained are equal. It will be shown that with high probability thisgives the correct result.

By not using the reduction step we get a deterministic test with runtime polynomial in d1, d2. We can apply this algorithm in the complex casewhere it yields a more efficient solution to the question whether a ratio ofradicals is in Q(α) than using a factorization algorithm.

As a special case, when d2 = 1, the algorithms can be used to determineefficiently whether a radical d

√ρ, ρ ∈ Q(α), is in Q(α).

In the last subsection we apply the previous results to sums of radicals.But before we describe the algorithms and derive a bound on the rep-

resentation size of a ratio of radicals we recall some of the basic definitionsand bounds for polynomials and their roots. Furthermore we deduce generalbounds on representations of algebraic numbers.

5.1 Definitions and Bounds

We review some of the basic facts about polynomials. For the reader’sconvenience we include some proofs.

Let p =∑ni=0 piX

i ∈ C[X] be a polynomial with complex coefficients.

The length |p|2 of p denotes the euclidean length(∑n

i=0 |pi|2) 1

2 of the vector(p0, . . . , pn). The height |p|∞ of p is the L∞-norm max|p0|, |p1|, . . . , |pn| of(p0, . . . , pn). Observe that for any irreducible integer polynomial p its lengthis at least

√2.

The first result we mention is a generalization of Cauchy’s bounds(Lemma 4.2.1, p.41) which is due to Landau (see Mignotte [Mi1] for a proof).

Definition 5.1.1 If p(X) =∑ni=0 piX

i ∈ C[X] has roots α0, . . . , αn−1, thenthe measure M(p) of p is defined as

M(p) = |pn|n−1∏j=0

max1, |αj |

Lemma 5.1.2 (Landau) The measure M(p) of a polynomial p satisfies

M(p) ≤ |p|2.

48

Page 53: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Next we give the well-known root separation bound for polynomials. It de-termines for example how to choose the isolating intervals for the descriptionof the fields Q(α).

Lemma 5.1.3 If p is an integer polynomial of degree n then any pair α1, α2

of distinct roots of p satisfies

|α1 − α2| > n−(n+2)/2|p|1−n2 .

A proof for this fact can be found in Mignotte’s overview [Mi1].For later purposes we also derive an upper bound for the absolute value of

the discriminant. We get this via Hadamard’s bound for the determinant ofa matrix. For a polynomial with distinct roots α0, α1, . . . , αn−1, we definedthe discriminant ∆ as

∆ = Π0≤i<j≤n−1(αi − αj)2.

Hence |∆|12 = |Π0≤i<j≤n−1(αi−αj)| is the absolute value of the determinant

of 1 α0 . . . αn−1

0

1 α1 . . . αn−11

...1 αn−1 . . . αn−1

n−1

,which is a Vandermonde matrix.

Lemma 5.1.4 (Hadamard’s bound) Let M = (mij) be (n × n)-squarematrix whose entries are complex numbers. Define H(M) as the product ofthe euclidean norm of the rows of M ,

H(M) =n∏i=1

n∑j=1

|mij |2 1

2

.

Then

|detM | ≤ H(M).

Combining Landau’s bound for the measure of a polynomial withHadamard’s bound yields an upper bound for the discriminant of a polyno-mial p.

49

Page 54: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Lemma 5.1.5 Let p(X) =∑ni=0 piX

i be an irreducible polynomial withcomplex coefficients. Then the discriminant ∆ of p satisfies

|∆|12 < |pn|−(n−1)nn|p|n2 .

Proof: Let α0, α1, . . . , αn−1 be the roots of p. |∆|12 = |detD|, where D is

the matrix 1 α0 . . . αn−1

0

1 α1 . . . αn−11

...1 αn−1 . . . αn−1

n−1

from above.

By Hadamard’s bound

|detD| ≤

n−1∏i=0

n−1∑j=0

|αi|2j 1

2

.

Expanding the product yields nn terms each of which is smaller than1

|pn|2(n−1) (M(p))2(n−1). Hence by Landau’s bound for M(p)

|∆|12 ≤ n

n2

1|pn|n−1

|p|n−12 .

Hadamard’s bound can be generalized to matrices with complex polyno-mials as entries. Below this bound will be applied to the resultant of twopolynomials.

Lemma 5.1.6 (Goldstein-Graham) Let M(X) = (Mij(X)) be an n ×n-matrix whose entries are polynomials with complex coefficients. Denoteby mij the L1-norm of Mij(X), that is, the sum of the absolute values ofthe coefficients of Mij . Furthermore define M ′ as M ′ = (mij). Then thepolynomial detM(X) ∈ C[X] satisfies

| detM(X)|2 ≤ H(M ′).

Next we deduce some bounds on the representation size of algebraic num-bers.

50

Page 55: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

For an algebraic number β = β0 of degree m its infinity norm [β]∞ isdefined as max|β0|, |β1|, . . . , |βm−1|, the maximum of the absolute valuesof its conjugates.

If we assume that β is an element of the algebraic number fieldQ(α), β = 1

b

∑n−1i=0 biα

i, gcd(b, b0, b1, . . . , bn−1) = 1, and denote the dis-tinct field embeddings of Q(α) over Q by σj , j = 0, . . . , n − 1, then[β]∞ = max|σ0(β)|, |σ1(β)|, . . . , |σn−1(β)|.

Let us note the following important property of the infinity norm. It willbe used throughout the thesis.

Lemma 5.1.7 Let α, β be algebraic numbers. Then

(i) [α+ β]∞ ≤ [α]∞ + [β]∞,

(ii) [αβ]∞ ≤ [α]∞[β]∞.

Proof: Denote the conjugates of α and β by αi, i = 0, 1, . . . , n − 1 andβj , j = 0, 1, . . . ,m− 1, respectively. By Lemma 2.7, p.18, the conjugates ofα+ β and αβ are among the numbers αi + βj and αiβj , respectively. Hence

[α+ β]∞ ≤ max|αi + βj | ≤

≤ max|αi|+ max|βj | = [α]∞ + [β]∞,

and similarly for the product.

For β ∈ Q(α) as above [β] is defined as max|b|, |b0|, |b1|, . . . , |bn−1|. Wewill call [β] the coefficient size of β with respect to Q(α). Observe that βhas in general not a unique representation as β = 1

b

∑n−1i=0 biα

i, b, bi ∈ Z,therefore we cannot define [β] without any restrictions on the integers b, bi.By our definition for [β] we can speak of the coefficient size of an algebraicnumber rather than of the coefficient size of a representation for an algebraicnumber.

[β] depends on the field Q(α) and the generator α. But in general it willbe clear which field Q(α) and which generator we are referring to, thereforewe do not mention them explicitly in the symbol [β].

With these definitions we get

Lemma 5.1.8 Let Q(α) be an algebraic number field , where α is an alge-braic integer with minimal polynomial p(X) =

∑ni=0 piX

i, pn = 1, pi ∈ Z.Then any element β of the ring of integers Rα of Q(α) satisfies

[β] < n2n|p|2n2 [β]∞.

51

Page 56: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Proof: Since β ∈ Rα it can be represented as β = 1∆

∑ni=0 biα

i, bi ∈ Z(see Lemma 2.10, p. 22). Hence ∆β =

∑ni=0 biα

i, and the integers bi, i =0, . . . , n− 1, are the unique solution to the following equation

1 α0 . . . αn−10

1 α1 . . . αn−11

...1 αn−1 . . . αn−1

n−1

b0b1...bn−1

=

∆σ0(β)∆σ1(β)...∆σn−1(β)

.A bound on the absolute values of ∆, bi is clearly an upper bound for [β].

Denote the matrix on the left-hand side of the equation above by D.Since (detD)2 is the discriminant ∆ of α, detD 6= 0. Hence D−1 is definedand

b0b1...bn−1

=D−1

∆σ0(β)∆σ1(β)...∆σn−1(β)

.Let D−1 = (dij).

Then bi = ∆∑n−1j=0 dijσj(β) or |bi| ≤ |∆|

∑n−1j=0 |dij ||σj(β)|. So we need

an upper bound on |dij |.It is standard linear algebra that

dij = (−1)i+j detDji/detD,

where Dji is the (n − 1) × (n − 1)-matrix we get by deleting the j-th rowand i-th column in D. As in the proof of Lemma 5.1.5 using Hadamard’sbound and Landau’s estimate for the measure yields

|detDji| ≤ (n− 1)n−1

2 |p|n−12 ,

because pn = 1. Hence

|dij | < nn−1|p|n2/|∆12 |.

The lemma follows from the previous bound on |∆|12 .

Formulated differently the lemma has already been shown by Weinberger,Rothschild [WR] and others (see [La1], [Le], [LMc]).

Next we recall from Loos’ paper on resultants [Lo1] how to construct foran element β = 1

b

∑n−1i=0 biα

i, bi ∈ Z, an integer polynomial whose roots arethe conjugates of β and deduce bounds on [β]∞ and [β−1]∞.

52

Page 57: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Definition 5.1.9 Let f(X) =∑ni=0 fiX

i and g(X) =∑mi=0 giX

i be poly-nomials over an arbitrary commutative ring R. The determinant of thefollowing (n + m) × (n + m)-matrix M that consists of m rows of shiftedcoefficients of f and n rows of shifted coefficients of g

M =

fn fn−1 fn−2 · · · f1 f0

fn fn−1 · · · · · · f1 f0

. . .fn fn−1 · · · f1 f0

gm gm−1 · · · g1 g0

gm gm−1 · · · g1 g0

. . .gm gm−1 · · · g1 g0

is called the resultant res(f, g) of f and g.

Lemma 5.1.10 (Loos) Let β = 1b

∑n−1i=0 biα

i, b, bi ∈ Z be an element ofQ(α). Denote by B(Y ) the polynomial B(Y ) =

∑n−1i=0 biY

i. Then the resul-tant r(X) ∈ Z[X] of bX − B(Y ) and p(Y ) taken with respect to the ringZ[X] has roots σj(β), j = 0, . . . , n− 1. Here p is the minimal polynomial ofα and the σj’s are the distinct field embeddings of Q(α).

Using the polynomial r we can prove an upper bound on [β−1]∞.

Lemma 5.1.11 Let Q(α) and β be as above. Then

(i) β and β−1 are roots of polynomials r and r′, respectively, whose 2-normis bounded by

|r|2 = |r′|2 < n2n|p|n2 [β]n

(ii)

[β]∞ < 3[β]|p|n2

and

[β−1]∞ < n2n|p|n2 [β]n.

53

Page 58: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Proof: We may choose r as the polynomial from the previous lemma. More-over, if

r(X) =n∑i=0

riXi, ri ∈ Z,

then β−1 is a root of

r′(X) =n∑i=0

rn−iXi.

Hence |r|2 = |r′|2. To get the bound on |r|2 apply the Graham-Goldsteinbound to the matrix defining r which shows

|r|2 < |p|n2

(n−1∑i=1

|bi|2 + (|b0|+ |b|)2

)n2

≤ |p|n2 (n+ 3)n2 [β]n < n2n|p|n2 [β]n.

By Landau’s bound for the measure of a polynomial (Lemma 5.1.2) forany integer polynomial p its roots are bounded in absolute value by |p|2.Applying this to the polynomial r′ and the bound on the length of r′ fromabove proves the bound for [β−1]∞.

To prove the better bound on [β]∞ note that by Landau’s bound on themeasure of p [α]∞ ≤ |p|2. Moreover observe that p is irreducible hence itslength is at least

√2. [β]∞ ≤ [β] max

∑n−1i=0 |σj(α)|i, where the maximum

is over all field embeddings σj of Q(α). Since

n−1∑i=0

|σj(α)|i ≤n−1∑i=1

|p|i2 ≤|p|n2 − 1|p|2 − 1

the bound follows from |p|2 − 1 ≥√

2− 1 > 13 .

Note that Lemma 5.1.8 and Lemma 5.1.11 are dual to one another in thesense that the first one bounds [β] in terms of [β]∞ and the second onebounds [β]∞ and [β−1]∞ in terms of [β]. We will use both directions in thesequel.

54

Page 59: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

5.2 Lattice Basis Reduction and Reconstructing AlgebraicNumbers

In this section we answer the following questions:Given an approximation γ to an element γ ∈ Q(α) and a guarantee that theinteger coefficients c, ci in γ = 1

c

∑n−1i−0 ciα

i are bounded in absolute valueby some positive number K. Can the coefficients c, ci be computed exactly?How good do we have to choose the approximation γ?

We will show that a variant of the Kannan, Lenstra, Lovasz algorithmto reconstruct minimal polynomials (see [KLL]) can be used to solve thisproblem.

The main tool that will be used are lattices. Given a set V of linearlyindependent vectors V = v1, . . . , vm ⊂ Rn, m ≤ n, the lattice Λ(V )generated by these vectors is the set

Λ(V ) =

m∑i=1

zivi | zi ∈ Z

of vectors that can be written as linear integer combinations of the vectorsin V. The vectors vi will be described as the columns of an (n×m)-matrixwhich is also called V.

Since we assume that the columns in V are linearly independent any vec-tor v ∈ Λ(V ) can be identified with a unique vector (z1, . . . , zm) ∈ Zm suchthat v =

∑mi=1 zivi. Using the matrix V this reads as v = V (z1, . . . , zm)T .

We give two examples for lattices.Example 1 The identity matrix In in Rn

In =

1 0 . . . 00 1 . . . 0

. . .0 0 . . . 1

,generates the set of all vectors in Rn with integer coordinates. The samelattice is generated if one column in In is replaced by a column consistingonly of 1’s.Example 2 The 2× 2-matrix [

−1 +1+1 +1

]

generates all vectors (a, b)T ∈ R2 satisfying a ≡ b( mod 2).

55

Page 60: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

An important concept is a basis of a lattice. A subset of a lattice Λ(V )is called a basis if every element in Λ(V ) can uniquely be written as aninteger linear combination of the basis vectors. As can be seen from thefirst example above and is clear from linear algebra a lattice may have manydifferent bases. On the other hand, each lattice has a basis.

Another important concept that has been introduced by Lenstra et al.[LLL] in their break-through work on polynomial factorization is a so-calledreduced basis of lattice. For a set V of linearly independent vectors we callthe set of vectors we get by applying the Gram-Schmidt orthogonalizationprocess to V the Gram-Schmidt-version of V.

Definition 5.2.1 Let Λ(V ) ⊂ Rn be an m-dimensional lattice. A ba-sis b1, . . . , bm for Λ(V ) is called reduced if its Gram-Schmidt versionb1∗, . . . , bm∗ has the following two properties

(i) ∣∣∣(bi, bj∗)/(bj∗, bj∗)∣∣∣ ≤ 12, 1 ≤ j < i ≤ m,

(ii)‖bj∗‖22 ≤ 2‖bj+1

∗‖22, j ≤ m,

where (, ) is the usual inner product and ‖.‖2 denotes the euclidean length.The basis b1, . . . , bm is called semi-reduced if its Gram-Schmidt ver-

sion satisfies‖bi∗‖22 ≤ 2m+j−i‖bj∗‖22, 1 ≤ i < j ≤ m.

The concept of a semi-reduced basis was introduced by Schonhage [Sc4].Observe that a lattice is a discrete object. So the length of a shortest

vector taken with respect to the euclidean length ‖.‖2 is uniquely definedalthough there may be many different vector of this length. For a reducedbasis the next lemma was originally proven in [LLL]. For a semi-reducedbasis the same proof can be applied.

Lemma 5.2.2 The length of the shortest vector in a reduced basis differsfrom the length of a shortest non-zero vector in the lattice by at most a factorof 2

m−12 . The length of a shortest vector in a semi-reduced basis differs from

the length of a shortest non-zero vector in the lattice by at most a factor of2m.

56

Page 61: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

A semi-reduced basis can be computed faster than a reduced basis, sowe will use this slightly weaker notion in the sequel.

The next theorem establishes the relationship between shortest vectorsin a semi-reduced basis and representations of algebraic numbers.

Theorem 5.2.3 Let Q(α) be an algebraic number field, where α is an al-gebraic integer with minimal polynomial p(X) =

∑ni=0 piX

i, pn = 1, pi ∈ Z.Let γ be an element in Q(α) such that [γ] < K, K ≥ 2. Assume s and ε arereal numbers satisfying

s > 22n227nnn|p|n2K4n, ε = 4s−1.

Moreover, suppose γ, α are approximations to γ and α, respectively, suchthat the following estimates hold for the real and imaginary parts <,= ofγ, α

|<(γ)−<(γ)| < 12ε, |=(γ)−=(γ)| < 1

2ε,

|<(αi)−<(αi)| < 12ε, |=(αi)−=(αi)| < 1

2ε, ∀i ∈ 1, 2, . . . , n− 1.

If Λ(V ) is generated by the columns of the following (n+ 3)× (n+ 1) matrix

V =

s<(γ) s s<(α) s<(α2) . . . s<(αn−1)s=(γ) 0 s=(α) s=(α2) . . . s=(αn−1)1 0 . . . 00 1 0 . . . 0

. . .. . .

0 0 . . . 0 1

,

then the shortest vector g = V (c, c0, c1, . . . , cn−1)T of a semi-reduced basisof Λ(V ) satisfies

γ =−1c

n−1∑i=0

ciαi.

Moreover, gcd(c, c0, c1, . . . , cn−1) = 1.

Proof: The columns in V are linearly independent. Each vector v ∈ Λ(V )can be identified with a unique vector (z, z0, z1, . . . , zn−1) ∈ Zn+1 such thatv = V(z, z0, z1, . . . , zn−1)T . Every vector (z, z0, z1, . . . , zn−1) ∈ Zn+1 in turncan be identified with a unique polynomial v(X,Y ) = zX+

∑n−1i=0 ziY

i in the

57

Page 62: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

two variables X and Y. Hence there is a one-to-one correspondence betweenvectors v ∈ Λ(V ) and certain polynomials v(X,Y ) ∈ Z[X,Y ].

It is easily verified that the euclidean norm ‖v‖2 of a vector v ∈ Λ(V )satisfies

‖v‖22 = s2|v(γ, α)|2 + |z|2 +n−1∑i=0

|zi|2.

Consider the vector g = V(c,−c0, . . . ,−cn−1)T and the polynomialg(X,Y ) = cX −

∑n−1i=0 ciY

i, corresponding to the representation γ =1c

∑n−1i=0 ciα

i, |c|, |ci| < K, c, ci ∈ Z.g(γ, α) = 0 hence

|g(γ, α)| = |g(γ, α)− g(γ, α)| ≤

≤ |c||γ − γ|+n−1∑i=1

|ci||αi − αi| ≤ nKε,

since by assumption |c| < K, |ci| < K for all i = 0, . . . , n− 1. So the length` of a shortest vector in a semi-reduced basis for Λ(V ) satisfies (see Lemma5.2.2)

` ≤ 2n+1‖g‖2 = 2n+1

(s2|g(γ, α)|2 + |c|2 +

n−1∑i=0

|ci|2) 1

2

≤ 2n+1((εsnK)2 + (nK)2)12 ≤ 2n+1(εs+ 1)nK.

By choice of ε and s` < 22n+3K.

Next we claim that for any vector v ∈ Λ(V ) whose euclidean norm‖v‖2 is smaller than 22n+3K the corresponding polynomial v(X,Y ) satisfiesv(γ, α) = 0.

To prove the claim first note that

‖v‖2 =

(s2|v(γ, α)|2 + |z|2 +

n−1∑i=0

|zi|2) 1

2

≤ 22n+3K

implies

|v(γ, α)| ≤ 22n+3Ks−1 and

58

Page 63: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

|z| ≤ 22n+3K, |zi| ≤ 22n+3K, i = 0, . . . , n− 1.

From |v(γ, α)| ≤ 22n+3Ks−1 we deduce

|v(γ, α)| ≤ |v(γ, α)− v(γ, α)|+ |v(γ, α)| ≤

≤ 22n+3nKε+ 22n+3Ks−1 = 22n+3Ks−1(4n+ 1) < 26nKs−1,

where the bound on |v(γ, α) − v(γ, α)| follows in exactly the same way asthe corresponding bound for g shown above.

By assumption on the representation size of γ a non-zero integer c, |c| <K, exists such that cγ ∈ Z[α] ⊂ Rα. Consider the norm

no(cv(γ, α)) =n−1∏j=0

cv(σj(γ), σj(α))

of cv(γ, α). Since cv(γ, α) ∈ Rα this is a rational integer9. Hence∣∣∣∣∣∣cnn−1∏j=0

v(σj(γ), σj(α))

∣∣∣∣∣∣ ∈ N ∪ 0.

We show that the product is smaller than 1 and hence must be zero.∣∣∣∣∣∣cnn−1∏j=0

v(σj(γ), σj(α))

∣∣∣∣∣∣ = |cn||v(γ, α)|n−1∏j=1

|v(σj(γ), σj(α))| <

s−126nKn+1n−1∏j=1

|v(σj(γ), σj(α))| ≤ s−126nKn+1n−1∏j=1

(|zσj(γ)|+

n−1∑i=0

|ziσj(αi)|).

Using the above estimates for |z| and |zi| this shows∣∣∣∣∣∣cnn−1∏j=0

v(σj(γ), σj(α))

∣∣∣∣∣∣ < s−122n2+7nK2nn−1∏j=1

(|σj(γ)|+

n−1∑i=0

|σj(αi)|).

Writing σj(γ) as 1c

∑n−1i=0 ciσj(α)i, combining corresponding powers of σj(α),

and expanding the product yields n(n−1) terms each of which is smaller than(K + 1)n−1M(p)n−1, where M(p) is the measure of p (see Definition 5.1.1,p. 48).

9Recall from Equation (2) in Section 2 that the norm of an element in Q(α) correspondsto an integer power of the constant term of the minimal polynomial

59

Page 64: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Hence by applying Landau’s bound for the measure of a polynomial (seeLemma 5.1.2, p. 48)∣∣∣∣∣∣cn

n−1∏j=0

v(σj(γ), σj(α))

∣∣∣∣∣∣ < s−122n2+7nnn−1K4n|p|n−12 .

By choice of s the claim follows.Since c 6= 0

∣∣∣cn∏n−1j=0 v(σj(γ), σj(α))

∣∣∣ = 0 implies that one of the factors

in∏n−1j=0 v(σj(γ), σj(α)) must be zero. But these factors are conjugates of

each other. Hence if one of them is zero all are, which proves our claim that‖v‖ < 22n+3K implies v(γ, α) = 0.

Applying this result to the shortest vector g in a semi-reduced basisof the lattice Λ(V ) shows that g corresponds to a polynomial g(X,Y ) =cX +

∑n−1i=0 ciY

i such that g(γ, α) = 0. This proves the first part of thetheorem.

To prove the second claim observe that if gcd(c, c0, . . . , cn−1) 6= 1we would find a vector g′ = V(c′, c′0, . . . , c

′n−1)T ∈ Λ(V ) such that

gcd(c′, c′0, . . . , c′n−1) = 1 and g′(γ, α) = 0. The unique representation of ele-

ments of Q(α) as rational linear combinations of powers of α shows g′ = c′

c g.Since gcd(c′, c′0, . . . , c

′n−1) = 1 the integer c cannot properly divide c′. On

the other hand, g′ can be written as an integer linear combination of thevectors in the semi-reduced basis. Since the elements of the basis are linearlyindependent even over Q, this representation must be g′ = c′

c g. But thenc′

c = +1 or c′

c = −1 which proves the second claim of the theorem.

The proof above is based on ideas of Lovasz [Lo] but replaces a brute forceestimate by Landau’s bound on the measure of a polynomial. Applying thismodification to Lovasz’ analysis of the corresponding bounds for minimalpolynomials leads to an improvement of his bounds by a factor of n. Thusthis simple analysis can be used to deduce the same bounds as in [KLL].

Remark 5.2.4 The proof of the previous theorem can be generalized to givea quite simple analysis of an algorithm to reconstruct minimal polynomialsover algebraic number fields and to factor polynomials over algebraic numberfields (see also [La1],[Le]). ♦

To turn Theorem 5.2.3 into an algorithm we need to compute a semi-

60

Page 65: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

reduced basis for the lattice Λ(V )10. For lattices corresponding to matricesas in the theorem above the best run times are due to Schonhage [Sc4]although in general Schnorr’s algorithm [Scr] is better.

Theorem 5.2.5 (Schonhage) Let α, γ, p,K, s, ε be as in Theorem 5.2.3.A semi-reduced basis for the lattice generated by the columns of

V =

s<(γ) s s<(α) s<(α2) . . . s<(αn−1)s=(γ) 0 s=(α) s=(α2) . . . s=(αn−1)1 0 . . . 00 1 0 . . . 0

. . .. . .

0 0 . . . 0 1

,

can be computed using O(n2 log s) elementary operations on integers whosebinary length is bounded by O(log s).

Observe that by choice of s and ε we can assume that the entries of thematrix are integers.

Combining this result with Theorem 5.2.3 and interpreting everythingin terms of bits yields

Theorem 5.2.6 Let Q(α) be an algebraic number field, where α is an al-gebraic integer with minimal polynomial p(X) =

∑ni=0 piX

i, pn = 1, pi ∈Z, |p|2 < 2l. Let γ be an element in Q(α) such that [γ] < 2B. Assume ε > 0satisfies

log1ε> 2n2 + 7n+ n log n+ nl + 4nB.

Moreover, suppose that approximations γ, α to γ and α, respectively, aregiven such that the following estimates hold for the real and imaginary parts<,= of γ, α

|<(γ)−<(γ)| < ε, |=(γ)−=(γ)| < ε,

|<(αi)−<(αi)| < ε, |=(αi)−=(αi)| < ε, ∀i ∈ 1, 2, . . . , n− 1.

Then the representation of γ as γ = 1c

∑n−1i=0 ciα

i, c, ci ∈ Z, andgcd(c, c0, c1, . . . , cn−1) = 1, can be computed with O(n3(n + l + B)) ele-mentary operations on integers whose bit size is bounded by O(n(n+ l+B)).

10Observe that by choice of s and ε we can assume that the entries of V are integers.

61

Page 66: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Observe that the approximation to α we use in this theorem is much betterthan the one we require for the isolating rectangle (see the separation boundof Lemma 5.1.3, p. 49). Also observe that by the second claim in Theorem5.2.3 the representation for γ computed by the lattice basis reduction isnormalized in the sense that the gcd of the integers appearing in the rep-resentation is 1. Therefore the integers of the representation will be smallerthan 2B in absolute value.

As we mentioned the technique we just presented to reconstruct algebraicnumbers was originally invented by Kannan, Lenstra, Lovasz [KLL] and(slightly different) Schonhage [Sc4] to reconstruct minimal polynomials fromapproximations. For later use let us state the precise result of Schonhagewho has slightly better run times.

Theorem 5.2.7 (Schonhage) Let α be an algebraic integer such that theminimal polynomial p of α is of degree at most n and satisfies |p|2 ≤ 2l.Given an approximation α to α with |α− α| ≤ 2−3(n2+n logn+n+nl) then theminimal polynomial p can be reconstructed exactly using at most O(n3(n+l))elementary operations on integers of length at most O(n2 + nl).

62

Page 67: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

5.3 Ratios of Radicals in Algebraic Number Fields

In this subsection we apply the result of the previous section to ratios of radi-cals in real and certain complex algebraic number fields. That is, we describean algorithm which, given a ”good” approximation to a ratio d1

√ρ1/ d2√ρ2

of radicals d1√ρ1, d2√ρ2 over an algebraic number field Q(α), computes an

element γ ∈ Q(α) such that d1√ρ1/ d2√ρ2 = γ if the ratio is in Q(α). We

derive a similar result for simple radicals d√ρ.

Lemma 5.3.1 Assume d√ρ is a radical over the number field Q(α), where

α is an algebraic integer with minimal polynomial p(X). If d√ρ ∈ Q(α) then

[ d√ρ] < 3[ρ]2n2n|p|3n2

Proof: d√ρ is a solution to the equationXd−ρ = 0. Let b be the denominator

of ρ. b d√ρ is a solution to Xd− bdρ = 0. This shows that b d

√ρ is an algebraic

integer (bdρ is an algebraic integer). Since we want to apply Lemma 5.1.8we need a bound for [ d

√ρ]∞.

The conjugates of d√ρ over Q are among the d-th roots of the conjugates

of ρ. In fact, if f is the minimal polynomial of ρ then the polynomial f(Xd)has root d

√ρ. Therefore the conjugates of d

√ρ must be among the roots

of f(Xd), which in turn are all d-th roots of the conjugates of ρ. Hence

[ d√ρ]∞ ≤ [ρ]

1d∞ ≤ max1, [ρ]∞ and [b d

√ρ]∞ ≤ bmax1, [ρ]∞. Here [ρ]

1d∞

denotes of course the positive real root.By Lemma 5.1.11, p. 53, [ρ]∞ < 3[ρ]|p|n2 . Therefore by Lemma 5.1.8,

p. 51,

[b d√ρ] < 3n2n|p|3n2 [ρ]2.

By Lemma 2.10, p. 22, the denominator of b d√ρ ∈ Rα is even bounded by

|∆|, where ∆ is the discriminant of α (see Lemma 2.10, p. 22). Thereforethe denominator of d

√ρ is bounded by |b||∆|. Hence the result follows from

the bound on |∆|12 (see Lemma 5.1.5, p. 50).

If we interpret the lemma in terms of bit complexity it states that if |p|2 < 2l

and [ρ] < 2L then d√ρ ∈ Q(α) implies

log [ d√ρ] < 2n log n+ 3nl + 2L+ log 3.

63

Page 68: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

For the rest of Section 5 define L := dn log n+ nl+Le. Then the coefficientsize above is strictly less than 23L.

The important thing to notice here is that the bounds are independentof d. This may seem quite surprising but it only reflects the fact that thesize of the coefficients of factors of a polynomial depend only on the degreeof the factors and on the size of the coefficients of the polynomial but noton its degree (see [WR]).

Next we deduce a similar bound for ratios of radicals over real algebraicnumber fields or certain complex algebraic number fields.

Lemma 5.3.2 Suppose that Q(α) is either a real algebraic number fieldor a number field containing d1-th and d2-th primitive roots of unity. Inboth cases let α, p be as above. If d1

√ρ1, d2√ρ2 are either real radicals in the

real case or arbitrary radicals in the complex case then d1√ρ1/ d2√ρ2 ∈ Q(α)

implies [d1√ρ1

d2√ρ2

]< 3n6n|p|5n2 [ρ1]2[ρ2]2n.

Proof: First we prove a bound on the denominator of d1√ρ1/ d2√ρ2.

By Lemma 3.4, p.24, d1√ρ1/ d2√ρ2 ∈ Q(α) implies d1

√ρ1d ∈

Q(α), d2√ρ2d ∈ Q(α), where d = gcd(d1, d2). Furthermore let d′1 =

d1/d, d′2 = d2/d.

Hence d1√ρ1d is a root of

Xd′1 − ρ1

and d2√ρ2−d is a root of

Xd′2 − 1ρ2.

By the same argument as in the proof of Lemma 5.3.1 b d1√ρ1d is an algebraic

integer for some b ∈ Z, b ≤ [ρ1].We need a similar result for 1

d2√ρ2d. Again we only need to bound the

size of a rational integer b′ such that b′ 1ρ2

is an algebraic integer.As has been observed in Section 2 if 1

ρ2is a root of the polynomial

r ∈ Z[X] whose leading coefficient is rm then rm1ρ2

is an algebraic integer.By Lemma 5.1.11 1

ρ2is root of a polynomial r whose 2-norm |r|2 is bounded

by n2n|p|n2 [ρ2]n. This gives us the bound on b′. Moreover, combined with

64

Page 69: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

bound on b it shows that an integer c exists such that

c

(d1√ρ1

d2√ρ2

)d∈ Rα, |c| < |p|n2n2n[ρ1][ρ2]n.

Next observe thatd1√ρ1

d2√ρ2

is a root of

Xd −(

d1√ρ1

d2√ρ2

)d.

Using again the argument of the proof of the previous lemma it has beenshown that if d1

√ρ1/ d2√ρ2 ∈ Q(α) then an integer c < |p|n2n2n[ρ1][ρ2]n exists

such thatcd1√ρ1

d2√ρ2∈ Rα.

By Lemma 5.1.8, p.51, and Lemma 5.1.7, p. 51,[cd1√ρ1

d2√ρ2

]< 3n2n|p|2n2 |c|[ρ1]∞[ρ−1

2 ]∞.

By Lemma 2.10, p. 22, the denominator of cd1√ρ1

d2√ρ2

is bounded by |∆|, where

∆ is the discriminant of α. Hence the denominator ofd1√ρ1

d2√ρ2

is bounded by|c||∆|. Using the bound on c from above, the bound for ∆ (Lemma 5.1.5,p.50), and the bounds for [ρ1]∞, [ρ−1

2 ]∞ from Lemma 5.1.11, p.53, the lemmafollows.

If we assume |p|2 < 2l and [ρ1], [ρ2] < 2L then

log

[d1√ρ1

d2√ρ2

]< 6n log n+ 5nl + 3nL+ log 3.

Defining for the rest of Section 5 L := dn log n+nl+nLe the coefficient sizeof a ratio of radicals is strictly less than 26L.

Using the bounds of the previous two lemmata and approximations toα and d

√ρ or d1

√ρ1/ d2√ρ2 we can apply Theorem 5.2.6 in order to determine

the integer vector (c, c0, c1, . . . , cn−1) ∈ Zn+1 such that if d√ρ or d1

√ρ1/ d2√ρ2

is in Q(α) then its representation is 1c

∑n−1i=0 ciα

i.

65

Page 70: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Corollary 5.3.3 Suppose d√ρ is a radical over an algebraic number field

Q(α). Here α is an algebraic integer with minimal polynomial p(X) =∑ni=0 piX

i, pn = 1, pi ∈ Z, |p|2 < 2l. Assume [ρ] < 2L and let ε > 0 besuch that

log1ε> 2n2 + n log n+ 7n+ nl + 12nL.

Moreover, suppose that approximations γ, α to γ = d√ρ and α, respectively,

are given such that the following estimates for the real and imaginary parts<,= of γ, α hold

|<(γ)−<(γ)| < ε, |=(γ)−=(γ)| < ε,

|<(αi)−<(αi)| < ε, |=(αi)−=(αi)| < ε, ∀i ∈ 1, 2, . . . , n− 1.

Then integers c, ci, gcd(c, c0, c1, . . . , cn−1) = 1, such that if d√ρ ∈ Q(α) then

d√ρ = 1

c

∑n−1i=0 ciα

i can be computed using O(n3L) elementary operations onintegers of bit size at most O(nL).

Observe that in this corollary we may even assume that d√ρ is specified

by the approximation rather than by the form given in the introduction. Itshould be noted however that the approximation does not necessarily specifya unique root of ρ.

This corollary can be used to check whether Q(α) contains certain rootsof unity. This test in turn can be used to check the condition for Q(α) weneed for the linear dependence result on complex radicals over Q(α).

Replacing the bound from Lemma 5.3.1 by the bound in Lemma 5.3.2we similarly get

Corollary 5.3.4 Suppose d1√ρ1, d2√ρ2 are either real radicals over a real

algebraic number field Q(α) or arbitrary radicals over a number field Q(α)containing primitive d1-th and d2-th roots of unity. Here α is an algebraicinteger with minimal polynomial p(X) =

∑ni=0 piX

i, pn = 1, pi ∈ Z, |p|2 <2l. Let [ρ1], [ρ2] < 2L and assume that ε > 0 satisfies

log1ε> 2n2 + n log n+ 7n+ nl + 24nL.

Moreover, suppose that approximations γ, α to γ = d1√ρ1/ d2√ρ2 and α, re-

spectively, are given such that the following estimates hold

|<(γ)−<(γ)| < ε, |=(γ)−=(γ)| < ε,

66

Page 71: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

|<(αi)−<(αi)| < ε, |=(αi)−=(αi)| < ε, ∀i ∈ 1, 2, . . . , n− 1.

Then integers c, c0, c1, . . . , cn−1, gcd(c, c0, c1, . . . , cn−1) = 1, such that ifd1√ρ1/ d2√ρ2 ∈ Q(α) then d1

√ρ1/ d2√ρ2 = 1

c

∑n−1i=0 ciα

i can be computed ex-actly using O(n3L) elementary operations on integers of bit size at mostO(nL).

Of course in the real case all imaginary parts are zero. As above, we mayassume that the radicals are specified simply by the approximations.

The results in this section are restricted in two directions. First, thetwo corollaries above assume that approximations to radicals or ratios ofradicals are already given. And second, the corollaries do not tell us whethera radical or a ratio of radicals is contained in some number, rather theytell us how to compute candidate representations for these numbers in thealgebraic number field. The next two subsections show how to resolve theserestrictions.

67

Page 72: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

5.4 Approximating Radicals and Ratios of Radicals

In Corollary 5.3.3 and Corollary 5.3.4 we assumed that approximations toα, d√ρ, and d1

√ρ1/ d2√ρ2, respectively, are given. In this subsection we show

how to compute efficiently such approximations. In particular, the run timeof the approximation algorithms will depend polynomially only on log dirather than on di. Our algorithms use several well-known approximationalgorithms. The first one is due to Schonhage [Sc3].

Theorem 5.4.1 (Schonhage) Let p(X) ∈ Z[X] be an integer polynomialsuch that deg p = n, |p|2 < 2l. Furthermore assume ε < 2−n logn−nl. Thenapproximations α0, α1, . . . , αn−1 to the roots α0, α1, . . . , αn−1 of p with |αi−αi| < ε can be computed using O(n) elementary operations on floating-pointnumbers of size at most O(n log 1

ε ).

We assume ε < 2−n logn−nl because this bound suffices to separate the rootsα0, α1, . . . , αn−1 of p (see Lemma 4.2.1, p. 41). Hence the approximationsα0, α1, . . . , αn−1 will be n distinct complex numbers. Moreover, the outputof this algorithms will be n floating-point numbers of the kind we defined inthe introduction to this thesis (see the introduction of [Sc3]). The secondtype of approximation algorithms is due to Brent [Br].

Theorem 5.4.2 (Brent) 1. π can be computed with relative error lessthan O(2−n) by O(log n) elementary operations on floating-point num-bers of size at most O(n).

2. Let −∞ < a < b < ∞. If x is an n-bit floating-point number in [a, b]then exp(x) can be computed with relative error less than O(2−n) byO(log n) elementary operations on floating-point numbers of size O(n).

If a > 0 the same is true for lnx.

3. For all n-bit floating-point numbers x arctan(x) can be computed withrelative error less than O(2−n) by O(log n) elementary operations onfloating-point numbers of size O(n).

4. Let −π < a < b < π. If x is an n-bit floating-point number in [a, b] thensin(x) and cos(x) can be computed with relative error less than O(2−n)using O(log n) elementary operations on floating-point numbers of sizeO(n).

Since sin(x), cos(x) are bounded in absolute value by 1 and arctan(x) isbounded in absolute error by π Brent’s algorithms can actually be used to

68

Page 73: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

compute the trigonometric functions with absolute error less than O(2−n)within the time bounds stated above. Of course the same is true for π and,more important, since [a, b] is a fixed interval it is also correct for ln(x) andexp(x) for all x in the interval [a, b].

We also need several standard estimates on absolute errors. We summa-rize them in the following lemma.

Lemma 5.4.3 Let x, x be a complex numbers satisfying 2−m < |x| <2m, 0 < ε < 2−(m+1), |x− x| < ε. Then

1.|xi − xi| < ε22(m+1)i.

2. If max|b0|, |b1|, . . . , |bn−1| < 2L then∣∣∣∣∣n−1∑i=0

bixi −

n−1∑i=0

bixi

∣∣∣∣∣ < ε22(m+1)n+L.

3. ∣∣∣∣1x − 1x

∣∣∣∣ < ε22m+3.

4. If d ≥ 2 then ∣∣∣x 1d − x

1d

∣∣∣ < 2m+1ε.

The first two bounds follow from standard estimates. The third and fourthbound are simple consequences of the Mean Value Theorem in its complexversion (see for example [C]).

The following lemma is the crucial step in the analysis of the approxi-mation algorithms.

Lemma 5.4.4 Assume x ∈ R, ε satisfy the bounds of the previous lemma,d, k ∈ N, and d > 2, k < d. If

y ∈ I =[(1− ε) exp

(1− ε)k

dlnx

, (1 + ε) exp

(1 + ε)

k

dlnx

]

then y is an approximation to exp(kd lnx), i.e. xkd with relative error less

than 4mε.

69

Page 74: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Proof: First we prove a lower bound on the left endpoint of the interval I.

(1− ε) exp

(1− ε)kd

lnx

=

= (1− ε) exp(k

dlnx

)exp

(−ε(k

dlnx

))>

> xkd (1− ε)

(1− εk

dlnx

),

since exp(δ) > 1 + δ for all δ.Therefore

(1− ε) exp

(1− ε)kd

lnx > (1− 2mε)x

kd .

Similarly we get an upper bound for the right endpoint of I.

(1 + ε) exp

(1 + ε)k

dlnx

=

= (1 + ε) exp(k

dlnx

)exp

dlnx

)<

< xkd (1 + ε)

(1 + 2ε

k

dlnx

),

since exp δ < 1 + 2δ for |δ| ≤ 12 .

Hence(1 + ε) exp

(1 + ε)

k

dlnx

< (1 + 4mε)x

kd .

Now we can analyze the time needed to compute the approximations re-quired by Corollary 5.3.3 and Corollary 5.3.4. First let us state the resultfor approximating powers of α.

Lemma 5.4.5 Let α be root of the polynomial p(X) =∑ni=0 pix

i, pn =1, pi ∈ Z, |p|2 < 2l. An approximation α to α satisfying |αi − αi| < ε ≤2−n logn−nl for all i < n, can be computed using O(n) elementary operationson floating-point numbers of bit size O(n log 1

ε ).

70

Page 75: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Proof: Set ε′ := ε2−2(l+1)n. By Lemma 5.4.3 an approximation α with|α − α| < ε′ will lead to approximations as required. By Schonhages algo-rithm (Theorem 5.4.1) the lemma follows.

Lemma 5.4.6 Suppose d1√ρ1, d2√ρ2 are real radicals over a real algebraic

number field Q(α) where α is an algebraic integer with minimal polynomialp(X) =

∑ni=0 piX

i, pn = 1, pi ∈ Z, |p|2 < 2l. Assume [ρ1], [ρ2] < 2L and letL = dn log n+ nl + nLe.

For any ε < 2−2L approximations to d1√ρ1, d2√ρ2 with absolute error less

than ε can be computed using O(n) elementary operations on floating-pointnumbers of size O(n log 1

ε ), O(log log 1ε ) elementary operations on floating-

point numbers of size O(log 1ε ), and a constant number of operations on

floating-point numbers of length O(log 1ε + maxlog di).

Within the same time bounds an approximation to the ratio d1√ρ1/ d2√ρ2

with absolute error less than ε can be computed.

Proof: Of course we show the first statement only for d1√ρ1. Furthermore

since we are dealing with real radicals in this lemma we may assume thatd1√ρ1 is given by exp 1

d1ln ρ1.

First we compute an approximation ρ1 to ρ1 with absolute error lessthan ε1 < ε, where ε1 will be specified later. This can be done as follows.

Assume ρ1 = 1b

∑n−1i=0 biα

i, b, bi ∈ Z. Compute an approximation to 1b

with absolute error less than ε12−(L+ln+4). Since 1b ≤ 1 an approximation

with relative error ε12−(L+ln+4) suffices. This approximation can be de-termined by a constant number of elementary operations on floating-pointnumbers of size O(log 1

ε1).

Also compute an approximation to∑n−1i=0 biα

i with absolute error lessthan 1

4ε1. If α is an approximation to α satisfying |α−α| < ε12−(L+2(l+1)n+2)

then∑n−1i=0 biα

i is such an approximation (see Lemma 5.4.3). The approx-imation α can be computed with O(n) elementary operations on floating-point numbers of size O(n(log 1

ε1)). It is easy to see that this dominates the

time needed to compute∑n−1i=0 biα

i. Finally multiplying the approximationsgives the result since 1

b ≤ 1 and∑n−1i=0 biα

i < 2L+ln+2 (see Lemma 5.1.11,p. 53).

Also by Lemma 5.1.11 and by the choice of ε this approximation sufficesto determine the sign of ρ1, hence for the rest of the proof we can assume thatρ1 > 0. Moreover, the approximation ρ1 will be non-zero which is importantfor the proof of the second part of the lemma.

71

Page 76: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Next an approximation γ1 to ρ1

1d1 with absolute error less than ε1 is

computed in the following way.ρ1 is given in the form a2m for some a,m ∈ Z such that log |a|, |m| ≤

log 1ε1

(Lemma 5.1.11). Write ρ1 as a2m12m2 with a2m1 ∈ [2, 4). This can bedone with |m1| ≤ log 1

ε1, m2 ≤ L. Then

ρ1

1d1 = (a2m1)

1d1 2

m2d1 ,

taking positive roots if d1 is even.We determine the representation of m2 as ud1 + k with u, k ∈ Z, 0 ≤

k < d. This can be done by a constant number of operations on numbers ofsize O(log 1

ε1+ log d1).

Now ρ1

1d1 = (a2m1)

1d1 2

kd1 2u. If (a2m1)

1d1 and 2

kd1 are both computed with

absolute error less than 2−(L+3)ε1, then the product of these approximationsand 2u leads to an approximation γ1 as required. Obviously the product canbe computed within the time bounds stated in the lemma. So it remainsto analyze how much time is needed to determine the approximations to(a2m1)

1d1 and 2

kd1 .

(a2m1)1d1 = exp

1d1

ln a2m1

and2kd1 = exp

k

d1ln 2.

By the previous lemma if 1d1

ln a2m1 and kd1

ln 2 are computed with relativeerror less than ε2 and the exp of these values is approximated with relativeerror less than ε2 then this leads to approximations to (a2m1)

1d1 and 2

kd1

with relative error less than 8ε2. But both numbers are bounded in absolutevalue by 2 so these approximations are with absolute error less than 16ε2.Choosing ε2 = 2−(L+7)ε1 therefore suffices to determine approximations to(a2m1)

1d1 , 2

kd1 with absolute error less than ε12−(L+3), and, accordingly, an

approximation to d1√ρ1 with absolute error less than ε1.

By Brent’s results (Theorem 5.4.2) we can compute the required ap-proximations using O(log log 1

ε1) operations on floating-point numbers of size

O(log 1ε1

) and a constant number of operations on floating-point numbers ofsize O(log 1

ε1+ log d1). The last term is caused by computing the inverse of

d1 with precision 2−O(

log 1ε1

).

72

Page 77: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Finally observe that by Lemma 5.1.11, p. 53, | d1√ρ1| < max1, |ρ1| <2nl+L+2 and hence Lemma 5.4.3 shows∣∣∣∣ρ 1

d11 − γ1

∣∣∣∣ ≤ ∣∣∣∣ρ 1d11 − ρ1

1d1

∣∣∣∣+ ∣∣∣∣ρ1

1d1 − γ1

∣∣∣∣ < 2L+nl+3ε1 + ε1 = 2L+nl+4ε1.

Choosing ε1 = ε2−(L+nl+4) therefore proves the first part of the lemma.To prove the second claim of the lemma note that by Lemma 5.1.11,

p. 53, not only | d1√ρ1| < max1, |ρ1| < 2nl+L+2 but also | d2√ρ2−1| <

max1, |ρ−12 | < 22L. Computing d1

√ρ1 with absolute error ε2−(2L+2) and

d2√ρ2−1 with absolute error ε2 = ε2−(nl+L+4) therefore suffices to prove the

claim. The time required to determine γ1 is analyzed by the first part of thelemma.

To analyze the time needed for approximating d2√ρ2−1 assume that γ2

is an approximation to d2√ρ2 with |γ2 − d2

√ρ2| < ε22−4(L+1) and 1

γ′2is an

approximation to 1γ2

with absolute error less than 12ε2 then11

∣∣∣∣∣ 1γ′2− 1

d2√ρ2

∣∣∣∣∣ ≤∣∣∣∣ 1γ′2− 1γ2

∣∣∣∣+∣∣∣∣∣ 1γ2− 1

d2√ρ2

∣∣∣∣∣ ≤12ε2 +

12ε2 ≤ ε2,

since the bound on | d2√ρ2−1| and Lemma 5.4.3 show

∣∣∣ 1γ2− 1

d2√ρ2

∣∣∣ ≤ 12ε2.

By the first part of the lemma γ2 can be determined within the timebounds stated.

Since |γ2− d2√ρ2| < ε22−4(L+1) and | d2√ρ2| > 2−2L the approximation γ2

satisfies |γ2| > 2−2L−1. Therefore an approximation to 1γ2

with relative errorless than ε22−2L−2 leads to the approximation 1

γ′2. The lemma follows.

Observe that the approximations to α may also be computed by successivebisecting the isolating interval for α and using the theory of Sturm sequences(see [vdW]) to determine after a bisection step which of the two createdintervals contains α. Using Schonhages result is at least theoretically moreefficient.

Corollary 5.4.7 In the real case approximations as required by Corollary5.3.4 can be computed using O(n) elementary operations on floating-point

11Note that all denominators are non-zero.

73

Page 78: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

numbers of size O(n2L), O(log nL) elementary operations on floating-pointnumbers of length O(nL) plus a constant number of operations on floating-point numbers of length O(nL+ maxlog d1, log d2).

Proof: The ε in Corollary 5.3.4 satisfies log 1ε = O(nL).

Hence in time polynomial in n, l, L, log d1, log d2 we can compute approxi-mations to ratios of real radicals as required by the lattice basis reductionalgorithm in Corollary 5.3.4. Observe that by the algorithm leading toLemma 5.4.6 we can efficiently compute the sign of ρi so from now on theassumption that certain radicals di

√ρi are real is justified since it can be

checked efficiently.Next we show the corresponding results for approximations to complex

radicals. Recall from the introduction to this section that we assume thatthe radical d

√ρ is given by ζkd |ρ|

1d (cos 1

dφρ + i sin 1dφρ), where φρ denotes the

angle of ρ, ζd = cos 2πd + i sin 2π

d , and 0 ≤ k < d.To give a rigorous analysis of the run times of the approximation algo-

rithm for complex radicals of this form we need the following lemma.

Lemma 5.4.8 Let ρ be a root of f =∑mi=0 fiX

i ∈ Z[X], |f |2 <2h. If <(ρ), =(ρ) 6= 0 then |<(ρ)| > 2−(m logm+mh+1) and |=(ρ)| >2−(m logm+mh+1).

Proof: |=(ρ)| = 12 |ρ− ρ|, where ρ is the complex conjugate of ρ. ρ is also a

root of f . Hence the root separation bound (see Lemma 5.1.3, p. 49) applies.To get the bound on < apply the same argument to iρ and the polynomialF (1

iX) ∈ Z[i][X]. Although F (1iX) is not an integer polynomial the root

separation bound still applies (see for example [Ja] and [Mi1]).

Lemma 5.4.9 Suppose Q(α) is a number field generated by the algebraicinteger α, whose minimal polynomial is p(X) ∈ Z[X], |p|2 < 2l. If d1, d2 ∈ Nand ρ1, ρ2 ∈ Q(α) satisfy [ρi] < 2L then for any ε < 2−2nL−n logn−1

the radicals di√ρi, i = 1, 2, can be computed with absolute error less

than ε using O(n) elementary operations on floating-point numbers of sizeO(n log 1

ε ), O(log log 1ε ) elementary operations on floating-point numbers of

size O(log 1ε ), and a constant number of operations on floating-point num-

bers of bit size O(maxlog di+ log 1ε ).

Within the same time bounds the ratio of these radicals can be computedwith absolute error less than ε.

74

Page 79: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Proof: As in the proof of Lemma 5.4.6 we show the first statement only ford1√ρ1.

First within the time bounds stated |ρ1|1d1 is approximated with absolute

error less than 14ε. This is done is the following fashion. Suppose ρ1 is an

approximation to ρ1 with absolute error less than ε2−(nl+L+6). |ρ1 − ρ1| <ε2−(nl+L+6) implies

||ρ1| − |ρ1|| < ε2−(nl+L+6).

Therefore (see Lemma 5.4.3 and Lemma 5.1.11, p. 53 for a bound on |ρ1|)∣∣∣∣|ρ1|1d1 − |ρ1|

1d1

∣∣∣∣ < 18ε.

Hence determining |ρ1|1d1 =

(√<(ρ1)2 + =(ρ1)2

) 1d1 = 2d1

√<(ρ1)2 + =(ρ1)2

with absolute error less than 18ε leads to an approximation to |ρ1|

1d1 as

desired. Computing the 2d1-th root can be analyzed exactly as in Lemma5.4.6 so we skip the details.

Next we show how to compute cos 1dφρ1 + i sin 1

dφρ1 and ζkd = cos 2kπd +

i sin 2kπd with relative and, since the absolute value of cos 1

dφρ1 + i sin 1dφρ1 is

equal to 1, with absolute error less than

ε1 = ε2−(nl+L+6).

Since |ρ1| < 2nl+L+2 (see Lemma 5.1.11, p. 53) this suffices to show that theproduct of the approximated values of cos 1

dφρ1 + i sin 1dφρ1 , ζ

kd , and of |ρ1|

1d

will lead to an approximation to d1√ρ1 with absolute error less than ε.

First observe that combining the previous lemma with the bounds ofLemma 5.1.11, p. 53, shows |<(ρ1)|, |=(ρ1)| > 2−2nL−n logn−1 if these valuesare non-zero. Therefore an approximation to ρ1 with absolute error lessthan ε1 < ε are sufficient to determine whether the real or imaginary partare non-zero. If any of them is non-zero the approximation also allows us todetermine the sign of the real and imaginary part of ρ1. In particular, If thereal part is non-zero the approximation to the real part will be non-zero,too. Hence we can determine the quadrant containing the angle φρ1 . So forthe rest of the proof assume 0 ≤ φρ1 ≤ π

2 .We show how to approximate φρ1 . If the real or the imaginary part

is zero this is trivial so we assume that both are non-zero. Next observeφρ1 = arctan

(=(ρ1)<(ρ1)

). Because of the lower bound on |<(ρ1)| of the previous

lemma within the time bounds stated in the lemma an approximation ρ

75

Page 80: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

to =(ρ1)<(ρ1) with

∣∣∣ρ− =(ρ1)<(ρ1)

∣∣∣ < ε12−4 can be computed (the details are thesame as in the proof of Lemma 5.4.6 where the ratio d1

√ρ1/ d2√ρ2 has been

approximated). Furthermore with Brent’s algorithm an approximation φρ1to arctan ρ with relative error less than ε12−6 is computed. Note that theimages of arctan are bounded in absolute value by π. Therefore the absoluteerror between φρ1 and arctan ρ is less than ε12−4.

Hence ∣∣∣φρ − φρ∣∣∣ ≤ |φρ − arctan ρ|+∣∣∣arctan ρ− φρ

∣∣∣ << ε12−4 + ε12−4 ≤ 1

8ε1.

For the bound on |φρ − arctan ρ| we used the Mean-Value Theorem and thefact that the derivative of arctan is bounded in absolute value by 1.

Then 1d1φρ1 is computed with absolute error less than 1

8ε1. Since φρ1 < 4an approximation to 1

d1with relative and hence absolute error less than 1

32ε1

will suffice. Denote this approximation by 1d1φρ1 . Since

∣∣∣ 1d1φρ1 − 1

d1φρ1

∣∣∣ the

real number 1d1φρ1 is an approximation to 1

d1φρ1 with absolute error less

than 14ε1.

Finally, we use Brent’s algorithm for sin and cos to compute cos 1d1φρ1

and sin 1d1φρ1 with relative and, since the sin and cos are bounded in absolute

value by 1, absolute error less than 14ε1. The derivatives of sin and cos are

also bounded in absolute value by 1 so cos 1d1φρ1 is an approximation to

cos 1d1φρ1 with absolute error less than 1

2ε1. The same is true for the sin .Hence cos 1

d1φρ1 + i sin 1

d1φρ1 has been approximated with absolute error less

than ε1. The details of the analysis of this approximation algorithm for sinand cos are even simpler as the ones in the proof of Lemma 5.4.6, so we omitagain the details.

Similarly we compute the approximation to ζkd from approximations toπ and k

d . The analysis is as above.As we mentioned above if these approximation are multiplied with the

approximation for |ρ1|1d1 then the result is the desired approximation to

d1√ρ1. This proves the first part of the lemma.The second one can be shown in exactly the same way as the correspond-

ing claim in Lemma 5.4.6.

76

Page 81: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

If we compare the previous lemma with Lemma 5.4.6 we remark thatcomplex radicals force us to use an initial approximation to α that is roughlyn times better than the one required for real radicals.

Corollary 5.4.10 Approximations as required by Corollary 5.3.3 or by thecomplex case of Corollary 5.3.4 can be computed by O(n) elementary op-erations on floating-point numbers of size O(n2L), O(log nL) elementaryoperations on floating-point numbers of size O(nL), and a constant numberof operations on floating-point numbers of size O(log d+ nL).

Proof: In Corollary 5.3.4 ε is of order 2−O(nL). For Corollary 5.3.3 ε is onlyof order 2−O(nL). But as follows from the proof of the previous lemma andthe lower bound on the real part of an algebraic number as given in Lemma5.4.8 we nevertheless have to work with approximations of order 2−O(nL)

since otherwise we may be forced to divide by zero.

Let us finish this subsection by explicitly stating one result that hasbeen shown while proving Lemma 5.4.6 and Lemma 5.4.9. It will be usedon several occasion in the second part of this thesis.

Lemma 5.4.11 Let ε > 0. Suppose ρ is a complex number such that |ρ| <2C1 and |<(ρ)| > 2−C2 and assume an approximation to ρ with absoluteerror less than ε2−(2C1+2C2+13), is given. Then using O(log(log 1

ε +C1 +C2))elementary operations on floating-point numbers of size O(log 1

ε +C1 +C2)and O(1) elementary operations on floating-point numbers of size O(log 1

ε +log d) an approximation to ζkd |ρ|

1d (cos 1

dφρ + i sin 1dφρ), k < d, with absolute

error less than ε can be computed.

Proof: First of all exactly as in the beginning of the proof of Lemma 5.4.9|ρ|

1d is determined with absolute error less than ε2−(C1+3).Next by the bounds on (<(ρ))−1 and =(ρ) (the latter bound following

from the bound on |ρ|) an approximation ρ to =(ρ)<(ρ) with absolute error less

than ε2−(C1+7) can be determined as follows.If the approximation to ρ is denoted by ρ then |(<(ρ))−1 − (<(ρ))−1| <

ε2−(2C1+10) which follows from the fact that <(ρ) is approximated by <(ρ)with absolute error less than ε2−(2C1+2C2+13) and Lemma 5.4.3. Hence ap-proximating (<(ρ))−1 with absolute error less than < ε2−(2C1+10) leads toan approximation of (<(ρ))−1 with absolute error less than ε2−(2C1+9).

77

Page 82: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Multiplying this approximation with =(ρ) yields the desired approxima-tion ρ to =(ρ)

<(ρ) . As is clear this approximation can be computed within thetime bounds stated.

Then we compute the arctan of ρ with absolute error less than ε2−(C1+7).Denote the approximation by φρ.

As in the proof of Lemma 5.4.9 it can be shown that computing 1dφρ

with absolute error less than ε2−(C1+6) gives an approximation to 1dφρ with

absolute error less than ε2−(C1+5). As follows from Brent’s result these stepscan also be done within the time bounds stated.

Again as in the proof of Lemma 5.4.9 Brent’s algorithm is used to com-pute cos 1

dφρ + i sin 1dφρ with absolute error less than ε2−(C1+3).

Analogously, within the time bounds stated ζkd is computed with absoluteerror less than ε2−(C1+3).

Finally multiplying the approximations to |ρ|1d , cos 1

dφρ + i sin 1dφρ, and

ζkd yields the approximation to ζkd |ρ|1d (cos 1

dφρ+i sin 1dφρ) with absolute error

less than ε.

78

Page 83: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

5.5 A Probabilistic Test for Equality.

In the last two paragraphs we showed how to compute the coefficients of anumber γ ∈ Q(α) such that if a radical d

√ρ or a ratio of radicals d1

√ρ1/ d2√ρ2

is contained in Q(α) then γ = d√ρ or γ = d1

√ρ1/ d2√ρ2. But even if the algo-

rithm outputs an element γ of Q(α) we cannot be sure whether these equali-ties are correct. In this paragraph it is shown how to use probabilistic meth-ods to check in time polynomially in log d1, log d2 whether γ = d1

√ρ1/ d2√ρ2

or γ = d√ρ. In the complex case this result is almost of no use since we

assume in this case that Q(α) contains primitive d1, d2-th roots of unityand therefore n = [Q(α) : Q] may be of order max d1

log log d1, d2

log log d2 (see

[Ap]). Therefore we also give a deterministic algorithm to decide whetherγ = d1

√ρ1/ d2√ρ2 or γ = d

√ρ. This algorithm has run time which is polynomial

only in d1, d2, or d, respectively.Observe that in order to check whether γ = d1

√ρ1/ d2√ρ2, say, we only

need to determine whether γd1d2ρd12 = ρd21 , because this implies that forsome d1-th root of ρ1 and some d2-th root of ρ2 their ratio is in Q(α).Hence the ratio of all real roots of ρ1, ρ2 in the real case and the ratios ofarbitrary roots of ρ1, ρ2 in the complex case are in Q(α). In particular, forthe roots denoted by d1

√ρ1 and d2

√ρ2 the ratio d1

√ρ1/ d2√ρ2 is in Q(α) and

the algorithm of Corollary 5.3.4, p. 66, will return the correct representationfor the ratio.

However, for Corollary 5.3.3, p. 66, the situation is different. If γd = ρthen γ need not be the d-th root d

√ρ. Moreover, Q(α) need not contain all

d-th roots of unity and it cannot be argued that if Q(α) contains one d-throot of ρ then it must contain all of them. Therefore even if γd = ρ we usea bit comparison test to decide whether γ = d

√ρ and hence if d

√ρ ∈ Q(α).

Let us begin by arguing why the usual methods to distinguish two al-gebraic numbers will not lead to an algorithm that is polynomial in thelogarithm of the degrees d1, d2 of the radicals. First of all, in general wecannot bound the size of d1, d2 in terms of a polynomial expression in theinput size of Q(α), ρ1, and ρ2. That is, we cannot give a polynomial upperbound such that if d1, d2 exceed this bound then the ratio d1

√ρ1/ d2√ρ2 can-

not be an element of Q(α). There are various methods to show that if d1 ord2 are larger than an expression polynomial in n, 2l, and 2L then the ratiocannot be in Q(α). But this is clearly not good enough for our purposes.

Because there is no such upper bound on d1, d2 the usual root separationbounds to distinguish two algebraic integers will result in an algorithm thatis polynomial only in d1, d2 or exponential in L, l rather than polynomial in

79

Page 84: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

the input size. In fact, we have to distinguish the algebraic number γ fromthe ratio d1

√ρ1/ d2√ρ2 which may have degree d1d2 over Q(α) so we may need

at least nd1d2 bits to separate these numbers.We also cannot compute ρd12 γ

d1d2 by successive squaring and checkwhether it equals ρd21 because this would clearly imply that we had to workwith numbers whose representations need Ω(d1 + d2) bits.

The same argument applies already to simple radicals d√ρ, in which case

we had to check whether γd = ρ. Although we know that the final resultcannot have a large representation if the equation is correct, the interme-diate results may require Ω(d) bits. In particular, the denominator of thecoefficients may get too large.

Instead of these approaches we describe an algorithm that actually usessuccessive squaring to check whether γ = d1

√ρ1/ d2√ρ2. But in order to avoid

an exponential coefficient growth in the intermediate steps we reduce thecoefficients modulo randomly chosen integers, that is, we use modular arith-metic. It will be shown that the error probability of the algorithm can bemade exponentially small.

We claim that the following algorithm answers the question whetherγ = d1

√ρ1/ d2√ρ2 correctly with probability at least 1− 2−t for t ∈ N, t > 3.

Assume b1, b2, c ∈ Z such that biρi ∈ Z[α], cγ ∈ Z[α]. Define ρi := biρiand γ := cγ. Then d1

√ρ1/ d2√ρ2 = γ is equivalent to

bd21 ρ2d1 γd1d2 − bd12 c

d1d2 ρ1d2 = 0,

which is an equation in Z[α]. Denote the algebraic number on the left-handside of the equation by Γ ∈ Z[α].

Fix an interval I = [1, 2T ], where T will be specified later. Randomlychoose 6(t+1)T rational integers zj from I and compute the coefficients of Γmodulo all integers zj . If for all j = 1, . . . , 6(t+ 1)T, the reduced coefficientsof Γ are zero output γ = d1

√ρ1/ d2√ρ2 otherwise output that the ratio of

radicals d1√ρ1/ d2√ρ2 is not contained in Q(α).

To compute the coefficients of Γ modulo zj , j = 1, . . . , 6(t + 1)T,first, using exact arithmetic in Z[α], the reduced coefficients of the num-bers bd21 , b

d12 , c

d1d2 , ρ′d21 , ρ′d12 , and γ′d1d2 are computed by successive squaringand reducing the coefficients of the intermediate results modulo zj . Withthese numbers the expression corresponding to Γ is formed and reduced, ifnecessary. The following lemma shows that this will give us the reducedcoefficients of Γ.

80

Page 85: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Lemma 5.5.1 Let ρ1, ρ2 ∈ Z[α], z ∈ Z, and denote by (ρi)z the numberthat is obtained by reducing the coefficients of ρi modulo z. Then

((ρ1)z + (ρ2)z)z = (ρ1 + ρ2)z

and((ρ1)z(ρ2)z)z = (ρ1ρ2)z

using arithmetic in Z[α].

Proof: For the addition this is a straightforward computation.For the multiplication we need to show that for two integer polynomials

g1, g2 ((((g1)p

)z

((g2)p

)z

)p

)z

=((g1g2)p

)z,

where (f)p denotes the polynomial f reduced modulo the minimal polyno-mial p of α. Write gi, i = 1, 2, as

gi = phi + ri, deg ri < deg p,

that is, ri = (gi)p.Furthermore let

ri = r′i + zri,

with r′i = (ri)z.Then (((

(g1)p)z

((g2)p

)z

)p

)z

=((r′1r′2

)p

)z.

Obviously ((g1g2)p

)z

=((r1r2)p

)z.

Now(r1r2)p =

(r′1r′2

)p + z

(r′1r2 + r′2r1 + r1r2

)p ,

since for a non-constant polynomial p the homomorphism Z[X] → (Z[X])pinduces an isomorphism of Z onto itself.

From the previous equality we finally get

(r1r2)p =(r′1r′2

)p ,

which proves the lemma.

81

Page 86: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Remark that ((((g1)p

)z

((g2)p

)z

)p

)z

=((g1g2)p

)z

is not correct if z is a non-constant polynomial.It remains to analyze the run time and error probability of the algorithm.

We begin with the run time.We apply the following well-known result on the complexity of arithmetic

in algebraic number fields (see [Lo1]).

Lemma 5.5.2 Let ρ1, ρ2 ∈ Z[α] with [ρ1], [ρ2] < 2B. Furthermore assumethat the minimal polynomial of the algebraic integer α has degree n andsatisfies |p|2 < 2l. Then

(i) the coefficients of ρ1 + ρ2 are bounded in absolute value by 2B+1 andρ1 + ρ2 can be computed by n additions of integers of size B,

(ii) the coefficients of ρ1ρ2 are bounded in absolute value by 22(nl+B) andthe product can be computed using O(n2) elementary operations onintegers of size O(nl +B).

The claim on the size of ρ1ρ2 is a special case of a lemma that will be provenbelow.

Furthermore we need to reduce bi, c, the coefficients of ρ′i, γ′, and the

coefficients of the intermediate results by integers of size ≤ 2T . Since by theprevious lemma the coefficients of the intermediate results have bit size lessthan O(nl + T ) each reduction step for a single coefficient can be done bya constant number of elementary operations on integers of size O(nl + T ).Recall that reducing a coefficient modulo zi is equivalent to a division withremainder. The overall number of these reduction steps is O(Ttn(log d1 +log d2)).

To analyze the initial reduction steps recall that if [ρi] < 2L then [γ] <26L, L = dn log n+ nl + nLe12 (see Lemma 5.3.2, p. 64). Hence for a singlezi we can compute the initial reductions by O(n) elementary operations onintegers of size O(L+ T ).

Finally, we need O(n2(log d1 + log d2)) operations on integers of sizeO(nl + T ) for the multiplications in Z[α].

We summarize this in12If the numbers returned by the lattice reduction do not satisfy this bound then we

may stop at once since the ratio of radicals will not be an element of Q(α).

82

Page 87: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Lemma 5.5.3 The algorithm described above uses O(n(Tt + n)(log d1 +log d2)) elementary operations on integers of size O(L+ T ).

Next we analyze the error probability of the algorithm. First observe thatif γ = d1

√ρ1/ d2√ρ2 then the algorithm above will always give the correct

answer because all coefficients even if reduced modulo an integer will bezero. On the other hand, if the algorithm answers that γ 6= d1

√ρ1/ d2√ρ2

this answer is also correct since the algorithm found an integer such thatthe representations of γ and the ratio of radicals differ already modulo thisinteger. So this number is a witness that γ 6= d1

√ρ1/ d2√ρ2.

Definition 5.5.4 If Γ 6= 0 then we call a number z ∈ Z unlucky if it dividesall coefficients of Γ or, equivalently, the gcd of these coefficients. Otherwisewe call z lucky.

Exactly the unlucky numbers will lead to an incorrect answer of the algo-rithm. Hence to prove the claim that the algorithm gives the correct answerwith probability at least 1− 2−t we have to show that among 6(t+ 1)T ran-domly chosen integers from I with probability at least 1− 2−t one numberis lucky.

Obviously, we need a bound on the gcd of the coefficients in Γ.

Lemma 5.5.5 Let ρ1, . . . ρd ∈ Z[α] such that [ρi] < 2B, i = 1, 2, . . . , d, andα is as usual. Then the coefficients of

∏di=1 ρi are bounded in absolute value

by 2d(logn+n(l+1)+B).

Proof: Let Ri(X) =∑n−1j=0 r

(i)j Xj be defined by Ri(α) = ρi. Hence com-

puting the coefficients of∏ρi is the same as computing the coefficients of∏

Ri(X) mod p(X), where p is the minimal polynomial of α.∏Ri(X) is a polynomial of degree (n − 1)d and its coefficients are

bounded in absolute value by 2d(logn+B). Write∏Ri(X) as

∑d(n−1)i=0 miX

i

and consider the following matrix:

d(n−1)−n+1 rows

1 pn−1 · · · p1 p0

1 · · · · · · p1 p0

. . .1 pn−1 · · · p1 p0

md(n−1) md(n−1)−1 · · · · · · · · · · · · m1 m0

83

Page 88: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Using Gauss-elimination this matrix can be transformed into an upper tri-angular matrix

d(n−1)−n+1 rows

1 pn−1 pn−2 · · · p1 p0

1 pn−1 · · · · · · p1 p0

. . .1 pn−1 · · · p1 p0

0 m′n−1 m′n−2 · · · · · · m′1 m′0

If follows that m′i are the coefficients of

∏Ri(X) modulo p(X).

We have to analyze this process. Denote by Bi an upper bound onthe absolute values on the entries in the last row after the i-th step ofthe Gauss-elimination. In particular, B0 ≤ 2d(logn+B). As is easily seenBi+1 ≤ |p|∞Bi +Bi = (|p|∞ + 1)Bi. Hence

Bi ≤ (|p|∞ + 1)i2d(logn+B).

As we have to apply d(n− 1)− n steps in the Gauss-elimination the lemmafollows from |p|∞ < 2l.

We apply this bound to bd21 ρ′d12 γ′d1d2 and bd12 c

d1d2ρ′d21 . Since|b1|, |b2|, |c|, [ρ′1], [ρ′2], [γ′] < 26L (see Lemma 5.3.2, p. 64, and the remarksfollowing this lemma) and d1 + d2 ≤ d1d2 for d1, d2 ≥ 2, this shows thatboth expressions have coefficients whose absolute value is bounded by

22d1d2(logn+n(l+1)+6L).

This finally yields a bound of

C ≤ 14d1d2L

for the bit size of the coefficients of Γ. Hence the gcd of the coefficients in Γis bounded by 2C .

Unfortunately, an integer z ∈ Z may have zΩ( 1log log z

) different divisors(cf. [Ch]). This is the main reason why we cannot show directly that mostnumbers in I are lucky. Instead we will show that most primes in I arelucky and that by choosing randomly 6(t + 1)T numbers from I with highprobability at least one of them is a lucky prime.

84

Page 89: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

First note that any integer z has at most log z different prime divisors.Hence the gcd of the coefficients of Γ has at most C distinct prime divisors.In general, this is a very crude estimate since we know that on the averagethe integer z has only log log z prime divisors (see [Ap]).

On the other hand there is the following well-known bound for the primecounting function π(x). For a proof of this lemma see [Ap].

Lemma 5.5.6 The number π(2T ) of primes in the interval [1, 2T ] is at least162T /T.

This lemma has two consequences. It shows that if T = 4(logC + t) thenthe number of primes that do not divide the gcd of the coefficients of Γ is atleast 2logC+t+1. In fact, T = 4(logC + t) implies T > logC + log T + t+ 4and 1

62T /T > 2logC+t+1 for t > 3.Hence at most a fraction of 2−t−1 of the primes in I is unlucky. Equiv-

alently,

Lemma 5.5.7 Let T = 4(logC + t), t > 3. A randomly chosen prime inI = [1, 2T ] is unlucky with probability less than 2−t−1.

As a second consequence to Lemma 5.5.6 we get the following lemma.

Lemma 5.5.8 Let I = [0, 2T ]. If 6(t + 1)T numbers are chosen randomlyfrom I then with probability at least 1−2−(t+1) one of the numbers is prime.

Proof: A random number in I is composite with probability at most 1 −1

6T . Therefore the probability that none of the chosen numbers is prime isbounded by (1− 1

6T )6(t+1)T . (1− 1

6T

)6T

≤ 1e,

hence (1− 1

6T

)6(t+1)T

≤ e−(t+1) < 2−(t+1).

The lemma follows.

Combining the two observations we have shown

Lemma 5.5.9 Let T = 4(log(14d1d2L)+t). If 6(t+1)T integers are chosenrandomly from I = [1, 2T ] then with probability at least 1 − 2−t one of theintegers is lucky.

85

Page 90: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Proof: There are two ways we may fail to hit upon a lucky integer. Firstno prime may have been chosen. Second, even if a prime has been chosen itmay not be lucky. Both cases happen independently and with probabilityat most 2−(t+1).

Finally we combine Lemma 5.5.3 with the previous lemma and state therun time for a deterministic test. The analysis of the deterministic test isstraightforward. We check whether γ d2

√ρ2 = d1

√ρ1 but raising both sides of

the equation to the lcm(d1, d2)-th power and apply Lemma 5.5.2. For theprobabilistic algorithm we may also have taken lcm(d1, d2)-th powers butin that case asymptotically it saves us nothing.

Corollary 5.5.10 Given the element γ = 1c

∑n−1i=0 ciα

i from Corollary 5.3.4it can be checked with error probability less than 2−t and using at mostO(nmaxlog di(n + t(t + logL + maxlog di))) elementary operations onintegers of size O(L + maxlog di + t) whether γ = d1

√ρ1/ d2√ρ2. Fur-

thermore, with O(maxlog din2) elementary operations on integers of sizeO( lcm(d1d2)L) it can be checked deterministically whether the element γ ofCorollary 5.3.4 satisfies γ = d1

√ρ1/ d2√ρ2.

Note that the algorithm above resembles in some respects Brown’s modulargcd-algorithm [Bro]. But in the brute force non-modular gcd algorithmonly the intermediate results may have exponential size while the gcd itselfhas polynomial size. In our case, both intermediate and final results, ifnot reduced, may have exponential size. This is the main reason why thealgorithm above is probabilistic and Brown’s modular gcd-algorithm is not.

As mentioned in the complex case the probabilistic algorithm is almostof no use.

Observe that of course for testing whether simple radicals d√ρ ∈ Q(α)

similar results apply. Actually by choosing d2 = 1 they are covered by theprevious theorems. But in Corollary 5.3.3, p. 66, we did not require thatQ(α) contains a primitive d-th root of unity. Hence even if γd = ρ it neednot be the case d

√ρ = γ for the specific d-th root d

√ρ of the corollary.

But γ will be a d-th root of ρ. Therefore |γ− d√ρ| = |ζd− 1| d√ρ. Observe

that two d-th roots of unity have distance at least 2−1−log d+log π, whichis easily seen by recalling that the d-th roots of unity correspond to thevertices of a regular d-gon inscribed in the unit circle in the plane. Hencecomputing the difference γ − d

√ρ with error less than 2−L−log d will show

whether γ = d√ρ. This test is necessary only if log d is larger than n, l, and

86

Page 91: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

L since otherwise the approximation required by the reconstruction stepguarantees already γ = d

√ρ if γd = ρ.

Corollary 5.5.11 Given an approximation with absolute error less than2−L−log d to the element γ from Corollary 5.3.3 then it can be decided witherror probability less than 2−t and using at most O(n log d(n+ t(t+ logL+log d))) elementary operations on integers of size O(L + log d + t) whetherγ = d

√ρ. Furthermore, with O(n2 log d) elementary operations on integers

of size O(dL) it can be checked deterministically whether the element γ ofCorollary 5.3.3 satisfies γ = d

√ρ.

Finally, let us mention that for algebraic number fields whose ring ofintegers is a unique factorization domain and the (then well-defined) gcd oftwo integers can be computed efficiently then the test can be made deter-ministic in any case. In fact, then the same algorithm as for the rationalnumbers can be used. It can be shown that only the exponential growth inthe denominator of an algebraic number when raised to a d-th power forcedus to use probabilistic methods. But for algebraic integers the denominatoris always bounded by the discriminant of the number field.

87

Page 92: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

5.6 Sums of Radicals over Algebraic Number Fields

We summarize the results of the previous subsections in the following table.

Table 1: Run Times for Ratios of Radicals

procedure number of operations bit size of numbers

O(n) O(n2L)Approximations O(1) O(nL+ maxlog di)to d1√ρ1/ d2√ρ2 O(log nL) O(nL)

Reconstruction O(n3L) O(nL)

Probabilistic O(nmaxlog di O(t+ L+test(< 2−t) (n+ t(t+ logL+ maxlog di))) + maxlog di)

Deterministictest O(maxlog din2) O( lcm(d1d2)L)

Only for the approximation algorithms floating-point numbers are re-quired.

Theorem 5.6.1 Let Q(α) be a real algebraic number field, where α is analgebraic integer whose minimal polynomial is of degree n and has length|p|2 < 2l. Furthermore let d1√ρ1, d2

√ρ2, . . . , dk

√ρk be a set of positive real

radicals over Q(α), where [ρi] < 2L, i = 1, 2, . . . , k.It can be decided with error probability less than 2−t and by using at most

O(k2n3L+k2nmaxlog di(n+t(maxlog di+logL+t))) elementary oper-ations on floating-point numbers or integers of size O(nL+ t+ maxlog di)whether the set of radicals is linearly independent. Within the same errorand time bounds a maximal linearly independent subset of radicals can becomputed. Here L = dn log n+ nl + nLe.

Furthermore, given a linear combination S =∑ki=1 υi di

√ρi of the radicals

over Q(α), υi ∈ Q(α), [υi] < 2L, then it can be decided with the same errorprobability and using additional O(kn) elementary operations on integers ofsize O(kL) whether S = 0.

Proof: To prove the first claim we apply the algorithms of the previous sub-sections to the k(k−1)

2 different pairs of radicals in d1√ρ1, d2√ρ2, . . . , dk

√ρk.

88

Page 93: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Furthermore, in this case for each pair we choose the error bound to be2−t−2 log k. Hence the overall error probability is bounded by 2−t. Also notethat the approximation algorithms to α and the radicals di

√ρi need to be

applied only once.To prove the second claim use this algorithm to partition

d1√ρ1, d2√ρ2, . . . , dk

√ρk into subsets R1, . . . Rh such that two radicals are

in the same subset if and only if their ratio is an element of Q(α). To sim-plify the notation assume di

√ρi ∈ Ri, i = 1, . . . , h. By the above result this

partitioning can be done within the time bounds stated. Also elements νijcan be computed such that if dj

√ρj/ di√ρi ∈ Q(α) then dj

√ρj/ di√ρi = νij . So

S =k∑i=1

υi di√ρi =

h∑i=1

∑dj√ρj∈Ri

υjνij

di√ρi.

Since for any pair of radicals in R′ = d1√ρ1, d2√ρ2, . . . , dh

√ρh their ratio is

not in Q(α) we conclude by Corollary 3.10, p.28, that S = 0 if and only if∑dj√ρj∈Ri

υjνij = 0 for i = 1, . . . , h.

It remains to show that the exact representations of these sums as elementsin Q(α) can be computed within the time bounds stated in the theorem.With these representations it is trivial to check whether the sums are zero.

By assumption [υj ] < 2L and by Lemma 5.3.2, p. 64, [νij ] < 2O(L).Hence by Lemma 5.5.2, p. 82, each product υjνij can be computed by O(n2)elementary operations on integers of length at most O(L), furthermore, therepresentation size of the product is bounded by O(L). Observe that at mostk− 1 products have to be computed. Hence for this step O(kL) elementaryoperations on integers of size O(L) are needed which is already covered bythe previous run time bound.

Next we compute for all i = 1, . . . , h, the products of the denominatorsof the products υjνij . These denominators are bounded in absolute value byO(L). Hence for all sums the corresponding products can be computed byO(k) elementary operations on integers of size O(kL). With these productsit is easy to compute the representations of the sums as linear combinationsof the basis elements αi by O(kn) elementary operations on integers ofsize O(kL) (see Lemma 5.5.2, p. 82). This yields the additional operationsmentioned in the theorem.

89

Page 94: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

The statement on the error probability follows directly from the errorprobability for determining a maximal linearly independent subset.

The important thing about this theorem is that the run time is polynomialin the input size of the problems.

If we want the algorithm only to be polynomial in the di’s the algo-rithm can easily be made deterministic. Instead of using the probabilisticalgorithm of the previous subsection we use the deterministic one.

Theorem 5.6.2 If the tests from the theorem above are made deterministicthen the algorithm uses at most O(k2n3L + k2n2 maxlog di) elementaryoperations on floating-point numbers and integers of size at most O((n +maxd2

i + k)L).

Observe that in the last two theorems the dependence on the size of thecoefficients υi in the sum

∑ki=1 υi di

√ρi is better than the dependence on the

size of the elements ρi. Even if these coefficients have representation sizeO(nL) the upper bounds given on the number of bit operations still apply.

Finally we let us state similar but more restricted results for complexradicals over number fields containing appropriate roots of unity. The proofis exactly as above.

Theorem 5.6.3 Let Q(α) be an algebraic number field containing primitivedi-th roots of unity i = 1, 2, . . . , k, where α is an algebraic integer whoseminimal polynomial is of degree n and has length |p|2 < 2l. Furthermore let d1√ρ1, d2

√ρ2, . . . , dk

√ρk be a set of radicals over Q(α), where [ρi] < 2L, i =

1, 2, . . . , k.Using at most O(k2n3L + k2n2 maxdi) elementary operations on

floating-point numbers and integers of size O((n+ maxd2i )L) it can be de-

cided whether the set of radicals is linearly independent over Q(α). Withinthe same error and time bounds a maximal linearly independent subset ofradicals can be computed. Here L = dn log n+ nl + nLe.

Furthermore, given a linear combination S =∑ki=1 υi di

√ρi of the radicals

over Q(α), υi ∈ Q(α), [υi] < 2L, then it can be decided using additionalO(kn) elementary operations on integers of size O(kL) whether S = 0.

Let us note that by assuming that Q(α) contains di-th roots of unity wemay have to work in a field whose degree is exponential in k. But for a smallnumber of different degrees this may still be efficient. In particular, this is

90

Page 95: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

true if all di are equal. The following example is one important applicationof this case.

Assume we are given a sum of complex d-th roots of rational numbersand want to decide whether it is zero. We can solve this problem by applyingTheorem 5.6.3 with Q(α) being the d-th cyclotomic field. This allows usto decide the question in time polynomial in d, k. The brute force approach(computing enough bits) has run time Ω(dk). Hence we have the followingpartial generalization of Corollary 4.1.5, p.39.

Corollary 5.6.4 Assume S =∑ki=1 vi d

√qi is a sum of radicals over Q

such that vi, qi are rational numbers whose numerator and denomina-tor are bounded in absolute value by 2L. Then it can be decided usingO(k2d4(d + L)) elementary operations on floating-point numbers and in-tegers of size O(d3 + d2L) and O(kd) elementary operations on integers ofsize O(kd2 + kdL) whether S = 0.

Proof: Applying Mignotte’s bounds [Mi1] for the size of the coefficientsof a factor of a polynomial to Xd − 1 and the irreducible polynomial of aprimitive d-th root of unity shows that the latter has length bounded by2d+1. Since this polynomial has degree φ(d) < d the corollary follows fromthe previous theorem by observing that in this case the deterministic test isapplied only to radicals of equal degree. Hence maxd2

i can be replaced byd (see Table 1).

If Q(α) contains a primitive d-th root of unity then applying Theorem5.6.3 to the set 1, d√ρ simply checks whether d

√ρ ∈ Q(α). In the denesting

algorithms to be described in the next sections exactly this situation willoccur.

However, in Corollary 5.3.3, p. 66, we did not assume that Q(α) containsprimitive d-th roots of unity. By Corollary 5.5.11 in this case we need toapproximate the candidate γ and d

√ρ with absolute error less than 2−L−log d.

Since [γ] < 23L (Lemma 5.3.1, p. 63) changing in Table 2 the last entryfor the approximation algorithm to “O(log nL) operations on floating-pointnumbers of size O(nL + log d)” takes care of this13.This proves the nexttheorem. It states for example how much time it takes to check whether

13For the reconstruction and verifying step L can also be replaced by L (see Lemma5.3.1, p. 63, and Corollary 5.3.3, p. 66).

91

Page 96: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

a field contains certain primitive roots of unity. Therefore whenever weassume that a number field contains some root of unity this can be checkedefficiently.

Theorem 5.6.5 Let Q(α) be an algebraic number field, where α is an al-gebraic integer with minimal polynomial p of degree n and length boundedby 2l. Let ρ ∈ Q(α) satisfy [ρ] < 2L. For d ∈ N it can be decided with errorprobability less than 2−t and using at most O(n3L + n log d(n + t(log d +logL+ t))) elementary operations on floating-point numbers and integers ofsize O(nL+ t+ log d) whether d

√ρ ∈ Q(α).

The deterministic test whether d√ρ ∈ Q(α) uses at most O(n3L + n2d)

elementary operations on floating-point numbers and integers of size O((n+d)L).

92

Page 97: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

6 Denesting Radicals - The Basic Results

In the second part of this thesis we consider a problem which is known as den-esting of nested radicals and has attracted a lot of mathematicians and com-puter scientists throughout the last years (see [BFHT],[HH],[La2],[La3],[Z]).Before stating the problem formally let us give some examples found in thenotebook of the Indian mathematician Ramanujan [R].√

3√

28− 3√

27 =13

(3√

98− 3√

28− 1)

3

√5

√32/5− 5

√27/5 = 5

√1/25 + 5

√3/25− 5

√9/25.

In both equations the formula on the left-hand side has nesting depth 2 andthe formula on the right-hand side has depth 1.

Although these examples may already explain the notion of nesting depthsufficiently we give a more formal definition (see also [BFHT]).

Definition 6.1 The nesting depth of an expression over a field F is de-fined as follows :

(1) an element of F has nesting depth 0 over F,

(2) an arithmetic combination (A+B,A−B,A×B,A/B) of expressionsA,B over F has nesting depth maxdepth(A), depth(B), and finally,

(3) a root d√A of an expression A has nesting depth depth(A) + 1.

Of course, given a nested radical there is no unique number in C, say,corresponding to this expression since the roots can be interpreted in dif-ferent ways. Instead it always has to be said which roots are meant so thatthe value v(A) of a nested radical is uniquely defined. Then the problemof denesting nested radicals over a field F can be defined in the followingway. Given a radical expression A over a field F with uniquely defined valuev(A). Is there an expression B over F with the same value as A and withlower nesting depth over F than A?

Borodin et al. [BFHT] considered certain depth 2 expressions involvingreal square roots and showed under which conditions such an expression canbe denested. To be more specific they proved the following theorem.

Theorem 6.2 (Borodin et al.) Let F ⊃ Q be a real field. Let α, β, ρ in Fwith

√ρ not in F. If

√α+ β

√ρ is contained in some real radical extension

93

Page 98: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

F ( d1√ρ1, . . . , dm

√ρm), ρ1, . . . , ρm ∈ F, of F then for some positive γ0 ∈ F

either √γ0

√α+ β

√ρ ∈ F (

√ρ)

or √γ0

4√ρ√α+ β

√ρ ∈ F (

√ρ).

In the first case√α2 − β2ρ ∈ F and γ0 can be chosen as 2(a+

√α2 − β2ρ).

In the second case√ρ(β2ρ− α2) ∈ F and γ0 can be chosen as 2(b +√

ρ(β2ρ− α2)).

Clearly, this theorem almost immediately leads to an algorithm computinga denesting. Since γ0 is known and the field has the basis 1,√ρ it remainsto determine the corresponding element in F (

√ρ), which is not too difficult

if F is for example Q.This procedure always finds a denesting using only sums of depth 1 rad-

icals. But observe that if a radical can be denested by a rational expressionin radicals of smaller nesting depth then it can be denested by a sum of radi-cals of smaller nesting depth. In fact, simply consider the field E containingall radicals appearing in the expression denesting the original radical expres-sion γ. γ is an element of E which has a basis of products of the radicals ofsmaller depth than the depth of γ.

A few years after Borodin et al. published their results Landau [La2]showed how to compute a minimum depth denesting that is just one off theoptimum and Horng, Huang [HH] showed how to compute the minimumdepth denesting of an arbitrary nested radical over a field containing all rootsof unity. Unfortunately, they were only able to prove single-exponentialor double-exponential bounds on the output size. As the input size theyconsider the size of the minimal polynomial of the expression that has tobe denested. This size may already be exponential in the number of bitsnecessary to describe the radical expression itself.

In particular, Horng and Huang showed that the minimum denesting ofa nested radical defined over a field containing all roots of unity can alreadybe found using only elements of a field F that is generated by a root ofunity whose degree over Q is in the worst case double-exponential in theinput size of the minimal polynomial of the nested radical. Accordingly, thealgorithms, which are completely due to Landau, have a double-exponentialrun time. Basically these algorithms work by first computing the Galoisgroup of the minimal polynomial and then applying the classical results onsolvable groups.

94

Page 99: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Although quite general these results cannot explain or determine thedenestings of Ramanujan mentioned in the beginning simply because theseexamples use no roots of unity at all. Moreover, a double-exponential algo-rithm is of almost no practical use.

By going back to the work of Borodin et al.[BFHT], simplifying, and gen-eralizing their results, in this thesis we completely determine the structureof denestings as the ones found by Ramanujan. In particular, we generalizethe theorem mentioned above to arbitrary depth 2 expressions containingreal radicals. These radicals need not be square roots.

Based on these results we also achieve efficient denesting algorithms fora large class of depth 2 expressions. In particular, the run times will be atmost polynomial and in many cases even less than polynomial in the inputsize of the minimal polynomial of the radical expression to be denested. Theclass of depth 2 radicals to which these algorithms can be applied contains allexamples given by Ramanujan and is not restricted to expressions involvingonly real radicals. Since we describe the algorithms for algebraic numberfields they can be applied repeatedly to nested radicals of depth larger than2 in order to find non-trivial denestings for arbitrary radical expressionsefficiently. Although in general the algorithms cannot find a minimum depthdenesting in many cases they will do.

The denesting algorithms can also be used to generalize the results ofSection 5 in the following sense.

Suppose we want to apply the results of the previous section to the sum

S =k∑i=1

di1

√ai + bi di2

√qi, ai, bi, qi ∈ Q.

We always assumed that the expressions under the root signs are elements ofa single extension Q(α) of Q. Hence in case of the sum S we had to considerthe field generated by the radicals d12

√q1, d22

√q2, . . . , dk2

√qk, whose degree

will in general be exponential in k even if the di2’s are constant. Thereforethe run times of the algorithms will be exponential in k.

Using the results of the following sections we can solve this problemmuch more efficiently. First we check whether the sum S can be writtenas a sum of radicals ti

√pi over Q. If this is not the case, S cannot be zero.

On the other hand, if S is transformed into a sum∑ci ti√pi of radicals over

Q determining whether it is zero can easily be done by the algorithms ofSection 5. Moreover, as the size of

∑ci ti√pi will be small the algorithm will

do so efficiently.

95

Page 100: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Hence all we have to do is to determine upper bounds on the represen-tation size of

∑ci ti√pi and to describe an algorithm that computes it.

The following two sections of this thesis are organized as follows. In thefirst part of the next section we prove certain generalizations of the basictheorem of [BFHT]. In the second part we provide the basic means for ourdenesting algorithms and give a first description of these algorithms. Theywill be described in more detail in Section 7 where we also give an analysisof the algorithms. This part is quite technical and hopefully the reader willbe already convinced at the end of Section 6 that our algorithms are muchmore efficient than the general algorithms known so far.

6.1 The Basic Theorems

In this section we prove our basic theorems on denesting nested radicalexpressions. For the time being we denote by a symbol d

√ρ, d ∈ N, ρ ∈ C,

a solution to the equation Xd− ρ = 0. If we do not restrict the radical thend√ρ denotes any of the d different solutions of Xd− ρ = 0. If we require that

the radical d√ρ is real then d

√ρ denotes one of the at most two real solutions

of Xd−ρ = 0. Hence it is implicitly assumed that ρ ∈ R and that Xd−ρ = 0has a real solution.

First we restrict ourselves to a single nested radical.Given a nested radical, like for example d

√∑ki=1 κi di

√ρi, of depth 2 over

some field F. To describe and compute the possible denestings of this nestedradical we consider γ =

∑ki=1 κi di

√ρi as an element of the radical extension

E = F ( d1√ρ1, . . . , dk

√ρk) generated by the depth one radicals di

√ρi appearing

in∑ki=1 κi di

√ρi. The theorems that we are going to prove in this section show

that this field is almost the right place to look for denestings.Observe that if F ⊂ R, di

√ρi ∈ R, i = 1, 2, . . . , k, and the radicals di

√ρi

are linearly independent over F then E = F (γ) (Theorem 3.13, p. 32). Thesame is true for not necessarily real radicals di

√ρi if F contains primitive

di-th roots of unity (Theorem 3.11, p. 31).Let us fix some notation. Throughout this subsection F is a subfield of

C and E = F ( d1√ρ1, . . . , dk

√ρk) is a radical extension of F. By N = [E : F ]

denote the degree of E over F. By nj denote the degree of the extensionF ( d1√ρ1, . . . , dj

√ρj) : F ( d1

√ρ1, . . . , dj−1

√ρj−1). As has been mentioned in Sec-

tion 2 a basis for the extension E : F is given by

B = β0, β1, . . . , βN−1 =

k∏j=1

dj√ρjej , 0 ≤ e1 < n1, . . . , 0 ≤ ek < nk

.96

Page 101: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

As before B is called the standard basis of E : F.The next theorem generalizes and simplifies the proof of the result due

to Borodin et al. [BFHT] (Theorem 6.2).

Theorem 6.1.1 Let F be a subfield of R, di ∈ N, ρi ∈ F, di√ρi ∈

R, di√ρi > 0 for all i = 1, 2, . . . , k, γ ∈ E = F ( d1

√ρ1, . . . , dk

√ρk), d ∈ N.

Assume d√γ is real and denests over F using only real radicals, that is,

d√γ ∈ F ( t1

√γ1, . . . , tm

√γm), for some ti ∈ N, γi ∈ F, pi > 0. Then there exist

γ0 ∈ F\0, βj ∈ B such that

d√γ0

d

√βj d√γ = η ∈ E,

where the real d-th roots of γ0, βj are implied in the above expressions.Hence writing η as a linear combination of the elements of the standard

basis B of E and dividing by d√γ0

d√βj leads to a denesting of d

√γ.

Observe that the condition di√ρi > 0 is no restriction since any real

radical extension can clearly be generated by positive radicals. However, ifit is generated by positive radicals then any element of the standard basisis a positive real number. Hence the real d-th root of βj as required by thetheorem will always exist.

Proof: d√γ ∈ F ( t1

√γ1, . . . , tm

√γm) clearly im-

plies d√γ ∈ E( t1

√γ1, . . . , tm

√γm). Hence d

√γ is a radical over E contained

in E( t1√γ1, . . . , tm

√γm). Since both fields, E and E( t1

√γ1, . . . , tm

√γm), are

real we can apply Lemma 3.6, p. 26, which shows

d√γ = η

m∏i=1

ti√γifi , or d

√γ

(m∏i=1

ti√γifi

)−1

= η

for some η ∈ E, fi ∈ N.∏mi=1

ti√γifi can be written as t

√ρ for some ρ ∈ F and t =

∏mi=1 ti. Taking

d-th powers this yieldsγ/ηd = t

√ρd.

But γ/ηd ∈ E ⊆ R and ρ ∈ F. Therefore t√ρd is a real radical over F that

is contained in E ⊆ R. By Lemma 3.6, p. 26, t√ρd = γ′0βh for γ′0 ∈ F and

some basis element βh.Since t

√ρ = 1

ηd√γ ∈ R this shows that t

√ρ can be written as a real d-th

root of γ′0βh. By assumption βh is positive and therefore γ′0 must be positive

97

Page 102: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

if d is even. Hence real d-th roots of γ′0 and βh exist such that

d

√1γ′0

d

√1βh

d√γ = η ∈ E.

Choosing different real roots of γ′0 and βh just effects the sign. Thereforemultiplying d

√γ with arbitrary real d-th roots of γ′0, βh leads to an element

of E.Finally observe that β−1

h is a radical of E over F. Hence it can be writtenas γ′′0βj for a basis element βj and γ′′0 ∈ F, γ′′0 > 0. So γ0 may be chosen asγ′′0γ′06= 0 to prove the first claim.

To prove that the equation d√γ0

d√βj d√γ leads to a denesting observe

that each basis element βj ∈ B can be written as a real N -th root of someelement in F. In fact, βj is a radical over F and generates a subfield of Eover F. Hence its degree Nj over F must be a divisor of N. By Theorem3.3, p. 24, in this case βj can be written as a real root Nj

√ωj , ωj ∈ F. Since

Nj |N Nj√ωj can be replaced by N

√ω′j for an appropriate ω′j ∈ F. This finally

proves the theorem.

In view of the last fact mentioned in the proof, Theorem 6.1.1 can beinterpreted as saying that only dN -th roots help in denesting d

√γ and that

d√γ can be denested by real radicals if and only if an element γ0 ∈ F exists

such that for the real roots dN√γ0 of γ0

dN√γ0

d√γ ∈ F. As it turns out, the

slightly more precise description in Theorem 6.1.1 yields a more efficientalgorithm.

In case E is generated by a single square root√ρ1 the basis B consists

of 1 and√ρ1 only. So in this case we immediately get the first part of the

theorem of Borodin et al. mentioned in the introduction to this section.Next observe that, given the equation γ0βjγ = ηd, η ∈ E, we can apply

the distinct field embeddings σi, i = 0, 1, . . . , N − 1, of E over F to thisequation. This yields γ0σi(βj)σi(γ) = σi(η)d, for all i.

Hence any d-th root of σi(γ) can be written as a product of an appro-priate d-th root of σi(βj)−1, an appropriate d-th root of γ−1

0 , and σi(η). Ingeneral, the d-roots will be complex.

Recall from Section 2 that σi(γ) is a root of the minimal polynomial ofγ. Hence the different d-th roots of σi(γ), i = 0, 1, . . . , N − 1, are the rootsof the minimal polynomial of d

√γ, provided d

√γ is of degree d. By Siegel’s

98

Page 103: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Theorem 3.9, p. 28, the radicals appearing in the expression for γ and η willbe mapped by each σi onto certain complex radicals. Hence the equationγ0βjγ = ηd leads to a denesting for all conjugates of d

√γ over F. In [HH]

this has been called an exact denesting.Before we proceed with linear combinations of nested radicals and com-

plex nested radicals we show that no sum of real depth one radicals over Fdenesting d

√γ can consist of fewer terms than the one described by Theorem

6.1.1. We will even show that the denesting is basically unique.Observe that the radicals appearing in the denesting of d

√γ following

from Theorem 6.1.1 are linearly independent. In fact, they all differ fromelements in B only by the factor

(d√γ0

d√βj)−1

.Suppose that d

√γ can be denested in two different ways by sums of real

radicals over F,

d√γ =

n∑i=1

κi si√ηi, ηi, κi ∈ F, κi 6= 0,

and

d√γ =

n′∑j=1

λj tj√µj , µj , λj ∈ F, λj 6= 0.

We may assume that the si√ηi’s and tj

√µj ’s separately are linearly indepen-

dent. Hence (see Corollary 3.10, p. 28)

sl√ηl

sm√ηm6∈ F,

tlõl

tm√µm6∈ F

for all pairs of different indices.On the other hand,

n∑i=1

κi si√ηi −

n′∑j=1

λj tj√µj = 0.

Therefore the set of radicals s1√η1, . . . , sn√ηn, t1√µ1, . . . , tn′

√µn′ must be

linearly dependent. It follows from Corollary 3.10, p. 28, and the fact thatthe si

√ηi’s, tj

√µj ’s separately are linearly independent that any si

√ηi differs

from some tjõj only by a factor in F. Moreover, different si

√ηi’s are multi-

ples of different tj√µj ’s. The same is true if we interchange the roles of the

si√ηi’s and tj

√µj ’s. This shows that up to linear depend radicals denestings

of d√γ by sums of real depth one radicals are basically unique and will have

at most N terms.

99

Page 104: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Observe that the number of terms in the description of γ may be lessthan logN. In general, elements of E have a description of size Ω(N) butalthough we treat γ as an element of E we do not assume that E is fixedand that γ is given as an element of E, that is, as a linear combinationof elements in the standard basis B. Rather in the algorithms we describebelow we first determine an appropriate representation of E. So if we reallywant to compute the denesting of Theorem 6.1.1 we must expect the outputsize to be exponential in the input size.

If we could represent elements in E not only as linear combinations ofbasis elements but by arbitrary rational expressions in certain elements ofE the output size might be much smaller than N. But as all other methodsin algorithmic algebra our approach relies on the basis representation.

Also in special cases the output size with respect to a basis representationmay be much smaller than N but it looks almost impossible to predict theoutput size in advance or to get an output sensitive algorithm. Hence weshould be satisfied with a denesting algorithm whose run time is polynomialin N. At the end of Section 6 we describe an algorithm that achieves such arun time.

As the next theorem shows denesting radicals of the form d√γ already

suffices to denest linear combinations of nested radicals.

Theorem 6.1.2 Suppose S =∑ki=1 κi di

√γi is a sum of real nested radicals

such that κi, γi ∈ Ei, i = 1, . . . , k, and each Ei is a real radical extension ofF ⊆ R. Furthermore assume that no nested radical di

√γi denests using real

radicals over F. Then

• If S denests using real radicals then S = 0.

• If no quotientdi√γi

dj√γji 6= j, denests using real radicals then S = 0 if and

only if κi = 0 for all i.

Proof: If a quotientdi√γi

dj√γj

denests thendi√γi

dj√γj

=∑lh=1 λh rh

√ρh for some

λh, ρh ∈ F. Hence S can be written as S =∑i∈I κ

′idi√γi, where each κ′i is a

real depth 1 expression over F, I ⊆ 1, . . . , k, and no quotientdi√γi

dj√γj, i, j ∈ I

denests.Next suppose S denests to S =

∑k′j=1 λj sj

√ρj , λj , ρj ∈ F. Consider the

real radical extension E of F generated by the fields Ei, the radicals sj√ρj ,

100

Page 105: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

and the radicals appearing in at least one of the κ′i’s.

S′ = S −k′∑j=1

λj sj√ρj =

∑i∈I

κi di√γi −

k′∑j=1

λj sj√ρj = 0.

Consider the sum S′ as a linear combination of the radicals 1, di√γi, i ∈ I,

where the coefficient κ′0 for 1 is defined as κ′0 =∑k′j=1 λj sj

√ρj . Each radical

di√γi is a depth one radical over E and the same is trivially true for the

element 1. Furthermore, the coefficients κ′i in S′ are elements of E.Apply Corollary 3.10, p. 28, to the field E and the sum S′. Together with

the assumption that no nested radical di√γi and the fact that no quotient

of these nested radicals denests using real radicals it implies that the set di√γi| i ∈ I ∪ 1 is linearly independent over E and hence S′ = 0 if andonly if κ′i = 0 for all i. In particular, S = κ′0 = 0, which proves the first partof the theorem.

If the nested radicals in S have already the property that no quotientof two of them denests then the argument above shows that S = 0 impliesκi = 0, for all i, which proves the second part of the theorem.

With respect to this theorem, given a sum S =∑ki=1 κi di

√γi of nested

radicals, by determining which nested radicals and which ratios of nestedradicals denest S can be transformed into

S = S′ +k′∑j=1

λj sj√ρj ,

where S′ is a sum of nested radicals satisfying the conditions of the previoustheorem and

∑k′j=1 λj sj

√ρj is a depth 1 expression. Hence S can denest if

and only if S′ = 0, in which case the denesting is given by∑k′j=1 λj sj

√ρj .

As above it can easily be shown that a denesting for S leads to a denestingfor all roots of the minimal polynomial of S.

Notice that the degree of the minimal polynomial of the sum S is ingeneral exponential in k. On the other hand, from the theorem above it iseasily deduced that the number of terms in a denesting is polynomial in thisparameter. Moreover, we will prove below that the whole description size ofthe denesting is polynomial in k. Hence the run times of the algorithms in[La2] and [HH] that construct the minimal polynomial and even the splitting

101

Page 106: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

field of S will be exponential in the output size. Our denesting algorithmfor these denesting will have a run time that is polynomial in k.

For complex radicals we get the following generalization of Theorem6.1.1. So far it is computationally not very useful but it partially proves aconjecture of R. Zippel [Z].

Theorem 6.1.3 Assume F contains all roots of unity. Let E be an ar-bitrary radical extension of F with standard basis B. Assume γ ∈ E andd ∈ N such that d

√γ denests over F, that is, d

√γ ∈ F ( t1

√γ1, . . . , tm

√γm), for

some γi ∈ F. Then there exist γ0 ∈ F\0, βj ∈ B such that

d√γ0

d

√βj d√γ = η ∈ E

for all d-th roots of γ0, βj .

The proof of this theorem is exactly the same as for Theorem 6.1.1 exceptthat Lemma 3.6, p. 26, is replaced by Lemma 3.8, p. 27 and that we neednot worry that all radicals involved are real. The fact that we may choosearbitrary roots of γ0 and βj is an immediate consequence of the fact that Fcontains all roots of unity.

In the form presented above Theorem 6.1.3 seems to be of theoreticalinterest only since it is computationally infeasible to work with a field con-taining all roots of unity. But at least it shows that even for arbitrary nestedradicals those denestings that will be considered in the remaining sectionsof this thesis may be the right thing to look at. Also, the results of Horng,Huang and Landau can be interpreted as saying that instead of consideringa field containing all roots of unity it suffices to work in a field that containsa certain root of unity whose degree is finite although very large. Moreover,there is the following restricted version of Theorem 6.1.3. It also general-izes a result of Borodin et al. for certain complex nested square roots (see[BFHT]).

Theorem 6.1.4 Assume F contains a primitive d-th root of unity. Let Ebe a radical extension of F generated by radicals di

√ρi such that di divides

d. Assume γ ∈ E, d ∈ N, such that d√γ denests over F using radicals

of the form t√ρ with t dividing d, that is, d

√γ ∈ F ( t1

√γ1, . . . , tm

√γm), for

some γi ∈ F, ti ∈ N, ti dividing d for all i. Then there exists an elementγ0 ∈ F\0 such that

d√γ0

d√γ = η ∈ E

for all d-th roots of γ0.

102

Page 107: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Representing η as a linear combination of the elements of the standardbasis of E and dividing it by d

√γ0 leads to a denesting of d

√γ using only

radicals of the form t√ρ with t dividing d and ρ ∈ F.

Proof: d√γ is a radical over E contained in E( t1

√γ1, . . . , tm

√γm). Since ti

divides d, i = 1, 2, . . . ,m, the field F and hence E contains primitive d-thand primitive ti-th roots of unity for all i. Applying Lemma 3.8, p. 27, shows

d√γ = η

m∏i=1

ti√γifi , or d

√γ

(m∏i=1

ti√γifi

)−1

= η

for some η ∈ E, fi ∈ N.

Since ti divides d, the product(∏m

i=1ti√γifi)−1

can be written as d-throot of γ0 for an appropriate γ0 ∈ F and an appropriate d-th root of γ0.

But E contains all d-th roots of unity. Therefore multiplying d√γ with

any d-th roots of γ0 leads to an element of E.Since the extension E has a standard basis consisting of radicals of the

form t√ρ, t dividing d, the equation d

√γ0

d√γ = η ∈ E really leads to a den-

esting of d√γ using only radicals of the form t

√ρ, ρ ∈ F, t ∈ N, such that t

divides d.

We may have formulated the theorem using only radicals of the formd√ρ, ρ ∈ F. However, the present form contains more information and is

more convenient for the analysis of the denesting algorithms in Section 7.The theorem may also be interesting from a practical point of view in

the following sense.Given a nested radical like d

√γ containing complex radicals di

√ρi, assume

you want to denest it and you want to allow a denesting by radicals over somelarger field, which is still easy to describe and allows efficient arithmetic, andyou want to allow the degrees of the radicals to be larger than the originalones.

This can be achieved in the following way. Choose some integer D suchthat d and di divide D. d

√γ can be written as D

√γe for some e. Replace

the field F by the smallest field F ′ containing F and a primitive D-th rootof unity. Then Theorem 6.1.4 describes and the algorithm we are going todescribe in the remainder of Section 6 and in Section 7 can determine adenesting over F ′ using radicals of the form D

√ρ, ρ ∈ F ′.

103

Page 108: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

This generalization is easily incorporated in the results and techniquesdescribed below. It does not require any new arguments or ideas. So we willrestrict ourselves in the sequel to denestings as in Theorem 6.1.4.

Recall that in the real case we showed that the denesting of Theorem6.1.1 was basically unique and since the elements in the standard basis ofa radical extension are of course linearly independent any denesting of d

√γ

by a sum of real radicals has at least as many terms as the one obtained byTheorem 6.1.1. The same is true in the present situation if the expression“real radicals” is always replaced by “radicals of the form t

√ρ with t dividing

d”. In particular, any demesting of this form has at least as many terms asthe one obtained by the conclusion in Theorem 6.1.4.

Let us finally mention the generalization of Theorem 6.1.2 to complexradicals.

Theorem 6.1.5 Assume F contains a primitive d-th root of unity. SupposeS =

∑ki=1 κi d

√γi, is a sum of nested radicals such that κi, γi ∈ Ei, i =

1, . . . , k, and each Ei is a radical extension of F generated by radicals of theform t

√ρ, ρ ∈ F, t ∈ N, such that t divides d. Furthermore assume that no

nested radical d√γi denests using only radicals of the same form as the ones

defining the fields Ei. Then

• If S denests using radicals of the form t√ρ, ρ ∈ F, t|d, then S = 0

• If no quotientd√γi

d√γj, i 6= j, denests using radicals of the form t

√ρ, ρ ∈

F, t|d, then S = 0 if and only if κi = 0 for all i.

The proof is as the one for Theorem 6.1.2 replacing the real variant ofCorollary 3.10, p. 28, by the complex one.

This finishes the description of our basic denesting theorems. In thefollowing section we give a description of all elements γ0 ∈ F with theproperty d

√γ0

d√γ′ ∈ E for a nested radical d

√γ′, γ′ ∈ E. This is already an

exhaustive description of denestings as in Theorem 6.1.4. To describe thedenestings of Theorem 6.1.1 in full detail we apply the description to allelements βjγ, γ ∈ E, βj ∈ B.

This description, however, leads to an efficient algorithm only for nestedradicals d

√γ, where γ is an element of a simple radical extension. But as

it turns out, any denesting problem can be reduced to certain denestingproblems for simple radical extensions.

104

Page 109: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

6.2 Denesting Sets and Reduction to Simple Radical Exten-sions

With respect to the results of Theorem 6.1.1 and Theorem 6.1.4 for the restof Section 6 we use the following convention.

Convention For the rest of Section 6 any symbol of the form m√ρ, m ∈

N, ρ ∈ C, denotes the complex number |ρ|1m

(cos 1

mφρ + i sin 1mφρ

), where

|ρ|1m is the positive real m-th root of |ρ| and φρ ∈ (−π, π] is the angle of ρ

when written in polar coordinates.

In fact, in Theorem 6.1.4 since F contains a primitive d-th root of unityand di|d for all i the extension F ( d1

√ρ1, . . . , dk

√ρk) is the same no matter

which di-th roots are meant. Furthermore, whether the nested radical canbe denested depends only on d and γ but not on the particular choice of thed-th root. Finally, we may choose the roots of the element γ0 in Theorem6.1.4 arbitrarily.

Likewise, in Theorem 6.1.1 the radical extension F ( d1√ρ1, . . . , dk

√ρk) is

independent of the specific real roots and the question whether the nestedradical can be denested depends only on d, γ, and the fact that real d-throots are considered. Moreover, if elements γ0, βj as stated in the theoremexist we may take any real d-th root of γ0, βj .

Observe that if a real number ρ has a real m-th root then|ρ|

1m

(cos 1

mφρ + i sin 1mφρ

)uniquely describes such a real root. Finally, if

ρ > 0 then m√ρ > 0, too.

Also to simplify the statement of our results we refer to denesting prob-lems as in Theorem 6.1.1 and Theorem 6.1.2 as the real case. To denestingproblems as in Theorem 6.1.4 and Theorem 6.1.5 we refer as the complexcase.

Definition 6.2.1 Assume F ⊂ C and let E be an algebraic extension ofF, γ ∈ E, d ∈ N. We say that γ0 ∈ F\0 denests d

√γ over F or that

it leads to a denesting of d√γ over F if γ0γ = ηd for some η ∈ E. The

element γ0 is called a denesting element for d√γ over F.

A set S ⊂ F of elements denesting d√γ such that any element denesting

d√γ differs from some element in this set by a d-th power of an element in F

is called a denesting set for d√γ over F. If no two elements in a denestings

set differ from one another by a d-th power of an element in F then thedenesting set is called irreducible.

105

Page 110: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

We gave this definition for arbitrary fields F and E because our charac-terization for elements denesting a nested radical we give below applies insuch a general setting and therefore generalizes a characterization of Zippel[Z] and Landau [La3].

In the general setting the definition says that γ0 denests d√γ if ζ d

√γ0

d√γ ∈

E for some d-th root of unity ζ. By the remarks above for field extensionsas in Theorem 6.1.1 or Theorem 6.1.4 it simply states that γ0 denests d

√γ

if and only if d√γ0

d√γ ∈ E (recall the convention from the beginning of this

section). Notice that in the real case if d is even then γ0 can denest d√γ only

if γ0 is positive.Before we characterize denesting elements we show that any denesting

problem as in Theorem 6.1.1 or in Theorem 6.1.4 can be reduced via den-esting sets to certain denesting problems over simple radical extensions.

We begin with showing that in these cases finite denesting sets exist.

Lemma 6.2.2 Let the fields F,E, d, and γ be either as in Theorem 6.1.1or as in Theorem 6.1.4. By B denote the standard basis of the extension E.If γ0 ∈ F denests d

√γ thenβdj γ0|βj ∈ B such that βdj ∈ F

and

β−dj γ0|βj ∈ B such that βdj ∈ F

are irreducible denesting sets for d√γ over F.

The second set has been included since it turns out to be more convenientfrom a computational point of view.

Proof: It is clear that both sets contain only denesting elements for d√γ. It

remains to show that they are irreducible denesting sets. We will do so indetail only for the first set.

First let us show that any two elements γ0, γ0 that denest d√γ differ

from one another by a factor of the form ρdβdj , with ρ ∈ F, βj ∈ F such thatβdj ∈ B. This will show that the first set is a denesting set.

By the remarks aboved√γ0

d√γ ∈ E,

and, likewise,d√γ0

d√γ ∈ E.

106

Page 111: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Henced√γ0

d√γ

d√γ0

d√γ

=d√γ0

d√γ0∈ E.

Therefore the ratio is a radical of E over F. In particular, if E is real theratio must be a real number. By Lemma 3.6, p. 26 in the real case, orLemma 3.8, p. 27 in the complex case,

d√γ0

d√γ0

= ρβj ,

for some ρ ∈ F, βj ∈ B. The claim follows.It remains to show that the set is irreducible. If the elements in the set

did not form an irreducible denesting set then

βdi γ0

βdj γ0= ρd

for some element ρ ∈ F and different basis elements βi, βj . Hence βj and βiwould differ from one another only by a factor in F. This is impossible sinceβi and βj are linearly independent.

The claim for the second set of elements in F can be shown in exactlythe same way by observing that if E is described as the extension generatedby di√ρi−1 then with respect to these generators the elements β−1

j form thestandard basis.

In the complex case βdj ∈ F for all elements βj of the standard basis Bhence the previous lemma implies

Corollary 6.2.3 In the real case a denesting set has size at most N. In thecomplex case a denesting set has size exactly N.

Assume next that L ⊃ F is a proper subfield of E, where F and E areas in the previous lemma. We may consider γ also as an element of thealgebraic extension E of the field L. If an element in F denesting d

√γ over

F exists then it is also a denesting element in L for d√γ over L. Moreover,

let D be a set containing the inverses of the elements of a denesting set ford√γ over L. For each element γ′ ∈ D we consider a denesting set of d

√γ′ over

F. We claim that the union of these sets contains a denesting set for d√γ

over F.

107

Page 112: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

To prove the claim recall that any element γ0 ∈ F ⊂ L that denests d√γ

over F also denests d√γ over L. Hence it differs from the inverse of some

element γ′ ∈ D by a d-th power of an element in L, i.e.,

γ0 = ηd1γ′, η ∈ L.

Henceγ0γ′ = ηd,

i.e., γ0 is a denesting element for d√γ′. By definition of a denesting set the

claim follows.Denote by F (i) the subfield of E generated by the first i radicals

d1√ρ1, . . . , di

√ρi. Define sets D(k−i), i = 0, 1, . . . , k, as follows:

D(k) = γ.

Assume D(k−i) has already been defined. For each element γ(k−i) ∈ D(k−i)

let Dγ(k−i) be a set containing the inverses of a denesting set for d

√γ(k−i)

over F (k−i−1). Then

D(k−i−1) =⋃

γ(k−i)∈D(k−i)

Dγ(k−i) .

By induction on k− i and by repeatedly using the argument from above weimmediately get

Lemma 6.2.4 If the sets D(k−i) are defined as above then the inverses ofthe elements in D(0) form a superset of a denesting set for d

√γ.

Our basic denesting algorithm will compute a set D(0) as described aboveand then will check for all inverses of elements in D(0) whether they denestd√γ.

But as long as we are not able to characterize and compute denestingelements the results obtained above are of no use. Therefore in the nextsubsection we give a constructive description of the elements denesting d

√γ.

It applies not only to radical extensions but to arbitrary extensions.

6.3 Characterizing Denesting Elements

In this subsection let F ⊂ C be an arbitrary field and E an algebraic exten-sion of F.

108

Page 113: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

We need one more definition. Let γ ∈ E and d ∈ N. If γ0 denests d√γ

then γ0γ = ηd, η ∈ E. Recall that this implies γ0σi(γ) = σi(η)d for allembeddings σi of E over F. These equations are equivalent to the existenceof d-th roots of unity ζ(i) such that

ζ(i) d√γ0

d

√σi(γ) = σi(η), for all i.

Definition 6.3.1 Let F be an arbitrary field, E an extension of F, [E :F ] = N, γ ∈ E, d ∈ N. Denote the field embeddings of E over Fby σi, i = 0, 1, . . . , N − 1, with σ0 being the identity. A sequence(ζ(0), ζ(1), ζ(2), . . . , ζ(N−1)) of d-th roots of unity is called an admissiblesequence for d

√γ over F if elements γ0 ∈ F and η ∈ E exist such that

ζ(i) d√γ0

d√σi(γ) = σi(η) for all i = 0, 1, . . . , N − 1. For fixed γ0 ∈ F such a

sequence is called an admissible sequence corresponding to γ0.An admissible sequence is called normalized if ζ(0) = 1.

For γ0 ∈ F there may be many different admissible sequences correspondingto it. For example, assume that F contains a primitive d-th root of unity.If at least one admissible sequence corresponding to γ0 exists then there areexactly d different admissible sequences corresponding to γ0. On the otherhand, since we fixed the meaning of d

√γ0, d√γ, for each γ0 there is at most

one normalized sequence corresponding to it.Based on this definition we give our characterization of elements den-

esting γ. It generalizes results of R. Zippel [Z] and S. Landau [La3] whocharacterized elements denesting a d-th root d

√γ if E is a Galois exten-

sion of F. Lemma 6.3.2 avoids this assumption and is correct for arbitraryextensions of an arbitrary field F.

Lemma 6.3.2 Let F ⊂ C be an arbitrary field and E an algebraic ex-tension of F with basis β0, β1, . . . , βN−1. Denote the distinct field em-beddings by σi, i = 0, . . . , N − 1, with σ0 being the identity. Assumeγ ∈ E, d ∈ N. If a denesting element for d

√γ exists then an admissible

sequence (ζ(0), ζ(1), . . . , ζ(N−1)) for d√γ and a basis element βj exist such

that (N−1∑i=0

ζ(i)σi(βj) d

√σi(γ)

)d∈ F\0

and the inverse of this element also denests d√γ over F.

109

Page 114: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Moreover, any element γ0 ∈ F that denests d√γ can be written as

ρd(N−1∑i=0

ζ(i)σi(βj) d

√σi(γ)

)−d

for an admissible sequence (ζ(0), ζ(1), . . . , ζ(N−1)) for d√γ, a basis element

βj , and some ρ ∈ F.

Proof: Assume γ0 ∈ F denests d√γ. Hence an η ∈ E exists such that

γ0σi(γ) = σi(η)d for all embeddings σi. Equivalently, d-th roots of unity ζ(i)

exist such thatζ(i) d√γ0

d√γ = σi(η).

These roots of unity form an admissible sequence.Furthermore applying Lemma 3.5, p. 25, to η ∈ E yields

tr(βjη) =N−1∑i=0

σi(βj)σi(η) = d√γ0

N−1∑i=0

ζ(i)σi(βj) d

√σi(γ) = ρ ∈ F\0

for some element βj of the basis. Therefore

γ0 =

(ρ∑N−1

i=0 ζ(i)σi(βj) d√σi(γ)

)d.

ρ ∈ F hence(∑N−1

i=0 ζ(i)σi(βj) d√σi(γ)

)dis also an element of F.

Moreover, as a d-th power of an element in F ⊂ E the element ρd will

not help in denesting d√γ, so

(∑N−1i=0 ζ(i)σi(βj) d

√σi(γ)

)−dwill also denest

d√γ. This proves the existence of an admissible sequence and basis element

as stated.The claim that any number γ0 ∈ F denesting d

√γ can be written in the

stated form follows from the fact that the process we just described can beapplied to all these elements of the field F.

The existence proof given above generalizes Lemma 6.2.2 in the sense thatit shows the existence of finite denesting sets for expression d

√γ even if F

and E are arbitrary fields.

Remark 6.3.3 If E and d√γ are as in Theorem 6.1.1 or as in Theorem

6.1.4 then Lemma 6.3.2 remains correct if “admissible sequence” is replaced

110

Page 115: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

by “normalized admissible sequence”. In fact, in the complex case, for theidentity σ0, that is for E and γ itself it does not matter which d-th root ofd√γ0 we are taking since F contains all d-th roots of unity, and hence we can

take d√γ0 itself. In the real case, for i = 0 in ζ(0) d

√γ0

d√γ = η ∈ E the root

of unity ζ(0) must be real since η, d√γ0, and d

√γ are real. But then it may

very well be 1. ♦

Let us briefly show that Lemma 6.3.2 generalizes the second part of thetheorem of Borodin et al. (Theorem 6.2).

Suppose E = F (√ρ) and γ = α + β

√ρ, α, β 6= 0. Moreover assume an

element γ0 ∈ F exists such that√γ0

√α+ β

√γ ∈ F.

E has only two field embeddings, the identity and the mapping σ sat-isfying σ(

√ρ) = −√ρ. It is easily seen that for any γ0 denesting

√γ

tr(√

γ0(α+ β√ρ))6= 0. By the previous lemma(√

α+ β√ρ+ ε

√α− β√ρ

)2

∈ F,

for ε = 1 or ε = −1.(√α+ β

√ρ+ ε

√α− β√ρ

)2

= 2α+ 2ε√α2 − ρβ2.

Hence√α2 − ρβ2 ∈ F . This shows that for both choices of ε the square

above is in F.Moreover,

12α+ 2

√α2 − ρβ2

denests√γ, so the same must be true for 2α + 2

√α2 − ρβ2 itself, which is

the first part of the condition in the theorem of Borodin et al.. The secondpart can be deduced similarly.

Unfortunately, for d = 2 and extensions generated by more than onesquare root or for d > 2 Lemma 6.3.2 does not lead to such a simple con-dition. We have to work much harder to get efficient algorithms. And eventhen it is not possible to apply these algorithms recursively to nested radicalsof depth larger than two as is the case for nested radicals containing exactlyone square root on each level (for the details of this recursive algorithm see[BFHT]).

On the other hand observe that even if we cannot determine the ad-missible sequences for d

√γ Lemma 6.3.2 almost immediately leads to an

111

Page 116: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

algorithm that computes a denesting element for d√γ if any such element

exists. For all basis elements βj and all dN N -tuples of d-th roots of unitywe can use the algorithm leading to Theorem 5.2.6, p. 61, to check whether(∑N−1

i=0 ζ(i)σi(βj) d√σi(γ)

)dequals an element γ0 in F. Then we check for all

elements γ0 of F found in this way whether ζ(0) d

√γ−1

0 γ ∈ E again using themethod of Theorem 5.2.6,p. 61..

For the general case we do not know any better way to compute a den-esting element than this brute force method. In case E is a simple radicalextension, however, we can do better. In the next subsection we give a char-acterization of admissible sequences that allows us to determine a supersetof the set of normalized admissible sequences of size at most d3. Then theprocess sketched above needs to be applied only to these sequences.

6.4 Characterizing Admissible Sequences

Throughout this subsection let F be a field and E = F ( m√ρ) a simple radical

extension of F such that m√ρ is of degree m over F. As usual γ is an element

of E. Before we can prove the main result on admissible sequences we needtwo auxiliary lemmata.

Lemma 6.4.1 If (ζ(0), ζ(1), ζ(2), . . . , ζ(m−1)) is an admissible sequence ford√γ then for any two indices i, j

ζ(i)d−1 d

√σi(γ)d−1ζ(j) d

√σj(γ) ∈ F ( m

√ρ, ζm).

Here ζm denotes a primitive m-th root of unity.

Proof: Since (ζ(0), ζ(1), ζ(2), . . . , ζ(m−1)) is an admissible sequence there ex-ist elements γ0 ∈ F, η ∈ F ( m

√ρ) such that ζ(i) d

√γ0

d√σi(γ) = σi(η) for all

i = 0, 1, . . . ,m− 1. Hence

ζ(i)d−1 d√γ0d−1 d

√σi(γ)d−1ζ(j) d

√γ0

d

√σj(γ) =

= γ0ζ(i)d−1 d

√σi(γ)d−1ζ(j) d

√σj(γ) ∈ F (σi(η), σj(η)).

Since γ0 ∈ F we get ζ(i)d−1 d√σi(γ)d−1ζ(j) d

√σj(γ) ∈ F (σi(η), σj(η)), too.

Because the conjugate fields of F ( m√ρ) are of the form F (ζim m

√ρ)

F (σi(η), σj(η)) ⊆ F ( m√ρ, ζm)

112

Page 117: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

and the lemma follows.

As before we are basically interested in two different cases. First, ζm ∈ F,equivalently, F (ζm) = F, where ζm is a primitive m-th root of unity, and,second, F ⊆ R. In the latter case we need to determine the degree of m

√ρ

over F (ζm). We show that this degree is either m or m2 .

Lemma 6.4.2 Let F be a real field. Suppose f(X) = Xm−ρ, ρ ∈ F, ρ > 0,is irreducible in F [X]. Furthermore let ζm be a primitive m-th root of unity.Then f(X) is either irreducible in F (ζm)[X], or

√ρ ∈ F (ζm) and f(X)

factors as (Xm2 −√ρ)(X

m2 +√ρ) over F (ζm)[X]. In the latter case m must

be even.

Proof: Consider the degree of m√ρ over F (ζm). By Theorem 3.3, p. 24,

this is the smallest positive integer e such that m√ρe ∈ F (ζm). Moreover, e

divides m.By Lemma 3.12, p. 32, if m

√ρe ∈ F (ζm) then it must be a square root

of an element in F. Therefore m√ρ2e ∈ F. Since we assume that the degree

of m√ρ over F is m it follows m|2e. But e|m hence m

e = 1 or me = 2, which

proves the lemma.

As follows from the lemma whenever m is odd then the degree of m√ρ

over F and the degree of m√ρ over F (ζm) are the same. The following

example shows that the second situation described by the lemma can alsooccur. Consider the positive 10-th root of 5. Its degree over Q is 10. Onthe other hand, Q(ζ10) contains the square roots of 5. Hence the degree of10√

5 over the 10-th cyclotomic field is 5. The fact that Q(ζ10) contains thesquare roots of 5 follows from a general result in algebraic number theorybut it can also be seen directly as follows.

The fifth cyclotomic field Q(ζ5) is a subfield of Q(ζ10). Since ζ45 + ζ3

5 +ζ2

5 + ζ5 + 1 = 0, the element ζ5 + ζ−15 = ζ5 + ζ4

5 satisfies (ζ5 + ζ−15 )2 +

(ζ5 + ζ−15 )− 1 = 0. Hence it is a root of X2 +X − 1 and can be written as

12(1 +

√5) for the positive or negative square root of 5. For this square root,√

5 = 2(ζ5 + ζ−15 )− 1 ∈ Q(ζ5) ⊂ Q(ζ10).

Lemma 6.4.3 Let F ⊆ R, ρ ∈ F, ρ > 0, m√ρ ∈ R, γ ∈ F ( m

√ρ), d ∈ N,

such that the degree of m√ρ over F is m. Any two normalized admissible

113

Page 118: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

sequences for d√γ over F that have a common prefix (1, ζ(1), ζ(2), ζ(3)) are

equal.If F ⊂ C contains a primitive m-th root of unity and ρ ∈ F, γ ∈ F ( m

√ρ), d ∈

N, then ζ(1) alone determines a normalized admissible sequence for d√γ over

F.

Proof: We will prove the lemma first for a real field F.

Let(1, ζ(1), ζ(2), ζ(3)

)be a common prefix of two admissible sequences(

1, ζ(1), ζ(2), ζ(3), ζ(4), . . . , ζ(m−1)),(1, ζ ′(1), ζ ′(2), ζ ′(3), ζ ′(4), . . . , ζ ′(m−1)

),

hence ζ ′(i) = ζ(i), i = 1, 2, 3.First we show by induction on l = 0, 1, . . . bm2 c, that ζ(2l) = ζ ′(2l).

By assumption, ζ(0) = ζ ′(0) = 1, and ζ(2) = ζ ′(2). Now assume thatζ(2l) = ζ ′(2l) has already been shown for all l = 0, 1, . . . , h− 1. By definitionof admissible sequences elements ρ1, ρ2 ∈ F and η1, η2 ∈ F ( m

√ρ) exist such

thatζ(2(h−j)) d

√ρ1

d

√σ2(h−j)(γ) = σ2(h−j)(η1),

ζ(2(h−j)) d√ρ2

d

√σ2(h−j)(γ) = σ2(h−j)(η2), for j = 1, 2,

andζ(2h) d

√ρ1

d

√σ2h(γ) = σ2h(η1),

ζ ′(2h) d√ρ2

d

√σ2h(γ) = σ2h(η2).

W.l.o.g. we may assume that the isomorphism σi of F ( m√ρ) over F is given

by σi( m√ρ) = ζim m

√ρ, i = 0, 1, . . . ,m − 114, for some fixed primitive m-th

root of unity ζm.By Lemma 6.4.1 the first equations imply

1ρ1σ2(h−1)(η1)

(σ2(h−2)(η1)

)d−1=

= ζ(2(h−1))ζ(2(h−2))d−1 d

√σ2(h−1)(γ)

(d

√σ2(h−2)(γ)

)d−1=

=1ρ2σ2(h−1)(η2)

(σ2(h−2)(η2)

)d−1= η ∈ F (ζm, m

√ρ).

By the previous lemma a field isomorphism τ of F (ζm, m√ρ) over F (ζm)

exists that maps m√ρ onto ζ2

mm√ρ.

14Recall that the degree of m√ρ over F is m.

114

Page 119: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

We want to apply τ to the equations above. σi(η1) and σi(η2) are ele-ments of F (ζm, m

√ρ) for all embeddings σi and we only have to determine

the images of σi(η1), σi(η2) under τ. If η1 =∑m−1j=0 rj m

√ρj , rj ∈ F then

σi(η1) =∑m−1j=0 rjζ

ijm

m√ρj . Since τ is a homomorphism

τ(σi(η1)) =m−1∑j=0

rjτ(ζijm)τ( m√ρj).

Hence by definition of τ

τ(σi(η1)) =m−1∑j=0

rjζijm(ζ2

mm√ρ)j =

=m−1∑j=0

rjζ(i+2)jm

m√ρj = σi+2(η1).

Likewise τ(σi(η2)) = σi+2(η2). If necessary the indices i and i + 2 have tobe taken mod m.

Therefore for j = 1, 2,

τ

(1ρjσ2(h−1)(ηj)

(σ2(h−2)(ηj)

)d−1)

=1ρjσ2h(ηj)

(σ2(h−1)(ηj)

)d−1= τ(η).

On the other hand, by definition of ρ1, ρ2 and η1, η2, and by the formulasfor τ(σi(η1)), τ(σi(η2)),

τ(η) =1ρ1σ2h(η1)

(σ2(h−1)(η1)

)d−1= ζ(2h) d

√σ2h(γ)

(ζ(2(h−1)) d

√σ2(h−1)(γ)

)d−1

and

τ(η) =1ρ2σ2h(η2)

(σ2(h−1)(η2)

)d−1= ζ ′(2h) d

√σ2h(γ)

(ζ(2(h−1)) d

√σ2(h−1)(γ)

)d−1.

Thereforeζ(2h) d

√σ2h(γ)

(ζ(2(h−1)) d

√σ2(h−1)(γ)

)d−1=

= ζ ′(2h) d

√σ2h(γ)

(ζ(2(h−1)) d

√σ2(h−1)(γ)

)d−1.

This impliesζ(2h) = ζ ′(2h),

115

Page 120: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

which was to be shown.To prove the claim also for the odd indices we can use the same proof

except that we start with the fact that ζ(1) = ζ ′(1) and ζ(3) = ζ ′(3).To prove the lemma for complex fields containing a primitive m-th root

of unity ζm note F = F (ζm). Hence the mapping τ from above may sim-ply be chosen as the isomorphism mapping m

√ρ onto ζm m

√ρ. Accordingly,

τ(σi(η)) = σi+1(η) for all elements η in F ( m√ρ) and we need not distinguish

between odd and even indices.

Clearly, the proof for Lemma 6.4.3 shows that in case F,E ⊂ R the prefix(ζ(0), ζ(1), ζ(2), ζ(3)) determines uniquely an admissible sequence. Accord-ingly, in the complex case any two admissible sequences with common prefix(ζ(0), ζ(1)) are equal. But we will use the result only for normalized admis-sible sequences.

The proof also shows that if the degree of m√ρ over F (ζm) is m then in

the real case ζ(1) alone determines a normalized admissible sequence, too.In a practical implementation of the algorithms described below this may beinteresting for reasons of efficiency. But for our purposes, which is basicallya polynomial time algorithm, the lemma above is sufficient. Hence we willnot pursue this observation in the sequel.

Observe that the proof of Lemma 6.4.3 more or less is an algorithm tocompute the superset for the set of admissible sequences. As in case ofLemma 6.3.2 we explain the details in the next section when analyzing thedenesting algorithms.

Since in the real case any normalized admissible sequence is determinedby the elements ζ(1), ζ(2), ζ(3) the number of normalized admissible sequencesfor simple radical extensions is bounded by d3. Combining this with Lemma6.3.2 would lead to a denesting set of size md3 if any triple (ζ(1), ζ(2), ζ(3))determined an admissible sequence. But we know already from Lemma 6.2.2that the size of a denesting set can be bounded by m. Hence most triples willnot lead to a normalized admissible sequences and the question is whetherwe can do better than Lemma 6.4.3. For an arbitrary field F so far we cannot, but if F is itself a real radical extension Q( d1

√q1, . . . , dk

√qk) of Q and

m√ρ is also a radical over Q we will show in the next section that we can

compute efficiently a superset of the set of normalized admissible sequencesof size at most (24m)3. This is still far from m but unlike d3 it is independentof d which is what we would expect.

The result is based on the following observation.

116

Page 121: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Assume that there are more than R normalized admissible sequencessuch that any two of them differ in the (i + 1)-st component, where i isfixed. By Lemma 6.4.1 this implies that more than R different d-th roots ofunity, say ζ(1), . . . , ζ(R+1), exist such that

d√γd−1ζ(j) d

√σi(γ) ∈ F ( m

√ρ, ζm), j = 1, . . . , R+ 1,

Hence any ratio of two of these elements is in F ( m√ρ, ζm). Especially, the

ratio of any element with the first one is contained in F ( m√ρ, ζm). These

ratios areζ(1)

ζ(j), j = 1, . . . , R+ 1,

which, by assumption that the admissible sequences differ in the (i + 1)-stcomponent, must be different d-th roots of unity. Hence F ( m

√ρ, ζm) contains

R + 1 different d-th roots of unity. Applying the following theorem whichwill be proven in an appendix shows that R ≤ 24m.

Theorem 6.4.4 Let F = Q( d1√q1, . . . , dk

√qk) be a real radical extension of

Q. If ζm is a primitive m-th root of unity then F (ζm) contains at most24m different roots of unity. Moreover, the constant 24 is best possible, i.e.,there are real radical extensions F of Q and m ∈ N such that F (ζm) containsexactly 24m different roots of unity.

In particular, for each of the components 2, 3, and 4, which determinethe whole sequence there exist only 24m possible d-th roots of unity and wecan compute these roots by determining for which d-th roots of unity ζ

ζ d√γd−1 d

√σi(γ),

is an element of F ( m√ρ, ζm). If we denote the roots that pass the test for

σ1, σ2, σ2 by Z1, Z2, Z3, respectively, in order to compute a superset of nor-malized admissible sequences we have to consider only the (24m)3 triples inZ1 × Z2 × Z3.

We are now in a position to outline the basic denesting algorithms incase F is an algebraic number field Q(α). A more detailed description of thesingle steps and a careful analysis will be given in the next section.

117

Page 122: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

6.5 Denesting Radicals - The Algorithms

We are now in a position to outline the basic denesting algorithms in caseF is an algebraic number field Q(α). Explanations to the algorithms willbe given partly in the descriptions itself (in italics) and partly at the endof the descriptions. A detailed description of the single steps and a carefulanalysis will be given in the next section.

Let us begin with an algorithm that computes for a nested radical d√γ

over an algebraic number field F = Q(α) an element denesting it.

Algorithm Denesting Element

Input A nested radical d√γ, where γ is a rational expression in the radicals

d1√ρ1, d2√ρ2, . . . , dK

√ρK , ρi ∈ F. F is either real, the radicals di

√ρi are

positive real radicals, and d√γ ∈ R, or F contains a primitive d-th root

of unity and di divides d for all i.

Output “Yes”, and elements γ0 ∈ F, η ∈ F ( d1√ρ1, d2√ρ2, . . . , dK

√ρK) such that

d√γ0

d√γ = η if some element in F leads to a denesting, “No”, otherwise.

Step 1 Denote F ( d1√ρ1, . . . , di

√ρi) by F (i). Determine for each i the relative

degree ni = [F (i) : F (i−1)] and a primitive element ηi of F (i) over Qrepresented by its minimal polynomial pi.

(This is basically a preprocessing step. The elements ηi are needed forapplying the algorithm leading to Theorem 5.2.6, p. 61, to the fieldsF (i).)

Renumber the radicals such that ni > 1 for the first k indices andni = 1 for the remaining indices. Hence F ( d1

√ρ1, d2√ρ2, . . . , dK

√ρK) =

F ( d1√ρ1, d2√ρ2, . . . , dk

√ρk) Also compute the set E(i) of those s ∈

N, s < ni, such that di√ρisd ∈ F (i−1).

(See Lemma 6.2.2 and observe that 1, di√ρi, . . . , di√ρini−1 is a basis

for F (i) over F (i−1).)

Using Theorem 5.2.6, p. 61, compute the representation of γ withrespect to the primitive element ηk of F (k) and determine a rationalinteger C such that γ(k) = Cγ is an algebraic integer.

(The integer C will prevent an exponential coefficient growth through-out the next steps of the algorithm.)

118

Page 123: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Step 2 Set D(k) := γ(k).For i = 0, 1, . . . , k − 1 do the following:

Using the characterization in Lemma 6.4.3 and the algo-rithm leading to Theorem 5.2.6, p. 61, compute for all ele-ments γ(k−i) in D(k−i) a superset A of the set of normalizedadmissible sequences for d

√γ(k−i) over F (k−i−1).

For each element γ(k−i) in D(k−i) do the following:

Until a denesting element for d

√γ(k−i) over F (k−i−1)

has been found, using the method of Theorem5.2.6, p. 61, determine for all sequences in A(ζ(0), ζ(1), . . . , ζ(nk−i−1)), ζ(0) = 1, and for all r ∈0, 1, . . . , nk−i − 1 the exact representation of theelement in F (k−i−1) thatnk−i−1∑

j=0

ζ(j)σj(dk−i√ρk−i

r) d

√σj(γ(k−i))

d

has to be if it is an element of F (k−i−1). The rep-resentation is with respect to the primitive elementηk−i−1 for F (k−i−1). Determine whether the inverse

of this element in F (k−i−1) denests d

√γ(k−i) over

F (k−i−1).(By Lemma 6.3.2 and Remark 6.3.3 we know that

if a denesting element for d

√γ(k−i) exists then the

inverse of at least one denesting element can be de-scribed by the formula above.)If the inverse γ(k−i−1) of a denesting element hasbeen found, set

Dγ(k−i) =γ(k−i−1) dk−i

√ρk−i

sd| s ∈ E(k−i).

(By Lemma 6.2.2 the set Dγ(k−i) contains the in-

verses of an irreducible denesting set for d

√γ(k−i)

over F (k−i−1).)

119

Page 124: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

SetD(k−i−1) =

⋃γ(k−i)∈D(k−i)

Dγ(k−i) .

If D(k−i−1) is empty, stop and output “No”, otherwise pro-ceed with the next i.

Step 3 Using the algorithm leading to Theorem 5.2.6, p. 61, decide whethersome element γ(0) in D(0) has the property

d

√C

γ(0)d√γ = η ∈ F ( d1

√ρ1, d2√ρ2, . . . , dk

√ρk),

if this is the case determine the representation of d

√Cγ(0)

d√γ as a lin-

ear combination of the elements of the standard basis, output thisrepresentation and C

γ(0) , otherwise output “No”.

Observe that if D is a denesting set for d√Cγ then multiplying the ele-

ments inD by C yields a denesting set for d√γ. Therefore the proof of correct-

ness for this algorithm follows immediately from Lemma 6.2.4 and Lemma6.3.2 and Remark 6.3.3. In fact, since 1, dk−i√ρk−i, . . . , dk−i

√ρk−i

nk−i−1 isa basis for F (k−i) over F (k−i−1) it follows from Lemma 6.3.2 and Remark6.3.3 that if a nested radical d

√γ(k−i) can be denested over F (k−i−1) by a

denesting element then such a denesting will be computed in Step 2. FromLemma 6.2.4 it follows that the inverses in D(0) are a superset of a denestingset for γ(k) = Cγ.

As mentioned in the algorithm computing a denesting set for d√C d√γ

rather than for d√γ avoids an exponential growth in the coefficient size of

the elements in D(i).The last step has been included since it is crucial for the General Den-

esting Algorithm to be described below.The set D(k) has one element and the set D(k−1) has at most nk elements

which is the degree of F (k) over F (k−1) (see Corollary 6.2.3). By inductionon i one shows that the set D(k−i) has at most

∏i−1j=0 nk−j elements. Hence

admissible sequences have to be computed for at most

k−1∑i=0

|D(k−i)| = Nk∑i=1

1n1n2 · · ·ni

≤ Nk∑i=1

12i≤ N

elements.

120

Page 125: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

And if for each of the elements in D(k−i) the admissible sequence de-termined by the prefix (1, ζ(1), ζ(2), ζ(3)) or (1, ζ(1)) can be computed ef-ficiently at each level of the algorithm we have to check for at mostd3nk−i|D(k−i)|, i = 0, . . . , k − 1, or dnk−i|D(k−i)|, i = 0, . . . , k − 1, differ-ent complex numbers fitting into the description of Lemma 6.3.2 whetherthey correspond to a denesting element in F ( d1

√ρ1, d2√ρ2, . . . , dk−i−1

√ρk−i−1).

The factor nk−i occurs because we have to apply the formula in Lemma 6.3.2for all combinations of normalized admissible sequences and elements of abasis of F (k−i) over F (k−i−1).

Hence overall this step has to be done at most

d3N

(1 +

1n1

+1

n1n2+ . . .+

1n1n2 · · ·nk−1

)≤ 2d3N

times in the real case, and accordingly, at most 2dN times in the complexcase. Since |D(0)| ≤ N the last step has to be applied at most N times.

Remark 6.5.1 In the complex case, the algorithm can be improved. D(0) isa superset of a denesting set of d

√γ. On the other hand, it contains at most

N elements. By Corollary 6.2.3 in this case any denesting set has size atleast N. Therefore, D(0) must be an irreducible denesting set for d

√γ. Hence

if d√γ0

d√γ ∈ F ( d1

√ρ1, d2√ρ2, . . . , dk

√ρk) for some γ0 ∈ F then any inverse of

an element in D(0) must denest d√γ. Therefore we only need to compute a

single element of D(0). Accordingly, in Step 2 on each level we compute theinverse of a single element leading to a denesting, i.e., for all i D(i) consistsof a single element. This observation saves us basically a factor of N inthe final run time. We will analyze this slightly modified algorithm in thecomplex case. ♦

By Theorem 6.1.4 in the complex case the Algorithm Denesting Elementfinds already a denesting of d

√γ using radicals of the form t

√ρ, ρ ∈ F, t ∈ N,

t dividing d, if any such denesting exists. In the real case however, theAlgorithm Denesting Element does not necessarily find a denesting usingonly real radicals if any such denesting exists.

The correctness of the following Algorithm Real Denesting is immediatefrom Theorem 6.1.1.

Algorithm Real Denesting

Input A nested radical d√γ as in the real case of the Algorithm Denesting

Element.

121

Page 126: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Output A denesting for d√γ using real radicals if such a denesting exists.

Step 1 Apply Step 1 from the Algorithm Denesting Element. Denote byB = β0, β1, . . . , βN−1 the standard basis for the radical extensionF (k) over F.

Step 2 Applying Step 2 and Step 3 of the Algorithm Denesting Elementcheck whether any nested radical d

√βjγ for j = 0, 1, 2, . . . , N − 1,

denests.

If this is not the case, output ”No”.

If some d√βjγ can be denested by an element in F output this denesting

and βj as a denesting for γ.

By the description of Step 2 of the Algorithm Real Denesting we meanthat Step 2 and Step 3 of the Algorithm Denesting Element are succes-sively applied with D(k) = βjγ(k), j = 0, 1, . . . , N − 1. The reader maywonder why we do not simply apply the Algorithm Denesting Element toall radicals d

√βjγ. The answer is that Step 1 of the Algorithm Denesting

Element is the same for all these radicals.

122

Page 127: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

The correctness of the following General Denesting Algorithm followsdirectly from Theorem 6.1.2 and its complex equivalent Theorem 6.1.5.

The General Denesting Algorithm

Input k depth 2 nested radicals di√γi over a field F and k depth one radical

expressions κi over F. If F is real then di√γi ∈ R and the radicals ap-

pearing in γi or κi are real. If F is not real then it contains a primitived-th root of unity, all di = d and the depth 1 radicals appearing in γior κi are of the form d′

√ρ, ρ ∈ F, d′|d.

Output In the real case, a denesting of S =∑ki=1 κi di

√γi using only real radicals

if any exists, in the complex case, a denesting of S =∑ki=1 κi d

√γi using

only radicals of the form d′√ρ, ρ ∈ F, d′|d if any exists.

Step 1 Use the following procedure to partition the set of nested radicals d1√γ1, d2

√γ2, . . . , dk

√γk into subsets Rt such that two nested radicals

are in the same subset if and only if their ratio can be written eitheras a sum of real depth 1 radicals over F in the real case, or as a sum ofdepth 1 radicals of the form d′

√ρ, ρ ∈ F, for some d′|d, in the complex

case.

In the complex case, apply the Algorithm Denesting Element tod

√γiγ

d−1t = γt

d√γi

d√γt

for all pairs γi, γt of radicals. If a radical denestsdenote the denesting by γit.

In the real case, for any pair of radicals determine whetherdidt

√γdti γ

di(dt−1)t = γt

di√γi

dt√γt

denests using real radicals by applying theAlgorithm Real Denesting. Again denote a denesting by γit.

Step 2 Assume dt√γt ∈ Rt, for t = 1, 2, . . . , h, where h is the number of subsets

in the partition of Step 1.

Write

S =h∑t=1

dt√γt

γt

∑i∈N| di√γi∈Rt

κiγit.

Determine whether any dt√γt, t = 1, 2, . . . , h, denests. If so, let d1

√γ1 be

the one that denests. (Observe that only one nested radical dt√γt, t =

1, . . . , h, can denest. If two of them did, so would their ratio, which isimpossible because of the partition determined in Step 1.)

123

Page 128: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Step 3 Transform each expression κi into a sum of radicals.

If d1√γ1 denests, check whether∑

i∈N| di√γi∈Rtκiγit = 0

for all t ≥ 2. If so outputd1√γ1

γ1

∑i∈N| di√γi∈R1 κiγi1 as the denesting

for S. If at least one sum is not zero output that S cannot be denested.

If d1√γ1 does not denest, check whether all sums are zero and, if so,

output zero as a denesting for S, otherwise output that it cannot bedenested.

The correctness of this algorithm follows directly from Theorem 6.1.2 andits complex equivalent Theorem 6.1.5.

Observe that by the Algorithm Denesting Element when we have tocheck whether

∑i∈N| di√γi∈Rj κiγij is zero this sum is a sum of radicals. γit

is a sum of radicals by the last step in the Algorithm Denesting Elementand for κi a representation as a sum of radicals has been determined in theGeneral Denesting Algorithm itself. Hence we can apply the algorithms ofSection 5 to determine whether the sums

∑i∈N| di√γi∈Rt κiγit are zero.

124

Page 129: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

7 Denesting Radicals - The Analysis

In this section we analyze the algorithms Algorithm Denesting Element, Al-gorithm Real Denesting, and the General Denesting Algorithm. We willshow how to fill in the details into the descriptions of the previous sectionsuch that the Algorithm Denesting Element and the Algorithm Real Den-esting run in time polynomial in d, the degree N of the radical extensiongenerated by the radicals appearing in the description of γ, and in the inputsize of the problem. For the General Denesting Algorithm we will show thatit runs in time polynomial in the di’s, in N, which is the maximum degreeof an extension generated by the radicals appearing in a single pair γi, κi ofradical expressions, and in the input size of the problem.

Recall from Section 6 that for the Algorithm Denesting Element andthe Algorithm Real Denesting the output, which will be a sum of depth 1radicals, may have N terms. Moreover, if γ is a sum of radicals, N will be thedegree of the minimal polynomial of γ (see Theorem 3.11, p. 31, or Theorem3.13, p. 32). Hence in this case we achieve a denesting algorithm whose runtime is polynomial in the description size of the minimal polynomial of thenested radical. This is the first algorithm to achieve such a run time.

As will be seen later for the General Denesting Algorithm the outputmay have O(kN2) terms, where k is the number of nested radicals di

√γi and

N is as above. Moreover, the degree of the minimal polynomial of S maybe Ω(Nk∏ di). On the other hand, the General Denesting Algorithm runsin time polynomial in k,maxdi, and N. Again this is the first algorithmto achieve such a run time.

As is clear from the previous section we basically have to analyze theAlgorithm Denesting Element.

7.1 Preliminaries

In this subsection we deduce several results that will simplify the analysisof the three main steps in the Algorithm Denesting Element. The mainpart of the analysis of the Algorithm Denesting Element will be done inthe following three subsections. In the final subsection the results will becombined to analyze the overall run times of the three denesting algorithms.

As in Section 6 we refer to the two different types of denesting problemswe consider as the real and complex case.

As in Section 5 we assume that an algebraic number field F = Q(α) isgenerated by an algebraic integer α. The field is specified by the minimal

125

Page 130: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

polynomial p(X) =∑ni=0 piX

i, pi ∈ Z, pn = 1, of α. Throughout this sec-tion n always denotes the degree of p and we assume that the length of psatisfies |p|2 < 2l. Furthermore it is assumed that α is distinguished fromits conjugates by an isolating interval or rectangle, that is, an interval orrectangle containing no root of p except α. In general, we will not mentionthis interval.

Any element β of the algebraic number field Q(α) is specified by an(n + 1)-tuple (b, b0, b1, . . . , bn−1) ∈ Zn+1, gcd(b, b0, b1, . . . , bn−1) = 1, suchthat β = 1

b

∑n−1i=0 biα

i. For the definition of the infinity norm [β]∞ and andthe coefficient size [β] we refer to Section 5.1.

A radical d√ρ, d ∈ N, ρ ∈ Q(α), is represented only by d and ρ. To avoid

ambiguity in this section d√ρ always has the value

d√ρ = |ρ|

1d

(cos

1dφ+ i sin

1dφ

),

where φ ∈ (−π, π] is the angle of ρ if written in polar coordinates, and|ρ|

1d is the positive real root of |ρ|. Recall from Section 6 that this is no

restriction15.Recall that we can check whether an algebraic number field is real or

contains a certain primitive root of unity by algorithms whose run timesare polynomial in the description size of the minimal polynomial p and, incase of the root of unity, in the degree of the root of unity (see for exampleTheorem 5.6.5, p. 92). Likewise, if Q(α) ⊂ R, we can check whether anelement β in Q(α) is positive in time polynomial in the description sizeof p and β. Hence for the Algorithm Denesting Element, as described inthe previous section, we can check the input conditions on Q(α) or on theradicals di

√ρi efficiently.

We want to apply the Algorithm Denesting Element or the AlgorithmReal Denesting to a nested radical d

√γ, where γ is a rational expression in

radicals over Q(α). That is, γ is an expression built up from elements in Q(α)and from elements in a finite set d1√ρ1, d2

√ρ2, . . . , dK

√ρK of radicals over

Q(α) using the arithmetic operations addition, subtraction, multiplication,and division.

We may for example assume that γ is given as a straight-line programusing elements in Q(α) and the radicals di

√ρi. As is not hard to see analyzing

15We may also use the same convention as in Section 5, but in view of the definitionsin Section 6 (in particular, admissible sequences) choosing this more restricted form isappropriate.

126

Page 131: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

the Algorithm Denesting Element for γ’s defined in this way would lead toan algorithm whose run time is exponential in the number of steps of thestraight-line program. Instead of assuming a particular input form belowwe state five conditions on the representation of γ and the analysis we givein the sequel applies to all classes of expressions satisfying these conditions.By an appropriate choice of the parameters also nested radicals defined viastraight-line programs fulfill these conditions. The conditions are as follows.

1) di√ρi is an algebraic integer for all i = 1, 2, . . . ,K.

2) di√ρi is of degree di over F = Q(α).

3) The radicals di√ρi are linearly independent over Q(α).

4) An integer K ∈ N is known such that

[γ]∞ < 2K

and such that an integer C ∈ N, C < 2K exists with

Cγ is an algebraic integer.

5) For any positive ε an approximation to γ with absolute error less than εcan be determined by elementary operations on floating-point numbersof size polynomial in log 1

ε and K. The number of operations necessaryis also bounded by a polynomial in log 1

ε and K.

In 4) we do not assume that we know the integer C. Rather it will bedetermined in the algorithm.

The first three conditions are of a more technical nature, they simplifythe notation and reduce the number of input parameters but are not crucialto the analysis. We will show that these conditions can always be satisfied.

Conditions 4) and 5) are the basic assumptions of the analysis. In fact,K will be one of the important run time parameters. Basically, using themethods of Section 5), in particular by Theorem 5.2.6, p. 61, Conditions 4)and 5) allow us to determine some canonical basis representation of γ. Thedenesting algorithm will work with this canonical representation rather thanwith the original expression for γ. Moreover, these conditions will allow usto determine whether γ is positive in case F ⊂ R and d is even.

To justify and demonstrate the generality of the conditions stated abovewe will show below that a large class of expressions satisfy, in particular,

127

Page 132: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

conditions 4) and 5). Hence the analysis applies to these expressions. Butdue to the generality of 4) and 5) the analysis is not restricted to expressionsof the form described below.

Let us begin by considering the first three conditions.Let ri be the denominator of ρi. As shown in the proof of Lemma 5.3.1,

p. 63, ri di√ρi is an algebraic integer. This shows how to satisfy condition 1)

by replacing di√ρi by ri di

√ρi and from now on we assume that all elements

di√ρi are integers.

Condition 2) can be satisfied by determining for any element in d1√ρ1, d2

√ρ2, . . . , dK

√ρK the smallest mi such that di

√ρimi = ρ′i ∈ Q(α).

By Theorem 3.3, p. 24, this is the degree of di√ρi over Q(α). Computing

the integers mi can easily be done by checking successively for mi = 1, 2, . . .whether di

√ρimi ∈ Q(α) using the algorithms of Section 5, in particular

Theorem 5.2.6, p. 61. The degree mi must be less than N, the degree ofthe extension generated by the radicals di

√ρi. Hence this algorithm has to

be applied at most KN times. The run time of each algorithm will bepolynomial in di if the deterministic version is applied. But if some erroris allowed it can be reduced to log di. Moreover Lemma 5.1.11, p. 53, andLemma 5.1.8, p. 51, can be used to show that if [ρi] < 2L then the elementρ′i with di

√ρimi = ρ′i has coefficient size less than 23L, where as in Section 5

we denote by L the expression dn log n + nl + nLe. In fact, since [ρi] < 2L

Lemma 5.1.11, p. 53, implies [ρi]∞ < 2L and therefore [ di√ρimi ]∞ < 2L for

mi ≤ di. Lemma 5.1.8, p. 51, implies [ di√ρimi ] < 23L. So from now on we

assume that di is the degree of di√ρi over Q.

Finally, Condition 3) can be satisfied for any expression γ by applyingthe algorithm leading to Theorem 5.6.1, p. 88, or Theorem 5.6.2, p. 90,to the set d1√ρ1, d2

√ρ2, . . . , dK

√ρK in order to determine a set of linearly

independent radicals of maximum size. Recall that any other radical canbe written as a product of an element in this set and an element in Q(α)(Corollary 3.10, p. 28). Moreover the multiples in Q(α) will have coefficientsize less than 26L (Lemma 5.3.2, p. 64).

Condition 1) implies di ≤ N for all i and condition 2) implies that thenumber K of radicals di

√ρi is less than N.

Now let us show that for a large class of expressions conditions 4) and 5)are satisfied with K being polynomially related to N and to the input size.

Let Sj , j = 1, 2, . . . ,m, have the form

Sj =m∑i=1

Pi,jQi,j

,

128

Page 133: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

where Pi,j , Qi,j , i = 1, . . . ,m, are linear combinations of the radicals dh√ρh

with the ρh’s and the coefficients κ in Q(α) satisfying [ρh], [κ] < 2L. Fur-thermore assume γ has the form

γ = P (S1, S2, . . . , Sm),

where P (X1, X2, . . . , Xm) is a polynomial in m variables with coefficientsin Q(α). Assume that P has at most m terms and that the degree is alsobounded by m16. Let the coefficients of P have coefficient size less than 2L.Finally we assume that conditions 1), 2), and 3) are already satisfied.

To derive the bound K in 4) consider a single ratio Pi,j/Qi,j . MultiplyingPi,j or Qi,j with the product bi,j , ci,j , respectively, of the denominators ofits coefficients yields algebraic integers since the radicals di

√ρi are already

integers. We obtain

|bij |, |cij | < 2NL,(1)

since the number of radicals and therefore the number of terms in eachPi,j , Qi,j is bounded by N (Condition 2).

We need a bound on a rational integer c′i,j such that c′i,j1

Qi,jis an algebraic

integer. By the previous argument a positive integer ci,j < 2NL exists suchthat Q′i,j = ci,jQi,j is an integer. It suffices to give a bound on the size of arational integer c′i,j such that c′i,j

1Q′i,j

is an algebraic integer.

We need to bound the infinity norm of Q′ij .The coefficients in Qi,j , Pij have coefficient size bounded by 2L and the

same is true for the ρi’s in di√ρi. Hence by Lemma 5.1.11, p. 53, the coeffi-

cients have infinity norm less than 2nl+L+2 ≤ 2L and [ di√ρi]∞ < 2L, too17.

Therefore

[Pij ]∞, [Qi,j ]∞ < 22L+logN < 22NL.(2)

Together with the bound on |cij | in (1) this implies

[Q′i,j ]∞ < 23NL.(3)

We need the following well-known lemma.16The argument we give is easily extended if we want to distinguish between the number

of variables, the degree of P, and the number of terms of P. In this case m is the maximumof the parameters above.

17Going from the coefficient size to the infinity norm usually means replacing 2L by 2L.

129

Page 134: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Lemma 7.1.1 Let ρ ∈ C be an algebraic integer of degree at most m suchthat [ρ]∞ < 2B. Then the minimal polynomial r of ρ satisfies

|r|∞ < 2mB, |r|2 < 2m(B+1),

provided B > logm.

Proof: Let ei(x1, . . . , xh) denote the i-th symmetric function on h symbolsx1, . . . , xh, i.e.,

ei(x1, . . . , xh) =∑

k1,...,kh−i⊆1,...,hxk1 · · ·xkh−i , i = 0, 1, . . . , h− 1.

As is well-known for h elements x1, . . . , xh ∈ C the number ei(x1, . . . , xh) isthe coefficient ri in

h∏i=1

(X − xi) =h∑i=0

riXi.

Hence for an algebraic integer ρ of degree h the symmetric functions on itsconjugates describe the coefficients of its minimal polynomial.

Therefore, if ρ is of degree h ≤ m and [ρ]∞ < 2B, B > logm, then thei-th coefficient ri in r satisfies

|ri| ≤(h

i

)[ρ]h−i∞ < hi[ρ]h−i∞ < 2hB ≤ 2mB.

Equivalently, the height |r|∞ is strictly less than 2mB.Moreover,

|r|2 ≤ |r|1 ≤h∑i=0

(h

i

)[ρ]h−i∞ = (1 + [ρ]∞)h < 2h(B+1) ≤ 2m(B+1).

Observe that the degree of Q′i,j over Q can be at most nN. Together withthe bound given above for the infinity norm of Q′i,j this yields a bound onthe height of the minimal polynomial of Q′i,j of

24nN2L.

130

Page 135: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Next recall that if f(X) =∑hi=1 fiX

i is the minimal polynomial of an al-gebraic number ρ then

∑hi=1 fh−iX

i is the minimal polynomial of ρ−1. Alsorecall that fhρ is an algebraic integer18.

Therefore an integer c′i,j exists whose size is bounded by 24nN2L suchthat c′i,j

1Q′i,j

is an algebraic integer. As mentioned before this implies that

for Qi,j a positive integer c′i,j satisfying

c′ij < 24nN2L(4)

exists such that c′i,j1

Qi,jis an algebraic integer.

The product over i of all c′i,j ’s,bi,j ’s is an integer Cj such that CjSj =Cj∑mi=1 Pi,j/Qi,j is an algebraic integer. By the bounds in (1), (4)

|Cj | < 25mnN2L.(5)

Since the polynomial P has degree at most m, each Sj can occur at mostwith exponent m. Raising all Cj ’s to the m-th power and multiplying thesepowers yields an integer C such that any product of Sj ’s occurring in γ ifmultiplied with C yields an algebraic integer. Hence a positive integer Cwith

C < 25m3nN2L(6)

exists such that Cγ is an algebraic integer.Next let us bound the infinity norm of γ. Recall that Q′i,j is a root of a

polynomial f whose height |f |∞ is bounded by 24nN2L. By Cauchy’s bound(Lemma 4.2.1, p. 41) [

1Q′i,j

]∞< 1 + 24nN2L.(7)

Since 1Qi,j

= ci,j1

Q′i,jand 1 ≤ |ci,j | we get

[1Qi,j

]∞< 25nN2L.(8)

18This argument was already used at the end of Section 2 to show that any number fieldcan be generated by an integer.

131

Page 136: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

By Lemma 5.1.7, p. 51, for each sum∑mi=1 Pi,j/Qi,j its infinity norm is

bounded by

m∑i=1

[Pi,j ]∞

[1Qi,j

]∞.(9)

Since [Pi,j ]∞ < 22(nl+L+2)+logN < 22NL (see (2)) this shows

[Sj ]∞ < 27nN2L+logm.(10)

By definition of m and the bounds on the coefficients of P

[γ]∞ < 28m2nN2L.(11)

Hence K satisfying both conditions in 4) may be chosen as

K = 8m3nN2L.(12)

Finally let us consider 5). Using very rough estimates we get the followinglemma that is already sufficient.

Lemma 7.1.2 Let ε > 0. An approximation with absolute error less than εto a radical expression γ as described above can be computed using O(n) ele-mentary operations on floating-point numbers of size O(n log 1

ε +mn2N2L)and O(N(log log 1

ε +logN+logL+m2n)) elementary operations on floating-point numbers of size O(log 1

ε +mnN2L).

Proof: We claim that approximations to α and the radicals di√ρi, i =

1, 2, . . . ,K, with absolute error less than

ε2−31mnN2L

yield an approximation to γ as required.In fact, if α is approximated with absolute error less than ε2−31mnN2L

then the coefficients in P and in the P ′i,js,Q′i,js are approximated with ab-

solute error less than

ε2−31mnN2L22(l+1)n+L < ε2−30mnN2L,

using Lemma 5.4.3, p. 69, and N ≥ 2, |α| < 2l (see Landau’s bound on themeasure of a polynomial Lemma 5.1.2, p. 48).

132

Page 137: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

The latter estimate also covers the error we may make by computing theinverses of the denominators of the coefficients in P, Pi,j , Qi,j only with errorless than ε2−31mnN2L.

These approximations to the coefficients can be determined from theinitial appproximations using O(nm2N) elementary operations on floating-point numbers of size O(log 1

ε +nmN2L), since the overall number of coeffi-cients is bounded by O(m2N) and each coeefficient is the linear combinationof 1, α, . . . , αn−1.

Since the coefficients in the sums Pi,j , Qi,j and the radicals di√ρi are

bounded in absolute value by 2L (Lemma 5.1.11, p. 53) the approximationsto the coefficients together with the approximations to the radicals di

√ρi

lead to approximations Pi,j , Qi,j to Pi,j , Qi,j with absolute error less thanε2−29mnN2L. Only O(m2N) operations are needed to compute these approx-imations.

Next we compute approximations 1Qi,j

′ to 1Qi,j

with absolute error less

than ε2−18mnN2L. Hence∣∣∣∣∣ 1Qi,j

− 1Qi,j

′∣∣∣∣∣ ≤

∣∣∣∣∣ 1Qi,j

− 1Qi,j

∣∣∣∣∣+∣∣∣∣∣ 1Qi,j

− 1Qi,j

′∣∣∣∣∣ ≤

≤ ε2−18mnN2L + ε2−18mnN2L < ε2−17mnN2L.

The first bound follows from the estimate on [1/Qi,j ]∞ given in (8) andLemma 5.4.3, p. 69, again.

For all m2 pairs i, j this step is done by O(m2) operations on floating-point numbers of size O(log 1

ε + nmN2L) ( see Theorem 5.4.2, p. 68).Then the approximations for the Pi,j ’s and 1

Qi,j’s are multiplied. By our

bounds on [Pi,j ]∞ in (2) and on[

1Qi,j

]∞

in (8) this yields approximations

to the quotients Sj with absolute error less than ε2−16mnN2L. The run timefor this step is covered by the previous ones.

Finally we need to approximate the power products of the Sj appearingin P by determining the corresponding power products of the approxima-tions to the Sj . Then we multiply these power products with the approxima-tions to the coefficients of P, and sum up the results. For the power productswe need O(m2) operations and for the remaining step O(m) elementary op-erations, both on floating-point numbers of size O(log 1

ε + nmN2L).By the bound [Sj ]∞ < 27nN2L (see (10)) our approximation to Sj is

clearly bounded in absolute value by 27nN2L+1. Lemma 5.4.3, p. 69, applied

133

Page 138: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

once more shows that the approximations to Sj lead to approximations tothe power products of the Sj with absolute error less than

ε2−2mnN2L+4m < ε2−mnN2L.

Again using the bound on Sj and the bound on the coefficients of P showsthat the final result of our computations approximates γ with absolute errorless than ε as desired.

To compute the approximation to α and di√ρi, i = 1, 2, . . . ,K requires

O(n) elementary operations on floating-point numbers of size O(n log 1ε +

mn2N2L) and O(N(log log 1ε + log(nmN2L)) elementary operations on

floating-point numbers of size O(log 1ε + nmN2L) which follows from from

Theorem 5.4.1, p. 68, Lemma 5.4.6, p. 71, or Lemma 5.4.9, p. 74. Thisproves the lemma.

This finally shows that conditions 4) and 5) are satisfied for expressions asthe ones described above.

One of the results to be proven below is that any expression satisfying theconditions 1) to 5) can be transformed efficiently into a sum of linearly inde-pendent radicals. But assuming this input form from the beginning seems tobe too restrictive and is not appropriate if we apply the Algorithm DenestingElement in the General Denesting Algorithm. Moreover, the transformationstep requires more or less all techniques used in the denesting algorithms.

Let us summarize and repeat all our input assumptions before beginningwith the analysis.

Input assumptions for the Algorithm Denesting Element

The input to the Algorithm Denesting Element consists of a field F =Q(α), where α is an algebraic integer. The minimal polynomial p of α hasdegree n and length |p|2 bounded by 2l. We assume that F is either real orcontains a primitive d-th root of unity. It furthermore consists of a nestedradical d

√γ, where d is a positive rational integer and γ is a radical expression

in the linearly independent radicals d1√ρ1, d2√ρ2, . . . , dk

√ρK over Q(α). It

is also assumed that ρi is an algebraic integer and [ρi] > 2L. Moreover,di√ρi is of degree di over Q(α). If F is real then the radicals di

√ρi and γ

are real. In the complex case di|d for all i. A positive integer K is givenand it is guaranteed that [γ]∞ < 2K and that a positive integer C lessthan 2K exists such that Cγ is an algebraic integer. N is the degree of

134

Page 139: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

the radical extension generated by the radicals in γ. It is assumed thatL = dn log n+ nl + nLe > logN.

Finally, we assume that for any ε > 0 γ can be approximated withabsolute error less than ε in time polynomial in log 1

ε and K.

The assumption L > logN simplifies the analysis considerably and weadopt it only for the sake of simplicity.

The next three subsections contain the detailed description and the anal-ysis of the three steps of the Algorithm Denesting Element. The readershould recall the steps of the Algorithm Denesting Element before readingthe corresponding analysis.

7.2 Description and Analysis of Step 1

We have to show how to compute the degrees ni of the extensions F (i) =F ( d1√ρ1, . . . , di

√ρi) over F (i−1) = F ( d1

√ρ1, . . . , di−1

√ρi−1), i = 1, 2, . . . ,K,

and how to compute primitive elements ηi for the extensions F (i) over Q. Weneed these elements for efficient computations in F (i). We use two differentrepresentations for the elements ηi. One is via a rational integer c such thatηi = cα + d1

√ρ1 + d2

√ρ2 + · · · + di

√ρi for all i. The other representation for

ηi is via its minimal polynomial pi which is computed using Schonhage’salgorithm. Finally in the real case sets E(i) have to be computed such thatdi√ρisd ∈ F (i−1) if and only if s ∈ E(i).

Lemma 7.2.1 In both, the real and complex case, the degrees ni, primitiveelements ηi, their minimal polynomials pi, and the sets E(i) can be computedusing O(n4N4L) elementary operations on integers of size O(Ln2N2 logN)and O(nN2 log(NL)) elementary operations on floating-point numbers ofsize O(n3N2L).The minimal polynomial pi of ηi satisfies |pi|2 < 24nNL.

Proof: We give the proof only for the real case, except for some minorchanges the proof in the complex case is the same.

First we show how to determine the degrees ni. We do this in the sameway as in the considerations leading to Theorem 4.2.2, p. 42.

Assume that the degrees nj , j < i, have already been computed. Thenwe also know the standard basis of F (i−1). By Lemma 3.9, p. 28, ni is thesmallest integer such that di

√ρini ∈ F (i−1). Moreover, in this case di

√ρini can

be written as a product of an element in F and an element of the standardbasis of F (i−1).

135

Page 140: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

The degree of F (i−1) is given by Ni−1 =∏i−1j=1 nj . By assumption, dj is

the degree of dj√ρj over F. Hence dj

√ρj , j ≤ i − 1, generates a subfield of

F (i−1) of degree dj . Therefore dj , j ≤ i− 1, must divide Ni−1. This impliesthat if `j := Ni−1/dj then any element βh in the standard basis of F (i−1)

can be written as

βh = Ni−1

√√√√i−1∏j=1

ρej`jj , ej < nj .

ej < nj hence ej`j ≤ Ni−1. By Lemma 5.5.5, p. 83,i−1∏j=1

ρej`jj

< 2LN logN ,

since overall at most logN degrees nj can be larger than 1, and N, althoughso far unknown, is clearly an upper bound for Ni−1. Together this implies∑i−1j=1 ej`j ≤ N logN.

Using O(n2N logN) elementary operations on integers of sizeO(LN logN) we can easily determine the representations as linear combina-

tions of 1, α, . . . , αn−1 of∏i−1j=1 ρ

ej`jj for all elements βh = Ni−1

√∏i−1j=1 ρ

ej`jj

in the standard basis of F (i−1) (Lemma 5.5.2, p. 82).Also by Lemma 5.5.5, p. 83,

[ di√ρim] < 2NL, if m ≤ di.

By the main result in Section 5 (see Table 1, p. 88) we can determine foreach basis element βh and each power di

√ρim whether their ratio is an ele-

ment of Q(α) usingO(n) elementary operations on floating-point numbers ofsize O(Ln3N logN), O(log(nNL)) elementary operations on floating-pointnumbers of size O(Ln2N logN), O(Ln4N logN) elementary operations onintegers of size O(Ln2N logN), and O(n2 logN) elementary operations onintegers of size O(LnN2 logN).19

Since the standard basis for F (i−1) contains Ni−1 elements and the ratiotests have to be applied only to di

√ρi, . . . , di

√ρini for each i this test has to

be applied at most niNi−1 times. Since niNi−1 ≤ N and K ≤ N throughoutthe first step of the Algorithm Denesting Element the test has to be appliedat most N2 times. Hence within the time bounds stated in the lemma the

19By applying the results from Section 5 to this case be careful about the meaning ofL and L. In the present case the coefficient size of the input elements is already 2LN logN .

136

Page 141: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

degrees ni can be determined and from now on we assume that Ni and Nare known.

The renumbering of the radicals di√ρi such that the first k radicals satisfy

F ( d1√ρ1, d2√ρ2, . . . , di−1

√ρi−1) 6= F ( d1

√ρ1, d2√ρ2, . . . , di

√ρi), i ≤ k,

is easily done by taking as the first k radicals those corresponding to theindices i with ni 6= 1 in the order as they appear in the sum defining γ.Except for their occurrence in the definition of γ the remaining ones will notbe used any further. From now on we therefore assume ni > 1. Furthermorethe indices i will refer to this rearranged set of radicals. The first k degreesni are therefore exactly the degrees ni > 1 computed above. Moreover, theorder of these degrees is as in the original sequence, that is, ni is the i-thdegree from the original sequence that is strictly larger than 1.

It follows from Theorem 3.3, p. 24, that di√ρisd ∈ F (i−1) if and only if ni

divides sd. Equivalently, s must be divisible by nigcd(ni,d) . This implies

E(i) =

0,ni

gcd(ni, d), 2

nigcd(ni, d)

, . . . , (gcd(ni, d)− 1)ni

gcd(ni, d)

.

Using one of the standard gcd-algorithms (for example [Sc1], but even theordinary algorithm of Euclid suffices) shows that all sets E(i) can be deter-mined within the time bounds stated.

By Remark 6.5.1, p. 121, this step is not necessary in the complex case.It remains to compute a primitive element ηi over Q for the extension

F (i) and the minimal polynomial pi of ηi.Let

c = 23L

we claim that

ηi = cα+i∑

j=1

dj√ρj

is a primitive element for F (i) over Q.By Lemma 2.5, p. 17, it suffices to show σ(ηi) 6= τ(ηi) for any pair (σ, τ)

of distinct embeddings of Q(α, d1√ρ1, d2√ρ2, . . . , di

√ρi) over Q.

First assume σ(α) = τ(α) = αm. In this case

σ

i∑j=1

dj√ρj

6= τ

i∑j=1

dj√ρj

137

Page 142: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

has to be shown.Consider the fields

σ (Q(α, d1√ρ1, d2√ρ2, . . . , di

√ρi)) = Q(σ(α), σ( d1

√ρ1), . . . , σ( di

√ρi))

and

τ (Q(α, d1√ρ1, d2√ρ2, . . . , di

√ρi)) = Q(τ(α), τ( d1

√ρ1), . . . , τ( di

√ρi)).

τσ−1 is an isomorphism between these fields. Moreover, since σ(α) = τ(α)it is an isomorphism that leaves the element σ(α) fixed. Hence τσ−1 is anembedding of Q(σ(α), σ( d1

√ρ1), . . . , σ( di

√ρi)) over Q(σ(α)). Since σ 6= τ it

is not the identity.By Theorem 3.11, p. 31 (or Theorem 3.13, p. 32, in the

complex case) and since the radicals dj√ρj are linearly indepen-

dent over Q(α) the sum∑ij=1

dj√ρj is a primitive element for

Q(α, d1√ρ1, d2√ρ2, . . . , di

√ρi). By isomorphism σ

(∑ij=1

dj√ρj)

is a prim-itive element for Q(σ(α), σ( d1

√ρ1), . . . , σ( di

√ρi)) over Q(σ(α)). Hence

the images of σ(∑i

j=1dj√ρj)

under the different embeddings ofQ(σ(α), σ( d1

√ρ1), . . . , σ( di

√ρi)) over Q(σ(α)) are pairwise distinct.

However, σ(∑i

j=1dj√ρj)

= τ(∑i

j=1dj√ρj)

implies that the images of

σ(∑i

j=1dj√ρj)

under τσ−1 and under the identity are equal. Hence

σ

i∑j=1

dj√ρj

6= τ

i∑j=1

dj√ρj

,which was to be shown.

Therefore we may assume σ(α) = αk 6= τ(α) = αm. In this case σ(ηi) =τ(ηi) implies

c =σ(∑i

j=1dj√ρj)− τ

(∑ij=1

dj√ρj)

αm − αk.

Since σ( dj√ρj), τ( dj

√ρj) are dj-th roots of σ(ρj) and τ(ρj), respectively, since

[ρj ]∞ < 2nl+L+2, and since k ≤ logN the numerator is bounded in absolutevalue by 2nl+L+log logN+3 (see Lemma 5.1.11, p. 53, for a bound on [ρi]∞and hence a bound on [ di

√ρi]∞). By the root separation bound, Lemma

5.1.3, p. 49, the denominator is bounded in absolute value from below by2−nl−n logn. By choice of c and the assumption L > logN the equality abovecannot hold. This proves that ηi generates F (i) over Q.

138

Page 143: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

The minimal polynomial pi is now easily computed as follows. By Lemma5.1.7, p. 51, [ηi]∞ < 23L+l+1. The degree of the minimal polynomial pi ofηi over Q is nNi, a number we already know. Hence by Lemma 7.1.1 thelength |pi|2 of pi is bounded by 24nNiL (which also proves the last claim ofthe lemma).

Using Schonhage’s algorithm (see Theorem 5.2.7, p. 62) the polynomialpi can be computed using O(n4N4

i L) elementary operations on integers ofsize O(n2N2L) provided an approximation to ηi with absolute error lessthan

ε = 2−6n2N2−12n2N2L

is given20.The number of operations in all applications of Schonhage’s algorithm

is bounded byO(n4N4L),

sincek∑i=1

Ni = N

(1 +

1nk

+1

nknk−1+ . . .+

1nknk−1 · · ·n2

)≤

≤ NlogN∑i=1

12i≤ 2N.

Since ηi = cα +∑ij=1

dj√ρj with c = 23L, the required approximations

to ηi for all i are easily computed from approximations to α and dj√ρj , j =

1, 2, . . . , i, with absolute error less than

ε2−(3L+1) > 2−(6n2N2+13n2N2L).

Theorem 5.4.1, p. 68 and Lemma 5.4.6, p. 71 (in the complex case Lemma5.4.9, p. 74, has to be used) imply that the appropriate approximations toα, ρj , j = 1, 2, . . . , k, can be determined by O(n) elementary operations onfloating-point numbers of size O(n3N2L) and O(logN log(n2N2L)) elemen-tary operations on floating-point numbers of size O(n2N2L). Collecting allrun times now proves the lemma.

Since the time required for Schonhage’s algorithm is the dominating termwe do not know how to improve asymptotically the run times of the previous

20Recall Ni ≤ N for all i and that by the time we come to this step N is already known.

139

Page 144: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

lemma. However, from a practical point of view the first part can be mademore efficient.

The reader may wonder why we did not use one global approximation toα, dj√ρj for the first part of the proof, rather than updating the approxima-

tions several times. The answer is that before we know N which is only atthe very end of this part of Step 1, the best global bound on the approxima-tions required is polynomial in

∏Ki=1 di. But the product may be exponential

in N. This is obviously not good enough for our purposes. In the analysisof Step 2 and Step 3, however, we will use exactly the same strategy as inthe second part of the proof above, that is, we will derive a global boundon the approximations required for the algorithms. The time needed for theapproximation algorithms will be determined only at the very end of theanalysis.

Remark 7.2.2 The first part of the previous lemma shows how to determinefor a set of radicals over Q the degree of the extension generated by theseradicals. But observe that in this case for the ratio tests the lattice basisreduction algorithm need not be applied. Theorem 4.1.3, p. 38, suffices.In particular, if the input set consists of L-bit radicals then the algorithmleading to this theorem has to be applied only to O(LN logN)-bit radicals.This proves part of the run times stated in Corollary 4.2.4, p. 43. ♦

In Step 1 also the representation of γ with respect to ηk has to bedetermined.

Lemma 7.2.3 The representation of γ as a linear combination over Q ofpowers of ηk can be computed using O(n5N5L + n3N3K) elementary oper-ations on integers of size O(n3N3L+ nNK) plus the number of operationsneeded to compute an approximation to γ with error ε less than

ε < 2−(46n3N3L+8nNK).

Within the same run times an integer C < 210n2N2L+2K is determined suchthat Cγ is an algebraic integer.

Proof: By assumption, an integer C < 2K exists such that Cγ is an alge-braic integer. Clearly, [Cγ]∞ < 22K.

By the previous lemma the minimal polynomial pk of ηk over Q hasdegree nN and length |pk|2 < 24nNL. By Lemma 5.1.8, p. 51 the represen-tation size of Cγ with respect to ηk is bounded by 22nN lognN+8n2N2L+2K <210n2N2L+2K.

140

Page 145: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Since Cγ is an algebraic integer the denominator of the representation ofCγ is bounded by |∆k|, the discriminant of ηk (see Lemma 2.10, p. 22), andby Lemma 5.1.5, p. 50, |∆k| < 2nN lognN+4n2N2L. Hence the representationsize of γ = 1

CCγ is bounded by 22nN lognN+8n2N2L+2K < 210n2N2L+2K, too.By Theorem 5.2.6, p. 61, and by the bound on |pk|2 derived in the

previous lemma, given an approximation to γ with absolute error less than

2−(2n2N2+7nN+nN lognN+4n2N2L+40n3N3L+8nNK) > 2−(46n3N3L+8nNK) > ε

its representation can be determined using O(n5N5L+n3N3K) elementaryoperations on integers of size O(n3N3L+ nNK).

As the integer C we choose the denominator of the representation deter-mined.

By the bound on the representation size of γ the approximation given inthe proof suffices to determine in the real case whether γ is positive. Thisjustifies the input assumption d

√γ ∈ R in the real case.

By assumption an integer C less than 2K with Cγ exists but we maynot be able to determine it since we have no exact description of the ring ofintegers of F (k), instead we only know the superset 1

∆kZ[ηk].

Remark 7.2.4 By the bound given on K and by Lemma 7.1.2 if γ is apolynomial expression in ratios of sums of radicals as described in Sec-tion 7.1, then the approximation can be determined using O(n) elemen-tary operations on floating-point numbers of size O(n4N3L + m3n3N3L)and O(N(log(NL) + m2n)) operations on floating-point number of sizeO(n3N3L+m3n2N3L). ♦

141

Page 146: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

7.3 Description and Analysis of Step 2

In Step 2 we have to compute for each element γ(i) in the sets D(i)21 theinverse of one element denesting it. By Remark 6.5.1, p. 121, in the complexcase D(i) contains only one element γ(i).

For an element γ(i) the procedure that determines one inverse of anelement in F (i−1) denesting it has two phases. In the first phase a supersetof its set of normalized admissible sequences is computed. In the secondphase until we are successful or all sequences have been tested we check forany sequence whether the formula in Lemma 6.3.2, p. 109, corresponding toγ(i), the basis 1, di√ρi, . . . , di

√ρini−1, and the admissible sequence of d-th

roots of unity leads to the inverse of an element denesting γ(i) over F (i−1).We begin with the second phase and since we want to apply the algorithm

of Theorem 5.2.6, p. 61, we need to bound the coefficient size of the elementsin the sets D(i). By Lemma 5.1.8, p. 51, the following lemma suffices todeduce such a bound.

Lemma 7.3.1 Let F, d1√ρ1, d2√ρ2, . . . , dk

√ρk, d√γ, F (i), i = 0, 1, . . . , k, be as

before. Any element γ(i) in a set D(i) satisfies

[γ(i)]∞ ≤ (nknk−1 · · ·ni+1)d 22d(k−i)L+3K+10n2N2L.

Moreover, any such γ(i) is an algebraic integer.

Proof: We will prove the bounds by induction on k − i.For k − i = 0 (or i = k) the bound is true since in Step 1 an integerC < 210n2N2L+2K is determined such that Cγ = γ(k) is an algebraic integer(see Lemma 7.2.3) and D(k) = γ(k).

Assume that each element γ(i) ∈ D(i) satisfies[γ(i)

]∞≤ (nknk−1 · · ·ni+1)d 22d(k−i)L+3K+10n2N2L

and is an algebraic integer.By the construction in the Algorithm Denesting Element any γ(i−1) ∈

D(i−1) can be written asni−1∑j=0

ζ(j)σj( di√ρir) d

√σj(γ(i))

d di√ρisd

21For a precise description of the sets D(i) we refer to the description of the denestingalgorithms at the end of Section 6.

142

Page 147: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

for r, s ∈ 0, 1, . . . , ni − 1, γ(i) ∈ D(i), ζ(j) a d-th root of unity. The map-pings σj are the different embeddings of F (i) over F (i−1). The factor di

√ρisd

will not occur in the complex case since in this case D(i) consists of a singleelement (see again Remark 6.5.1).

As mentioned in the proof of Lemma 5.3.1, p. 63,[d

√σj(γ(i))

]∞≤ [γ(i)]

1d∞, j = 0, 1, . . . , ni − 1,

and similarly[di

√σj(ρi)

]∞< [ρi]

1di∞ . Since [ρi] < 2L and therefore [ρi]∞ < 2L

(Lemma 5.1.11, p. 53) the latter implies[di

√σj(ρi)

]∞< [ρi]

1di∞ < 2

1diL.

Combining these estimates with Lemma 5.1.7, p. 51, shows

[γ(i−1)

]∞≤

ni−1∑j=0

[ di√ρi]

r∞

[γ(i)

] 1d

d [ di√ρisd]∞ ≤≤ (nknk−1 · · ·ni)d22d(k−i+1)L+3K+10n2N2L,

which is also correct if the factor[di√ρisd]∞

is missing.

Moreover, γ(i) is an algebraic integer hence all its conjugates and theirroots are. By the input assumptions the radicals di

√ρi are algebraic inte-

gers. Since the algebraic integers form a ring this shows that γ(i−1) is alsoan integer, which proves the lemma.

Corollary 7.3.2 The infinity norm [γ(i)]∞ of an element γ(i) in a set D(i)

satisfies [γ(i)

]∞< 210n2N2L+3dL logN+3K.

The coefficient size [γ(i)] with respect to the primitive element ηi22 satisfies[γ(i)

]< 220n2N2L+3dL logN+3K.

22η0 is defined to be α.

143

Page 148: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

As usual the bounds on the coefficient size follow from Lemma 5.1.8, p. 51.Recall that the degree of F (i) over Q is nNi ≤ nN and that the length ofthe minimal polynomial pi of ηi is bounded by 24nNL.

For the sake of simplicity let us introduce some additional notation.

Definition 7.3.3 Denote 20n2N2L + 3dL logN + 3K by B. Hence 2B is aglobal bound for the infinity norm and the coefficient size with respect to ηiof elements in the sets D(i).

Using these notations from Definition 7.3.3 we get the following lemma whichpartially analyses Step 2.

Lemma 7.3.4 Assume that γ(i) is an element of D(i) which is representedas a linear combination of powers of ηi. Furthermore assume that approxi-mations to α, dj

√ρj , a primitive d-th root of unity ζd, and a primitive ni-th

root of unity ζni with absolute error less than

ε < 2−12n2N2B

are given. By σj denote the different field embeddings of F (i) overF (i−1), j = 0, 1, . . . , ni − 1. For any sequence (1, ζ(1), . . . , ζ(ni−1)) of d-throots of unity and any r ∈ 0, 1, . . . , ni − 1, using O((nN)3B) elemen-tary operations on integers of size O(nNB) and O(nN2 + N log log 1

ε ) el-ementary operations on floating-point numbers of size O(log 1

ε ) an elementγ(i−1) ∈ F (i−1) can be determined such that ifni−1∑

j=0

ζ(j)σj( di√ρir) d

√σj(γ(i))

d ∈ F (i−1)

then it must be γ(i−1). γ(i−1) is represented as a linear combination of powersof ηi−1.

Using additional O(n2N2 log d) elementary operations on integers of size

O(dB) it can be decided whether the inverse of this element denests d

√γ(i).

Finally, within the previous run times the representations with respect toηi−1 of the multiples of γ(i−1) with all elements in di√ρisd|s ∈ E(i) can bedetermined.

Proof: By the previous bounds, if the complex number corresponds to anelement γ(i−1) ∈ F (i−1) then the coefficient size of γ(i−1) is bounded by 2B.By the bounds on pi and by Theorem 5.2.6, p. 61, given an approximation

144

Page 149: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

to this number with absolute error less than (observe that N ≥ 2 and thatB contains a term quadratic in N)

2−(2n2N2+7nN+nN lognN+4n2N2L+4nNB) > 2−5nNB,

then the exact representation of γ(i−1) can be reconstructed usingO((nN)3B) elementary operations on integers of size O(nNB).

Next we show how to obtain the required approximation from the initialapproximations.

By Theorem 3.9, p. 28, we can assume that σj is defined via σj( di√ρi) =

ζjnidi√ρi. Then σj(ηi) = cα + d1

√ρ1 + · · · + ζjni

di√ρi, with c = 23L (for the

value of c see the proof of Lemma 7.2.1).Now the approximation to α leads to an approximation to cα with

absolute error less than 2−12n2N2B+3L since c is an integer known ex-actly. Next the approximations to di

√ρi, and ζni lead to an approxima-

tion to ζjni di√ρi with absolute error less than 2−12n2N2B+L+2 since | di√ρi| ≤

2L, |ζni | = 1, logN < L. Hence adding the approximations for cα, ζni di√ρi,

and dj√ρj , j 6= i, yields an approximation to σj(ηi) with absolute error

less than 2−12n2N2B+3L+logN . The number of steps used to compute theintermediate approximations for all j = 0, 1, . . . , ni − 1, is bounded byO(ni + logN) ∈ O(N).

Since nN is an upper bound on the degree nNi of F (i) over Q, σj(γ(i)) =1f

∑nNi−1h=0 fiσj(ηi)h, f, fi ∈ Z, j = 0, . . . , ni.As observed in the proof of Lemma 7.2.1 [ηi]∞ < 23L+l+1. Hence (apply

Lemma 5.4.3, p. 69), part 2), with α = σj(ηi), L = B, and n = nN) giventhe approximation to σj(ηi) with error less than 2−12n2N2B+3L+logN then∑nNi−1h=0 fiσj(ηi)h is approximated with absolute error less than

2−12n2N2B+3L+logN+6nNL+2nNl+4nN+B.

Moreover if 1f is approximated with absolute error less than ε then σj(γ(i)) =

1f

∑nNi−1h=0 fiσj(ηi)h itself is approximated with absolute error less than

2−12n2N2B+3L+logN+6nNL+2nNl+4nN+B+2 < 2−11n2N2B.

Since nN is an upper bound on the degree nNi of F (i) over Q, γ(i) =1f

∑nNi−1h=0 fiηi, f, fi ∈ Z, and ni ≤ N numbers σj(γ(i)), j = 0, 1, . . . , ni −

1, have to be approximated, O(nN2) elementary operations on floating-point numbers of size O(log 1

ε ) are needed for to approximate σj(γ(i)), j =0, 1, . . . , ni − 1.

145

Page 150: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

γ(i) and σj(γ(i)), j = 0, 1, . . . , ni− 1, have the same minimal polynomialover Q, which follows from the fact that σj leaves elements in F (i−1) andhence in Q fixed. Since [γ(i)]∞ < 2B Lemma 7.1.1 implies that the minimalpolynomial of σj(γ(i)) has length less than 2nN(B+1) for all j = 0, 1, . . . , ni−1.Hence the imaginary part =(σj(γ(i))) is upper bounded by 2nN(B+1) (seeLandau’s bound Lemma 5.1.2, p. 48) and the real part <(σj(γ(i))) of σj(γ(i))if non-zero is bounded from below by 2−2n2N2(B+1) (Lemma 5.4.8, p. 74).

By Lemma 5.4.11, p. 77, O(N log log 1ε ) operations on floating-

point numbers of size O(log 1ε ) suffice to compute approximations to

d

√σj(γ(i)), j = 0, 1, . . . , ni − 1, with absolute error less than 2−5n2N2B from

the approximations to σj(γ(i)). For the estimate on the error again N ≥ 2and the fact that B contains a term quadratic in N are used23.

Using the estimates of Lemma 5.4.3, p. 69, once more, it can easily beshown that the approximations to d

√σj(γ(i)), to ζ, to ζni , and to di

√ρi lead to

an approximation to(∑ni−1

j=0 ζ(j)σj( di√ρir) d

√σj(γ(i))

)dwith absolute error

less than 2−3n2N2B < 2−5nNB as required by the reconstruction algorithm.Obviously, to compute this approximation from the previous ones takes

only O(ni + log d) elementary operations on floating-point numbers of sizeO(log 1

ε ), which is dominated by the O(nN2 + N log log 1ε ) operations on

floating-point numbers of size O(log 1ε ) we used already previously (Observe

that B and hence log 1ε contains a term linear in d.).

If the reconstruction algorithm returns the element γ(i−1) in F (i−1), thenwe still have to determine whether its inverse denests d

√γ(i). Equivalently,

we have to check whether

d√γ(i−1)d−1 d

√γ(i) ∈ F (i).

But this is exactly the kind of problem we solved in Section 5 (Theorem5.6.5, p. 92) since γ(i−1)d−1γ(i) ∈ F (i). It follows from Corollary 7.3.2 thatthe approximation and reconstruction step can be done within the same timebounds as the previous step. In particular, due to our choice of B an elementin F (i) corresponding to d

√γ(i−1)d−1 d

√γ(i) has coefficient size at most 2B and

our approximations to α, di√ρi, ζd, ζni suffice to compute an approximation

to d

√γ(i−1)d−1 d

√γ(i) as required by the reconstruction step.

23If the real part of σj(γ(i)) is zero Lemma 5.4.6, p. 71, instead of Lemma 5.4.11, p. 77,

can be used.

146

Page 151: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

On the other hand, for the deterministic test whether the element inF (i) determined by the reconstruction step is identical to d

√γ(i−1)d−1 d

√γ(i)

we compute the d-th power of the element in F (i) and compare it toγ(i−1)d−1γ(i). By Lemma 5.5.5, p. 83, for this step we have to spendO(n2N2 log d) elementary operations24 on integers of size O(dB) which isnot covered by the previous run times.

But for this test we need to represent γ(i−1)d−1γ(i) as an element in F (i).We therefore compute the representation of γ(i−1) as an element of F (i),raise it to the (d − 1)-st power by exact arithmetic in F (i), and multiply itwith γ(i). By definition of B, Corollary 7.3.2, and by Lemma 5.5.2, p. 82and Lemma 5.5.5, p. 83, the time needed for this step is dominated by theprevious run times.

The last statement of the lemma follows from the fact that by the resultsof Section 5, in particular, Theorem 5.2.6, p. 61, the representation of di

√ρijd

for the smallest non-zero j ∈ E(i) can be computed within the time boundsof the lemma. Recall from the proof of Lemma 7.3.1 that [ di

√ρijd]∞ < 2B

and hence for this reconstruction step the same analysis as for the recon-struction step for γ(i−1) applies. Then the powers of this element and theirmultiples with the element determined above are computed by exact arith-metic in Q(α). Since in each step the coefficient size is bounded by 2B (seeagain the proof of Lemma 7.3.1 and Corollary 7.3.2, Definition 7.3.3) theclaim follows from Lemma 5.5.2, p. 82.

For i < k in the overall Algorithm Denesting Element the assumption thatγ(i) is represented as a linear combination of powers of ηi is justified by thelemma itself. In fact, γ(i) itself will be computed by the procedure leadingto Lemma 7.3.4 and hence it will be represented as a linear combinationof powers of ηi as is required on the next level of the Algorithm DenestingElement for the computation of γ(i−1). For the case i = k the assumptionthat γ(k) is represented as a linear combination of powers of ηk is justifiedby Lemma 7.2.3. Also observe that by this lemma the coefficient size of γand of Cγ = γ(k) is bounded by 2B.

Remark 7.3.5 The procedure of the proof of Lemma 7.3.4 shows how todetermine denesting elements for d

√γ in case γ is an element of an arbitrary

algebraic number field (recall Definition 6.2.1, p. 105). However, in this

24Recall again that nN is an upper bound on the degree of F (i) over Q.

147

Page 152: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

general setting we do not know how to determine efficiently a small supersetof the set of admissible sequences for d

√γ. Trying all possible sequences of

d-th roots of unity leads to an algorithm whose run time is polynomial in2N log d (N is the degree of the extension containing γ in this case) and theremaining input parameters. But even this improves an algorithm of Landau(see [La3]) that achieves a run time that is polynomial only in 2Nd. ♦

By Lemma 6.3.2, p. 109, applying the previous lemma to all sequencescontained in a superset for the normalized admissible sequences of an elementγ(i) ∈ D(i) over the field F (i−1) shows how much time is needed to determinea single denesting element and the corresponding denesting set Dγ(i) for γ(i)

over F (i−1). Next we show how to compute efficiently for each element inthe sets D(i) a superset of its set of admissible sequences. In the real casethe following result is used.

Lemma 7.3.6 Let ζni be a primitive ni-th root of unity. Assume thatα, di√ρi, ζni , are approximated with absolute error less than

ε < 2−(6(n2N4+3n2N4L)+5L+1).

In the real case primitive elements and their minimal polynomials over Qfor the fields F (i)(ζni), i = 1, 2, . . . , k, can be computed using O(n4N8L) el-ementary operations on integers of size O(n2N4L) and O(logN) operationson floating-point numbers of size O(n2N4L).

The minimal polynomials have degree less than nN2 and length boundedby 26nN2L.

Proof: We analyze the time needed to compute the primitive element forone extension F (i)(ζni).

Recall that if φ denotes Euler’s φ-function then the degree of Q(ζni) overQ is φ(ni) < ni. Hence the degree of the extension F (i)(ζni) over Q is atmost nNiφ(ni), which is bounded by nN2

i and nN2. The last bound may berather crude in general, but for k = 1 it is almost tight.

By the Primitive Element Theorem 2.6, p. 17, for any integer c′ 6= 0satisfying

c′ 6= η(h)i − η

(j)i

ζuni − ζvni,

for different powers ζuni , ζvni of ζni , and conjugates η(j)

i , η(h)i of ηi, the element

ηi + c′ζni will generate F (i)(ζni).

148

Page 153: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

The denominator of the ratio is bounded from below by 2− logN+log π

since this is a lower bound for the distance between two vertices in a regularN -gon whose circumcircle has radius 1. Using this bound, the assumptionL > logN, and [ηi]∞ < 24L it can be shown that for c′ = 25L the elementηi + c′ζni is a primitive element. The bound on the length of the minimalpolynomial is immediate from Lemma 7.1.1, p. 130, using [ηi + c′ζni ]∞ <24L + 25L < 2L+1.

Moreover, applying Schonhage’s reconstruction algorithm from Theo-rem 5.2.7, p. 62, shows that the minimal polynomial for ηi + c′ζni canbe computed using O(n4N8

i L) elementary operations on integers of sizeO(n2N4L) provided an approximation to ηi + c′ζni with absolute error lessthan 2−6(n2N4+3n2N4L) exists. Such an approximation can be computed eas-ily from the given approximations by O(logN) elementary operations onfloating-point numbers of size O(log 1

ε ).To obtain the run time for computing all primitive elements we just have

to sum up the number of elementary operations. As used already previously∑ki=1Ni ≤ 2N and the number of operations is bounded by O(n4N8L).

In the complex case F (i)(ζni) = F (i) since ni divides di, hence it dividesd, which implies that F contains with a primitive d-th root of unity also aprimitive ni-th root of unity.

The next lemma is the algorithmic version of Lemma 6.4.3, p. 113.

Lemma 7.3.7 Assume that α, di√ρi, ζni , and a primitive d-th root of unity

ζd are approximated with absolute error less than

ε < 2−(60n3N6L+13n2N2B).

In the real case, if a primitive element for F (i)(ζni) has already been

computed then for any d

√γ(i), γ(i) ∈ D(i), a superset of size d3 of its

set of normalized admissible sequences over F (i−1) can be determined us-ing O(d2(n5N10L + n3N6B)) elementary operations on integers of sizeO(n3N6L + nN2B) and O(d2nN3 + N log log 1

ε ) elementary operations onfloating-point numbers of size O(log 1

ε ).

It may look strange that we compute d3 sequences by an algorithm whosedependence on d is only quadratic. But we can represent the sequences in amore efficient form than just listing them all. For details see the followingproof.

149

Page 154: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Proof: By Lemma 6.4.3, p. 113, any normalized admissible se-quence

(1, ζ(1), ζ(2), . . . , ζ(ni−1)

)is already determined by its prefix(

1, ζ(1), ζ(2), ζ(3)). Moreover, in this situation

ζ(2) d√γ(i)d−1 d

√σ2(γ(i)) = ρ ∈ F (i)(ζni),

and

ζ(2(m−1))d−1ζ(2m) d

√σ2(m−1)(γ(i))d−1 d

√σ2m(γ(i)) = τ (m−1)(ρ) ∈ F (i)(ζni),

where τ is the field embedding of F (i)(ζni) over F (i−1)(ζni) with τ( di√ρi) =

ζ2ni

di√ρi. τ

(m−1)(ρ) denotes τ(τ(. . . (τ︸ ︷︷ ︸(m−1)−times

(ρ)) . . .)), which makes sense since τ

is an automorphism of F (i)(ζni) over F (i−1)(ζni).Analogously, for the odd indices we get

ζ(1)d−1ζ(3) d

√σ1(γ(i))d−1 d

√σ3(γ(i)) = ρ′ ∈ F (i)(ζni),

and

ζ(2m−1)d−1ζ(2m+1) d

√σ2m−1(γ(i))d−1 d

√σ2m+1(γ(i)) = τ (m−1)(ρ′) ∈ F (i)(ζni).

Here as throughout the proof σj denotes the embedding of F (i) over F (i−1)

determined by σj( di√ρi) = ζjni

di√ρi.

Observe that the subsequences for the even and the subsequences forthe odd indices are independent of one another. Hence we will determinethe set of sequences for the even and the odd indices separately. The set ofadmissible sequences is obtained by mixing these two sets of sequences inall possible ways.For the even indices the following algorithm is used.

For all d-th roots of unity ζ(2) do the following:

Use the algorithm of Theorem 5.2.6, p. 61, to compute an elementρ in F (i)(ζni) such that if ζ(2) d

√γ(i)d−1 d

√σ2(γ(i)) ∈ F (i)(ζni) then

it must be ρ. If no such ρ exists proceed with the next d-th root ofunity, since no admissible sequence with ζ(2) in the third positioncan exist.

150

Page 155: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Using the formula

ζ(2m) =τ (m−1)(ρ)

d

√σ(2(m−1))(γ(i))d−1 d

√σ2m(γ(i))

ζ(2(m−1)),

determine for m = 1, 2, . . . , bni2 c, by a bit comparison test withall d-th roots of unity the elements ζ(2m) in the admissible se-quences containing ζ(2). If for some m no d-th root of unity sat-isfying the equation can be found proceed with the next possiblevalue for ζ(2).

For each tuple of d-th roots of unity (ζ(1), ζ(3)) the roots of unity ζ(2m−1)

corresponding to those admissible sequences with prefix (1, ζ(1), ζ(2), ζ(3)) forsome d-th roots of unity are computed in exactly the same way except thatwe start with the equation

ζ(1)d−1ζ(3) d

√σ1(γ(i))d−1 d

√σ3(γ(i)) = ρ′,

for some ρ′ ∈ F (i)(ζni).It remains to analyze the run time of this algorithm. First observe

that although there are d3 triples of roots of unity determining normalizedadmissible sequences, in our approach we have to compute only d+ d2 rep-resentations of elements in F (i)(ζni).

For each pair σm, σm+1[ζ(m)ζ(m+1) d

√σm(γ(i))d−1 d

√σm+1(γ(i))

]∞< 2B,

by definition of B.Hence if ζ(2) d

√γ(i)d−1 d

√σ2(γ(i)) = ρ ∈ F (i)(ζni) then (see Lemma 5.1.8,

p. 51, recall that nN2 is a bound on the degree of F (i)(ζni) over Q andrecall also the bound on the minimal polynomial of a primitive element ofF (i)(ζni) given in the previous lemma)

[ρ] < 22nN2 lognN2+12n2N4L+B < 214n2N4L+B.

By Theorem 5.2.6, p. 61,given an approximation to ζ(2) d

√γ(i)d−1 d

√σ2(γ(i)) with absolute error less

than (recall N ≥ 2)

2−(2n2N4+nN2 lognN2+7nN2+6n2N4L+56n3N6L+4nN2B) ≥ 2−(60n3N6L+4nN2B)

151

Page 156: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

the exact representation of an element ρ ∈ F (i)(ζni) such that if thecomplex number above is in F (i)(ζni) then it must be ρ can be deter-mined by O(n3N6(n2N4L + B)) elementary operations on integers of sizeO(nN2(n2N4L+ B)).

Likewise, for roots of unity ζ(1), ζ(3) we determine ρ′ ∈ F (i)(ζni).It can be shown exactly as in the proof of Lemma 7.3.4 that the initial ap-

proximations to α, di√ρi, ζni , ζ can be used to compute the required approx-

imations d

√γ(i), d

√σ1(γ(i)), d

√σ2(γ(i)), and d

√σ3(γ(i)) by O(nN + log log 1

ε )elementary operations on floating-point numbers of size O(log 1

ε ). Fur-thermore throughout all reconstruction steps at most d + d2 productsζ2 d

√γ(i) d

√σ2(γ(i)) or ζ(1)d−1ζ(3) d

√σ1(γ(i)) d

√σ3(γ(i)) need to be computed.

Hence all approximations used in these steps can be determined from theinitial one by O(d2 +nN+log log 1

ε ) elementary operations on floating-pointnumbers of size O(log 1

ε ).If ζ(2(m−1)) is already computed, in order to determine the d-th root

corresponding toτ (m−1)(ρ)ζ(2(m−1))

d

√σ(2(m−1))(γ(i))d−1 d

√σ2m(γ(i))

we just have to approximate this number with absolute error less than 2− log d

and compare it with all d-th roots of unity. Since 2− log d is a separationbound for two d-th roots of unity this will pick out the correct d-th root ofunity if the ratio is such a root.

We determine such an approximation by approximating the numeratorand denominator separately. As mentioned above the numerator must bebounded in absolute value by 2B. By Lemma 5.1.2, p. 48, Lemma 7.1.1,p. 130, and Corollary 7.3.2 the denominator is bounded from below by2−nNB−1. In fact, by Corollary 7.3.2 we know that [γ(i)]∞ < 2B. The degreeof γ(i) and its conjugates is at most nN and therefore by Lemma 7.1.1 itsminimal polynomial has infinity norm bounded by 2nNB. Hence the same istrue for the minimal polynomial of 1

γ(i) . By Cauchy’s bound (Lemma 4.2.1,

p. 41) [γ(i)]∞ lower bounded by 2−nNB−1. But this immediately implies thatthe denominator is also bounded from below by 2−nNB−1.

Hence approximating the numerator with absolute error less than2−(nNB+log d+3) and the denominator with absolute error less than2−(B+log d+2) suffices.

We know the representation of ρ as a linear combination of powers of

152

Page 157: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

the primitive element ηi + c′ζni and

τ (m−1)(ηi + c′ζni) = τ (m−1)(ηi) + c′τ (m−1)(ζni) = σ2(m−1)(ηi) + c′ζni

Since τ (m−1)(ρ) is represented as a linear combination of O(nN2) powers ofηi + c′ζni as in the proof of Lemma 7.3.4 it can be shown that the requiredapproximation to τ (m−1)(ρ) can be determined from the given approxima-tions using O(nN2) elementary operations on floating-point numbers of sizeO(log 1

ε ) (in the proof of Lemma 7.3.4 we had to approximate σj(γ(i))).There are at most O(d2N) different numerators so overall this step takesO(d2nN3) elementary operations, since nN2 is an upper bound on the degreeof F (i)(ζni) and therefore τ (m−1)(ρ) is represented as a linear combinationof O(nN2) powers of ηi + c′ζni .

Lemma 5.4.11, p. 77, implies that the required approximations to eachdenominator can be computed from the initial ones by O(nNi + log log 1

ε )elementary operations on floating-point numbers of size O(log 1

ε ). But ob-serve that overall there are at most ni different denominators that have tobe approximated. Hence for all ratios appearing during the algorithm thisstep can be done in with O(nN2 + N log log 1

ε ) elementary operations onfloating-point numbers of size O(log 1

ε ).Multiplying the approximations in order to approximate the ratio can

clearly be done by one operation on floating-point numbers of size O(log 1ε )

and throughout the algorithm at most O(d2N) ratios need to be determined.Finally, we needed to compute all d−1 powers of the primitive d-th root

of unity ζd (which is done of course only once) and compare the first log dbits of these numbers with the first log d bits of each of the ratios we deter-mine in the algorithm. This takes O(d3 log d) time which is dominated bytime needed for O(d2(n5N10L+n3N6B)) elementary operations on integersof size O(n3N6L+ nN2B) since B contains a term that is linear in d.. Thelemma follows.

As mentioned at the end of Section 6.4 for F = Q we can compute asuperset of the set of admissible sequences of size O(n3

i ).

Lemma 7.3.8 Suppose F = Q and that the radicals di√ρi are real. Assume

that di√ρi, ζni , and a primitive d-th root of unity ζd are approximated with

absolute error less than

ε < 2−(125N6L+39dL logN+39N2K).

153

Page 158: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

If a primitive element for the field F (i)(ζni) has been computed then ford

√γ(i), γ(i) ∈ D(i) a superset of size 24n3

i of its set of admissible se-quences can be determined using O((d+N2)(N10L+ dLN6 logN +N6K))elementary operations on integers of size O(N6L + dLN2 logN + N2K)and O(d log dN4) elementary operations on floating-point numbers of sizeO(dN4L+ d2L logN + dK).

Proof: For all d-th roots of unity ζ determine the element ρ ∈ F (i)(ζni)

such that if ζ d

√γ(i)d−1 d

√σ1(γ(i)) ∈ F (i)(ζni) then it must be ρ. We do this

in exactly in the same way as in the previous lemma.Determine the representation µi in F (i)(ζni) of γ(i)d−1σ1(γ(i)) by deter-

mining separately the representation of γ(i) and σ1(γ(i)) in F (i)(ζni) and bycomputing γ(i−1)d−1σ1(γ(i)) with exact arithmetic in F (i)(ζni).

Then compute ρd by successive squaring and check whether ρd = µi.If so, approximate ρ

d√γ(i)d−1σ1(γ(i))

with absolute error less than 2− log d and

check whether it is ζ.By Theorem 6.4.4, p. 117, only 24ni roots of unity will pass this test.

Denote the set of these roots of unity by Z1.Replacing σ1(γ(i)) by σ2(γ(i)) and σ3(γ(i)), respectively, the sets Z2 and

Z3 are defined and computed.By Lemma 6.4.1, p. 112, only triples in Z1×Z2×Z3 can be a prefix of an

admissible sequence. So if the algorithm leading to the previous lemma isrun only on these triples it will determine a superset of the set of admissiblesequences of size at most (24ni)3.

The analysis of the algorithm described above is exactly as in the proofof the previous lemma except for the part of the algorithm in which we com-pute γ(i−1)σj(γ(i)), j = 1, 2, 3, and ρd using exact arithmetic in F (i)(ζni).This part is analyzed using Lemma 5.5.2, p. 82, and Lemma 5.5.5, p. 83.Moreover, we adjusted the run time for the case F = Q, i.e., n, l do notappear.

Due to F (i) = F (i)(ζni) in the complex case we get a better result thanLemma 7.3.7. In particular, the degree of F (i)(ζni) = F (i) is still boundedby nN and the length of the minimal polynomial of a primitive element forthis field is still 24nNL (Lemma 7.2.1) rather than 26nN2L as in the real case(see Lemma 7.3.6). The proof is as above except that we use the complexversion of Lemma 6.4.3, p. 113.

154

Page 159: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Lemma 7.3.9 Assume that α, di√ρi, ζni , and a primitive d-th root of unity

ζd are approximated with absolute error less than

ε ≤ 2−(51n3N3L+13n2N2B).

In the complex case, a superset of size d for the set of admissible sequences ofd

√γ(i) over F (i−1) can be determined using O(d(n5N5L+n3N3B)) elementary

operations on integers of size O(n3N3L+nNB) and O(dnN2 +N log log 1ε )

elementary operations on floating-point numbers of size O(log 1ε ).

7.4 Description and Analysis of Step 3

It remains to analyze Step 3 of the Algorithm Denesting Element. We willshow that for each element γ(0) in D(0) the time needed for this step isfully covered by the time needed for a single application of the algorithmleading to Lemma 7.3.4, that is, neither do we use larger numbers than inthis algorithm nor do we use more operations. The analysis we present isnot optimal but it suffices for our purposes.

For each element γ(0) in D(0) (in the complex case there will be only one)we have to check whether C

γ(0) denests d√γ and compute the corresponding

element ρ = d

√Cγ(0)

d√γ in F (k) = Q(α, d1

√ρ1, d2√ρ2, . . . , dk

√ρk). Equivalently,

we determine whether Cγ(0)d−1 denests d√γ and, if so, determine the repre-

sentation of d

√Cγ(0)d−1 d

√γ as an element in F (k).

This step is of course done by the algorithms leading to Theorem 5.2.6,p. 61, applied with the field F (k) = Q(ηk). We need to bound the repre-

sentation size of d

√Cγ(0)d−1 d

√γ as an element of F (k) with respect to the

primitive element ηk. Since Cγ is by choice of C an algebraic integer andby Lemma 7.3.1 the same is true for γ(0) and hence for d

√Cγ(0)d−1 d

√γ. By

Lemma 2.10, p. 22, this implies d

√Cγ(0)d−1 d

√γ = 1

∆k

∑nN−1i=0 eiη

ik, ei ∈ Z,

where ∆k is the discriminant of ηk. By Corollary 7.3.2

[ d√Cγ(0)d−1 d

√γ]∞ < 210n2N2L+3dL logN+3K

and by Lemma 7.3.6 the minimal polynomial of ηk has degree at most nN2

and length at most 26nN2L. Lemma 5.1.8, p. 51, shows |ei| < 2B.Now Theorem 5.2.6, p. 61 shows that the time needed for determining

the representation of d

√Cγ(0)d−1 d

√γ as an element of F (k) is dominated by

155

Page 160: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

the run times stated in Lemma 7.3.4. In fact, at the end of the algorithmleading to this lemma we used a similar procedure to determine whether theelement γ(i−1) denests d

√γ(i). The approximations used in this lemma are

also sufficient for the reconstruction algorithm in Step 3.If Cγ(0)d−1 denests d

√γ this procedure already finds an expression for

d

√Cγ(0)d−1 d

√γ containing no depth 2 radicals and hence it finds a den-

esting for d√γ. But the representation for d

√Cγ(0)d−1 d

√γ is not as a lin-

ear combination of the elements β0, β1, . . . , βN−1 of the standard basis forF (k). Rather it finds a representation as a linear combination of powers ofηk = cα+ d1

√ρ1 + d2

√ρ2 + . . .+ dk

√ρk, a primitive element of F (k) over Q. As

it turns out for the General Denesting Algorithm this is not an appropriateform. Therefore we included in the Algorithm Denesting Element the laststep in which we determine the representation of d

√Cγ(0)d−1 d

√γ as a linear

combination of the βj ’s. We describe and analyze the transformation.To do so need a bound on the size of the coefficients µ(i)

j ∈ Q(α) in the

representation ηik =∑N−1j=0 µ

(i)j βj , i < nN. Since k ≤ logN, i < nN (the

degree of F (k) over Q), if ηik is expanded it consists of at most (logN +1)nN

terms. All these terms are products of powers of cα, and of powers ofdj√ρj , j = 1, . . . , k, with each exponent being smaller than nN.

Using Lemma 5.1.8, p. 51, c < 23L, |α| < 2l, easily shows

[(cα)e] < 26nNL, for all e < nN.

We also know that each power product∏

dj√ρjfj , fj < nN, can be writ-

ten as a product of an element κ ∈ Q(α) and some basis element βh (Lemma3.6, p. 26, in the real and Lemma 3.8, p. 27, in the complex case).

Define `j := N/dj for all j. Then

k∏j=1

dj√ρjfj = N

√√√√√ k∏j=1

ρfj`jj

and

βh = N

√√√√√ k∏j=1

ρej`jj

for some integer ej between 0 and nj−1 (see also the proof of Lemma 7.2.1,

156

Page 161: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

p. 135, where a similar argument was used). Hence

κ = N

√√√√√∏kj=1 ρ

`jfjj∏k

j=1 ρej`jj

.

Observe that k ≤ logN, `jfj < nN2, and ej`j ≤ N. Therefore Lemma5.5.5, p. 83, implies that

∏ρej`jj is an element in Z[α] whose coefficients are

bounded in absolute value by 2LN logN and that∏ρ`jfj is an element in

Z[α] whose coefficients are bounded in absolute value by 2LnN2 logN 25.

By Lemma 5.3.2, p. 64, if the ratio κ is in Q(α) then

[κ] < 26LnN2 logN .

However, due to the fact that the numerator and denominator are algebraicintegers and that the denominator has coefficient size 2LN logN the denomi-nator of κ is even bounded by 24LnN logN . In fact, by Lemma 5.1.11, p. 53,the inverse of the denominator of the ratio κ is a root of a polynomial withlength less than 22LnN logN . As used already several times, this implies thatan integer less than 22LnN logN exists such that the product of the inverseof the denominator of κ with this number is an algebraic integer. Hencethe product of κ with this integer is an algebraic integer. Lemma 5.1.8,p. 51, implies that the denominator of κ is bounded in absolute value by|∆|22LnN logN , where ∆ is the discriminant of α. Now the claim follows fromthe bound on |∆| in Lemma 5.1.5, p. 50.

Combining these estimates and observations implies that each power ηikcan be represented as a sum

∑N−1j=0 µ

(i)j βj with

[µ(i)j ] ≤ 26LnN2 logN+6nNL+2nN log logN < 210LnN2 logN .

We transform the powers ηik, i = 0, 1, 2, . . . , nN − 1, successively. Soassume ηi−1

k has already been transformed into

ηi−1k =

N−1∑j=0

µ(i−1)j βj , µ

(i−1)j ∈ Q(α), [µ(i−1)

j ] < 210LnN2 logN .

25Observe that a similar argument was used in the analysis of Step 1. But in this stepthe standard basis for F (k) may not have been computed in the form we need it now.

157

Page 162: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Hence

ηik = cαN−1∑j=0

µ(i−1)j βj + d1

√ρ1

N−1∑j=0

µ(i−1)j βj + . . .+ dk

√ρk

N−1∑j=0

µ(i−1)j βj .

Each product dm√ρmβj must be a multiple of some element κm,j in Q(α) and

a basis element β(j)m . For each product that is not already a basis element we

compute κm,j and β(j)m . Then we multiply κm,j and µ(i−1)

j , or cα and µ(i−1)j ,

and collect the coefficients of each basis element βi.The coefficients (which are elements in Q(α)) obtained in this way need

not be in a reduced form, that is, the gcd of the denominator and the rationalintegers appearing in the linear combination of powers of α that forms thenumerator need not be 1. Accordingly, these numbers need not satisfy thebound we derived above for the coefficient size of the µ(i)

j ’s. To obtain arepresentation satisfying this bound in the final step for each coefficient wecompute the gcd of its denominator and all rational integers appearing inits numerator and divide these numbers by the gcd . Since the βj ’s form abasis and, accordingly, the representation

ηik =N−1∑j=0

µ(i)j βj

is unique, this finally must lead to a representation with coefficient size asrequired.

To determine for a product dm√ρmβj the elements κm,j and β

(j)m men-

tioned above we simply check for each basis element βh whether the ratiodm√ρmβjβh

is an element of Q(α). We do so by transforming both numeratorand denominator into an N -th root as described above and apply the de-terministic algorithm leading to Theorem 5.6.2, p. 90, (or Theorem 5.6.3,p. 90, in the complex case) to this ratio of radicals.

The transformation step has been analyzed already for Step 1 (see theproof of Lemma 7.2.1, p. 135). It takes O(n2N logN) elementary operationson integers of size O(LN logN) to transform all basis elements βj and allproducts dm

√ρmβj into this form.

For each ratio the algorithm leading to Theorem 5.6.2, p. 90, requireO(Ln5N2 logN) operations on integers of size O(Ln3N2 logN), O(n2 logN)operations on integers of size O(Ln2N3 logN) (see also Table 1, page 88),and (see Theorem 5.2.6, p. 61) an approximation to the ratio with absoluteerror less than

2−25Ln3N2 logN .

158

Page 163: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

As is easily seen (recall Lemma 5.4.3, p. 69) the approximations toα, di√ρi, i = 1, 2, . . . , k, used in Lemma 7.3.4, p. 144, are sufficient to de-

termine this approximation using O(N) elementary operations on floating-point numbers of size O(Ln3N2 logN). In fact, for each ratio we only needto compute O(N) products of the radicals di

√ρi and perform one inversion

to approximate 1βh.

Moreover, observe that if we keep a list of the products dm√ρmβj that

have already been computed, and every time we have to compute the basisrepresentation of a product dm

√ρmβj , we check in this list whether it has

already been transformed previously, we have to apply this test to at mostN logN products.

So in order to compute all powers of ηk we have to check at most N2 logNratios. Since the list has length at most N logN, has to be updated at mostN logN times, and overall nN2 logN queries are performed the operationsabove clearly dominate these list operations and the time required for allratio tests is dominated by the run time stated in Lemma 7.3.4, p. 144.

Throughout the computation of all powers of ηk multiplying α or theelements κm,j with the elements µ(i−1)

j , requires O(n3N2 logN) operationson integers of size O(LnN2 logN). In fact, for each power of ηk we haveto perform O(N logN) arithmetic operations in Q(α). Since there are nNproducts the upper bound on the number of operations follows from Lemma5.5.2, p. 82. The bound on the size of the integers involved in these opera-tions follows from the bound on [µ(i)

j ].Throughout the algorithm collecting the coefficients and computing the

gcd’s uses O(n2N2 logN+n2N2 log(LnN log2N)) elementary operations onintegers of size O(LnN2 logN). In fact, for each power ηik we first add foreach basis element the coefficients. This uses O(nN logN) operations onintegers of size O(LnN2 logN). To prove these bounds observe that for afixed radical dm

√ρm any two products dm

√ρmβi and dm

√ρmβj , i 6= j, must

be multiples of different basis elements. Otherwise βi and βj were linearlydependent. Hence for each basis element βi we have to add at most logNelements in Q(α). This shows the upper bound on the number of operations.For the bound on the integers involved recall the O(LnN logN) bound forthe denominators of any ratio. Hence taking the product of these denom-inators yields an integer in O(LnN log2N). Now the upper bound on theintegers is immediate.

Then we have to compute N times the gcd of n + 1 O(LnN2 logN)-bit integers. The run time above follows now from the fast gcd-algorithm

159

Page 164: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

(see [Sc1]). Although even with the usual gcd-algorithm the run time isdominated by the run times of Lemma 7.3.4, p. 144.

Finally, given the representations of all powers of ηk we compute therepresentation of d

√Cγ(0)d−1 d

√γ. This step requires only arithmetic in Z

and can be done by O(n2N2) arithmetic operations on integers of sizeO(Ln2N2 logN + dL logN +K), which follows from the bounds above, and

the bound on the representation size of d√Cγ = d

√γ(k) as a linear combina-

tion of powers of ηk (see Corollary 7.3.2, p. 143). So this step is also coveredby the previous run times.

Lemma 7.4.1 It can be decided for each Cγ(0)d−1, γ(0) ∈ D(0) whetherit denests d

√γ using the same approximations to α, di

√ρi, i = 1, 2, . . . , k,

as in Lemma 7.3.4. Also the run times for the test is dominated by therun times stated in Lemma 7.3.4. Moreover within these time bounds therepresentation of d

√Cγ(0)d−1 d

√γ as

∑N−1j=0 κjβj , can be determined. Here

β0, β1, . . . , βN−1 is the standard basis of F ( d1√ρ1, . . . , dk

√ρk) and κj ∈

Q(α) satisfies

[κj ] ∈ O(B) = O(Ln2N2 logN + dL logN +K).

Let us remark that combining the results of the previous subsectionsyields an efficient algorithm to transform any radical expression into a sumof radicals.

In fact, in the first step we determine the standard basis and a primitiveelement for the radical extension generated by the radicals appearing in theexpression. This is done as in Step 1. Then we determine the representationof the expression with respect to powers of this primitive element. This isdone as in Step 2. Then we transform this representation into a linearcombination of the elements of the standard basis. This is done as in Step3.

160

Page 165: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

7.5 The Final Results

If we collect in the real case the run times for all applications of the singlesteps of the Algorithm Denesting Element we arrive at the following table.An explanation of the various entries will be given below.

Table 2: Run Times for the Algorithm Denesting Element(The real case)

161

Page 166: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

procedure number of operations bit size of numbers

Fields F (i), O(nN2 log(NL)) O(n3N2L) (fp)prim. elementsin Step 1 O(n4N4L) O(Ln2N2 logN)

Representation polynomial polynomial (fp)of γin Step 1 O(n5N5L+ n3N3K) O(n3N3L+ nNK)

O(n) O(n4N6L+ n3N2B) (fp)Initial appro-ximations O(logN log(nNB)) O(n3N6L+ n2N2B) (fp)in Step 2

O(logN) O(n2N4L) (fp)Prim. elementfor F (i)(ζni) O(n4N8L) O(n2N4L)

O(d2nN4+ O(n3N6L+ n2N2B) (fp)Computing +N2 log(nNB))admissiblesequences O(d2n5N11L+ O(n3N6L+ nN2B)

+d2n3N7B)

O(d3nN3+ O(n2N2B) (fp)Computing +d3N2 log(nNB))denestingelements O(d3n3N4B) O(nNB)

O(d3n2N3 log d) O(dB)

Step 3 covered by Step 2

By numbers both integers and floating-point numbers are meant. Thesymbol (fp) indicates where floating-point numbers are used. The first entryfor the “Representation of γ” refers to the assumption that approximations

162

Page 167: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

to γ can be determined efficiently (see assumption 5) of Section 7.1).To deduce these run times use Lemma 7.2.1, p. 135, Lemma 7.2.3, p. 140,

Lemma 7.3.4, p. 144, Lemma 7.3.6, p. 148, and Lemma 7.3.7, p. 149, finallyrecall that overall the set of admissible sequences has to be determined for atmost |

⋃ki=1D

(i)| ≤ N elements (see page 120). Accordingly, the algorithmleading to Lemma 7.3.4 has to be applied at most 2d3N times.

Step 3 has to be applied at most N times since∣∣∣D(0)

∣∣∣ ≤ N.As follows from Section 5.4 the run times stated above for the initial

approximation step are valid to compute approximations sufficient for allapplications of the algorithms leading to Lemma 7.3.4, p. 144, to Lemma7.3.6, p. 148, and to Lemma 7.3.7, p. 149. This proves the run times in thetable.

The following theorem states explicit upper bounds on the number ofoperations used and the bit size of the numbers on which these operationshave to be performed. The bounds stated are obtained from Table 2. Wesimply took the worst dependence on each parameter for the number ofoperations as well as for the size of the integers or floating-point numbers.Recall that

B = 20n2N2L+ 3dL logN + 3K.Theorem 7.5.1 Let F = Q(α) be a real algebraic number field and let γbe a rational expression over F in linearly independent radicals di

√ρi, ρi ∈

F, i = 1, 2, . . . ,K. Here ρi an algebraic integer and di√ρi is of degree di over

F. Assume [γ]∞ < 2K and assume that a guarantee is given that a positiveinteger C less than 2K exists such that Cγ is an algebraic integer. Finallyassume that for any positive ε the number γ can be approximated with errorless than ε > 0 in time polynomial in log 1

ε and K. Then the Algorithm

Denesting Element determines whether there exists an element γ(0) ∈ Q(α)with

d√γ(0) d√γ ∈ F ( d1

√ρ1, d2√ρ2, . . . , dK

√ρK)

in time polynomial in K, the degree n of the minimal polynomial p of α,in l = dlog |p|2e, in L = maxlog[ρi], in the degree N of the extensiongenerated by di

√ρi, i = 1, . . . ,K, and in d.

In particular, the algorithm uses at most O(d4n5N11(L + K)) elementaryoperations on integers or floating-point numbers of size O(d2n5N6(L+K))plus the number of operations used to compute an approximation to γ withabsolute error less than 2−(46n3N3L+8nNK). Here L = dn log n+ nl + nLe.

Using at most N times this number of operations on integers or floating-point numbers of asymptotically the same size the Algorithm Real Denesting

163

Page 168: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

determines a denesting of d√γ using real radicals over Q(α), if any such

denesting exists.

Proof: The first claim follows from Table 2.To prove the statement on the run time of the Algorithm Real Den-

esting observe that Step 1 of the Algorithm Denesting Element is appliedonly once. Likewise the primitive elements for the fields F (i)(ζni) are onlycomputed once. Step 2 and Step 3 of the Algorithm Denesting Elementare applied to all products βjγ, where βj is a basis element of the standardbasis of F ( d1

√ρ1, d2√ρ2, . . . , dK

√ρK) over F. Hence Step 2 and Step 3 of the

Algorithm Denesting Element have to be applied at most N times.Observe that βj is an integer, hence for any rational integer C such that

Cγ is an integer Cβjγ is an integer. Next observe that [βjγ]∞ < 2K+L logN

since any product di√ρiei , ei < di, has infinity norm less than 2nl+L+2(Lemma

5.1.11, p. 53) and each basis element is the product of at most logN of thesepowers. In fact, changing B from 20n2N2L+ 3d logN + 3K to 21n2N2L+3d logN + 3K will already do. This will increase the corresponding boundsin Lemma 7.3.1 and in Corollary 7.3.2 only by a constant factor. Hence thevalue B defined in Definition 7.3.3 has to be increased only by a constantfactor. It follows that for each product βjγ the run times stated in Table 2asymptotically remain unchanged.

Finally observe that in order to compute the representation of βjγ asa linear combination of powers of ηk we may compute the representationof βj and γ separately and then multiply. Computing the representationof γ is contained in Table 2. Computing the representation of βj is easilydominated by determining the representation of the elements γ(i) in the in-termediate denesting sets D(i). The second claim follows.

Recall from Section 6.2 that the denesting computed by the AlgorithmDenesting Element is not restricted to the specific d-th root we denote byd√γ. If d is even, it applies to both d-th roots if it applies to any. Similar

remarks apply below. Also recall that the assumptions that each ρi is aninteger and that di

√ρi is of degree di over F are no restriction. They are

easily guaranteed by the algorithms of Section 5.The claim that the algorithm runs in polynomial time without any refer-

ence to a specific model of computation is justified by the analysis in term ofelementary operations. In any reasonable model of computation that allowsbit operations the algorithm will use only polynomially many bit opera-tions since it uses only polynomially many elementary operations and the

164

Page 169: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

bit size on which these operations have to be performed is also polynomi-ally bounded. The statements of the other theorems and corollaries in thissection are justified similarly.

Let us explicitly state the upper bound for nested radicals of the formdescribed in the preliminaries.

Corollary 7.5.2 Let Q(α) be as above. Assume Sj , j = 1, 2, . . . ,m, havethe form

Sj =m∑i=1

Pi,jQi,j

,

where Pi,j , Qi,j , i = 1, . . . ,m, are sums of radicals of the linearly indepen-dent radicals di

√ρi, i = 1, 2, . . . ,K, with coefficients κ in Q(α) satisfying

[κ] < 2L. Assume that ρi is an integer for all i. Let P (X1, X2, . . . , Xm)be a polynomial in m variables with coefficients in Q(α). Assume that Phas at most m terms, that the total degree is also bounded by m, and thatits coefficients µ ∈ Q(α) satisfy [µ] < 2L. If the Algorithm Denesting El-ement is applied to γ = P (S1, S2, . . . , Sm) then its run time is bounded bya polynomial in m,n, l,N, L, and d. Here N is the degree of the extensionQ(α, d1

√ρ1, d2√ρ2, . . . , dK

√ρK) over Q(α).

In particular, the algorithm uses at most O(d4m3n6N13L) elementaryoperations on integers or floating-point numbers of size O(d2m3n6N8L),L = dn log n+ nl + nLe.

If the Algorithm Real Denesting is applied to d√γ it uses at most N times

this number of operations on numbers of the same bit size.

Proof: By the results in Section 7.1 in this case K ∈ O(m3nN2L). Further-more recall from Remark 7.2.4 that the required approximation to γ can bedetermined by O(n) elementary operations on floating-point numbers of sizeO(n4N3L + m3n3N3L) and O(N log(NL) + m2n) operations on floating-point number of size O(n3N3L+m3n2N3L).

Setting n = 1 and l = 0 one easily extracts from these results the resultfor F = Q and real radicals d1

√q1, . . . , dK

√qK such that the sum of the bit

size of the numerator and denominator in each qi is less than L. However, ifLemma 7.3.8, p. 153, instead of Lemma 7.3.7, p. 149, is used the algorithmmay be more efficient. We state the result only for radical expressions as inthe previous corollary.

165

Page 170: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Corollary 7.5.3 If Q(α) = Q and d√γ is as in Corollary 7.5.2 then the Al-

gorithm Denesting Element uses O(d2m3n3N13L) on integers and floating-point numbers of size O(d2m3N8L).

Using at most N times this number of operations on numbers of the samesize the Algorithm Real Denesting determines a denesting using real radicalsif any such denesting exists.

Here n is the maximum of all field degrees [Q( d1√q1, . . . , di

√qi) :

Q( d1√q1, . . . , di

√qi−1)].

Observe that depending on k the maximum degree n may be anything be-tween 2 and N. If k = 1, n = N, and N is large compared to d it may bepreferable not to use the algorithm of Lemma 7.3.8. But if k = logN and dis large applying Lemma 7.3.8 improves the run time considerably. In par-ticular, it reduces the number of applications of the lattice basis reductionalgorithm in the procedure where denesting elements are determined.

In the complex case we get a similar table than the one above. Due toLemma 7.3.9, p. 155, the fact that the number of admissible sequences thatare computed is bounded by d, and the fact that on each level from k − 1to 0 instead of computing the inverses of a denesting set we only need todetermine the inverse of a single denesting element the run times are muchbetter. Recall that k and hence the number of levels is bounded by logNthen apply Lemma 7.2.1, p. 135, Lemma 7.2.3, p. 140, Lemma 7.3.4, p. 144,and Lemma 7.3.7, p. 149.

Table 3: Run Times for the Algorithm Denesting Element(The complex case)

166

Page 171: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

procedure number of operations bit size of numbers

Field F (i), O(nN2 log(nL)) O(n3N2L) (fp)prim. elementsin Step 1 O(n4N4L) O(Ln2N2 logN)

Representation polynomial polynomial (fp)of γin Step 1 O(n5N5L+ n3N3K) O(n3N3L+ nNK)

O(n) O(n4N3L+ n3N2B) (fp)Initial appro-ximations O(logN log(nNB)) O(n3N3L+ n2N2B) (fp)in Step 2

O(dnN2 logN+ O(n3N3L+ n2N2B) (fp)Computing +N logN log(nNB))admissiblesequences O(dn5N5L logN+ O(n3N3L+ nNB)

+dn3N3B logN)

O(dnN2 logN+ O(n2N2B) (fp)Computing +dN logN log(nNB))denestingelements O(dn3N3B logN) O(nNB)

O(dn2N2 log d logN) O(dB)

Step 3 covered by Step 2

Recall that in the complex case the Algorithm Denesting Element alreadydetermines whether a denesting using radicals of the form t

√ρ, t|d, exists.

Theorem 7.5.4 Assume the algebraic number field F = Q(α) contains aprimitive d-th root of unity. Assume γ is a rational expression over F inlinearly independent radicals di

√ρi, [ρi] < 2L, i = 1, 2, . . . ,K. Here ρi an

algebraic integer which is of degree di, di|d over F. Assume [γ]∞ < 2K andassume that a guarantee is given that a positive integer C less than 2K existssuch that Cγ is an algebraic integer. Finally assume that for any positive

167

Page 172: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

ε the complex number γ can be approximated with error less than ε > 0 intime polynomial in log 1

ε and K.Then the Algorithm Denesting Element determines in time polynomial in

n, l,N, L, d, and K a denesting of d√γ using radicals of the form t

√ρ, t|d, ρ ∈

Q(α), if any such denesting exists. The parameters n, l, L are defined as inTheorem 7.5.1.

In particular, the algorithm uses at most O(d2n5N5(K + L) logN)elementary operations on integers or floating-point numbers of sizeO(d2n5N4(L + K)) plus the number of operations used to compute an ap-proximation to γ with absolute error less than 2−(46n3N3L+8nNK).

Again the bounds given above are very crude and more precise run timescan be read of from Table 3.

Thanks to the fact that Q(α) contains a primitive d-th root of unity, ifthe denesting applies to that d-th root of γ we usually denote by d

√γ then

it applies to all d-th roots of γ. As before the assumptions on the radicalsare no restriction.

As we mentioned before, the last theorem can also be applied to nestedradicals if di does not divide d. In that case, the field Q(α) should containprimitive d-th and di-th roots of unity. In this situation the AlgorithmDenesting Element finds a denesting using only radicals of degree dividinglcm(d, d1, d2, . . . , dK).

Corollary 7.5.2 generalizes accordingly to complex radicals.

Corollary 7.5.5 If γ is of the form described in Corollary 7.5.2 then the Al-gorithm Denesting Element in the complex case uses O(d2m3n6N7L logN)elementary operations on integers or floating-point numbers of sizeO(d2m3n6N6L).

The analysis of the General Denesting Algorithm requires still somework.

Theorem 7.5.6 Suppose S =∑ki=1 κi di

√γi, is a real depth two radical ex-

pression over a real algebraic number field Q(α), where κi, γi are radicalexpressions satisfying the assumptions of Theorem 7.5.1. In particular theirinfinity norm is bounded by 2K and for each expression γi, κi an integerCγi , Cκi < 2K exists such that Cκiκi or Cγiγi is an algebraic integer.

Denote by n the degree of the minimal polynomial p of α, l = dlog |p|2e.2L is the maximum coefficient size of an element in Q(α) appearing in thedefinition of κi, γi, and d is the maximum over all products didj for indices

168

Page 173: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

i, j between 1 and k. Finally assume that for any fixed pair of indices i, jthe radicals appearing in κi, κj , γi, and γj generate an extension of Q(α) ofdegree at most N.

When applied to S the General Denesting Algorithm determines in timepolynomial in k, n, l, L,N, d, and the upper bound K whether S can be writ-ten as a sum of real depth one radicals over Q(α). In particular, the GeneralDenesting Algorithm uses at most O(k2d4n6N12(L + dK)) elementary op-erations on integers or floating-point numbers of size O(kd2n5N6(L+ dK))plus the number of operations needed to approximate each γi, κi with absoluteerror less than 2−(46n3N3L+12dnNK).

Within the same time bounds the algorithm decides whether S = 0.

Proof: For a pair of indices i, j denote the radical extension generated bythe radicals in γi, κi, γj , κj by Eij .

Recall that we first partition R = d1√γ1, . . . , dk√γk into subsets

R1, . . . , Rh such that two nested radicals are in the same subset if andonly if their ratio denests using real radicals. We assume w.l.o.g. thatdt√γt ∈ Rt, t = 1, . . . , h. Then we transform S into

S =h∑t=1

dt√γt

γt

∑i∈N| di√γi∈Rt

κiγit,

γit denotes the denesting of didt

√γdti γ

di(dt−1)t .

Since γdti γdi(dt−1)t is an element of a radical extension of degree at most

N for each of these k(k−1)2 expressions we can easily determine its denesting

using the Algorithm Real Denesting. In particular, observe that any producthas infinity norm less than 2dK. Hence replacing in Lemma 7.3.1, p. 142,and in the definition of B (see Definition 7.3.3, p. 144) K by dK Table 2 orTheorem 7.5.1 easily yields the run times. Moreover observe that in Lemma7.2.3, p. 140, K also has to be replaced by dK. Finally, observe that byLemma 5.4.3, p. 69, approximations to γi, γt with absolute error less than2−(46n3N3L+12dnNK) suffices to determine the approximation to γdti γ

di(dt−1)t

as required by Lemma 7.2.3.Remark that in this step a primitive element ηit for the extension gener-

ated by the radicals in γi, γt has been computed. We may assume that thisextension is already Eit.

The next step that checks which of the radicals di√γi, i = 1, 2, . . . , h,

denests is also easily done within the time bounds stated. Recall that there

169

Page 174: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

is at most one radical that denests and we assume that d1√γ1 is this one, if

any.In the final step we have to check whether the sums

St =∑

i∈N| di√γi∈Rtκiγit,

for t ≥ 2, if d1√γ1, denests, and for t ≥ 1, otherwise, are zero.

We transform each κi into a sum of radicals. As mentioned at the endof Section 7.4 this can be done as follows. First compute the representationof κi as a linear combination of powers of the primitive element ηit for theextension Eit. Then using the same method as in Step 3 of the AlgorithmDenesting Element transform this representation into a linear combinationof the elements of the standard basis for this field.

The run time for this step is also covered by the previous run timessince similar steps have already been applied in the denesting steps above,in particular, in Step 3 of the Algorithm Denesting Element. Alsothe coefficients in this representation are bounded in absolute value by2O(Ln2N2 logN+dL logN+K). To prove this bound see Lemma 7.4.1 where abound on the coefficients in the denesting of d

√γ was given. Of course,

the bound for the denesting also applies to the easier case we are presentlyconsidering. The estimate is even very crude since the term dL logN, inthe bound of Lemma 7.4.1 which comes from the bound on the denestingelements γ(i), will not show up in the bound for the coefficients κi. But itsuffices for our purposes.

Observe that that each γit is a linear combination over Q(α) of elementsof the form

1didt√θitβit

β′it.

βit, β′it are elements of the standard basis of the radical extension Eit and

θit ∈ Q(α) is a denesting element for didt

√γtiγ

di(dt−1)t βit. By Corollary 7.3.2,

p. 143, the coefficient size of θit is bounded by 2O(n2N2L+dL logN+dK).In particular, γit is a sum of real depth one radicals over Q(α) containing

at most N terms. Hence, using the representation for κi as a linear combi-nation of the elements of the standard basis Eit each product κiγit is a sumof depth one radicals over Q(α) and so is St.

The overall number of terms in the sums St, t = 1, 2, . . . , h, is clearlybounded by kN2 and, as mentioned above, the coefficients in these sumshave coefficient size less than 2O(Ln2N2 logN+dL logN+dK).

170

Page 175: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

To check whether St is zero we apply the method of Section 5. Hencewe need to determine for ratios of the form

βjtdidt√θitβitβ

′jt

βit djdt√θjtβjtβ′it

whether they are elements of Q(α). Here βjt, βit are elements of the stan-dard basis of Ejt, Eit, respectively, appearing in the representation of κj , κicomputed above. θit, θjt, βit, β′it, βjt, β

′jt are as above.

As in the analysis of Step 1 and Step 3 of the previous section forthese ratio tests we transform denominator and numerator separately intoradicals Dit

√µ, Djt√ν, where Dit, Djt ≤ dN2. This is possible since N2 is an

upper bound on the degree of the radical extension containing κi, κj , γi, γj ,and γt, and the degree of any radical contained in this extension must dividethe field degree. Analogously for the denominator.

One shows as in the analysis of Step 3 that

log[µ], log[ν] ∈ O(n2N4L+ dN2L logN + dN2K).

For each of the O(k2N4) ratios these representations can be computed usingO(n2(log2N+log d)) operations on integers of size O(n2N4L+dN2L logN+dN2K). For example, if we denote for the sake of simplicity the degree ofthe extension generated by κi, κj , γi, γj , and γt by N ′, N ′ ≤ N2, then Dit =didtN

′ and βjt can be transformed into a radical of the form Dit√ρ, ρ ∈ Q(α)

as follows.Assume

βjt =k∏l=1

l√ρlel .

As in the analysis of Step 1 set `l = N ′/dl. Then

βjt = N′

√√√√ k∏l=1

ρ`lell .

To compute this representation we simply have to compute each ρ`lell , whichcan be done for a single ρl using O(logN ′) multiplications in Q(α) since`lel < dl`l < N ′. Since the number k of radicals dl

√ρl is bounded by logN ′

overall O(log2N ′) multiplications in Q(α) are needed. Then we multiplythese powers. Taking the didt-th power of this element yields the represen-tation Dit

√ρ, ρ ∈ Q(α) for βjt. For the last step O(log didt) multiplications

171

Page 176: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

are needed. By Lemma 5.5.5, p. 83, [ρ] < 2LdN2 logN . Now the time bounds

follow from Lemma 5.5.2, p. 82.Exactly the same analysis applies for the transformation of β′jt into the

appropriate form. To transform didt√θitβit into this form observe that we

only need to transform θit and βit separately into N ′-th roots. To do this thesame process as above is applied. But observe that since θit has coefficientsize 2O(n2N2L+dL logN+dK) the element θ′it in θit = N′

√θ′it will have coefficient

size 2O(n2N4L+dN2L logN+dN2K).Multiplying the representations for βit, θit, βjt, and β′jt yields the desired

result since this can be done by a constant number of multiplications.Now Theorem 5.6.2, p. 90, or Table 1 on page 88 can be applied which

shows that the last step in the General Denesting Algorithm requires

O(k2n4N4(n2N4L+ dN2L logN + dN2K))

elementary operations on floating-point numbers and integers of size

O((n+ d2N4 + kN2)(n3N4L+ dnN2L logN + dnN2K)),

since in this case k = kN2,maxdi ≤ dN2, and L = n2N4L+dN2L logN+dN2K26.

Except for the n6 factor these bounds are dominated by the time boundsfor the denesting part.

The last claim of the theorem is an immediate consequence.

The exact run time of the General Denesting Algorithm can be obtainedfrom Table 2 by replacing K by dK in B and multiplying the number ofoperations by k2 and from Table 1 by replacing L by O(Ln2N2 logN +dL logN +dK). The latter algorithm has to be applied at most k2N4 times.

Corollary 7.5.2 generalizes to the General Denesting Algorithm.

Corollary 7.5.7 If the nested radicals γi, κi are as in Corollary 7.5.2then the General Denesting Algorithm uses at most O(k2d5m3n7N14L)elementary operations on integers or floating-point numbers of sizeO(kd3m3n6N8L).

In the very same way the corresponding result for complex radicals canbe shown. The proof is slightly simpler since in the ratio tests we need not

26Observe that in the L in Theorem 5.6.2 another factor of n is hidden.

172

Page 177: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

consider the ratio of nested radicals of different degree. Hence we need notdenest radicals whose degree is quadratic in the degree of the nested radicalsappearing in S.

Theorem 7.5.8 Assume Q(α) contains a primitive d-th root of unity. Sup-pose S =

∑ki=1 κi d

√γi, is a depth two radical expression over Q(α), such that

any depth one radical appearing in S has the form t√ρ, t|d. Assume κi, γi are

radical expressions satisfying the assumptions of Theorem 7.5.4. In partic-ular their infinity norm is bounded by 2K and for each expression γi, κi aninteger Cγi , Cκi < 2K exists such that Cκiκi or Cγiγi is an algebraic integer.

Denote by n the degree of the minimal polynomial p of α. Assumel = dlog |p|2e. 2L is the maximum coefficient size of an element in Q(α)appearing in the definition of κi, γi. Finally assume that for any fixed pair ofindices i, j the radicals appearing in κi, κj , ρi, and ρj generate an extensionof Q(α) of degree at most N.

When applied to S the General Denesting Algorithm determines in timepolynomial in k, n, l, L,N, d, and the upper bound K whether S can be writtenas a sum of real depth one radicals over Q(α).

In particular, the General Denesting Algorithm uses at mostO(k2d2n6N5(L + dK) logN) elementary operations on integers or floating-point numbers of size O(kd2n5N4(L + dK)) plus the number of op-erations needed to approximate each γi with absolute error less than2−(46n3N3L+12dnNK).

Within the same time bounds the algorithm decides whether S = 0.

The complex version of Corollary 7.5.2 generalizes to the General Den-esting Algorithm.

Corollary 7.5.9 If the nested radicals γi, κi are as in Corollary 7.5.5 thenthe General Denesting Algorithm uses at most O(k2d3m3n7N7L logN)elementary operations on integers or floating-point numbers of sizeO(kd2m3n6N6L).

Before we finish this thesis let us mention some easy extensions of theprevious results to quotients of nested radicals. For the quotient of twonested radicals we gave a denesting algorithm already in the General Den-esting Algorithm.

If we are interested in one quotient of nested radicals of different degreeat least in the real case we can do slightly better than described in theanalysis of the General Denesting Algorithm. The improvement is based

173

Page 178: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

on the following lemma which is an immediate consequence of Lemma 3.4,p. 24.

Lemma 7.5.10 A ratio of real nested radicalsd1√γ1

d2√γ2

over a real field F

denests using real radicals iff

• d′1√γ1 and d′2

√γ2 denest using real radicals

and

• d

√γ′1γ′2

denests using real radicals.

Here d = lcm(d1, d2), d′i = di/d, and γ′i denotes the denesting of d′i√γi if it

exists.

Using this lemma, rather than applying the Algorithm Denesting Elementto a nested radical of degree d1d2 we need to apply the algorithm only tonested radicals of degree maxd1, d2.

For quotients of sums of nested radicals our results also yield an efficientsolution. In fact, let Σ1,Σ2 be sums of real nested radicals of depth two.We may assume that the General Denesting Algorithm has already beenapplied to both sums. Hence no ratio of radicals appearing in the same sumdenests. Suppose

Σ1

Σ2= S,

where S is a depth one expression. This implies

Σ1 − SΣ2 = 0.

Applying to this sum of nested radicals Theorem 6.1.2, p. 100, or its complexequivalent, Theorem 6.1.5, p. 104, shows that for any nested radical d

√γ in

Σ1 there exists a unique nested radical t√η in Σ2 such that their ratio denests.

Using the previous results we may compute one such denesting, say,

d√γ

t√η

= θ.

If the coefficients of d√γ and t

√η in Σ1,Σ2 are κ, µ, respectively, then S is

already uniquely determined as

S =θκ

µ.

174

Page 179: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Finally we check whether S satisfies Σ1 − SΣ2 = 0 using the General Den-esting Algorithm.

Obviously the run time is determined by the last step so the analysis ofTheorem 7.5.6 applies.

The next step is to go to sums of ratios of sums of nested radicals.Here we know nothing better than just transforming the problem into asingle ratio of sums of nested radicals. But the sums in the numerator anddenominator may have length exponential in the length of the original sum.On the other hand, if the bounds from Section 6 are tight as is to be expectedthen the output size of a denesting may also be exponential in the length ofthe sum of ratios.

But recall that in algorithmic algebra we usually assume that numbersare given in some basis representation, that is in our case, as a linear com-bination of nested radicals.

175

Page 180: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Appendix: Roots of Unity in Radical Extensions ofthe Rational Numbers

The goal of this appendix is to give the proof of

Theorem 6.4.4 Let F = Q( d1√q1, . . . , dk

√qk) be a real radical extension

of Q. If ζm is a primitive m-th root of unity then F (ζm) contains at most24m different roots of unity. Moreover, the constant 24 is best possible, i.e.,there are real radical extensions F of Q and m ∈ N such that F (ζm) containsexactly 24m different roots of unity.

Proof: First we reformulate the problem a bit.

Lemma A 1 The number M of roots of unity in Q( d1√q1, . . . , dk

√qk, ζm) is

the maximum of all numbers N such that Q( d1√q1, . . . , dk

√qk, ζm) contains a

primitive N -th root of unity. Moreover, m divides M.

Proof: Assume the field contains no primitive M -th root of unity, insteadassume N < M is the largest number such that Q( d1

√q1, . . . , dk

√qk, ζm) con-

tains an N -th primitive root of unity. Then Q( d1√q1, . . . , dk

√qk, ζm) contains

a primitive N -th root of unity and a primitive K-th root of unity ζKN for(N,K) = 1, K > 1. By Lemma 3.7, p. 26, in this case Q( d1

√q1, . . . , dk

√qk, ζm)

also contains a primitive K-th root of unity. This contradicts the maximalityof N, so M = N.

But then all roots of unity in Q( d1√q1, . . . , dk

√qk, ζm) must be a power of

ζM . In particular, the primitive m-th roots of unity ζm must be a power ofζM . This is possible if and only if m divides M.

In view of these facts we can reformulate the original problem. We haveto determine the largest multipleM ofm such that Q( d1

√q1, . . . , dk

√qk, ζm) =

Q( d1√q1, . . . , dk

√qk, ζM ) for primitive m-th and M -th roots of unity

ζm, ζM . Moreover, instead of answering this question for the fieldQ( d1√q1, . . . , dk

√qk, ζm) we will answer it for Q( d1

√q1, . . . , dk

√qk, ζm′) where

m′ = lcm(4,m). The number M deduced in this way will clearly be anupper bound on the number of roots of unity in Q( d1

√q1, . . . , dk

√qk, ζm).

Now assume that the prime factorization of m′ is given by m′ =2e∏li=1 p

eii , pi prime, e, ei ∈ N, e ≥ 2. Then M can be written as M =

2e′∏l′

i=1 qfii

∏li=1 p

eii , qi prime, e′, fi ∈ N, e′ ≥ e, where the qi’s need not be

distinct from the pi’s.

176

Page 181: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

If Q( d1√q1, . . . , dk

√qk, ζm′) = Q( d1

√q1, . . . , dk

√qk, ζM ) then the first field

must be the same as Q( d1√q1, . . . , dk

√qk, ζMi), for all i = 0, 1, . . . , l′, where

M0 = 2e′∏l

i=1 peii and Mi = qfii 2e

∏li=1 p

eii for i > 0. We will show that this

is possible only for e′ − e = 1 and qfii = 3. This implies M ≤ 6m′ ≤ 24mand will therefore prove the upper bound of the theorem.

To prove the claim we consider for each Mi, i = 0, 1, . . . , l′ a field Ei suchthat if Q( d1

√q1, . . . , dk

√qk, ζm′) = Q( d1

√q1, . . . , dk

√qk, ζMi) then Ei must be

a subfield of this field. Clearly the degrees of Q( d1√q1, . . . , dk

√qk, ζm′) and

of Q( d1√q1, . . . , dk

√qk, ζMi) over Ei have to be the same. From this property

the claim will easily follow.We will choose the field Ei to be the field generated by the real radicals

d1√q1, . . . , dk

√qk and all real square roots in Q(ζMi).

Lemma A 2 Let m ∈ N such that 4|m. If m = 2e∏li=1 p

eii , ei ≥ 1, e ≥ 2, is

the prime factorization of m then the subfield of Q(ζm) generated by all realsquare roots in Q(ζm) is Q(

√p1, . . . ,

√pl) if e = 2 and Q(

√p1, . . . ,

√pl,√

2)if e > 2.

Proof: First all quadratic subfields of Q(ζm) will be determined. By Galoistheory these subfields correspond to subgroups of the Galois group of Q(ζm)over Q of order ϕ(m)

2 , ϕ(m) = [Q(ζm) : Q]. As we mentioned above (seeLemma 2.3, p. 16) the Galois group of this extension is isomorphic to Z∗m,the multiplicative group of integers taken modulo m between 1 and m whichare relatively prime to m. Hence it is abelian. By the following result dueto G. Birkhoff [Bi] the number of quadratic subfields of Q(ζm) equals thenumber of subgroups of Z∗m of order 2.

Lemma A 3 (Birkhoff) If G is an abelian group of order n then the num-ber of subgroups of order n

d , d|n, equals the number of subgroups of order d.

As is well-known from number theory (see [IR]) Z∗m can be written as adirect product

Z∗m = Z∗2e × Z∗pe11× Z∗

pe22× · · · × Z∗

pell

,

where Z∗peii

is a cyclic group of order pei−1i (pi − 1) and Z∗2e is either a cyclic

group of order 2 or a direct product of two cyclic groups C1, C2, one of order2 and the other of order 2e−2.

Each subgroup of order 2 of Z∗m must be cyclic. Hence we have todetermine all elements in Z∗m of order 2. By the above representation for Z∗mthese elements correspond to products h1h2g1 · · · gl, where h1 ∈ C1, h2 ∈

177

Page 182: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

C2, gi ∈ Z∗peii

and each element is either the unit element of that group oran element of order 2. If e = 2 then we have to dismiss the second factor.

By group theory any cyclic group of order d contains for each divisor d′

of d exactly one element of order d′. Hence there are 2l+1 − 1 or 2l+2 − 1elements of order 2 in Z∗m depending on whether e = 2 or e > 2. The −1-terms occurs because we are not allowed to take the unit element from eachsubgroup. Accordingly, Q(ζm) has either 2l+1 − 1 or 2l+2 − 1 quadraticsubfields.

Next observe that Q(ζpi),Q(ζ4) are subfields of Q(ζm). And if 8|m thenQ(ζ8) is also a subfield. A well-known result in algebraic number theory(see for example [Ja]) states that the unique quadratic subfield of Q(ζpi)is generated by

√(−1)pi if pi ≡ 3 mod 4 and is generated by

√pi if pi ≡

1 mod 4. Moreover, Q(ζ4) is of course generated by√−1 and Q(ζ8) has

three quadratic subfields generated by√−1,√

2, and by√−2.

Therefore Q(ζm) contains√(−1)f12f2pf31 · · · p

fl+2

l ,

where each fi is either 0 or 1 and in case e = 2 f2 is always 0.As is easily seen (use Theorem 3.9, p. 28, for example, but it can also

be proven directly) these square roots generate pairwise distinct quadraticsubfields of Q(ζM ). Since this yields 2l+1 − 1 or 2l+2 − 1 distinct quadraticfields depending on whether e = 2 or e > 2 these must be all quadraticsubfields. Hence a real square root that is contained Q(ζm) must generateone of the fields

Q(√

2f2pf31 · · · pfl+2

l

), fi = 0, 1, f2 = 0 if e = 2.

Since all these fields are contained in Q(√p1, . . . ,

√pl) if e = 2 and in

Q(√p1, . . . ,

√pl,√

2) if e > 2 the lemma follows.

Denote the field generated by the real square roots containedin Q(ζMi) and by the real radicals d1

√q1, . . . , dk

√qk by Ei. Hence

Ei ⊂ Q(ζMi) and Q( d1√q1, . . . , dk

√qk, ζMi)=Ei(ζMi). Moreover, if ζMi ∈

Q( d1√q1, . . . , dk

√qk, ζm′) then Q( d1

√q1, . . . , dk

√qk, ζm′) = Ei(ζm′) and there-

fore Ei(ζm′) = Ei(ζMi). In particular, the degree of Ei(ζm′) over Ei must beequal to the degree of Ei(ζMi) over Ei.

178

Page 183: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Applying Theorem 2.4, p. 16, to K = Q, E = Q(ζMi), F = Ei and toK = Q, E = Q(ζm′), F = Ei shows that this implies

[Q(ζm′) : Q(ζm′) ∩ Ei] = [Q(ζMi) : Q(ζMi) ∩ Ei], i = 0, 1, . . . , l′.

Next we determine how the intersections look like.

Lemma A 4 Let d1√q1, . . . , dk

√qk be real radicals and ζm be a primitive m-

th root of unity. By E denote the subfield of Q( d1√q1, . . . , dk

√qk, ζm) that

is generated by the radicals di√qi and by the real square roots contained in

Q(ζm). Then E ∩ Q(ζm) is the field generated by the real square roots inQ(ζm).

Proof: Since E ∩Q(ζm) is a subfield of the real radical extension E it mustbe generated by real radicals (see Theorem 3.13, p. 32).

Since E ∩Q(ζm) is a real radical extension contained in Q(ζm) it mustbe generated by square roots (see Lemma 3.12, p. 32). Also by Lemma 3.12,p. 32, the field generated by all real square roots in Q(ζm) is the largestpossible subfield of Q(ζm) generated by real radicals.

By definition of E this field is also a subfield of E. The lemma follows.

Combining Lemma A 2 and Lemma A 4 shows

• If e > 2 then

Fi = Q(ζMi) ∩ Ei = Q(√

2,√p1, . . . ,

√pl,√qi), i = 1, 2, . . . , l′,

and

F = Q(ζm′) ∩ Ei = F0 = Q(ζM0) ∩ E0 = Q(√

2,√p1, . . . ,

√pl).

• If e = e′ = 2 then

Fi = Q(ζMi) ∩ Ei = Q(√p1, . . . ,

√pl,√qi), i = 1, 2, . . . , l′,

and

F = Q(ζm′) ∩ Ei = F0 = Q(ζM0) ∩ E0 = Q(√p1, . . . ,

√pl) for all i

179

Page 184: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

• If e = 2, e′ > 2 then

Fi = Q(ζMi) ∩ Ei = Q(√p1, . . . ,

√pl,√qi), i = 1, 2, . . . , l′,

F0 = Q(ζM0) ∩ E0 = Q(√

2,√p1, . . . ,

√pl),

andF = Q(ζm′) ∩ Ei = Q(

√p1, . . . ,

√pl) for all i.

Since field degrees are multiplicative

ϕ(Mi)ϕ(m′)

=[Q(ζMi) : Q][Q(ζm′) : Q]

=[Fi : Q][F : Q]

.

First consider i = 0 and assume e′ > e. In this case

ϕ(M0)ϕ(m′)

= 2e′−e

but[F0 : Q][F : Q]

= 2

if e = 2. Otherwise this ratio is 1. Hence if e = 2 then e′ can be at most 3and if e > 2 then e = e′.

For i > 0 we can use a similar argument.

ϕ(Mi)ϕ(m′)

= qfi−1i (qi − 1)

if qi is distint from all pi’s. Otherwise

ϕ(Mi)ϕ(m′)

= qfii .

On the other hand[F (i) : Q][F : Q]

= 2

or[F (i) : Q][F : Q]

= 1

depending on whether qi is distinct from the pj ’s or not.

180

Page 185: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Hence qfi−1i (qi − 1) = 2 or qfii = 2. The second case is impossible for an

odd prime and the first one is possible if and only if qi = 3 and fi = 1. Asmentioned this proves the upper bound.

It remains to show that this bound is optimal. To do so let m be apositive integer such that gcd(24,m) = 1. Moreover let m be divisible by aprime p satisfying p ≡ 3 mod 4. Consider Q(

√2,√

3,√p, ζm), where ζm is a

primitive m-th root of unity.As noted above Q(ζm) contains

√−p. Hence

√−1 ∈ Q(

√2,√

3,√p, ζm).

Therefore this field contains

1√2

(1 +√−1) and

12

(1 +√−3).

the first number is a primitive 8-th root of unity and the second onea primitive 3-rd root of unity. By Lemma 3.7, p. 26, this implies thatQ(√

2,√

3,√p, ζm) contains a 24m-th primitive root of unity.

The following corollary is an immediate consequence of the previous theo-rem.

Corollary A 5 Let m ∈ N. Both numbers sin 2πm and cos 2π

m can be writtenas linear combination of real radicals over Q if and only if m|24.

181

Page 186: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

References

[AHU] A. V. Aho, J. E. Hopcroft, J. D. Ullman, The Design and Analysisof Computer Algorithms, Addison-Wesley, 1975.

[Ap] T. M. Apostol, Introduction to Analytic Number Theory , Springer-Verlag, 1976.

[Ar] E. Artin, Galois Theory , University of Notre Dame Press, 1942.

[Be] A. S. Besicovitch, “On the Linear Independence of Fractional Pow-ers of Integers”, Journal of the London Mathematical Society Vol.15,pp. 3-6, 1940.

[BFHT] A. Borodin, R. Fagin, J. E. Hopcroft, M. Tompa, “Decreasing theNesting Depth of Expressions Involving Square Roots”, Journal ofSymbolic Computation Vol. 1, pp. 169-188, 1985.

[Bi] G. Birkhoff, “Subgroups of Abelian Groups”, Proceedings of theLondon Mathematical Society 2. Series Vol. 38, pp. 385-401, 1935.

[Br] R. P. Brent, “Fast Multiple-Precision Evaluation of ElementaryFunctions”, Journal of the ACM , Vol. 23, pp. 242-251, 1976.

[Bro] W. S. Brown, “On Euclid’s Algorithm and the Computation ofPolynomial Greatest Common Divisors”, Journal of the ACM , Vol.18, pp. 478-504, 1971.

[C] H. Cartan, Elementare Theorien der analytischen Funktio-nen einer oder mehrerer komplexen Veranderlichen, B. I.-Wissenschaftsverlag, 1966.

[Ch] K. Chandrasekharan, Arithmetical Functions, Springer-Verlag,1970.

[CLR] T. C. Cormen, C. E. Leiserson, R. L. Rivest, Introduction to Algo-rithms, MIT Press, 1990.

[HH] G. Horng, M. -D. Huang, “Simplifying Nested Radicals and SolvingPolynomials by Radicals in Minimum Depth”, Proc. 31st Sympo-sium on Foundations of Computer Science 1990, pp. 847-854.

182

Page 187: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

[HJLS] J. Hastad, B. Just, J. C. Lagarias, C. P. Schnorr “Polynomial TimeAlgorithms for Finding Integer Relations among Real Numbers”,SIAM Journal of Computing Vol. 18, pp. 859-881, 1989.

[IR] K. Ireland, M. Rosen, A Classical Introduction to Modern NumberTheorey , Springer-Verlag, 1982.

[J] N. Jacobson, Basic Algebra I , W. H. Freeman and Company, 1974.

[Ja] G. J. Janusz, Algebraic Number Fields, Academic Press, 1973.

[K] M. Kneser, “Lineare Abhangigkeit von Wurzeln”, Acta ArithmeticaVol. 26, pp. 307-308, 1974/75.

[KLL] R. Kannan, A. K. Lenstra, L. Lovasz, “Polynomial Factorizationand Nonrandomness of Bits of Algebraic abd Some TranscendentalNumbers”, Mathematics of Computation Vol. 50, No. 181, pp. 235-250, 1988.

[L] S. Lang, Algebra I , Addison-Wesley, 1965.

[La1] S. Landau, “Factoring Polynomials over Algebraic Number Fields”,SIAM Journal on Computing Vol. 14, No. 1, pp. 184-195, 1985.

[La2] S. Landau, “Simplification of Nested Radicals”, SIAM Journal onComputing Vol. 21, pp. 85-110, 1992.

[La3] S. Landau, “A Note on Zippel-Denesting”, Journal of SymbolicComputation, Vol. 13, pp. 41-46, 1992.

[Le] A. K. Lenstra, “Factoring Polynomials over Algebraic NumberFields”, Proc. European Computer Algebra Conference, LNCS 162,pp. 245-254, 1983.

[LLL] A. K. Lenstra, H. W. Lenstra, L. Lovasz, “Factoring Polynomi-als with Rational Coefficients”, Mathematische Annalen, Vol. 261,pp. 515-534, 1982.

[LMc] L. Langemyr, S. McCallum, “The Computation of Greatest Com-mon Divisors over an Algebraic Number Field”, Journal of SymbolicComputation, Vol. 8, pp. 429-448, 1989.

[Lo] L. Lovasz, An Algorithmic Theory of Numbers, Graphs and Con-vexity , SIAM, 1986.

183

Page 188: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

[Lo1] R. Loos, “Computing in Algebraic Extensions”, Computing , Suppl.4, pp. 173-187, 1982.

[Lo2] R. Loos, “Generalized Polynomial Remainder Sequences”, Com-puting , Suppl. 4, pp. 115-137, 1982.

[M] D. A. Marcus, Number Fields, Springer-Verlag, 1977.

[Mi1] M. Mignotte, “Some Useful Bounds”, Computing , Suppl. 4, pp.259-263, 1982.

[Mi2] M. Mignotte, Mathematics for Computer Algebra, Springer-Verlag,1992.

[Mo] L. J. Mordell, “On the Linear Independence of Algebraic Numbers”,Pacific Journal of Mathematics, Vol. 3, pp. 625-630, 1953.

[R] S. Ramanujan, Problems and Solutions, Collected Works of S. Ra-manujan, Cambridge University Press, 1927.

[Sc1] A. Schonhage, “Schnelle Berechnung von Kettenbruchentwicklun-gen”, Acta Informatica, Vol. 1, pp. 139-144, 1971.

[Sc2] A. Schonhage, “Storage Modification Machines”, SIAM Journalon Computing , Vol. 9, pp. 490-508, 1980.

[Sc3] A. Schonhage, “The Fundamental Theorem of Algebra in Termsof Computational Complexity”, Preliminary Report , UniversitatTubingen, 1982.

[Sc4] A. Schonhage, “Factorization of Univariate Integer Polynomials byDiophantine Approximation and an Improved Basis Reduction Al-gorithm”, Proc. 11th ICALP, LNCS 172, pp. 437-447, 1984.

[Scr] C. P. Schnorr, “A More Efficient Algorithm for Lattice Basis Re-duction”, Journal of Algorithms Vol. 9, pp. 47-62, 1988.

[Si] C. L. Siegel, “Algebraische Abhangigkeit von Wurzeln”, ActaArithmetica, Vol. 21, pp. 59-64, 1971.

[vdW] B. L. van der Waerden, Algebra I , Springer Verlag, 1975.

[WR] P. J. Weinberger, L. P. Rothschild, “Factoring Polynomials OverAlgebraic Number Fields”, ACM Transactions on MathematicalSoftware, Vol. 2, pp. 335-350, 1976.

184

Page 189: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

[Y] C.-K. Yap, , Fundamental Problems in Algorithmic Algebra, Prince-ton Press, to appear 1993.

[Z] R. Zippel, “Simplification of Expressions Involving Radicals”, Jour-nal of Symbolic Computation Vol. 1, pp. 189-210, 1985.

185

Page 190: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:
Page 191: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Summary

In this thesis we describe several simplification algorithms for expres-sions involving radicals, for example sums of square roots. We show how totransform any sum of square roots into a sum of linearly independent squareroots. After this transformation it is very easy to determine whether thesum is zero.

More generally, it is shown how to determine for any sum of real radicalsover the rational numbers in time polynomial in the input size of the sumwhether it is zero. This contrasts to the fact, that until now no efficientalgorithms to determine the sign of a sum of radicals are known.

Other examples of radical expressions that are simplified are the so-callednested radicals. The problem of denesting nested radicals is best explainedby the following examples which can be found in the notebook of the Indianmathematician Ramanujan.√

3√

5− 3√

4 =13

(3√

2 + 3√

20− 3√

25)

6√

7 3√

20− 19 = 3

√53− 3

√23.

The expressions on the left-hand side of the equations have nesting depthtwo while the expressions on the right-hand side have nesting depth one.In the thesis it is shown that a for a large class of nested radicals of depthtwo a denesting can be found in polynomial time. This class contains allof Ramanujan’s examples. Although in many respects more general, thealgorithms known so far for denesting radicals cannot handle the denestingsfound by Ramanujan.

From a theoretical point of view the results mentioned above are based onthe fact that real radicals over a real field F are already linearly independentif any two of them are linearly independent. The proof of this fact in turnis based on a theorem due to C. L. Siegel that determines to some extendthe structure of real radical extensions. We give a simplified proof of thistheorem.

From an algorithmic point of view the results are based on an algorithmthat, given an algebraic number field, determines in polynomial time for anelement in this field its exact representation provided an upper bound on the

187

Page 192: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

representation size of the element and an approximation to this element aregiven. The main ingredient to this algorithm is the lattice basis reductionalgorithm of Lenstra, Lenstra and Lovasz.

188

Page 193: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Zusammenfassung

In der Dissertation werden Algorithmen beschrieben, dieRadikalausdrucke vereinfachen, zum Beispiel Summen von Quadratwurzeln.In diesen Fallen wird die Summe in eine Summe von linear unabhangigenRadikalen umgewandelt. Der Vorteil einer solchen Darstellung ist, daß mitihr schnell entschieden werden kann, ob eine Summe Null ist.

Allgemeiner wird gezeigt, daß fur jede Summe von reellen Radikalen uberden rationalen Zahlen in polynomieller Zeit in der Eingabegroße entschiedenwerden kann, ob die Summe Null ist. Andererseits ist bis heute unbekannt,ob das Vorzeichen einer Summe von Radikalen effizient berechnet werdenkann.

Andere Beispiele von Ausdrucken, die vereinfacht werden, sindgeschachtelte Wurzelausdrucke. Das Problem wird am besten durch diefolgenden den Notizbuchern des indischen Mathematikers Ramanujan ent-nommenen Beispiele demonstriert.√

3√

5− 3√

4 =13

(3√

2 + 3√

20− 3√

25)

6√

7 3√

20− 19 = 3

√53− 3

√23.

Die Ausdrucke auf den linken Seiten der Gleichungen haben Tiefe 2 wahrenddie Ausdrucke auf den rechten Seiten nur Tiefe 1 haben. In der Dissertationwird gezeigt, daß fur eine große Klasse von geschachtelten Wurzeln der Tiefe2 eine Vereinfachung in polynomieller Zeit gefunden werden kann. Alle vonRamanujan angegebenen Entschachtelungen fallen in diese Klasse. Obwohlin vieler Hinsicht allgemeiner als die Algorithmen dieser Dissertation kon-nten die bislang in der Literatur beschriebenen EntschachtelungsalgorithmenRamanujans Beispiele nicht berechnen.

Die oben genannten Ergebnisse beruhen in theoretischer Hinsicht aufder Tatsache, daß reelle Radikale uber einem reellen Korper schon dannlinear unabhangig uber diesem Korper sind, wenn es je zwei von ihnen sind.Der Beweis dieser Tatsache beruht auf einem Satz von C. L. Siegel, der dieStruktur von reellen Radikalerweiterungen fast vollstandig bestimmt. Furdiesen Satz wird in der Dissertation ein vereinfachter Beweis gegeben.

In algorithmischer Hinsicht beruhen die Ergebnisse der Dissertation aufeinem Algorithmus, der, gegeben einen algebraischen Zahlkorper, in poly-

189

Page 194: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

nomieller Zeit fur ein Element dieses Korpers seine exakte Darstellungberechnet, falls eine obere Schranke fur die Darstellungsgroße des Elementsund eine Approximation des Elementes bekannt ist. Dieser Algorithmusbenutzt den sogenannten Gitterreduktionsalgorithmus von Lenstra, Lenstraund Lovasz.

190

Page 195: Johannes Bl omer - NYU Computer Science · computer scientists ([BFHT],[HH],[La2],[La3],[Z]). Many of them have been attracted by equations of Ramanujan [R] such as the following:

Johannes Blomer

Lebenslauf

23. Mai 1964 geboren in Dinklage

1970 - 1974 Besuch der Grundschule Dinklage

1974 - 1983 Besuch des Gymnasiums Lohne

Mai 1983 Abitur am Gymnasium Lohne

Okt.1983 - Feb. 1989 Studium der Mathematik, Geschichte und Phi-losophie an der FU Berlin

Juli 1985 Vordiplom in Mathematik

Feb. 1989 Diplom in Mathematik

April 1989 - Mai 1989 Werkvertrag an der FU Berlin

seit Mai 1989 wissenschaftlicher Mitarbeiter im DFG-Projekt“Georechner” von Prof. Dr. H. Alt

seit Oktober 1991 wissenschaftlicher Mitarbeiteran der FU Berlin

191