INM2012a.pdf

(182 KB) Pobierz

Introduction to Numerical Methods

Marcin Studniarski

Faculty of Mathematics and Computer Science

University of × z

ód´

Rounding of numbers

= 0:

1 2

:::

;

We consider a positive decimal number of the following form:

(1)

which uses

digits to the right of the decimal point.

De…nition 1

We say that

rounded

decimal places (where

s < r)

if the

following operartions are performed, depending on the values of

and

s+1

(a)

For

s+1

2 f0;

1; 2; 3; 4g,

the digit

remains unchanged and the digits

s+1

:::

are discarded.

(b)

For

s+1

2 f5;

6; 7; 8; 9g

and

= 9,

the digit

is increased by

and

the digits

s+1

:::

are discarded.

(c)

For

s+1

2 f5;

6; 7; 8; 9g

and

= 9,

the digit

is replaced by

the

digit

is increased by

(except for the case where

= 9)

and the digits

s+1

:::

are discarded.

Remark 2

The case where

s+1

= 5

and

s+2

:::

= 0

can be handled

in di¤ erent ways. For example, we may choose to increase

only when

is even.

De…nition 3

We say that

chopped

truncated

decimal places (where

s < r)

if the digits

s+1

:::

are discarded (no matter what the value of

s+1

is).

Proposition 4

(a)

rd(x)

is obtained by rounding

decimal places, then

rd(x)j

2 10

(2)

(b)

ch(x)

is obtained by chopping

decimal places, then

ch(x)j

(3)

Proof.

(a) If

s+1

2 f0;

1; 2; 3; 4g,

then

= rd(x) +

where

" <

(1=2) 10

hence (2) follows. If

s+1

2 f5;

6; 7; 8; 9g,

then

rd(x) = ch(x) + 10

and

= ch(x) +

;

) 10

(4)

. Since

where

1=2

Equalities (4) imply

rd(x)

1=2,

we obtain (2).

(b) We have

= ch(x) +

with

, and (3) holds.

= (1

Then

ch(x)j =

Floating-point number system

De…nition 5

(a)

‡

oating-point number system

is a subset of the set of

real numbers

such that each element

has the form

e t

;

(5)

where

m; ; e; t

are integer parameters, and

is called the

base

(most computers use

is called the

precision,

[0;

= 2),

is called the

mantissa

or the

signi…cand,

min

; e

max

]

is called the

exponent range.

(b)

The system

is called

normalized

for every

= 0

(6)

(the number

is an exceptional number which does not have a normalized rep-

resentation).

(c)

The elements of

are called

‡

oating-point numbers

machine numbers.

Proposition 6

The range of nonzero ‡oating-point numbers in a normalized

system

is given by

min

jxj

max

(7)

Proof.

Let

It follows from (5) and (6) that

jxj

e t

min

On the other hand, using the condition

jxj

(

max

we get

max

De…nition 7

The

positional

representation of a number

is given by

:::

0:d

:::d

;

(8)

where

[0;

are digits, and

= 0

for

= 0.

Here

is called the

most

signi…cant digit

and

is called the

least signi…cant digit.

De…nition 8

Let

G R

be the set of all real numbers of the form

(5)

with no

restriction on the exponent

We denote by

any mapping such that

(x)

is an element of

nearest to

(since there may be two such elements for

some numbers

the mapping may be de…ned in di¤ erent ways). We say that

(a)

(x)

over‡

ows

(x)j

maxfjyj :

Fg,

(b)

(x)

under‡

ows

(x)j

minfjyj : 0

Fg.

For any

we will us the following notation:

b c

is the largest integer less than or equal to ,

d e

is the smallest integer greater than or equal to .

Theorem 9

[1]

Suppose that

lies in the range of

(that is, it satis…es

(7)).

Then there exists

such that

(x) =

x(1

+ )

and

j j

< u

(9)

Proof.

We give the proof for the case of

x >

only. We may represent

the form

e t

;

for some

[

;

)

and

min

; e

max

The number

lies between the

adjacent ‡

oating point numbers

b c

(x)

Hence, we can estimate

(x)

The conditions imposed on

e t

and

d e

e t

Then either

(x) =

. Moreover,

(x)

is chosen so that

= minfjy

= 1; 2g:

(10)

give that

e t

(11)

Taking into account (10) and (11), we obtain

(x)

e t

(12)

The second inequality in (12) is strict, except for the case of

. But

in this case we have

(x) =

and the …rst inequality is strict. Therefore, by

de…ning

(x)

, we get (9).

De…nition 10

(a)

The number

appearing in

(9)

is called the

unit roundo¤

error.

(b)

The distance

from

to the next larger machine number is called the

machine epsilon.

Proposition 11

= 2u

Proof.

Using the positional representation (8), we see that

0:1

:::0

;

where the lower indices denote the positions after the decimal point. The next

machine number

larger than

is given by

Hence, we have

0:0

:::0

0:1

:::0

Error analysis

De…nition 12

Let a number

be approximated by another number

(where

x; x

R).

Then

(a)

x x

is called the

error;

(b)

is called the

absolute error;

x x

(c)

is called the

relative error

(we assume that

= 0).

In particular, it follows from (12) that the relative error of approximating a

number

lying in the range of

by the nearest machine number is less than

We will now analyse roundo¤ errors appearing in the process of performing

basic arithmetic operations on a digital computer. Let

x; y

and let

f+;

; ; =g

Since each of these operations is carried out on machine numbers, the

result

x y

is de…ned for

x; y

only. We assume that

x y

is …rst computed

correctly, then normalized and rounded o¤ to

y).

In real computers,

this assumption is not satis…ed exactly, but it is very close to the truth. In

practice, many computers carry out arithmetic operations in special registers

that have more bits than the usual machine numbers. Therefore, although the

value

x y

is not computed with full accuracy, it is initially determined with

greater precision that the result

later stored as a machine number.

We consider two possible cases:

(a)

and

are machine numbers (that is,

x; y

and

x y

lies in the

range of

Then Theorem 9 gives

= (x

y)(1

+ );

j j

< u

(13)

(b)

and

are not necessarily machine numbers. Then the operation

carried out on

(x)

and

(y),

and we get

( (x)

(y)) = (x(1 +

)

y(1

))(1

);

< u

= 1; 2; 3:

(14)

The fundamental operator

In many numerical problems (e.g. interpolation, numerical di¤erentiation and

integration, numerical solving of di¤erential equations) one uses approximations

of a given function by polynomials. In order to present a uni…ed approach to

such problems, we now introduce a linear operator which is a basis of all these

polynomial approximation methods.

De…nition 13

[3]The fundamental operator

(R)

L(f; x)

(k)

is de…ned by

(15)

(x) +

m n

i=0 j=1

(x)f

(i)

);

where

(i)

is the

i-th

order derivative of

for

(0)

are constants,

2 f0;

1g,

and

are polynomials of the variable

By making di¤erent choices of parameters

k; m; n; a

;

; q

;

we can con-

struct various numerical methods for solving di¤erent approximation problems.

A general procedure is described below.

1. The part of the operator

which we want to approximate is formally

replaced by its approximation, and the resulting operator is denoted by

2. Then we solve the given approximation problem by solving the equation

L(f; x)

= 0:

(16)

This gives a formula for the approximated quantity (e.g. function value,

derivative or integral).

3. It follows from (16) that the error of any approximation de…ned by steps

1– is equal to

E(x)

L(f; x)

L(f; x):

(17)

Since we are seeking polynomial approximations only, we assume that

the coe¢ cient (number or polynomial) of the approximated quantity in

(15) is constant. We now describe some special forms of the fundamental

operator which are used to solve speci…c problems.

4.1

Interpolation and extrapolation

Let

bed a function whose values (possibly with certain derivatives)

are known on a …nite set of points called

nodes.

The aim of

interpolation

is to

determine approximate values of

at points which are not nodes but lie between

nodes. The aim of

extrapolation

is to approximate

at points lying outside the

range of nodes (but su¢ ciently close to the smallest or to the largest node).

Suppose that in (15) we have

= 0,

= 1

and at least one coe¢ cient

(x)

is di¤erent from

Then the operator

de…nes an interpolation of

for

[min

;

max

]

and an extrapolation of

for

[min

;

max

general form of the interpolation/extrapolation operator is thus the following:

L(f; x)

(x) +

m n

i=0 j=1

(x)f

(i)

(18)

Although there exist some interpolation methods that use the derivatives of

(e.g. Hermite interpolation), in this lecture we deal with the case of

= 0

only.

In this case, using the traditional symbols, we denote

(x)

(x).

We also

write

instead of

.Then (18) can be rewritten as

L(f; x)

(x)

j=1

(x)f (a

(19)

According to the general procedure, we now replace the function

in (19) by

its polynomial approximation

The resulting operator

should be identically

zero; therefore

p(x)

(x)f (a

)

for all

(20)

j=1

Plik z chomika:

xyzgeo

Inne pliki z tego folderu:

Numerical_Recipes(3).pdf (10373 KB)
Wykład 13 - Element Płytowy.pdf (562 KB)
Wykład 14 - MES w Praktyce.pdf (48 KB)
Aproksymacja.pdf (656 KB)
Interpolacja - pełne zagadnienie.pdf (2678 KB)

INM2012a.pdf

Plik z chomika:

Inne pliki z tego folderu:

Inne foldery tego chomika: