DIGITAL
Software
Product
Description
___________________________________________________________________
PRODUCT NAME: DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
DESCRIPTION
DIGITAL Extended Math Library (DXML) is a set of mathematical subpro-
grams that are optimized for Compaq architectures. Included subpro-
grams cover the areas of Basic Linear Algebra, Linear System and Eigen-
problem Solvers, Sparse Linear System Solvers, Sorting, Random Num-
ber Generation, and Signal Processing.
The Basic Linear Algebra library includes the industry-standard Ba-
sic Linear Algebra Subprograms (BLAS) Level 1, Level 2, and Level 3.
Also included are subprograms for BLAS Level 1 Extensions, Sparse BLAS
Level 1, and Array Math Functions (VLIB).
The Linear System and Eigenproblem Solver library provides the com-
plete LAPACK package developed by a consortium of university and gov-
ernment laboratories. LAPACK is a new, industry-standard subprogram
package offering an extensive set of linear system and eigenproblem
solvers. LAPACK uses blocked algorithms that are better suited to most
modern architectures, particularly ones with memory hierarchies. LA-
PACK will supersede LINPACK and EISPACK for most users.
The Sparse Linear System library provides both direct and iterative
sparse linear system solvers. The direct solver package supports both
symmetric and nonsymmetric sparse matrices stored using the skyline
storage scheme. The iterative solver package contains a basic set of
storage schemes, preconditioners, and iterative solvers. The design
of this package is modular and matrix-free, allowing future expansion
and easy modification by users.
September 1998
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
The Signal Processing library provides a basic set of signal process-
ing functions. Included are one-, two-, and three-dimensional Fast Fourier
Transforms (FFT), group FFTs, Cosine/Sine Transforms (FCT/FST), Con-
volution, Correlation, and Digital Filters.
Many DXML subprograms are optimized for the supported hardware plat-
forms. Optimization techniques include traditional optimizations such
as loop unrolling and loop reordering. DXML subprograms also provide
efficient management of the hierarchical memory system, using tech-
niques such as the following:
o Reuse of data within registers to minimize memory accesses
o Efficient cache management
o Use of blocked algorithms that minimize translation buffer misses
and unnecessary paging
Since DXML routines can be called from all languages that support Com-
paq's DIGITAL UNIX[R] calling standard, the library provides optimized
computation for applications written in these languages. Where appro-
priate, most subprograms are available in both real and complex ver-
sions, as well as in both single and double precision. The supported
floating point format is IEEE.
Parallel Library Support for Symmetric Multiprocessing
DXML also supports symmetric multiprocessing (SMP) for improved per-
formance. Key BLAS Level 2 and 3 routines, the LAPACK GETRF and POTRF
routines, the sparse iterative solvers, the skyline solvers, and the
FFT routines have been modified to execute in parallel if run on SMP
hardware. These parallel routines along with the other serial routines
are supplied in an alternative library. The user may choose to link
with either the parallel or the serial library, depending on whether
SMP support is required, since each library contains the complete set
of routines.
DXML Run-Time Only Option
2
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
Compaq provides a DXML Run-Time Only Option to allow applications built
against the DXML shared library to be run on other systems. Each ad-
ditional target system must have a DXML Run-Time Only Option (library
and license) installed in order to run applications built with the De-
velopment Option. The DXML Run-Time Only Option does not permit new
applications to be developed.
Distributing Applications Built with the DXML Run-Time Library
The DIGITAL Extended Math Library is an application development tool
that provides convenience and improved performance to the developer.
To encourage application developers to incorporate DXML routines from
the DXML archive libraries into their applications for distribution
to other users, Compaq permits the distribution of the DXML Run-Time
Library (RTL), under the following conditions.
You may copy and distribute royalty-free the DXML RTL provided that
you:
1. distribute the RTL only in conjunction with and as a part of your
application,
2. include Compaq's copyright notice on each copy of your application,
3. do not use Compaq's logo or trademarks to market your application,
and
4. agree to defend and indemnify Compaq from and against any claims
or lawsuits that arise or result from the use or distribution of
your application.
The Run-time Library is that portion of the DXML Software that is re-
quired during the execution of your application. For V3.4, the RTL com-
ponents are defined to be:
o libdxml_ev4.a
o libdxml_ev5.a
Basic Linear Algebra Subprograms
3
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
Linear algebra operations are fundamental to many mathematical appli-
cations, and several libraries of linear algebra subprogramss exist
throughout the computer industry. The DXML BLAS library contains the
most commonly used linear algebra subprograms.
The DXML linear algebra library contains five groups of subprograms
at three levels:
o Basic Linear Algebra Subprograms (BLAS) Level 1
o BLAS Level 1 Extensions
o BLAS Level 1 Sparse Extensions
o BLAS Level 2
o BLAS Level 3
BLAS Level 1 (Scalar/Vector and Vector/Vector Operations)
BLAS Level 1 provides a set of elementary vector functions, operat-
ing on one or two vectors. These are typically very small routines,
and they make less efficient use of the computing resources of mod-
ern computer architectures than the Level 2 and 3 operations.
DXML provides the 15 standard BLAS Level 1 operations:
o The index of the element of a vector having maximum absolute value
o The sum of the absolute values of the elements of a vector
o Inner product of two real vectors
o Scalar plus the extended precision inner product of two real vec-
tors
o Conjugated inner product of two complex vectors
o Unconjugated inner product of two complex vectors
o Square root of the sum of squares (norm) of the elements of a vec-
tor
o Scalar times a vector plus a vector
o Copy one vector to another
4
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
o Apply a Givens rotation
o Apply a modified Givens plane rotation
o Generate elements for a Givens plane rotation
o Generate elements for a modified Givens plane rotation
o Product of a vector times a scalar
o Swap the elements of two vectors
BLAS Level 1 Extensions (Vector/Vector Operations)
When developing mathematical algorithms using the BLAS Level 1, sci-
entists and engineers found that several additional constructs were
used on a regular basis. These constructs are well known throughout
the computer industry as BLAS Level 1 Extensions.
DXML contains 13 BLAS Level 1 Extension operations:
o Index of element having the minimum absolute value
o Index of element having the maximum value
o Index of element having the minimum value
o Largest value of the elements of a vector
o Smallest value of the elements of a vector
o Largest absolute value of the elements of a vector
o Smallest absolute value of the elements of a vector
o Sum of the values of the elements of a vector
o Set all elements of a vector equal to a scalar
o Constant times a vector set to another vector
(y = a x)
o Euclidean norm with no intermediate scaling
o Sum of the squares of the elements of a vector
5
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
o Constant times a vector plus a vector set to another vector (z =
a x + y)
BLAS Level 1 Sparse Extensions (Vector/Vector Operations)
This group of operations is similar to the BLAS Level 1 routines, but
is designed to work on sparse vectors (vectors in which most of the
elements are zero). Six of the routines are from industry standard Sparse
BLAS 1, and the remaining three are enhancements.
The nine sparse BLAS Level 1 operations are:
o Scalar times a sparse vector plus a vector
o Sum of a sparse vector and a full vector
o Inner product of a sparse vector and a full vector
o Gather a sparse vector from a full vector
o Gather a sparse vector from the scaled elements of a full vector
o Gather a sparse vector from a full vector and zero corresponding
elements of full vector
o Apply Givens rotation to a sparse vector and a full vector
o Scatter a sparse vector into a full vector
o Scale and scatter a sparse vector into a full vector
BLAS Level 2 (Matrix/Vector Operations)
The BLAS Level 2 codes make more effective use of the data in the reg-
isters, reducing the number of register loads and stores required. In
addition, loop unrolling techniques are used to minimize cache misses
and page faults. The BLAS Level 2 subprograms use the following types
of operations:
o Matrix/vector products
o Rank-1 and rank-2 matrix updates
o Solutions of triangular systems of equations
6
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
Six types of matrices are supported by these BLAS Level 2 routines:
o General
o General band
o Symmetric/Hermitian
o Symmetric/Hermitian band
o Triangular
o Triangular band
BLAS Level 3 (Matrix/Matrix Operations)
The BLAS Level 3 routines operate at a level that makes the most ef-
ficient use of machine resources. DXML optimizes these routines by par-
titioning matrices into blocks and computing matrix/matrix operations
on each block. This approach avoids excessive memory accesses by pro-
viding full reuse of data while each block is in the cache or the reg-
isters. BLAS Level 3 routines provide this kind of blocking for three
basic types of operations:
o Matrix/matrix products
o Rank-k and rank-2k updates of a symmetric matrix
o Solving triangular systems of equations with multiple right-hand
sides
Three types of matrices are supported by these BLAS Level 3 routines:
o General
o Symmetric/Hermitian
o Triangular
A set of additional matrix-matrix routines is provided:
o Add two matrices
o Subtract one matrix from another
7
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
o Transpose a matrix, in-place or out-of-place
Array Math Functions
The Array Math Functions provide a set of basic math functions that
operate on arrays of numbers rather than on scalars. On vector and su-
perscalar architectures, such functions have a performance advantage
over a loop of scalar operations. The library includes the following
array functions for double precision numbers:
o Sine of array
o Cosine of array
o Cosine and sine of array
o Exponent of array
o Logarithm of array
o Square root of array
o Reciprocal of array
LAPACK Library Contents
LAPACK is a library of linear algebra subprograms intended to solve
a wide range of problems in linear algebra. LAPACK can be used to solve
dense systems of linear equations, linear least squares problems, eigen-
value problems, and singular value problems. It is also useful in do-
ing other computations such as matrix factorizations and estimations
of condition numbers.
The DXML LAPACK library provides the complete LAPACK package. DXML's
version of LAPACK is provided as a packaged library, compiled, tested,
and ready-to-use. Combined with the optimized BLAS Level 3 routines,
the DXML LAPACK will provide optimal performance on all supported plat-
forms. LAPACK should be used in place of LINPACK and EISPACK, because
it is more efficient, accurate, and robust.
8
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
LAPACK supports both real and complex, single and double precision data.
It operates on the following types of matrices:
o Bidiagonal
o General band
o General unsymmetric
o General tridiagonal
o Hermitian
o Hermitian, packed storage
o Upper Hessenberg, generalized problem
o Upper Hessenberg
o Orthogonal
o Orthogonal, packed storage
o Symmetric/Hermitian positive definite band
o Symmetric/Hermitian positive definite
o Symmetric/Hermitian positive definite, packed storage
o Symmetric/Hermitian positive definite tridiagonal
o Symmetric band
o Symmetric, packed storage
o Symmetric tridiagonal
o Symmetric
o Triangular band
o Triangular, generalized problem
o Triangular, packed storage
o Triangular
o Trapezoidal
o Unitary
9
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
o Unitary, packed storage
LAPACK provides the following operations:
o Triangular factorization
o Unblocked triangular factorization
o Solve a system of linear equations (based on triangular factoriza-
tion)
o Compute the inverse (based on triangular factorization)
o Compute a split Cholesky factorization of a symmetric/Hermitian pos-
itive definite band matrix
o Unblocked computation of inverse
o Estimate condition number
o Refine initial solution returned by solver
o Perform QR factorization without pivoting
o Unblocked QR factorization
o Solve linear least squares problem (based on QR factorization)
o Solve the linear equality constrained least squares (LSE) problem
o Solve the Gauss-Markov linear model problem
o Perform LQ factorization without pivoting
o Unblocked LQ factorization
o Solve underdetermined linear system (based on LQ factorization)
o Generate a real orthogonal or complex unitary matrix as a product
of Householder matrices
o Unblocked generation of real orthogonal or unitary matrix
o Multiply a matrix by a real orthogonal or complex unitary matrix
by applying a product of Householder matrices
10
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
o Unblocked version of multiplication of a matrix by a real orthog-
onal or complex unitary matrix by applying a product of Householder
matrices
o Reduce a square matrix to upper Hessenberg form
o Unblocked version of square matrix reduction
o Reduce a symmetric matrix to real symmetric tridiagonal form
o Reduce a band matrix to bidiagonal form
o Unblocked version of symmetric matrix reduction
o Reduce a rectangular matrix to bidiagonal form
o Reduce a band symmetric/Hermitian matrix to tridiagonal form
o Reduce a symmetric/Hermitian-definite banded generalized eigenprob-
lem to standard form
o Compute various norms of a complex Hermitian tridiagonal matrix
o Compute eigenvalues and optional Schur factorization or eigenvec-
tors using QR algorithm
o Compute selected eigenvectors by inverse iteration
o Compute eigenvectors from Schur factorization
o Compute eigenvectors using the Pal-Walker-Kahan variant of the QL
or QR algorithm
o For a pair of N-by-N real nonsymmetric matrices, compute the gen-
eralized eigenvalues, the real Schur form, and the left and/or right
Schur vectors
o For a pair of N-by-N real nonsymmetric matrices, compute the gen-
eralized eigenvalues, and the left and/or right generalized eigen-
vectors
o Solve the generalized nonsymmetric eigenproblem Ax = lambda Bx
o Solve the generalized definite banded eigenproblem Ax = lambda Bx
o Solve the generalized symmetric/Hermitian-definite banded eigen-
problem
11
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
o Solve the symmetric eigenproblem using divide-and-conquer algorithm
o Compute singular values and, optionally, singular vectors using the
QR algorithm
o Compute the generalized (quotient) singular value decomposition
o Compute the generalized singular value decomposition (GSVD) on the
M-by-N matrix A and P-by-N matrix B
o Solve a generalized linear regression model problem
Sparse System Solver Subrograms
The DXML Sparse System Solver library contains a set of subprograms
that may be used to solve sparse linear systems of equations. Two pack-
ages providing direct and iterative methods are supported.
Direct Method Sparse Solver Package
The direct solver package includes skyline (profile) solvers for sym-
metric and nonsymmetric matrices. Separate factorization and solver
routines are provided to allow repeated use of the solver for multi-
ple right hand sides, without repeating the factorization. To make the
subprograms easier to use, both simple and expert driver routines are
provided. Functions provided include:
o LDU factorization
o Solve
o Norm evaluation
o Condition number estimation
o Iterative refinement
o Simple and expert drivers
These storage schemes are supported for symmetric and nonsymmetric ma-
trices:
o Profile-in storage
o Structurally symmetric, profile-in storage (for nonsymmetric only)
12
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
o Diagonal-out storage
Iterative Method Sparse Solver Package
For the iterative method, the library provides a modular set of stor-
age schemes, preconditioners, and solvers. These solvers and precon-
ditioners are easily accessed through an integrated driver routine.
Six iterative sparse solvers for real, double precision data are sup-
plied:
o Preconditioned conjugate gradient method
o Preconditioned least squares conjugate gradient method
o Preconditioned biconjugate method
o Preconditioned conjugate gradient squared method
o Preconditioned generalized minimum residual method
o Preconditioned transpose free QMR method
Routines for three storage schemes are provided, or the user may de-
velop routines to employ a custom storage scheme. The supplied stor-
age schemes include:
o Symmetric diagonal
o Unsymmetric diagonal
o General storage by rows
Three preconditioners are supplied, which can be selectively applied
to the data. Users may also supply custom preconditioners. The pre-
conditioners supplied include:
o Diagonal
o Polynomial (Neumann)
o Incomplete LU with zero diagonals added
Sorting Subprograms
13
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
Two sort subprograms using the Quicksort algorithm and two general pur-
pose radix sort subprograms are provided, as follows:
o Sort elements of a vector using the Quicksort algorithm
o Sort an indexed vector of data using the Quicksort algorithm
o Sort data using a radix sort algorithm
o Sort an indexed vector of data using a radix sort algorithm
All of the above sorts operate on data stored in memory.
Random Number Subprograms
DXML provides four random number generator subprograms:
o Produce a vector of uniform [0,1], long-period random numbers us-
ing the L'Ecuyer multiplicative method
o Produce a vector of N(0,1), normally-distributed random numbers
Note: Two auxilliary input routines are provided to allow the above
generator subprograms to be called from within a parallel section of
a program.
o Produce single precision random numbers using a linear multiplica-
tive algorithm
o Produce single precision random numbers using a Lehmer multiplica-
tive generator
Signal Processing Subprograms
The DXML Signal Processing library contains a set of subprograms in
four basic areas of signal processing:
o Fast Fourier Transforms (FFT)
o Fast Cosine and Fast Sine Transforms (FCT and FST)
o Convolution and correlation
o Digital filters
Fast Fourier Transforms and Cosine and Sine Transforms
14
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
DXML provides one-dimensional, two-dimensional, three-dimensional, and
group FFT routines and one-dimensional FCT/FST routines. Each routine
is supplied in two forms:
o The first form computes the transform in one unit operation. This
is convenient for programs requiring speed on only one or a few op-
erations.
o The second form is provided for programs requiring speed on repeated
operations. With this form, each routine is subdivided into three
routines. One routine builds the rotation factors, a second rou-
tine applies them to perform the transform, and a third routine deal-
locates any virtual memory allocated in the first routine. Thus,
for repeated operations, the rotation factors need to be built only
once.
Convolution and Correlation
DXML provides routines for computing one-dimensional discrete convo-
lutions and correlations. These routines can process both periodic and
nonperiodic data.
Digital Filters
DXML provides support for one-dimensional, nonrecursive digital fil-
tering. Based on the Kaisers Sinh-Bessel algorithm, these routines al-
low programming of bandpass, bandstop, low-pass, and high-pass fil-
ters.
Cray SciLib Portability Support
SCIPORT is Compaq Computer Corporation's implementation of the Cray
Reasearch scientific numerical library, SciLib. SCIPORT provides 64-
bit single-precision and 64-bit integer interfaces to underlying DXML
routines for Cray users porting programs to Alpha systems running Com-
paq's DIGITAL UNIX. SCIPORT also provides an equivalent version of al-
most all Cray Math Library and CF77 (Cray Fortran 77) Math intrinsic
routines.
15
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
In order to be completely source code compatible with SciLib, the SCI-
PORT library calling sequence supports 64-bit integers passed by ref-
erence. However, internally, SCIPORT used 32 bit integers. Consequently,
some run-time uses of SciLib are not be supported by SCIPORT.
SCIPORT provides the following:
o 64-bit versions of all Cray SciLib single-precision BLAS Level 1,
Level 2, and Level 3 routines
o All Cray SciLib LAPACK routines
o All Cray SciLib Special Linear System Solver routines
o All Cray SciLib Signal Processing routines
o All Cray SciLib Sorting and Searching routines
These routines are completely interchangeable with their Cray SciLib
counterparts up to the runtime limit on integer size, and with the ex-
ception of the ORDERS routine, require no program changes to function
correctly. Owing to endian differences of machine architecture, spe-
cial considerations must be given when the ORDERS routine is used to
sort multi-byte character strings.
HARDWARE REQUIREMENTS
DXML will operate on any AlphaStation or AlphaServer capable of run-
ning Compaq's DIGITAL UNIX. In addition, DXML will operate correctly
when the archive library is linked to an application built with the
Compaq's DIGITAL UNIX version of the VxWorks[R] development environ-
ment and executed on an Alpha embedded processor. Such use may require
an additional license.
DXML versions 3.1-3.4 provide two versions of the libraries built for
the Alpha EV4 and EV5 implementations. Both versions of the libraries
will function correctly on either EV4 or EV5 processors, but may ex-
hibit some performance loss when not run on the designated implemen-
tation.
Disk Space Requirements
16
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
Development Option
Disk space required for installation:
Root file system: / 0 MB
Other file systems: /usr 90 MB
/tmp 0 MB
/var 0 MB
Disk space required for use (permanent), including man pages:
Root file system: / 0 MB
Other file systems: /usr 57 MB
/var 0 MB
Run-Time Option
Disk space required for installation:
Root file system: / 0 MB
Other file systems: /usr 55 MB
/tmp 0 MB
/var 0 MB
Disk space required for use (permanent):
Root file system: / 0 MB
Other file systems: /usr 20 MB
/var 0 MB
These counts refer to the disk space required on the system disk. The
sizes are approximate; actual sizes may vary depending on the user's
system environment, configuration, and software options.
SOFTWARE REQUIREMENTS
Compaq's DIGITAL UNIX Operating System Version V4.0 - V4.0D
17
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
SOFTWARE LICENSING
This software is furnished only under a license. For more information
about Compaq's licensing terms and policies, contact your local Com-
paq office.
License Management Facility Support
This layered product supports Compaq's DIGITAL UNIX License Manage-
ment Facility. License units for this product are allocated on an Un-
limited Use Basis.
For more information on the License Management Facility, refer to Com-
paq's DIGITAL UNIX Operating System Software Product Description (SPD
41.61.xx) or to Compaq's DIGITAL UNIX Operating System documentation
set.
For more information about Compaq's licensing terms and policies, con-
tact your local Compaq office.
GROWTH CONSIDERATIONS
The minimum hardware/software requirements for any future version of
this product may be different from the requirements for the current
version.
DISTRIBUTION MEDIA
CDrom
This product is also available as part of the Compaq's DIGITAL UNIX
Consolidated Software Distribution on CD-ROM (QA-054AA-H8).
The software documentation for this product is also available as part
of Compaq's DIGITAL UNIX Online Documentation Library on CD-ROM.
18
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
Year 2000 Ready
This product is Year 2000 Ready.
SOFTWARE WARRANTY
This software is provided by Compaq with a 90 day conformance warranty
in accordance with the Compaq warranty terms applicable to the license
purchase.
The above information is valid at time of release. Please contact your
local Compaq office for the most up-to-date information.
ORDERING INFORMATION
Development Option
Software Licenses: QL-MUXA*-**
Software Media: QA-MUXAA-H8
Software Documentation: QA-MUXAA-GZ
Software Product Services: QT-MUXA*-**
Run-Time Option
Software Licenses: QL-MUYA*-**
Software Media: QA-MUYAA-H8
Software Documentation: QA-MUYAA-GZ
Software Product Services: QT-MUYA*-**
* Denotes variant fields. For additional information on available li-
censes, services, and media, refer to the appropriate price book.
19
DIGITAL Extended Math Library SPD 41.86.07
Version 3.4 for Compaq's DIGITAL UNIX
SOFTWARE PRODUCT SERVICES
A variety of service options are available. For more information, please
contact your local Compaq office.
[R] UNIX is a registered trademark in the United States and other
countries licensed exclusively through X/Open Company Lim-
ited.
[TM] The Compaq logo, AlphaGeneration, AXP, DEC, and DIGITAL are
trademarks of Compaq Computer Corporation.
[TM] CRAY is a trademark of Cray Research, Inc.
[R] VxWorks is a registered trademark and VxGDB is a trademark of
Wind River Systems, Inc.
© 1998 Compaq Computer Corporation. All rights reserved.
20