A set of benchmark programs is provided to allow the user to compare performance of certain routines with similar functionality. For example, the user may wish to compare the performance of lin_sol_gen with LFTRG and LFSRG. Since performance is dependent on problem size and platform, the user can run the time_sol_gen benchmark to determine which of these routines is likely to perform better with the user's specific configuration.
The benchmark programs are supplied with the product in the examples benchmark subdirectory and are summarized in Table B. These programs call Fortran 90 array functions, in single and double precision, to compare the routines shown in columns A and B of Table B. The main program reads single lines of input:
NSIZE NTRIES PREC “Description”
NSIZE NTRIES PREC “Description”
...
QUIT
The parameters of NSIZE and NTRIES appear in summary tables. The parameter PREC has values 1, 2 or 3. The choice depends on whether the user wants precision of single, double or both versions timed. The array functions return a summary table with these 6 values:
1. Average time
2. Standard deviation
3. Total time
4. nsize
5. ntries
6. Time Units/Sec.
As an example, the program time_rand_gend is compiled and linked with the single and double precision timing functions s_rand_gen_bench and d_rand_gen_bench.
The two lines of input are:
100000 5 3 “Random Number Benchmarks”
QUIT
This routine evaluates the elapsed time to compute 100,000
random numbers obtained with rand_gen and rnun(drnun). The
“Average” is the mean of the individual elapsed times for 5 calls to the
routines, obtaining 100,000 random numbers in each call. The “St. Dev.” is the
standard deviation for that “Average”. This value indicates the variability of
the “Average”. In order for this value to provide any useful information
it is necessary for |NTRIES|
> 1. The value
|NTRIES|
= 1 is acceptable, but only one time sample and no standard deviation is
obtained. Values of NTRIES
> 0 result in the printing of results as shown in Table
A. The numbers in the table will vary depending on the machine
and other factors that impact performance of Fortran codes.
Benchmark of rand_gend and rnun: Date of benchmark, (Y, Mo, D, H, M, S): 2006 5 11 8 58 58 | |||
1 |
3.6000E+00 |
3.2000E+00 |
Average |
2 |
4.8990E-01 |
4.0000E-01 |
St. Dev. |
3 |
1.8000E+01 |
1.6000E+01 |
Total Ticks |
4 |
1.0000E+04 |
1.0000E+04 |
Size |
5 |
5.0000E+00 |
5.0000E+00 |
Repeats |
6 |
5.0000E+01 |
5.0000E+01 |
Ticks per sec. |
Benchmark of rand_gend and rnun: Date of benchmark, (Y, Mo, D, H, M, S): 2006 5 11 8 58 58 | |||
1 |
2.8000E+00 |
3.2000E+00 |
Average |
2 |
4.0000E-01 |
4.0000E-01 |
St. Dev. |
3 |
1.4000E+01 |
1.6000E+01 |
Total Ticks |
4 |
1.0000E+04 |
1.0000E+01 |
Size |
5 |
5.0000E+00 |
5.0000E+00 |
Repeats |
6 |
5.0000E+01 |
5.0000E+01 |
Ticks per sec. |
Table A: Benchmark Summary: rand_gen, rnun, (drnun)
If NTRIES < 0 the 6 × 2 functions return the tabular values shown, with |NTRIES| samples. No printing is performed with NTRIES < 0.
To compute a related benchmark such as the rate “random numbers per second” for single precision rand_gen, separately calculate
rate = size × ticks per sec./average
= 104 × 50/3.6
= 138,889.
numbers/sec.
= 0.139 million numbers/sec.
|
|
Routines
| |
A |
B | ||
1 |
time_dft.f90, s_dft_bench.f90, d_dft_bench.f90 |
fast_dft |
fftcf, fftcb dfftcf, dfftcb |
2 |
time_eig_gen.f90, s_eig_gen_bench.f90, d_eig_gen_bench.f90 |
lin_eig_gen |
e8crg, de8crg |
3 |
time_eig_self.f90, s_eig_self_bench.f90, d_eig_self_bench.f90 |
lin_eig_self |
e5csf, de5csf |
4 |
time_geig_gen.f90, s_geig_gen_bench.f90, d_geig_gen_bench.f90 |
lin_geig_gen |
g8crg, dg8crg |
5 |
time_inv_chol.f90, s_inv_chol_bench.f90, d_inv_chol_bench.f90 |
lin_sol_self |
l2nds, dl2nds |
6 |
time_inv_gen.f90, s_inv_gen_bench.f90, d_inv_gen_bench.f90 |
lin_sol_gen |
l2nrg, dl2nrg |
7 |
time_inv_lsq.f90, s_inv_lsq_bench.f90, d_inv_lsq_bench.f90 |
lin_sol_lsq |
lsgrr, dlsgrr |
8 |
time_inv_self.f90, s_inv_self_bench.f90, d_inv_self_bench.f90 |
lin_sol_self |
lftsf, lfssf dlftsf, dlfssf |
9 |
time_rand_gen.f90, s_inv_rand_bench.f90, d_inv_rand_bench.f90 |
rand_gen |
rnun, drnun |
Table B: Scalar Benchmark Comparisons
|
|
Routines | |
A |
B | ||
10 |
time_sol_chol.f90, s_inv_sol_chol.f90, d_inv_sol_chol.f90 |
lin_sol_self |
lftds, lfsds dlftds, dlfsds |
11 |
time_sol_gen.f90, s_sol_gen_bench.f90, d_sol_gen_bench.f90 |
lin_sol_gen |
lftrg, lfsrg dftrg, dlfsrg |
12 |
time_sol_lsq.f90, s_sol_lsq_bench.f90, d_sol_lsq_bench.f90 |
lin_sol_lsq |
l2rrv, dl2rrv |
13 |
time_sol_self.f90, s_sol_self_bench.f90, d_sol_self_bench.f90 |
lin_sol_self |
lftsf, lfssf, dlftsf, dlfssf |
14 |
time_svd.f90, s_svd_bench.f90, d_svd_bench.f90 |
lin_svd |
lsvrr, dlsvrr |
15 |
time_tri.f90, s_tri_bench.f90, d_tri_bench.f90 |
lin_sol_tri |
lslcr, dlslcr |
16 |
time_mult.f90 s_mult_bench.f90 d_mult_bench.f90 |
A .x. B |
matmul(D,E) |
Table B- continued: Scalar Benchmark ComparisonsTable
Notes on the comparable problems:
1. Perform forward and backward DFT of a random complex sequence of size NSIZE.
2. Compute
eigenexpansion of a random real matrix of dimension
NSIZE × NSIZE.
3. Compute
eigenexpansion of a random symmetric real matrix of dimension
NSIZE × NSIZE.
4. Compute
generalized eigenexpansion of a random matrix pencil of dimension
NSIZE × NSIZE.
5. Compute the inverse of a positive definite real matrix of dimension NSIZE × NSIZE. Uses Cholesky method.
6. Compute the inverse of a general real random matrix of dimension NSIZE × NSIZE. Uses LU factorization.
7. Compute
the generalized inverse of a general real random matrix of dimension
(2 ×
NSIZE) × NSIZE. Uses QR
factorization for lin_sol_lsq and SVD for LSGRR.
8. Compute
the inverse of a real, symmetric random matrix of dimension
NSIZE × NSIZE. Uses Aasen's
decomposition for lin_sol_self and Bunch-Kaufman
decomposition for LFTSF.
9. Generate NSIZE random numbers.
10. Solve a single system of linear equations with a positive definite real random matrix of dimension NSIZE × NSIZE.
11. Solve a single system of linear equations with a general real random matrix of dimension NSIZE × NSIZE.
12. Solve a single least-squares system of linear equations with a real random matrix of dimension (2 × NSIZE) × NSIZE.
13. Solve a single system of linear equations with a symmetric real random matrix of dimension NSIZE × NSIZE.
14. Compute the full singular value decomposition of a general real random matrix of dimension NSIZE × NSIZE.
15. Solve NSIZE systems of
linear equations of a nonsymmetric
NSIZE × NSIZE tridiagonal
matrix. Uses cyclic reduction.
16. Compute products of square matrices of size NSIZE × NSIZE. Compare the IMSL defined operation C = A .x. B with F = matmul(D,E). The arrays are assumed shape. Identical problems A = D and B = E are timed.
17. Compare times to use SHOW() for writing a random array of size NSIZE to a CHARACTER buffer vs. writing the same array to a scratch file.
PHONE: 713.784.3131 FAX:713.781.9260 |