November 2006 B
High Productivity Computing Systems and the Path Towards Usable Petascale Computing
Sadaf R. Alam, Oak Ridge National Laboratory
Nikhil Bhatia, Oak Ridge National Laboratory
Jeffrey S. Vetter, Oak Ridge National Laboratory

3. Construction and Validation of Symbolic Models

As a motivating example, we demonstrate MA on the NAS CG benchmark. NAS CG computes an approximation to the smallest eigenvalue of a large, sparse, symmetric positive definite matrix, which is characteristic of unstructured grid computations. The main subroutine is conj_grad, which is called niter times. The benchmark results report time in seconds for the time step (do it = 1, niter) loop that is shown in Figure 2. Hence, we started constructing the model in the main program, starting from the conj_loop, as shown is Figure 2. The first step was to identify the key input parameters, na, nonzer, niter and nprocs (number of MPI tasks). Then, using ma_def_variable_assign_int function, we declared the essential derived values, which are later used in the MA annotations to simplify the model representation. Figure 2 only shows the MA annotations as applied to the NAS CG main loop. It does not show the annotations in the conj_grad subroutine, which are similar except for additional ma_subroutine_start and ma_subroutine_end calls. The annotations shown in Figure 2 capture the overall flow of the application at different levels in the hierarchy; loops (e.g., conj_loop, mpi), subroutines (e.g., conj_grad), basic loop blocks, etc. These annotations also define variables (e.g., na, nonzer, niter, nprocs), which are important to the user in terms of quantities that determine the problem resolution and its workload requirements. At the lowest level is the norm_loop as shown in Figure 2. The ma_loop_start and ma_loop_end annotations specify that the enclosed loop executes a number of iterations (e.g., l2npcols) with a typical number of MPI send and receive operations. More specifically, the ma_loop_start routine captures an annotation name (i.e., l2npcols = log2(num_proc_cols)), a symbolic expression that defines the number of iterations in terms of an earlier defined MA variable (i.e., num_proc_cols).

call maf_def_variable_int(’na’,na)
call maf_def_variable_assign_int (’num_proc_cols’,
’2ˆceil(log(nprocs)/(2*log(2)))’, num_proc_cols)
call maf_def_variable_assign_int( ’l2npcols’
call maf_loop_start(’conj_loop’, ’niter’, niter)
   do it = 1, niter
     call conj_grad ( colidx, reduce_recv_lengths )........
     call maf_loop_start(’norm_loop’, ’l2npcols’, l2npcols)
     do i = 1, l2npcols
       call maf_mpi_irecv(’nrecv’,’dp*2’, dp*2,1)
       call mpi_irecv( norm_temp2, 2, dp_type ........
     call maf_loop_end(’norm_loop’,i-1)

Figure 2. Annotation of the CG benchmark with MA API calls.

Pages: 1 2 3 4 5 6

Reference this article
"Symbolic Performance Modeling of HPCS Applications," CTWatch Quarterly, Volume 2, Number 4B, November 2006 B. http://www.ctwatch.org/quarterly/articles/2006/11/symbolic-performance-modeling-of-hpcs-applications/

Any opinions expressed on this site belong to their respective authors and are not necessarily shared by the sponsoring institutions or the National Science Foundation (NSF).

Any trademarks or trade names, registered or otherwise, that appear on this site are the property of their respective owners and, unless noted, do not represent endorsement by the editors, publishers, sponsoring institutions, the National Science Foundation, or any other member of the CTWatch team.

No guarantee is granted by CTWatch that information appearing in articles published by the Quarterly or appearing in the Blog is complete or accurate. Information on this site is not intended for commercial purposes.