######################################################################## This is the DARPA/DOE HPC Challenge Benchmark version 1.4.1 October 2003 Produced by Jack Dongarra and Piotr Luszczek Innovative Computing Laboratory University of Tennessee Knoxville and Oak Ridge National Laboratory See the source files for authors of specific codes. Compiled on Mar 30 2011 at 07:49:03 Current time (1335447555) is Thu Apr 26 09:39:15 2012 Hostname: 'snacky' ######################################################################## ================================================================================ HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008 Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK Modified by Julien Langou, University of Colorado Denver ================================================================================ An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system. The following parameter values will be used: N : 2560 NB : 80 PMAP : Column-major process mapping P : 1 Q : 1 PFACT : Right NBMIN : 4 NDIV : 2 RFACT : Crout BCAST : 1ringM DEPTH : 1 SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes ALIGN : 8 double precision words -------------------------------------------------------------------------------- - The matrix A is randomly generated for each test. - The following scaled residual check will be computed: ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0 Begin of MPIRandomAccess section. Running on 1 processors (PowerofTwo) Total Main table size = 2^22 = 4194304 words PE Main table size = 2^22 = 4194304 words/PE Default number of updates (RECOMMENDED) = 16777216 CPU time used = 3.828239 seconds Real time used = 10.628453 seconds 0.001578519 Billion(10^9) Updates per second [GUP/s] 0.001578519 Billion(10^9) Updates/PE per second [GUP/s] Verification: CPU time used = 0.412026 seconds Verification: Real time used = 0.412872 seconds Found 0 errors in 4194304 locations (passed). Current time (1335447566) is Thu Apr 26 09:39:26 2012 End of MPIRandomAccess section. Begin of StarRandomAccess section. Main table size = 2^22 = 4194304 words Number of updates = 16777216 CPU time used = 0.348022 seconds Real time used = 0.351086 seconds 0.047786641 Billion(10^9) Updates per second [GUP/s] Found 0 errors in 4194304 locations (passed). Node(s) with error 0 Minimum GUP/s 0.047787 Average GUP/s 0.047787 Maximum GUP/s 0.047787 Current time (1335447567) is Thu Apr 26 09:39:27 2012 End of StarRandomAccess section. Begin of SingleRandomAccess section. Main table size = 2^22 = 4194304 words Number of updates = 16777216 CPU time used = 0.348022 seconds Real time used = 0.351018 seconds 0.047795859 Billion(10^9) Updates per second [GUP/s] Found 0 errors in 4194304 locations (passed). Node(s) with error 0 Node selected 0 Single GUP/s 0.047796 Current time (1335447568) is Thu Apr 26 09:39:28 2012 End of SingleRandomAccess section. Begin of MPIRandomAccess_LCG section. Running on 1 processors (PowerofTwo) Total Main table size = 2^22 = 4194304 words PE Main table size = 2^22 = 4194304 words/PE Default number of updates (RECOMMENDED) = 16777216 CPU time used = 3.812238 seconds Real time used = 10.705810 seconds 0.001567113 Billion(10^9) Updates per second [GUP/s] 0.001567113 Billion(10^9) Updates/PE per second [GUP/s] Verification: CPU time used = 0.420026 seconds Verification: Real time used = 0.420304 seconds Found 0 errors in 4194304 locations (passed). Current time (1335447579) is Thu Apr 26 09:39:39 2012 End of MPIRandomAccess_LCG section. Begin of StarRandomAccess_LCG section. Main table size = 2^22 = 4194304 words Number of updates = 16777216 CPU time used = 0.352022 seconds Real time used = 0.350507 seconds 0.047865596 Billion(10^9) Updates per second [GUP/s] Found 0 errors in 4194304 locations (passed). Node(s) with error 0 Minimum GUP/s 0.047866 Average GUP/s 0.047866 Maximum GUP/s 0.047866 Current time (1335447579) is Thu Apr 26 09:39:39 2012 End of StarRandomAccess_LCG section. Begin of SingleRandomAccess_LCG section. Main table size = 2^22 = 4194304 words Number of updates = 16777216 CPU time used = 0.348022 seconds Real time used = 0.350579 seconds 0.047855733 Billion(10^9) Updates per second [GUP/s] Found 0 errors in 4194304 locations (passed). Node(s) with error 0 Node selected 0 Single GUP/s 0.047856 Current time (1335447580) is Thu Apr 26 09:39:40 2012 End of SingleRandomAccess_LCG section. Begin of PTRANS section. M: 1280 N: 1280 MB: 80 NB: 80 P: 1 Q: 1 TIME M N MB NB P Q TIME CHECK GB/s RESID ---- ----- ----- --- --- --- --- -------- ------ -------- ----- WALL 1280 1280 80 80 1 1 0.04 PASSED 0.334 0.00 CPU 1280 1280 80 80 1 1 0.04 PASSED 0.328 0.00 WALL 1280 1280 80 80 1 1 0.04 PASSED 0.334 0.00 CPU 1280 1280 80 80 1 1 0.04 PASSED 0.328 0.00 WALL 1280 1280 80 80 1 1 0.04 PASSED 0.334 0.00 CPU 1280 1280 80 80 1 1 0.04 PASSED 0.328 0.00 WALL 1280 1280 80 80 1 1 0.04 PASSED 0.334 0.00 CPU 1280 1280 80 80 1 1 0.04 PASSED 0.328 0.00 WALL 1280 1280 80 80 1 1 0.04 PASSED 0.334 0.00 CPU 1280 1280 80 80 1 1 0.04 PASSED 0.328 0.00 Finished 5 tests, with the following results: 5 tests completed and passed residual checks. 0 tests completed and failed residual checks. 0 tests skipped because of illegal input values. END OF TESTS. Current time (1335447582) is Thu Apr 26 09:39:42 2012 End of PTRANS section. Begin of StarDGEMM section. Scaled residual: 0.0142747 Node(s) with error 0 Minimum Gflop/s 16.231961 Average Gflop/s 16.231961 Maximum Gflop/s 16.231961 Current time (1335447582) is Thu Apr 26 09:39:42 2012 End of StarDGEMM section. Begin of SingleDGEMM section. Scaled residual: 0.0142747 Node(s) with error 0 Node selected 0 Single DGEMM Gflop/s 16.402220 Current time (1335447583) is Thu Apr 26 09:39:43 2012 End of SingleDGEMM section. Begin of StarSTREAM section. ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2184533, Offset = 0 Total memory required = 0.0488 GiB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 3420 microseconds. (= 3420 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (GB/s) Avg time Min time Max time Copy: 6.5924 0.0053 0.0053 0.0053 Scale: 6.5101 0.0054 0.0054 0.0054 Add: 6.8993 0.0076 0.0076 0.0076 Triad: 7.2038 0.0073 0.0073 0.0073 ------------------------------------------------------------- Results Comparison: Expected : 2519423615566406144.000000 503884723113281280.000000 671846297484375040.000000 Observed : 2519423615622480384.000000 503884723094127232.000000 671846297499976832.000000 Solution Validates ------------------------------------------------------------- Node(s) with error 0 Minimum Copy GB/s 6.592388 Average Copy GB/s 6.592388 Maximum Copy GB/s 6.592388 Minimum Scale GB/s 6.510126 Average Scale GB/s 6.510126 Maximum Scale GB/s 6.510126 Minimum Add GB/s 6.899328 Average Add GB/s 6.899328 Maximum Add GB/s 6.899328 Minimum Triad GB/s 7.203770 Average Triad GB/s 7.203770 Maximum Triad GB/s 7.203770 Current time (1335447583) is Thu Apr 26 09:39:43 2012 End of StarSTREAM section. Begin of SingleSTREAM section. ------------------------------------------------------------- This system uses 8 bytes per DOUBLE PRECISION word. ------------------------------------------------------------- Array size = 2184533, Offset = 0 Total memory required = 0.0488 GiB. Each test is run 10 times, but only the *best* time for each is used. ------------------------------------------------------------- Your clock granularity/precision appears to be 1 microseconds. Each test below will take on the order of 3397 microseconds. (= 3397 clock ticks) Increase the size of the arrays if this shows that you are not getting at least 20 clock ticks per test. ------------------------------------------------------------- WARNING -- The above is only a rough guideline. For best results, please be sure you know the precision of your system timer. ------------------------------------------------------------- Function Rate (GB/s) Avg time Min time Max time Copy: 6.5924 0.0060 0.0053 0.0054 Scale: 6.5691 0.0059 0.0053 0.0054 Add: 7.2798 0.0081 0.0072 0.0076 Triad: 7.3276 0.0080 0.0072 0.0073 ------------------------------------------------------------- Results Comparison: Expected : 2519423615566406144.000000 503884723113281280.000000 671846297484375040.000000 Observed : 2519423615622480384.000000 503884723094127232.000000 671846297499976832.000000 Solution Validates ------------------------------------------------------------- Node(s) with error 0 Node selected 0 Single STREAM Copy GB/s 6.592388 Single STREAM Scale GB/s 6.569052 Single STREAM Add GB/s 7.279845 Single STREAM Triad GB/s 7.327634 Current time (1335447583) is Thu Apr 26 09:39:43 2012 End of SingleSTREAM section. Begin of MPIFFT section. Number of nodes: 1 Vector size: 524288 Generation time: 0.029 Tuning: 0.031 Computing: 0.072 Inverse FFT: 0.073 max(|x-x0|): 1.391e-15 Gflop/s: 0.694 Current time (1335447584) is Thu Apr 26 09:39:44 2012 End of MPIFFT section. Begin of StarFFT section. Vector size: 1048576 Generation time: 0.069 Tuning: 0.000 Computing: 0.075 Inverse FFT: 0.079 max(|x-x0|): 1.687e-15 Node(s) with error 0 Minimum Gflop/s 1.395997 Average Gflop/s 1.395997 Maximum Gflop/s 1.395997 Current time (1335447584) is Thu Apr 26 09:39:44 2012 End of StarFFT section. Begin of SingleFFT section. Vector size: 1048576 Generation time: 0.076 Tuning: 0.000 Computing: 0.080 Inverse FFT: 0.078 max(|x-x0|): 1.687e-15 Node(s) with error 0 Node selected 0 Single FFT Gflop/s 1.314023 Current time (1335447584) is Thu Apr 26 09:39:44 2012 End of SingleFFT section. Begin of LatencyBandwidth section. Current time (1335447584) is Thu Apr 26 09:39:44 2012 End of LatencyBandwidth section. Begin of HPL section. ================================================================================ HPLinpack 2.0 -- High-Performance Linpack benchmark -- September 10, 2008 Written by A. Petitet and R. Clint Whaley, Innovative Computing Laboratory, UTK Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK Modified by Julien Langou, University of Colorado Denver ================================================================================ An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system. The following parameter values will be used: N : 2560 NB : 80 PMAP : Column-major process mapping P : 1 Q : 1 PFACT : Right NBMIN : 4 NDIV : 2 RFACT : Crout BCAST : 1ringM DEPTH : 1 SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes ALIGN : 8 double precision words -------------------------------------------------------------------------------- - The matrix A is randomly generated for each test. - The following scaled residual check will be computed: ||Ax-b||_oo / ( eps * ( || x ||_oo * || A ||_oo + || b ||_oo ) * N ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0 ================================================================================ T/V N NB P Q Time Gflops -------------------------------------------------------------------------------- WC11C2R4 2560 80 1 1 0.84 1.326e+01 -------------------------------------------------------------------------------- ||Ax-b||_oo/(eps*(||A||_oo*||x||_oo+||b||_oo)*N)= 0.0044692 ...... PASSED ================================================================================ Finished 1 tests with the following results: 1 tests completed and passed residual checks, 0 tests completed and failed residual checks, 0 tests skipped because of illegal input values. -------------------------------------------------------------------------------- End of Tests. ================================================================================ Current time (1335447586) is Thu Apr 26 09:39:46 2012 End of HPL section. Begin of Summary section. VersionMajor=1 VersionMinor=4 VersionMicro=1 VersionRelease=f LANG=C Success=1 sizeof_char=1 sizeof_short=2 sizeof_int=4 sizeof_long=8 sizeof_void_ptr=8 sizeof_size_t=8 sizeof_float=4 sizeof_double=8 sizeof_s64Int=8 sizeof_u64Int=8 sizeof_struct_double_double=16 CommWorldProcs=1 MPI_Wtick=1.000000e-06 HPL_Tflops=0.0132625 HPL_time=0.84408 HPL_eps=1.11022e-16 HPL_RnormI=2.33196e-12 HPL_Anorm1=666.101 HPL_AnormI=663.835 HPL_Xnorm1=1494.04 HPL_XnormI=2.76479 HPL_BnormI=0.499975 HPL_N=2560 HPL_NB=80 HPL_nprow=1 HPL_npcol=1 HPL_depth=1 HPL_nbdiv=2 HPL_nbmin=4 HPL_cpfact=R HPL_crfact=C HPL_ctop=1 HPL_order=C HPL_dMACH_EPS=1.110223e-16 HPL_dMACH_SFMIN=2.225074e-308 HPL_dMACH_BASE=2.000000e+00 HPL_dMACH_PREC=2.220446e-16 HPL_dMACH_MLEN=5.300000e+01 HPL_dMACH_RND=1.000000e+00 HPL_dMACH_EMIN=-1.021000e+03 HPL_dMACH_RMIN=2.225074e-308 HPL_dMACH_EMAX=1.024000e+03 HPL_dMACH_RMAX=1.797693e+308 HPL_sMACH_EPS=5.960464e-08 HPL_sMACH_SFMIN=1.175494e-38 HPL_sMACH_BASE=2.000000e+00 HPL_sMACH_PREC=1.192093e-07 HPL_sMACH_MLEN=2.400000e+01 HPL_sMACH_RND=1.000000e+00 HPL_sMACH_EMIN=-1.250000e+02 HPL_sMACH_RMIN=1.175494e-38 HPL_sMACH_EMAX=1.280000e+02 HPL_sMACH_RMAX=3.402823e+38 dweps=1.110223e-16 sweps=5.960464e-08 HPLMaxProcs=1 HPLMinProcs=1 DGEMM_N=1477 StarDGEMM_Gflops=16.232 SingleDGEMM_Gflops=16.4022 PTRANS_GBs=0.333592 PTRANS_time=0.0392911 PTRANS_residual=0 PTRANS_n=1280 PTRANS_nb=80 PTRANS_nprow=1 PTRANS_npcol=1 MPIRandomAccess_LCG_N=4194304 MPIRandomAccess_LCG_time=10.7058 MPIRandomAccess_LCG_CheckTime=0.420304 MPIRandomAccess_LCG_Errors=0 MPIRandomAccess_LCG_ErrorsFraction=0 MPIRandomAccess_LCG_ExeUpdates=16777216 MPIRandomAccess_LCG_GUPs=0.00156711 MPIRandomAccess_LCG_TimeBound=-1 MPIRandomAccess_LCG_Algorithm=0 MPIRandomAccess_N=4194304 MPIRandomAccess_time=10.6285 MPIRandomAccess_CheckTime=0.412872 MPIRandomAccess_Errors=0 MPIRandomAccess_ErrorsFraction=0 MPIRandomAccess_ExeUpdates=16777216 MPIRandomAccess_GUPs=0.00157852 MPIRandomAccess_TimeBound=-1 MPIRandomAccess_Algorithm=0 RandomAccess_LCG_N=4194304 StarRandomAccess_LCG_GUPs=0.0478656 SingleRandomAccess_LCG_GUPs=0.0478557 RandomAccess_N=4194304 StarRandomAccess_GUPs=0.0477866 SingleRandomAccess_GUPs=0.0477959 STREAM_VectorSize=2184533 STREAM_Threads=1 StarSTREAM_Copy=6.59239 StarSTREAM_Scale=6.51013 StarSTREAM_Add=6.89933 StarSTREAM_Triad=7.20377 SingleSTREAM_Copy=6.59239 SingleSTREAM_Scale=6.56905 SingleSTREAM_Add=7.27985 SingleSTREAM_Triad=7.32763 FFT_N=1048576 StarFFT_Gflops=1.396 SingleFFT_Gflops=1.31402 MPIFFT_N=524288 MPIFFT_Gflops=0.693938 MPIFFT_maxErr=1.39111e-15 MPIFFT_Procs=1 MaxPingPongLatency_usec=-1 RandomlyOrderedRingLatency_usec=-1 MinPingPongBandwidth_GBytes=-1 NaturallyOrderedRingBandwidth_GBytes=-1 RandomlyOrderedRingBandwidth_GBytes=-1 MinPingPongLatency_usec=-1 AvgPingPongLatency_usec=-1 MaxPingPongBandwidth_GBytes=-1 AvgPingPongBandwidth_GBytes=-1 NaturallyOrderedRingLatency_usec=-1 FFTEnblk=16 FFTEnp=8 FFTEl2size=1048576 M_OPENMP=-1 omp_get_num_threads=0 omp_get_max_threads=0 omp_get_num_procs=0 MemProc=64 MemSpec=-1 MemVal=-1 MPIFFT_time0=9.53674e-07 MPIFFT_time1=0.011394 MPIFFT_time2=0.013288 MPIFFT_time3=0.00471282 MPIFFT_time4=0.0285861 MPIFFT_time5=0.00942302 MPIFFT_time6=9.53674e-07 CPS_HPCC_FFT_235=0 CPS_HPCC_FFTW_ESTIMATE=0 CPS_HPCC_MEMALLCTR=0 CPS_HPL_USE_GETPROCESSTIMES=0 CPS_RA_SANDIA_NOPT=0 CPS_RA_SANDIA_OPT2=0 CPS_USING_FFTW=0 End of Summary section. ######################################################################## End of HPC Challenge tests. Current time (1335447586) is Thu Apr 26 09:39:46 2012 ########################################################################