1,381
edits
Changes
From IGEP - ISEE Wiki
→Test Software
= Dhrystone Benchmark: Rationale for Version 2 and Measurement Rules =
Reinhold P. Weicker<br>Siemens AG, E STE 35<br>Postfach 3240<br>D-8520 Erlangen<br>Germany (West)
<br>
The Dhrystone benchmark program [1] has become a popular benchmark for CPU/compiler performance measurement, in particular in the area of minicomputers, workstations, PC's and microprocesors. It apparently satisfies a need for an easy-to-use integer benchmark; it gives a first performance indication which is more meaningful than MIPS numbers which, in their literal meaning (million instructions per second), cannot be used across different instruction sets (e.g. RISC vs. CISC). With the increasing use of the benchmark, it seems necessary to reconsider the benchmark and to check whether it can still fulfill this function. Version 2 of Dhrystone is the result of such a re-evaluation, it has been made for two reasons:
a) Dhrystone has been published in Ada [1], and Versions in Ada, Pascal and C have been distributed by Reinhold Weicker via floppy disk. However, the version that was used most often for benchmarking has been the version made by Rick Richardson by another translation from the Ada version into the C programming language, this has been the version distributed via the UNIX network Usenet [2].
There is an obvious need for a common C version of Dhrystone, since C is at present the most popular system programming language for the class of systems (microcomputers, minicomputers, workstations) where Dhrystone is used most. There should be, as far as possible, only one C version of Dhrystone such that results can be compared without restrictions. In the past, the C versions distributed by Rick Richardson (Version 1.1) and by Reinhold Weicker had small (though not significant) differences.
Together with the new C version, the Ada and Pascal versions have been updated as well.
b) As far as it is possible without changes to the Dhrystone statistics,optimizing compilers should be prevented from removing significant statements. It has turned out in the past that optimizing compilers suppressed code generation for too many statements (by "dead code removal" or "dead variable elimination"). This has lead to the danger that benchmarking results obtained by a naive application of Dhrystone - without inspection of the code that was generated - could become meaningless.
The overall policiy for version 2 has been that the distribution of statements, operand types and operand locality described in [1] should remain unchanged as much as possible. (Very few changes were necessary; their impact should be negligible.) Also, the order of statements should remain unchanged. Although I am aware of some critical remarks on the benchmark - I agree with several of them - and know some suggestions for improvement, I didn't want to change the benchmark into something different from what has become known as "Dhrystone"; the confusion generated by such a change would probably outweight the benefits. If I were to write a new benchmark program, I wouldn't give it the name "Dhrystone" since this denotes the program published in [1]. However, I do recognize the need for a larger number of representative programs that can be used as benchmarks; users should always be encouraged to use more than just one benchmark. The new versions (version 2.1 for C, Pascal and Ada) will be distributed as widely as possible. (Version 2.1 differs from version 2.0 distributed via the UNIX Network Usenet in March 1988 only in a few corrections for minor deficiencies found by users of version 2.0.) Readers who want to use the benchmark for their own measurements can obtain a copy in machine-readable form on floppy disk (MS-DOS or XENIX format) from the author.
== <br> 2. Overall Characteristics of Version 2 ==
In general, version 2 follows - in the parts that are significant for performance measurement, i.e. within the measurement loop - the published (Ada) version and the C versions previously distributed. Where the versions distributed by Rick Richardson [2] and Reinhold Weicker have been different, it follows the version distributed by Reinhold Weicker. (However, the differences have been so small that their impact on execution time in all likelihood has been negligible.) The initialization and UNIX instrumentation part - which had been omitted in [1] - follows mostly the ideas of Rick Richardson [2]. However, any changes in the initialization part and in the printing of the result have no impact on performance measurement since they are outside the measaurement loop. As a concession to older compilers, names have been made unique within the first 8 characters for the C version.
Contrary to the suggestion in the published paper and its realization in the versions previously distributed, no attempt has been made to subtract the time for the measurement loop overhead. (This calculation has proven difficult to implement in a correct way, and its omission makes the program simpler.) However, since the loop check is now part of the benchmark, this does have an impact - though a very minor one - on the distribution statistics which have been updated for this version.
== <br> 3. Discussion of Individual Changes ==
In this section, all changes are described that affect the measurement loop and that are not just renamings of variables. All remarks refer to the C version; the other language versions have been updated similarly.
In addition to adding the measurement loop and the printout statements, changes have been made at the following places:
The distribution statistics have been changed only by the addition of the measurement loop iteration (1 additional statement, 4 additional local integer operands) and by the change in Proc_4 (one operand changed from local to global). The distribution statistics in the comment headers have been updated accordingly.
== <br> 4. String Operations ==
The string operations (string assignment and string comparison) have not been changed, to keep the program consistent with the original version.
I admit that the string comparison in Dhrystone terminates later (after scanning 20 characters) than most string comparisons in real programs. For consistency with the original benchmark, I didn't change the program despite this weakness.
== <br> 5. Intended Use of Dhrystone ==
When Dhrystone is used, the following "ground rules" apply:
Of course, for experimental purposes, post-linkage optimization, procedure merging and/or compilation in one unit can be done to determine their effects. However, Dhrystone numbers obtained under these conditions should be explicitly marked as such; "normal" Dhrystone results should be understood as results obtained following the ground rules listed above.
In any case, for serious performance evaluation, users are advised to ask for code listings and to check them carefully. In this way, when results for different systems are compared, the reader can get a feeling how much performance difference is due to compiler optimization and how much is due to hardware speed.
== <br> 6. Acknowledgements ==
The C version 2.1 of Dhrystone has been developed in cooperation with Rick Richardson (Tinton Falls, NJ), it incorporates many ideas from the "Version 1.1" distributed previously by him over the UNIX network Usenet. Through his activity with Usenet, Rick Richardson has made a very valuable contribution to the dissemination of the benchmark. I also thank Chaim Benedelac (National Semiconductor), David Ditzel (SUN), Earl Killian and John Mashey (MIPS), Alan Smith and Rafael Saavedra-Barrera (UC at Berkeley) for their help with comments on earlier versions of the benchmark.
== <br> Bibliography == [1] Reinhold P. Weicker: Dhrystone: A Synthetic Systems Programming Benchmark. Communications of the ACM 27, 10 (Oct. 1984), 1013-1030 [2]Rick Richardson: Dhrystone 1.1 Benchmark Summary (and Program Text) Informal Distribution via "Usenet", Last Version Known to me: Sept. 21, 1987 [3]Brian W. Kernighan and Dennis M. Ritchie: The C Programming Language. Prentice-Hall, Englewood Cliffs (NJ) 1978 <br> = IGEP Dhrystone 2.1 MIPS Test = == Test Software == You can donwload the Dhrystone 2.1 MIPS test from [http://downloads.isee.biz/pub/files/dhrystone-2.1.tar.gz here]. The software it's compiled for OMAP / DM processors, inside be available 2 executables: *gcc_dry2reg<br> <u>Tune Parameters:</u> GCCOPTIM= -O Compiler: Linaro & Ubuntu <pre>$ arm-linux-gnueabi-gcc -vUsing built-in specs.COLLECT_GCC=arm-linux-gnueabi-gccCOLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabi/4.5.2/lto-wrapperTarget: arm-linux-gnueabiConfigured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.5.2-8ubuntu3' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.5 --enable-shared --enable-multiarch --with-multiarch-defaults=i386-linux-gnu --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/arm-linux-gnueabi/include/c++/4.5.2 --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-gold --enable-ld=default --with-plugin-ld=ld.gold --enable-objc-gc --disable-sjlj-exceptions --with-arch=armv7-a --with-float=softfp --with-fpu=vfpv3-d16 --with-mode=thumb --disable-werror --enable-checking=release --program-prefix=arm-linux-gnueabi- --includedir=/usr/arm-linux-gnueabi/include --build=i686-linux-gnu --host=i686-linux-gnu --target=arm-linux-gnueabi --with-headers=/usr/arm-linux-gnueabi/include --with-libs=/usr/arm-linux-gnueabi/libThread model: posixgcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu3)</pre> *cc_dry2reg<br> <u>Tune Parameters:</u><br> OPTIMIZE= -O4 -march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -fno-tree-vectorize Compiler: Linaro & Ubuntu<br> <pre>$ arm-linux-gnueabi-gcc -vUsing built-in specs.COLLECT_GCC=arm-linux-gnueabi-gccCOLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabi/4.5.2/lto-wrapperTarget: arm-linux-gnueabiConfigured with: ../src/configure -v --with-pkgversion='Ubuntu/Linaro 4.5.2-8ubuntu3' --with-bugurl=file:///usr/share/doc/gcc-4.5/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-4.5 --enable-shared --enable-multiarch --with-multiarch-defaults=i386-linux-gnu --enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --with-gxx-include-dir=/usr/arm-linux-gnueabi/include/c++/4.5.2 --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-plugin --enable-gold --enable-ld=default --with-plugin-ld=ld.gold --enable-objc-gc --disable-sjlj-exceptions --with-arch=armv7-a --with-float=softfp --with-fpu=vfpv3-d16 --with-mode=thumb --disable-werror --enable-checking=release --program-prefix=arm-linux-gnueabi- --includedir=/usr/arm-linux-gnueabi/include --build=i686-linux-gnu --host=i686-linux-gnu --target=arm-linux-gnueabi --with-headers=/usr/arm-linux-gnueabi/include --with-libs=/usr/arm-linux-gnueabi/libThread model: posixgcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu3)</pre> Calculation References:<br> *[http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/4160.html ARM Dhrystone reference]. == Test Case 1: IGEPv2 Revision C- DM3730 @ 1 Ghz == <u>'''Board'''</u>: IGEPv2 Revision C - RC5 - DM3730 @ 1Ghz - 512 MBytes LPDDR RAM + 512 MBytes OneNand Flash<br> <u>'''Operating System'''</u>: Linux version 2.6.35.13 (mcaro@manel-p) (gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu3) ) #3 Fri Jun 10 19:58:16 CEST 2011 <u>'''Boot Software:'''</u> IGEP X-Loader 2.0.1-2<br> '''<u>a) Test case: Execution 10000000 loops (gcc_dry2)</u>'''<br> <u>Result</u><br> <pre>root@localhost:/tmp# ./gcc_dry2Dhrystone Benchmark, Version 2.1 (Language: C)Program compiled without 'register' attributePlease give the number of runs through the benchmark: 10000000Execution starts, 10000000 runs through DhrystoneExecution endsFinal values of the variables used in the benchmark:Int_Glob: 5 should be: 5Bool_Glob: 1 should be: 1Ch_1_Glob: A should be: ACh_2_Glob: B should be: BArr_1_Glob[8]: 7 should be: 7Arr_2_Glob[8][7]: 10000010 should be: Number_Of_Runs + 10Ptr_Glob-> Ptr_Comp: 13295624 should be: (implementation-dependent) Discr: 0 should be: 0 Enum_Comp: 2 should be: 2 Int_Comp: 17 should be: 17 Str_Comp: DHRYSTONE PROGRAM, SOME STRING should be: DHRYSTONE PROGRAM, SOME STRINGNext_Ptr_Glob-> Ptr_Comp: 13295624 should be: (implementation-dependent), same as above Discr: 0 should be: 0 Enum_Comp: 1 should be: 1 Int_Comp: 18 should be: 18 Str_Comp: DHRYSTONE PROGRAM, SOME STRING should be: DHRYSTONE PROGRAM, SOME STRINGInt_1_Loc: 5 should be: 5Int_2_Loc: 13 should be: 13Int_3_Loc: 7 should be: 7Enum_Loc: 1 should be: 1Str_1_Loc: DHRYSTONE PROGRAM, 1'ST STRING should be: DHRYSTONE PROGRAM, 1'ST STRINGStr_2_Loc: DHRYSTONE PROGRAM, 2'ND STRING should be: DHRYSTONE PROGRAM, 2'ND STRING Microseconds for one run through Dhrystone: 0.4Dhrystones per Second: 2788671.0</pre> '''''DMIPS: 2788671.0 / 1757 = 1587.17''''' '''<u>b) Test case: Execution 10000000 loops (cc_dry2)</u>'''<br> <u>Result</u><br> <pre>root@localhost:/tmp# . Bibliography /cc_dry2reg Dhrystone Benchmark, Version 2.1 (Language: C) Program compiled with 'register' attribute Please give the number of runs through the benchmark: 10000000 Execution starts, 10000000 runs through DhrystoneExecution ends Final values of the variables used in the benchmark: Int_Glob: 5 should be: 5Bool_Glob: 1 should be: 1Ch_1_Glob: A should be: ACh_2_Glob: B should be: BArr_1_Glob[8]: 7 should be: 7Arr_2_Glob[8][7]: 10000010 should be: Number_Of_Runs + 10Ptr_Glob-> Ptr_Comp: 4169736 should be: (implementation-dependent) Discr: 0 should be: 0 Enum_Comp: 2 should be: 2 Int_Comp: 17 should be: 17 Str_Comp: DHRYSTONE PROGRAM, SOME STRING should be: DHRYSTONE PROGRAM, SOME STRINGNext_Ptr_Glob-> Ptr_Comp: 4169736 should be: (implementation-dependent), same as above Discr: 0 should be: 0 Enum_Comp: 1 should be: 1 Int_Comp: 18 should be: 18 Str_Comp: DHRYSTONE PROGRAM, SOME STRING should be: DHRYSTONE PROGRAM, SOME STRINGInt_1_Loc: 5 should be: 5Int_2_Loc: 13 should be: 13Int_3_Loc: 7 should be: 7Enum_Loc: 1 should be: 1Str_1_Loc: DHRYSTONE PROGRAM, 1'ST STRING should be: DHRYSTONE PROGRAM, 1'ST STRINGStr_2_Loc: DHRYSTONE PROGRAM, 2'ND STRING should be: DHRYSTONE PROGRAM, 2'ND STRING