summaryrefslogtreecommitdiff
path: root/software/dhrystone
diff options
context:
space:
mode:
authorPalmer Dabbelt <palmer@dabbelt.com>2017-11-17 12:14:52 -0800
committerPalmer Dabbelt <palmer@dabbelt.com>2017-11-17 15:24:01 -0800
commit401b704e73599a36bfdc8c778dab85f94d74ed1d (patch)
tree5d4d4851f1a9ff03864ab87de11bacc05804ddbd /software/dhrystone
parentb53187a0434fe5e8dce288f55cfca36b292552e4 (diff)
Speed up Dhrystone on the HiFive1
There's a handful of things that went wrong here: * The read-only data sections were mapped to flash, which is very slow. I just put them in the data segment, so they end up in the scratchpad. This is about a 10x hit, so it's really important. * The toolchain was an old version, which didn't have a fast memcpy implementation on 32-bit systems. This is about a 2x hit. * Some compiler flags were incorrect, including * -Os instead of -O3 * Missing -mexplicit-relocs * Missing -DNOENUM * Missing -falign-functions=4 I haven't checked how much those hurt With this, I get $ make software BOARD=freedom-e300-hifive1 PROGRAM=dhrystone LINK_TARGET=dhrystone $ make upload BOARD=freedom-e300-hifive1 PROGRAM=dhrystone LINK_TARGET=dhrystone Execution starts, 10000000 runs through Dhrystone Execution ends Final values of the variables used in the benchmark: Int_Glob: 5 should be: 5 Bool_Glob: 1 should be: 1 Ch_1_Glob: A should be: A Ch_2_Glob: B should be: B Arr_1_Glob[8]: 7 should be: 7 Arr_2_Glob[8][7]: 10000010 should be: Number_Of_Runs + 10 Ptr_Glob-> Ptr_Comp: -2147470264 should be: (implementation-dependent) Discr: 0 should be: 0 Enum_Comp: 2 should be: 2 Int_Comp: 17 should be: 17 Str_Comp: DHRYSTONE PROGRAM, SOME STRING should be: DHRYSTONE PROGRAM, SOME STRING Next_Ptr_Glob-> Ptr_Comp: -2147470264 should be: (implementation-dependent), same as above Discr: 0 should be: 0 Enum_Comp: 1 should be: 1 Int_Comp: 18 should be: 18 Str_Comp: DHRYSTONE PROGRAM, SOME STRING should be: DHRYSTONE PROGRAM, SOME STRING Int_1_Loc: 5 should be: 5 Int_2_Loc: 13 should be: 13 Int_3_Loc: 7 should be: 7 Enum_Loc: 1 should be: 1 Str_1_Loc: DHRYSTONE PROGRAM, 1'ST STRING should be: DHRYSTONE PROGRAM, 1'ST STRING Str_2_Loc: DHRYSTONE PROGRAM, 2'ND STRING should be: DHRYSTONE PROGRAM, 2'ND STRING Microseconds for one run through Dhrystone: 1.3 Dhrystones per Second: 714285.6 which is 1.55 DMIPS/MHz at 262 MHz. It's still a bit slower than our current stuff, but I don't remember what was actually in the HiFive1 so I'm not sure what we should be getting. I verified the clock is accurate with a stopwatch. I haven't bothered to go look through the binary, but I think we're about 10 cycles off so it should be managable.
Diffstat (limited to 'software/dhrystone')
-rw-r--r--software/dhrystone/Makefile4
1 files changed, 2 insertions, 2 deletions
diff --git a/software/dhrystone/Makefile b/software/dhrystone/Makefile
index d401720..4602653 100644
--- a/software/dhrystone/Makefile
+++ b/software/dhrystone/Makefile
@@ -5,10 +5,10 @@ C_SRCS := dhry_stubs.c dhry_printf.c
HEADERS := dhry.h
DHRY_SRCS := dhry_1.c dhry_2.c
-DHRY_CFLAGS := -O2 -DTIME -fno-inline -fno-builtin-printf -Wno-implicit -mcmodel=medany -march=$(RISCV_ARCH) -mabi=$(RISCV_ABI)
+DHRY_CFLAGS := -O3 -DTIME -fno-inline -fno-builtin-printf -Wno-implicit -mcmodel=medany -march=$(RISCV_ARCH) -mabi=$(RISCV_ABI)
XLEN ?= 32
-CFLAGS := -Os -fno-common -mcmodel=medany -march=$(RISCV_ARCH) -mabi=$(RISCV_ABI)
+CFLAGS := -O3 -fno-common -mcmodel=medany -march=$(RISCV_ARCH) -mabi=$(RISCV_ABI) -mexplicit-relocs -DNOENUM -falign-functions=4
LDFLAGS := -Wl,--wrap=scanf -Wl,--wrap=printf -march=$(RISCV_ARCH) -mabi=$(RISCV_ABI) -mcmodel=medany
DHRY_OBJS := $(patsubst %.c,%.o,$(DHRY_SRCS))