From 4eb814f9253639a43d0eaa1897535ba1e1bf67df Mon Sep 17 00:00:00 2001 From: Silvan Jegen Date: Fri, 16 Sep 2016 21:21:34 +0200 Subject: Add link to README --- README.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 57a3884..d475efc 100644 --- a/README.md +++ b/README.md @@ -36,11 +36,14 @@ Run the benchmark ----------------- To run the benchmark you need the test input which is a subset of all -the Open Access Pubmed Central full text XML files. The subset used can -be found in the 'xmldata/subset.txt' file. The input consists of 10'000 -small XML files that have to be copied into their subdirectories in the -'xmldata' directory. +the Open Access Pubmed Central full text XML files[0]. The exact subset +used can be found in the 'xmldata/subset.txt' file. The input consists of +10'000 small XML files that have to be copied into their subdirectories +in the 'xmldata' directory. If you have located and copied all the input files into 'xmldata/' you can execute the "runbenchmarks.sh" script to run the benchmark. + +[0] I used a subet of ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_bulk/non_comm_use.A-B.xml.tar.gz + (warning: the file is about 1.2GB in size) -- cgit v1.2.3