diff options
author | Silvan Jegen <s.jegen@gmail.com> | 2016-09-16 21:21:34 +0200 |
---|---|---|
committer | Silvan Jegen <s.jegen@gmail.com> | 2016-09-16 21:21:34 +0200 |
commit | 4eb814f9253639a43d0eaa1897535ba1e1bf67df (patch) | |
tree | 68d1cb990930e0abbbc52680f6f7d0936fa0fff8 | |
parent | 20849860c5572fa4fda86d26a5ad0a6fb760a3b8 (diff) |
Add link to README
-rw-r--r-- | README.md | 11 |
1 files changed, 7 insertions, 4 deletions
@@ -36,11 +36,14 @@ Run the benchmark ----------------- To run the benchmark you need the test input which is a subset of all -the Open Access Pubmed Central full text XML files. The subset used can -be found in the 'xmldata/subset.txt' file. The input consists of 10'000 -small XML files that have to be copied into their subdirectories in the -'xmldata' directory. +the Open Access Pubmed Central full text XML files[0]. The exact subset +used can be found in the 'xmldata/subset.txt' file. The input consists of +10'000 small XML files that have to be copied into their subdirectories +in the 'xmldata' directory. If you have located and copied all the input files into 'xmldata/' you can execute the "runbenchmarks.sh" script to run the benchmark. + +[0] I used a subet of ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_bulk/non_comm_use.A-B.xml.tar.gz + (warning: the file is about 1.2GB in size) |