Thursday 20 July 2023

staticSearch and Me: getting started

 In another post, I explain the background to my work in making digital scholarly editions in relation to the Endings project, and how this led me to staticSearch. In this post, I describe my use of staticSearch, in hope that it might help others who, like me, want to include a seach engine in their online resource. I am using a Macintosh MacBook Pro, running Ventura 13.4 in July 2023 as I write this.

The documentation for staticSearch is at https://endings.uvic.ca/staticSearch/docs/index.html. The first step is to download the source code at https://github.com/projectEndings/staticSearch/releases/. This should arrive on your computer as a zip file named staticSearch-1.4.4.zip, or similar. Just double-click to unpack the zip file into a folder named staticSearch-1.4.4. Move that folder somewhere convenient from the downloads folder: easiest and simplest to put it all in your Applications folder.

Before you can do anything more: you need Apache Ant. This is a tool designed to build complex software projects from source. Ant will read a set of instructions: get this file! rebuild it this way! save it to this file! now get another sofrware process to use that file to convert other files into something else! and then create new objects (new files, new tools, new libraries) from those files! etc. etc. Section 7.7 of the StaticSearch document says laconically:

    Note: you will need Java and Apache Ant installed, as well as ant-contrib.

You should have Java already, in an up-to-date distribution, as part of your computer. But you may need to get Apache Ant. You get it from https://ant.apache.org/srcdownload.cgi. Look for the latest version: in July 2023, this was 1.9.16. This requires Java 5, which you should already have. Download the zip file, double-click to unpack it to a folder named apache-ant-1.10.13 (or similar). As before, move that folder into your Applications folder.

You also have to get ant-contrib. This is a little more complex. What you actually need are two Java .jar files, named "cpptasks-1.0b5.jar" and "ant-contrib-1.0.jar". It took me a while to figure this out. The Apache ant-contrib page gives you the source for cpptasks, and someone with more expertise and time than me could (I suppose) compile the source into a Java .jar. But I took the short-cut and found a copy of cpptasks-1.0b5.jar out there on the net (in my case, at https://jar-download.com/artifacts/ant-contrib/cpptasks/1.0b5#google_vignette). I found ant-contrib-1.0.jar at http://www.java2s.com/Code/Jar/a/Downloadantcontrib10jar.htm.

Once you have these .jar files: place them both in the lib directory of your Apache Ant folder. 

You are now ready to test out staticSearch. Here's what you do:

  1. Open the Macintosh terminal application. You will find this in your Applications/Utilities folder. This is a good old-fashioned command-prompt system, like we all used back in the 80s (remember the 80s? Wham? Freddy Mercury? yes, those). 
  2. In the terminal: move into your static search folder. If you have unpacked it into Applications as "staticSearch-1.4.4" you should type "cd /Applications/staticSearch-1.4.4" into the terminal
  3. Now you are ready to test out all is working. For this you have to run Ant. You do this with the following command at the terminal "/Applications/apache-ant-1.10.13/bin/./ant" (assuming you have got Ant in a directory named apache-ant-1.10.13 inside Applications"). If all is installed correctly you should see a lot of things on the screen and, finally, a triumphant "BUILD SUCCESSFUL" message comes up. (If you are smarter than I am you might be able to edit the $PATH statement in your terminal profile so that you just need to type "ant" into the terminal, and not "/Applications/apache-ant-1.10.13/bin/./ant". It seems Apple do not want you to edit your terminal profile, and are making this rather difficult: see https://stackoverflow.com/questions/9832770/where-is-the-default-terminal-path-located-on-mac.)
Now, try it with your own HTML. The staticSearch documentation is excellent. I created a folder called "mystuff"inside the staticSearch folder. In this folder I put all my html, itself in another folder called "html". I had an index.html file in the root of the mystuff folder and I had an xml file called "ssconfig.xml" containing the key instructions directing staticSearch to work on my html:

<config xmlns="http://hcmc.uvic.ca/ns/staticSearch">
 <params>
<searchFile>index.html</searchFile>
<recurse>true</recurse>
 </params>
</config>

I now ran staticSearch on my material with this command: 

/Applications/apache-ant-1.10.13/bin/./ant -DssConfigFile=/Applications/staticSearch-1.4.4/mystuff/ssconfig.xml
(I could also have used just "/Applications/apache-ant-1.10.13/bin/./ant -DssConfigFile=mystuff/ssconfig.xml" as I am already in the staticSearch folder)

The first time I tried this, it did not work. It turns out that the <params> declaration needs a whole lot more it it or you get a failed build. <params> needs to contain declarations as follows:
       <phrasalSearch>true</phrasalSearch>
        <wildcardSearch>true</wildcardSearch>
        <createContexts>true</createContexts>
        <resultsPerPage>5</resultsPerPage>
        <minWordLength>2</minWordLength>
        <maxKwicsToHarvest>5</maxKwicsToHarvest>
        <maxKwicsToShow>5</maxKwicsToShow>
        <totalKwicLength>15</totalKwicLength>
        <kwicTruncateString>...</kwicTruncateString>
        <verbose>false</verbose>
        <stopwordsFile>test_stopwords.txt</stopwordsFile>
        <dictionaryFile>english_words.txt</dictionaryFile>
        <indentJSON>true</indentJSON>
It turns out that this issue is a part of a wider discussion in the SS community on what needs to be declared in the set-up, and what can be set as defaults. See the discussion in the comments on https://github.com/projectEndings/staticSearch/issues/270, where I first reported my experience, and on https://github.com/projectEndings/staticSearch/issues/195, where the wider discussion takes place.

Now that I had staticSearch running: the next step was to start integrating it into our own HTML. That's the subject of the next post.




No comments:

Post a Comment