Tutorial: Dynamic search with htmx, hyperscript and ProcessWire
Using ProcessWire, you can easily create a dynamic search with very little code. This search can’t compete with engines such as Elasticsearch or Solr, of course. However, it is suitable for most “showcase” sites. Here’s how we did it on Spiria’s site using the htmx small library and its companion hyperscript.
The goal
You can try out the search for yourself just above this article.
The recipe
- Including htmx and hyperscript libraries (the latter is optional).
- A textarea-type field integrated with the page templates that we want to index.
- A code for indexing the existing content in the file
ready.php
. - A search controller, which we named
api.php
. This controller will also become a page with theapi
template. - A form placed in the pages that require the search.
Content indexing
Before we can program, we need to index the content to which we want to apply our search. In my proof of concept, I have developed two strategies. This is probably overkill since I am not sure of the increase in speed.
- Indexing for a single term search.
- Indexing for a multiple-term search.
To do this, we need to introduce two fields in each model we want to be indexed.
- The
search_text
field, which will contain only one occurrence of each word on a page. - The
search_text_long
field, which will preserve all sentences without HTML tags.
This is how we place a hook in the ready.php
page in this way:
<?php namespace ProcessWire;
pages()->addHookAfter("saveReady", function (HookEvent $event) {
$p = $event->arguments[0];
switch ($p->template->name) {
case "blog_article":
$french = languages()->get('fr');
$english = languages()->get('default');
$txt_en = $p->page_content->getLanguageValue($english) . ' ' . $p->blog_summary->getLanguageValue($english);
$txt_fr = $p->page_content->getLanguageValue($french) . ' ' . $p->blog_summary->getLanguageValue($french);
$title_en = $p->title->getLanguageValue($english);
$title_fr = $p->title->getLanguageValue($french);
$resultEn = stripText($txt_en, $title_en);
$resultFr = stripText($txt_fr, $title_fr);
$p->setLanguageValue($english, "search_text", $resultEn[0]);
$p->setLanguageValue($english, "search_text_long", $resultEn[1]);
$p->setLanguageValue($french, "search_text", $resultFr[0]);
$p->setLanguageValue($french, "search_text_long", $resultFr[1]);
break;
}
});
And:
function stripText($t, $s)
{
$resultText = [];
$t = strip_tags($t);
$t .= " " . $s;
$t = str_replace(["\n", ",", "“", "”", "'", "?", "!", ":", "«", "»", ",", ".", "l’", "d’", " "], "", $t);
//$t = preg_replace('/\?|\[\[.*\]\]|“|”|«|»|\.|!|\ |l’|d’|s’/','',$t);
$arrayText = explode(" ", $t);
foreach ($arrayText as $item) {
if (strlen(trim($item)) > 3 && !in_array($item, $resultText)) {
$resultText[] = $item;
}
}
return [implode(" ", $resultText), $t];
}
If you have the ListerPro module, it’s easy to batch-save all the pages to be indexed; then any new page you create will be indexed.
The stripText()
function scrubs the text to our specifications. Note that, in my example, I make the distinction between French and English. This little algorithm is entirely perfectible! I have noted a shorter way to clean up the text, though at the expense of ease of comprehension.
$t = preg_replace('/\?|\[\[.*\]\]|“|”|«|»|\.|!|\ |l’|d’|s’/','',$t);
As I mentioned before, it’s probably unnecessary to create two search fields. Most important thing would be to optimize the text as much as possible, since so many short words serve no purpose. The current code restricts us to words longer than three characters, which is tricky in a computing context such as our site where words like C#
, C++
and PHP
compete with the
, for
, not
, etc. That said, perhaps this optimization is superfluous in the context of a single-content search and limited in number.
So now let’s see the process and the research code.
The structure
This graphic is a classic and needs no explanations. The htmx
library makes a simple Ajax call.
The form
- The form has a
get
method that sends us back to a conventional search page when the user presses theenter
key. - A hidden field with a secret key generated on the fly enhances security.
- The third field is the
input
field involved in the dynamic search. It has anhtmx
syntax. The first command,hx-post
, indicates how data is sent to the API – apost
in this case.htmx
handles events on any DOM element. So, for example, we could have several calls on different elements of a form. - The second line indicates where the API response will be sent, in this case
div#searchResult
below the form. - The
hx-trigger
command describes the context of the dispatch to the API: when the user releases a key and with a delay of 200 ms between each occurrence. - The
hx-indicator
command is optional. It signals to the user that something is underway. In our example, the#indexsearch
image (point 9) is displayed. htmx automatically handles this. - The
_=on
command comes from thehyperscript
syntax. It adds a class to the#screenWindow
division. - We can add other parameters to the search using the
hx-vals
command. In our simplified example, we send the search language. - This is an optional image. htmx controls its appearance.
- The last command is
hyperscript
again. It removes the contents of the search when we click outside this area. - Finally, this is coupled with the
#screenWindow
division’s behavior. Note how the simple the syntax is.
This example clearly shows that no javascript is called, except for the htmx and hyperscript libraries. It is worth visiting these two libraries’ websites to understand their methodology and potential.
The Search API
The API resides in a normal ProcessWire page. Although it is published, it remains "hidden" from CMS searches. Several requests to the CMS are gathered in this type of page where requests can be answered and the correct functions called.
<?php namespace ProcessWire;
$secretsearch = session()->get('secretToken');
$request = input()->post();
$lang = sanitizer()->text($request["lang"]);
if (isset($request['CSRFTokenBlog'])) {
if (hash_equals($secretsearch, $request['CSRFTokenBlog'])) {
if (!empty($request["search"])) {
echo page()->querySite(sanitizer()->text($request["search"]),$lang);
}
} else {
echo __("A problem occurred. We are sorry of the inconvenience.");
}
}
exit;
In this case :
- We extract the secret token for the session, which will be created in the search-form page.
- We then process everything that is in the
post
query. Remember that this is a simplified example. - We compare the token with the one received in the query. If all goes well, we call the SQL query. Our example uses a class residing in
site/classes/ApiPage.php
; it can therefore be directly called withpage()
. Any other strategy is valid.
The following code represents the core of the process:
<?php namespace ProcessWire;
public function querySite($q, $l)
{
$this->search = "";
$this->lang = $l == 'en' ? 'default' : 'fr';
user()->setLanguage($this->lang);
$whatQuery = explode(" ", $q);
$this->count = count($whatQuery);
if ($this->count > 1) {
$this->search = 'template=blog_article,has_parent!=1099,search_text_long~|*= "' . $q . '",sort=-created';
} elseif (strlen($q) > 1) {
$this->search = 'template=blog_article,has_parent!=1099,search_text*=' . $q . ',sort=-created';
}
if ($this->search !== "") {
$this->result = pages()->find($this->search);
return $this->formatResult();
}
return "";
}
protected function formatResult()
{
$html = '<ul id="found">';
if (count($this->result) > 0) {
foreach ($this->result as $result) {
$html .= '<li><a href="' . $result->url . '">' . $result->title . '</a></li>';
}
} else {
$html .= __('Nothing found');
}
$html .= '</ul></div>';
return $html;
}
The formatResult()
function is simple to understand, and this is where the ul#found
div appears, which gets deleted by the hyperscript line of the form.
_="on click from elsewhere remove #found"
No need to add CSS to display the result in the current code. It is invisible at first because it is placed in an empty #searchResult
div. But when the search result fills it, everything becomes accessible as the CSS targets the ul#found
list and not its parent.
Conclusion
The purpose of this article was to experiment with htmx and hyperscript. I was just scratching the surface of the libraries under construction. The search as described is perfectible and sometimes shows its limitations. There are so many possible combination strategies that advanced search options should eventually be proposed. This could be the subject of another article.
The haiku placed at the end of the introduction page of htmx is very fitting:
javascript fatigue:
longing for a hypertext
already in hand
Finally, SearchEngine is an excellent search module for ProcessWire, which coexists very well with the code described here.