-

@ mleku
2025-05-11 22:20:41
i've now drafted the full text indexer, it uses unicode to split up the words, eliminates the duplicates, and then feeds that to a background thread that updates the indexes for each word to add the database sequence record for the event to the list of them that contain each word
it's too late in the night for me to add the search function tho, but i will continue this work tomorrow, it's much more easy for me to do this than the bunker signer at this point, because i'm new to working with teh GUI i'm using for that, and i had to refresh my memory about how to encrypt keys and all the bits associated with that.
not gonna even test the indexer tonight, i'm poopered, but i will test it tomorrow afternoon after my fiat mine work, to make sure it isn't bombing out with a panic or anything, and then to work on the actual search function
the search function is a bit complicated... first you have to take the search term, break it down into words, and find all the records and get all of their lists of database sequence numbers, and then out of that, first, assemble a progressive list of the events that come up in all of them, and then one less, and one less and one less until there is no search terms left
then you go through those lists, and you scan the content field for the locations of the text matches, and find the ones that have the longest sequences of matches in the same order as the search request text, to sort them by, and then you return the events in descending order of exact matching, and all the events that have several of the terms but not in order after that
and then the task will be complete, and #realy will have a full text search capability with sort by relevance (relevance meaning how closely the result event matches the search terms).
making the index is the easy part. finding the matches and ranking them will be quite a long function i expect. the indexer is only 117 lines of code.