Efficient AutoComplete algorithm?

This is something I've been wondering about for a while. In the past, I have wanted to load a table of string values into the autocomplete custom list for a textbox. This often works great, but is in issue if I'm iterating through 10000+ datarows to build the list. I've tried building the list asynchronously when the app first runs, then adding the collection to the autocomplete list when a control loads, but with so many entries even this lags. So I've been thinking about how to implement an efficient autocomplete algorithm on my own.

Assume that I have an object that provides a pre-loaded generic collection of strings (list(of string), etc.). How could I design a function, taking an input of type char[ ] or string, and search through the list to find the first entry with a matching substring at the beginning

I suppose I could do something like:

for each s as string in listAutoComplete

if s.StartsWith(query) then return s

next

But my thoughts are that, given a list/sorted list of the size I mentioned, this method might cause a bit of a lag.

The idea I have in mind is to implement this "list" as a sort of a tree. The tree would have top-level child nodes for each letter or number at index 0 of the strings loaded from the datatable. Each of these nodes would have children for the character at index 1, and so on.

For example, given the set of strings {Alligator,Along,Astronaut} the tree would be built as follows: The root node would have a child node "A"; the node "A" would have two children, "l" and "s"; the node "l" would have children "l" and "o", while the node "s" would have the child "t" (or it might have no children, if no other strings start with "As"). Then the AutoComplete method would search the top-level children and see if a node was present with the same starting character as the input string. If the node is not found, return nothing...if it is found, go to the children, etc.

The main class would keep track of state. For instance, when someone started typing in a textbox utilizing the Autocomplete, a "session" would begin. When the search function would run, it would keep track of the last node, so that it could go straight to checking the children to find a match, rather than going through the whole thing again.

I know that building the tree might be a bit time-consuming, so I could build that asynchronously when the app loads. But, if my thinking is correct, the query function should run very quickly. Then I would not be slowing myself down by trying to circumvent the WinForms AutoComplete, and I wouldn't have to suffer a lag when a form/control loads and the autocomplete list is being filled.

So, does anyone have any thoughts on this Am I reinventing the wheel here If there's a way to do this that already exists, someone please let me know! If not, does my logic seem valid for building this thing

Thanks,

Ben



Answer this question

Efficient AutoComplete algorithm?

  • David Len.

    Why not simply use a windows autocomplete. It sounds as though you are trying to reivent the wheel.

    I'd probably be using a combobox with autocomplete to allow be to populate the listitems which it would be using.

    I'm one for the easy life, if I can make something work acceptably with minimal effort then thats the way I'll go, using in built system functionality rather than reiventing your own means I have less maintenance overhead as a result, because there is much less of my code.


  • crescens2k

    I found out how to do what I was looking for. I built an AutoCompleteStringCollection on a background thread. When the collection was built, I fired an event. When the event was handled, I invoked a delegate that set the AutoCompleteCustomSource property to the collection I had built. So I didn't have to add item by item, and didn't have to develop an algorithm of my own.
  • Efficient AutoComplete algorithm?