Some problems and solutions…
Problem 0: Wildcard searches don’t work
- This was fixed in 1.12! You can search for “releas*” and match both “release” and “releasing” etc.
Problem 1: Minimum length and stopwords
- People don’t like when their searches can’t turn up their favorite acronyms and such
You can tweak the MySQL configuration… server-wide… if you have enough permissions on the server… - We can hack a transformation like we do for Unicode: append x00 or such to small words to force them to be indexed.
Problem 2: The table crashes sometimes
- People often get mystified when the searchindex table is marked crashed.
- Catch the error: try a REPAIR TABLE transparently, and display a friendlier error if that fails.
Problem 3: Separate title and text search results are ugly and hard to manage
- People are used to Google-style searches where you just get one set of results which takes both title and body text into account.
- Merge the title into the text index and return one set of results only.
Problem 4: Needs to join to ‘page’ table
- The search does joins to the ‘page’ table to do namespace & redirect filtering and to return the original page title for result display. These joins can cause ugly slow locks, mixing up the InnoDB world with the MyISAM world.
- Denormalize: add fields for namespace, original title, and redirect status to ‘searchindex’ table.