diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/bugs/Search_results_should_point_to_dir__44___not_index.html__44___when_use__95__dirs_is_enabled.mdwn | 2 | ||||
-rw-r--r-- | doc/features.mdwn | 4 | ||||
-rw-r--r-- | doc/ikiwiki.setup | 4 | ||||
-rw-r--r-- | doc/plugins/search.mdwn | 20 | ||||
-rw-r--r-- | doc/plugins/search/discussion.mdwn | 2 | ||||
-rw-r--r-- | doc/todo/different_search_engine.mdwn | 26 | ||||
-rw-r--r-- | doc/todo/search_terms.mdwn | 5 | ||||
-rw-r--r-- | doc/wikitemplates.mdwn | 5 |
8 files changed, 37 insertions, 31 deletions
diff --git a/doc/bugs/Search_results_should_point_to_dir__44___not_index.html__44___when_use__95__dirs_is_enabled.mdwn b/doc/bugs/Search_results_should_point_to_dir__44___not_index.html__44___when_use__95__dirs_is_enabled.mdwn index 45a8f0abd..5f36e21df 100644 --- a/doc/bugs/Search_results_should_point_to_dir__44___not_index.html__44___when_use__95__dirs_is_enabled.mdwn +++ b/doc/bugs/Search_results_should_point_to_dir__44___not_index.html__44___when_use__95__dirs_is_enabled.mdwn @@ -11,3 +11,5 @@ point to `foo/bar/` instead. > This bug affects the [[plugins/amazon_s3]] plugin -- when using that > plugin plus the search plugin, you need to enable `amazon_s3_dupindex`. > So this definitly should be fixed. --[[Joey]] + +> [[done]], the new xapian search uses nice urls diff --git a/doc/features.mdwn b/doc/features.mdwn index 1d762bed4..df963ab4f 100644 --- a/doc/features.mdwn +++ b/doc/features.mdwn @@ -158,8 +158,8 @@ Well, sorta. Rather than implementing YA history browser, it can link to ### Full text search -ikiwiki can use the [[HyperEstraier]] search engine to add powerful -full text search capabilities to your wiki. +ikiwiki can use the xapian search engine to add powerful +full text [[plugins/search]] capabilities to your wiki. ### [[w3mmode]] diff --git a/doc/ikiwiki.setup b/doc/ikiwiki.setup index db806a8c4..03d04176d 100644 --- a/doc/ikiwiki.setup +++ b/doc/ikiwiki.setup @@ -156,9 +156,9 @@ use IkiWiki::Setup::Standard { # base page. #tagbase => "tag", - # For use with the search plugin if your estseek.cgi is located + # For use with the search plugin if the omega cgi is located # somewhere else. - #estseek => "/usr/lib/estraier/estseek.cgi", + #omega_cgi => "/usr/lib/cgi-bin/omega/omega", # For use with the openid plugin, to give an url to a page users # can use to signup for an OpenID. diff --git a/doc/plugins/search.mdwn b/doc/plugins/search.mdwn index 7b32714f4..67e3b85ef 100644 --- a/doc/plugins/search.mdwn +++ b/doc/plugins/search.mdwn @@ -1,12 +1,18 @@ [[template id=plugin name=search author="[[Joey]]"]] [[tag type/useful]] -This plugin is included in ikiwiki, but is not enabled by default. It adds -full text search to ikiwiki, using the [[HyperEstraier]] engine. +This plugin adds full text search to ikiwiki, using the +[xapian](http://xapian.org/) engine, its +[omega](http://xapian.org/docs/omega/overview.html) frontend, +and the [[cpan Search::Xapian]] perl module. (The [[cpan HTML::Scrubber]] +perl module will also be used, if available.) -It's possible to configure HyperEstraier via one of ikiwiki's -[[templates|wikitemplates]], but for most users, no configuration should be -needed aside from enabling the plugin. +Ikiwiki will handle indexing new and changed page contents. Note that it +indexes page contents before they are preprocessed and converted to html, +as this tends to produce less noisy search results. Also, since it only +indexes page contents, files copied by the [[rawhtml]] plugin will not be +indexed, nor will other types of data files. -This plugin has a configuration option. To change the path to estseek.cgi, -set `--estseek=/path/to/estseek.cgi` +There is one setting you may need to use in the config file. `omega_cgi` +should point to the location of the omega cgi program. The default location +is `/usr/lib/cgi-bin/omega/omega`. diff --git a/doc/plugins/search/discussion.mdwn b/doc/plugins/search/discussion.mdwn index 494d0a38a..6b5714c42 100644 --- a/doc/plugins/search/discussion.mdwn +++ b/doc/plugins/search/discussion.mdwn @@ -42,3 +42,5 @@ Now I did a `rm -rf ~wiki/wiki/.ikiwiki/hyperestraier` and re-ran `--rebuild`ing once more, I'm back to the previous error message. --[[tschwinge]] + +I guess this is fixed now that it uses xapian. :-) --[[Joey]] diff --git a/doc/todo/different_search_engine.mdwn b/doc/todo/different_search_engine.mdwn index 81ca47547..788473ec5 100644 --- a/doc/todo/different_search_engine.mdwn +++ b/doc/todo/different_search_engine.mdwn @@ -1,3 +1,5 @@ +[[done]], using xapian-omega! --[[Joey]] + After using it for a while, my feeling is that [[hyperestraier]], as used in the [[plugins/search]] plugin, is not robust enough for ikiwiki. It doesn't upgrade well, and it has a habit of sig-11 on certain input from time to @@ -31,35 +33,25 @@ Possibilities: written on the page would be indexed. Not text generated by directives, pulled in by inlining, etc. There's something to be said for that. And something to be said against it. It would also get markdown formatted - content, mostly, though it would still need to strip html. + content, mostly, though it would still need to strip html, and also + probably strip preprocessor directives too. * `sanitize` - Would get the htmlized content, so would need to strip html. - Preprocessor directive output would be indexed. + Preprocessor directive output would be indexed. Doesn't get a destpage + parameter, making optimisation hard. * `format` - Would get the entire html page, including the page template. Probably not a good choice as indexing the same template for each page is unnecessary. -Currently, a filter hook seems the best option. - The hook would remove any html from the content, and index it. -It would need to add the same document data that omindex would, as well as -adding the same special terms (see -http://xapian.org/docs/omega/overview.html "Boolean terms"). - -(Note that the U term is a bit tricky because I'll have to replicate -ominxes's hash_string() to hash terms > 240 chars.) +It would need to add the same document data that omindex would. The indexer (and deleter) will need a way to figure out the ids in xapian of the documents to delete. One way is storing the id of each page in the ikiwiki index. The other way would be adding a special term to the xapian db that can be -used with replace_document_by_term/delete_document_by_term. omindex uses -U<url> as a term, and I guess I could just use that, and then map page -names to urls when deleting a page ... only real problem being the -hashing; a collision would be bad. - -At the moment, storing xapian ids in the ikiwiki index file seems like the -best approach. +used with replace_document_by_term/delete_document_by_term. +Hmm, let's use a term named "P<pagename>". The hook should try to avoid re-indexing pages that have not changed since they were last indexed. One problem is that, if a page with an inline is diff --git a/doc/todo/search_terms.mdwn b/doc/todo/search_terms.mdwn new file mode 100644 index 000000000..4e3f3aa6b --- /dev/null +++ b/doc/todo/search_terms.mdwn @@ -0,0 +1,5 @@ +The [[plugin/search]] plugin could use xapian terms to allow some special +searches. For example, "title:foo", or "link:somepage", or "author:foo", or +"copyright:GPL". + +Reference: <http://xapian.org/docs/omega/termprefixes.html> diff --git a/doc/wikitemplates.mdwn b/doc/wikitemplates.mdwn index f095cb035..b03fc10a1 100644 --- a/doc/wikitemplates.mdwn +++ b/doc/wikitemplates.mdwn @@ -21,15 +21,14 @@ located in /usr/share/ikiwiki/templates by default. * `inlinepage.tmpl` - Used for adding a page inline in a blog page. * `archivepage.tmpl` - Used for listing a page in a blog archive page. -* `estseek.conf` - Not a html template, this is actually a template for - a config file for the [[HyperEstraier]] search engine. If you like you - can read the [[HyperEstraier]] docs and configure it using this. * `blogpost.tmpl` - Used for a form to add a post to a blog (and a rss/atom links) * `feedlink.tmpl` - Used to add rss/atom links if blogpost.tmpl is not used. * `aggregatepost.tmpl` - Used by the [[plugins/aggregate]] plugin to create a page for a post. * `searchform.tmpl` - Used by the [[plugins/search]] plugin to add a search form to wiki pages. +* `searchquery.tmpl` - This is an omega template, used by the + [[plugins/search]] plugin. The [[plugins/pagetemplate]] plugin can allow individual pages to use a different template than `page.tmpl`. |