* Add a separate pass to find page links, and only render each page once,

instead of over and over. This is up to 8 times faster than before! (This could have introduced some subtle bugs, so it needs to be tested extensively.)
author: joey <joey@0fa5a96a-9a0e-0410-b3b2-a0fd24251071> 2006-10-28 03:27:10 +0000
committer: joey <joey@0fa5a96a-9a0e-0410-b3b2-a0fd24251071> 2006-10-28 03:27:10 +0000
commit: 49bf877701d89d613dcf5c2d85bd08876a636dba (patch)
tree: d28ec4df6277b2dcf8dbcd7aac5dc63215d9618a /doc/todo
parent: 05fe79b4872547997b4e54cf6965743b7fbf6e57 (diff)
1 files changed, 16 insertions, 20 deletions
diff --git a/doc/todo/optimisations.mdwn b/doc/todo/optimisations.mdwn
index 4e8118756..13a270b8f 100644
--- a/doc/todo/optimisations.mdwn
+++ b/doc/todo/optimisations.mdwn
@@ -1,25 +1,21 @@
-* Render each changed page only once. Currently pages are rendered up to 4
-  times in worst case (8 times if there's an rss feed).
-
-  The issue is that rendering a page is used to gather info like the links
-  on the page (and other stuff) that can effect rendering other pages. So it
-  needs a multi-pass system. But rendering the whole page in each pass is
-  rather obscene.
-  
-  It would be better to have the first pass be a data gathering pass. Such
-  a pass would still need to load and parse the page contents etc, but
-  wouldn't need to generate html or write anything to disk.
-
-  One problem with this idea is that it could turn into 2x the work in
-  cases where ikiwiki currently efficiently renders a page just once. And
-  caching between the passes to avoid that wouldn't do good things to the
-  memory footprint.
-
-  Might be best to just do a partial first pass, getting eg, the page links
-  up-to-date, and then multiple, but generally fewer, rendering passes.
-
 * Don't render blog archive pages unless a page is added/removed. Just
   changing a page doesn't affect the archives as they show only the title.
 
 * Look at splitting up CGI.pm. But note that too much splitting can slow
   perl down.
+
+* The backlinks code turns out to scale badly to wikis with thousands of
+  pages. The code is O(N^2)! It's called for each page, and it loops
+  through all the pages to find backlinks.
+
+  Need to find a way to calculate and cache all the backlinks in one pass,
+  which could be done in at worst O(N), and possibly less (if they're
+  stored in the index, it could be constant time). But to do this, there
+  would need to be a way to invalidate or update the cache in these
+  situations:
+
+  - A page is added. Note that this can change a backlink to point to 
+    the new page instead of the page it pointed to before.
+  - A page is deleted. This can also change backlinks that pointed to that
+    page.
+  - A page is modified. Links added/removed.
author	joey <joey@0fa5a96a-9a0e-0410-b3b2-a0fd24251071>	2006-10-28 03:27:10 +0000
committer	joey <joey@0fa5a96a-9a0e-0410-b3b2-a0fd24251071>	2006-10-28 03:27:10 +0000
commit	49bf877701d89d613dcf5c2d85bd08876a636dba (patch)
tree	d28ec4df6277b2dcf8dbcd7aac5dc63215d9618a /doc/todo
parent	05fe79b4872547997b4e54cf6965743b7fbf6e57 (diff)