Merge commit 'intrigeri/po' into po

author: Joey Hess <joey@gnu.kitenet.net> 2009-05-19 13:06:35 -0400
committer: Joey Hess <joey@gnu.kitenet.net> 2009-05-19 13:06:35 -0400
commit: 53b1c6f559c1d09fbdbc28c8e4d5090dd455cd26 (patch)
tree: d379bb0acd2dd3e9370c37b27f03989398694977 /doc/tips
parent: 18695056917a2f34a36e5e89df7f01deff9ab640 (diff)
parent: 4558457402a4ab6bc795589a2e400fa66144f76e (diff)
6 files changed, 209 insertions, 2 deletions
diff --git a/doc/tips/Importing_posts_from_Wordpress.mdwn b/doc/tips/Importing_posts_from_Wordpress.mdwn
index 87ef12079..59330caa4 100644
--- a/doc/tips/Importing_posts_from_Wordpress.mdwn
+++ b/doc/tips/Importing_posts_from_Wordpress.mdwn
@@ -2,4 +2,12 @@ Use case: You want to move away from Wordpress to Ikiwiki as your blogging/websi
 
 [This](http://git.chris-lamb.co.uk/?p=ikiwiki-wordpress-import.git) is a simple tool that generates [git-fast-import](http://www.kernel.org/pub/software/scm/git/docs/git-fast-import.html)-compatible data from a WordPress export XML file. It retains creation time of each post, so you can use Ikiwiki's <tt>--getctime</tt> to get the preserve creation times on checkout. 
 
-WordPress categories are mapped onto Ikiwiki tags. The ability to import comments is planned.
-\ No newline at end of file
+WordPress categories are mapped onto Ikiwiki tags. The ability to import comments is planned.
+
+-----
+
+I include a modified version of this script. This version includes the ability to write \[[!tag foo]] directives, which the original intended, but didn't actually do.
+
+-- [[users/simonraven]]
+
+[[ikiwiki-wordpress-import]]
diff --git a/doc/tips/Importing_posts_from_Wordpress/discussion.mdwn b/doc/tips/Importing_posts_from_Wordpress/discussion.mdwn
index 3b328649e..55e04d9cb 100644
--- a/doc/tips/Importing_posts_from_Wordpress/discussion.mdwn
+++ b/doc/tips/Importing_posts_from_Wordpress/discussion.mdwn
@@ -2,3 +2,43 @@ When I attempt to use this script, I get the following error:
 warning: Not updating refs/heads/master (new tip 26b1787fca04f2f9772b6854843fe99fe06e6088 does not contain fc0ad65d14d88fd27a6cee74c7cef3176f6900ec).  I have git 1.5.6.5, any ideas?
 
 Thanks!!
+
+-----
+
+### KeyError: 146
+
+I also get this error, here's the output (it seems to stem from an error in the python script):
+
+<pre>
+Traceback (most recent call last):
+  File "../ikiwiki-wordpress-import.py", line 74, in <module>
+    main(*sys.argv[1:])
+  File "../ikiwiki-wordpress-import.py", line 54, in main
+    data = content.encode('ascii', 'html_replace')
+  File "../ikiwiki-wordpress-import.py", line 30, in <lambda>
+    % htmlentitydefs.codepoint2name[ord(c)] for c in x.object[x.start:x.end]]), x.end))
+KeyError: 146
+warning: Not updating refs/heads/master (new tip 6dca6ac939e12966bd64ce8a822ef14fe60622b2 does not contain 60b798dbf92ec5ae92f18acac3075c4304aca120)
+git-fast-import statistics:
+</pre>
+
+etc.
+
+
+> Well, if this really is a script error, it's not really the script, but the wordpress XML dump, referring to a
+> possible malformed or invalid unicode character in the dump file. This is what I can gather from other scripts.
+> I'll be checking my dump file shortly.
+
+>> This is only part of the problem... I'm not exactly sure what's going on, and it's get late/early for me....
+
+>>> I used --force for fast-import, but then everything seems deleted, so you end up doing a reset, checkout, add, *then* commit.
+>>> Seems really odd. I edited the script however, maybe this is why... this is my changes:
+
+    -print "data %d" % len(data)
+    +print "data %d merge refs/heads/%s" % (len(data), branch)
+
+>>> That control character is a ^q^0 in emacs, see git fast-import --help for more info.
+>>> I'll be trying an import *without* that change, to see what happens.
+
+>>>> I still have to do the above to preserve the changes done by this script... (removed previous note).
+
diff --git a/doc/tips/add_chatterbox_to_blog.mdwn b/doc/tips/add_chatterbox_to_blog.mdwn
new file mode 100644
index 000000000..aa35b9331
--- /dev/null
+++ b/doc/tips/add_chatterbox_to_blog.mdwn
@@ -0,0 +1,21 @@
+If you use twitter or identi.ca, here's how to make a box
+on the side of your blog that holds your recent status updates
+from there, like I have on [my blog](http://kitenet.net/~joey/blog/)
+--[[Joey]] 
+
+* Enable the [[plugins/aggregate]] plugin, and set up a cron
+  job for it.
+* At the top of your blog's page, add something like the following.
+  You'll want to change the urls of course. Be sure to also change
+  the inline directive's [[PageSpec]] to link to the location the
+  feed is aggregated to, which will be a subpage of the page
+  you put this on (blog in this example):
+
+	\[[!template id=note text="""  
+	\[[!aggregate expirecount=5 name="dents" url="http://identi.ca/joeyh"  
+	feedurl="http://identi.ca/api/statuses/user_timeline/joeyh.atom"]]  
+	\[[!inline pages="internal(./blog/dents/*)" template=microblog
+	show=5 feeds=no]]
+	"""]]
+
+Note: Works best with ikiwiki 3.10 or better.
diff --git a/doc/tips/add_chatterbox_to_blog/discussion.mdwn b/doc/tips/add_chatterbox_to_blog/discussion.mdwn
new file mode 100644
index 000000000..bf7c9b1c3
--- /dev/null
+++ b/doc/tips/add_chatterbox_to_blog/discussion.mdwn
@@ -0,0 +1,25 @@
+The example you gave looks a bit odd.
+
+This is what I did from your example (still trying to learn the more complex things ;).
+
+<pre>
+\[[!template id=note text="""
+\[[!aggregate expirecount=5 name=kijkaqawej url=http://identi.ca/kjikaqawej
+feedurl=http://identi.ca/api/statuses/user_timeline/kjikaqawej.atom]]
+\[[!inline pages="internal(kijkaqawej/*)" template=microblog show=5 feeds=no]] """]]
+</pre>
+
+mine, live, here: <http://simonraven.kisikew.org/blog/meta/microblog-feed/>
+
+I expected something like: sidebar, with a number, and displaying them in the sidebar, but they don't display (similar to what you have on your blog).
+
+On the [[/ikiwiki/pagespec]] page, it says "internal" pages aren't "first-class" wiki pages, so it's best not to directly display them, so how do you manage to display them? I'd like to display their name, and what they link to in the sidebar, or otherwise in the main body.
+
+> That's what the inline does, displays the internal pages.
+> 
+> You need to fix your pagespec to refer to where the pages are aggregated
+> to, under the page that contains the aggregate directive. In your example,
+> it should be `internal(./blog/meta/microblog-feed/kijkaqawej/*)` --[[Joey]]
+
+>> Oooh, I see, it's referring to an absolute path (relative to the site), right?
+>> Thanks :).
diff --git a/doc/tips/embedding_content.mdwn b/doc/tips/embedding_content.mdwn
index 142acd16e..bfe458a84 100644
--- a/doc/tips/embedding_content.mdwn
+++ b/doc/tips/embedding_content.mdwn
@@ -15,7 +15,7 @@ you'd better trust that site. And if ikiwiki lets you enter such html, it
 needs to trust you.)
 
 The [[plugins/htmlscrubber]] offers a different way around this problem.
-You can configure it to skip scrubbing certian pages, so that content from
+You can configure it to skip scrubbing certain pages, so that content from
 elsewhere can be embedded on those pages. Then use [[plugins/lockedit]]
 to limit who can edit those unscrubbed pages.
 
diff --git a/doc/tips/importing_posts_from_wordpress/ikiwiki-wordpress-import.mdwn b/doc/tips/importing_posts_from_wordpress/ikiwiki-wordpress-import.mdwn
new file mode 100644
index 000000000..5d7a266ec
--- /dev/null
+++ b/doc/tips/importing_posts_from_wordpress/ikiwiki-wordpress-import.mdwn
@@ -0,0 +1,113 @@
+[[!meta title="ikiwiki-wordpress-import"]]
+
+I modified the script a bit so categories and tags would actually show up in the output file.
+
+
+<pre>
+#!/usr/bin/env python
+
+"""
+    Purpose:
+    Wordpress-to-Ikiwiki import tool
+
+    Copyright:
+    Copyright (C) 2007  Chris Lamb <chris@chris-lamb.co.uk>
+
+    This program is free software: you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation, either version 3 of the License, or
+    (at your option) any later version.
+
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+
+    You should have received a copy of the GNU General Public License
+    along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+    Usage: run --help as an argument with this script.
+
+    Notes:
+    I added some extra bits to include the [[!tag foo]] stuff in the post,
+    as it wasn't before, at all. I'll diff the versions out so you can see
+    the mess I made :).
+
+"""
+
+import os, sys
+import time
+import re
+
+from BeautifulSoup import BeautifulSoup
+
+import codecs, htmlentitydefs
+
+codecs.register_error('html_replace', lambda x: (''.join([u'&%s;' \
+    % htmlentitydefs.codepoint2name[ord(c)] for c in x.object[x.start:x.end]]), x.end))
+
+def main(name, email, subdir, branch='master'):
+    soup = BeautifulSoup(sys.stdin.read())
+
+    # Regular expression to match stub in URL.
+    stub_pattern = re.compile(r'.*\/(.+)\/$')
+
+    for x in soup.findAll('item'):
+        # Ignore draft posts
+        if x.find('wp:status').string != 'publish': continue
+
+        match = stub_pattern.match(x.guid.string)
+        if match:
+            stub = match.groups()[0]
+        else:
+            # Fall back to our own stubs
+            stub = re.sub(r'[^a-zA-Z0-9_]', '-', x.title.string).lower()
+
+        commit_msg = """Importing WordPress post "%s" [%s]""" % (x.title.string, x.guid.string)
+        timestamp = time.mktime(time.strptime(x.find('wp:post_date_gmt').string, "%Y-%m-%d %H:%M:%S"))
+
+        content = '[[!meta title="%s"]]\n\n' % (x.title.string.replace('"', r'\"'))
+        content += x.find('content:encoded').string.replace('\r\n', '\n')
+
+        # categories = x.findAll('category')
+        # categories = x.findAll({'category':True}, attrs={'domain':re.compile(('category|tag'))})
+        # categories = x.findAll({'category':True}, domain=["category", "tag"])
+        # categories = x.findAll({'category':True}, nicename=True)
+        """
+        We do it differently here because we have duplicates otherwise.
+        Take a look:
+        <category><![CDATA[Health]]></category>
+	<category domain="category" nicename="health"><![CDATA[Health]]></category>
+
+        If we do the what original did, we end up with all tags and cats doubled.
+        Therefore we only pick out nicename="foo". Our 'True' below is our 'foo'.
+        I'd much rather have the value of 'nicename', and tried, but my
+        python skillz are extremely limited....
+        """
+        categories = x.findAll('category', nicename=True)
+        if categories:
+            content += "\n"
+            for cat in categories:
+                # remove 'tags/' because we have a 'tagbase' set.
+                # your choice: 'tag', or 'taglink'
+                # content += "\n[[!tag %s]]" % (cat.string.replace(' ', '-'))
+                content += "\n[[!taglink %s]]" % (cat.string.replace(' ', '-'))
+                # print >>sys.stderr, cat.string.replace(' ', '-')
+
+        # moved this thing down
+        data = content.encode('ascii', 'html_replace')
+        print "commit refs/heads/%s" % branch
+        print "committer %s <%s> %d +0000" % (name, email, timestamp)
+        print "data %d" % len(commit_msg)
+        print commit_msg
+        print "M 644 inline %s" % os.path.join(subdir, "%s.mdwn" % stub)
+        print "data %d" % len(data)
+        print data
+
+if __name__ == "__main__":
+    if len(sys.argv) not in (4, 5):
+        print >>sys.stderr, "%s: usage: %s name email subdir [branch] < wordpress-export.xml | git-fast-import " % (sys.argv[0], sys.argv[0])
+    else:
+        main(*sys.argv[1:])
+
+</pre>
author	Joey Hess <joey@gnu.kitenet.net>	2009-05-19 13:06:35 -0400
committer	Joey Hess <joey@gnu.kitenet.net>	2009-05-19 13:06:35 -0400
commit	53b1c6f559c1d09fbdbc28c8e4d5090dd455cd26 (patch)
tree	d379bb0acd2dd3e9370c37b27f03989398694977 /doc/tips
parent	18695056917a2f34a36e5e89df7f01deff9ab640 (diff)
parent	4558457402a4ab6bc795589a2e400fa66144f76e (diff)