summaryrefslogtreecommitdiff
path: root/doc/todo/darcs.mdwn
blob: e5bf5ee271fe32d98ba2ab50140d9f99f187a979 (plain)

Here's Thomas Schwinge unfinished darcs support for ikiwiki.

(Finishing this has been suggested as a [[soc]] project.)

I haven't been working on this for months and also won't in the near future. Feel free to use what I have done so far and bring it into an usable state! Also, feel free to contact me if there are questions.

-- Thomas Schwinge

[[!toggle text="show"]] [[!toggleable text=""" # Support for the darcs rcs, URL:http://darcs.net/. # Copyright (C) 2006 Thomas Schwinge tschwinge@gnu.org # # This program is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by the # Free Software Foundation; either version 2 of the License, or (at your # option) any later version. # # This program is distributed in the hope that it will be useful, but # WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU # General Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program; if not, write to the Free Software Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.

# We're guaranteed to be the only instance of ikiwiki running at a given
# time.  It is essential that only ikiwiki is working on a particular
# repository.  That means one instance of ikiwiki and it also means that
# you must not `darcs push' into this repository, as this might create
# race conditions, as I understand it.


use warnings;
use strict;
use IkiWiki;

package IkiWiki;


# Which darcs executable to use.
my $darcs = ($ENV{DARCS} or 'darcs');


# Internal functions.

sub darcs_info ($$$) {
    my $field = shift;
    my $repodir = shift;
    my $file = shift; # Relative to the repodir.

    my $child = open(DARCS_CHANGES, "-|");
    if (! $child) {
	exec($darcs, 'changes', '--repo=' . $repodir, '--xml-output', $file) or
	    error('failed to run `darcs changes\'');
    }

    # Brute force for now.  :-/
    while (<DARCS_CHANGES>) {
	last if /^<\/created_as>$/;
    }
    ($_) = <DARCS_CHANGES> =~ /$field=\'([^\']+)/;
    $field eq 'hash' and s/\.gz//; # Strip away the `.gz' from `hash'es.

    close(DARCS_CHANGES) or error('`darcs changes\' exited ' . $?);

    return $_;
}


# Exported functions.

sub rcs_update () {
    # Not needed.
}

sub rcs_prepedit ($) {
    # Prepares to edit a file under revision control.  Returns a token that
    # must be passed to rcs_commit() when the file is to be commited.  For us,
    # this token the hash value of the latest patch that modifies the file,
    # i.e. something like its current revision.  If the file is not yet added
    # to the repository, we return TODO: the empty string.

    my $file = shift; # Relative to the repodir.

    my $hash = darcs_info('hash', $config{srcdir}, $file);
    return defined $hash ? $hash : "";
}

sub rcs_commit ($$$) {
    # Commit the page.  Returns `undef' on success and a version of the page
    # with conflict markers on failure.

    my $file = shift; # Relative to the repodir.
    my $message = shift;
    my $rcstoken = shift;

    # Compute if the ``revision'' of $file changed.
    my $changed = darcs_info('hash', $config{srcdir}, $file) ne $rcstoken;

    # Yes, the following is a bit convoluted.
    if ($changed) {
	# TODO.  Invent a better, non-conflicting name.
	rename("$config{srcdir}/$file", "$config{srcdir}/$file.save") or
	    error("failed to rename $file to $file.save: $!");

	# Roll the repository back to $rcstoken.

	# TODO.  Can we be sure that no changes are lost?  I think that
	# we can, if we make sure that the `darcs push' below will always
	# succeed.

	# We need to revert everything as `darcs obliterate' might choke
	# otherwise.
        # TODO: `yes | ...' needed?  Doesn't seem so.
	system($darcs, "revert", "--repodir=" . $config{srcdir}, "--all") and
	    error("`darcs revert' failed");
	# Remove all patches starting at $rcstoken.
	# TODO.  Something like `yes | darcs obliterate ...' seems to be needed.
	system($darcs, "obliterate", "--quiet", "--repodir" . $config{srcdir},
	       "--match", "hash " . $rcstoken) and
		   error("`darcs obliterate' failed");
	# Restore the $rcstoken one.
	system($darcs, "pull", "--quiet", "--repodir=" . $config{srcdir},
	       "--match", "hash " . $rcstoken, "--all") and
		   error("`darcs pull' failed");

	# We're back at $rcstoken.  Re-install the modified file.
	rename("$config{srcdir}/$file.save", "$config{srcdir}/$file") or
	    error("failed to rename $file.save to $file: $!");
    }

    # Record the changes.
    # TODO: What if $message is empty?
    writefile("$file.log", $config{srcdir}, $message);
    system($darcs, 'record', '--repodir=' . $config{srcdir}, '--all',
	   '--logfile=' . "$config{srcdir}/$file.log",
	   '--author=' . 'web commit <web-hurd@gnu.org>', $file) and
	       error('`darcs record\' failed');

    # Update the repository by pulling from the default repository, which is
    # master repository.
    system($darcs, "pull", "--quiet", "--repodir=" . $config{srcdir},
	   "--all") and error("`darcs pull' failed\n");

    # If this updating yields any conflicts, we'll record them now to resolve
    # them.  If nothing is recorded, there are no conflicts.
    $rcstoken = darcs_info('hash', $config{srcdir}, $file);
    # TODO: Use only the first line here, i.e. only the patch name?
    writefile("$file.log", $config{srcdir}, 'resolve conflicts: ' . $message);
    system($darcs, 'record', '--repodir=' . $config{srcdir}, '--all',
	   '--logfile=' . "$config{srcdir}/$file.log",
	   '--author=' . 'web commit <web-hurd@gnu.org>', $file) and
	       error('`darcs record\' failed');
    my $conflicts = darcs_info('hash', $config{srcdir}, $file) ne $rcstoken;
    unlink("$config{srcdir}/$file.log") or
	error("failed to remove `$file.log'");

    # Push the changes to the main repository.
    system($darcs, 'push', '--quiet', '--repodir=' . $config{srcdir}, '--all')
	and error('`darcs push\' failed');
    # TODO: darcs send?

    if ($conflicts) {
	my $document = readfile("$config{srcdir}/$file");
	# Try to leave everything in a consistent state.
        # TODO: `yes | ...' needed?  Doesn't seem so.
	system($darcs, "revert", "--repodir=" . $config{srcdir}, "--all") and
	    warn("`darcs revert' failed.\n");
	return $document;
    } else {
	return undef;
    }
}

sub rcs_add ($) {
    my $file = shift; # Relative to the repodir.

    # Intermediate directories will be added automagically.
    system($darcs, 'add', '--quiet', '--repodir=' . $config{srcdir},
	   '--boring', $file) and error('`darcs add\' failed');
}

sub rcs_recentchanges ($) {
    warn('rcs_recentchanges() is not implemented');
    return 'rcs_recentchanges() is not implemented';
}

sub rcs_notify () {
    warn('rcs_notify() is not implemented');
}

sub rcs_getctime () {
    warn('rcs_getctime() is not implemented');
}

1

"""]]

This is my (bma) darcs.pm - it's messy (my Perl isn't up to much) but seems to work. It uses just one repo, like the mercurial plugin (unlike the above version, which AIUI uses two).

rcs_commit() uses backticks instead of system(), to prevent darcs' output being sent to the browser and mucking with the HTTP headers (darcs record has no --quiet option). And rcs_recentchanges() uses regexes rather than parsing darcs' XML output.

[[!toggle text="show" id="bma"]] [[!toggleable id="bma" text="""

#!/usr/bin/perl

use warnings;
use strict;
use IkiWiki;
use Date::Parse;
use open qw{:utf8 :std};

package IkiWiki;

sub rcs_update () { #{{{
	# Do nothing - there's nowhere to update *from*.
} #}}}

sub rcs_prepedit ($) { #{{{
} #}}}

sub rcs_commit ($$$;$$) { #{{{
	my ($file, $message, $rcstoken, $user, $ipaddr) = @_;

	# $user should probably be a name and an email address, by darcs
	# convention.
	if (defined $user) {
		$user = possibly_foolish_untaint($user);
	}
	elsif (defined $ipaddr) {
		$user = "Anonymous from $ipaddr";
	}
	else {
		$user = "Anonymous";
	}

	$message = possibly_foolish_untaint($message);
	
	# BUG: this outputs one line of text, and there's not a -q or --quiet
	# option. Redirecting output to /dev/null works, but I still get the
	# HTTP status and location headers displayed in the browser - is that
	# darcs' fault or ikiwiki's?
	# Doing it in backticks *works*, but I'm sure it could be done better.
	my @cmdline = ("darcs", "record", "--repodir", "$config{srcdir}",
	               "-a", "-m", "$message", "--author", "$user", $file);
	`darcs record --repodir "$config{srcdir}" -a -m "$message" --author "$user" $file`; # Return value? Output? Who needs 'em?
	#if (system(@cmdline) != 0) {
	#	warn "'@cmdline' failed: $!";
	#}

	return undef; # success
	
sub rcs_add ($) { # {{{
	my ($file) = @_;

	my @cmdline = ("darcs", "add", "--repodir", "$config{srcdir}", "-a", "-q", "$file");
	if (system(@cmdline) != 0) {
		warn "'@cmdline' failed: $!";
	}
} #}}}

sub rcs_recentchanges ($) { #{{{
	# TODO: This is horrible code. It doesn't work perfectly, and uses regexes
	# rather than parsing Darcs' XML output.
	my $num=shift;
	my @ret;
	
	return unless -d "$config{srcdir}/_darcs";

	my $changelog = `darcs changes --xml --summary --repodir "$config{srcdir}"`;
	$changelog = join("", split(/\s*\n\s*/, $changelog));
	my @changes = split(/<\/patch>.*?<patch/m, $changelog);


	foreach my $change (@changes) {
		$change =~ m/hash='(.*?)'/;
		my $rev = $1;
		$change =~ m/author='(.*?)'/;
		my $user = $1."\n";
		my $committype = "web";
		if($user =~ m/&lt;/) {
			# Author fields generated by darcs include an email address: look for the "<".
			$committype = "darcs";
			use HTML::Entities;
			$user = decode_entities $user;
		}
		$change =~ m/local_date='(.*?)'/;
		my $when = $1;
		$when=time - str2time($when, 'UTC');
		$change =~ m/<name>(.*?)<\/name>/g;
		my @message = {line => $1};
		foreach my $match ($change =~ m/<comment>(.*?)<\/comment>/gm) {
			push @message, {line => $1};
		}

		my @pages;
		foreach my $match ($change =~ m/<.*?_(file|directory)>(.*?)(<(added|removed)_lines.*\/>)*<\/.*?_(file|directory)>/g) {
			# My perl-fu is weak. I'm probably going about this all wrong, anyway.
			push @pages, {page => pagename($match)} if ( -f $config{srcdir}."/".$match || -d $config{srcdir}."/".$match) and not $match =~ m/^$/;
		}
		push @ret, { rev => $rev,
				user => $user,
				committype => $committype,
				when => $when,
				message => [@message],
				pages => [@pages],
			}
	}
	return @ret;
} #}}}

sub rcs_notify () { #{{{
	# TODO
} #}}}

sub rcs_getctime ($) { #{{{
	error gettext("getctime not implemented");
} #}}}

1

"""]]


Well, here's my version too. It only does getctime -- using a real XML parser, instead of regexp ugliness -- and maybe recentchanges, but that may be bitrotted, or maybe I never finished it, as I only need the getctime. As for actual commits, I have previously voiced my opinion, that this should be done by the plugin generating a patch bundle, and forwarding it to darcs in some way (darcs apply or even email to another host, possibly moderated), instead of the hacky direct modification of a working copy. It could also be faster to getctime in a batch. Just reading in all the changes the first time they're needed, might not be a big improvement in many cases, but if we got a batch request from ikiwiki, we could keep reaing the changes until all the files in this batch request have been met. --[[tuomov]]

[[!toggle text="show" id="tuomov"]] [[!toggleable id="tuomov" text="""

"""]]


I merged the two versions above and made some fixes; it is recording my web edits in darcs and showing a recent changes page. It is in a darcs repository, please send patches. --[[Simon_Michael]]

I'd like to see at least the following fixed before I commit this: --[[Joey]]

  • Running darcs record $filename in backticks is not good (security) The thing to do is to open stdout to /dev/null before execing darcs.
  • Get rcs_recentchanges_xml working, parsing xml with regexps does not seem like a maintenance win.
  • rcs_notify should be removed, it's no longer used.
  • Some form of conflict handling. Using darcs to attempt to merge the changes is I gusss optional (although every other rcs backend, including svn manages to do this), but it needs to at least detect conflicts and return a page with conflict markers for the user to fix the conflict.

I have addressed the recentchanges bit, you can find my hacked up darcs.pm at http://web.mornfall.net/stuff/web-root/IkiWiki/Rcs/darcs.pm.

It's got couple of FIXMEs, and a very site-specific filter for recentchanges. Not sure how to do that better though. I will eventually add web commits, probably of my own (and mention it here).


And here's yet another one, including an updated ikiwiki-makerepo. :)

http://khjk.org/~pesco/ikiwiki-darcs/ (now a darcs repo)

Note that there's a 'darcs' branch in git that I'm keeping a copy of your code in. Just in case. :-)

I've taken all the good stuff from the above and added the missing hooks. The code hasn't seen a lot of testing, so some bugs are likely yet to surface. Also, I'm not experienced with perl and don't know where I should have used the function possibly_foolish_untaint.

Regarding the repository layout: There are two darcs repositories. One is the srcdir, the other we'll call master.

  • HTML is generated from srcdir.

  • CGI edits happen in srcdir.

  • The backend pulls updates from master into srcdir, i.e. darcs commits should happen to master.

  • master calls ikiwiki (through a wrapper) in its apply posthook, i.e. master/_darcs/prefs/defaults should look like this:

    apply posthook ikiwrap
    apply run-posthook
    

    (I'm not sure, should/could it be ikiwrap --refresh above?)

  • The backend pushes CGI edits from srcdir back into master (triggering the apply hook).

  • The working copies in srcdir and master should not be touched by the user, only by the CGI or darcs, respectively.

Review of this one:

  • Should use tab indentation.
  • rcs_getctime should not need to use a ctime cache (such a cache should also not be named .ikiwiki.ctimes). rcs_getctime is run exactly once per page, ever, and the data is cached in ikiwiki's index.
  • I doubt that ENV{DARCS} will be available, since the wrapper clobbers> the entire environment. I'd say remove that.
  • I don't understand what darcs_info is doing, but it seems to be parsing xml with a regexp?
  • Looks like rcs_commit needs a few improvements, as marked TODO
  • rcs_remove just calls "rm"? Does darcs record notice the file was removed and automatically commit the removal? (And why system("rm") and not unlink?)
  • Is the the darcs info in [[details]] still up-to-date re this version? --[[Joey]]

Update:

I think I've addressed all of the above except for the XML parsing in darcs_info. The function determines the md5 hash of the last patch the given file appears in. That's indeed being done with regexps but my Perl isn't good enough for a quick recode right now.

As for the darcs info in [[rcs/details]], it does not accurately describe the way this version works. It's similar, but the details differ slightly. You could copy my description above to replace it.

There is still some ironing to do, for instance the current version doesn't allow for modifying attachments by re-uploading them via CGI ("darcs add failed"). Am I assuming correctly that "adding" a file that's already in the repo should just be a no-op? --pesco

It should result in the new file contents being committed by rcs_commit_staged. For some revision control systems, which automatically commit modifications, it would be a no-op. --[[Joey]]

Done. --pesco

[[!tag patch]]