I've been profiling my IkiWiki to try to improve speed (with many pages makes speed even more important) and I've written a patch to improve the speed of match_glob. This matcher is a good one to improve the speed of, because it gets called so many times.
Here's my patch - please consider it! -- [[KathrynAndersen]]
It seems to me as though changing glob2re
to return qr/$re/, and calling
memoize(glob2re)
next to the other memoize calls, would be a less
verbose way to do this? --[[smcv]]
I think so, yeah. Anyway, do you have any benchmark results handy,
Kathryn? --[[Joey]]
See below.
Also, would it make more sense for glob2re to return qr/^$re$/i rather than qr/$re/? Everything that uses glob2re seems to use
$foo =~ /^$re$/i
rather than /$re/ so I think that would make sense.
-- [[KathrynAndersen]]
Git branch smcv/ka-glob-cache
has Kathryn's patch. Git
branch smcv/memoize-glob2re
does as I suggested, which
is less verbose than Kathryn's patch but also not as
fast; I'm not sure why, tbh. --[[smcv]]
I think it's because my patch focuses on match_glob while the memoize patch focuses on glob2re
, and glob2re
is called in filecheck
, meta
and po
as well as in match_glob
and match_user
; thus the memoized glob2re
is dealing with a bigger set of globs to look up, and thus could be just that little bit slower. -- [[KathrynAndersen]]
Benchmarks done with Devel::Profile on the same testbed IkiWiki setup. I'm just showing the start of the profile output, since that's what's relevant.
Before:
After:
Note that the seconds per call for match_glob in the "after" case has gone down by about a third.
K.A.
A second set of benchmarks, done by rebuilding the docwiki at commit f942c2db05e4
like so:
perl -Iblib/lib -d:Profile ikiwiki.in -setup docwiki.setup --no-verbose
The docwiki appears to use fewer glob matches than Kathryn's wiki.
With master:
time elapsed (wall): 29.6970
time running program: 24.6930 (83.15%)
time profiling (est.): 5.0041 (16.85%)
number of calls: 1359180
number of exceptions: 13
%Time Sec. #calls sec/call F name
13.62 3.3629 3406 0.000987 Text::Balanced::_match_tagged
10.84 2.6773 79442 0.000034 IkiWiki::PageSpec::match_glob
3.08 0.7598 59454 0.000013 <anon>:IkiWiki/Plugin/inline.pm:223
3.07 0.7593 29830 0.000025 IkiWiki::bestlink
2.99 0.7378 10231 0.000072 IkiWiki::PageSpec::match_link
With my smcv/memoize-glob2re
branch:
time elapsed (wall): 30.4931
time running program: 25.1248 (82.39%)
time profiling (est.): 5.3683 (17.61%)
number of calls: 1439943
number of exceptions: 13
%Time Sec. #calls sec/call F name
13.19 3.3146 3406 0.000973 Text::Balanced::_match_tagged
8.41 2.1123 79442 0.000027 IkiWiki::PageSpec::match_glob
3.97 0.9979 86905 0.000011 Memoize::_memoizer
3.05 0.7654 59454 0.000013 <anon>:IkiWiki/Plugin/inline.pm:223
3.02 0.7576 29830 0.000025 IkiWiki::bestlink
and in a repeated run:
8.40 2.0905 79442 0.000026 IkiWiki::PageSpec::match_glob
With Kathryn's patch as seen in my smcv/ka-glob-cache
branch:
time elapsed (wall): 27.7567
time running program: 22.9941 (82.84%)
time profiling (est.): 4.7627 (17.16%)
number of calls: 1279946
number of exceptions: 13
%Time Sec. #calls sec/call F name
14.29 3.2867 3406 0.000965 Text::Balanced::_match_tagged
7.89 1.8136 79442 0.000023 IkiWiki::PageSpec::match_glob
3.30 0.7577 59454 0.000013 <anon>:IkiWiki/Plugin/inline.pm:223
3.24 0.7461 29830 0.000025 IkiWiki::bestlink
3.19 0.7332 143 0.005127 ? IkiWiki::pagespec_match_list
and in a repeated run:
7.84 1.8253 79442 0.000023 IkiWiki::PageSpec::match_glob
--[[smcv]]