Perl 6 RSS Feeds

Steve Mynott (Freenode: stmuk) steve.mynott (at) / 2017-07-20T12:11:12

Weekly changes in and around Perl 6: 2017.29 Zoffix Released

Published by liztormato on 2017-07-17T23:27:55

This week saw the end of an era: Zoffix Znet and his trusty bots released Rakudo Perl 6 2017.07 compiler. Which will be the base of the 2017.07 Rakudo Star release within the next few days by the looks of it.

So why is that the end of an era? Well, because Zoffix decided that he would like to pass on the baton of Release Manager to someone else, which AlexDaniel graciously accepted!

I would hereby like to specifically thank Zoffix Znet for everything he has done to make the release process as easy as it is now. And to thank him for the 14 consecutive monthly Rakudo Perl 6 releases he as done, which is a record! Tip of the hat!

The Perl Conference in Amsterdam

The final list of presentations is now available: no further talks will be added. Unless you want to submit a Lightning Talk! If you’ve already registered, please mark the presentations that you want to attend, so that the organisers can have a better idea of the size of the room needed for that presentation.

Furthermore, it is now certain that there will be 2 days of Hackathon following the Perl Conference in Amsterdam (Saturday 12 and Sunday 13 August) at the same venue. Yours truly will be there for sure, and hopefully a lot of other people working on either Rakudo Perl 6 or Pumpkin Perl 5!

Core Developments

Blog Posts

Meanwhile on Twitter

Meanwhile on StackOverflow

Meanwhile on perl6-users

Ecosystem Additions

  • WebService::AWS::S3 by Brian Duggan.
  • Winding Down

    A hot week ahead. Please check again in a week to see how hot it got in Perl 6 land!

    Perl 6 Maven: LWP::Simple - a simple web client in Perl 6

    Published by szabgab

    Weekly changes in and around Perl 6: 2017.28 Rakudo is Hot

    Published by liztormato on 2017-07-10T21:28:09

    “A rose by any other name…”. It is the subtitle of Zoffix Znet‘s blog post The Hot New Language Named Rakudo, in which he describes his reasoning for wanting to tweak the name of the programming language “Perl 6” to “Rakudo Perl 6“. Which, in the view of yours truly, is not too different from earlier suggestions of tweaking the name of “Perl 5” to “Pumpking Perl 5” to differentiate Perl 5 and Perl 6 in the public eye. Some quotes from the blog post:

    (Rakudo) is a young and hip teenager who doesn’t mind breaking long held status quo. Rakudo is the King of Unicode and Queen of Concurrency. It’s a “4th generation” language, and if you take the time to learn its many features, it’s quite damn concise.

    Trying to market Rakudo language as a “Perl 6” language is like holding a great Poker hand while playing Blackjack—even with a Royal Flush, you still lost the game that’s actually being played. The truly distinguishing features of Rakudo don’t get any attention, while at the same time people get disappointed when a “Perl” language no longer does things Perl used to do.

    Rakudo has many strengths but they get muted when we call it “Perl 6“. Perl is a brand name for a product with different strengths and attempting to pretend Rakudo has the same strengths for the past 2 years proved to be a failed strategy. I believe a name tweak can help these issues and start us on a path with a more solid footing.

    The blog post sparked quite a few comments so far:, Reddit r/perl and Reddit r/perl6.

    Yours truly appreciates the effort and thought that Zoffix Znet has put into this blog post (as he has done with many other excellent blog posts in the past). It is definitely food for thought for the marketing efforts of Rakudo Perl 6. And one can only hope it will get picked up!

    Other blog posts

    Other Core Developments

    Meanwhile on StackOverflow

    Meanwhile on Twitter

    Meanwhile on perl6-users

    Ecosystem Additions

    Winding Down

    Feels like summer is heating up. Check here again next week for more news about Rakudo Perl 6! The Loss of Name and Orientation

    Published by Moritz Lenz on 2017-07-10T22:00:01

    The Perl 6 naming debate has started again. And I guess with good reason. Teaching people that Perl 6 is a Perl, but not the Perl requires too much effort. Two years ago, I didn't believe. Now you're reading a tired man's words.

    I'm glad that this time, we're not discussing giving up the "Perl" brand, which still has very positive connotations in my mind, and in many other minds as well.

    And yet, I can't bring myself to like "Rakudo Perl 6" as a name. There are two vary shallow reasons for that: Going from two syllables, "Perl six", to five of them, seems a step in the wrong direction. And two, I remember the days when the name was pretty young, and people would misspell it all the time. That seems to have abated, though I don't know why.

    But there's also a deeper reason, probably sentimental old man's reason. I remember the days when Pugs was actively developed, and formed the center of a vibrant community. When kp6 and SMOP and all those weird projects were around. And then, just when it looked like there was only a single compiler was around, Stefan O'Rear conjured up niecza, almost single-handedly, and out of thin air. Within months, it was a viable Perl 6 compiler, that people on #perl6 readily recommended.

    All of this was born out of the vision that Perl 6 was a language with no single, preferred compiler. Changing the language name to include the compiler name means abandoning this vision. How can we claim to welcome alternative implementations when the commitment to one compiler is right in the language name?

    However I can't weigh this loss of vision against a potential gain in popularity. I can't decide if it's my long-term commitment to the name "Perl 6" that makes me resent the new name, or valid objections. The lack of vision mirrors my own state of mind pretty well.

    I don't know where this leaves us. I guess I must apologize for wasting your time by publishing this incoherent mess.

    Perl 6 Maven: MongoDB with Perl 6 on Linux

    Published by szabgab

    Zoffix Znet: The Hot New Language Named Rakudo

    Published on 2017-07-07T00:00:00

    A rose by any other name...

    Perl 6 Maven: Parsing command line arguments in Perl 6 - ARGS - ARGV - MAIN

    Published by szabgab

    samcv: Grant Status Update 2

    Published on 2017-07-04T07:00:00

    This is my second grant progress report for my Perl Foundation grant entitled "Improving the Robustness of Unicode Support in Rakudo on MoarVM".

    I got to working on collation this month. I'm going to explain a bit of how the Unicode Collation Algorithm works.

    Unicode Collation Algorithm

    The collation data for UCA is made up of arrays like so:


    Each one is an integer. Primary is different for different letters, 'a' vs 'z'. secondary is differences such as diacritics, 'a' vs 'á'. And tertiary is case, 'a' vs 'A'. While it works different for non-Latin characters, that is the gist of what they represent. In most cases you have one codepoint mapped to one collation array. Though in many cases, this is not true.

    Single codepoints can map to multiple collation array elements. Sequences of codepoints can also map to one or more than one collation array elements.

    Some sequences also can exist inside others.

    So the string xyz may have one set of collation elements but xy has another, where x y and z are codepoints in a three codepoint sequence with its own set of multiple collation keys.

    So, how do these collation elements translate into sorting the codepoints?

    [.0706.0020.0002], [.06D9.0020.0002]

    You take the two primary values, then append a 0 as a seperator. Then push the secondary, append another 0 as a separator and then push on the tertiary:

    0707, 06D9, 0, 020, 020, 0, 02, 02

    Now this would pose a problem since we would need to traverse the entire string before making any decisions. Instead what I have decided to do is to use the arrays with [primary.secondary.tertiary] and push them onto a stack instead of changing them into a linear progression, and iterate through the primary's, and then grab more collation elements as they are needed to resolve ties.

    Also when collation data for the next codepoint is added to the stack, if it is a starter is a sequence we will also pull the next codepoint going through a linked list stored in C arrays as needed. If the next codepoint ends up not being a part of a sequence we just push the codepoint we just "peeked" at onto the stack as well, so don't have to go back over codepoints.

    Now this improved Unicode Collation Algorithm is not complete, but I am continuing to work on integrating the new C data structure I've created into MoarVM, and it currently works partially, but not as well as the current implementation.

    Improvements to the Current UCA Implementation

    In the meantime I have made improvements to the current implementation of the Unicode Collation Algorithm. Previously it was possible to enable or disable the primary, secondary or tertiary levels. This allowed you to do things such as ignore diacritics when sorting or ignore casing. What you are now able to do is to reverse the sorting of different levels. This allows you to for example sort uppercase letters before lowercase (default UCA sorts lowercase before uppercase, since lowercase < uppercase). It can also let you put diacritic mark containing characters before the ordinary letters. Any of the three levels can be either enabled, disabled, or reversed. For anybody already using it, supplying True or False to set $*COLLATION still works the same as before, but you are now able to supply 1, 0, or -1 to enable, disable or reverse the collation for specific levels.


    Grapheme Cluster Break

    As I said last week I made improvements to the script that tests our breakup of graphemes. Now we have full support for the Prepend property that was added in Unicode 9.0, as well as passing all the tests for regional indicators. The only tests we now don't pass in GraphemeClusterBreakTest.t are a few emoji tests, and I believe we only fail 3 or so of these! The Prepend mark fixes needed us to save more state across parsing the code, as Prepend is different from all other Unicode grapheme break logic in that it comes before not after a base character.

    Igorecase+Ignoremark Regex

    The longstanding bug I mentioned in my previous status report has now been fixed. The bug was in regex when both ignorecase and ignoremark adverbs were used.

    say "All hell is breaking loose" ~~ m:i:m/"All is fine, I am sure of it"/
    # OUTPUT«「All hell is breaking loose」␤» Output before the fix. This should not have matched.

    This bug occurred when the entire length of the haystack was searched and all of the graphemes matched the needle.

    If the needle exceeded the length of the haystack past that point, it would erroneously think there was a match there, as it only checked that it matched the whole length of the haystack.

    Would cause 'fgh' to be found in: 'abcdefg'. This only occurred at the very end of the haystack.

    The internal string_equal_at_ignore_case_INTERNAL_loop now returns -1 if there was no match and 0 or more if there was a match at that index.

    This return value provides new information which is 0 if there was a match and some positive integer when the haystack was expanded when casefolding it.

    As explained by my previous post, information about when characters expand when foldcased must be retained.

    This information had been planned to be exposed in some way at a future date, as if we are searching for 'st' inside a string 'stabc', nqp::indexic (index ignorecase) will indicate that it is located at index 0, and in Perl 6 Rakudo it will return 'sta' when it should instead have returned 'st'.

    For now this additional information is only internal and the return values of the nqp::indexic_s and nqp::equatic_s ops have not changed.

    NQP Codepath Problems…

    Previously there were way too many different codepaths handling different variations of no regex adverbs, ignorecase, ignoremark, ignorecase+ignoremark. Problematically each combination had their own codepath. To really solve this bug and improve the code quality I decided to clean it up and correct this.

    In my past work I had already added a nqp::indexic op, so it was time to add another! I added a nqp::indexicim op and a nqp::eqaticim op and was able to reuse most of the code and not increase our code burden much on the MoarVM side, and greatly reduce the possibility for bugs to get in on varying combinations of ignorecase/ignoremark ops.

    This is was a very longstanding Unicode bug (I don't think both adverbs together ever worked) so it's great that it is now fixed :).

    Coming Up

    I will be continuing to fix out the issues in the new Unicode Collation Algorithm implementation as I described earlier in this post. I also plan on taking stock of all of the current Grapheme Cluster Break issues, which only exist now for certain Emoji (though the vast majority of Emoji work properly).

    I will also be preparing my talks for the Amsterdam Perl conference as well!


    I released a new module, Font::QueryInfo which allows you to query font information using FreeType. It can even return the codepoints a font supports as a list of ranges!

    Weekly changes in and around Perl 6: 2017.27 Inching On Speed

    Published by liztormato on 2017-07-03T22:08:52

    Or, how to go from 2.5x slower to only 1.1x slower than Perl 5. Jonathan Worthington explains it all in his blog post titled “Optimizing reading lines from a file“. But that’s not all he’s done in the past week! Last Friday he also gave an online presentation titled “Primitives, Composition, Patterns” (slides), which was part of the sponsoring by Nick Logan. A must see if you want to get up to date on the latest in concurrency in Perl 6! And also a prime example of the quality of deliverables of sponsoring Jonathan.

    Seqs, Drugs, And Rock’n’Roll

    Zoffix Znet published part 2 of his blog post series about Iterators and Seqs. Please look at part 1, in which he discusses how a Seq can .cache its values, if you haven’t done so already. Part 2 delves into how to write your own Iterator and what you can do to optimize it for a number of use cases.

    Other blog posts

    So you want to be a Presenter

    You can still be one at The Perl Conference in Amsterdam, as the Call for Papers has been extended to 7 July. There is no schedule yet, but the list of accepted talks is available. On it, you can already find the following presentations that have some relation to Perl 6:

    Core Developments

    Meanwhile on StackOverflow

    Meanwhile on perl6-users

    Ecosystem Additions

    Winding Down

    A nice week with some surprises. A good beginning of the second half of 2017. Be sure to check again next week for more Perl 6 news!

    6guts: Optimizing reading lines from a file

    Published by jnthnwrthngtn on 2017-07-02T15:57:43

    Reading lines from a file and processing them one at a time is a hugely common scripting task. However, to date our performance at this task has been somewhat underwhelming. Happily, a grateful Perl 6 fan stepped up in response to my recent call for funding, offering 25 hours of funding to work on whatever I felt was most pressing, but with a suggestion that perhaps I could look at some aspect of I/O performance. Having recently been working on refactoring I/O anyway, this was very timely. So, I crafted a benchmark and dug in.

    The benchmark and a baseline

    Perl 5 is considered to have very good I/O performance, so I figured I’d use that as a rough measure of how close Perl 6 was to performing well at this task. A completely equivalent benchmark isn’t quite possible, but I tried to pick something representative of what the average programmer would write. The task for the benchmark was to take a file with one million lines, each having 60 characters, loop over them, and add up the number of characters on each line. That number would then be printed out at the end (it’s important that benchmarks calculating results return or consume the result in some way, as a sufficiently smart optimizer may otherwise manage to eliminate the work we think we’re measuring). The rules were that:

    The Perl 5 benchmark for this came out as follows:

    perl -e 'open my $fh, "<:encoding(UTF-8)", "longfile";
             my $chars = 0;
             while ($_ = <$fh>) { chomp; $chars = $chars + length($_) };
             close $fh;
             print "$chars\n"'

    With the Perl 6 one looking like this:

    perl6 -e 'my $fh = open "longfile";
              my $chars = 0;
              for $fh.lines { $chars = $chars + .chars };
              say $chars'

    I’ll note right off that in Perl 6 there are likely ways, today, to do a bit better. For example, the $chars variable could be given a native int type, and it’s entirely possible that a while loop might come out faster than the for loop. Neither of those are representative of what a typical programmer looking at the documentation and diving in to implementing stuff would do, however. I suspect that Perl 5 experts could similarly point out some trick I’ve missed, but I’m trying to benchmark typical use.

    One slight unfairness is that the Perl 6 solution will actually count the number of grapheme clusters, since strings are at grapheme level. This entails some extra processing work, even in the case that there are no multi-codepoint clusters in the input file (as there were not in this case). But again, the average user making comparisons won’t much care for such technicalities.

    All measurements were made on modern hardware with an Intel Xeon 6-core CPU and a fast SSD, and on Linux.

    At the point I started work, the Perl 6 solution clocked in at 2.87s, to just 1.13s for Perl 5. This made Perl 6 a factor of 2.5 times slower.

    First hints from the profile

    The whole I/O stack recently got a good overhaul, and this was the first time I’d looked at a profile since that work was completed. Looking at the output from --profile immediately showed up some rather disappointing numbers. Of all callframes, 57.13% were JIT-compiled. Worse, basically nothing was being inlined.

    At this point, it’s worth recalling that Perl 6 is implemented in Perl 6, and that there’s quite a bit going on between the code in the benchmark and ending up in either things implemented in C or a system call. The call to lines returns an Iterator object. Reading a line means calling the pull-one method on that Iterator. That in turn calls the consume-line-chars method on a $!decoder object, and that method is what actually calls down to the VM-backed decoder to read a line (so there’s a level of indirection here to support user provided decoders). The return value of that method then has defined called on it to check we actually got a line back. If yes, then it can be returned. If not, then read-internal should be called in order to fetch data from the file handle (given buffering, this happens relatively rarely). Running the loop body is a further invocation, passing the read line as a parameter. Getting the chars count is also a method call (which, again, actually calls down to the VM guts to access the string’s grapheme count).

    That’s quite a lot of method calling. While the VM provides I/O, decoding, and finding input up to a separator, the coordination of that whole process is implemented in Perl 6, and involves a bunch of method calls. Seen that way, it’s perhaps not surprising that Perl 6 would come in slower.

    There are, however, things that we can do to make it fast anyway. One of them is JIT compilation, where instead of having to interpret the bytecode that Perl 6 is compiled in to, we further translate it into machine code that runs on the CPU. That cuts out the interpreter overhead. Only doing that for 57% of the methods or blocks we’re in is a missed opportunity.

    The other really important optimization is inlining. This is where small methods or subroutines are taken and copied into their callers by the optimizer. This isn’t something we can do by static analysis; the point of methods calls is polymorphism. It is something a VM doing dynamic analysis and type specialization can do, however. And the savings can be rather significant, since it cuts out the work of creating and tearing down call frames, as well as opening the door to further optimization.

    The horrors in the logs

    There are a couple of useful logs that can be written by MoarVM in order to get an idea of how it is optimizing, or failing to optimize, code. The JIT log’s main point of interest for the purpose of optimization is that it can indicate why code is not being JIT-compiled – most commonly because it contains something the JIT doesn’t know about. The first thing in this case was the call into the VM-backed decoder to extract a line, which was happily easily handled. Oddly, however, we still didn’t seem to be running the JIT-compiled version of the code. Further investigation uncovered an unfortunate mistake. When a specialized version of a method calls a specialized version of another method, we don’t need to repeat the type checks guarding the second method. This was done correctly. However, the code path that was taken in this case failed to check if there was a JIT-compiled version of the target rather than just a specialized bytecode version, and always ran the latter. I fixed that, and went from 57.13% of frames JIT-compiled to 99.86%. Far better.

    My next point of investigation is why the tiny method to grab a line from the decoder was not being inlined. When I took a look at the post-optimization code for it, it turned out to be no surprise at all: while the logic of the method was very few instructions, it was bulked out by type checking of the incoming arguments and return values. The consume-line-chars method looks like this:

    method consume-line-chars(Bool:D :$chomp = False, Bool:D :$eof = False --> Str) {
        my str $line = nqp::decodertakeline(self, $chomp, $eof);
        nqp::isnull_s($line) ?? Str !! $line

    Specializations are always tied to a callsite object, from which we can know whether we’re being passed a parameter or not. Therefore, we should be able to optimize out those checks and, in the case the parameter is being passed, throw out the code setting the return value. Further, the *%_ that all methods get automatically should have been optimized out, but was not being.

    The latter problem was fixed largely by moving code, although tests showed a regression that needed a little more care to handle – namely, that a sufficiently complex default value might do something that causes a deoptimization, and we need to make sure we can fall back into the interpreter and have things work correctly in that case.

    While these changes weren’t enough to get consume-line-chars inlined, they did allow an inlining elsewhere, taking the inline ratio up to 28.49% of call frames.

    These initial round of changes took the Perl 6 benchmark from 2.87s to 2.77s, so about 3.5% off. Not much, but something.

    Continuing to improve code quality

    The code we were producing even pre-optimization was disappointing in a few ways. Firstly, even though a simple method like consume-line-chars, or chars, would never possibly do a return, we were still spitting out a return exception handler. A little investigation revealed that we were only doing analysis and elimination of this for subs but not methods. Adding that analysis for methods too took the time down to 2.58s. Knocking 7% off with such a small change was nice.

    Another code generation problem lay in consume-line-chars. Access to a native lexical can be compiled in two ways: either just by reading the value (fine if it’s only used as an r-value) or by taking a reference to it (which is correct if it will be used as an l-value). Taking a reference is decidedly costly compare to just reading the value. However, it’s always going to always have the correct behavior, so it’s the default. We optimize doing so away whenever we can (in fact, all the most common l-value usages of it never need a reference either).

    Looking at consume-line-chars again:

    method consume-line-chars(Bool:D :$chomp = False, Bool:D :$eof = False --> Str) {
        my str $line = nqp::decodertakeline(self, $chomp, $eof);
        nqp::isnull_s($line) ?? Str !! $line

    We can see the read of $line here is, since consume-line-chars is not marked is rw, an r-value. Unfortunately, it was compiled as an l-value because the conditional compilation lost that context information. So, I addressed that and taught Rakudo to pass along return value’s r-value context.

    A native reference means an allocation, and this change cut the number of GC runs enormously, from 182 or them to 41 of them. That sounds like it should make a sensational difference. In fact, it got things down to 2.45s, a drop of just 5%. Takeaway lesson: allocating less stuff is good, but MoarVM’s GC is also pretty good at throwing away short-lived things.

    Meanwhile, back in the specializer…

    With the worst issues of the code being fed into MoarVM addressed, it was back to seeing why the specializer wasn’t doing a better job of stripping out type checks. First of all, it turned out that optional named arguments were not properly marking the default code dead when the argument was actually passed.

    Unfortunately, that wasn’t enough to get the type tests stripped out for the named parameters to consume-line-chars. In fact, this turned out to be an issue for all optional parameters. When doing type analysis, and there are two branches, the type information has to be merged at join points in the control flow graph. So it might see something like this in the case that the argument was not passed:

        Bool (default path) \   / Unknown (from passed path)
                             \ /
                       Result: Unknown

    Or maybe this in the case that it was passed:

        Bool (default path) \   / Scalar holding Bool (from passed path)
                             \ /
                       Result: Unknown

    In both cases, the types disagree, so they merge to unknown. This is silly, as we’ve already thrown out one of the two branches, so in fact there’s really no merge to do at all! To fix this up, I marked variables (in single static assignment form) that died as a result of a basic block being removed. To make the dead basic blocks from argument analysis actually be removed, we needed to do the dead code removal earlier as well as doing it at the very end of the optimization process. With that marking in place, it was then possible to ignore now-dead code’s contribution to a merge, which meant a whole load of type checks could now be eliminated. Well, in fact, only in the case where the optional was passed; a further patch to mark the writers of individual instructions dead for the purpose of merges was needed to handle the case where it was not.

    That left the return type being checked on the way out, which also seemed a bit of a waste as we could clearly see it was a Str. After a tweak to Rakudo to better convey type information in one of its VM extension ops, that check was optimized out too.

    And for all of this effort, the time went from…2.45s to 2.41s, just 2% off. While it’s cheaper to not type check things, it’s only so costly in the first place.

    A further win was that, with the code for consume-line-chars now being so tiny, it should have been an inlining candidate. Alas, it was not, because the optional arguments was still having tracking information recorded just in case we needed to deoptimize. This seemed odd. It turned out that my earlier fix for this was too simplistic: it would leave them in if the method would ever deoptimize, not just if it would do it while handling arguments. I tightened that up and the time dropped to 2.37s, another 2% one. Again, very much worth it, but shows that invocation – while not super cheap – is also only so costly.

    With consume-line-chars inlining now conquered, another area of the code we were producing caught by eye: boolification was, in some cases, managing to box an int into an Int only to them immediately unbox it and turn it into a Bool. Clearly this was quite a waste! It turned out that an earlier optimization to avoid taking native references had unexpected consequences. But even nicer was that my earlier work to pass down r-value context meant I could delete some analysis and just use that mechanism instead. That was worth 4%, bringing us to 2.28s.

    Taking stock

    None of these optimizations so far were specific to I/O or touched the I/O code itself. Instead, they are general optimization and code quality improvements that will benefit most Perl 6 programs. Together, they had taken the lines benchmark from 2.87s to 2.28s. Each may have been just some percent, but together they had knocked 20% off.

    By this point, the code quality – especially after optimization – was far more pleasing. It was time to look for some other sources of improvement.

    Beware associativity

    Perhaps one of the easiest wins came from spotting the pull-one method of the lines iterator seemed to be doing two calls to the defined method. See if you can spot them:

    method pull-one() {
        $!decoder.consume-line-chars(:$!chomp) // $!handle.get // IterationEnd

    The // operator calls .defined to test for definedness. Buy why two calls in the common case? Because of associativity! Two added parentheses:

    method pull-one() {
        $!decoder.consume-line-chars(:$!chomp) // ($!handle.get // IterationEnd)

    Were worth a whopping 8%. At 2.09s, the 2 second mark was in sight.

    Good idea, but…

    My next idea for an improvement was a long-planned change to the way that simple for loops are compiled. With for being defined in terms of map, this is also how it had been implemented. However, for simple cases, we can just compile:

    for some-iteratable { blah }

    Not into:{ blah }).sink-all;

    But instead in to something more like:

    my \i = some-iterable.iterator;
    while (my \v = i.pull-one) !== IterationEnd {

    Why is this an advantage? Because – at least in theory – now the pull-one and loop body should become possible to inline. This is not the case if we call map, since that is used with dozens of different closures and iterator types. Unfortunately, however, due to limitations in MoarVM’s specializer, it was not actually possible to achieve this inlining even after the change. In short, because we don’t handle inlining of closure-y things, and the way the on-stack replacement works means the optimizer is devoid of type information to have a chance to doing better with pull-one. Both of these are now being investigated, but were too big to take on as part of this work.

    Even without those larger wins being possible (when they are, we’ll achieve a tasty near-100% inlining rate in this benchmark), it brought the time down to the 2.00s mark. Here’s the patch.

    Optimizing line separation and decoding

    Profiling at the C level (using callgrind) showed up some notable hot spots in the string handling code inside of MoarVM, which seemed to offer the chance to get further wins. At this point, I also started taking measurements of CPU instructions using callgrind too, which makes it easier to see the effects of changes that may come out as noise on a simple time measurement (even with taking a number of them and averaging).

    Finding the separator happens in a couple of steps. First, individual encodings are set up to decode to the point that they see the final character of any of the line separators (noting these are configurable, and multi-char separators are allowed). Then, a second check is done to check if the multi-char separator was found. This is complicated by needing to handle the case where a separator was not found, and another read needs to be done from a file handle.

    It turns out that this second pass was re-scanning the entire buffer of chars, rather than just looking close to the end of it. After checking there should not be a case where just jumping to look at the end would ever be a problem, I did the optimization and got a reduction from 18,245,144,315 instructions to 16,226,602,756, or 11%.

    A further minor hot-spot was re-resolving the CRLF grapheme each time it was needed. It turned out caching that value saved around 200 million instructions. Caching the maximum separator length saved another 78 million instructions. The wallclock time now stood at 1.79s.

    The identification of separators when decoding chars seemed the next place to find some savings. CPUs don’t much like having to do loops and dereferences on hot paths. To do better, I made a compact array of the final separator graphemes that could be quickly scanned through, and also introduced a maximum separator codepoint filter, which given the common case is control characters works out really quite well. These were worth 420 million and 845 million instructions respectively.

    Next, I turned to the UTF-8 decoding and NFG process. A modest 56 million instruction win came from tweaking this logic given we can never be looking for a separator and have a target number of characters to decode. But a vast win came from adding a normalization fast path for the common case where we don’t have any normalization work to do. In the case we do encounter such work, we simply fall into the slow path. One nice property of the way I implemented this is that, when reading line by line, one line may cause a drop into the slow path, but the next line will start back in the fast path. This change was worth a whopping 3,200 million decrease in the instruction count. Wallclock time now stood at 1.37s.

    Better memory re-use

    Another look at the profile now showed malloc/free as significant costs. Could anything be done to reduce the number of those we did?

    Yes, it turned out. Firstly, keeping around a decoding result data structure instead of freeing and allocating it every single line saved a handy 450 million instructions. It turned out that we were also copying the decoded chars into a new buffer when taking a line, but in the common case that buffer would contain precisely the chars that make up the line. Therefore, this buffer could simply be stolen to use as the memory for the string. Another 400 million instructions worth dropped away by a call less to malloc/free per line.


    A few futher micro-optimizations in the new UTF-8 decoding fast-path were possible. By lifting some tests out of the main loop, reading a value into a local because the compiler couldn’t figure out it was invariant, and moving some position updates so they only happen on loop exit, a further 470 million instructions were removed. If you’re thinking that sounds like a lot, this is a loop that runs every single codepoint we decode. A million line file with 60 chars per line plus a separator is 61 million iterations. These changes between them only save 7 cycles per codepoint; that just turns out to be a lot when multiplied by the number of codepoints!

    The final result

    With these improvements, the Perl 6 version of the benchmark now ran in 1.25s, which is just 44% of the time it used to run in. The Perl 5 version still wins, but by a factor of 1.1 times, not 2.5 times. While an amount of the changes performed during this work were specific to the benchmark in question, many were much more general. For example, the separator finding improvements will help with this benchmark in all encodings, and the code generation and specializer improvements will have far more cross-cutting effects.

    Actually, not so final…

    There’s still a decent amount of room for improvement yet. Once MoarVM’s specializer can perform the two inlinings it is not currently able to, we can expect a further improvement. That work is coming up soon. And beyond that, there will be more ways to shave off some instructions here and there. Another less pleasing result is that if Perl 5 is not asked to do UTF-8 decoding, this represents a huge saving. Ask Perl 6 for ASCII or Latin-1, however, however, and it’s just a small saving. This would be a good target for some future optimization work. In the meantime, these are a nice bunch of speedups to have.

    Zoffix Znet: Perl 6: Seqs, Drugs, And Rock'n'Roll (Part 2)

    Published on 2017-06-27T00:00:00

    How to make your very own Iterator object

    Weekly changes in and around Perl 6: 2017.26 Half Way There

    Published by liztormato on 2017-06-26T22:36:18

    Feels like everybody is either preparing for a conference, at a conference or recovering from a conference. A quiet week, with record temperatures at various Perl 6 core developers locations, which was not helping productivity.

    Core Developments

    Blog Posts

    The Perl Conference – US

    Last week saw The Perl Conference US (formerly known as YAPC::NA). The videos of the presentations are available on YouTube. The following videos are Perl 6 related. Or I find them in need of more exposure:

    Meanwhile on Twitter

    Meanwhile on StackOverflow

    Meanwhile on perl6-users

    Ecosystem Additions

    Winding Down

    Well, compared to the weeks before, not as much happened. But under the hood, things are brewing. And it’s not necessarily beer. So check in again next week for more freshly brewed Perl 6 news! Living on the (b)leading edge

    Published by Moritz Lenz on 2017-06-24T22:00:01

    Perl 6 is innovative in many ways, and sometimes we don't fully appreciate all the implications, for good or for bad.

    There's one I stumbled upon recently: The use of fancy Unicode symbols for built-in stuff. In this case: the `.gist` output of Match objects. For example

    my token word { \w+ }
    say 'abc=def' ~~ /<word> '=' <word>/;
    produces this output:
     word => 「abc」
     word => 「def」

    And that's where the problems start. In my current quest to write a book on Perl 6 regexes, I noticed that the PDF that LeanPub generates from my Markdown sources don't correctly display those pesky 「」 characters, which are

    $ uni -c 「」

    When I copied the text from the PDF and pasted into my editor, they showed up correctly, which indicates that the characters are likely missing from the monospace font.

    The toolchain allows control over the font used for displaying code, so I tried all the monospace fonts that were available. I tried them in alphabetical order. Among the earlier fonts I tried was Deja Vu Sans Mono, which I use in my terminal, and which hasn't let me down yet. No dice. I arrived at Noto, a font designed to cover all Unicode codepoints. And it didn't work either. So it turns out these two characters are part of some Noto Sans variants, but not of the monospace font.

    My terminal, and even some font viewers, use some kind of fallback where they use glyphs from other fonts to render missing characters. The book generation toolchain does not.

    The Google Group for Leanpub was somewhat helpful: if I could recommend an Open Source mono space font that fit my needs, they'd likely include it in their toolchain.

    So I searched and searched, learning more about fonts than I wanted to know. My circle of geek friends came up with several suggestions, one of them being Iosevka, which actually contains those characters. So now I wait for others to step up, either for LeanPub to include that font, or for the Noto maintainers to create a monospace variant of those characters (and then LeanPub updating their version of the font).

    And all of that because Perl 6 was being innovative, and used two otherwise little-used characters as delimiters, in an attempt to avoid collisions between delimiters and content.

    (In the mean time I've replaced the two offending characters with ones that look similar. It means the example output is technically incorrect, but at least it's readable).

    Zoffix Znet: Perl 6: Seqs, Drugs, And Rock'n'Roll

    Published on 2017-06-20T00:00:00

    Seq type and its caching mechanism

    Weekly changes in and around Perl 6: 2017.25 [*] @perl-6-books

    Published by liztormato on 2017-06-19T21:25:51

    Yes, it looks like the Perl 6 books are multiplying! Almost a month ago, Gábor Szabó announced his crowdfunding campaign for “Web Application Development in Perl 6”. In the past week we also saw J.J. Merelo‘s book “Learning to program with Perl 6” appear on Amazon in a Kindle edition. And we saw Moritz Lenz publish the first chapters of his new “Searching and Parsing with Perl 6 Regexes” book. It’s great to see this many books arriving!

    2017.06 Compiler Release

    Zoffix Znet released Rakudo Compiler 2017.06 with his trusty bots and a full ecosystem toast. Claudio Ramirez was hot on his tail with the release of packages for several Unix systems. There is no Rakudo Star release planned for this month: next month should see one!

    for ^1000 optimization is back

    The optimization of for loops that run for a set number of times, has been re-instated by Timo Paulssen and then further refined by Jonathan Worthington (graph). So there will now be more situations where the overhead of running such a loop will be greatly reduced.

    Proc overhauled

    The internals of Proc have been completely overhauled by Jonathan Worthington, and is now also completely supported on the JVM backend as well.

    Optimizer and JIT improvements

    Jonathan Worthington also spent a lot of time on several static optimizer and spesh improvements, as well as adding more possibilities for code to get JITted. It has caused the canary in the goldmine benchmark to go almost go down below 4 seconds. Which means it got about 1.5x faster in the past 6 months!

    Grant Extension Proposal

    If you like the work that Jonathan did the past week, you should probably leave a comment at his proposal for extension of his Perl 6 Core Development Grant!

    Other Core Developments

    All of these developments made it to the 2017.06 compiler release, except where noted.

    Blog Posts

    Meanwhile on Twitter

    Not a lot going on that wasn’t already covered in this issue:

    Meanwhile on StackOverflow

    Meanwhile on perl6-users

    Ecosystem Additions

    Winding Down

    From a sweltering place in the south of the Netherlands, it’s yours truly wishing you all a good week. Please check in again next week for more Perl 6 news!

    <plug>Oh, and if you want to attend The Perl Conference in Amsterdam, you can now order tickets at the price-level you want / need!</plug>

    Perl 6 Maven: Bailador Plans

    Published by szabgab

    Perl 6 Maven: Arrays with unique values in Perl 6

    Published by szabgab

    Zoffix Znet: Perl 6 Release Quality Assurance: Full Ecosystem Toaster

    Published on 2017-06-14T00:00:00

    How devs are ensuring quality of Rakudo compiler releases

    6guts: Sorting out synchronous I/O

    Published by jnthnwrthngtn on 2017-06-07T23:00:26

    A few weeks back, I put out a call for funding. So far, two generous individuals have stepped up to help, enabling me to spend much more time on Perl 6 than would otherwise have been possible.

    First in line was Nick Logan (ugexe), who is funding 60 hours to get a longstanding I/O bug resolved. The problem, in short, has been that a synchronous socket (that is, an IO::Socket::INET instance) accepted or connected on one thread could not be read from or written to by another thread. The same has been true of synchronous file handles and processes.

    The typical answer for the socket case has been to use IO::Socket::Async instead. While that’s the best answer from a scalability perspective, many people new to Perl 6 will first reach for the more familiar synchronous socket API and, having heard about Perl 6’s concurrency support, will pass those off to a thread, probably using a start block. Having that fail to work is a bad early impression.

    For processes, the situation has been similar; a Proc was bound to the thread it was created on, and the solution was to use Proc::Async. For situations where dealing with more than one of the input or output streams is desired, I’d argue that it’s actually easier to deal with it correctly using Proc::Async. However, Proc has also ended up with a few features that Proc::Async has not so far offered.

    Finally, there are “file handles” – that is, instances of IO::Handle. Here, we typically would get away with passing handles of ordinary files around between threads, and due to an implementation detail could get away with it. The situation of an IO::Handle backed by a TTY or pipe was much less pleasing, however, which was especially unfortunate because it afflicted all of $*IN, $*OUT, and $*ERR. The upshot was that you could only read from $*IN from the main thread. While writing to $*OUT and $*ERR worked, it was skating on thin ice (and occasionally falling through it).

    How did we get into this situation?

    To understand the issue, we need to take a look at the history of MoarVM I/O. In its first year of its life, MoarVM was designed and built on a pretty extreme budget – that is to say, there wasn’t one. Building platform abstractions was certainly not a good use of limited time, so a library was brought in to handle this. Initially, MoarVM used the Apache Portable Library, which served us well.

    As the concurrent and parallel language features of Perl 6 came into focus, together with asynchronous I/O, it became clear that libuv – the library that provides I/O and platform abstractions for Node.js – was a good option for supporting this. Since the APR and libuv had substantially overlapping feature sets, we decided to move over to using libuv. In the months that followed, a bunch of the asynchronous features found in Perl 6 today quickly fell in to place: Proc::Async, IO::Socket::Async, IO::Notification, asynchronous timers, and signal handlers. All seemed to be going well.

    Alas, we’d made a problem for ourselves. libuv is centered around event loops. Working with an event loop is conceptually simple: we provide callbacks to be run when certain things happen (like data arriving over a socket, or a timer elapsing), and enter the event loop. The loop waits for some kind of event to happen, and then invokes the correct callback. Those callbacks may lead to other callbacks being set up (for example, the callback reacting to an incoming socket connection can set up a callback to be called whenever data arrives over that socket). This is a fine low-level programming model – and a good bit nicer than dealing with things like poll sets. However, a libuv event loop runs on a single thread. And handles are tied to the libuv event loop that they were created on.

    For IO::Socket::Async and Proc::Async, this is not a problem. MoarVM internally runs a single event loop for all asynchronous I/O, timers, signals, and so forth. Whenever something happens, it pushes a callback into the queue of a scheduler (most typically that provided by ThreadPoolScheduler), where the worker threads lie in wait to handle the work. Since this event loop serves as a pure dispatcher, not running any user code or even such things as Unicode decoding, it’s not really limiting to have it only on a single thread.

    When the synchronous I/O was ported from the APR, none of this was in place, however. Therefore, each thread got its own libuv event for handling the synchronous I/O. At the time, of course, there wasn’t really much in the way of threading support available in Rakudo on MoarVM. Therefore, at that point in time, that an event loop was tied to a thread was not problematic. It only came to be an issue as the concurrency support in Perl 6 became mature enough that people started using it…and then running into the limitation.

    Seizing an opportunity

    Having to deal with this was a chance to spend some quality time improving the lower level bits of I/O in the MoarVM/NQP/Rakudo stack, which have had very little love over the last couple of years. Recently, Zoffix did a bunch of great work on the higher level parts of I/O. Most helpfully for my endeavor, he also greatly improved the test coverage of I/O, meaning I could refactor the lower level bits with increased confidence.

    A while back, I took the step of fully decoupling MoarVM’s asynchronous I/O from character encoding/decoding. For some months now, MoarVM has only provided support for asynchronous byte-level I/O. It also introduced a streaming decoder, which can turn bytes into characters for a bunch of common encodings (ASCII, UTF-8, and friends). This means that while the decoding hot paths are provided by the VM, the coordination is moved up to the Perl 6 level.

    With synchronous I/O, the two were still coupled, with the runtime directly offering both character level and byte level I/O. While this is in some ways convenient, it is also limiting. It blocks us from supporting user-provided encodings – at least, not in such a way that they can just be plugged in and used with a normal IO::Handle. That aside, there are also situations where one might like to access the services provided by a streaming decoder when not dealing with an I/O handle. (A streaming decoder is one you can feed bytes to incrementally and pull characters out, trusting that it will do the right thing with regard to multi-byte and multi-codepoint sequences.)

    Whatever changes were going to have to happen to fix the thread limitations of synchronous I/O, it was quite clear that only having to deal with binary I/O there would make it easier. Therefore, a plan was hatched:

    1. Re-work the synchronous I/O handles to use only binary I/O, and coordinate with the VM-backed decoder to handle char I/O.
    2. Rip out the synchronous char I/O.
    3. Re-implement the remaining synchronous binary I/O, so as not to be vulnerable to the threading limitations.

    Making it possible to support user-defined encodings would be a future step, but the work done in these refactors would make it a small step – indeed, one that will only require work at a Perl 6 level, not any further down the stack. While it may well end up being me that does this anyway, it’s at least now in reach for a bunch more members of the Perl 6 development team.

    Sockets first

    I decided to start out with sockets. From a technical perspective, this made sense because they were the most isolated; break IO::Handle and suddenly such things as say are busted, and both it and Proc are used in the pre-compilation management code too. Conveniently, sockets were also Nick’s primary interest, so it was nice to get the most important goal of the work delivered first.

    The streaming decode API, while fully implemented on MoarVM, had only been partially implemented on the JVM. Therefore, to complete the work on sockets without busting them on the JVM, I had to implement the missing pieces of the VM-backed streaming decode API. This meant dealing with NIO (“New IO”), the less about which is said the better. I’m pretty sure the buffer API wasn’t designed to trip me up at every turn, but it seems to reliably manage to do so. Since file handles on the JVM would soon also come to depend on this code, it was nice to be able to get it at least straight enough to be passing all of the sockets tests, plus another set of lower-level tests that live in the NQP repository.

    Refactoring the socket code itself gave a good opportunity for cleanups. The IO::Socket::INET class does the IO::Socket role, the idea being that at some point other implementations that provide things like domain sockets will also do that role, and just add the domain socket specific parts. In reviewing what methods were where, it became clear that some things that really belonged in the IO::Socket role were not, so I moved them there as part of the work. I also managed to eliminate all of the JVM-specific workarounds in the socket code along the way, which was also a happy outcome.

    With that refactored, I could rip out the character I/O support from sockets in MoarVM. This left me with a relatively small body of code doing binary socket I/O using libuv, implementing synchronous socket I/O atop of its asynchronous event loop. When it comes to sockets, there are two APIs to worry about: the Berkeley/BSD/POSIX one, and Winsock. Happily, in my case, there was a lot of overlap and just a handful of places that I had to deal with divergence. The extra lines spent coping with the difference were more than paid back by not faking synchronous I/O in terms of an asynchronous event loop.

    File handles next

    Buoyed by this success, it was time to dig into file handles. The internals of IO::Handle were poked into from outside of the class: a little in IO::Path and more so in Proc, which was actually setting up IO::Pipe, a subclass of IO::Handle. Thankfully, with a little refactoring, the encapsulation breakage could be resolved. Then it was a case of refactoring to use the streaming decode API rather than the character I/O. This went relatively smoothly, though it also uncovered a previously hidden bug in the JVM implementation of the streaming decoder, which I got fixed up.

    So, now to rip the character file I/O out of MoarVM and refactor the libuv away in synchronous file handles too? Alas, not so fast. NQP was also using these operations. Worse, it didn’t even have an IO handle object! Everything was done in terms of nqp::ops instead. So, I introduced an NQP IO handle class, gave it many of the methods that its Perl 6 big sister has, and refactored stuff to use it.

    With that blocker out of the way, I could move on to sorting things out down in MoarVM. Once again, synchronous file I/O looks similar enough everywhere to not need all that much in the way of an abstraction layer. On Windows, it turned out to be important to put all of the handles into binary mode, however, since we do our own \n <-> \r\n mapping. (Also, yes, it is very much true that it’s only very similar on Windows if you use their POSIX API emulation. It’s possible there may be a performance win from not using that, but I can’t imagine it’ll be all that much.)

    Not entirely standard

    So, all done? Well, not quite. For the standard handles, we only used the synchronous file code path when the handle was actually a regular file. This is not the common case; it’s often a pipe or a TTY. These used a completely different code path in libuv, using its streams API instead of the files API.

    Thankfully, there isn’t much reason to retain this vast implementation difference. With a little work on EOF handling, and re-instating the faking of tell, it was possible to use the same code I had written to handle regular files. This worked out very easily on Linux. On Windows, however, a read operation would blow up if reading from the console.

    Amazingly, the error string was “out of space”, which was a real head-scratcher given it was coming from a read operation! It turned out that the error string was a tad misleading; the error code is #defined as ENOMEM, so a better error string would have been “out of memory”. That still made little sense. So I went digging into the native console APIs on Windows, and discovered that reads from the console are allocated out of a 64KB buffer, which is also used for various other things. 64KB should be enough for anybody, I guess. Capping read requests to a console on Windows to 16KB was enough to alleviate this.

    Job done!

    At least, for IO::Socket::INET and IO::Handle, which now are not tied to a thread on HEAD of MoarVM/NQP/Rakudo. The work on processes is ongoing, although due to be completed in the coming week. So, I’ll talk about that in my next post here. Perl 6 Books Landscape in June 2017

    Published by Moritz Lenz on 2017-06-07T22:00:01

    There are lots of news around Perl 6 books to share these days. If you follow the community very closely, you might be aware of most of it. If not, read on :-).

    Think Perl 6 is now available for purchase, and also for download as a free ebook. Heck, it's even Open Source, with the LaTeX sources on GitHub!

    Perl 6 at a Glance, previously only available in print form, is now available as an ebook. Save paper and shipping costs!

    My own book, Perl 6 Fundamentals, is now in the "production" phase: copyediting, indexing, layout. And just before the manuscript submission deadline, Larry Wall has contributed a foreword. How awesome is that?

    I've revamped to provide a short overview of the current and future Perl 6 books. As a small gimmick, it contains a flow chart explaining which book to chose. And I even got input from two other Perl 6 book authors (Laurent Rosenfeld of "Think Perl 6", Andrew Shitov of "Perl 6 at a Glance", "Migrating to Perl 6".

    From a pull request to, it looks like Andrew Shitov is working on two more Perl 6 books. Keep 'em coming!

    Last but not least, Gabor Szabo has started a crowd funding campaign for a Perl 6 book on web app development. There are still a few day left, so you can help it succeed!

    And as always, if you want to keep informed about Perl 6 books, you can sign up at for my Perl 6 books mailing list (low volume, typically less than one email per month).

    samcv: Grant Status Update 1

    Published on 2017-06-02T07:00:00

    This is my first grant progress report for my Perl Foundation grant entitled "Improving the Robustness of Unicode Support in Rakudo on MoarVM".

    I was not able to work quite as many hours as I would have liked this month, but I still made quite a lot of progress.

    Improvement for Tests

    Merged In

    In Roast there is a new version of GraphemeBreakTest.t.

    The script tests the contents of each grapheme individually from the GraphemeClusterBreak.txt file from the Unicode 9.0 test suite.

    Previously we only checked the total number of ‘.chars’ each for the string as a whole. Obviously we want something more precise than that, since the test specifies the location of each of the breaks between codepoints. The new code checks that codepoints are put in the correct graphemes in the proper order. In addition we also check the string length as well.

    This new test uses a grammar to parse the file and generally is much more robust than the previous script.

    Running the parse class generates an array of arrays. The index of the outer array indicates the grapheme, while the inner arrays indicate which codepoints should be in that grapheme.

    [[10084, 776], [9757]]

    The array above would indicate that the 1st grapheme is made up of codepoint's 10084 and 776 while the 2nd grapheme is made up codepoint 9757. This allows us to easily test the contents of each grapheme.

    The array shown above corresponds to the following line from the Unicode data file:

    ÷ 2764 × 0308 ÷ 261D ÷ where × means break and ÷ means no-break

    Work in Progress

    I have some currently unmerged tests which need to wait to be merged, although sections of it are complete and are being incorporated into the larger Unicode Database Retrofit, reusing this code.

    I have written grammars and modules to process and provide data on the PropertyValueAliases and PropertyAliases. They will be used for testing that all of the canonical property names and all the property values themselves properly resolve to separate property codes, as well as that they are usable in regex.

    Work on the Unicode Database Retrofit

    As part of my grant work I am working on making Unicode property values distinct per property, and also on allowing all canonical Unicode property values to work. For a background on this see my previous post about Unicode Property Names. The WIP generated code can be seen in this gist here and was generated from UCD-gen.p6. The code resolves property name and property value command line arguments and matches them with property codes and property value codes. It is also case insensitive and ignores underscores as Unicode spec says is permissible. In addition it is also deduplicated, meaning we only store one hash per unique set of property values.

    For example: Script and Script_Extensions both have the same values, so we don't store these more than once; likewise for the Boolean property values. The C program resolves the property string to a unique property code, and from there is able to look up the property value code. Note: aside from the property values which specify the lack of a property, these codes are internal and have no relation to the Unicode spec, for example Grapheme_Cluster_Break=Other is designated as property value 0.


    I've also started adding some documentation to my Unicode-Grant wiki with information about what is enclosed in each Unicode data file; there are a few other pages as well. This wiki is planned to be expanded to have many more sections than it does currently.

    Future Work

    Next I must integrate the property name/value alias resolving code with UCD-gen.p6. UCD-gen.p6 already has a mostly functional Unicode database with a fair number of properties. When these two are integrated, the next step will be to start integrating it with the MoarVM codebase, making any changes to MoarVM or the database retrofit codebase as needed along the way.

    I will also be exploring ways of compressing the mapping of codepoints to unique combinations of Unicode property data in the bitfield. Due to the vast number of codepoints within Unicode, currently the mapping of codepoints to rows in the bitfield takes up many times more space than the actual property value data itself.

    For compressing the Unicode names, it is planned to use base 40 encoding with some additional tricks to save additional space for repeated words. I plan on making a blog post where I go into the details of the compression scheme.

    I am considering also rolling in the ignorecase/ignoremark bug into my grant. Even though it was not originally planned to be part of the Grant, I think it is important enough to warrant inclusion. Currently, using regex using both ignorecase and ignoremark together is completely broken.


    The work described above has been commited to the two repositories as listed below (in addition to the test work described which was merged into Roast).

    brrt to the future: Function call return values

    Published by Bart Wiegmans on 2017-05-25T19:02:00

    Hi there, it's been about a month since last I wrote on the progress of the even-moar-jit branch, so it is probably time for another update.

    Already two months ago I wrote about adding support for function calls in the expression JIT compiler. This was a major milestone as calling C functions is essential for almost everything that is not pure numerical computing. Now we can also use the return values of function calls (picture that!) The main issue with this was something I've come to call the 'garbage restore' problem, by which I mean that the register allocator would attempt to 'restore' an earlier, possibly undefined, version of a value over a value that would result from a function call.

    This has everything to do with the spill strategy used by the compiler. When a value has to be stored to memory (spilled) in order to avoid being overwritten and lost, there are a number of things that can be done. The default, safest strategy is to store a value to memory after every instruction that computes it and to load it from memory before every instruction that uses it. I'll call this a full spill. It is safe because it effectively makes the memory location the only 'true' storage location, with the register being merely temporary caches. It can also be somewhat inefficient, especially if the code path that forces the spill is conditional and rarely taken. In MoarVM, this happens (for instance) around memory barriers, which are only necessary when creating cross-generation object references.

    That's why around function calls the JIT uses another strategy, which I will call a point spill. What I mean by that is that the (live) values which could be overwritten by the function call are spilled to memory just before the function call, and loaded back into their original registers directly after. This is mostly safe, since under normal control flow, the code beyond the function call point will be able to continue as if nothing had changed. (A variant which is not at all safe is to store the values to memory at the point, and load them from memory in all subsequent code, because it isn't guaranteed that the original spill-point-code is reached in practice, meaning that you overwrite a good value with garbage. The original register allocator for the new JIT suffered from this problem).

    It is only safe, though, if the value that is to be spilled-and-restored is both valid (defined in a code path that always precedes the spill) and required (the value is actually used in code paths that follow the restore). This is not the case, for instance, when a value is the result of a conditional function call, as in the following piece of code:

    1:  my $x = $y + $z;
    2:  if ($y < 0) {
    3:      $x = compute-it($x, $y, $z);
    4:  }
    5:  say "\$x = $x";

    In this code, the value in $x is defined first by the addition operation and then, optionally, by the function call to compute-it. The last use of $x is in the string interpolation on line 5. Thus, according to the compiler, $x holds a 'live' value at the site of the function call on line 3, and so to avoid it from being overwritten, it must be spilled to memory and restored. But in fact, loading $x from memory after compute-it would directly overwrite the new value with the old one.

    The problem here appears to be that when the JIT decides to 'save' the value of $x around the function call, it does not take into account that - in this code path - the last use of the old value of $x is in fact when it is placed on the parameter list to the compute-it call. From the perspective of the conditional branch, it is only the new value of $x which is used on line 5. Between the use on the parameter list and the assignment from the return value, the value of $x is not 'live' at all. This is called a 'live range hole'. It is then the goal to find these holes and to make sure a value is not treated as live when it is in fact not.

    I used an algorithm from a paper by Wimmer and Franz (2010) to find the holes. However, this algorithm relies on having the control flow structure of the program available, which usually requires a separate analysis step. In my case that was fortunately not necessary since this control flow structure is in fact generated by an earlier step in the JIT compilation process, and all that was necessary is to record it. The algorithm itself is really simple and relies on the following ideas:

    I think it goes beyond the scope of this blog post to explain how it works in full, but it is really not very complicated and works very well. At any rate, it was sufficient to prevent the JIT from overwriting good values with bad ones, and allowed me to finally enable functions that return values, which is otherwise really simple.

    When that was done, I obviously tried to use it and immediately ran into some bugs. To fix that, I've improved the script, which wasn't very robust before. The script uses two environment variables, MVM_JIT_EXPR_LAST_FRAME and MVM_JIT_EXPR_LAST_BB, to automatically find the code sequence where the expression compiler fails and compiles wrong code. (These variables tell the JIT compiler to stop running the expression compiler after a certain number of frames and basic blocks. If we know that the program fails with N blocks compiled, we can use binary search between 0 and N to find out which frame is broken). The script then provides disassembled bytecode dumps that can be compared and with that, it is usually relatively easy to find out where the JIT compiler bug is.

    With that in hand I've spent my time mostly fixing existing bugs in the JIT compiler. I am now at a stage in which I feel like most of the core functionality is in place, and what is left is about creating extension points and fixing bugs. More on that, however, in my next post. See you then!

    Death by Perl6: Perl Toolchain Summit 2017 - CPAN and Perl6

    Published by Nick Logan on 2017-05-25T05:41:53

    At the 2017 Perl Toolchain Summit (PTS) a lot of stuff got done. This is a brief demonstration style summary of the resulting CPAN-related feature enhancements to zef.

    First I should mention that now Perl6 distributions can be uploaded to CPAN (without needing to add a special Perl6/ folder), and will have their source-url automatically set or replaced with the appropriate CPAN url. Additionally App::Mi6 now has mi6 dist and mi6 upload to make the process even simpler.

    Now lets get started by making sure we are using a version with features developed at PTS:

    $ zef install "zef:ver(v0.1.15+)"
    All candidates are currently installed  
    No reason to proceed. Use --force to continue anyway  

    Perl6 distributions uploaded to CPAN are now indexed. Currently the index is generated by and stored at alongside a mirror of the existing p6c ecosystem. It is also enabled by default now:

    $ zef list --max=10
    ===> Found via Zef::Repository::Ecosystems<cpan>
    $ zef info CompUnit::Repository::Mask
    - Info for: CompUnit::Repository::Mask
    - Identity: CompUnit::Repository::Mask:ver('0.0.1')
    - Recommended By: Zef::Repository::Ecosystems<cpan>
    Description:     hide installed modules for testing.  
    License:     Artistic-2.0  
    Provides: 1 modules  
    Depends: 0 items

    A distribution can exist in multiple "ecosystems":

    $ zef search Inline::Perl5
    ===> Found 3 results
    ID|From                             |Package                                       |Description  
    1 |Zef::Repository::LocalCache      |Inline::Perl5:ver('0.26'):auth('github:niner')|Use Perl 5 code in a Perl 6 program  
    2 |Zef::Repository::Ecosystems<cpan>|Inline::Perl5:ver('0.26'):auth('github:niner')|Use Perl 5 code in a Perl 6 program  
    3 |Zef::Repository::Ecosystems<p6c> |Inline::Perl5:ver('0.26'):auth('github:niner')|Use Perl 5 code in a Perl 6 program  

    Dependencies can be resolved by any and all ecosystems available, so distributions can be put on cpan and still have their dependencies that aren't get resolved:

    $ zef -v install Inline::Perl5
    ===> Searching for: Inline::Perl5
    ===> Found: Inline::Perl5:ver('0.26'):auth('github:niner') [via Zef::Repository::Ecosystems<cpan>]
    ===> Searching for missing dependencies: LibraryMake, File::Temp
    ===> Found dependencies: File::Temp [via Zef::Repository::Ecosystems<p6c>]
    ===> Found dependencies: LibraryMake:ver('1.0.0'):auth('github:retupmoca') [via Zef::Repository::LocalCache]
    ===> Searching for missing dependencies: Shell::Command, File::Directory::Tree
    ===> Found dependencies: Shell::Command, File::Directory::Tree:auth('labster') [via Zef::Repository::Ecosystems<p6c>]
    ===> Searching for missing dependencies: File::Which, File::Find
    ===> Found dependencies: File::Find:ver('0.1'), File::Which [via Zef::Repository::Ecosystems<p6c>]
    ...<more output>...

    In addition to CPAN we have access to CPAN testers. Garu worked with me to create a perl6 cpan testers report module: Zef::CPANReporter

    $ zef install Zef::CPANReporter
    ===> Searching for: Zef::CPANReporter
    ===> Searching for missing dependencies: Net::HTTP
    ===> Testing: Net::HTTP:ver('0.0.1'):auth('github:ugexe')
    ===> Testing [OK] for Net::HTTP:ver('0.0.1'):auth('github:ugexe')
    ===> Testing: Zef::CPANReporter:ver('0.0.1'):auth('github:garu')
    ===> Testing [OK] for Zef::CPANReporter:ver('0.0.1'):auth('github:garu')
    ===> Installing: Net::HTTP:ver('0.0.1'):auth('github:ugexe')
    ===> Installing: Zef::CPANReporter:ver('0.0.1'):auth('github:garu')
    # ...and in use:
    $ zef -v install Grammar::Debugger
    ===> Searching for: Grammar::Debugger
    ===> Found: Grammar::Debugger:ver('1.0.1'):auth('github:jnthn') [via Zef::Repository::Ecosystems<p6c>]
    ===> Searching for missing dependencies: Terminal::ANSIColor
    ===> Found dependencies: Terminal::ANSIColor:ver('0.3') [via Zef::Repository::Ecosystems<p6c>]
    ===> Fetching [OK]: Grammar::Debugger:ver('1.0.1'):auth('github:jnthn') to /Users/ugexe/.zef/tmp/grammar-debugger.git
    ===> Fetching [OK]: Terminal::ANSIColor:ver('0.3') to /Users/ugexe/.zef/tmp/Terminal-ANSIColor.git
    ===> Testing: Terminal::ANSIColor:ver('0.3')
    t/00-load.t .. ok  
    All tests successful.  
    Files=1, Tests=1,  0 wallclock secs  
    Result: PASS  
    ===> Testing [OK] for Terminal::ANSIColor:ver('0.3')
    Report for Terminal::ANSIColor:ver('0.3') will be available at  
    ===> Testing: Grammar::Debugger:ver('1.0.1'):auth('github:jnthn')
    t/debugger.t .. ok  
    t/ltm.t ....... ok  
    t/tracer.t .... ok  
    All tests successful.  
    Files=3, Tests=3,  1 wallclock secs  
    Result: PASS  
    ===> Testing [OK] for Grammar::Debugger:ver('1.0.1'):auth('github:jnthn')
    Report for Grammar::Debugger:ver('1.0.1'):auth('github:jnthn') will be available at  
    ===> Installing: Terminal::ANSIColor:ver('0.3')
    ===> Install [OK] for Terminal::ANSIColor:ver('0.3')
    ===> Installing: Grammar::Debugger:ver('1.0.1'):auth('github:jnthn')
    ===> Install [OK] for Grammar::Debugger:ver('1.0.1'):auth('github:jnthn')

    This work (and my attendance) was made possible by a bunch of great perl companies and people:, ActiveState, cPanel, FastMail, MaxMind, Perl Careers, MongoDB, SureVoIP, Campus Explorer, Bytemark, CAPSiDE, Charlie Gonzalez, Elastic, OpusVL, Perl Services, Procura, XS4ALL, Oetiker+Partner.

    samcv: Unicode Property Names

    Published on 2017-05-21T07:00:00

    Currently when you do: 'word' ~~ /<:Latin>/, MoarVM looks in a hash which contains all of the property values and looks up what property name it is associated with. So in this case it looks up Latin, and then finds it is related to the Script property.

    There is a longstanding issue in MoarVM. The Unicode database of MoarVM was created with the incorrect assumption that Unicode property values were distinct. As part of my work on the Unicode Grant this is one of the issues I am tackling. So to be better informed I generated a list of all of the overlaps. I won't paste it here because it is very long, but if you want to see a full list see my post here.

    There are 68 property values which belong to multiple property names and an additional 126 that are shared between Script and Block properties.

    In addition we must also make sure that we check overlap between property names and the values themselves.

    Here are all of the property names that conflict with values

    «« IDC Conflict with property name [blk]  is a boolean property
    «« VS Conflict with property name [blk]  is a boolean property
    «« White_Space Conflict with property name [bc]  is a boolean property
    «« Alphabetic Conflict with property name [lb]  is a boolean property
    «« Hyphen Conflict with property name [lb]  is a boolean property
    «« Ideographic Conflict with property name [lb]  is a boolean property
    «« Lower Conflict with property name [SB]  is a boolean property
    «« STerm Conflict with property name [SB]  is a boolean property
    «« Upper Conflict with property name [SB]  is a boolean property

    Luckily these are all Bool properties and so we don't need to worry about anything complicated there.

    A fun fact, currently the only reason ' ' ~~ /<:space>/ matches is because space resolves as Line_Break=space. In fact, it should resolve as White_Space=True. Luckily space character and a few others have Line_Break=space, though this does not work properly "\n" ~~ /<:space>/. I will note though, that using <:White_Space> does work properly, as it resolves to the property name.

    I would make Bool properties to be 0th in priority

    Then similar to other regex engines, we will allow you to designate General_Category and Script unqualified. <:Latin> <:L> # unqualified <:Script<Latin>> <:General_Category<L>> # qualified

    I propose a heirarchy as follows

    The following below I have not decided if we want to guarantee them but they should be a part of the internal hierarchy

    We should resolve also Numeric_Type so that people can use <:Numeric> in their regex (I'm sure that there must already exist code where this is used so we need to make sure this is resolved as well).

    In actuality this resolves as Numeric_Type != None. So this is covered under rule 0.

    I am open to adding whichever properties people think most important to the ordered priority list as well. Due to how things are setup in MoarVM/NQP I will need to come up with some hierarchy to resolve all the properties. In addition to this we will have a Guaranteed list, where specs will specify that using them unqualified are guaranteed to work.

    The ones property value names with overlap remaining after the proposed list above:

    NU => ["Word_Break", "Line_Break", "Sentence_Break"],
    NA => ["Age", "Hangul_Syllable_Type", "Indic_Positional_Category"],
    E => ["Joining_Group", "Jamo_Short_Name"],
    SP => ["Line_Break", "Sentence_Break"],
    CL => ["Line_Break", "Sentence_Break"],
    D => ["Jamo_Short_Name", "Joining_Type"],
    Narrow => ["East_Asian_Width", "Decomposition_Type"],
    NL => ["Word_Break", "Line_Break"],
    Wide => ["East_Asian_Width", "Decomposition_Type"],
    Hebrew_Letter => ["Word_Break", "Line_Break"],
    U => ["Jamo_Short_Name", "Joining_Type"],
    LE => ["Word_Break", "Sentence_Break"],
    Close => ["Bidi_Paired_Bracket_Type", "Sentence_Break"],
    BB => ["Jamo_Short_Name", "Line_Break"],
    HL => ["Word_Break", "Line_Break"],
    Maybe => ["NFKC_Quick_Check", "NFC_Quick_Check"],
    FO => ["Word_Break", "Sentence_Break"],
    H => ["East_Asian_Width", "Jamo_Short_Name"],
    Ambiguous => ["East_Asian_Width", "Line_Break"],

    Any ideas above adding further to the hierarchy (even if they don't have any overlap presently [Unicode 9.0] it could be introduced later) will be appreciated. Either comment on this Github issue or send me an email (address at bottom of the page).

    6guts: Looking for Perl 6, Rakudo, and MoarVM development funding

    Published by jnthnwrthngtn on 2017-05-12T16:17:52

    Note for regular 6guts readers: this post isn’t about Perl 6 guts themselves, but rather about seeking funding for my Perl 6 work. It is aimed at finding medium-size donations from businesses who are interested in supporting Perl 6, Rakudo, and MoarVM by funding my work on these projects.

    I started contributing to the Rakudo Perl 6 compiler back in 2008, and since then have somehow managed to end up as architect for both Rakudo and MoarVM, together with playing a key design role in Perl 6’s concurrency features. Over the years, I’ve made time for Perl 6 work by:

    I’m still greatly enjoying doing Perl 6 stuff and, while I’ve less free time these days for a variety of reasons, I still spend a decent chunk of that on Perl 6 things too. That’s enough for piecing together various modules I find we’re missing in the ecosystem, and for some core development work. However, the majority of Perl 6, Rakudo, and MoarVM issues that end up on my plate are both complex and time-consuming. For various areas of MoarVM, I’m the debugger of last resort. Making MoarVM run Perl 6 faster means working on the dynamic optimizer, which needs a good deal of care to avoid doing the wrong thing really fast. And driving forward the design and implementation of Perl 6’s concurrent and parallel features also requires careful consideration. Being funded through The Perl Foundation over the last couple of years has enabled me to spend quality time working on these kinds of issues (and plenty more besides).

    So what’s up?

    I’ve been without funding since early-mid February. Unfortunately, my need to renew my funding has come at a time when The Perl Foundation has been undergoing quite a lot of changes. I’d like to make very clear that I’m hugely supportive and thankful for all that TPF have done and are doing, both for me personally and for Perl 6 in general. Already this year, two Perl 6 grants have been made to others for important work. These were made through the normal TPF grants process. By contrast, my work has been funded through a separate Perl 6 Core Development Fund. As a separate fund, it thus needs funds raising specifically for it, and has its own operation separate from the mainstream grant process.

    Between the fund being almost depleted, and various new volunteers stepping up to new roles in TPF and needing to get up to speed on quite enough besides the Perl 6 Core Development Fund, unfortunately it’s not been possible to make progress on my funding situation in the last couple of months. I’m quite sure we can get there with time – but at the same time I’m keen to get back to having more time to spend on Perl 6 again.

    So, I’ve decided to try out an alternative model. If it works, I potentially get funded faster, and TPF’s energies are freed up to help others spend more time on Perl. If not, well, it’ll hopefully only cost me the time it took to write this blog post.

    The Offer

    I’m looking for businesses willing to help fund my Perl 6 development work. I can offer in return:

    I’m setting a rate for this work of 55 EUR / hour with a minimum order of 25 hours. This need not be billed all in one go; for example, if you happened to be a company wishing to donate 1000 EUR a month to open source and wished to be invoiced that amount each month, this is entirely possible. After all, if 3-4 companies did that, we’d have me doing Perl 6 stuff for 2 full days every week.

    If you’re interested in helping, please get in contact with me, either by email or on freenode (I’m jnthn there). Thank you! Announce: Rakudo Star Release 2017.04

    Published by Steve Mynott on 2017-05-01T15:35:01

    A useful and usable production distribution of Perl 6

    On behalf of the Rakudo and Perl 6 development teams, I’m pleased to announce the April 2017 release of “Rakudo Star”, a useful and usable production distribution of Perl 6. The tarball for the April 2017 release is available from

    Binaries for macOS and Windows (64 bit) are also available.

    This is the seventh post-Christmas (production) release of Rakudo Star and implements Perl v6.c. It comes with support for the MoarVM backend (all module tests pass on supported platforms).

    This release includes “zef” as module installer. “panda” is to be shortly replaced by “zef” and will be removed in the near future.

    It’s hoped to produce quarterly Rakudo Star releases during 2017 with 2017.07 (July) and 2017.10 (October) to follow.

    Please note that this release of Rakudo Star is not fully functional with the JVM backend from the Rakudo compiler. Please use the MoarVM backend only.

    In the Perl 6 world, we make a distinction between the language (“Perl 6”) and specific implementations of the language such as “Rakudo Perl”.

    This Star release includes [release 2017.04.3] of the Rakudo Perl 6 compiler, version 2017.04-53-g66c6dda of MoarVM, plus various modules, documentation, and other resources collected from the Perl 6 community.

    The Rakudo compiler changes since the last Rakudo Star release of 2017.01 are now listed in “” and “” under the “rakudo/docs/announce” directory of the source distribution.

    In particular this release featured many important improvements to the IO subsystem thanks to Zoffix and the support of the Perl Foundation.

    Please see
    Part 1:
    Part 2:
    Part 3:

    Note there were point releases of 2017.04 so also see “”, “” and “”.

    Notable changes in modules shipped with Rakudo Star:

    + DBIish: New version with pg-consume-input
    + doc: Too many to list. Large number of “IO Grant” doc changes.
    + json\_fast: Too many to list. Big performance improvements.
    + perl6-lwp-simple: Fix for lexical require and incorrect regex for absolute URL matcher
    + test-mock: Enable concurrent use of mock objects
    + uri: Encoding fixes
    + zef: Too many to list. IO fixage.

    There are some key features of Perl 6 that Rakudo Star does not yet handle appropriately, although they will appear in upcoming releases. Some of the not-quite-there features include:

    + advanced macros
    + non-blocking I/O (in progress)
    + some bits of Synopsis 9 and 11
    + There is an online resource at that lists the known implemented and missing features of Rakudo’s backends and other Perl 6 implementations.

    In many places we’ve tried to make Rakudo smart enough to inform the programmer that a given feature isn’t implemented, but there are many that we’ve missed. Bug reports about missing and broken features are welcomed at

    See for links to much more information about Perl 6, including documentation, example code, tutorials, presentations, reference materials, design documents, and other supporting resources. Some Perl 6 tutorials are available under the “docs” directory in the release tarball.

    The development team thanks all of the contributors and sponsors for making Rakudo Star possible. If you would like to contribute, see, ask on the mailing list, or join us on IRC #perl6 on freenode.

    brrt to the future: Letting templates do what you mean

    Published by Bart Wiegmans on 2017-04-30T22:12:00

    Hi everybody, today I'd like to promote a minor, but important improvement in the 'expression template compiler' for the new JIT backend. This is a tool designed to make it easy to develop expression templates, which are themselves a way to make it easy to generate the 'expression tree' intermediate representation used by the new JIT backend. This is important because MoarVM instructions operate on a perl-like level of abstraction - single instructions can perform operations such as 'convert object to string', 'find first matching character in string' or 'access the last element of an array'. Such operations require rather more instructions to represent as machine code.

    This level of abstraction is rather convenient for the rakudo compiler, which doesn't have to consider low-level details when it processes your perl6 code. But it is not very convenient for the JIT compiler which does. The 'expression' intermediate representation is designed to be much closer to what hardware can support directly. Basic operations include loading from and storing to memory, memory address computation, integer arithmetic, (conditional) branching, and function calls. At some point in the future, floating point operations will also be added. But because of this difference in abstraction level, a single MoarVM instruction will often map to many expression tree nodes. So what is needed is an efficient way to convert between the two representations, and that is what expression templates are supposed to do.

    Expression templates are very much like the expression tree structure itself, in that both are represented as arrays of integers. Some of the elements represent instructions, some are constants, and some are references (indexes into the same array), forming a directed acyclic graph (not a tree). The only difference is that the template is associated with a set of instructions that indicate how it should be linked into the tree. (Instruction operands, i.e. the data that each instruction operates on, are prepared and linked by the template application process as well).

    Surprisingly, arrays of integers aren't a very user-friendly way to write instruction templates, and so the template compiler was born. It takes as input a text file with expression templates defined as symbolic expressions, best known from the LISP world, and outputs a header file that contains the templates, ready for use by the JIT compiler. Note that the word 'template' has become a bit overloaded, referring to the textual input of the template compiler as well as to the binary input to the JIT compiler. That's okay, I guess, since they're really two representations of the same thing. The following table shows how template text, binary, and expression tree relate to each other:

    Text 'Binary'Tree

    (template: unless_i
    (zr $0)
    (branch (label $1))

    template: {
    info: ".f.f.l.ll",
    len: 9,
    root: 6

    I hope it isn't too hard to see how one maps to the other. The unless_i instruction executes a branch if its integer argument is zero, specified by a constant as its second argument. All symbols (like when, label and zr) have been replaced by uppercase prefixed constants (MVM_JIT_WHEN), and all nesting has been replaced by references (indexes) into the template array. The 'info' string specifies how the template is to be linked into the tree. Instruction operands are indicated by an 'f', and internal links by an 'l'. In the tree representation the operands have been linked into the tree by the JIT; they form the LOAD and CONST nodes and everything below them.

    Anyway, my improvement concerns a more complex form of template, such as the following example, an instruction to load an object value from the instance field of an object:

    (template: sp_p6oget_o
    (let: (($val (load (add (^p6obody $1) $2) ptr_sz)))
    (if (nz $val) $val (^vmnull))))

    This template contains a let: expression, which declares the $val variable. This value can be used in the subsequent expression by its name. Without such declarations the result of a computation could only have one reference, its immediate syntactic parent. (Or in other words, without let:, every template can only construct a tree). That is very inconvenient in case a result should be checked for null-ness, as in this case. (vmnull is a macro for the global 'null object' in MoarVM. The null object represents NULL wherever an object is needed, but isn't actually NULL, as that would mean it couldn't be dereferenced; it saves the interpreter from checking if a pointer to an object is NULL everywhere it is accessed).

    The let: construct has another purpose: it ensures the ordering of operations. Although most operations can be ordered in whatever way suits the compiler, some do not, most notably function calls. (Function calls may have numerous unpredictable side effects, after all). All statements declared in the 'let declaration body' are compiled to run before any statements in the 'expression body'. This enables the programmer to ensure that a value is not needlessly computed twice, and more importantly, it ensures that a value that is used in multiple branches of a conditional statement is defined in both of them. For instance:

    (let (($foo (...)))
    (if (...)
    (load $foo)

    This pseudo-snippet of template code would dereference $foo if some condition is met (e.g. $foo is not NULL) and returns $foo directly otherwise. Without let to order the computation of $foo prior to the blocks of if, the first (conditional) child of if would be the first reference to $foo. That would mean that the code to compute $foo is only compiled in the first conditional block, which would not be executed whenever the if condition was not true, meaning that $foo would be undefined in the alternative conditional block. This would mean chaos. So in fact let does order expressions. All is good.

    Except... I haven't told you how this ordering works, which is where my change comes in. Prior to commit 7fb1b10 the let expression would insert a hint to the JIT compiler to add the declared expressions as tree roots. The 'tree roots' are where the compiler starts converting the expression tree (graph) to a linear sequence of byte code. Hence the declaring expressions are compiled prior to the dependent expressions. But this has, of course, one big disadvantage, which is that the set of roots is global for the tree. Every declaration, no matter how deep into the tree, was to be compiled prior to the head of the tree. As a result, the following template code would not at all do what you want:

    (let ($foo (...))
    (if (nz $foo)
    (let (($bar (load $foo))) # dereference $foo !
    (... $bar))

    The declaration of $bar would cause $foo to be dereferenced prior to checking whether it is non-null, causing a runtime failure. Chaos is back. Well, that's what I've changed. Fortunately, we have another ordering mechanism at our disposal, namely DO lists. These are nodes with a variable number of children that are also promised to be compiled in order. After the patch linked above, the compiler now transforms let expressions into the equivalent DO expressions. Because DO expressions can be nested safely, $bar is not computed prior to the null-check of $foo, as the programmer intended. I had originally intended to implement analysis to automatically order the expressions with regard to the conditionals, but I found that this was more complicated to implement and more surprising to the programmer. I think that in this case, relying on the programmer is the right thing.

    One thing that I found interesting is that this reduces the number of mechanisms in the compiler. The 'root-hint' was no longer useful, and subsequently removed. At the same time, all but the last child of a DO list must be void expressions, i.e. yield no value, because DO can only return the value of its last child. Since all expressions in a let declaration must yield some value - otherwise they would be useless - they required a new operation type: discard. Thus with a new node type (extension of data range) we can remove a class of behavior.

    After I had implemented this, I've started working on adding basic block analysis. That is a subject for a later post, though. Until next time! Perl 6 By Example: Now "Perl 6 Fundamentals"

    Published by Moritz Lenz on 2017-04-30T22:00:01

    This blog post is part of my ongoing project to write a book about Perl 6.

    If you're interested, either in this book project or any other Perl 6 book news, please sign up for the mailing list at the bottom of the article, or here. It will be low volume (less than an email per month, on average).

    After some silence during the past few weeks, I can finally share some exciting news about the book project. Apress has agreed to publish the book, both as print and electronic book.

    The title is Perl 6 Fundamentals, with A Primer with Examples, Projects, and Case Studies as subtitle. The editorial process is happening right now. I've received some great feedback on my initial manuscript, so there's a lot to do for me.

    Stay tuned for more updates!

    Subscribe to the Perl 6 book mailing list

    * indicates required

    gfldex: Issue All The Things

    Published by gfldex on 2017-04-30T18:07:12

    While on her epic quest to clean up the meta part of the ecosystem samvc send me a few pull requests. That raised the question which of my modules have open issues. Github is quite eager to list you many things but lacks the ability to show issues for a group of repos. Once again things fell into place.

    Some time ago I made a meta module to save a few clicks when testing modules once a week. What means I have a list of modules I care about already.

    perl6 -e 'use META6; use META6::bin :TERM :HELPER;\\
    for => "$*HOME/projects/perl6/gfldex-meta-zef-test/META6.json").<depends> -> $name {\\
        say BOLD $name;\\

    META6::bin didn’t know about Github issues, what was easily solved, including retries on timeouts of the github api. Now I can feed the module names into &MAIN and get a list of issues.

    perl6 -e 'use META6; use META6::bin :TERM :HELPER;\\
    for => "$*HOME/projects/perl6/gfldex-meta-zef-test/META6.json").<depends> -> $name {\\
        say BOLD $name;\\
        try META6::bin::MAIN(:issues, :module($name), :one-line, :url);\\

    I switfly went to merge the pull requests.

    [open] Add License checks and use new META license spec [10d] ⟨⟩
    [open] warn on source [35d] ⟨⟩
    [open] warn on empty description [37d] ⟨⟩
    [open] check if source-url is accessible [37d] ⟨⟩
    [open] Check `perl` version [135d] ⟨⟩
    [open] Report missing modules? [1y] ⟨⟩
    [open] Add :strict-versions switch [1y] ⟨⟩
    [open] Test harder that "provides" is sane [1y] ⟨⟩
    Github timed out, trying again 1/3.
    Github timed out, trying again 2/3.
    Github timed out, giving up.
    [open] Use SPDX identifier in license field of META6.json [3d] ⟨⟩
    [open] Use SPDX identifier in license field of META6.json [3d] ⟨⟩
    Github timed out, trying again 1/3.
    Github timed out, trying again 1/3.
    [open] Use SPDX identifier in license field of META6.json [9d] ⟨⟩

    To check the issues of any project that got a META6.json run meta6 --issues. To check if there are issues for a given module in the ecosystem use meta6 --issues --module=Your::Module::Name


    As requested by timotimo, meta6 --issues --one-line --url --deps will list all issues of the repo and all issues of the dependencies listed in META6.json.

    samcv: Camelia Wants YOU to Add License Tags to Your Module!

    Published on 2017-04-23T07:00:00

    Open to scene, year 2017: With no good guidance on the license field, the ecosystem had at least as many variations for "Artistic 2.0" license as humans had fingers. But there was a hope that robot kind and human kind could work to solve this problem, together.

    Most of our ecosystem modules that have licenses are Artistic 2.0. Here are just some of the variations of the license metedata tag we had in the ecosystem for the same license. Some were ambiguous as well:

    The list goes on. Note: the ambiguous license names above (perl and Artistic) were found on modules that were provably Artistic 2.0 as they had a LICENSE file. I make no assertion that all modules using these ambiguous names are Artistic 2.0, and the list above only refers to what was found on actual Artistic 2.0 projects in the ecosystem.

    This was by no fault of the module creators, as the example didn't even show a license field at all (this has now been updated with guidance on the license field). The original META spec in S22 used to say that the field should contain a URL to the license, and even if this had been consistent between modules in the ecosystem, it still would not have been very useful for computers or people to quickly figure out with certainty what license a project was under, as the URL's were at many different addresses for the same licenses.

    It was clear the original spec was not sufficiently useful to computers or package managers, so the spec was changed to more conform to other parts of META. It was decided we would use SPDX standard identifiers in the license field, which are both human and computer readable. Then allowing for an optional URL to the license to go under the support key of the META (where other project URL's already go).

    This new effort hopes to make sure the license fields in META are both computer and human useful to look at, so we standardized based on the SPDX identifiers which are the most widely used identifiers in the open source world.

    Humans and Robots working together

    We had 103 modules that had license fields, but were non-standard values. My robot agreed to make about 50 pull requests, as a show of good will towards us humans :). These automated pull requests were made when it was a full certainty which license the project had, (either by the LICENSE file or because the license field was "The Artistic 2" or something unambiguous but nonstandard).

    There is a full list of the modules with non-standard licenses here where we are keeping track of the progress that has been made! The list is planned to expand to also cover modules with no license field at all, but the ones with license fields were much easier for my robot friend to deal with.

    If you have a module in that list, or have a module with no license field (don't feel bad, until several days ago none of my modules had license fields either), it is your job to add one!

    If you have any modules which me or my robot friend didn't PR, feel free to add those license fields to any other modules you may have. If you see a module and notice it has no license field in the meta file, feel free to submit a PR of your own if they have a LICENSE file showing which license the module is under, or if there is no license field, opening an issue so the author can make the change themself. If possible add it to the list above so we can keep track of it. As mentioned before, make sure to use SPDX identifiers.

    For the details of the updated META spec regarding the license field, please see S22 here.

    Thank you for doing your part to help make the ecosystem a better place!

    P.S. On April 19th only 13% of modules had a license field at all. Now, 4 days later we are up to 20.5%! Keep up the good work everyone!

    Zoffix Znet: The Failure Point of a Release

    Published on 2017-04-23T00:00:00

    The April Glitches in Rakudo Releases

    6guts: Massively reducing MoarVM Fixed Size Allocator contention

    Published by jnthnwrthngtn on 2017-04-22T14:37:35

    The latest MoarVM release, 2017.04, contains a significant improvement for multi-threaded applications, especially those that are CPU-bound. I mentioned the improvement briefly on Twitter, because it’s hard to be anything other than brief on Twitter. This post contains a decidedly less brief, though hopefully somewhat interesting, description of what I did. Oh, and also a bonus footnote in which I attempt to prove the safety (or lack of safety) of a couple of lock free algorithms, because that’s fun, right?

    The Fixed Size Allocator

    The most obvious way to allocate memory in a C program is through functions like malloc and calloc. Indeed, we do this plenty in MoarVM. The malloc and calloc implementations in C libraries have certainly been tuned a bunch, but at the same time they have to have good behavior for a very wide range of programs. They also need to keep track of the sizes of allocations, since a call to free does not pass the size of the memory being released. And they need to try to avoid fragmentation, which can lead to out-of-memory errors occurring because the heap ends up with lots of small gaps, but none big enough to allocate a larger object.

    When we know a few more properties of the memory usage of a program, and we have information around to know the size of the memory block we are freeing, it’s possible to do a little better. MoarVM does this in multiple ways.

    One of them is by using a bump-the-pointer allocator for memory that is managed by the garbage collector. These have a header that points to a type table that knows the size of the object that was allocated, meaning the size information is readily available. And the GC can move objects around in memory, since it can find all of the references to an object and update them, meaning there is a way out of the fragmentation trap too.

    The call stack is another example. In the absence of closures, it is possible to allocate a block of memory and use it like a stack. When a program makes a call, the current location in the memory block is taken as the address for the call frame memory, and the location is bumped by the frame size. This could be seen as a “push”, in stack terms. Because call frames are left in the opposite order to which they are entered, freeing them is just subtraction. This could be seen as a “pop”. Since holes are impossible, fragmentation cannot occur.

    A third case is covered by the fixed size allocator. This is the most difficult of the three. It tries to do a better job than malloc and friends in the case that, at the point when memory is freed, we readily know the size of the memory. This allows it to create regions of memory that consist of N blocks of a fixed size, and allocate the memory out of those regions (which it calls “pages”). When a memory request comes in, the allocator first checks if it’s within the size range that the fixed size allocator is willing to handle. If it isn’t, it’s passed along to malloc. Otherwise, the size is rounded up to the nearest “bin size” (which are 8 bytes, 16 bytes, 24 bytes, and so forth). A given bin consists of:

    If the free list contains any entries, then one of them will be taken. If not, then the pages will be considered. If the current page is not full, then the allocation will be made from it. Otherwise, another page will be allocated. When memory is freed, it is always added to the free list of the appropriate bin. Therefore, a longer-running program, in steady state, will typically end up getting all of its allocations from the free list.

    Enter threads

    Building a fixed size allocator for a single-threaded environment isn’t all that difficult. But what happens when it needs to cope with being used in a multi-threaded program? Well…it’s complicated. Clearly, it is not possible to have a single global fixed size allocator and have all of the threads just use it without any kind of concurrency control. Taking an item off the freelist is a multi-step process, and allocating from a page – or allocating a new page – is even more steps. Without concurrency control, there will be data races all over, and we’ll be handed a SIGSEGV in record time.

    It’s worth stopping to consider what would happen if we were to give every thread its own fixed size allocator. This turns out to get messy fast, as memory allocated on one thread may be freed by another. A seemingly simple scheme is to say that the freed memory is simply appended to the freelist of the freeing thread’s fixed size allocator. Unfortunately, this has two bad properties.

    1. When the thread ends, we can’t just throw aways the pages – because bits of them may still be in use by other threads, or referenced in the free lists of other threads. So they’d need to be somehow “re-homed”, which is going to need some kind of coordination. Further measures may be needed to mitigate memory fragmentation in programs that spawn and join many threads during their lifetimes.
    2. Imagine a producer/consumer setup, where one thread does allocations and passes the allocated memory to another thread, which processes the data in the memory and frees it. The producing thread will build up a lot of pages to allocate out of. The consuming thread will build up an ever longer free list. Memory runs out. D’oh.

    So, MoarVM went with a single global fixed size allocator. Of course, this has the drawback of needing concurrency control.

    Concurrency control

    The easiest possible form of concurrency control is to have threads acquire a mutex on every allocate and free operation. This has the benefit of being very straightforward to understand and reason about. It has the disadvantage of being extremely costly. Mutex acquisition can be relatively cheap, but it gets expensive when there is high contention – that is, lots of threads trying to obtain the lock. And since all CPU-bound threads will typically allocate some working memory, particularly in a VM for a dynamic language that doesn’t yet do escape analysis, that adds up to a lot of contention.

    So, MoarVM did something more sophisticated.

    First, the easy part. It’s possible to append to a free list with a CPU-provided atomic operation, provided taking from the freelist is also using one. So, no mutex acquisition is required for freeing memory. However, an atomic operation still requires a kind of locking down at the CPU level. It’s cheaper than a mutex acquire/release for sure, but there will still be contention between CPU cores for the cache line holding the head of the free list.

    What about allocation? It turns out that we can not just take from a free list using an atomic operation without hitting the ABA problem (gory details in footnote). Therefore, some kind of locking is needed to ensure an ordering on the operations. In most cases, the atomic operation will work on the first attempt (it’s competing with frees, which happen without any kind of locking, meaning a retry will sometimes be needed). In cases where something will complete very rapidly, a spinlock may be used in place of a full-on mutex. So, the MoarVM fixed size allocator allocation scheme boiled down to:

    1. Acquire the spin lock.
    2. Try to take from the free list in a loop, until either we succeed or the free list is seen to be empty.
    3. Release the spin lock.
    4. If we failed to obtain memory from the free list, take the slow path to get memory from a page, allocating another page if needed. This slow path does acquire a real mutex.


    First up, I’ll note that the strategy outlined above does beat the “just take a mutex for every allocate/free” approach – at least, in all of the benchmarks I’ve considered. Frees end up being lock free, and most of the allocations just do a spin lock and an atomic operation.

    At the same time, contention means contention, and no lock free data structure or spinlock changes that. If multiple threads are constantly scrambling to work on the same memory location – such as the head of a free list – it’s going to get expensive. How expensive? On an Intel Core i7, obtaining a cache line that is held by another core exclusively – which it typically will be under contention – costs somewhere around 70 CPU cycles. It gets worse in a multi-CPU setup, where it could easily be hundreds of CPU cycles. Note this is just for one operation; the spinlock is a further atomic operation and, of course, it uses some cycles as it spins.

    But how much could this end up costing in a real world Perl 6 application? I recently had chance to find out, and the numbers were ugly. Measurements obtained by perf showed that a stunning 40% of the application’s runtime was spent inside of the fixed size allocator. (Side note: perf is a sampling profiler, which – in my handwavey understanding – snapshots the callstack at regular intervals to figure out where time is being spent. My experience has been that sampling profilers tend to be better at showing up surprising costs like this than instrumenting profilers are, even if they are in some senses less precise.)

    Making things better

    Clearly, there was significant room for improvement. And, happily, things are now muchly improved and my real-world program did get something close to a 40% performance boost.

    To make things better, I introduced per-thread freelists, while leaving pages global and retaining global free lists also.

    Memory is allocated in the first place from global pages, as before. However, when it is freed, it is instead placed on a per-thread free list (with one free list per thread per size bin). When a thread needs memory, it first checks its thread-local free list to see if there is anything there. It will only then look at the global free list, or the global pages, if the thread-local free list cannot satisfy the memory request. The upshot of this is that the vast majority of allocations and frees performed by the fixed size allocator no longer have any contention.

    However, as I mentioned earlier, one needs to be very careful when introducing things like thread-local freelists to not create bad behavior when a thread terminates or in producer/consumer scenarios. Therefore:

    So, I believe this improvement is both good for performance without being badly behaved for any cases that previously would have worked out fine.

    Can we do better?

    Always! While the major contention bottleneck is gone, there are further opportunities for improvement that are worth exploring in the future.

    In summary…

    If you have CPU-bound multi-threaded Perl 6 programs, MoarVM 2017.04 could offer a big performance improvement. For my case, it was close to 40%. And the design lesson from this: on modern hardware, contention is really costly, and using a lock free data structure or picking the “best kind of lock” will not overcome that.

    Footnote on the ABA vulnerability: It’s decidedly interesting – at least to me – that prepending to a free list can be safely done with a single atomic operation, but taking from it cannot be. Here I’ll attempt to provide a proof for these claims.

    We’ll consider a single free list whose head lives at memory location F, and two threads, T1 and T2. We assume the existence of an atomic operation, TRY-CAS(location, old, new), which will – in a single CPU instruction that may not be interrupted – compare the value in memory pointed to by location with old and, if they match, replace it with new. (CAS is short for Compare And Swap.) The TRY-CAS function evaluates to true if the replacement took place, and false if not. The threads may be preempted (that is, taken off the CPU) at any point in time.

    To show that allocation is vulnerable to the ABA problem, we just need to find an execution where it happens. First of all, we’ll define the operation ALLOCATE as:

    1: do
    2:     allocated = *F
    3:     if allocated != NULL
    4:         next =    
    5: while allocated != NULL && !TRY-CAS(F, allocated, next)
    6: return allocated

    And FREE(C) as:

    1: do
    2:     current = *F
    3: = current;
    4: while !TRY-CAS(F, current, C)

    Let’s consider a case where we have 3 memory cells, C1, C2, and C3. The free list head F points to C1, which in turn points to C2, which in turn points to C3.

    Thread T1 enters ALLOCATE, but is preempted immediately after the execution of line 4. At this point, allocated contains C1 and next contains C2.

    Next, T2 calls ALLOCATE, and succeeds in making an allocation. F now points to C2. It again calls ALLOCATE, meaning that F now points to C3. It then calls FREE(C1). At this point, F points to C1 again, and C1 points to C3. Notice that at this point, cell C2 is considered to be allocated and in use.

    Consider what happens if T1 is resumed. It performs TRY-CAS(F, C1, C2). This operation will succeed, because F does indeed currently point to C1. This means that F now come to point to C2. However, we earlier stated that C2 is allocated and in use, and therefore should not be in the free list. Therefore we have demonstrated the code to be buggy, and shown how the bug arises as a result of the ABA problem.

    What of the claim that the FREE(C) is not vulnerable to the ABA problem? To be vulnerable to the ABA problem, another thread must be able to change the state of something that the correctness of the operation depends upon, but that is not tested by the TRY-CAS operation. Looking at FREE(C) again:

    1: do
    2:     current = *F
    3: = current;
    4: while !TRY-CAS(F, current, C)

    We need to consider C and current. We can very reasonably make the assumption that the calling program is well-behaved, and will never use the cell C again after passing it to FREE(C) (unless it obtains it again in the future through another call to ALLOCATE, which cannot happen until FREE has inserted it into the free list). Therefore, C cannot be changed in any way other than the code in FREE changes it. The FREE operation holds the sole reference to C at this point.

    Life is much more complicated for current. It is possible for a preemption at line 3 of FREE, followed by another thread allocating the cell pointed to by current and then freeing it again, which is certainly a case of an ABA state change. However, unlike the situation we saw in ALLOCATE, the FREE operation does not depend on the content of current. We can see this by noticing how it never looks inside of it, and instead just holds a reference to it. An operation cannot depend upon a value it never accesses. Therefore, FREE is not vulnerable to the ABA problem.

    Strangely Consistent: The root of all eval

    Published by Carl Mäsak

    Ah, the eval function. Loved, hated. Mostly the latter.

    $ perl -E'my $program = q[say "OH HAI"]; eval $program'
    OH HAI

    I was a bit stunned when the eval function was renamed to EVAL in Perl 6 (back in 2013, after spec discussion here). I've never felt really comfortable with the rationale for doing so. I seem to be more or less alone in this opinion, though, which is fine.

    The rationale was "the function does something really weird, so we should flag it with upper case". Like we do with BEGIN and the other phasers, for example. With BEGIN and others, the upper-casing is motivated, I agree. A phaser takes you "outside of the normal control flow". The eval function doesn't.

    Other things that we upper-case are things like .WHAT, which look like attributes but are really specially code-generated at compile-time into something completely different. So even there the upper-casing is motivated because something outside of the normal is happening.

    eval in the end is just another function. Yes, it's a function with potentially quite wide-ranging side effects, that's true. But a lot of fairly standard functions have wide-ranging side effects. (To name a few: shell, die, exit.) You don't see anyone clamoring to upper-case those.

    I guess it could be argued that eval is very special because it hooks into the compiler and runtime in ways that normal functions don't, and maybe can't. (This is also how TimToady explained it in the commit message of the renaming commit.) But that's an argument from implementation details, which doesn't feel satisfactory. It applies with equal force to the lower-cased functions just mentioned.

    To add insult to injury, the renamed EVAL is also made deliberately harder to use:

    $ perl6 -e'my $program = q[say "OH HAI"]; EVAL $program'
    ===SORRY!=== Error while compiling -e
    EVAL is a very dangerous function!!! (use the MONKEY-SEE-NO-EVAL pragma to override this error,
    but only if you're VERY sure your data contains no injection attacks)
    at -e:1
    ------> program = q[say "OH HAI"]; EVAL $program⏏<EOL>
    $ perl6 -e'use MONKEY-SEE-NO-EVAL; my $program = q[say "OH HAI"]; EVAL $program'
    OH HAI

    Firstly, injection attacks are a real issue, and no laughing matter. We should educate each other and newcomers about them.

    Secondly, that error message ("EVAL is a very dangerous function!!!") is completely over-the-top in a way that damages rather than helps. I believe when we explain the dangers of code injection to people, we need to do it calmly and matter-of-factly. Not with three exclamation marks. The error message makes sense to someone who already knows about injection attacks; it provides no hints or clues for people who are unaware of the risks.

    (The Perl 6 community is not unique in eval-hysteria. Yesterday I stumbled across a StackOverflow thread about how to turn a string with a type name into the corresponding constructor in JavaScript. Some unlucky soul suggested eval, and everybody else immediately piled on to point out how irresponsible that was. Solely as a knee-jerk reaction "because eval is bad".)

    Thirdly, MONKEY-SEE-NO-EVAL. Please, can we just... not. 😓 Random reference to monkies and the weird attempt at levity while switching on a nuclear-chainsaw function aside, I find it odd that a function that enables EVAL is called something with NO-EVAL. That's not Least Surprise.

    Anyway, the other day I realized how I can get around both the problem of the all-caps name and the problem of the necessary pragma:

    $ perl6 -e'my &eval = &EVAL; my $program = q[say "OH HAI"]; eval $program'
    OH HAI

    I was so happy to realize this that I thought I'd blog about it. Apparently the very dangerous function (!!!) is fine again if we just give it back its old name. 😜

    gfldex: You can call me Whatever you like

    Published by gfldex on 2017-04-19T11:00:43

    The docs spend many words to explain in great detail what a Whatever is and how to use it from the caller perspective. There are quite a few ways to support Whatever as a callee as I shall explain.

    Whatever can be used to express “all of the things”. In that case we ask for the type object that is Whatever.

    sub gimmi(Whatever) {};

    Any expression that contains a Whatever * will be turned into a thunk. The latter happens to be a block without a local scope (kind of, it can be turned into a block when captured). We can ask specifically for a WhateverCode to accept Whatever-expressions.

    sub compute-all-the-things(WhateverCode $c) { $c(42) }
    say compute-all-the-things(*-1);
    say (try say compute-all-the-things({$_ - 1})) // 'failed';
    # OUTPUT: «41␤failed␤»

    We could also ask for a Block or a Method as both come preloaded with one parameter. If we need a WhateverCode with more then one argument we have to be precise because the compiler can’t match a Callable sub-signature with a WhateverCode.

    sub picky(WhateverCode $c where .arity == 2 || fail("two stars in that expession please") ) {
        $c.(1, 2)
    say picky(*-*);
    # OUTPUT: «-1␤»
    say (try picky(*-1)) // $!;
    # OUTPUT: «two stars in that expession please␤  in sub picky at …»

    The same works with a Callable constraint, leaving the programmer more freedom what to supply.

    sub picky(&c where .arity == 2) { c(1, 2) }

    There are quite a few things a WhateverCode can’t do.

    sub faily(WhateverCode $c) { $c.(1) }
    say (try faily( return * )) // $!.^name;
    # OUTPUT: «X::ControlFlow::Return␤»

    The compiler can take advantage of that and provide compile time errors or get things done a little bit qicker. So trading the flexibility of Callable for a stricter WhateverCode constraint may make sense.

    gfldex: Dealing with Fallout

    Published by gfldex on 2017-04-19T09:51:53

    The much welcome and overdue sanification of the IO-subsystem lead to some fallout in some of my code that was enjoyably easy to fix.

    Some IO-operations used to return False or undefined values on errors returned from the OS. Those have been fixed to return Failure. As a result some idioms don’t work as they used to.

    my $v = §some-filename.txt" // 'sane default';

    The conditional method call operator .? does not defuse Failure as a result the whole expression blows up when an error occures. Luckily try can be used as a statement, which will return Nil, so we can still use the defined-or-operator // to assign default values.

    my $v = (try "some-filename.txt" // 'sane default';

    The rational to have IO-operations throw explosives is simple. Filesystem dealings can not be atomic (at least seen from the runtime) and can fail unexpectetly due to cable tripping. By packaging exceptions in Failure objects Perl 6 allows us to turn them back into undefined values as we please. PART 3: Information on Changes Due to IO Grant Work

    Published by Zoffix Znet on 2017-04-17T20:22:46

    The IO grant work is at its wrap up. This note lists some of the last-minute changes to the plans delineated in earlier communications ([1], [2], [3]). Most of the listed items do not require any changes to users’ code.

    Help and More Info

    If you need help or more information, please join our IRC channel and ask there. You can also contact the person performing this work via Twitter @zoffix or by talking to user Zoffix in our dev IRC channel

    gfldex: Slipping in a Config File

    Published by gfldex on 2017-04-17T15:31:24

    I wanted to add a config file to META6::bin without adding another dependency and without adding a grammar or other forms of fancy (and therefore time consuming) parsers. As it turns out, .split and friends are more then enough to get the job done.

    # META6::bin config file
    general.timeout = 60
    git.timeout = 120
    git.protocol = https

    That’s how the file should look like and I wanted a multidim Hash in the end to query values like %config<git><timeout>.

    our sub read-cfg($path) is export(:HELPER) {
        use Slippy::Semilist;
        return unless $path.IO.e;
        my %h;
            ».split(/\s* '=' \s*/)\
  > $k, $v { %h{||$k.split('.').cache} = $v });

    We slurp in the whole file and process it line by line. All newlines are removed and any line that starts with a # or is empty is skipped. We separate values and keys by = and use a Semilist Slip to build the multidim Hash. Abusing a .map that doesn’t return values is a bit smelly but keeps all operations in order.

    A Semilist is the thing you can find in %hash{1;2;3} (same for arrays) to express multi-dimentionallity. Just using a normal list wont cut it because a list is a valid key for a Hash.

    I had Rakudo::Slippy::Semilist laying around for quite some time but never really used it much because it’s cheating by using nqp-ops to get some decent speed. As it turned out it’s not really the operations on a Hash as the circumfix:<{ }>-operator itself that is causing a 20x speed drop. By calling .EXISTS-KEY and .BIND-KEY directly the speed hit shrinks down to 7% over a nqp-implementation.

    It’s one of those cases where things fall into place with Perl 6. Being able to define my own operator in conjunction with ». allows to keep the code flowing in the order of thoughs instead of breaking it up into nested loops.

    samcv: Indexing Unicode Things, Improving Case Insensitive Regex

    Published on 2017-04-15T07:00:00

    In the 2017.04 release of Rakudo under the MoarVM backend, there will be some substantial improvements to regex speed.

    I have been meaning to make a new blog post for some time about my work on Unicode in Perl 6. This is going to be the first post of several that I have been meaning to write. As a side note, let me mention that I have a Perl Foundation grant proposal which is related to working on Unicode for Perl 6 and MoarVM.

    The first of these improvements I'm going to write about is case insensitive regex m:i/ /. MoarVM had formerly lowercased the haystack and the needle whenever nqp::indexic was called. The new code now also uses foldcase instead of lowercasing.

    It ended up 1.8-3.3x faster than before, but it began when MasterDuke submitted a Pull Request which changed the underlying MoarVM function behind nqp::indexic (index ignorecase) to use foldcase instead of lowercase.

    At first this seemed like a great and easy improvement, but shortly after there were some serious problems. You see, when you foldcase a string, sometimes the number of graphemes can change. One grapheme can become up to 3 new codepoints! For example the ligature ‘st’ will foldcase to ‘st’, even though it may lowercase to 'st'. The issue was, if the string to be searched contained any of these expanding characters, the nqp::indexic command's would be off by however many codepoints were increased!

    ’.fc.say; # st’.chars.say; # 1
    'ffi'.fc.say; # ffi
    'ffi'.chars.say; # 1
    'ffi'.fc.chars.say; # 3ß’.fc.say; #  ss

    So this was a real problem.

    On the bright side of things, this allowed me to make many great changes in how case insensitive strings are searched for under Perl 6/MoarVM.

    The nqp::index command does a lot of the effort when searching for a string. I discovered we had a nqp::indexic operation that searched for a string but ignored case, but it was not used everywhere. nqp::index was still used extensively and this required us changing case in both in Perl 6 to use nqp::index and also doing it when using the nqp::indexic (index ignore case). In addition MoarVM changed the case of the entire haystack and needle whenever the nqp::indexic operation was used.

    On the MoarVM side I first worked on getting it working with foldcase, and quickly discovered that the only sane way to do this, was to begin foldcasing operations on the haystack only from the starting point sent to the indexic function. If you foldcased them before the requested index, it would screw up the offset. My solution was to foldcase the needle, and then foldcase each grapheme down the haystack, only as far as we needed to find our match, preventing useless work foldcasing parts of the string we did not need.

    The offset of the needle that is found from the indexic op is relative to the original string, and it will expand characters as needed, but the returned offset will always be related to the original string, not its changed version, making the offsets useful and relevant information on where the needle is in the original string, not in the altered version. As with regex, we are looking for the match in the haystack, and so must be able to return the section of the string we have matched.

    The end result is we now have a 1.8x to 3.3x (depending on not finding a match/finding a match at the beginning) faster case insensitive regex!

    gfldex: Speeding up Travis

    Published by gfldex on 2017-04-14T20:55:00

    After some wiggling I managed to convince travis to use ubuntu packages to trim off about 4 minutes of a test. Sadly the .debs don’t come with build in zef, what would be another 40 seconds.

    As follows a working .travis.yml.

    sudo: required
        - wget
        - sudo dpkg --install perl6-rakudo-moarvm-ubuntu16.04_20170300-02_amd64.deb
        - sudo /opt/rakudo/bin/
        - export PATH=/opt/rakudo/bin:$PATH
        - sudo chown -R travis.travis /home/travis/.zef/
        - zef --debug install .
    - zef list --installed

    Using a meta package in conjuction with .debs makes it quite easy to test if a module will work not just with bleeding Rakudo but with versions users might actually have.

    brrt to the future: Function Call Milestone

    Published by Bart Wiegmans on 2017-03-28T16:14:00

    Hi everybody. It's high time for another update, and this time I have good news. The 'expression' JIT compiler can now compile native ('C') function calls (although it's not able to use the results). This is a major milestone because function calls are hard! (At least from the perspective of a compiler, and especially from the perspective of the register allocator). Also because native function calls are really very important in MoarVM. Most of its 'primitive' operations (like hash table access, string equality, big integer arithmetic) are implemented by invoking native functions, and so to compile almost any program the JIT has to compile many function calls.

    What makes function calls 'hard' is that they must implement the 'calling convention' of the relevant 'application binary interface' (ABI). In short, the ABI specifies the locations of function call parameters.  A small number of parameters (on Windows, the first 4, for POSIX platforms, the first 6) are placed in registers, and if there are more parameters they are usually placed on the stack. Aside from the calling convention, the ABI also specifies the expected alignment of the stack pointer (per 16 bytes) and the registers a functions may overwrite (clobber in ABI-speak) and which registers must have their original values after the function returns. The last type of registers are called 'callee-saved'. Note that at least a few registers must be callee-saved, especially those related to call stack management, because if the callee function would overwrite those it would be impossible to return control back to the caller. By the way, manipulating exactly those registers is how the setjmp and longjmp 'functions' work.

    So the compiler is tasked with generating code that ensures the correct values are placed in the correct registers. That sounds easy enough, but what if the these registers are taken by other values, and what if those other values might be required for another parameter? Indeed, what if the value in the %rdx register needs to be in the %rsi register, and the value of the %rsi register is required in the %rdx register? How to determine the correct ordering for shuffling the operands?

    One simple way to deal with this would be to eject all values from registers onto the stack, and then to load the values from registers if they are necessary. However, that would be very inefficient, especially if most function calls have no more than 6 (or 4) parameters and most of these parameters are computed for the function call only. So I thought that solution wouldn't do.

    Another way to solve this would be if the register allocator could ensure that values are placed in their correct registers directly,- especially for register parameters -  i.e. by 'precoloring'. (The name comes from register allocation algorithms that work by 'graph coloring', something I will try to explain in a later post). However, that isn't an option due to my choice of 'linear scan' as the register allocation algorithm. This is a 'greedy' algorithm, meaning that it decides the allocation for a live range as soon as it encounters them, and that it cannot revert that decision once it's been made. (If it could, it would be more like a dynamic programming algorithm). So to ensure that the allocation is valid I'd have to make sure that the information about register requirements is propagated backwards from the instructions to all values that might conflict with it... and that point we're no longer talking about linear scan, and I would be better off re-engineering a new algorithm. Not a very attractive option either!

    Instead, I thought about it and it occurred to me that this problem seems a lot like unravelling a dependency graph, with a number of restrictions. That is to say, it can be solved by a topological sort. I map the registers to a graph structure as follows:

    I linked to the topological sort page for an explanation of the problem, but I think my implementation is really quite different from that presented there. They use a node visitation map and a stack, I use an edge queue and and outbound count. A register transfer (edge) can be enqueued if it is clear that the destination register is not currently used. Transfers from registers to stack locations (as function call parameters) or local memory (to save the value from being overwritten by the called function) are also enqueued directly. As soon as the outbound count of a node reaches zero, it is considered to be 'free' and the inbound edge (if any) is enqueued.

    Unlike a 'proper' dependency graph, cycles can and do occur, as in the example where '%rdx' and '%rsi' would need to swap places. Fortunately, because of the single-inbound edge rule, such cycles are 'simple' - all outbound edges not belonging to the cycle can be resolved prior to the cycle-breaking, and all remaining edges are part of the cycle. Thus, the cycle can always be broken by freeing just a single node (i.e. by copy to a temporary register).

    The only thing left to consider are the values that are used after the function call returns (survive the function call) and that are stored in registers that the called function can overwrite (which is all of them, since the register allocator never selects callee-saved registers). So to make sure they are available afterwards, we must spill them. But there are a few spill strategies to choose from (terminology made up by me):

    The current register allocator does a full spill when it's run out of registers, and it would make some sense to apply the same logic for function-call related spills. I've decided to use spill-and-restore, however, because a full spill complicates the sorting order (a value that used to be in a register is suddenly only in memory) and it can be wasteful, especially if the call only happens in an alternative branch. This is common for instance when assigning values to object fields, as that may sometimes require a write barrier (to ensure the GC tracks all references from 'old' to 'new' objects). So I'm guessing that it's going to be better to pay the cost of spilling and restoring only in those alternative branches, and that's why I chose to use spill-and-restore.

    That was it for today. Although I think being able to call functions is a major milestone, this is not the very last thing to do. We currently cannot allocate any of the registers used for floating-point calculations, which is a relatively minor limitation since those aren't used very frequently. But I also need to do some more work to actually use function return values and apply generic register requirements of tiles. But I do think the day is coming near where we can start thinking about merging the new JIT with the MoarVM master branch, making it available to everybody. Until next time! PART 2: Upgrade Information for Changes Due to IO Grant Work

    Published by Zoffix Znet on 2017-04-03T00:15:07

    We’re making more changes!

    Do the core developers ever sleep? Nope! We keep making Perl 6 better 24/7!


    Not more than 24 hours ago, you may have read Upgrade Information for Changes Due to IO Grant Work. All of that is still happening.

    However, it turned out that I, (Zoffix), had an incomplete understanding of how changes in 6.d language will play along with 6.c stuff. My original assumption was we could remove or change existing methods, but that assumption was incorrect. Pretty much the only sane way to incompatibly change a method in an object in 6.d is to add a new method with a different name.

    Since I rather us not have, e.g. .child and .child-but-secure, for the next decade, we have a bit of an in-flight course correction:

    ORIGINAL PLAN was to minimize incompatibilities with existing 6.c language code; leave everything potentially-breaking for 6.d

    NEW PLAN is to right away add everything that does NOT break 6.c-errata specification, into 6.c language; leave everything else for 6.d. Note that current 6.c-errata specification for IO is sparse (the reason IO grant is running in the first place), so there’s lots of wiggle room to make most of the changes in 6.c.


    I (Zoffix) still hope to cram all the changes into 2017.04 release. Whether that’s overly optimistic, given the time constraints… we’ll find out on April 17th. If anything doesn’t make it into 2017.04, all of it definitely will be in 2017.05.


    Along with the original list in first Upgrade Information Notice, the following changes may affect your code. I’m excluding any non-conflicting changes.

    Potential changes:

    Changes for 6.d language:

    Help and More Info

    If you need help or more information, please join our IRC channel and ask there. You can also contact the person performing this work via Twitter @zoffix or by talking to user Zoffix in our dev IRC channel Upgrade Information for Changes Due to IO Grant Work

    Published by Zoffix Znet on 2017-04-02T08:31:49

    As previously notified, there are changes being made to IO routines. This notice is to provide details on changes that may affect currently-existing code.


    Barring unforeseen delays, the work affecting version 6.c language is planned to be included in 2017.04 Rakudo Compiler release (planned for release on April 17, 2017) on which next Rakudo Star release will be based.

    Some or all of the work affecting 6.d language may also be included in that release and will be available if the user uses use v6.d.PREVIEW pragma. Any 6.d work that doesn’t make it into 2017.04 release, will be included in 2017.05 release.

    If you use development commits of the compiler (e.g. rakudobrew), you will
    receive this work as-it-happens.


    If you only used documented features, the likelihood of you needing to change any of your code is low. The 6.c language changes due to IO Grant work affect either routines that are rarely used or undocumented routines that might have been used by users assuming they were part of the language.


    This notice describes only changes affecting existing code and only for 6.c language. It does NOT include any non-conflicting changes or changes slated for 6.d language. If you’re interested in the full list of changes, you can find it in the IO Grant Action Plan

    The changes that may affect existing code are:

    Help and More Info

    If you need help or more information, please join our IRC channel and ask there. You can also contact the person performing this work via Twitter @zoffix or by talking to user Zoffix in our dev IRC channel Perl 6 By Example: Idiomatic Use of Inline::Python

    Published by Moritz Lenz on 2017-04-01T22:00:01

    This blog post is part of my ongoing project to write a book about Perl 6.

    If you're interested, either in this book project or any other Perl 6 book news, please sign up for the mailing list at the bottom of the article, or here. It will be low volume (less than an email per month, on average).

    In the two previous installments, we've seen Python libraries being used in Perl 6 code through the Inline::Python module. Here we will explore some options to make the Perl 6 code more idiomatic and closer to the documentation of the Python modules.

    Types of Python APIs

    Python is an object-oriented language, so many APIs involve method calls, which Inline::Python helpfully automatically translates for us.

    But the objects must come from somewhere and typically this is by calling a function that returns an object, or by instantiating a class. In Python, those two are really the same under the hood, since instantiating a class is the same as calling the class as if it were a function.

    An example of this (in Python) would be

    from matplotlib.pyplot import subplots
    result = subplots()

    But the matplotlib documentation tends to use another, equivalent syntax:

    import matplotlib.pyplot as plt
    result = plt.subplots()

    This uses the subplots symbol (class or function) as a method on the module matplotlib.pyplot, which the import statement aliases to plt. This is a more object-oriented syntax for the same API.

    Mapping the Function API

    The previous code examples used this Perl 6 code to call the subplots symbol:

    my $py =;
    $'import matplotlib.pyplot');
    sub plot(Str $name, |c) {
        $'matplotlib.pyplot', $name, |c);
    my ($figure, $subplots) = plot('subplots');

    If we want to call subplots() instead of plot('subplots'), and bar(args) instead of `plot('bar', args), we can use a function to generate wrapper functions:

    my $py =;
    sub gen(Str $namespace, *@names) {
        $"import $namespace");
        return -> $name {
            sub (|args) {
                $$namespace, $name, |args);
    my (&subplots, &bar, &legend, &title, &show)
        = gen('matplotlib.pyplot', <subplots bar legend title show>);
    my ($figure, $subplots) = subplots();
    # more code here
    legend($@plots, $@top-authors);
    title('Contributions per day');

    This makes the functions' usage quite nice, but comes at the cost of duplicating their names. One can view this as a feature, because it allows the creation of different aliases, or as a source for bugs when the order is messed up, or a name misspelled.

    How could we avoid the duplication should we choose to create wrapper functions?

    This is where Perl 6's flexibility and introspection abilities pay off. There are two key components that allow a nicer solution: the fact that declarations are expressions and that you can introspect variables for their names.

    The first part means you can write mysub my ($a, $b), which declares the variables $a and $b, and calls a function with those variables as arguments. The second part means that $ returns a string '$a', the name of the variable.

    Let's combine this to create a wrapper that initializes subroutines for us:

    sub pysub(Str $namespace, |args) {
        $"import $namespace");
        for args[0] <-> $sub {
            my $name = $;
            $sub = sub (|args) {
                $$namespace, $name, |args);
    pysub 'matplotlib.pyplot',
        my (&subplots, &bar, &legend, &title, &show);

    This avoids duplicating the name, but forces us to use some lower-level Perl 6 features in sub pysub. Using ordinary variables means that accessing their results in the name of the variable, not the name of the variable that's used on the caller side. So we can't use slurpy arguments as in

    sub pysub(Str $namespace, *@subs)

    Instead we must use |args to obtain the rest of the arguments in a Capture. This doesn't flatten the list of variables passed to the function, so when we iterate over them, we must do so by accessing args[0]. By default, loop variables are read-only, which we can avoid by using <-> instead of -> to introduce the signature. Fortunately, that also preserves the name of the caller side variable.

    An Object-Oriented Interface

    Instead of exposing the functions, we can also create types that emulate the method calls on Python modules. For that we can implement a class with a method FALLBACK, which Perl 6 calls for us when calling a method that is not implemented in the class:

    class PyPlot is Mu {
        has $.py;
        submethod TWEAK {
            $!'import matplotlib.pyplot');
        method FALLBACK($name, |args) {
            $!'matplotlib.pyplot', $name, |args);
    my $pyplot =$py);
    my ($figure, $subplots) = $pyplot.subplots;
    # plotting code goes here
    $pyplot.legend($@plots, $@top-authors);
    $pyplot.title('Contributions per day');

    Class PyPlot inherits directly from Mu, the root of the Perl 6 type hierarchy, instead of Any, the default parent class (which in turn inherits from Mu). Any introduces a large number of methods that Perl 6 objects get by default and since FALLBACK is only invoked when a method is not present, this is something to avoid.

    The method TWEAK is another method that Perl 6 calls automatically for us, after the object has been fully instantiated. All-caps method names are reserved for such special purposes. It is marked as a submethod, which means it is not inherited into subclasses. Since TWEAK is called at the level of each class, if it were a regular method, a subclass would call it twice implicitly. Note that TWEAK is only supported in Rakudo version 2016.11 and later.

    There's nothing specific to the Python package matplotlib.pyplot in class PyPlot, except the namespace name. We could easily generalize it to any namespace:

    class PythonModule is Mu {
        has $.py;
        has $.namespace;
        submethod TWEAK {
            $!"import $!namespace");
        method FALLBACK($name, |args) {
            $!$!namespace, $name, |args);
    my $pyplot =$py, :namespace<matplotlib.pyplot>);

    This is one Perl 6 type that can represent any Python module. If instead we want a separate Perl 6 type for each Python module, we could use roles, which are optionally parameterized:

    role PythonModule[Str $namespace] is Mu {
        has $.py;
        submethod TWEAK {
            $!"import $namespace");
        method FALLBACK($name, |args) {
            $!$namespace, $name, |args);
    my $pyplot = PythonModule['matplotlib.pyplot'].new(:$py);

    Using this approach, we can create type constraints for Python modules in Perl 6 space:

    sub plot-histogram(PythonModule['matplotlib.pyplot'], @data) {
        # implementation here

    Passing in any other wrapped Python module than matplotlib.pyplot results in a type error.


    Perl 6 offers enough flexibility to create function and method call APIs around Python modules. With a bit of meta programming, we can emulate the typical Python APIs close enough that translating from the Python documentation to Perl 6 code becomes easy.

    Subscribe to the Perl 6 book mailing list

    * indicates required Upgrade Information for Lexical require

    Published by Zoffix Znet on 2017-03-18T01:29:32

    Upgrade Information for Lexical require

    What’s Happening?

    Rakudo Compiler release 2017.03 includes the final piece of lexical module loading work: lexical require. This work was first announced in December, in

    There are two changes that may impact your code:

    Upgrade Information

    Lexical Symbols


    # WRONG:
    try { require Foo; 1 } and ::('Foo').new;

    The require above is inside a block and so its symbols won’t be available
    outside of it and the look up will fail.


    (try require Foo) !=== Nil and ::('Foo').new;

    Now the require installs the symbols into scope that’s lexically accessible
    to the ::('Foo') look up.

    Optional Loading


    # WRONG:
    try require Foo;
    if ::('Foo') ~~ Failure {
        say "Failed to load Foo!";

    This construct installs a package named Foo, which would be replaced by the
    loaded Foo if it were found, but if it weren’t, the package will remain a
    package, not a Failure, and so the above ~~ test will always be False.


    # Use return value to test whether loading succeeded:
    (try require Foo) === Nil and say "Failed to load Foo!";
    # Or use a run-time symbol lookup with require, to avoid compile-time
    # package installation:
    try require ::('Foo');
    if ::('Foo') ~~ Failure {
        say "Failed to load Foo!";

    In the first example above, we test the return value of try isn’t Nil, since
    on successful loading it will be a Foo module, class, or package.

    The second example uses a run-time symbol lookup in require and so it never needs
    to install the package placeholder during the compile time. Therefore, the
    ::('Foo') ~~ test does work as intended.

    Help and More Info

    If you require help or more information, please join our chat channel
    #perl6 on

    6guts: Considering hyper/race semantics

    Published by jnthnwrthngtn on 2017-03-16T16:42:05

    We got a lot of nice stuff into Perl 6.c, the version of the language released on Christmas of 2015. Since then, a lot of effort has gone on polishing the things we already had in place, and also on optimization. By this point, we’re starting to think about Perl 6.d, the next language release. Perl 6 is defined by its test suite. Even before considering additional features, the 6.d test suite will tie down a whole bunch of things that we didn’t have covered in the 6.c one. In that sense, we’ve already got a lot done towards it.

    In this post, I want to talk about one of the things I’d really like to get nailed done as part of 6.d, and that is the semantics of hyper and race. Along with that I will, of course, be focusing on getting the implementation in much better shape. These two methods enable parallel processing of list operations. hyper means we can perform operations in parallel, but we must retain and respect ordering of results. For example:

    say (1, 9, 6)* + 5); # (6 14 11)

    Should always give the same results as if the hyper was not there, even it a thread computing 6 + 5 gave its result before that computing 1 + 5. (Obviously, this is not a particularly good real-world example, since the overhead of setting up parallel execution would dwarf doing 3 integer operations!) Note, however, that the order of side-effects is not guaranteed, so:


    Could output the numbers in any order. By contrast, race is so keen to give you results that it doesn’t even try to retain the order of results:

    say (1, 9, 6)* + 5); # (14 6 11) or (6 11 14) or ...

    Back in 2015, when I was working on the various list handling changes we did in the run up to the Christmas release, my prototyping work included an initial implementation of the map operation in hyper and race mode, done primarily to figure out the API. This then escaped into Rakudo, and even ended up with a handful of tests written for it. In hindsight, that code perhaps should have been pulled out again, but it lives on in Rakudo today. Occasionally somebody shows a working example on IRC using the eval bot – usually followed by somebody just as swiftly showing a busted one!

    At long last, getting these fixed up and implemented more fully has made it to the top of my todo list. Before digging into the implementation side of things, I wanted to take a step back and work out the semantics of all the various operations that might be part of or terminate a hyper or race pipeline. So, today I made a list of those operations, and then went through every single one of them and proposed the basic semantics.

    The results of that effort are in this spreadsheet. Along with describing the semantics, I’ve used a color code to indicate where the result leaves you in the hyper or race paradigm afterwards (that is, a chained operation will also be performed in parallel).

    I’m sure some of these will warrant further discussion and tweaks, so feel free to drop me feedback, either on the #perl6-dev IRC channel or in the comments here.

    Pawel bbkr Pabian: Your own template engine in 4 flavors. With Benchmarks!

    Published by Pawel bbkr Pabian on 2017-02-25T23:43:36

    This time on blog I'll show you how to write your own template engine - with syntax and behavior tailored for your needs. And we'll do it in four different ways to analyze pros and cons of each approach as well as code speed and complexity. Our sample task for today is to compose password reminder text for user, which can then be sent by email.

    use v6;

    my $template = q{
    Hi [VARIABLE person]!

    You can change your password by visiting [VARIABLE link] .

    Best regards.

    my %fields = (
    'person' => 'John',
    'link' => ''

    So we decided how our template syntax should look like and for starter we'll do trivial variables (although that's not very precise name because variables in templates are almost always immutable).
    We also have data to populate template fields. Let's get started!

    1. Substitutions

    sub substitutions ( $template is copy, %fields ) {
        for %fields.kv -> $key, $value {
            $template ~~ s:g/'[VARIABLE ' $key ']'/$value/;
        return $template;

    say substitutions($template, %fields);

    Yay, works:

        Hi John!

    You can change your password by visiting .

    Best regards.

    Now it is time to benchmark it to get some baseline for different approaches:

    use Bench;

    my $template_short = $template;
    my %fields_short = %fields;

    my $template_long = join(
    ' lorem ipsum ', map( { '[VARIABLE ' ~ $_ ~ ']' }, 'a' .. 'z')
    ) x 100;
    my %fields_long = ( 'a' .. 'z' ) Z=> ( 'lorem ipsum' xx * );

    my $b =;
    'substitutions_short' => sub {
    substitutions( $template_short, %fields_short )
    'substitutions_long' => sub {
    substitutions( $template_long, %fields_long )

    Benchmark in this post will tests two cases for each approach. Our template from example is "short" case. And there is "long" case with 62KB template containing 2599 text fragments and 2600 variables filled by 26 fields. So here are the results:

    Timing 1000 iterations of substitutions_long, substitutions_short...
    substitutions_long: 221.1147 wallclock secs @ 4.5225/s (n=1000)
    substitutions_short: 0.1962 wallclock secs @ 5097.3042/s (n=1000)

    Whoa! That is a serious penalty for long templates. And the reason for that is because this code has three serious flaws - original template is destroyed during variables evaluation and therefore it must be copied each time we want to reuse it, template text is parsed multiple times and output is rewritten every time after populating each variable. But we can do better...

    2. Substitution

    sub substitution ( $template is copy, %fields ) {
        $template ~~ s:g/'[VARIABLE ' (\w+) ']'/{ %fields{$0} }/;
        return $template;

    This time we have single substitution. Variable name is captured and we can use it to get field value on the fly. Benchmarks:

    Timing 1000 iterations of substitution_long, substitution_short...
    substitution_long: 71.6882 wallclock secs @ 13.9493/s (n=1000)
    substitution_short: 0.1359 wallclock secs @ 7356.3411/s (n=1000)

    Mediocre boost. We have less penalty on long templates because text is not parsed multiple times. However remaining flaws from previous approach still apply and regexp engine still must do plenty of memory reallocations for each piece of template text replaced.

    Also it won't allow our template engine to gain new features - like conditions or loops - in the future because it is very hard to parse nested tags in single regexp. Time for completely different approach...

    3. Grammars and direct Actions

    If you are not familiar with Perl 6 grammars and Abstract Syntax Tree concept you should study official documentation first.

    grammar Grammar {
        regex TOP { ^ [  |  ]* $ }
        regex text { <-[ [ ] >+ }
        regex variable { '[VARIABLE ' $=(\w+) ']' }

    class Actions {

    has %.fields is required;

    method TOP ( $/ ) {
    make [~]( map { .made }, $/{'chunk'} );
    method text ( $/ ) {
    make ~$/;
    method variable ( $/ ) {
    make %.fields{$/{'name'}};


    sub grammar_actions_direct ( $template, %fields ) {
    my $actions = fields => %fields );
    return Grammar.parse($template, :$actions).made;

    The most important thing is defining our template syntax as a grammar. Grammar is just a set of named regular expressions that can call each other. On "TOP" (where parsing starts) we see that our template is composed of chunks. Each chunk can be text or variable. Regexp for text matches everything until it hits variable start ('[' character, let's assume it is forbidden in text to make things more simple). Regexp for variable should look familiar from previous approaches, however now we capture variable name in named way instead of positional.

    Action class has methods that are called whenever regexp with corresponding name is matched. When called, method gets match object ($/) from this regexp and can "make" something from it. This "made" something will be seen by upper level method when it is called. For example our "TOP" regexp calls "text" regexp which matches "Hi " part of template and calls "text" method. This "text" method just "make"s this matched string for later use. Then "TOP" regexp calls "variable" regexp which matches "[VARIABLE name]" part of template. Then "variable" method is called and it checks in match object for variable name and "makes" value of this variable from %fields hash for later use. This continues until end of template string. Then "TOP" regexp is matched and "TOP" method is called. This "TOP" method can access array of text or variable "chunks" in match object and see what was "made" for those chunks earlier. So all it has to do is to "make" those values concatenated together. And finally we get this "made" template from "parse" method. So let's look at benchmarks:

    Timing 1000 iterations of grammar_actions_direct_long, grammar_actions_direct_short...
    grammar_actions_direct_long: 149.5412 wallclock secs @ 6.6871/s (n=1000)
    grammar_actions_direct_short: 0.2405 wallclock secs @ 4158.1981/s (n=1000)

    We got rid of two more flaws from previous approaches. Original template is not destroyed when fields are filled and that means less memory copying. There is also no reallocation of memory during substitution of each field because now every action method just "make"s strings to be joined later. And we can easily extend our template syntax by adding loops, conditions and more features just by throwing some regexps into grammar and defining corresponding behavior in actions. Unfortunately we see some performance regression and this happens because every time template is processed it is parsed, match objects are created, parse tree is built and it has to track all those "make"/"made" values when it is collapsed to final output. But that was not our final word...

    4. Grammars and closure Actions

    Finally we reached "boss level", where we have to exterminate last and greatest flaw - re-parsing.
    The idea is to use grammars and actions like in previous approach, but this time instead of getting direct output we want to generate executable and reusable code that works like this under the hood:

    sub ( %fields ) {
        return join '',
            sub ( %fields ) { return "Hi "}.( %fields ),
            sub ( %fields ) { return %fields{'person'} }.( %fields ),

    That's right, we will be converting our template body to a cascade of subroutines.
    Each time this cascade is called it will get and propagate %fields to deeper subroutines.
    And each subroutine is responsible for handling piece of template matched by single regexp in grammars. We can reuse grammar from previous approach and modify only actions:

    class Actions {
        method TOP ( $/ ) {
            my @chunks = $/{'chunk'};
            make sub ( %fields ) {
               return [~]( map { .made.( %fields ) }, @chunks );
        method text ( $/ ) {
            my $text = ~$/;
            make sub ( %fields ) {
                return $text;
        method variable ( $/ ) {
            my $name = $/{'name'};
            make sub ( %fields  ) {
                return %fields{$name}

    sub grammar_actions_closures ( $template, %fields ) {
    state %cache{Str};
    my $closure = %cache{$template} //= Grammar.parse(
    $template, actions =>
    return $closure( %fields );

    Now every action method instead of making final output makes a subroutine that will get %fields and do final output later. To generate this cascade of subroutines template must be parsed only once. Once we have it we can call it with different set of %fields to populate in our template variables. Note how Object Hash %cache is used to determine if we already have subroutines tree for given $template. Enough talking, let's crunch some numbers:

    Timing 1000 iterations of grammar_actions_closures_long, grammar_actions_closures_short...
    grammar_actions_closures_long: 22.0476 wallclock secs @ 45.3563/s (n=1000)
    grammar_actions_closures_short: 0.0439 wallclock secs @ 22778.8885/s (n=1000)

    Nice result! We have extensible template engine that is 4 times faster for short templates and 10 times faster for long templates than our initial approach. And yes, there is bonus level...

    4.1. Grammars and closure Actions in parallel

    Last approach opened a new optimization possibility. If we have subroutines that will generate our template why not run them in parallel? So let's modify our action "TOP" method to process text and variable chunks simultaneously:

    method TOP ( $/ ) {
        my @chunks = $/{'chunk'};
        make sub ( %fields ) {
           return [~]( {.made.( %fields ) } ).list );

    Such optimization will shine if your template engine must do some lengthy operations to generate chunk of final output, for example execute heavy database query or call some API. It is perfectly fine to ask for data on the fly to populate template, because in feature rich template engine you may not be able to predict and generate complete set of data needed beforehand, like we did with our %fields. Use this optimization wisely - for fast subroutines you will see a performance drop because cost of sending and retrieving chunks to/from threads will be higher that just executing them in serial on single core.

    Which approach should I use to implement my own template engine?

    That depends how much you can reuse templates. For example if you send one password reminder per day - go for simple substitution and reach for grammar with direct actions if you need more complex features. But if you are using templates for example in PSGI processes to display hundreds of pages per second for different users then grammar and closure actions approach wins hands down.

    You can download all approaches with benchmarks in single file here.

    To be continued?

    If you like this brief introduction to template engines and want to see more complex features like conditions of loops implemented leave a comment under this article on or send me a private message on #perl6 channel (nick: bbkr).

    brrt to the future: Register Allocator Update

    Published by Bart Wiegmans on 2017-02-09T16:19:00

    Hi everybody, I thought some yof you might be interested in an update regarding the JIT register allocator, which is after all the last missing piece for the new 'expression' JIT backend. Well, the last complicated piece, at least. Because register allocation is such a broad topic, I don't expect to cover all topics relevant to design decisions here, and reserve a future post for that purpose.

    I think I may have mentioned earlier that I've chosen to implement linear scan register allocation, an algorithm first described in 1999. Linear scan is relatively popular for JIT compilers because it achieves reasonably good allocation results while being considerably simpler and faster than the alternatives, most notably via graph coloring (unfortunately no open access link available). Because optimal register allocation is NP-complete, all realistic algorithms are heuristic, and linear scan applies a simple heuristic to good effect. I'm afraid fully explaining the nature of that heuristic and the tradeoffs involves is beyond the scope of this post, so you'll have to remind me to do it at a later point.

    Commit ab077741 made the new allocator the default after I had ironed out sufficient bugs to be feature-equivalent to the old allocator (which still exists, although I plan to remove it soon).
    Commit 0e66a23d introduced support for 'PHI' node merging, which is really important and exciting to me, so I'll have to explain what it means. The expression JIT represents code in a form in which all values are immutable, called single static assignment form, or SSA form shortly. This helps simplify compilation because there is a clear correspondence between operations and the values they compute. In general in compilers, the easier it is to assert something about code, the more interesting things you can do with it, and the better code you can compile. However, in real code, variables are often assigned more than one value. A PHI node is basically an 'escape hatch' to let you express things like:

    int x, y;
    if (some_condition()) {
    x = 5;
    } else {
    x = 10;
    y = x - 3;

    In this case, despite our best intentions, x can have two different values. In SSA form, this is resolved as follows:

    int x1, x2, x3, y;
    if (some_condition()) {
    x1 = 5;
    } else {
    x2 = 10;
    x3 = PHI(x1,x2);
    y = x3 - 3;

    The meaning of the PHI node is that it 'joins together' the values of x1 and x2 (somewhat like a junction in perl6), and represents the value of whichever 'version' of x was ultimately defined. Resolving PHI nodes means ensuring that, as far as the register allocator is concerned, x1, x2, and x3 should preferably be allocated to the same register (or memory location), and if that's not possible, it should copy x1 and x2 to x3 for correctness. To find the set of values that are 'connected' via PHI nodes, I apply a union-find data structure, which is a very useful data structure in general. Much to my amazement, that code worked the first time I tried it.

    Then I had to fix a very interesting bug in commit 36f1fe94 which involves ordering between 'synthetic' and 'natural' tiles. (Tiles are the output of the tiling process about which I've written at some length, they represent individual instructions). Within the register allocator, I've chosen to identify tiles / instructions by their index in the code list, and to store tiles in a contiguous array. There are many advantages to this strategy but they are also beyond the scope of this post. One particular advantage though is that the indexes into this array make their relative order immediately apparent. This is relevant to linear scan because it relies on relative order to determine when to allocate a register and when a value is no longer necessary.

    However, because of using this index, it's not so easy to squeeze in new tiles to that array - which is exactly what a register allocator does, when it decides to 'spill' a value to memory and load it when needed. (Because inserts are delayed and merged into the array a single step, the cost of insertion is constant). Without proper ordering, a value loaded from memory could overwrite another value that is still in use. The fix for that is, I think, surprisingly simple and elegant. In order to 'make space' for the synthetic tiles, before comparison all indexes are multiplied by a factor of 2, and synthetic tiles are further offset by -1 or +1, depending on whether they should be handled before or after the 'natural' tile they are inserted for. E.g. synthetic tiles that load a value should be processed before the tile that uses the value they load.

    Another issue soon appeared, this time having to do with x86 being, altogether, quaint and antiquated and annoying, and specifically with the use of one operand register as source and result value. To put it simply, where you and I and the expression JIT structure might say:

    a = b + c

    x86 says:

    a = a + b

    Resolving the difference is tricky, especially for linear scan, since linear scan processes the values in the program rather than the instructions that generate them. It is therefore not suited to deal with instruction-level constraints such as these. If a, b, and c in my example above are not the same (not aliases), then this can be achieved by a copy:

    a = b
    a = a + c

    If a and b are aliases, the first copy isn't necessary. However, if a and c are aliases, then a copy may or may not be necessary, depending on whether the operation (in this case '+') is commutative, i.e. it holds for '+' but not for '-'. Commit 349b360 attempts to fix that for 'direct' binary operations, but a fix for indirect operations is still work in progress. Unfortunately, it meant I had to reserve a register for temporary use to resolve this, meaning there are fewer available for the register allocator to use. Fortunately, that did simplify handling of a few irregular instructions, e.g. signed cast of 32 bit integers to 64 bit integers.

    So that brings us to today and my future plans. The next thing to implement will be support for function calls by the register allocator, which involves shuffling values to the right registers and correct positions on the stack, and also in spilling all values that are still required after the function call since the function may overwrite them. This requires a bit of refactoring of the logic that spills variables, since currently it is only used when there are not enough registers available. I also need to change the linear scan main loop, because it processes values in order of first definition, and as such, instructions that don't create any values are skipped, even if they need special handling like function calls. I'm thinking of solving that with a special 'interesting tiles' queue that is processed alongside the main values working queue.

    That was it for today. I hope to write soon with more progress.

    Strangely Consistent: Deep Git

    Published by Carl Mäsak

    I am not good at chess.

    I mean... "I know how the pieces move". (That's the usual phrase, isn't it?) I've even tried to understand chess better at various points in my youth, trying to improve my swing. I could probably beat some of you other self-identified "I know how the pieces move" folks out there. With a bit of luck. As long as you don't, like, cheat by having a strategy or something.

    I guess what I'm getting at here is that I am not, secretly, an international chess master. OK, now that's off my chest. Phew!

    Imagining what it's like to be really good at chess is very interesting, though. I can say with some confidence that a chess master never stops and asks herself "wait — how does the knight piece move, again?" Not even I do that! Obviously, the knight piece is the one that moves √5 distances on the board. 哈哈

    I can even get a sense of what terms a master-level player uses internally, by reading what master players wrote. They focus on tactics and strategy. Attacks and defenses. Material and piece values. Sacrifices and piece exchange. Space and control. Exploiting weaknesses. Initiative. Openings and endgames.

    Such high-level concerns leave the basic mechanics of piece movements far behind. Sure, those movements are in there somewhere. They are not irrelevant, of course. They're just taken for granted and no longer interesting in themselves. Meanwhile, the list of master-player concerns above could almost equally well apply to a professional Go player. (s:g/piece/stone/ for Go.)

    Master-level players have stopped looking at individual trees, and are now focusing on the forest.

    The company that employs me (Edument) has a new slogan. We've put it on the backs of sweaters which we then wear to events and conferences:

    We teach what you can't google.

    I really like this new slogan. Particularly, it feels like something we as a teaching company have already trended towards for a while. Some things are easy to find just by googling them, or finding a good cheat sheet. But that's not why you attend a course. We should position ourselves so as to teach answers to the deep, tricky things that only emerge after using something for a while.

    You're starting to see how this post comes together now, aren't you? 😄

    2017 will be my ninth year with Git. I know it quite well by now, having learned it in depth and breadth along the way. I can safely say that I'm better at Git than I am at chess at this point.

    Um. I'm most certainly not an international Git grandmaster — but largely that's because such a title does not exist. (If someone reads this post and goes on to start an international Git tournament, I will be very happy. I might sign up.)

    No, my point is that the basic commands have taken on the role for me that I think basic piece movements have taken on for chess grandmasters. They don't really matter much; they're a means to an end, and it's the end that I'm focusing on when I type them.

    (Yes, I still type them. There are some pretty decent GUIs out there, but none of them give me the control of the command line. Sorry-not-sorry.)

    Under this analogy, what are the things I value with Git, if not the commands? What are the higher-level abstractions that I tend to think in terms of nowadays?

    (Yes, these are the ACID guarantees for database transactions, but made to work for Git instead.)

    A colleague of mine talks a lot about "definition of done". It seems to be a Scrum thing. It's his favorite term more than mine, but I still like it for its attempt at "mechanizing" quality, which I believe can succeed in a large number of situations.

    Another colleague of mine likes the Boy Scout Rule of "Always leave the campground cleaner than you found it". If you think of this in terms of code, it means something like refactoring a code base as you go, cleaning it up bit by bit and asymptotically approaching code perfection. But if you think of it in terms of process, it dovetails quite nicely with the "definition of done" above.

    Instead of explaining how in the abstract, let's go through a concrete-enough example:

    1. Some regression is discovered. (Usually by some developer dogfooding the system.)
    2. If it's not immediately clear, we bisect and find the offending commit.
    3. ASAP, we revert that commit.
    4. We analyze the problematic part of the reverted commit until we understand it thoroughly. Typically, the root cause will be something that was not in our definition of done, but should've been.
    5. We write up a new commit/branch with the original (good) functionality restored, but without the discovered problem.
    6. (Possibly much later.) We attempt to add discovery of the problem to our growing set of static checks. The way we remember to do that is through a TODO list in a wiki. This list keeps growing and shrinking in fits and starts.

    Note in particular the interplay between process, quality and, yes, Git. Someone could've told me at the end of step 6 that I had totalled 29 or so Git basic commands along the way, and I would've believed them. But that's not what matters to us as a team. If we could do with magic pixie dust what we do with Git — keep historic snapshots of the code while ensuring quality and isolation — we might be satisfied magic pixie dust users instead.

    Somewhere along the way, I also got a much more laid-back approach to conflicts. (And I stopped saying "merge conflicts", because there are also conflicts during rebase, revert, cherry-pick, and stash — and they are basically the same deal.) A conflict happens when a patch P needs to be applied in an environment which differs too much from the one in which P was created.

    Aside: in response to this post, jast++ wrote this on #perl6: "one minor nitpick: git knows two different meanings for 'merge'. one is commit-level merge, one is file-level three-way merge. the latter is used in rebase, cherry-pick etc., too, so technically those conflicts can still be called merge conflicts. :)" — TIL.

    But we actually don't care so much about conflicts. Git cares about conflicts, becuase it can't just apply the patch automatically. What we care about is that the intent of the patch has survived. No software can check that for us. Since the (conflict ↔ no conflict) axis is independent from the (intent broken ↔ intent preserved) axis, we get four cases in total. Two of those are straightforward, because the (lack of) conflict corresponds to the (lack of) broken intent.

    The remaining two cases happen rarely but are still worth thinking about:

    If we care about quality, one lesson emerges from mst's example: always run the tests after you merge and after you've resolved conflicts. And another lesson from my example: try to introduce automatic checks for structures and relations in the code base that you care about. In this case, branch A could've put in a test or a linting step that failed as soon as it saw something according to the old naming convention.

    A lot of the focus on quality also has to do with doggedly going to the bottom of things. It's in the nature of failures and exceptional circumstances to clump together and happen at the same time. So you need to handle them one at a time, carefully unraveling one effect at a time, slowly disassembling the hex like a child's rod puzzle. Git sure helps with structuring and linearizing the mess that happens in everyday development, exploration, and debugging.

    As I write this, I realize even more how even when I try to describe how Git has faded into the background as something important-but-uninteresting for me, I can barely keep the other concepts out of focus. Quality being chief among them. In my opinion, the focus on improving not just the code but the process, of leaving the campground cleaner than we found it, those are the things that make it meaningful for me to work as a developer even decades later. The feeling that code is a kind of poetry that punches you back — but as it does so, we learn something valuable for next time.

    I still hear people say "We don't have time to write tests!" Well, in our teams, we don't have time not to write tests! Ditto with code review, linting, and writing descriptive commit messages.

    No-one but Piet Hein deserves the last word of this post:

    The road to wisdom? — Well, it's plain
    and simple to express:

    and err
    and err again
    but less
    and less
    and less.

    Death by Perl6: Hello Web! with Purée Perl 6

    Published by Tony O'Dell on 2017-01-09T18:19:56

    Let's build a website.

    Websites are easy to build. There are dozens of frameworks out there to use, perl has Mojo and Catalyst as its major frameworks and other languages also have quite a few decent options. Some of them come with boilerplate templates and you just go from there. Others don't and you spend your first few hours learning how to actually set up the framework and reading about how to share your DB connection between all of your controllers and blah, blah, blah. Let's look at one of P6's web frameworks.

    Enter Hiker

    Hiker doesn't introduce a lot of (if any) new ideas. It does use paradigms you're probably used to and it aims to make the initialization of creating your website very straight forward and easy, that way you can get straight to work sharing your content with the English.

    The Framework

    Hiker is intended to make things fast and easy from the development side. Here's how it works. If you're not into the bleep blop and just want to get started, skip to the Boilerplate heading.

    Application Initialization

    1. Hiker reads from the subdirectories we'll look at later. The controllers and models are classes.
    2. Looking at all controllers, initializes a new object for that class, and then checks for their .path attribute
      1. If Hiker can't find the path attribute then it doesn't bind anything and produces a warning
    3. After setting up the controller routes, it instantiates a new object for the model as specified by the controller (.model)
      1. If none is given by the controller then nothing is instantiated or bound and nothing happens
      2. If a model is required by the controller but it cannot be found then Hiker refuses to bind
    4. Finally, HTTP::Server::Router is alerted to all of the paths that Hiker was able to find and verify

    The Request

    1. If the path is found, then the associated class' .model.bind is called.
      1. The response (second parameter of .model.bind($req, $res)) has a hash to store information: $
    2. The controller's .handler($req, $res) is then executed
      1. The $ hash is available in this context
    3. If the handler returns a Promise then Hiker waits for that to be kept (and expects the result to be True or False)
      1. If the response is already rendered and the Promise's status is True then the router is alerted that no more routes should be explored
      2. If the response isn't rendered and the Promise's result is True, then .render is called automagically for you
      3. If the response isn't rendered and the Promise's result is False, then the next matching route is called


    Ensure you have Hiker installed:

    $ zef install Hiker
    $ rakudobrew rehash #this may be necessary to get the bin to work

    Create a new directory where you'd like to create your project's boilerplate and cd. From here we'll initialize some boilerplate and look at the content of the files.

    somedir$ hiker init  
    ==> Creating directory controllers
    ==> Creating directory models
    ==> Creating directory templates
    ==> Creating route MyApp::Route1: controllers/Route1.pm6
    ==> Creating route MyApp::Model1: models/Model1.pm6
    ==> Creating template templates/Route1.mustache
    ==> Creating app.pl6

    Neato burrito. From the output you can see that Hiker created some directories - controllers, models, templates - for us so we can start out organized. In those directories you will find a few files, let's take a look.

    The Model

    use Hiker::Model; 
    class MyApp::Model1 does Hiker::Model {  
      method bind($req, $res) {
        $<who> = 'web!';

    Pretty straight forward. MyApp::Model1 is instantiated during Hiker initialization and .bind is called whenever the controller's corresponding path is requested. As you can see here, this Model just adds to the $ hash the key value pair of who => 'web!'. This data will be available in the Controller as well as available in the template files (if the controller decides to use that).

    The Controller

    use Hiker::Route; 
    class MyApp::Route1 does Hiker::Route {  
      has $.path     = '/';
      has $.template = 'Route1.mustache';
      has $.model    = 'MyApp::Model1';
      method handler($req, $res) {
        $res.headers<Content-Type> = 'text/plain';

    As you can see above, the Hiker::Route has a lot of information in a small space and it's a class that does a Hiker role called Hiker::Route. This let's our framework know that we should inspect that class for the path, template, model so it can handle those operations for us - path and template are the only required attributes.

    As discussed above, our Route can return a Promise if there is some asynchronous operation that is to be performed. In this case all we're going to do is set the header's to indicated the Content Type and then, automagically, render the template file. Note: if you return a Falsey value from the handler method, then the router will not auto render and it will attempt to find the next route. This is so that you can cascade paths in the event that you want to chain them together, do some type of decision making real time to determine whether that's the right class for the request, or perform some other unsaid dark magic. In the controller above we return a Truethy value and it auto renders.

    By specifying the Model in the Route, you're able to re-use the same Model class across multiple routes.

    The Path

    Quick notes about .path. You can pass a ('/staticpath'), maybe a path with a placeholder ('/api/:placeholder'), or if you're path is a little more complicated then you can pass in a regex (/ .* /). Check out the documentation for HTTP::Server::Router (repo).

    The Template

    The template is specified by the controller's .template attribute and Hiker checks for that file in the ./templates folder. The default template engine is Template::Mustache (repo). See that module's documentation for more info.

    Running the App

    Really pretty straight forward from the boilerplate:

    somedir$ perl6 app.pl6  

    Now you can visit in your favorite Internet Explorer and find a nice 'Hello web!' message waiting to greet you. If you visit any other URI you'll receive the default 'no route found' message from HTTP::Server::Router.

    The Rest

    The module is relatively young. With feedback from the community, practical applications, and some extra feature expansion, Hiker could be pretty great and it's a good start to taking the tediousness out of building a website in P6. I'm open to feedback and I'd love to hear/see where you think Hiker can be improved, what it's missing to be productive, and possibly anything else [constructive or otherwise] you'd like to see in a practical, rapid development P6 web server.

    Steve Mynott: Rakudo Star: Past Present and Future

    Published by Steve Mynott on 2017-01-02T14:07:31

    At YAPC::EU 2010 in Pisa I received a business card with "Rakudo Star" and the
    date July 29, 2010 which was the date of the first release -- a week earlier
    with a countdown to 1200 UTC. I still have mine, although it has a tea stain
    on it and I refreshed my memory over the holidays by listening again to Patrick
    Michaud speaking about the launch of Rakudo Star (R*):

    R* was originally intended as first of a number of distribution releases (as
    opposed to a compiler release) -- useable for early adopters but not initially production
    Quality. Other names had been considered at the time like Rakudo Beta (rejected as
    sounding like "don't use this"!) and amusingly Rakudo Adventure Edition.
    Finally it became Rakudo Whatever and Rakudo Star (since * means "whatever"!).

    Well over 6 years later and we never did come up with a better name although there
    was at least one IRC conversation about it and perhaps "Rakudo Star" is too
    well established as a brand at this point anyway. R* is the Rakudo compiler, the main docs, a module installer, some modules and some further docs.

    However, one radical change is happening soon and that is a move from panda to
    zef as the module installer. Panda has served us well for many years but zef is
    both more featureful and more actively maintained. Zef can also install Perl
    6 modules off CPAN although the CPAN-side support is in its early days. There
    is a zef branch (pull requests welcome!) and a tarball at:

    Panda has been patched to warn that it will be removed and to advise the use of
    zef. Of course anyone who really wants to use panda can reinstall it using zef

    The modules inside R* haven't changed much in a while. I am considering adding
    DateTime::Format (shown by ecosystem stats to be widely used) and
    HTTP::UserAgent (probably the best pure perl6 web client library right now).
    Maybe some modules should also be removed (although this tends to be more
    controversial!). I am also wondering about OpenSSL support (if the library is

    p6doc needs some more love as a command line utility since most of the focus
    has been on the website docs and in fact some of these changes have impacted
    adversely on command line use, eg. under Windows cmd.exe "perl 6" is no longer
    correctly displayed by p6doc. I wonder if the website generation code should be
    decoupled from the pure docs and p6doc command line (since R* has to ship any
    new modules used by the website). p6doc also needs a better and faster search
    (using sqlite?). R* also ships some tutorial docs including a PDF generated from
    We only ship the English one and localisation to other languages could be

    Currently R* is released roughly every three months (unless significant
    breakage leads to a bug fix release). Problems tend to happen with the
    less widely used systems (Windows and the various BSDs) and also with the
    module installers and some modules. R* is useful in spotting these issues
    missed by roast. Rakudo itself is still in rapid development. At some point a less frequently
    updated distribution (Star LTS or MTS?) will be needed for Linux distribution
    packagers and those using R* in production). There are also some question
    marks over support for different language versions (6.c and 6.d).

    Above all what R* (and Rakudo Perl 6 in general) needs is more people spending
    more time working on it! JDFI! Hopefully this blog post might
    encourage more people to get involved with github pull requests.

    Feedback, too, in the comments below is actively encouraged.