pl6anet

Perl 6 RSS Feeds

Steve Mynott (Freenode: stmuk) steve.mynott (at)gmail.com / 2017-12-11T03:11:22


Perl 6 Advent Calendar: Day 11 — All the Stars of Perl 6, or * ** *

Published by andrewshitov on 2017-12-11T00:00:28

Today’s blog post in this year’s Perl 6 Advent Calendar is full of snowflakes. We’ll go through the constructs that employ the * character. In Perl 6, you may call it a star (or asterisk if you prefer) or a whatever, depending on the context.

Perl 6 is not a cryptic programming language; its syntax is in many aspects much more consistent than that of Perl 5. On the other side, some areas require spending time to start feeling confident in the syntax.

Let’s go through different usages of *, starting with the most simple, aiming to understand the most brain-breaking ones such as * ** *.

The first couple of usages is simple and does not require many comments:

1. Multiplication

A single star is used for multiplication. Strictly speaking, this is an infix operator infix:<*>, whose return value is Numeric.

say 20 * 18; # 360

2. Power

The double star ** is the exponentiation operator. Again, this is an infix:<**> that returns the Numeric result, calculating the power for the given two values.

say pi ** e; # 22.4591577183611

* * *

The same two tokens, * or **, are also used in regexes, where they mean different things. One of the features of Perl 6 is that it can easily switch between different languages inside itself. Both regexes and grammars are examples of such inner languages, where the same symbols can mean different things from what they mean in Perl 6 itself (if they have any meaning at all there).

3. Zero or more repetitions

The * quantifier. This syntax item works similarly to its behaviour in Perl 5: allows zero or more repetitions of the atom.

my $weather = '*****';
my $snow = $weather ~~ / ('*'*) /;
say 'Snow level is ' ~ $snow.chars; # Snow level is 5

Of course, we also see here another usage of the same character, the '*' literal.

4. Min to max repetitions

The double ** is a part of another quantifier that specifies the minimum and the maximum number of repetitions:

my $operator = '..';
say "'$operator' is a valid Perl 6 operator"
    if $operator ~~ /^ '.' ** 1..3 $/;

In this example, it is expected that the dot is repeated one, two, or three times; not less and not more.

Let’s look a bit ahead, and use a star in the role (role as in theatre, not in Perl 6’s object-oriented programming) of the Whatever symbol:

my $phrase = 'I love you......';
say 'You are so uncertain...'
    if $phrase ~~ / '.' ** 4..* /;

The second end of the range is open, and the regex accepts all the phrases with more than four dots in it.

5. Slurpy arguments

A star before an array argument in a sub’s signature means a slurpy argument—the one that consumes separate scalar parameters into a single array.

list-gifts('chocolade', 'ipad', 'camelia', 'perl6');

sub list-gifts(*@items) {
    say 'Look at my gifts this year:';
    .say for @items;
}

Hashes also allow celebrating slurpy parameters:

dump(alpha => 'a', beta => 'b'); # Prints:
                                 # alpha = a
                                 # beta = b

sub dump(*%data) {
    for %data.kv {say "$^a = $^b"}
}

Notice that unlike Perl 5, the code does not compile if you omit the star in the function signature, as Perl 6 expects exactly what is announced:

Too few positionals passed; expected 1 argument but got 0

6. Slurpy-slurpy

The **@ is also working but notice the difference when you pass arrays or lists.

With a single star:

my @a = < chocolade ipad >;
my @b = < camelia perl6 >;

all-together(@a, @b);
all-together(['chocolade', 'ipad'], ['camelia', 'perl6']);
all-together(< chocolade ipad >, < camelia perl6 >);

sub all-together(*@items) {
    .say for @items;
}

Currently, each gift is printed on a separate line regardless the way the argument list was passed.

With a double star:

keep-groupped(@a, @b);
keep-groupped(['chocolade', 'ipad'], ['camelia', 'perl6']);
keep-groupped(< chocolade ipad >, < camelia perl6 >);

sub keep-groupped(**@items) {
    .say for @items;
}

This time, the @items array gets two elements only, reflecting the structural types of the arguments:

[chocolade ipad]
[camelia perl6]

or

(chocolade ipad)
(camelia perl6)

7. Dynamic scope

The * twigil, which introduces dynamic scope. It is easy to confuse the dynamic variables with global variables but examine the following code.

sub happy-new-year() {
    "Happy new $*year year!"
}

my $*year = 2018;
say happy-new-year();

If you omit the star, the code cannot be run:

Variable '$year' is not declared

The only way to make it correct is to move the definition of $year above the function definition. With the dynamic variable $*year, the place where the function is called defines the result. The $*year variable is not visible in the outer scope of the sub, but it is quite visible in the dynamic scope.

For the dynamic variable, it is not important whether you assign a new value to an existing variable or create a new one:

sub happy-new-year() {
    "Happy new $*year year!"
}

my $*year = 2018;
say happy-new-year();

{
    $*year = 2019;        # New value
    say happy-new-year(); # 2019
}

{
    my $*year = 2020;     # New variable
    say happy-new-year(); # 2020
}

8. Compiler variables

A number of dynamic pseudo-constants come with Perl 6, for example:

say $*PERL;      # Perl 6 (6.c)
say @*ARGS;      # Prints command-line arguments
say %*ENV<HOME>; # Prints home directory

9. All methods

The .* postfix pseudo-operator calls all the methods with the given name, which can be found for the given object, and returns a list of results. In the trivial case you get a scholastically absurd code:

6.*perl.*say; # (6 Int.new)

The code with stars is a bit different from doing it simple and clear:

pi.perl.say; # 3.14159265358979e0 (notice the scientific
             # format, unlike pi.say)

The real power of the .* postfix comes with inheritance. It helps to reveal the truth sometimes:

class Present {
    method giver() {
        'parents'
    }
}

class ChristmasPresent is Present {
    method giver() {
        'Santa Claus'
    }
}

my ChristmasPresent $present;

$present.giver.say;             # Santa Claus
$present.*giver.join(', ').say; # Santa Claus, parents

Just a star but what a difference!

* * *

Now, to the most mysterious part of the star corner of Perl 6. The next two concepts, the Whatever and the WhatverCode classes, are easy to mix up with each other. Let’s try to do it right.

10. Whatever

A single * can represent Whatever. Whatever in Perl 6 is a predefined class, which introduces some prescribed behaviour in a few useful cases.

For example, in ranges and sequences, the final * means infinity. We’ve seen an example today already. Here is another one:

.say for 1 .. *;

This one-liner has a really high energy conversion efficiency as it generates an infinite list of increasing integers. Press Ctrl+C when you are ready to move on.

The range 1 .. * is the same as 1 .. Inf. You can clearly see that if you go to the Rakudo Perl 6 sources and find the following definitions in the implementation of the Range class in the src/core/Range.pm file:

multi method new(Whatever \min,Whatever \max,:$excludes-min,:$excludes-max){
    nqp::create(self)!SET-SELF(-Inf,Inf,$excludes-min,$excludes-max,1);
}
multi method new(Whatever \min, \max, :$excludes-min, :$excludes-max) {
    nqp::create(self)!SET-SELF(-Inf,max,$excludes-min,$excludes-max,1);
}
multi method new(\min, Whatever \max, :$excludes-min, :$excludes-max) {
    nqp::create(self)!SET-SELF(min,Inf,$excludes-min,$excludes-max,1);
}

The three multi constructors describe the three cases: * .. *, * .. $n and $n .. *, which are immediately translated to -Inf .. Inf, -Inf .. $n and $n .. Inf.

As a side Christmas tale, here’s a tiny excursus showing that * is not just an Inf. There were two commits to src/core/Whatever.pm:

First, on 16 September 2015, “Make Whatever.new == Inf True:”

  my class Whatever {
      multi method ACCEPTS(Whatever:D: $topic) { True }
      multi method perl(Whatever:D:) { '*' }
+     multi method Numeric(Whatever:D:) { Inf }
  }

In a few weeks, on 23 October 2015, “* no longer defaults to Inf,” This is to protect extensibility of * to other dwimmy situations:

  my class Whatever {
      multi method ACCEPTS(Whatever:D: $topic) { True }
      multi method perl(Whatever:D:) { '*' }
-     multi method Numeric(Whatever:D:) { Inf }
  }

Returning to our more practical problems, let’s create our own class that makes use of the whatever symbol *. Here is a simple example of a class with a multi-method taking either an Int value or a Whatever.

class N {
    multi method display(Int $n) {
        say $n;
    }

    multi method display(Whatever) {
        say 2000 + 100.rand.Int;
    }
}

In the first case, the method simply prints the value. The second method prints a random number between 2000 and 2100 instead. As the only argument of the second method is Whatever, no variable is needed in the signature.

Here is how you use the class:

my $n = N.new;
$n.display(2018);
$n.display(*);

The first call echoes its argument, while the second one prints something random.

The Whatever symbol can be held as a bare Whatever. Say, you create an echo function and pass the * to it:

sub echo($x) {
    say $x;
}

echo(2018); # 2018
echo(*);    # *

This time, no magic happens, the program prints a star.

And now we are at a point where a tiny thing changes a lot.

11. WhataverCode

Finally, it’s time to talk about WhateverCode.

Take an array and print the last element of it. If you do it in the Perl 5 style, you’d type something like @a[-1]. In Perl 6, that generates an error:

Unsupported use of a negative -1 subscript
to index from the end; in Perl 6 please
use a function such as *-1

The compiler suggests to use a function such as *-1. Is it a function? Yes, more precisely, a WhataverCode block:

say (*-1).WHAT; # (WhateverCode)

Now, print the second half of an array:

my @a = < one two three four five six >;
say @a[3..*]; # (four five six)

An array is indexed with the range 3..*. The Whatever star as the right end of the range means to take all the rest from the array. The type of 3..* is a Range:

say (3..*).WHAT; # (Range)

Finally, take one element less. We’ve already seen that to specify the last element a function such as *-1 must be used. The same can be done at the right end of the range:

say @a[3 .. *-2]; # (four five)

At this point, the so-called Whatever-currying happens and a Range becomes a WhateverCode:

say (3 .. *-2).WHAT; # (WhateverCode)

WhataverCode is a built-in Perl 6 class name; it can easily be used for method dispatching. Let’s update the code from the previous section and add a method variant that expects a WhataverCode argument:

class N {
    multi method display(Int $n) {
        say $n;
    }

    multi method display(Whatever) {
        say 2000 + 100.rand.Int;
    }

    multi method display(WhateverCode $code) {
        say $code(2000 + 100.rand.Int);
    }
}

Now, the star in the argument list falls into either display(Whatever) or display(WhataverCode):

N.display(2018);     # display(Int $n)

N.display(*);        # display(Whatever)

N.display(* / 2);    # display(WhateverCode $code)
N.display(* - 1000); # display(WhateverCode $code)

Once again, look at the signature of the display method:

multi method display(WhateverCode $code)

The $code argument is used as a function reference inside the method:

say $code(2000 + 100.rand.Int);

The function takes an argument but where is it going to? Or, in other words, what and where is the function body? We called the method as N.display(* / 2) or N.display(* - 1000). The answer is that both * / 2 and * - 1000 are functions! Remember the compiler’s hint about using a function such as *-1?

The star here becomes the first function argument, and thus * / 2 is equivalent to {$^a / 2}, while * - 1000 is equivalent to {$^a - 1000}.

Does it mean that $^b can be used next to $^a? Sure! Make the WhateverCode block accept two arguments. How do you indicate the second of them? Not a surprise, with another star! Let us add the fourth variant of the display method to our class:

multi method display(WhateverCode $code 
                     where {$code.arity == 2}) {
    say $code(2000, 100.rand.Int);
}

Here, the where block is used to narrow the dispatching down to select only those WhateverCode blocks that have two arguments. Having this done, two snowflakes are allowed in the method call:

N.display( * + * );
N.display( * - * );

The calls define the function $code that is used to calculate the result. So, the actual operation behind the N.display( * + * ) is the following: 2000 + 100.rand.Int.

Need more snow? Add more stars:

N.display( * * * );
N.display( * ** * );

Similarly, the actual calculations inside are:

2000 * 100.rand.Int

and

2000 ** 100.rand.Int

Congratulations! You can now parse the * ** * construct as effortlessly as the compiler does it.

Homework

Perl 6 gave us so many Christmas gifts so far. Let’s make an exercise in return and answer the question: What does each star mean in the following code?

my @n = 
    ((0, 1, * + * ... *).grep: *.is-prime).map: * * * * *;
.say for @n[^5];

D’oh. I suggest we start transforming the code to get rid of all the stars and to use different syntax.

The * after the sequence operator ... means to generate the sequence infinitely, so replace it with Inf:

((0, 1, * + * ... Inf).grep: *.is-prime).map: * * * * *

The two stars * + * in the generator function can be replafced with a lambda function with two explicit arguments:

((0, 1, -> $x, $y {$x + $y} ... Inf).grep: 
    *.is-prime).map: * * * * *

Now, a simple syntax alternation. Replace the .grep: with a method call with parentheses. Its argument *.is-prime becomes a codeblock, and the star is replaced with the default variable $_. Notice that no curly braces were needed while the code was using a *:

(0, 1, -> $x, $y {$x + $y} ... Inf).grep({
    $_.is-prime
}).map: * * * * *

Finally, the same trick for .map: but this time there are three arguments for this method, thus, you can write {$^a * $^b * $^c} instead of * * * * *, and here’s the new variant of the complete program:

my @n = (0, 1, -> $x, $y {$x + $y} ... Inf).grep({
        $_.is-prime
    }).map({
        $^a * $^b * $^c
    });
.say for @n[^5];

Now it is obvious that the code prints five products of the groups of three prime Fibonacci numbers.

Additional assignments

In textbooks, the most challenging tasks are marked with a *. Here are a couple of them for your to solve yourself.

  1. What is the difference between chdir('/') and &*chdir('/') in Perl 6?
  2. Explain the following Perl 6 code and modify it to demonstrate its advantages: .say for 1...**.

❄ ❄ ❄

That’s all for today. I hope that you enjoyed the power and expressiveness of Perl 6. Today, we talked about a single ASCII character only. Imagine how vast Perl 6’s Universe is if you take into account that the language offers the best Unicode support among today’s programming languages.

Enjoy Perl 6 today and spread the word! Stay tuned to the Perl 6 Advent Calendar; more articles are waiting for your attention, the next coming already tomorrow.

Andrew Shitov


Perl 6 Advent Calendar: Day 10 – Wrapping Rats

Published by bduggan on 2017-12-10T13:37:40

Going down chimneys is a dangerous business.

Chimneys can be narrow, high, and sometimes not well constructed to begin with.

This year, Santa wants to be prepared. Therefore, he is combining a chimney inspection with the delivery of presents.

A chimney inspection involves ensuring that every layer of bricks is at the correct height; i.e. that the layers of mortar are consistent, and that the bricks are also a consistent height.

For instance, for bricks that are 2¼” high, and mortar that is ⅜” thick, the sequence of measurements should look like this:

                       🎅 
                      ─██─
                       ||
 layer                                      total
       ░░░░░░░░░░ ░░░░░░░░░░░░░░░ ░░░░░░░░░░
  2¼   ░░░░░░░░░░ ░░░░░░░░░░░░░░░ ░░░░░░░░░░
       ░░░░░░░░░░ ░░░░░░░░░░░░░░░ ░░░░░░░░░░
   ⅜                                        ‾‾???
       ░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░░
  2¼   ░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░░
       ░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░░
   ⅜                                        ‾‾5⅝
       ░░░░░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░
  2¼   ░░░░░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ 
       ░░░░░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░
   ⅜                                        ‾‾3
       ░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░
  2¼   ░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░ 
       ░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░
   ⅜   _____________________________________‾‾⅜ 

The plan is for the Elves to do the dangerous descent to the bottom, tape measure in hand, and then come back up, ensuring that the top of each brick layer is at precisely the correct place on the tape measure.

One particular Elf, named Elvis, has taken it upon himself to write a program to help out with the task of computing this sequence of heights.

Being lazy, Elvis did not even want to add any of the fractions above, and wanted the program to do all the work. He also did not want to exert the mental effort required to figure out the formula for the height of each layer. Luckily, he was using Perl 6, which properly turns unicode fractions into rational numbers (type Rat), and also has a sequence operator (...) which figures out arithmetic sequences based on the first few terms.

So, Elvis’ first cut at the program looked like this:

my @heights = 0, ⅜, 3, 5+⅝ ... *;

say @heights[^10].join(', ')

This gave him the first 10 heights that he needed:

0, 0.375, 3, 5.625, 8.25, 10.875, 13.5, 16.125, 18.75, 21.375

While this was correct, it was hard to use. The tape measure had fractions of an inch, not decimals. The output Elvis really wanted was fractions.

Fortunately, he knew that using join turned the Rats into strings, Turning a Rat into a Str is done by calling the Str method of the Rat class. So, by modifying the behavior of Rat.Str, he figured he could make the output prettier.

The way he decided to do this was to wrap the Str method (aka using the decorator pattern), like so:

Rat.^find_method('Str').wrap:
  sub ($r) {
    my $whole = $r.Int || "";
    my $frac = $r - $whole;
    return "$whole" unless $frac > 0;
    return "$whole" ~ <⅛ ¼ ⅜ ½ ⅝ ¾ ⅞>[$frac * 8 - 1];
  }

In other words, when stringifying a Rat, return the whole portion unless there is a fractional portion. Then treat the fractional portion as the number of eighths, and use that as an index into an array to look up the right unicode fraction.

He combined that with his first program to get this sequence of heights:

0, ⅜, 3, 5⅝, 8¼, 10⅞, 13½, 16⅛, 18¾, 21⅜

“Hooray!” he thought. “Exactly what I need.”

Santa took a look at the program and said “Elvis, this is clever, but not quite enough. While most brick dimensions are multiples of ⅛ , that might not be true of mortar levels. Can you make your program handle those cases, too?”

“Sure” said Elvis with a wry smile. And he added this line into his wrapper function:

return "$whole {$frac.numerator}⁄{$frac.denominator}"
   unless $frac %% ⅛;

using the “is divisible by” operator (%%), to ensure that the fraction was evenly divisible into eighths, and if not to just print the numerator and denominator explicitly. Then for mortar that was ⅕” thick, the sequence:

my @heights = 0, ⅕,
                 ⅕ + 2+¼ + ⅕,
                 ⅕ + 2+¼ + ⅕
                   + 2+¼ + ⅕ ... *;
say @heights[^10].join(', ');
0,  1⁄5, 2 13⁄20, 5 1⁄10, 7 11⁄20, 10, 12 9⁄20, 14 9⁄10, 17 7⁄20, 19 4⁄5

“Actually”, Santa said, “now that I look at it, maybe this isn’t useful — the tape measure only has sixteenths of an inch, so it would be better to round to the nearest sixteenth of an inch.”

tape-measure

Elvis added a call to round to end up with:


Rat.^find_method('Str').wrap:
  sub ($r) {
        my $whole = $r.Int || '';
        my $frac = $r - $whole;
        return "$whole" unless $frac > 0;
        my $rounded = ($frac * 16).round/16;
        return "$whole" ~ <⅛ ¼ ⅜ ½ ⅝ ¾ ⅞>[$frac * 8 - 1] if $rounded %% ⅛;
        return "$whole {$rounded.numerator}⁄{$rounded.denominator}";
  }

which gave him

0,  3⁄16, 2⅝, 5⅛, 7 9⁄16, 10, 12 7⁄16, 14⅞, 17¼, 19 13⁄16

He showed his program to Elivra the Elf who said, “What a coincidence, I wrote a program that is almost exactly the same! Except, I also wanted to know where the bottoms of the layers of bricks are. I couldn’t use a sequence operator for this, since it isn’t an arithmetic progression, but I could use a lazy gather and an anonymous stateful variable! Like this:


my \brick = 2 + ¼;
my \mortar = ⅜;
my @heights = lazy gather {
    take 0;
    loop { take $ += $_ for mortar, brick }
}

Elvira’s program produced:

0, ⅜, 2⅝, 3, 5¼, 5⅝, 7⅞, 8¼, 10½, 10⅞

i.e. both the tops and the bottoms of the layers of bricks:

                     \ 🎅 /
                       ██
                       ||
 layer                                      total
       ░░░░░░░░░░ ░░░░░░░░░░░░░░░ ░░░░░░░░░░
  2¼   ░░░░░░░░░░ ░░░░░░░░░░░░░░░ ░░░░░░░░░░
       ░░░░░░░░░░ ░░░░░░░░░░░░░░░ ░░░░░░░░░░
   ⅜                                        ‾‾8¼
       ░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░░‾‾7⅞
  2¼   ░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░░
       ░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░░
   ⅜                                        ‾‾5⅝
       ░░░░░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░‾‾5¼
  2¼   ░░░░░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ 
       ░░░░░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░
   ⅜                                        ‾‾3
       ░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░‾‾2⅝
  2¼   ░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░ 
       ░░░░░░ ░░░░░░░░░░░░░ ░░░░░░░░░░░░ ░░░
   ⅜   _____________________________________‾‾⅜
                                            ‾‾0

With their programs in hand, the Elves checked out the chimneys and Santa made it through another holiday season without any injuries.


Perl 6 Advent Calendar: Day 9 – HTTP and Web Sockets with Cro

Published by jnthnwrthngtn on 2017-12-09T00:01:28

It’s not only Christmas time when gifts are given. This summer at the Swiss Perl Workshop – beautifully situated up in the Alps – I had the pleasure of revealing Cro. Cro is a set of libraries for building services in Perl 6, together with a little development tooling to stub, run, and trace services. Cro is intially focused on building services with HTTP (including HTTP/2.0) and web sockets, but early support for ZeroMQ is available, and a range of other options are planned for the future.

Reactive pipelines

Cro follows the Perl design principle of making the easy things easy, and the hard things possible. Much like Git, Cro can be thought of as having porcelain (making the easy things easy) and plumbing (making the hard things possible). The plumbing level consists of components that are composed to form pipelines. The components come in different shapes, such as sources, transforms, and sinks. Here’s a transform that turns a HTTP request into a HTTP response:

use Cro;
use Cro::HTTP::Request;
use Cro::HTTP::Response;
class MuskoxApp does Cro::Transform {
method consumes() { Cro::HTTP::Request }
method produces() { Cro::HTTP::Response }
method transformer(Supply $pipeline --> Supply) {
supply whenever $pipeline -> $request {
given Cro::HTTP::Response.new(:$request, :200status) {
.append-header: "Content-type", "text/html";
.set-body: "Muskox Rocks!\n".encode('ascii');
.emit;
}
}
}
}

Now, let’s compose it with a TCP listener, a HTTP request parser, and a HTTP response serializer:

use Cro::TCP;
use Cro::HTTP::RequestParser;
use Cro::HTTP::ResponseSerializer;
my $server = Cro.compose:
Cro::TCP::Listener.new(:port(4242)),
Cro::HTTP::RequestParser.new,
MuskoxApp,
Cro::HTTP::ResponseSerializer;

That gives back a Cro::Service, which we can now start, and stop upon Ctrl+C:

$server.start;
react whenever signal(SIGINT) {
$server.stop;
exit;
}

Run it. Then curl it.

$ curl http://localhost:4242/
Muskox Rocks!

Not bad. But what if we wanted a HTTPS server? Provided we’ve got key and certificate files handy, that’s just a case of replacing the TCP listener with a TLS listener:

use Cro::TLS;
my $server = Cro.compose:
Cro::TLS::Listener.new(
:port(4242),
:certificate-file('certs-and-keys/server-crt.pem'),
:private-key-file('certs-and-keys/server-key.pem')
),
Cro::HTTP::RequestParser.new,
MuskoxApp,
Cro::HTTP::ResponseSerializer;

Run it. Then curl -k it.

$ curl -k https://localhost:4242/
Muskox Rocks!

And middleware? That’s just another component to compose into the pipeline. Or, seen another way, with Cro everything is middleware. Even the request parser or response serializer can be easily replaced, should the need arise (which sounds like an odd thing to need, but that’s effectively what implementing FastCGI would involve).

So, that’s how Cro is plumbed. It also requires an amount of boilerplate to work at this level. Bring in the porcelain!

HTTP server, the easy way

The Cro::HTTP::Server class gets rid of the boilerplate of building the HTTP processing pipeline. The example from earlier becomes just:

use Cro;
use Cro::HTTP::Server;
class MuskoxApp does Cro::Transform {
method consumes() { Cro::HTTP::Request }
method produces() { Cro::HTTP::Response }
method transformer(Supply $pipeline --> Supply) {
supply whenever $pipeline -> $request {
given Cro::HTTP::Response.new(:$request, :200status) {
.append-header: "Content-type", "text/html";
.set-body: "Muskox Rocks!\n".encode('ascii');
.emit;
}
}
}
}
my $server = Cro::HTTP::Server.new: :port(4242), :application(MuskoxApp);
$server.start;
react whenever signal(SIGINT) {
$server.stop;
exit;
}

There’s no magic here; it really is just a more convenient way to compose a pipeline. And while that’s only so much of a saving for HTTP/1.*, a HTTP/2.0 pipeline involves some more components, and a pipeline that supports both is a bit more involved still. By comparison, it’s easy to configure Cro::HTTP::Server to do HTTPS with support for both HTTP/1.1 and HTTP/2.0:

my %tls =
:certificate-file('certs-and-keys/server-crt.pem'),
:private-key-file('certs-and-keys/server-key.pem');
my $server = Cro::HTTP::Server.new: :port(4242), :application(MuskoxApp),
:%tls, :http<1.1 2>;

The route to happiness

A web application in Cro is ultimately always a transform that turns a HTTP request into a HTTP response. It’s very rare to want to process all requests in exactly the same way, however. Typically, different URLs should be routed to different handlers. Enter Cro::HTTP::Router:

use Cro::HTTP::Router;
use Cro::HTTP::Server;
my $application = route {
get -> {
content 'text/html', 'Do you like dugongs?';
}
}
my $server = Cro::HTTP::Server.new: :port(4242), :$application;
$server.start;
react whenever signal(SIGINT) {
$server.stop;
exit;
}

The object returned by a route block does the Cro::Transform role, meaning it would work just fine to use it with Cro.compose(...) plumbing too. It’s a good bit more convenient to write an application using the router, however! Let’s look at the get call a little more closely:

get -> {
content 'text/html', 'Do you like dugongs?';
}

Here, get is saying that this handler will only deal with HTTP GET requests. The empty signature of the pointy block means no URL segments are expected, so this route only applies to /. Then, instead of having to make a response object instance, add a header, and encode a string, the content function does it all.

The router is built to take advantage of Perl 6 signatures, and also to behave in a way that will feel natural to Perl 6 programmers. Route segments are set up by declaring parameters, and literal string segments match literally:

get -> 'product', $id {
content 'application/json', {
id => $id,
name => 'Arctic fox photo on canvas'
}
}

A quick check with curl shows that it takes care of serializing the JSON for us also:

$ curl http://localhost:4242/product/42
{"name": "Arctic fox photo on canvas","id": "42"}

The JSON body serializer is activated by the content type. It’s possible, and pretty straightforward, to implement and plug in your own body serializers.

Want to capture multiple URL segments? Slurpy parameters work too, which is handy in combination with static for serving static assets, perhaps multiple levels of directories deep:

get -> 'css', *@path {
static 'assets/css', @path;
}

Optional parameters work for segments that may or may not be provided. Using subset types to constrain the allowed values work too. And Int will only accept requests where the value in the URL segment parses as an integer:

get -> 'product', Int $id {
content 'application/json', {
id => $id,
name => 'Arctic fox photo on canvas'
}
}

Named parameters are used to receive query string arguments:

get -> 'search', :$query {
content 'text/plain', "You searched for $query";
}

Which would be populated in a request like this:

$ curl http://localhost:4242/search?query=llama
You searched for llama

These too can be type constrained and/or made required (named parameters are optional by default in Perl 6). The Cro router tries to help you do HTTP well by giving a 404 error for failure to match a URL segments, 405 (method not allowed) when segments would match but the wrong method is used, and 400 when the method and segments are fine, but there’s a problem with the query string. Named parameters, through use of the is header and is cookie traits, can also be used to accept and/or constrain headers and cookies.

Rather than chugging through the routes one at a time, the router compiles all of the routes into a Perl 6 grammar. This means that routes will be matched using an NFA, rather than having to chug through them one at a time. Further, it means that the Perl 6 longest literal prefix rules apply, so:

get -> 'product', 'index' { ... }
get -> 'product', $what { ... }

Will always prefer the first of those two for a request to /product/index, even if you wrote them in the opposite order:

get -> 'product', $what { ... }
get -> 'product', 'index' { ... }

Middleware made easier

It’s fun to say that HTTP middleware is just a Cro::Transform, but it’d be less fun to write if that was all Cro had to offer. Happily, there are some easier options. A route block can contain before and after blocks, which will run before and after any of the routes in the block have been processed. So, one could add HSTS headers to all responses:

my $application = route {
after {
header 'Strict-transport-security', 'max-age=31536000; includeSubDomains';
}
# Routes here...
}

Or respond with a HTTP 403 Forbidden for all requests without an Authorization header:

my $application = route {
before {
unless .has-header('Authorization') {
forbidden 'text/plain', 'Missing authorization';
}
}
# Routes here...
}

Which behaves like this:

$ curl http://localhost:4242/
Missing authorization
$ curl -H"Authorization: Token 123" http://localhost:4242/
<strong>Do you like dugongs?</strong>

It’s all just a Supply chain

All of Cro is really just a way of building up a chain of Perl 6 Supply objects. While the before and after middleware blocks are convenient, writing middleware as a transform provides access to the full power of the Perl 6 supply/whenever syntax. Thus, should you ever need to take a request with a session token and make an asynchronous call to a session database, and only then either emit the request for further processing (or do a redirection to a login page), it’s possible to do it – in a way that doesn’t block other requests (including those on the same connection).

In fact, Cro is built entirely in terms of the higher level Perl 6 concurrency features. There’s no explicit threads, and no explicit locks. Instead, all concurrency is expressed in terms of Perl 6 Supply and Promise, and it is left to the Perl 6 runtime library to scale the application over multiple threads.

Oh, and WebSockets?

It turns out Perl 6 supplies map really nicely to web sockets. So nicely, in fact, that Cro was left with relatively little to add in terms of API. Here’s how an (overly) simple chat server backend might look:

my $chat = Supplier.new;
get -> 'chat' {
# For each request for a web socket...
web-socket -> $incoming {
# We start this bit of reactive logic...
supply {
# Whenever we get a message on the socket, we emit it into the
# $chat Supplier
whenever $incoming -> $message {
$chat.emit(await $message.body-text);
}
# Whatever is emitted on the $chat Supplier (shared between all)
# web sockets), we send on this web socket.
whenever $chat -> $text {
emit $text;
}
}
}
}

Note that doing this needs a use Cro::HTTP::Router::WebSocket; to import the module providing the web-socket function.

In summary

This is just a glimpse at what Cro has to offer. There wasn’t space to talk about the HTTP and web socket clients, the cro command line tool for stubbing and running projects, the cro web tool that provides a web UI for doing the same, or that if you stick CRO_TRACE=1 into your environment you get lots of juicy debugging details about request and response processing.

To learn more, check out the Cro documentation, including a tutorial about building a Single Page Application. And if you’ve more questions, there’s also a recently-created #cro IRC channel on Freenode.


Strangely Consistent: Has it been three years?

Published by Carl Mäsak

007, the toy language, is turning three today. Whoa.

On its one-year anniversary, I wrote a blog post to chronicle it. It seriously doesn't feel like two years since I wrote that post.

On and off, in between long stretches of just being a parent, I come back to 007 and work intensely on it. I can't remember ever keeping a side project alive for three years before. (Later note: Referring to the language here, not my son.) So there is that.

So in a weird way, even though the language is not as far along as I would expect it to be after three years, I'm also positively surprised that it still exists and is active after three years!

In the previous blog post, I proudly announce that "We're gearing up to an (internal) v1.0.0 release". Well, we're still gearing up for v1.0.0, and we are closer to it. The details are in the roadmap, which has become much more detailed since then.

Noteworthy things that happened in these past two years:

Things that I'm looking forward to right now:

I tried to write those in increasing order of difficulty.

All in all, I'm quite eager to one day burst into #perl6 or #perl6-dev and actually showcase examples where macros quite clearly do useful, non-trivial things. 007 is and has always been about producing such examples, and making them run in a real (if toy) environment.

And while we're not quite over that hump yet, we're perceptibly closer than we were two years ago.

Belated addendum: Thanks and hugs to sergot++, to vendethiel++, to raiph++ and eritain++, for sharing the journey with me so far.

Perl 6 Advent Calendar: Day 8 – Adventures in NQP Land: Hacking the Rakudo Compiler

Published by tbrowder on 2017-12-08T00:00:54

With apologies to the old Christmas classic, “The Twelve Days of Christmas,” I give you the first line of a Perl 6 version:

On the first day of Christmas, my true love gave to me, a Perl table in a pod tree…

But the table I got wasn’t very pretty!

Background

My first real contact with Perl 6 came in the spring of 2015 when I decided to check on its status and found it was ready for prime time. After getting some experience with the language I started contributing to the docs in places where I could help. One of my first contributions to the docs was to clean up one of the tables which was not rendering nicely . During my experiments with pod tables on my local host I tried the following table:

=begin table
-r0c0 r0c1
=end table

which caused Perl 6 to throw an ugly, LTA (less than awesome) exception message:

"===SORRY!=== Cannot iterate object with P6opaque representation"

I worked around the problem but it nagged at me so I started investigating the guts of pod and tables. That led me to the source of the problem in github.com/rakudo/src/Perl6/Pod.nqp.

In fact, the real problem for many pod table issues turned out to be in that file.

Not Quite Perl (NQP)

nqp is an intermediate language used to build the Rakudo Perl 6 compiler. Its repository is found here. The rest of this article is about modifying nqp code in the rakudo compiler found in its repository here. Rakudo also has a website here.

Before getting too far I first read the available information about Rakudo and NQP here:

Then I started practicing nqp coding by writing and running small nqp files like this (file “hello.nqp”):

say("Hello, world!");

which, when executed, gave the expected results:

$ nqp hello.nqp
Hello, world!

Note that say() is one of the few nqp opcodes that doesn’t require the nqp:: prefix.

Into the trenches

The purpose of the Perl6::Pod class, contained in the rakudo/src/Perl6/Pod.nqp file, is to take pod grammar matches and transform them into Perl 6 pod class definitions, in rakudo/src/core/Pod.pm, for further handling by renderers in Perl 6 land. For tables that means anything represented in any legal pod form as described by the Perl 6 documentation design Synopsis S26, the Perl 6 test suite specs, and the Perl 6 docs has to be transformed into a Perl 6 Pod::Block::Table class object with this form as described in file rakudo/src/core/Pod.pm:

configuration information
a header line with N cells
M content lines, each with N cells

I wanted the nqp table pod handling to be robust and able to automatically fix some format issues (with a warning to the author) or throw an exception (gracefully) with detailed information of problems to enable the author to fix the pod input.

Workspace and tools

I needed two cloned repositories: rakudo and roast. I also needed forks of those same repositories on github so I could create pull requests (PRs) for my changes. I found a very handy Perl 5 tool in CPAN module App::GitGot. Using got allowed me to easily set up all four repos. (Note that got requires that its target repo not exist either in the desired local directory or the user’s github account.) After configuring got I went to a suitable directory to contain both repos and executed the following:

got fork https://github.com/rakudo/rakudo.git
got fork https://github.com/perl6/roast.git

which resulted in a subdirectories rakudo and roast containing the cloned repos and new forks of rakudo and roast on my github account. In the rakudo directory one can see the default setup for easy creation of PRs:

$ git remote -v
origin git@github.com:tbrowder/rakudo.git (fetch)
origin git@github.com:tbrowder/rakudo.git (push)
upstream https://github.com/rakudo/rakudo.git (fetch)
upstream https://github.com/rakudo/rakudo.git (push)

There are similar results in the roast repo.

Next, I renamed the roast repo as a subdirectory of rakudo (“rakudo/t/spec”) so it functions as a subgit of the local rakudo.

Finally, I created several bash scripts to ease configuring rakudo for installation in the local repo directory, setting the environment, and running tests:

(See all scripts mentioned here at https://github.com/tbrowder/nqp-tools.)

To complete the local working environment you will need to install some local modules so you must change your path and install a local copy of the zef installer. Follow these steps in your rakudo directory (from advice from @Zoffix):

git clone https://github.com/ugexe/zef
export PATH=`pwd`/install/bin:$PATH
cd zef; perl6 -Ilib bin/zef install .
cd ..
export PATH=`pwd`/install/share/perl6/site/bin:$PATH
zef install Inline::Perl5

Then install any other module you need, e.g.,

zef install Debugger::UI::CommandLine
zef install Grammar::Debugger

Hacking

Now start hacking away. When ready for a build, execute:

make
make install

The make install step is critical because otherwise, with the local environment we set up, the new Perl 6 executables won’t be found.

Debugging for me was laborious, with rebuilds taking about three minutes each. The debugger (perl6-debug-m) would have been very useful but I could not install the required Debbugger::UI::CommandLine module so it would be recognized by the locally-installed perl6-debug-m. The primary method I used was inserted print statements plus using the --ll-exception option of perl6. Of major note, though, is that this author is a Perl 6 newbie and made lots of mistakes, and did not always remember the fixes, hence this article. (Note I likely would have used the debugging tools but, at the time I started, I did not ask for help and did not have the advice provided shown above.)

Testing

It goes without saying that a good PR will include tests for the changes. I always create a roast branch with the same name as my rakudo branch. Then I submit both PRs and I refer to the roast PR in the rakudo PR and vice versa. I note for the roast PR that it requires the companion rakudo PR for it to pass all tests.

See Ref. 5 for much more detail on specialized test scripts for fudging and other esoteric testing matters.

Documentation

I try to keep the Perl 6 pod table documentation current with my fixes.

NQP lessons learned

Success!

I have now had two major Perl 6 POD (and roast) PRs accepted and merged, and am working on one more “easy” one which I should finish this week. The PRs are:

  1. Rakudo PR #1240

The Rakudo PR provided fixes for RTs #124403, #128221, #132341, #13248, and #129862. It was accompanied by roast PR #353.

The PR allowed the problem table above to be rendered properly. It also added warnings for questionable tables, added Rakudo environment variables RAKUDO_POD6_TABLE_DEBUG to aid users in debugging tables (see docs, User Debugging), and allows short rows with empty columns to be rendered properly.

  1. Rakudo PR #1287

The Rakudo PR provided a fix for Rakudo repo issue #1282. It was accompanied by roast PR #361. (Note that roast PR #361 is not yet merged.)

The PR allows table visual column separators (‘|’) and (‘+’) to be used as cell data by escaping them in the pod source.

Summary

Merry Christmas and Happy Hacking!

Credits

Any successful inputs I have made are due to all the Perl 6 core developers and the friendly folks on IRC (#perl6, #perl6-dev).

References

  1. JWs Perl 6 debugger Advent article
  2. JWs Rakudo debugger module Debugger::UI::CommandLine
  3. JWs grammar debugger module Grammar::Debugger
  4. Testing Rakudo
  5. Contributing to roast
  6. Github guide to pull requests (PRs)
  7. Perl 6 documentation (docs)

Appendix

POD tools

Major Perl 6 POD renderers


Perl 6 Advent Calendar: Day 7 – Test All The Things

Published by drforr on 2017-12-07T00:00:21

Perl 6, like its big sister Perl 5, has a long tradition of testing. When you install any Perl module, the installer normally runs that module’s test suite. And of course, as a budding Perl 6 module author, you’ll want to create your own test suite. Or maybe you’ll be daring and create your test suite before creating your module. This actually has several benefits, chief among them being your very first user, even before it’s written.

Before getting to actual code, though, I’d like to mention two shell aliases that I use very frequently –

alias 6='perl6 -Ilib'
alias 6p="prove -e'perl6 -Ilib'"

These aliases let me run a test file quickly without having to go to the trouble of installing my code. If I’m in a project directory, I can just run

$ 6 t/01-core.t
ok 1 - call with number
ok 2 - call with text
ok 3 - call with formatted string
1..3

and it’ll tell me what tests I’ve run and whether they all passed or not. Perl 6, just like its big sister Perl 5, uses the ‘t/’ directory for test files, and by convention the suffix ‘.t’ to distinguish test files from packages or scripts. It’s also got a built-in unit testing module, which we used above. If we were testing the sprintf() internal, it might look something like

use Test;

ok sprintf(1), 'call with number';
ok sprintf("text"), 'call with text';
ok sprintf("%d",1), 'call with formatted string';

done-testing;

The ok and done-testing functions are exported for us automatically. I’m using canonical Perl 6 style here, not relying too much on parentheses. In this case I do need to use parentheses to make sure sprintf() doesn’t “think” ’empty call’ is its argument.

ok takes just two arguments, the truthiness of what you want to test, and an optional message. If the first argument is anything that evaluates to True, the test passes. Otherwise… you know. The message is just text that describes the test. It’s purely optional, but it can be handy when the test fails as you can search for that string in your test file and quickly track down the problem. If you’re like the author, though, the line number is more valuable, so when you see

not ok 1 - call with number
# Failed test 'call with number'
# at test.t line 4
ok 2 - call with text
ok 3 - call with formatted string
1..3

in your test, you can immediately jump to line 4 of the test file and start editing to find out where the problem is. This gets more useful as your test files grow larger and larger, such as my tests for the Common Lisp version of (format) that I’m writing, with 200+ tests per test file and growing.

Finally, done-testing simply tells the Test module that we’re done testing, there are no more tests to come. This is handy when you’re just starting out and you’re constantly experimenting with your API, adding and updating tests. There’s no test counter to update each time or any other mechanics to keep track of.

It’s optional, of course, but other tools may use the ‘1..3’ at the end to prove that your test actually ran to completion. The tool prove is one, Jenkins unit testing and other systems may need that as well.

It depends…

on what your definition of ‘is’ is. While the ok test is fine if you’re only concerned with the truthiness of something, sometimes you need to dig a little deeper. Perl 6, just like its big sister, can help you there.

is 1 + 1, 2, 'prop. 54.43, Principia Mathematica';

doesn’t just check the truthiness of your test, it checks its value. While you could easily write this as

ok 1 + 1 == 2, 'prop. 54.43, Principia Mathematica';

using is makes your intent clear that you’re focusing on whether the expression 1+1 is equal to 2; with the ok version of the same statement, it’s unclear whether you’re testing the ‘1 + 1’ portion or the ‘==’ operator.

These two tests alone cover probably a good 80% of your testing needs, is handles basic lists and hashes with relative aplomb, and if you really need complex testing, its big sister is-deeply is standing at the wayside, ready to handle complex hash-array combinations.

Laziness and Impatience

Sometimes you’ve got a huge string, and you only need to check just a little bit of it.

ok 'Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg' ~~ 'manchau', 'my side';

You can certainly use the ~~ operator here. Just like ‘1 + 1 == 2’, though, your intent might not be clear. You can use the like method to make your intentions clear.

like 'Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg',
     /manchau/, 'my side';

and not have the ~~ dangling over the side of your boat.

DRYing out

After spending some time on and in beautiful Lake Char­gogg­a­gogg­man­chaugg­a­gogg­chau­bun­a­gung­a­maugg, you probably want to wring your clothes out. Test files tend to grow, especially regression tests. You might find yourself writing

is sprintf( "%s", '1' ), '1', "%s formats numbers";
is sprintf( "%s", '⅑' ), '⅑', "%s formats fractions";
is sprintf( "%s", 'Ⅷ' ), 'Ⅷ', "%s formats graphemes";
is sprintf( "%s", '三' ), '三', "%s formats CJKV";

That’s fine, copying and pasting (especially from StackOverflow) is a time-honored tradition, nothing wrong with that. Consider though, what happens when you add more tests with “%d” instead of “%s”, and since all of those strings are numbers, you just copy and paste the block, change “%s” to “%d” and go on.

is sprintf( "%s", '1' ), '1', "%s formats numbers';
# ...

is sprintf( "%d, '1' ), '1', "%d formats numbers';
# ...

So now you’ve got two sets of tests with the same names. Rather than editing all of the new “%d” tests, wouldn’t it be nice if we didn’t have to repeat ourselves in the first place?

subtest '%s', {
    is sprintf( "%s", '1' ), '1', "formats numbers";
    is sprintf( "%s", '⅑' ), '⅑', "formats fractions";
    is sprintf( "%s", 'Ⅷ' ), 'Ⅷ', "formats graphemes";
    is sprintf( "%s", '三' ), '三', "formats CJKV";
};

Now you just need to edit things in two places rather than three. If this has whetted your appetite for testing, I’d encourage you as well to check out Test All The Things at my personal site for more advanced testing paradigms and more advanced Perl 6 code. Also don’t forget to stay tuned for tomorrow’s Perl 6 Advent posting!

Thank you, and happy hacking!

DrForr aka Jeff Goff, The Perl Fisher


Weekly changes in and around Perl 6: 2017.49 Mischieventing

Published by liztormato on 2017-12-04T22:56:49

Zoffix Znet wrote the first of the Rakudo Perl 6 Advent posts and hit the Jackpot on Hacker News. Mischief achieved indeed! The Grinch would be so proud! And then there were also some Reddit comments.

Other Rakudo Perl 6 Advent posts so far:

Squashathon

Last weekend saw yet another Squashathon (yes, there is one on every first Saturday of the month). The goal of this squashathon was to go through tickets that were not updated in two or more years (these tickets are automatically labeled with MOLD tag). As a result, 121 out of 224 tickets (more than half) were updated. Most of the updates were simply about reproducing the issue with the current version of Rakudo Perl 6, but some tickets got more attention and received tests and fixes. A fuller overview of all Squashatons is also available. Kudos to Aleks-Daniel Jakimenko-Aleksejev for organizing and everybody else for participating!

Other Blog Posts

Core Developments

Meanwhile on FaceBook

Meanwhile on Twitter

Meanwhile on StackOverflow

Meanwhile on perl6-users

Winding Down

Yours truly spent most of the past week on the road, and recovering from a bad cold, and a bumpy overnight ferry. On the plus side, it was good to get out and see new stuff and meet old friends in faraway places. A bit like Rakudo Perl 6 and Pumpking Perl 5 really. See you next week for more Rakudo Perl 6 news!


Weekly changes in and around Perl 6: 2017.48 Community First

Published by liztormato on 2017-11-27T18:45:34

People visiting the London Perl Workshop this year know what this means: it was an excellent event with a lot of high quality Rakudo Perl 6 and Pumpking Perl 5 talks and workshops. Please check out the resulting blog posts about 2 days after the event:

Yours truly was touched by the many high quality conversations about the future of Perl, and how the importance of community was expressed by so many.

The Perl 6 Recipes Book

Andrew Shitov may be at it again. He intends to write a Perl 6 Recipes Book with the help of a Kickstarter Campaign. On FaceBook he explained:

Hi, let this year be a year of the Perl 6 books. I want to make a big addition to it and publish the Perl 6 Recipes book next year. Its structure will be based on the content of Perl (5) Cookbook and Using Perl 6 and of course will include all the new cool things available in Perl 6, from Unicode to ⚛ operations.

An effort well worth supporting!

A Guide To Parsing

raiph started a discussion on Reddit about Tomassetti‘s A Guide to Parsing: Algorithms and Terminology and how Rakudo Perl 6 parsing was distinctive relative to the landscape as outlined by that guide. He would like to get feedback from the Perl 6 community about his suggestions for inclusion in Tomassetti‘s guide. Please help him get the unique grammar features of Rakudo Perl 6 better known to the world!

A unified Supply concurrency model

Jonathan Worthington describes the work that has been sponsored by Vienna Perl Mongers: how it has replaced the two concurrency models that previously backed Supply with a single unified model, and so enabled new use cases of supply and react. As usual, highly recommended reading if you want to get a deeper grasp of some of Rakudo Perl 6 inner workings.

Let’s Go!

Ahmad M. Zawawi took a hint from Jonathan Worthington: it seems that an Inline::Go project is born! I’m assuming Ahmad can use all the help you all can give him!

Other Blog Posts

Core Developments

Meanwhile in an alternate universe

Off-topic Alert! Yours truly has been using email for 40 years now, as well as being able to chat with other people online. Or read discussion forums. Or visit “websites”. How was this possible? It’s all discussed in The Internet that wasn’t by Sharon Weinberger. Which is a review of The Friendly Orange Glow by Brian Dear: the Untold Story of the PLATO System and the Dawn of Cyberculture. Wow, what a trip down memory lane. We will now continue with our regular Rakudo Perl 6 related news.

Meanwhile on FaceBook

Meanwhile on Twitter

Meanwhile on StackOverflow

Meanwhile on perl6-users

Meanwhile on PerlMonks

Winding Down

Wow. What a week! Perhaps not too many core developments. But plenty of development around it. Hope to be able to top that again next week. So check in again then!


6guts: A unified and improved Supply concurrency model

Published by jnthnwrthngtn on 2017-11-24T00:54:34

Perl 6 encourages the use of high-level constructs when writing concurrent programs, rather than dealing with threads and locks directly. These not only aid the programmer in producing more correct and understandable programs, but they also afford those of us working on Perl 6 implementation the opportunity to improve the mechanisms beneath the constructs over time. Recently, I wrote about the new thread pool scheduler, and how improving it could bring existing programs lower memory usage, better performance, and the chance to scale better on machines with higher core counts.

In this post, I’ll be discussing the concurrency model beneath Supply, which is the Perl 6 mechanism for representing an asynchronous stream of values. The values may be packets arriving over a socket, output from a spawned process or SSH command, GUI events, file change notifications, ticks of a timer, signals, and plenty more besides. Giving all of these a common interface makes it easier to write programs that compose multiple asynchronous data sources, for example, a GUI application that needs to update itself when files change on disk, or a web server that should shut down cleanly upon a signal.

Until recently, there were actually two different concurrency models, one for supply blocks and one for all of the methods available on a Supply. Few people knew that, and fewer still had a grasp of what that meant. Unfortunately, neither model worked well with the Perl 6.d non-blocking await. Additionally, some developers using supply/react/whenever in their programs ran into a few things that they had expected would Just Work, which in reality did not.

Before digging in to the details, I’d like to take a moment to thank Vienna.pm for providing the funding that allowed me to dedicate a good bit of time to this task. It’s one of the trickiest things I’ve worked on in a while, and having a good chunk of uninterrupted time to focus on it was really helpful. So, thanks!

Supplies and concurrency

The first thing to understand about Supply, and supply blocks, is that they are a tool for concurrency control. The power of supply blocks (react also) is that, no matter how many sources of asynchronous data you tap using whenever blocks, you can be sure that only one incoming message will be processed at a time. The same principle operates with the various methods: if I Supply.merge($s1, $s2).tap(&some-code), then I know that even if $s1 or $s2 were to emit values concurrently, they will be pushed onward one at a time, and thus I can be confident that &some-code will be called with a value at a time.

These one-message-at-a-time semantics exist to enable safe manipulation of state. Any lexical variables declared within a supply block will exist per time the supply block is tapped, and can be used safely inside of it. Furthermore, it’s far easier to write code that processes asynchronous messages when one can be sure the processing code for a given message will run to completion before the next message is processed.

Backpressure

Another interesting problem for any system processing asynchronous messages is that of backpressure. In short, how do we make a source of messages emit them at a rate no greater than that of the processing logic? The general principle with Supply is that the sender of a message pays the cost of its processing. So, if I have $source.map(&foo).tap(&bar), then whatever emits at the source pays the cost of the map of that message along with the processing done by the tap callback.

The principle is easy to state, but harder to deliver on. One of the trickiest questions resolves around recursion: what happens when a Supply ends up taking an action that results in it sending a message to itself? That may sound contrived, but it can happen very easily. When the body of a supply block runs, the whenever blocks trigger tapping of a supply. If the tapped Supply was to synchronously emit a message, we immediately have a problem: we can’t process it now, because that would violate the one-at-a-time rule. A real world example where this happens? A HTTP/2 client, where the frame serializer immediately emits the connection preface when tapped, to make sure it gets sent before anything else can be sent. (Notice how this in itself also relies on the non-interruption principle.) This example comes straight out of Cro’s HTTP/2 implementation.

A further issue is how we apply the backpressure. Do we block a real thread? Or can we do better? If we go doing real blocking of thread pool threads, we’ll risk exhausting the pool at worst, or in the better case force the program to create more threads (and so use more memory) than it really needs.

Where things stood

So, how did we do on these areas before my recent work? Not especially great, it turned out.

First, let’s consider the mechanism that was used for everything except the case of supply blocks. Supply processing methods generally check that their input Supply is serial – that is, delivered one message at a time – by calling its serial method. If not, they serialize it. (In fact, they all call the serialize method, which just returns identity if serial is True, thus factoring out the check). The upshot of this is that we only have to enforce the concurrency control once in a chain of operations that can’t introduce concurrency themselves. That’s good, and has been retained during my changes.

So, the interesting part is how serialize enforces one-at-a-time semantics. Prior to my recent work, it did this using a Lock. This has a few decent properties: locks are pretty well optimized, and they block a thread in an OS-aware way, meaning that the OS scheduler knows not to bother scheduling the waiting thread until the lock is released. They also have some less good properties. One is that using Lock blocks the use of Perl 6.d non-blocking await in any downstream code (a held Lock can’t migrate between threads), which was a major motivator to look for an alternative solution. Even if that were not an issue, the use of Lock really blocks up a thread, meaning that it will not be available for the thread pool to use for anything else. Last, but certainly not least, Lock is a reentrant mutex – meaning that we could end up violating the principle that a message is completely processed before the next message is considered in some cases!

For supply blocks, a different approach had been taken. The supply block (or react block) instance had a queue of messages to process. Adding to, or taking from, this queue was protected by a Lock, but that was only held in order to update the queue. Messages were not removed from the queue until they had been processed, which in turn provided a way of knowing if the block instance is “busy”. If a message was pushed to the block instance when it was busy, then it was put onto the queue…and that is all. So who paid for the message processing?

It turns out, it was handled by the thread that was busy in the supply block at the time that message arrived! This is pretty sensible if the message was a result of recursion. However, it could lead to violation of the principle that the sender of a message should pay for its processing costs. An unlucky sender could end up paying the cost of an unbounded number of messages of other senders! Interestingly enough, there haven’t been any bug reports about this, perhaps because most workloads simply don’t hit this unfairness, and those that do aren’t impacted by it anyway. Many asynchronous messages arrive on the thread pool, and it’s probably not going to cause much trouble if one thread ends up working away at a particular supply block instance that is being very actively used. It’s a thread pool, and some thread there will have to do the work anyway. The unfairness could even be argued to be good for memory caches!

Those arguments don’t justify the problems of the previous design, however. Queues are pretty OK at smoothing out peaks in workloads, but the stable states of a queue are being full and being empty, and this was an unbounded queue, so “full” would mean “out of memory”. Furthermore, there was no way to signal back towards a producer that it was producing too fast.

Towards a unified model: Lock::Async

So, how to do better? I knew I wanted a unified concurrency control mechanism to use for both serialize and supply/react blocks. It had to work well with non-blocking await in Perl 6.d. In fact, it needed to – in the case a message could not be processed now, and when the sender was on the thread pool – do exactly what non-blocking await does: suspend the sender by taking a continuation, and schedule that to be run when the message it was trying to send could be sent. Only in the case that the sender is not a pool thread should it really block. Furthermore, it should be fair: message senders should queue up in order to have their message processed. On top of that, it needed to be efficient in the common case, which is when there is no contention.

In response to these needs, I built Lock::Async: an asynchronous locking mechanism. But what is an asynchronous lock? It has a lock method which returns a Promise. If nothing else is holding the lock, then the lock is marked as taken (this check-and-acquire operation is implemented efficiently using an atomic operation) and the Promise that is returned will already be Kept. Otherwise, the Promise that is returned will be Kept when the lock becomes available. This means it can be used in conjunction with await. And – here’s the bit that makes this particularly useful – it means that it can use the same infrastructure built for non-blocking await in Perl 6.d. Thus, an attempt to acquire an asynchronous lock that is busy on the thread pool will result in that piece of work being suspended, and the thread can be used for something else. As usual, in a non-pool thread, real blocking (involving a condition variable) will take place, meaning those who need to be totally in control of what thread they’re running on, and so use Thread directly, will maintain that ability. (Examples of when that might be needed include writing bindings using NativeCall.)

When the unlock method is called, then there are two cases. The first is if nobody contended for the lock in the meantime: in this case, then another atomic operation can be used to mark it as free again. Otherwise, the Promise of the first waiter in line is kept. This mechanism provides fairness: the lock is handed out to senders in the order that they requested it.

Thus, using Lock::Async for concurrency control of supplies gives us:

As an aside: as part of this I spent some time thinking about the semantics of await inside of a supply or react block. Should it block processing of other messages delivered to the block? I concluded that yes, it should: it provides a way to apply backpressure (for example, await of a write to a socket), and also means that await isn’t an exception to the “one message processed at a time, and processed fully” design principle. It’s not like getting the other behavior is hard: just use a nested whenever.

Taps that send messages

So, I put use of Lock::Async in place and all was…sort of well, but only sort of. Something like this:

my $s = supply {
    for ^10 -> $i {
        emit $i
    }
}
react {
    whenever $s {
        .say 
    }
}

Would hang. Why? Because the lock protecting the react block was obtained to run its mainline, setting up the subscription to $s. The setup is treated just like processing a message: it should run to completion before any other message is processed. Being able to rely on that is important for both correctness and fairness. The supply $s, however, wants to synchronously emit values as soon as it is tapped, so it tries to acquire the async lock. But the lock is held, so it waits on the Promise, but in doing so blocks progress of the calling react block, so the lock is never released. It’s a deadlock.

An example like this did work under the previous model, though for not entirely great reasons. The 10 messages would be queued up, along with the done message of $s. Then, its work complete, the calling react block would get back control, and then the messages would be processed. This was OK if there was just a handful of messages. But something like this:

my $s = supply {
    loop {
        emit ++$;
    }
}
react {
    whenever $s {
        .say;
    }
    whenever Promise.in(0.1) {
        done;
    }
}

Would hang, eating memory rapidly, until it ran out of memory, since it would just queue messages forever and never give back control.

The new semantics are as follows: if the tap method call resulting from a whenever block being encountered causes an await that can not immediately be satisfied, then a continuation is taken, rooted at the whenever. It is put into a queue. Once the message (or initial setup) that triggered the whenever completes, and the lock is released, then those continuations are run. This process repeats until the queue is empty.

What does this mean for the last two examples? The first one suspends at the first emit in the for ^10 { ... } loop, and is resumed once the setup work of the react block is completed. The loop then delivers the messages into the react block, producing them and having them handled one at a time, rather than queuing them all up in memory. The second example, which just hung and ate memory previously, now works as one would hope: it displays values for a tenth of a second, and then tears down the supply when the react block exits due to the done.

This opens up supply blocks to some interesting new use cases. For example, this works now:

my $s = supply {
    loop {
        await Promise.in(1);
        emit ++$;
    }
}
react {
    whenever $s {
        .say
    }
}

Which isn’t itself useful (just use Supply.interval), but the general pattern here – of doing an asynchronous operation in a loop and emitting the result it gives each time – is. A supply emitting the results of periodic polling of a service, for example, is pretty handy, and now there’s a nice way to write it using the supply block syntax.

Other recursion

Not all recursive message delivery results from synchronous emits from a Supply tapped by a whenever. While the solution above gives nice semantics for those cases – carefully not introducing extra concurrency – it’s possible to get into situations where processing a message results in another message that loops back to the very same supply block. This typically involves a Supplier being emitted into. This isn’t common, but it happens.

Recursive mutexes – which are used to implement Lock – keep track of which thread is currently holding the lock, using thread ID. This is the reason one cannot migrate code that is holding such a lock between threads, and thus why one being held prevents non-blocking await from being, well, non-blocking. Thus, a recursion detection mechanism based around thread IDs was not likely to end well.

Instead, Lock::Async uses dynamic variables to keep track of which async locks are currently held. These are part of an invocation record, and so can safely be transported across thread pool threads, meaning they interact just fine with the Perl 6.d non-blocking await, and the new model of non-blocking handling of supply contention.

But what does it do when it detects recursion? Clearly, it can’t just decide to forge ahead and process the message anyway, since that violates the “messages are processed one at a time and completely” principle.

I mentioned earlier that Lock::Async has lock and unlock methods, but those are not particularly ideal for direct use: one must be terribly careful to make sure the unlock is never missed. Therefore, it has a protect method taking a closure. This is then run under the lock, thus factoring out the lock and unlock, meaning it only has to be got right in one place.

There is also a protect-or-queue-on-recursion method. This is where the recursion detection comes in. If recursion is detected, then instead of the code being run now, a then is chained on to the Promise returned by lock, and the passed closure is run in the then. Effectively, messages that can’t be delivered right now because of recursion are queued for later, and will be sent from the thread pool.

This mechanism’s drawback is that it becomes a place where concurrency is introduced. On the other hand, given we know that we’re introducing it for something that’s going to run under a lock anyway, that’s a pretty safe thing to be doing. A good property of the design is that recursive messages queue up fairly with external messages.

Future work

The current state is much better than what came before it. However, as usual, there’s more that can be done.

One thing that bothers me slightly is that there are now two different mechanisms both dealing with different cases of message recursion: one for the case where it happens during the tapping of a supply caused by a whenever block, and one for other (and decidedly less common) cases. Could these somehow be unified? It’s not immediately clear to me either way. My gut feeling is that they probably can be, but doing so will involve some different trade-offs.

This work has also improved the backpressure situation in various ways, but we’ve still some more to do in this area. One nice property of async locks is that you can check if you were able to acquire the lock or not before actually awaiting it. That can be used as feedback about how busy the processing path ahead is, and thus it can be used to detect and make decisions about overload. We also need to do some work to communicate down to the I/O library (libuv on MoarVM) when we need it to stop reading from things like sockets or process handles, because the data is arriving faster than the application is able to process it. Again, it’s nice that we’ll be able to do this improvement and improve the memory behavior of existing programs without those programs having to change.

In summary…

This work has replaced the two concurrency models that previously backed Supply with a single unified model. The new model makes better use of the thread pool, deals with back-pressure shortcomings with supply blocks, and enables some new use cases of supply and react. Furthermore, the new approach interacts well with Perl 6.d non-blocking await, removing a blocker for that.

These are welcome improvements, although further unification may be possible, and further work on back-pressure is certainly needed. Thanks once again to Vienna.pm for helping me dedicate the time to think about and implement the changes. If your organization would like to help me continue the journey, and/or my Perl 6 work in general, I’m still looking for funding.


Weekly changes in and around Perl 6: 2017.47 More TPCiA Videos

Published by liztormato on 2017-11-20T22:26:42

In the past weeks, more videos from the The Perl Conference in Amsterdam have become available. Rakudo Perl 6 related videos that have been added:

Kudos to all behind the scenes who made the recording, processing and uploading of these videos happen!

Blog Posts

Core Developments

Meanwhile on Twitter

Meanwhile on StackOverflow

Meanwhile on FaceBook

Format::Lisp is on the ecosystem. Not *quite* fully featured, but it handles some of the recursive directives, and the test suite will tell what it can currently handle.

Meanwhile on perl6-users

Winding Down

We’re still waiting for the 2017.11 release of the Rakudo Compiler. There have been a few blockers that turned up at the last moment, which appear to have been vanquished, at least temporarily. But not in time for the deadline for this Perl 6 Weekly. So please check in again for next week’s Perl 6 Weekly for the continuing story!


Weekly changes in and around Perl 6: 2017.46 Spesh Explained

Published by liztormato on 2017-11-13T22:10:59

Jonathan Worthington completed his four part blog about the MoarVM Specializer Improvements he did in the past 3 months, supported by a grant from the Perl Foundation. The four parts are:

Required reading for anybody interested in the current and future inner workings of the Moar Virtual Machine!

Rakudo Star 2017.10 Released

Steve Mynott did all the hard work again and released Rakudo Star 2017.10, now available for download. Precompiled binaries for MacOS and Windows (64 bit) are also available.

Perl 6 Advent Calendar

Zoffix Znet reminds us that it is still possible to claim a slot in the Perl 6 Advent Calendar for this year. We’d like to hear from you! Your Advent Calendar Needs You! Just let us know!

No More Grants This Year

The Perl Foundation has run out of allocable funds for grants for this year. Your donation or a donation by your employer will allow for more grant work to be supported next year. So please give kindly, especially if you need to finish off budgets before the end of the year!

Core developments

Other Blog Posts

Meanwhile on Twitter

Meanwhile on StackOverflow

Winding Down

It was a bit of a quiet week: nothing on perl6-users or on perlmonks. Which is also nice every now and then, as it makes the work of yours truly for the Perl 6 Weekly easier. The coming weekend will see the Rakudo 2017.11 compiler release with already more than 300 commits under its belt. The coming weekend will also see yours truly giving a one-hour “Introduction to Rakudo Perl 6” presentation at T-Dose (in Dutch), as well as a Perl stand with a lot of Rakudo Perl 6 books and goodies. Whether or not you will be able to attend, it seems wise to check in again next week for more Perl 6 related news 🙂


6guts: MoarVM Specializer Improvements Part 4: Argument Guards

Published by jnthnwrthngtn on 2017-11-09T21:27:31

So far in this series, I have discussed how the MoarVM dynamic optimizer gathers statistics, uses them to plan what to optimize, and then produces specialized versions of hot parts of a program to speed up execution. In this final part, I’ll look at how we switch from unoptimized code into optimized code, which centers around argument guards.

But wait, what about code-gen?

Ah, yes, I knew somebody would ask that. At the end of part 3, we had a data structure representing optimized code, perhaps for a routine, method, or block. While going from bytecode to a CFG in SSA form was a fairly involved process, going back to bytecode is far simpler: we iterate the basic blocks in order, iterate each of the instructions within a basic block, and write out the bytecode for each of instructions. Done!

There are, of course, a few complications to take care of. When we have a forward branch, we don’t yet know the offset within the bytecode of the destination, so a table is needed to fix those up later. Furthermore, a new table of exception handler offsets will be needed, since the locations of the covered instructions and handlers will have moved. Beyond those bits of bookkeeping, however, there’s really not much more to it than a loop spitting out bytecode from instruction nodes.

Unlike bytecode that is fed into the VM from the outside, we don’t spend time doing validation of the specialized bytecode, since we can trust that it is valid – we’re generating it in-process! Additionally, the specialized bytecode may make use of “spesh ops” – a set of extra opcodes that exist purely for spesh to generate. Some of them are non-logging forms of ops that would normally log statistics (no point logging after we’ve done the optimizations), but most are for doing operations that – without the proofs and guards done by spesh – would be at risk of violating memory safety. For example, there’s an op that simply takes an object offset and reads a pointer or integer from a certain number of bytes into it, which spesh can prove is safe to do, but in general would not be.

What I’ve described so far is the portable behavior that we can do on any platform. So it doesn’t matter whether you’re running MoarVM on x86, x64, ARM, or something else, you can take advantage of all the optimizations that spesh can do. On x64, however, we can go a step further, and compile the spesh graph not back into specialized MoarVM bytecode, but instead into machine code. This eliminates the interpreter overhead. In MoarVM, we tend to refer to this stage as “the JIT compiler”, because most people understand JIT compilation as resulting in machine code. In reality, what most other VMs call their JIT compiler spans the same space that both spesh and the MoarVM JIT cover between them. MoarVM’s design means that we can deliver performance wins on all platforms we can run on, and then an extra win on x64. For more on the machine code generation process, I can recommend watching this talk by brrt, who leads work on it.

Argument guards

By this point, we have some optimized code. It was generated for either a particular callsite (a certain specialization) or a combination of callsite and incoming argument types (an observed type specialization). Next, we need a mechanism that will, upon a call, look at the available specializations and see if any of them match up with the incoming arguments. Provided one is found that matches, we can then call it.

My original approach to this was to simply have a list of specializations, each tagged with a callsite and, for each object argument index, an expected type, whether we wanted a type object or a concrete object, and – for container types like Scalar – what type we expected to find on the inside of the container. This was simple to implement, but rather inefficient. Even if all of the type specializations were for the same callsite, it would be compared for each of them. Alternatively, if there were 4 specializations and 3 were on the same callsite, and one was on a second callsite, we’d have to do 3 failed comparisons on it to reach the final one that we were hunting.

That might not sound overly bad, because comparing callsites is just comparing pointers, and so somewhat cheap (although it’s branching, and branches aren’t so friendly for CPUs). Where it gets worse is that parameter type checks worked the same way. Therefore, if there were 4 specializations of the same callsite, all of them against a Scalar argument with 4 different types of value inside of it, then the Scalar would have to be dereferenced up to 4 times. This isn’t ideal.

My work during the summer saw the introduction of a new, tree-structured, approach. Each node in the tree represents either an operation (load an argument to test, read a value from a Scalar container) with a single child node, or a test with two child nodes representing “yes” and “no”. The leaves of the tree either indicate which specialization to use, or “no result”.

The tree structure allows for loads, tests, and dereferences to be lifted out. Therefore, each argument needs to be loaded once, checked against a particular type once, and dereferenced once if it’s a container. So, if there were to be specializations for (Scalar:D of Int:D, Str:D) and (Scalar:D of Int:D, Num:D), then the first argument would be loaded one time and tested to see if it is a Scalar. If it is, then it will be dereferenced once, and the resulting value tested to see if it’s an Int. Both alternatives for the second argument are placed in the tree underneath this chain of tests, meaning that they do not need to be repeated.

Arg guard trees are dumped in the specializer log for debugging purposes. Here is how the output looks for the situation described above:

0: CALLSITE 0x7f5aa3f1acc0 | Y: 1, N: 0
1: LOAD ARG 0 | Y: 2
2: STABLE CONC Scalar | Y: 3, N: 0
3: DEREF_RW 0 | Y: 4, N: 0
4: DEREF_VALUE 0 | Y: 5, N: 0
5: STABLE CONC Int | Y: 6, N: 0
6: LOAD ARG 1 | Y: 7
7: STABLE CONC Int | Y: 8, N: 9
8: RESULT 0
9: STABLE CONC Str | Y: 10, N: 0
10: RESULT 1

As the output suggests, the argument guard tree is laid out in a single block of memory – an array of nodes. This gives good cache locality on the lookups, and – since argument guard trees are pretty small – means we can use a small integer type for the child node indices rather than requiring a pointer worth of space.

Immutability wins performance

Additional specializations are generated over time, but the argument guard tree is immutable. When a new specialization is generated, the existing argument guard tree is cloned, and the clone is modified to add the new result. That new tree is then installed in place of the previous one, and the previous one can be freed at the next safe point.

Why do this? Because it means that no locks need to be acquired to use a guard tree. In fact, since spesh runs on a single thread of its own, no locks are needed to update the guard trees either, since the single specializer thread means those updates are naturally serialized.

Calls between specialized code

In the last part of the series, I mentioned that part of specializing a call is to see if we can map it directly to a specialization. This avoids having to evaluate the argument guard tree of the target of the call, which is a decidedly nice saving. As a result, most uses of the argument guard are on the boundary between unspecialized and specialized code.

But how does the optimizer see if there’s a specialization of the target code that matches the argument types being passed? It does it by evaluating the argument guard tree – but on facts, not real values.

On Stack Replacement

Switching into specialized code at the point of a call handles many cases, but misses an important one: that where the hot code is entered once, then sits in a loop for a long time. This does happen in various real world programs, but it’s especially common in benchmarks. It’s highly desirable to specialize the hot loop’s code, if possible inlining things into the loop body and compiling the result into machine code.

I discussed detection of hot loops in an earlier part of this series. This time around, let’s take a look at the code for the osrpoint op:

OP(osrpoint):
    if (MVM_spesh_log_is_logging(tc))
        MVM_spesh_log_osr(tc);
    MVM_spesh_osr_poll_for_result(tc);
    goto NEXT;

The first part is about writing a log entry each time around the loop, which is what bumps the loop up in the statistics and causes a specialization to be generated. The call to MVM_spesh_osr_poll_for_result is the part that checks if there is a specialization ready, and jumps into it if so.

One way we could do this is to evaluate the argument guard in every call to MVM_spesh_osr_poll_for_result to see if there’s an appropriate optimization. That would get very pricey, however. We’d like the interpreter to make decent progress through the work until the optimized version of the code is ready. So what to do?

Every frame gets an ID on entry. By tracking this together with the number of specializations available last time we checked, we can quickly short-circuit running the argument guard when we know it will give the very same result as the last time we evaluated it, because nothing changed.

MVMStaticFrameSpesh *spesh = tc->cur_frame->static_info->body.spesh;
MVMint32 num_cands = spesh->body.num_spesh_candidates;
MVMint32 seq_nr = tc->cur_frame->sequence_nr;
if (seq_nr != tc->osr_hunt_frame_nr || num_cands != tc->osr_hunt_num_spesh_candidates) {
    /* Check if there's a candidate available and install it if so. */
    ...

    /* Update state for avoiding checks in the common case. */
    tc->osr_hunt_frame_nr = seq_nr;
    tc->osr_hunt_num_spesh_candidates = num_cands;
}

If there is a candidate that matches, then we jump into it. But how? The specializer makes a table mapping the locations of osrpoint instructions in the unspecialized code to locations in the specialized code. If we produce machine code, a label is also generated to allow entry into the code at that point. So, mostly all OSR does is jump execution into the specialized code. Sometimes, things are approximately as easy as they sound.

There’s a bonus feature hidden in all of this. Remember deoptimization, where we fall back to the interpreter to handle rarely occurring cases? This means we’ll encounter the osrpoint instructions in the unoptimized code again, and so – once the interpreter has done with the unusual case – we can enter back into the specialized, and possibly JIT-compiled code again. Effectively, spesh factors your slow paths out for you. And if you’re writing a module, it can do it differently based on different application’s use cases of the module.

Future idea: argument guard compilation to machine code

At the moment, the argument guard tree is walked by a little interpreter. However, on platforms where we have the capability, we could compile it into machine code. This would perhaps allow branch predictors to do a bit of a better job, as well as eliminate the overhead the interpreter brings (which, given the ops are very cheap, is much more significant here than in the main MoarVM interpreter).

That’s all, folks

I hope this series has been interesting, and provided some insight into how the MoarVM specializer works. My primary reason for writing it was to put my recent work on the specializer, funded by The Perl Foundation, into context, and I hope this has been a good bit more interesting than just describing the changes in isolation.

Of course, there’s no shortage of further opportunities for optimization work, and I will be reporting on more of that here in the future. I continue to be looking for funding to help make that happen, beyond what I can do in the time I have aside from my consulting work.


rakudo.org: Announce: Rakudo Star Release 2017.10

Published by Steve Mynott on 2017-11-09T15:53:10

A useful and usable production distribution of Perl 6

On behalf of the Rakudo and Perl 6 development teams, I’m pleased to announce the October 2017 release of “Rakudo Star”, a useful and usable production distribution of Perl 6. The tarball for this release is available from https://rakudo.perl6.org/downloads/star/.

Binaries for macOS and Windows (64 bit) are also available at the same location.

This is a post-Christmas (production) release of Rakudo Star and implements Perl v6.c. It comes with support for the MoarVM backend (all module tests pass on supported platforms). Currently, Star is on a quarterly release cycle.

IMPORTANT: “panda” has been removed from this release since it is deprecated. Please use “zef” instead.

Please note that this release of Rakudo Star is not fully functional with the JVM backend from the Rakudo compiler. Please use the MoarVM backend only.

In the Perl 6 world, we make a distinction between the language (“Perl 6”) and specific implementations of the language such as “Rakudo Perl”.

This Star release includes release 2017.10 of the Rakudo Perl 6 compiler, version 2017.10 MoarVM, plus various modules, documentation, and other resources collected from the Perl 6 community.

The Rakudo compiler changes since the last Rakudo Star release of 2017.07 are now listed in “2017.08.md”, “2017.09.md”, and “2017.10.md” under the “rakudo/docs/announce” directory of the source distribution.

Notable changes in modules shipped with Rakudo Star:

+ DBIish: Newer version (doesn't work with 2017.01 anymore)
+ Test-META: New. also dependencies (JSON-Class, JSON-Marshal, JSON-Name, JSON-Unmarshal and META6)
+ doc: Too many to list. p6doc-index merged into p6doc and index built on first run.
+ p6-Template-Mustache: Fixed on Windows.
+ panda: Removed. Stub added to warn about this.
+ perl6-datetime-format: New (heavily used in ecosystem)
+ perl6-file-which: Fixed tests on Windows
+ tap-harness6: Many fixes.
+ zef: New version with many fixes

There are some key features of Perl 6 that Rakudo Star does not yet handle appropriately, although they will appear in upcoming releases. Some of the not-quite-there features include:

There is an online resource at http://perl6.org/compilers/features that lists the known implemented and missing features of Rakudo’s backends and other Perl 6 implementations.

In many places we’ve tried to make Rakudo smart enough to inform the programmer that a given feature isn’t implemented, but there are many that we’ve missed. Bug reports about missing and broken features are welcomed at rakudobug@perl.org.

See https://perl6.org/ for links to much more information about Perl 6, including documentation, example code, tutorials, presentations, reference materials, design documents, and other supporting resources. Some Perl 6 tutorials are available under the “docs” directory in the release tarball.

The development team thanks all of the contributors and sponsors for making Rakudo Star possible. If you would like to contribute, see http://rakudo.org/how-to-help, ask on the perl6-compiler@perl.org mailing list, or join us on IRC #perl6 on freenode.

Weekly changes in and around Perl 6: 2017.45 Uplink Established

Published by liztormato on 2017-11-06T22:38:58

Elizabeth Mattijsen got carried away trying to devise an API for getrusage. The result is a new set of modules available with use Telemetry. And a helper module for easy activation: snapper. The simplest use case: perl6 -Msnapper yourscript.pl6. This will start up a separate thread that will take snapshots every 0.1 seconds. Once it is done, it will display a report much like this on STDERR:

Telemetry Report of Process #72908 (2017-11-06T21:26:26Z)
Number of Snapshots: 7
Supervisor thread ran the whole time
Initial Size:        67676 Kbytes
Total Time:           0.58 seconds
Total CPU Usage:      0.59 seconds

wallclock  util%  max-rss
   105476  13.28    17680
   105220  13.27       12
   105178  12.73     1024
   105149  12.85      152
   105141  12.42        4
    49525  12.98       12
--------- ------ --------
   575689  12.92    18884

Legend:
wallclock  Number of microseconds elapsed
    util%  Percentage of CPU utilization (0..100%)
  max-rss  Maximum resident set size (in Kbytes)

Even this hot off the press, Wenzel P. P. Peppmeyer took this new development to actually create a distribution that runs an httpd daemon in your program, so that you can look at usage data of your (long running program) in your browser!

But that’s only the tip of the iceberg: there are also a number of ways to get at system information ad-hoc, such as:

$ perl6 -MTelemetry -e 'say T<wallclock cpu max-rss>'
(140878 264269 84756)

Or plan your snapshots and reporting more precisely:

while $working {
    snap;
    LAST snap;
    # do a lot of work
}
say report(:csv)  # output in CSV format

Other features of Telemetry include a Telemetry::Instrument role for creating your own “instruments” for plugging into this framework (with two Instruments already supplied: Telemetry::Instrument::Usage (basically giving all the data of getrusage + wallclock in microseconds) and Telemetry::Instrument::ThreadPool (giving all the data pertaining to actions of the $*SCHEDULER, such as being able to see when new worker threads are started). Exciting times!

LPW Rakudo Perl 6 Presentations

Well, a program as such hasn’t been decided yet, but it looks like the following Rakudo Perl 6 related presentations will be given at the London Perl Workshop on 25 November 2017 at the University of Westminster:

Hope to see you there!

Optimizing Code

Jonathan Worthington published another part of his MoarVM Specializer Improvements series: Optimizing Code. A long read, but if you fancy understanding MoarVM capabilities better, it is highly recommended reading indeed!

Building a Single Page Application with Cro

This seem to have slipped through the cracks the past weeks. Jonathan Worthington describes how to set up a simple Single Page Application using Cro as the backend, and webpack, ES6, React and Redux as the frontend (no prior knowledge needed). It shows a SPA for a (food/beer) festival where people could leave their tips about what’s hot and what’s not, and being able to see them in real time. It supports:

A must read if you are looking into building fully interactive web-applications, specifically using all of the asynchronous functionality that Rakudo Perl 6 offers today!

Other Core Developments

Other Blog Posts

Meanwhile on Twitter

Meanwhile on StackOverflow

Meanwhile on perl6-users

Winding down

Not a small Perl 6 Weekly again this week. It’s almost become a dayjob! But a fun one! So keep all the new stuff coming so yours truly can report them again in the next issue of the Perl 6 Weekly. See you then!


6guts: MoarVM Specializer Improvements Part 3: Optimizing Code

Published by jnthnwrthngtn on 2017-11-05T22:51:20

Finally, after considering gathering data and planning what to optimize, it’s time to look at how MoarVM’s dynamic optimizer, “spesh”, transforms code into something that should run faster.

Specialization

The name “spesh” is short for “specializer”, and this nicely captures the overall approach taken. The input bytecode the optimizer considers is full of genericity, and an interpreter of that bytecode needs to handle all of the cases. For example, a method call depends on the type of the invocant, and in the general case that may vary from call to call. Worse, while most of the time the method resolution might just involve looking in a cache, in other cases we might have to call into find_method to do a more dynamic dispatch. That in turn means method lookup is potentially an invocation point, which makes it harder to reason about the state of things after the method lookup.

Of course, in the common case, it’s a boring method lookup that will be found in the cache, and in many cases there’s only one invocant type that shows up at a given callsite at runtime. Therefore, we can specialize the code for this case. This means we can cache the result of the method lookup, and since that is now a constant, we may be able to optimize the call itself using that knowledge.

This specialization of method resolution is just one of many kinds of specialization that we can perform. I’ll go through a bunch more of them in this post – but before that, let’s take a look at how the code to optimize is represented, so that these optimizations can be performed conveniently.

Instruction Nodes, Basic Blocks, and CFGs

The input to the optimization process is bytecode, which is just a bunch of bytes with instruction codes, register indexes, and so forth. It goes without saying that doing optimizations by manipulating that directly would be pretty horrible. So, the first thing we do is make a pass through the bytecode and turn every instruction in the bytecode into an instruction node: a C structure that points to information about the instruction, to the previous and next instructions, and contains information about the operands.

This is an improvement from the point of view of being able to transform the code, but it doesn’t help the optimizer to “understand” the code. The step that transforms the bytecode instructions into instruction nodes also performs a second task: it marks the start and end of each basic block. A basic block is a sequence of instructions that will run one after the other, without any flow control. Actually, this definition is something we’ve gradually tightened up over time, and was particularly refined during my work on spesh over the summer. By now, anything that might invoke terminates a basic block. This means that things like findmethod (which may sometimes invoke find_method on the meta-object) and decont (which normally takes a value out of a Scalar container, but may have to invoke a FETCH routine on a Proxy) terminate basic blocks too. In short, we end up with a lot of basic blocks, which makes life a bit harder for those of us who need to read the spesh debug output, but experience tells that the more honest we are about just how many things might cause an invocation, the more useful it is in terms of reasoning about the program and optimizing it.

The basic blocks themselves are only so useful. The next step is to arrange them into a Control Flow Graph (CFG). Each basic block is represented by a C structure, and has a list of its successors and predecessors. The successors of a basic block are those basic blocks that we might end up in after this one. So, if it’s a condition that may be taken or not, then there will be two successors. If it’s a jump table, there may be a lot more than two.

Once we have a control flow graph, it’s easy to pick out basic blocks that are not reachable from the root of the graph. These could never be reached in any way, and are therefore dead code.

One thing that we need to take special care of is exception handlers. Since they work using dynamic scope, then they may be reached only through a callee, but we are only assembling the CFG for a single routine or block. We need to make sure they are considered reachable, so they aren’t deleted as dead code. One safe and conservative way to do this is to add a dummy entry node and link them all from it. This is what spesh did for all exception handlers, until this summer. Now, it only does this for catch handlers. Control exception handlers are instead made reachable by linking them into the graph as a successor to all of the basic blocks that are in the region covered by the handler. Since these are used for redonext, and last handling in loops, it means we get much more precise flow control information for code with loops in. It also means that control exception handlers can now be eliminated as dead code too, if all the code they cover goes away. Making this work out meant doing some work to remove assumptions in various areas of spesh that exception handlers were never eliminated as dead code.

Dominance and SSA form

The control flow graph is a step forward, but it still falls short of letting us do the kind of analysis we’d really like. We’d also like to be able to propagate type information through the graph. So, if we know a register holds an Int, and then it is assigned into a second register, we’d like to mark that register as also containing an Int. The challenge here is that register allocation during code generation wants to minimize the number of registers that are needed – a good thing for memory use – and so a given register will have many different – and unrelated – usages over time.

To resolve this, the instructions are transformed into Static Single Assignment form. Each time a register is assigned to in the bytecode (this is where the “static” comes from), it is given a new version number. Thus, the first assignment to r2 would be named r2(1), the second r2(2), and so forth. This helps enormously, because now information can be stored about these renamed variables, and it’s easy to access the correct information upon each usage. Register re-use is something we no longer have to worry about in this model.

So far so straightforward, but of course there’s a catch: what happens when we have conditionals and loops?

  if_i r1(1) goto LABEL1
  set r2(1), r3(1)
  goto LABEL2
LABEL1:
  set r2(2), r4(1)
LABEL2:      
  # Which version of r2 reaches here???

This problem is resolved by the use of phi nodes, which “merge” the various versions at “join points”:

  if_i r1(1) goto LABEL1
  set r2(1), r3(1)
  goto LABEL2
LABEL1:
  set r2(2), r4(1)
LABEL2:      
  PHI r2(3), r2(1), r2(2)

The placement of these phi nodes is an interesting problem, and that’s where dominance comes in. Basic block B1 dominates basic block B2 if every possible path to B2 in the flow control graph must pass through B1. There’s also a notion of immediate dominance, which you can think of as the “closest” dominator of a node. The immediate dominators form a tree, not a graph, and this tree turns out to be a pretty good order to explore the basic blocks in during analysis and optimization. Last but not least, there’s also the concept of dominance frontiers – that is, where a node’s dominance ends – and this in turn tells us where the phi nodes should be placed.

Thankfully, there’s a good amount of literature on how to implement all of this efficiently and, somewhat interestingly, the dominance algorithm MoarVM picked is a great example of how a cubic algorithm with a really low constant factor can beat a n*log(n) algorithm on all relevant problem sizes (20,000 basic blocks or so – which is a lot).

Facts

Every SSA register has a set of facts associated with it. These include known type, known value, known to be concrete, known to be a type object, and so forth. These provide information that is used to know what optimizations we can do. For example, knowing the type of something can allow an istype operation to be turned into a constant. This means that the facts of the result of that istype now include a known value, and that might mean a conditional can be eliminated entirely.

Facts are, in fact, something of a relative concept in spesh. In the previous parts of this series, I discussed how statistics about the types that show up in the program are collected and aggregated. Of course, these are just statistics, not proofs. So can these statistics be used to introduce facts? It turns out they can, provided we are willing to add a guard. A guard is an instruction that cheaply checks that reality really does match up with expectations. If it does not, then deoptimization is triggered, meaning that we fall back to the slow-path code that can handle all of the cases. Beyond the guard, the facts can be assumed to be true. Guards are another area that I simplified and made more efficient during my work in the summer.

Phi nodes and facts have an interesting relationship. At a phi node, we need to merge the facts from all of the joined values. We can only merge those things that all of the joined values agree on. For example, if we have a phi with two inputs, and they both have a fact indicating a known type Int, then the result of the phi can have that fact set on it. By contrast, if one input thinks it’s an Int and the other a Rat, then we don’t know what it is.

During my work over the summer, I also introduced the notion of a “dead writer”. This is a fact that is set when the writer of a SSA register is eliminated as dead code. This means that the merge can completely ignore what that input thinks, as it will never be run. A practical example where this shows up is with optional parameters:

sub process($data, :$normalize-first) {
    if $normalize-first {
        normalize($data);
    }
    do-stuff-with($data);
}

Specializations are produced for a particular callsite, meaning that we know if the normalize-first argument is passed or not. Inside of the generated code, there is a branch that sticks an Any into $normalize-first if no argument is received. Before my change, the phi node would not have any facts set on it, as the two sides disagreed. Now, it can see that one side of the branch is gone, and the merge can keep whichever facts survive. In the case that type named argument was not passed, then the if can also be eliminated entirely, as a type object is falsey.

Specialization by arguments

So, time to look at some of the specializations! The first transformations that are applied relate to the instructions that bind the incoming arguments into the parameter variables that are to receive them. Specializations are keyed on callsite, and a callsite object indicates the number of arguments, whether any of them are natively typed, and – for named arguments – what names the arguments have. The argument processing code can be specialized for the callsite, by:

Attribute access specialization

Attribute access instructions pass the name of the attribute together with the type the attribute was declared in. This is used to resolve the location of the attribute in the object. There is a static optimization that adds a hint, which has to be verified to be in range at runtime and that does not apply when multiple inheritance is used.

When the spesh facts contain the type of the object that the attribute is being looked up in, then – after some sanity checks – this can be turned into something much cheaper: a read of the memory location at an offset from the pointer to the object. Both reads and writes of object attributes can receive this treatment.

Decontainerization specialization

Many values in Perl 6 live inside of a Scalar container. The decont instruction is used to take a value out of a container (which may be a Scalar or a Proxy). Often, decont is emitted when there actually is no container (because it’s either impossible or too expensive to reason about that at compile time). When we have the facts to show it’s not being used on a container type, the decont can be rewritten into a set instruction. And when the facts show that it’s a Scalar, then it can optimized into a cheap memory offset read.

Method resolution specialization

When both the name of a method and the type it’s being resolved on are known from the facts, then spesh looks into the method cache, which is a hash. If it finds the method, it turns the method lookup into a constant, setting the known value fact along the way. This, in the common case, saves a hash lookup per method call, which is already something, but knowing the exact method that will be run opens the door to some very powerful optimizations on the invocation of the resolved method.

Invocation specialization

Specialization of invocations – that is, calling of subs, methods, blocks, and so forth – is one of the most valuable things spesh does. There are quite a few combinations of things that can happen here, and my work over the summer made this area of spesh a good deal more powerful (and, as a result, took some time off various benchmarks).

The most important thing for spesh to know is what, exactly, is the target of the invocation. In the best case, it will be a known value already. This is the case with subs resolved from the setting or from UNIT, as well as in the case that a findmethod was optimized into a constant. These used to be the only cases, until I also added recording statistics on the static frame that was the target of an invocation. These are analyzed and, if pretty stable, then a guard can be inserted. This greatly increases the range of invocations that can be optimized. For example, a library that can be configured with a callback may, if the program uses a consistent callback, and the callback is small, have the callback’s code inlined into the place that calls it in the library code.

The next step involves seeing if the target of the invocation is a multi sub or method. If so, then the argument types are used to do a lookup into the multi dispatch cache. This means that spesh can potentially resolve a multi-dispatch once at optimization time, and just save the result.

Argument types are worth dwelling on a little longer. Up until the summer, the facts were consulted to see if the types were known. In the case there was no known type from the facts, then the invocation would not be optimized. Now, using the far richer set of statistics, when a fact is missing but the statistics suggest there is a very consistent argument type, then a guard can be inserted. This again increases the range of invocations that can be optimized.

So, by this point a multi dispatch may have been resolved, and we have something to invoke. The argument types are now used to see if there is a specialization of the target of the invocation for those arguments. Typically, if it’s on a hot path, there will be. In that case, then the invocation can be rewritten to use an instruction that pre-selects the specialization. This saves a lot of time re-checking argument types, and is a powerful optimization that can reduce the cost of parameter type checks to zero.

Last, but very much not least, it may turn out that the target specialization being invoked is very small. In this case, a few more checks are performed to see if the call really does need its own invocation record (for example, it may do so if it has lexicals that will be closed over, or if it might be the root of a continuation, or it has an exit handler). Provided the conditions are met, the target may be inlined – that is, have its code copied – into the caller. This saves the cost of creating and tearing down an invocation record (also known as call frame), and also allows some further optimizations that consider the caller and callee together.

Prior to my work in the summer, it wasn’t possible to inline anything that looked up lexical variables in an outer scope (that is to say, anything that was a closure). Now that restriction has been lifted, and closures can be inlined too.

Finally, it’s worth noting that MoarVM also supports inlining specializations that themselves contain code that was inlined. This isn’t too bad, except that in order to support deoptimization it has to also be able to do multiple levels of uninlining too. That was a slight headache to implement.

Conditional specialization

This was mentioned in passing earlier, but worth calling out again: when the value that is being evaluated in a conditional jump is known, then the branch can be eliminated. This in turn makes dead code of one side of the branch or the other. This had been in place for a while, but in my work during the summer I made it handle some further cases, and also made it more immediately remove the dead code path. That led to many SSA registers being marked with the dead writer fact, and thus led to more optimization possibilities in code that followed the branch.

Control exception specialization

Control exceptions are used for things like next and last. In the general case, their target may be some stack frames away, and so they are handled by the exception system. This walks down the stack frames to find a handler. In some cases, however – and perhaps thanks to an inlining – the handler and the throw may be in the same frame. In this case, the throw can be replaced with a simple – and far cheaper – goto instruction. We had this optimization for a while, but it was only during the summer that I made it work when the throwing part was in code that had been inlined, making it far more useful for Perl 6 code.

In summary

Here we’ve seen how spesh takes pieces of a program, which may be full of generalities, and specializes them for the actual situations that are encountered when the program is run. Since it operates at runtime, it is able to do this across compilation units, meaning that a module’s code may be optimized in quite different ways according to the way a particular consumer uses it. We’ve also discussed the importance of guards and deoptimization in making good use of the gathered statistics.

You’ll have noticed I mentioned many changes that took place “in the summer”, and this work was done under a grant from The Perl Foundation. So, thanks go to them for that! Also, I’m still looking for sponsors.

In the next, and final, part of this series, we’ll take a look at argument guards, which are used to efficiently pick which specialization to use when crossing from unspecialized to specialized code.


gfldex: Racing Rakudo

Published by gfldex on 2017-11-05T17:39:33

In many racing sports telemetry plays a big role in getting faster.  Thanks to a torrent of commits by lizmat you can use telemetry now too!

perl6 -e 'use Telemetry; snapper(½); my @a = (‚aaaa‘..‚zzzz‘).pick(1000); say @a.sort.[*-1 / 2];'
zyzl
Telemetry Report of Process #30304 (2017-11-05T17:24:38Z)
No supervisor thread has been running
Number of Snapshots: 31
Initial Size:        93684 Kbytes
Total Time:          14.74 seconds
Total CPU Usage:     15.08 seconds

wallclock  util%  max-rss  gw      gtc  tw      ttc  aw      atc
   500951  53.81     8424
   500557  51.92     9240
   548677  52.15    12376
   506068  52.51      196
   500380  51.94     8976
   506552  51.74     9240
   500517  52.45     9240
   500482  52.33     9504
   506813  51.67     6864
   502634  51.63
   500520  51.78     6072
   500539  52.13     7128
   503437  52.29     7920
   500419  52.45     8976
   500544  51.89     8712
   500550  49.92     6864
   602948  49.71     8712
   500548  50.33
   500545  49.92      320
   500518  49.92
   500530  49.92
   500529  49.91
   500507  49.92
   506886  50.07
   500510  49.93     1848
   500488  49.93
   500511  49.93
   508389  49.94
   508510  51.27      264
    27636  58.33
--------- ------ -------- --- -------- --- -------- --- --------
 14738710  51.16   130876

Legend:
wallclock  Number of microseconds elapsed
    util%  Percentage of CPU utilization (0..100%)
  max-rss  Maximum resident set size (in Kbytes)
       gw  The number of general worker threads
      gtc  The number of tasks completed in general worker threads
       tw  The number of timer threads
      ttc  The number of tasks completed in timer threads
       aw  The number of affinity threads
      atc  The number of tasks completed in affinity threads

The snapper function takes an interval at which data is collected. On termination of the program the table above is shown.

The module comes with plenty of subs to collect the same data at hand and file your own report. What may be sensible in long running processes. Or you call the reporter sub by hand every now and then.

use Telemetry;

react {
    snapper;
    whenever Supply.interval(60) {
        say report;
    }
}

If the terminal wont cut it you can use http to fetch telemetry data.

Documentation isn’t finished nor is the module. So stay tuning for more data.


rakudo.org: Main Development Branch Renamed from “nom” to “master”

Published by Zoffix Znet on 2017-10-27T00:35:03

If you track latest rakudo changes by means of a local checkout of the development repository, please take notice that we renamed our unusually-named main development branch from nom to the more traditional name master

This branch will now track all the latest changes. Attempting to build HEAD of the nom branch will display a message at some point during the build stage, informing that the branch was renamed. The build will then wait for user to press ENTER if they want to continue.

It’s expected this change will cause some fallout and this page will be updated with any useful instructions, as needed. For more information join our #perl6-dev IRC chat channel on irc.freenode.net


UPDATE 1: the blocking message has been disabled for now, to avoid too much impact on any of the bleeding edge users’ automations.


Rakudobrew

Rakudobrew has been updated to default to master branch now. Note that you need to upgrade rakudobrew to receive that change; run rakudobrew self-upgrade

Until updated, rakudobrew‘s standard instructions will build the no-longer actively updated nom branch.

To build latest and greatest rakudo use rakudobrew build moar master instead of rakudobrew build nom. In general, we advise against using rakudobrew if the goal is simply to track development version of the compiler. Instead, we recommend to build directly from repository. Regular end-users should continue using pre-built Rakudo Star distributions.

Zoffix Znet: Rakudo Perl 6 Advent Calendar 2017 Call for Authors

Published on 2017-10-24T00:00:00

Write a blog post about Rakudo Perl 6

gfldex: There Is More Than One Way At The Same Time

Published by gfldex on 2017-10-22T21:21:16

The Perl 6 Rosattacode section for parallel calculations is terribly outdated and missing all the goodies that where added or fixed in the last few weeks. With this post I would like to propose an updated version for Rosettacode. If you believe that I missed something feel free to comment below. Please keep in mind that Rosettacode is for showing off, not for being comprehensive.

use v6.d.PREVIEW;

 

Perl 6 provides parallel execution of code via threads. There are low level custructs that start a thread or safely pause execution.

my $t1 = Thread.start({ say [+] 1..10_000_000 });
my $t2 = Thread.start({ say [*] 1..10_000 });
$t1.finish;
$t2.finish;

my $l = Lock.new;
$l.lock;
$t1 = Thread.start: { $l.lock; say 'got the lock'; $l.unlock };
sleep 2; $l.unlock;

$t1.finish;

When processing lists, one can use a highlevel Iterator created by the methods hyper and race. The latter may return values out of order. Those Iterators will distribute the elements of the list to worker threads that are in turn assigned to OS level threads by Rakudos ThreadPoolScheduler. The whole construct will block until the last element is processed.

my @values = 1..100;

sub postfix:<!> (Int $n) { [*] 1..$n }

say [+] @values.hyper.map( -> $i { print '.' if $i %% 100; $i!.chars });

For for-lovers there are the race for and hyper for keyword for distributing work over threads in the same way as their respective methods forms.

race for 1..100 {
    say .Str; # There be out of order dragons!
}

my @a = do hyper for 1..100 {
   .Int! # Here be thread dragons!
}

say [+] @a;

Perl 6 sports constructs that follow the reactive programming model. One can spin up many worker threads and use threadsafe Channels or Supplys to move values from one thread to another. A react-block can combine those streams of values, process them and react to conditions like cleaning up after a worker thread is done producing values or dealing with errors. The latter is done by bottling up Exception-objects into Failure-objects that keep track of where the error first occured and where it was used instead of a proper value.

my \pipe = Supplier::Preserving.new;

start {
    for $*HOME {
        pipe.emit: .IO if .f & .ends-with('.txt');

        say „Looking in ⟨{.Str}⟩ for files that end in ".txt"“ if .IO.d;
        .IO.dir()».&?BLOCK when .IO.d;

        CATCH {
            default {
                note .^name, ': ', .Str;
                pipe.emit: Failure.new(.item);
            }
        }
    }
    pipe.done;
}

react {
    whenever pipe.Supply {
        say „Checking ⟨{.Str}⟩ for "Rosetta".“;
        say „I found Rosetta in ⟨{.Str}⟩“ if try .open.slurp.contains('Rosetta');
        LAST {
            say ‚Done looking for files.‘;
            done;
        }
        CATCH {
            default {
                note .^name, ': ', .Str;
            }
        }
    }
    whenever Promise.in(60*10) {
        say „I gave up to find Rosetta after 10 minutes.“;
        pipe.done;
        done;
    }
}

Many build-in objects will return a Supply or a Promise. The latter will return a single value or just convey an event like a timeout. In the example above we used a Promise in that fashion. Below we shell out to find and process its output line by line. This could be used in a react block if there are many different types of events to process. Here we just tap into a stream of values and process them one by one. Since we don’t got a react block to provide a blocking event loop, we wait for find to finish with await and process it’s exitcode. Anything inside the block given to .tap will run in its own thread.

my $find = Proc::Async.new('find', $*HOME, '-iname', '*.txt');
$find.stdout.lines.tap: {
    say „Looking for "Rosetta" in ⟨$_⟩“;
    say „Found "Rosetta" in ⟨$_⟩“ if try .open.slurp.contains('Rosetta');
};

await $find.start.then: {
    say „find finished with exitcode: “, .result.exitcode;
};

Having operators process values in parallel via threads or vector units is yet to be done. Both hyper operators and Junctions are candidates for autothreading. If you use them today please keep in mind side effects may provide foot guns in the future.


gfldex: It’s Classes All The Way Down

Published by gfldex on 2017-10-08T17:23:05

While building a cache for a web api that spits out JSON I found myself walking over the same data twice to fix a lack of proper typing. The JSON knows only about strings even though most of the fields are integers and timestamps. I’m fixing the types after parsing the JSON with JSON::Fast by coercively .map-ing .

@stations.=hyper.map: { # Here be heisendragons!
    .<lastchangetime> = .<lastchangetime>
        ?? DateTime.new(.<lastchangetime>.subst(' ', 'T') ~ 'Z', :formatter(&ISO8601))
        !! DateTime;
    .<clickcount> = .<clickcount>.Int;
    .<lastcheckok> = .<lastcheckok>.Int.Bool;

    (note "$_/$stations-count processed" if $_ %% 1000) with $++;

    .Hash
};

The hyper helps a lot to speed things up but will put a lot of stress on the CPU cache. There must be a better way to do that.

Then lizmat showed where Rakudo shows its guts.

m: grammar A { token a { }; rule a { } }
OUTPUT: «5===SORRY!5=== Error while compiling <tmp>␤Package 'A' already has a regex 'a' 
(did you mean to declare a multi-method?)␤

Tokens are regex or maybe methods. But if tokens are methods then grammars must be classes. And that allows us to subclass a grammar.

grammar WWW::Radiobrowser::JSON is JSON {
    token TOP {\s* <top-array> \s* }
    rule top-array      { '[' ~ ']' <station-list> }
    rule station-list   { <station> * % ',' }
    rule station        { '{' ~ '}' <attribute-list> }
    rule attribute-list { <attribute> * % ',' }

    token year { \d+ } token month { \d ** 2 } token day { \d ** 2 } token hour { \d ** 2 } token minute { \d ** 2 } token second { \d ** 2}
    token date { <year> '-' <month> '-' <day> ' ' <hour> ':' <minute> ':' <second> }

    token bool { <value:true> || <value:false> }

    token empty-string { '""' }

    token number { <value:number> }

    proto token attribute { * }
    token attribute:sym<clickcount> { '"clickcount"' \s* ':' \s* '"' <number> '"' }
    token attribute:sym<lastchangetime> { '"lastchangetime"' \s* ':' \s* '"' <date> '"' }
    token attribute:sym<lastcheckok> { '"lastcheckok"' \s* ':' \s* '"' <bool> '"' }
}

Here we overload some tokens and forward calls to tokens that got a different name in the parent grammar. The action class follows suit.

class WWW::Radiobrowser::JSON::Actions is JSON::Actions {
    method TOP($/) {
        make $<top-array>.made;
    }
    method top-array($/) {
        make $<station-list>.made.item;
    }
    method station-list($/) {
        make $<station>.hyper.map(*.made).flat; # Here be heisendragons!
    }
    method station($/) {
        make $<attribute-list>.made.hash.item;
    }
    method attribute-list($/) {
        make $<attribute>».made.flat;
    }
    method date($_) { .make: DateTime.new(.<year>.Int, .<month>.Int, .<day>.Int, .<hour>.Int, .<minute>.Int, .<second>.Num) }
    method bool($_) { .make: .<value>.made ?? Bool::True !! Bool::False }
    method empty-string($_) { .make: Str }

    method attribute:sym<clickcount>($/) { make 'clickcount' => $/<number>.Int; }
    method attribute:sym<lastchangetime>($/) { make 'lastchangetime' => $/<date>.made; }
    method attribute:sym<lastcheckok>($/) { make 'lastcheckok' => $/<bool>.made; }
}

In case you wonder how to call a method with such a funky name, use the quoting version of postfix:<.>.

class C { method m:sym<s>{} }
C.new.'m:sym<s>'()

I truncated the examples above. The full source can be found here. The .hyper-Version is still quite a bit faster but also heisenbuggy. In fact .hyper may not work at all when executed to fast after a program starts or when used in a recursive Routine. This is mostly due to the grammer being one of the oldest parts of Rakudo with the least amount of work to make it fast. That is a solvable problem. I’m looking forward to Grammar All The Things.

If you got grammars please don’t hide them. Somebody might need them to be classy.

 


Zoffix Znet: CPAN6 Is Here

Published on 2017-10-06T00:00:00

CPAN support for Rakudo Perl 6 dists

Zoffix Znet: 6lang: The Naming Discussion Update

Published on 2017-09-28T00:00:00

A rose by any other name 2.0...

samcv: Grant Final Report

Published on 2017-09-25T07:00:00

This contains the work since the last report as well as my final report.

Table of Contents

Work Since the Last Report

Merged in Unicode Collation Algorithm

I merged the Unicode Collation Algorithm branch into MoarVM. Now that the sort is stable, the coll, unicmp and .collate operators in Rakudo are no longer experimental so use experimental :collation no longer is needed to use them.

The $*COLLATION dynamic variable is still hidden under experimental, since it is possible design changes could be made to them.

Prepend

In some of my other posts I talked about the difficulties of getting Prepend codepoints working properly. To do this I had changed how we store synthetics so as not to assume that the first codepoint of a synthetic is always the base character. This month I merged in change in synthetic representation and implemented the features which were now possible with the new representation.

The feature was to detect which character is the base character and store its index in the synthetic. Most combiners, such as diacritics come after the base character and are Extend codepoints: a + ◌́. Prepend has the reverse functionality and comes before: ؀◌ + 9 (Arabic number sign + number).

This required many assumptions our code rightfully made before Unicode 9.0 to be abandoned. When a synthetic is created, we now check to see if the first codepoint is a Prepend codepoint. If so, we keep checking until we find a codepoint that is not a Prepend. In most cases, the base character is the codepoint following the last Prepend mark.

In degenerate cases there is no base character, which could be a grapheme composed of all Prepend’s or only Prepend’s and Extend’s. In these degenerate cases we set the first codepoint as the base character.

Once I had that done, I was able to fix some of our ops which did not work correctly if there were Prepend characters. This included fixing our case changing op so it would now work on graphemes with Prepend marks. Since the case change ops apply the case change to the base codepoint, it is necessary for us to have the correct base character. Similarly, ordbaseat which gets the base codepoint also needed to be fixed. This allowed ignoremark to now work for graphemes with Prepend marks.

Documentation

I wrote documentation on our Unicode Collation Algorithm, which explains to the reader why the UCA solves, with examples of different single to multiple or multiple to single mappings of codepoints. It goes in a fair amount of detail on how it was implemented.

UTF8-C8

Bugs with Encoding into UTF8-C8

Since MoarVM normalizes all input text, our way of dealing with not normalizing, is important to people who want their strings to be unchanged unchanged. Previously there was a bug where if something was a valid UTF8 storable value, such as a Surrogate or a value higher than 0x10FFFF, it would create a Str type with that value, even though it was not valid Unicode. It would then throw when this value was attempted to be used (since the Str type shouldn’t hold values higher than 0x10FFFF or Surrogates). As this is the only way we have of dealing with text unaltered, this seemed like a serious issue that would prevent UTF8-C8 from being usable in a production environment attempting to encode arbitrary data into UTF8-C8. [f112fbcf]

Bugs While Working with UTF8-C8 Strings

Another issue I fixed was that under concatenation, text replacement or renormalization, the UTF8-C8 codepoints would be "flattened". They would lose their special properties and instead start acting like any other set of Unicode codepoints (although unusual since it contains a private use character and a hex code of the stored value). I changed our codepoint iterator so that optionally you can choose to pass back UTF8-C8 Synthetics unchanged. We use Synthetics to store both UTF8-C8 values as well as storing graphemes which contain multiple codepoints. When iterating by codepoint on an already existing arbitrary string, we want to retain the UTF8-C8 codepoints and make sure they are not changed during the renormalization process. This has been fixed, and UTF8-C8 strings are now drastically more reliable, and hopefully, much more production ready. [2f71945d]

Grapheme Caching and Move To

The function which moves a grapheme iterator forward a specified number of graphemes now works even if we aren’t starting from the very start of the string. In this function we have a first loop which locates the correct strand, and had a second loop which would find the correct grapheme inside that strand. I refactored the grapheme locating code and was able to remove the loop.

In the grapheme caching implementation we can save a lot of work by not creating a new iterator for every grapheme access. Not only that, I also sped it the move_to function about 30%. While the cached iterator reduces access to this function for the functions I added it to, there are still many which may seek for each grapheme requested, this will speed that up.

Other MoarVM Changes

I setup automated Appveyor builds for MoarVM so we get automated builds on Windows (Travis CI only builds MacOS and Linux builds).

I fixed a segfault that occurred when compiling nqp on Alpine Linux which uses musl as its libc. I ended up reducing the depth of recursion in the optimize_bb() function when compiling nqp from 99 to 29 (3x reduction). In the Rakudo build, we have a 5x reduction in the depth of recursion.

Final Report

As I’ve already said many things in my previous grant reports (1, 2, 3, 4) I will iterate on some of the big things I did do, which is not exhaustive. For full details of all the changes please see my other grant reports as well as a partial list of bonus changes I made in MoarVM during the grant at the bottom of the page.

The only thing not completed was implementing a whole new UCD backend. While I planned on doing this, I ended up choosing not to do so. I realized that it would not have been the best use of my time on the grant, as there were many more user noticeable changes I could do. Despite this, I did achieve the goals that the rewrite was intended to solve; namely making property values distinct for each property and making the UCD database generation more reproducible. While it is not exactly the same on different runs, the only thing that changes is the internal property codes which does not affect anything adversely. It is fully functional every time instead of database regeneration breaking our tests most of the time. Once the database was stable, I was then able to update our database to Unicode 10. Without my improvements regarding reproducibility and property values becoming distinct for each property, updating to Unicode 10 would have not been possible. In addition all Hangul (Korean) characters now have names in the Unicode database.

A big thing I wanted to implement was the Unicode Collation Algorithm, which ended up being a total success. I was able to still retain the ability to choose which collation levels the user wanted to sort with as well as reverse the sort order of individual collation levels.

Yet I did not only implement one algorithm, I also implemented the Knuth-Morris-Prat string search algorithm which can take advantage of repeated characters in the needle (can be multiple times faster if you have sections repeating). The Knuth-Morris-Pratt algorithm was adjusted to use either the new cached grapheme iterator or the simple lookup depending on if it was a flat or strand haystack as well. Indexing a strand based string with a one grapheme long needle was sped up by 9x by making a special case for this.

Practically all string ops were sped up, often by multiple times due to getting MVM_string_get_grapheme_at_nocheck inlined. In addition to this, I changed the way we access strings in many of our most used string ops, intelligently using either grapheme iterators, cached grapheme iterators or direct access depending on circumstances. With the MVM_string_get_grapheme_at_nocheck inlined, the time to accessing graphemes with this function was sped up between 1.5x for strands and up to 2x for flat strings. Ops we use a lot, like the op that backs eq and nqp::eqat was given special casing for Strand ←→ Strand, Flat ←→ Flat and Strand ←→ Flat (which also covers Flat ←→ Strand as well). This special casing spec up eq by 1.7x when one is a strand and one is flat, and 2.5x when both strings are flat. Applying similar optimizations to index made a 2x speedup when haystack is flat and 1.5x speedup when haystack is a strand (on top of the previous improvements due to the Knuth-Morris-Pratt algorithm)

I fixed a longstanding bug in NQP which caused 'ignoremark+ignorecase' operations to be totally broken. I fixed this by adding more MoarVM ops and refactoring the code to have many less branches. In MoarVM we now use a centralized function to do each variation of with/without ignorecase and ignore mark which is also fully compatible with foldcase operations as well as igoremark.

Doing the ignoremark/ignorecase indexing work sped them up by multiple times, but then in addition to that, it became 2x faster when the haystack was made up of 10 strands by implementing a cached grapheme iterator.

I implemented full Unicode 9.0 support not just in our grapheme segmentation, but also in our other ops, refactoring how we store synthetic codepoints to allow us to have the 'base' codepoint be a codepoint other than the 1st in the synthetic to support Prepend codepoints.

Our concatenation was improved so as to make full renormalization of both input strings no longer needed in almost all cases. The x repeat operator was fixed so it always creates normalized strings. Previously it could create unnormalized strings instead, causing issues when it was used.

I believe I have more than accomplished what I set out to do in this grant. I made tons of user facing changes: to speed, Unicode normalization support, full Unicode 9.0 support. I added awesome collation features and fixed all the major issues with decoding and working with UTF8-C8 representations. I have listed an incomplete list of bonus deliverables below the deliverables which were part of this project.

Deliverables

  • I documented MoarVM’s string representation, with lots of good information for future developers as well as interested users.

  • Hangul syllables now have Unicode names in our database, with a test added in roast.

  • I implemented the Unicode Collation Algorithm [866623d9]

    • Tests have been added in roast for the UCA and the unicmp op

    • I wrote documentation on our Unicode Collation Algorithm implementation

    • Regarding language specific sort. This would involve us using data from the Unicode CLDR. Once we have this data available from MoarVM, it simply requires a conditional to override DUCET and check a different set of data before checking the DUCET data. This information is in our documentation for collation.

  • Text normalization

    • Speed of normalization was improved

    • Full Unicode 9.0 support for text segmentation and normalization was added

  • While I did not fully rewrite the database, I did solve the needed issues:

    • Property values are now unique for each property

    • Running the generation script creates a functional database every time it is run, rather than only some of the time.

    • I added Unicode Collation data to our database, generated from a Perl 6 script, which happened to be the only property required to complete my deliverables

Bonus Deliverables

Here is a somewhat complete list of bonus deliverables:

  • Updated our database to Unicode 10. This was only possible once I had fixed the problems with the database generation, and made property values unque.

  • Implemented Knuth-Morris-Pratt string search

  • Set up Appveyor builds. Appveyor builds and tests MoarVM on Windows, similar to Travis CI.

  • Fixed ignoremark+ignorecase regex when used together as well as huge speed increases.

UTF8-C8/UTF-8

  • Fix UTF8-C8 encoding so it can encode values > 0x10FFFF as well as surrogates

  • Fix UTF8-C8 strings so they do not get corrupted and flattened when string operations are performed on them.

  • MVM_string_utf8_decodestream: free the buffer on malformed UTF-8 [a22f98db]

String Ops

  • Have MVM_string_codes iterate the string with codepoint iterator [ed84a632]

  • Make eqat 1.7x-2.5x faster [3e823659]

  • Speed up index 9x when Haystack is strand, needle is 1 long [0b364cb8]

  • Implement the Knuth-Morris-Pratt string search algorithm [6915d80e]

  • Add indexim_s op and improve/fix bugs in eqatim_s [127fa2ce]

  • Fix a bug in index/eqat(im) and in ord_getbasechar causing us to not decompose the base character when the grapheme was a synthetic [712cff33]

  • Fix MVM_string_compare to support deterministic comparing of synthetics [abc38137]

  • Added a codes op which gets the number of codepoints in a string rather than the number of graphemes. Rakudo is now multipe times faster doing the .codes op now. Before it would request an array of all the codepoints and then get number of elements, which was much slower.

Fix string ops with Prepend characters

  • Rework MVMNFGSynthetic to not store base separately [3bd371f1]

  • Fix case change when base cp isn’t the first cp in synthetic [49b90b99]

  • For degenerate Synth’s with Prepend and Extend set base cp to 1st cp [db3102c4]

  • Fix ignoremark with Prepend characters and ordbaseat op [f8a639e2]

Memory/GC/Build Fixes

  • Fix segfault when compiling nqp with musl as libc [5528083d]

  • Avoid recursion in optimize_bb() when only 1 child node [6d844e00]

  • Fix various memory access/garbage collection issues in some string ops that were showing up when running in Valgrind or using Address Sanitizer

Grapheme Iteration

  • Ensure we can move forward in a grapheme iterator even if we aren’t starting at the very beginning of a string.

  • Use grapheme iterator cached for ignorecase/ignoremark index ops [b2770e27]

  • Optimize MVM_string_gi_move_to. Optimize the loop which finds the correct location within a strand so that it isn’t a loop and is just conditionals.[c2fc14dd]

  • Use MVMGraphemeIter_cached for strands in KMP index [ce76c994]

  • Allow MVM_string_get_grapheme_at_nocheck to be inlined

  • Refactor code into iterate_gi_into_string() to reduce code duplication [1e92fc96]

Tests Added to Roast

  • Add tests for testing collation. Tests for the unicmp operator [5568a0d63]

  • Test that Hangul syllables return the correct Unicode name [6a4bc5450]

  • Add tests for case changes when we have Prepend codepoints [00554ccbd]

  • Add tests for x operator to ensure normalization retained [1e4fd2148] [59909ca9a]

  • Add a large number of string comparison tests [51c4ac08b]

    • Add tests to make sure synthetics compare properly with cmp [649d9dc50]

  • Improve ignoremark tests to cover many different cases [810e218c8]

    • Add ignoremark tests to cover JSON::Tiny regression + other issue [c185acc57]

  • Add generated tests (from UCD data) and manually created ones to ensure strings concatenation is stable, when the concatenated string would change the normalization. [2e1f7d92a][9f304b070] [64e385faf][0976d07b9][59909ca9a][2e1f7d92a] [88635564e] [a6bbc73cf] [a363e3ff1]

  • Add test for MoarVM Issue #566 .uniprop overflow [9227bc3d8]

  • Add tests to cover RT #128875 [0a80d0d2e]

  • Make new version of GraphemeBreakTest.t to better test grapheme segmentation [54c96043c]

NQP Work

Below is a listing of some of the commits I made to NQP. This included adding the ops I created over the course of the grant: eqatim, indexim, indexicim, eqaticim, and codes (gets the number of codepoints in a string rather than graphemes).

The testing in NQP was inadequate for our string ops, so I added hundreds of tests for practically all of the string ops, so we could properly test the different variations of index* and eqat*.

NQP Documentation

  • Add docs for a few variations of index/eqat [589a3dd5c] Bring the unicmp_s op docs up to date [091625799]

  • Document hasuniprop moar op [650840d74]

NQP Tests

  • Add more index* tests to test empty string paths [8742805cb]

  • run indexicim through all of the indexic tests [26adef6be]

  • Add tests for RT #128875, ignorecase+ignoremark false positives [b96a0afe7]

  • Add tests for nqp::eqatim op on MoarVM [6fac0036e]

Other Work

  • Added script to find undocumented NQP ops [0ead16794]

  • Added nqp::codes to QASTOperationsMAST.nqp [59421ffe1]

  • Update QASTRegexCompilerMAST to use new indexicim and eqaticim ops [18e40936a]

  • Added eqatim and indexim ops. Fix a bug when using ignoremark [9b94aae27]

  • Added nqp::hasuniprop op to QASTOperationsMAST.nqp [d03c4023a]

6guts: Rakudo gets a new thread pool

Published by jnthnwrthngtn on 2017-09-23T14:00:12

Vienna.pm have funded me to work 50 hours on Perl 6. After some discussion, we decided I would first work on improving the thread pool scheduler, and then move on to continuing the work around non-blocking await, a feature of the upcoming Perl 6.d. In this post I’ll discuss the work on the thread pool scheduler, which was merged shortly after the latest Rakudo release to maximize testing time, and thus will appear in the next release (2017.10).

What was wrong with the ThreadPoolScheduler before?

My (by now slightly hazy) memory is that I wrote the initial Perl 6 thread pool implementation in the space of an hour or two on a train, probably heading to some Perl event or other. It was one of those “simplest thing that could possibly work” implementations that turned out to work well enough that it survived with only isolated tweaks and fixes until January this year. At that point, I added the initial bits of support for non-blocking await. Even that change was entirely additive, leaving the existing scheduling mechanism completely intact.

When I first implemented the thread pool, there were approximately no people writing concurrent Perl 6 programs. Happily, that’s changed, but with it came a few bug reports that could be traced back to the thread pool. Along with those, there were things that, while not being outright bugs, were far from ideal.

Before I dug into a re-design, I first summarized all of the problems I was aware of with the thread pool, so as I considered new designs I could “crash test” them against the problems that afflicted the previous design. Here’s my list of problems with the previous design.

  1. It was wasteful, spawning far too many threads in quite a lot of cases. A single use of Proc::Async would cause some threads to be created. A second use after that would add some more. A third use would add more. Even if these uses were not concurrent, threads would still be added, up until the maximum pool size. It was a very conservative way to make sure a thread would be available for each piece of work, provided the pool didn’t reach exhaustion point. But really, the pool doesn’t need more than a thread or two for many programs. RT #131915 was a ticket resulting from this behavior: each thread added to the memory consumption of the program, making memory use a good bit higher than it should have been for programs that really didn’t need more than a thread or two.
  2. Very active async I/O could swamp the thread pool with events. Since there was a single queue, then timer events – which might be being used to kill off a process after a timeout – may not have fired in a very timely manner at all, since the timer event would be stuck behind all of the I/O events. RT #130370
  3. Sometimes, despite the conservative mechanism described in #1, not enough threads got started and this led to occasional deadlocks. RT #122709
  4. It didn’t try to scale the number of threads to match the available CPU cores in any way. Just because there is a lot of work in the queue does not mean that we need more threads; if we are making progress and the CPU load is pretty high then adding more threads is unlikely to lead to more progress.
  5. In high-core-count systems, the default limit of 16 was too low. 32-core CPUs at tolerable prices very much exist by now!
  6. For programs that manage to really block a lot of threads with non-CPU bound work, we could deadlock. Most of that will go away with non-blocking await, but there will still be ways to write code that really blocks a bunch of OS threads and can’t make progress without yet more threads being made available to run something. At some point, people just have to write their programs differently, but the default limit of 16 threads was not very generous.
  7. Despite wishing to raise the maximum default pool size a good bit, we couldn’t do it because issues #1 and #4 meant we’d likely end up hitting the maximum in most programs, and memory use would become totally unreasonable.
  8. We suffered from poor thread affinity for events that would inevitably queue up due to serial supplies. For example, if two packets arrived from a socket then they might be picked up by different threads in the pool. If the first packet was still being processed, then the second would contend for the lock used to enforce serial processing of messages by supplies, and block until it was available.

3 kinds of queue

Problems 2 (active I/O swamping the queue and delaying timers) and 8 (poor thread affinity) are resolved in the new thread pool by recognizing that not all work given to the pool is equal. Timers really want to be dealt with in a timely manner, so in programs that have timers it makes sense to have a queue, and one or more worker threads, exclusively for time-based events. Work items that will need processing sequentially anyway should be assigned to a single thread, implying that each such “affinity worker” gets a queue of its own. Finally, there is a general queue for everything else, and one or more threads eat from it.

Adding a supervisor

The other problems were addressed by the addition of a Sufficiently Smart Supervisor. The supervisor is a thread created by the thread pool upon its first use, and living from then until the end of the program. It spends most of its time sleeping, waking up around 100 times a second to check how things are going. It is the supervisor that makes most of the decisions about how many threads to add to the pool. It factors in:

Having the pool soft-limited to the number of threads, and reluctantly able to go beyond that, means that we can raise the maximum default pool size quite a lot; it now stands at 64 threads. In reality, many programs don’t even reach the CPU core count, as there’s never enough work to trigger the addition of more threads.

Affinity workers

Affinity worker threads aren’t added by the supervisor. Instead, when an affinity queue is requested, any existing affinity worker threads will have their queue lengths inspected. The one with the shortest queue will be picked, provided that the queue length is below a threshold. If it’s over the threshold (which increases per affinity worker thread added), then another affinity worker thread will be spawned and the work allocated to it.

Outcomes

For all the cases I’ve tried so far, the new scheduler seems to do either at least as well or better than the one it replaced – and sometimes much better. The deadlock due to insufficient threads bug is gone, thanks to the supervisor. Programs that do the occasional bit of work needing pool threads end up with just a thread or two, greatly reducing their memory consumption. A Cro web application that is processing just a handful of requests a second will now spawn just a few threads (and so use much less memory and get better locality), while one faced with some hundreds of requests per second will spawn more. And a program doing a lot of CPU-bound work now spawns as many threads as there are cores, which gives a small speedup compared to oversubscribing the CPU under the previous scheduler. Finally, timer events are delivered and handled in a timely way, even when there is a lot of output from a process.

And next?

Well, most immediately I should write some extra regression tests, so I can get the RT tickets mentioned earlier in the article closed up. Feel free to beat me to doing that, though! :-)

That aside, I’d like to mention that while the new scheduler is a decided improvement, it’s also improvable. The heuristics it uses to decide when to add further threads can surely be tuned some more. The code doing that decision making is written in Perl 6, which I hope makes it accessible to those who would like to experiment and tweak.

Once again, thanks to Vienna.pm for making this work possible. And, finally, a mention that I’m still looking for funding to help me keep on doing Perl 6 things for a sizable chunk of my work time; a handful of companies each sponsoring 10 hours a month would soon add up!


p6steve: Clone Wars

Published by p6steve on 2017-09-20T18:52:04

Apologies to those that have OO steeped in their blood. I am a wary traveller in OO space,  maybe I am an technician, not an architect at heart. So for me, no sweeping frameworks unless and until they are needed. And, frankly, one can go a long way on procedural code with subroutines to gather repetitive code sequences.

(And don’t get me started on functional programming…)

Some time ago, I was tasked to write a LIMS in old perl. Physical ‘aliquots’ with injections of various liquids would be combined and recombined by bio-machines. This led to a dawning realization that these aliquot objects could be modelled in an object style with parent / child relationships. After a few weeks, I proudly delivered my lowly attempt at ‘Mu’ for this (and only this) problem. Kudos to the P6 team – after a couple of weeks in here, it’s just sensational the level of OO power that the real Mu delivers:

Screenshot 2017-09-20 19.32.21

Now, hard at work, on the perl6 version of Physics::Unit, I am wondering how to put the OO theory into productive practice. One of my aims was to find a medium sized (not core tools) problem that (i) I know something about and (ii) would be a good-sized problem to wrangle.

So I am enjoying the chance to design some classes and some interfaces that will make this all hang together. But – as an explorer, it has become clear that I only have three options. The problem runs like this:

Initially I had some success with object types ::T – but these only let you read the type and  duplicate if needed for a new left hand side container. Then I tried the built in (shallow) clone method. But…

Ultimately, thanks to rosettacode.org, I worked out that $x.perl.EVAL with some ~~ s/// substitions on the way would do the trick!

Phew. Please comment below if you have a better way to share – or would like to point out the risks of this technique…


6guts: MoarVM Specializer Improvements Part 2: Planning

Published by jnthnwrthngtn on 2017-09-17T22:39:51

It’s been a good while since I posted part 1 of this little series. In the meantime, I paid a visit to the Swiss Perl Workshop, situated in a lovely hotel in the beautiful village of Villars. Of course, I was working on my slides until the last minute – but at least I could do it with this great view!

DSC08286-small

I gave two talks at the workshop, one of them about the MoarVM specializer (slidesvideo). Spoiler alert: the talk covers much of what I will write about in this blog series. :-) Now that I’ve recovered from the talk writing and the travel, it’s time to get back to writing here. Before I dig in to it, I’d like to say thanks to The Perl Foundation and their sponsors, who have made this work possible.

Where we left off

In the last part I discussed how we collected statistics about running code, in order to understand what to optimize and how to optimize it. The running example, which I’ll continue with, was this

sub shorten($text, $limit) is export {
    $text.chars > $limit
        ?? $text.substr(0, $limit) ~ '...'
        !! $text
}

And an example of the kind of information collected is:

Latest statistics for 'infix:<~>' (cuid: 4245, file: SETTING::src/core/Str.pm:2797)

Total hits: 367

Callsite 0x7f1cd6231ac0 (2 args, 2 pos)
Positional flags: obj, obj

    Callsite hits: 367

    Maximum stack depth: 32

    Type tuple 0
        Type 0: Str (Conc)
        Type 1: Str (Conc)
        Hits: 367
        Maximum stack depth: 32

Which, to recap, tells us that the ~ infix operator was called 367 times so far, and was always passed two Str instances, neither of which was held in a Scalar container.

The job of the planner

The planner is the component that takes this statistical data and decides what code to optimize and what cases to optimize it for. It turns out to be, at least at the time of writing, one of the smaller parts of the whole specialization subsystem; subtract memory management code and debug bits and it’s not much over 100 lines.

Its inputs are a list of static frames whose statistics were updated by the latest data. There’s no point considering anything whose statistics did not change, since there will be nothing new since the last time the planner saw them and so the conclusion would be the same. In Perl 6 terms, a static frame may represent a routine (SubMethod), Block, or Code thunk.

Initial filtering

The first thing the planner looks at is whether the static frame got enough hits to be worth even considering investing time on. So if it sees something like this:

Latest statistics for 'command_eval' (cuid: 8, file: src/Perl6/Compiler.nqp:31)

Total hits: 1

No interned callsite
    Callsite hits: 1

    Maximum stack depth: 5

It can simply ignore it based on the total hits alone. Something that gets a single hit isn’t worth spending optimization effort on.

There are cases where the total hits are very low – perhaps just a single hit – but optimization is still interesting, however. Here is the loop that I used to make shorten hot so we could look at its statistics and plan:

for 'x' x 50 xx 100000 {
    shorten($_, 42)
}

The statistics for the mainline of the program, containing the loop, start out like this:

Latest statistics for '<unit>' (cuid: 4, file: xxx.p6:1)

    Total hits: 1
    OSR hits: 273

    Callsite 0x7f1cd6241020 (0 args, 0 pos)

        Callsite hits: 1

        OSR hits: 273

        Maximum stack depth: 10

The total hits may well be 1, but obviously the loop inside of this code is hot and would be worth optimizing. MoarVM knows how to perform On Stack Replacement, where code running in a frame already on the callstack can be replaced with an optimized version. To figure out whether this is worth doing, it records the hits at “OSR points” – that is, each time we cross the location in the loop where we might replace the code with an optimized version. Here, the loop has racked up 273 OSR hits, which – despite only one hit in terms of entering the code at the top – makes it an interesting candidate for optimization.

Hot callsites

In MoarVM, a callsite (of type MVMCallsite) represents the “shape” of a set of arguments being passed. This includes:

When bytecode is produced, callsites are interned – that is to say, if we have two calls that are both passing two positional object arguments, then they will share the callsite data. This makes them unique within a compilation unit. When bytecode is loaded, then any callsites without flattening are further interned by the VM, meaning they are globally shared across all VM instances. This means that they can then be used as a kind of “key” for things like multi-dispatch caching. The specializer’s statistics are also aggregated by callsite.

The planner, therefore, takes a look at what callsites showed up. Often, code will be called with a consistent callsite, but things with optional parameters may not be. For example, if we were to run:

sub foo(:$bar) { }
for ^100000 { rand < 0.9 ?? foo(bar => 42) !! foo() }

The the statistics of foo might end up looking like this:

Latest statistics for 'foo' (cuid: 1, file: -e:1)

Total hits: 367

Callsite 0x506d920 (2 args, 0 pos)
  - bar

    Callsite hits: 331

    Maximum stack depth: 13

    Type tuple 0
        Type 0: Int (Conc)
        Hits: 331
        Maximum stack depth: 13

Callsite 0x7f3fd7461620 (0 args, 0 pos)

    Callsite hits: 36

    Maximum stack depth: 13

    Type tuple 0
        Hits: 36
        Maximum stack depth: 13

Unsurprisingly, the case where we pass bar is significantly more common than the case where we don’t, so the planner would in this case favor spending time making a specialization for the case where bar was passed.

As an aside, specialization by optional argument tends to be a good investment. A common pattern is:

sub foo($bar, Bool :$some-option) {
    if $some-option {
        extra-stuff();
    }
}

In the case that $some-option is not passed, the specializer:

That stripping out of code might in turn bring the optimized version below the inlining limit, whereas before it might have been too big, which could bring about an additional win.

Hot type tuples

Once one or more hot callsites have been identified, the types that were passed will be considered. In the best case, that turns out to be very boring. For example, here:

Latest statistics for 'chars' (cuid: 4219, file: SETTING::src/core/Str.pm:2728)

Total hits: 273

Callsite 0x7f1cd6231ae0 (1 args, 1 pos)
Positional flags: obj

    Callsite hits: 273

    Maximum stack depth: 14

    Type tuple 0
        Type 0: Scalar (Conc) of Str (Conc)
        Hits: 273
        Maximum stack depth: 14

The code is consistently called with a Scalar container holding a Str instance. Therefore, we can see this code is monomorphic in usage. It is quite clearly going to be worth spending time producing a specialization for this case.

In other cases, the code might be somewhat polymorphic:

Latest statistics for 'sink' (cuid: 4795, file: SETTING::src/core/List.pm:694)

Total hits: 856

Callsite 0x7f1cd6231ae0 (1 args, 1 pos)
Positional flags: obj

    Callsite hits: 856

    Maximum stack depth: 31

    Type tuple 0
        Type 0: Slip (Conc)
        Hits: 848
        Maximum stack depth: 31

    Type tuple 1
        Type 0: List (Conc)
        Hits: 7
        Maximum stack depth: 26

Here we can see the sink method is being called on both Slip and List. But it turns out that the Slip case is far more common. Thus, the planner would conclude that the Slip case is worth a specialization, and the List case is – at least for now – not worth the bother.

Sometimes, a situation like this comes up where the type distribution is more balanced. Any situation where a given type tuple accounts for 25% or more of the hits is considered to be polymorphic. In this case, we produce the various specializations that are needed.

Yet another situation is where there are loads of different type tuples, and none of them are particularly common. One place this shows up is in multiple dispatch candidate sorting, which is seeing all kinds of types. This case is megamorphic. It’s not worth investing the time on specializing the code for any particular type, because the types are all over the place. It is still worth specialization by callsite shape, however, and there may be other useful optimizations that can be performed. It can also be compiled into machine code to get rid of the interpreter overhead.

Sorting

By this point, the planner will have got a list of specializations to produce. Specializations come in two forms: observed type specializations, which are based on matching a type tuple, and certain specializations, which are keyed only on an interned callsite (or the absence of one). Certain specializations are produced for megamorphic code. The list of specializations are then sorted. But why?

One of the most valuable optimizations the specializer performs is inlining. When one specialized static frame calls another, and the callee is small, then it can often be incorporated into the caller directly. This avoids the need to create and tear down an invocation record. Being able to do this relies on a matching specialization being available – that is, given a set of types that will be used in the call, there should be a specialization for those types.

The goal of the sorting is to try and ensure we produce specializations of callees ahead of specializations of their callers. You might have wondered why every type tuple and callsite was marked up with the maximum call depth. Its role is for use in sorting: specializing things with the deepest maximum call stack depth first means we will typically end up producing specialized code for potential inlinees ahead of the potential inliners. (It’s not perfect, but it’s cheap to compute and works out pretty well. Perhaps more precise would be to form a graph and do a topological sort.)

Dumping

The plan is then dumped into the specialization log, if that is enabled. Here are some examples from our shorten program:

==========

Observed type specialization of 'infix:<~>' (cuid: 4245, file: SETTING::src/core/Str.pm:2797)

The specialization is for the callsite:
Callsite 0x7f1cd6231ac0 (2 args, 2 pos)
Positional flags: obj, obj

It was planned for the type tuple:
    Type 0: Str (Conc)
    Type 1: Str (Conc)
Which received 367 hits (100% of the 367 callsite hits).

The maximum stack depth is 32.

==========

Observed type specialization of 'substr' (cuid: 4316, file: SETTING::src/core/Str.pm:3061)

The specialization is for the callsite:
Callsite 0x7f1cd6231a60 (3 args, 3 pos)
Positional flags: obj, obj, obj

It was planned for the type tuple:
    Type 0: Str (Conc)
    Type 1: Scalar (Conc) of Int (Conc)
    Type 2: Scalar (Conc) of Int (Conc)
Which received 274 hits (100% of the 274 callsite hits).

The maximum stack depth is 15.

==========

Observed type specialization of 'chars' (cuid: 4219, file: SETTING::src/core/Str.pm:2728)

The specialization is for the callsite:
Callsite 0x7f1cd6231ae0 (1 args, 1 pos)
Positional flags: obj

It was planned for the type tuple:
    Type 0: Scalar (Conc) of Str (Conc)
Which received 273 hits (100% of the 273 callsite hits).

The maximum stack depth is 14.

==========

Observed type specialization of 'substr' (cuid: 2562, file: SETTING::src/core/Cool.pm:74)

The specialization is for the callsite:
Callsite 0x7f1cd6231a60 (3 args, 3 pos)
Positional flags: obj, obj, obj

It was planned for the type tuple:
    Type 0: Scalar (Conc) of Str (Conc)
    Type 1: Int (Conc)
    Type 2: Scalar (Conc) of Int (Conc)
Which received 273 hits (100% of the 273 callsite hits).

The maximum stack depth is 14.

==========

Observed type specialization of 'infix:«>»' (cuid: 3165, file: SETTING::src/core/Int.pm:365)

The specialization is for the callsite:
Callsite 0x7f1cd6231ac0 (2 args, 2 pos)
Positional flags: obj, obj

It was planned for the type tuple:
    Type 0: Int (Conc)
    Type 1: Scalar (Conc) of Int (Conc)
Which received 273 hits (100% of the 273 callsite hits).

The maximum stack depth is 14.

==========

Observed type specialization of 'shorten' (cuid: 1, file: xxx.p6:1)

The specialization is for the callsite:
Callsite 0x7f1cd6231ac0 (2 args, 2 pos)
Positional flags: obj, obj

It was planned for the type tuple:
    Type 0: Str (Conc)
    Type 1: Int (Conc)
Which received 273 hits (100% of the 273 callsite hits).

The maximum stack depth is 13.

==========

Compared to before

So how does this compare to the situation before my recent work in improving specialization? Previously, there was no planner at all, nor the concept of certain specializations. Once a particular static frame hit a (bytecode size based) threshold, the type tuple of the current call was taken and a specialization produced for it. There was no attempt to discern monomorphic, polymorphic, and megamorphic code. The first four distinct type tuples that were seen got specializations. The rest got nothing: no optimization, and no compilation into machine code.

The sorting issue didn’t come up, however, because specializations were always produced at the point of a return from a frame. This meant that we naturally would produce them in deepest-first order. Or at least, that’s the theory. In practice, if we have a callee in a conditional that’s true 90% of the time, before there was a 10% chance we’d miss the opportunity. That’s a lot less likely to happen with the new design.

Another problem the central planning avoids is a bunch of threads set off executing the same code all trying to produce and install specializations. This was handled by letting threads race to install a specialization. However, it could lead to a lot of throwaway work. Now the planner can produce them once; when it analyzes the logs of other threads, it can see that the specialization was already produced, and not plan anything at all.

Future plans

One new opportunity that I would like to exploit in the future is for derived type specializations. Consider assigning into an array or hash. Of course, the types of the assigned values will, in a large application, be hugely variable. Today we’d consider ASSIGN-POS megamorphic and so not try and do any of the type-based optimizations. But if we looked more closely, we’d likely see the type tuples are things like (Array, Int, Str)(Array, Int, Piranha)(Array, Int, Catfish), and so forth. Effectively, it’s monomorphic in its first two arguments, and megamorphic in its third argument. Therefore, a specialization assuming the first two arguments are of type Array and Int, but without specialization of the third argument, would be a win.

Next time…

We now have a specialization plan. Let’s go forth and optimize stuff!


Zoffix Znet: The Rakudo Book Project

Published on 2017-09-17T00:00:00

A plan to write some Rakudo books

gfldex: The Siege Can Continue

Published by gfldex on 2017-09-16T20:14:31

A wise internet spaceship pirate once wrote: “Whining gets you stuff. That’s why humans are at the top of the food chain.”. My Whining got me a fix that put an end to segfaults on long running scripts that react to http requests.

So I continued to add missing bits to my golfed httpd and found another good use for term:<>.

constant term:<HTTP-HEADER-404> = "HTTP/1.1 404 Not Found", "Content-Type: text/plain; charset=UTF-8", "Content-Encoding: UTF-8", "";

Without term:<> the compiler would think I wanted to substract 400 from a http header.

If you got some time to spare please write a whiny blog post about a crash that is bugging you – works like a charm.


gfldex: Goto the Last Fifo

Published by gfldex on 2017-09-13T08:45:21

I always admired the elegance of the sysfs. You take a text and write it to a file to talk to a function running in kernel space. As soon as you know how to work with files, you can change the systems behaviour and reuse access control mechanism without any special tools. It’s very easy to script and dump (parts) of the systems state.

Linux and friends come with fifos that serve the same purpose. Create a fifo, set access rights and start reading from that pseudo-file. Very easy to do in Perl 5.

my $fifo; open($fifo, "+);
while (<$fifo>) {
    do-things-with $_;
} 

Rakudo doesn’t really know about fifos yet, as a result it doesn’t block on a read of a fifo that don’t got data anymore. After a bit of fiddeling I found a way around that problem.

      1 use v6.c;
      2
      3 # `mkfifo radio-fifo-in`
      4 # `echo "foo" > radio-fifo-in`
      5 # `echo "foo^D" > radio-fifo-in`
      6
      7 my $fifo-in = open(„radio-fifo-in“, :r);
      8
      9 LABEL: loop {
     10     react {
     11         whenever supply { .emit for $fifo-in.lines } {
     12             say .Str;
     13             last LABEL if /‚^D‘/;
     14         }
     15     }
     16 }

I learned that whenever reacts to last and will teach the docs about it later today. Luckily Perl 6 got labels so we can tell last where to goto.

UPDATE: scovit found a short expression that gets very close to the behaviour of Perl 5.

my $fifo = open("radio-fifo-in", :r);
while defined $_ = $fifo.get { .say }

p6steve: perl6 Atomic Fission

Published by p6steve on 2017-09-03T18:46:48

I have been listening to the reaction on the web to the incorporation of an emoji as a unicode symbol in perl6 rakudo. Here’s a flavour…

(https://p6weekly.wordpress.com/2017/08/21/2017-34-going-atomic/ )

The rationale for the use of unicode symbols is as follows:

BTW- ASCII versions are known as Texas versions since they are always bigger

Certainly this has caused some consternation – ranging from how can I type ⚛️ on my keyboard (hit CTRL-CMD-SPACE if you are on macOS ) to this will never be accepted for the coding standards of my company.

On reflection, while it is understandable that programmers have a well established comfort zone of ASCII text and using English for keywords, I think that perl6 is leading the way on an irresistible path. Of the 6.5bn people on the planet, only a small fraction prefer to work in English – or even in Latin alphabets. Now, the pioneering work to embed unicode in a programming language will open the doors to all kinds of invention. What about:

And this, in combination with perl6 Grammars, opens some interesting conceptual doors.

~p6steve


samcv: Grant Status Update 4

Published by Samantha McVey on 2017-08-31T07:00:00

This is my fourth grant update. A productive and noteworthy month. I gave two presentations at YAPC-EU in the Netherlands earlier this month, on High End Unicode and MoarVM Internals (links to my slides). It was my first Perl conference and I had a great time meeting people in the Perl 6 and Perl community!

Despite the conference, I made some big changes this month: big speedups for indexing operations, as well as an update to the Unicode 10 database. Before I could regenerate the database I had to fix ucd2c.pl, the script that generates the database, to be more deterministic and have unique property values per property (without this regenerating it with the Unicode 10 data would result in partially broken Unicode properties). I also implemented the Knuth-Morris-Pratt string search algorithm for our index function.

I have added documentation to MoarVM which is an overview of how our strings are represented as well as details about normalization, synthetics and other topics. On Hacker News someone noted that this was not documented anywhere, so I made sure to add documentation for this. If you are interested in some of the internals of MoarVM, I’m hoping the document should be pretty accessible even if you are not a MoarVM developer.

Table of Contents

Index

Knuth-Morris-Pratt String Search Algorithm

Previously we did not have any natively implemented efficient string search algorithms, we only had memmem which was optimized, but would only work if both strings were flat and both the same bitwidth per grapheme.

Now all index operations with a 4096 or under length needle are optimized and no longer use brute force methods for searching (this does not include case insensitive or ignoremark indexing operations).

When my KMP implementation in MVM_string_index is used:
  • Strings with non-matching types

    • That have more than one codepoint in the needle

    • That don’t have a needle more than 4096 graphemes long

  • Speedup can be small, or large depending on the pattern of the needle

    • Repeating letters can cause multiple times speed increase depending on the haystack

We still use memmem when both strings are both flat strings and both the same data type (which use Knuth-Morris-Pratt or Booyer-Moore depending on platform). Most of the strings we will work with — especially once the strings have been modified — are strands.

Grapheme Caching

Since the Knuth-Morris-Pratt algorithm often will request the same grapheme again, but will never request an earlier point in the haystack, I was able to optimize the KMP string search function I added to cache the graphemes so we can use a grapheme iterator instead of using MVM_string_get_grapheme_at_nocheck which for strands, will have to find its place in the haystack from the beginning each time. What this grapheme caching does, is it caches the last returned grapheme, so if the same grapheme is requested again it continues to return that without requesting a new grapheme from the grapheme iterator. If it requests the next grapheme, or some number of graphemes after the current position, it will either grab the next grapheme or move the grapheme iterator forward (skip a certain number of graphemes) and then get a grapheme from the grapheme iterator.

Here are some timings I got before and after the grapheme iterator for an English language book text file, searching for English words misspelled by one letter (so it would search the entire file but be representative of searching for actual words).

Description

caching

master

index with needle (strand haystack)

0.832730

2.67385181

index with word needle (flat haystack)

0.8357464

1.01405131

As you can see from the table, we actually even got savings when the haystack was flat. This surprised me, since getting a grapheme with a flat haystack points to a function with one switch and then returning the integer of the blob array at the specified position. I am guessing this is likely caused by the caching function generating more efficient machine code, since the savings can’t be only explained by the savings from caching the grapheme — the speed up was seen even when I manipulated the needle so that there were no cache “hits” and it always requested different graphemes.

ℹ️
The grapheme caching has not yet been merged, but is ready to go after fixing some merge conflicts

Inlining MVM_string_get_grapheme_at_nocheck

After I finished writing the previous section, I was able to discover the reason for the speed difference with flat haystacks. By getting MVM_string_get_grapheme_at_nocheck to inline I was able to speed up index operations for a flat haystack by 2x. This is on top of the speedups of about we got from the KMP algorithm! This should affect any code which uses this function, making the function 2x as fast for flat strings, and likely a slight speedup for strands as well. This has huge implications as this function is used extensively throughout our string ops. This may change what I do with the grapheme caching code. It is likely I will change it so that it uses the grapheme caching for strands, and uses MVM_string_get_grapheme_at_nocheck for flat haystacks.

Single Grapheme Needle Index

I sped up string index by 9x when the haystack is a strand and the needle is 1 grapheme long. For this we use a grapheme iterator when looking for a single grapheme inside of a strand, and use a simpler faster loop since we are only looking for a single grapheme.

Unicode

Property Values

Property values are now unique for each property in the Unicode Database. Since property values are non-unique, we must store them uniquely. Previously this would cause only the property whose value was last in the C data array to work. Now that property values are unique for each property code, that should no longer cause breakage.

Unicode Database Updated to Unicode 10

The Unicode database has been updated for Unicode 10. Now that the previous breaking point — the database not always generating properly each time — was solved, I was able to make changes to the Unicode database, including updating it to version 10. The main significant changes in this release were the addition of new scripts, symbols and Emoji. You can see the full list of changes here.

Unicode Names

Hangul (Korean) characters now have names in our name database. These are generated algorithmically on database creation by decomposing the characters and concatening the 2 or 3 resulting codepoint’s Jamo names. This is needed since the Unicode data file does not include the name for these codepoints and leaves it to the implementer to create them. A roast test has been added to check for support of these Korean characters.

Fixed a bug in ignoremark/ordbaseat

Ignoremark relys on the MoarVM ordbaseat op under the hood. For graphemes made up of a single character, we decompose the character and get the resulting base charactewhich it relays on assumed once we had gotten a synthetic’s base character that we already had the final base character. For non-synthetics we would decompose This wasn’t true for example with: "\c[LATIN SMALL LETTER J WITH CARON, COMBINING DOT BELOW]" The "j with caron" would not be decomposed because it was the base character of a synthetic.

This also fixes a bug in indexim and eqatim ops which caused it to fail when encountering a synthetic.

We now have ord_getbasechar handle synthetic and non-synthetic’s and have MVM_string_ord_basechar_at just handle the needed string based checks, then grabbing the grapheme from the string and passing its value on to ord_getbasechar.

Synthetic Grapheme Rework

I reworked MVMNFGSynthetic, the C struct which represents our synthetic graphemes. This is done to (eventually) get Prepend support working for things like ignoremark regex and the ordbaseat op which gets the base codepoint of a grapheme.

Unlike all other marks which come after a base character, Prepend characters come before the base character. All of our current code assumed the first codepoint of the synthetic is the base character.

For now, we also assume the first codepoint of the synthetic is the base character, but we now have a base_index which will be able to hold the index of the base character in the codepoint array.

While this doesn’t add Prepend support to everything, this is one step toward getting that working, and decoupling base codepoints from being the first codepoint of a synthetic.

ℹ️
This is not yet merged, but ready to go

Collation

There’s not as much to say about this, since it is almost ready. As I said last month it is fully functional, and I have since done some work cleaning it up. Left to be done is integrating the data generation into the other Unicode database generation script. Although the file and code it uses to generate is different, ideally we will have only one script to run to generate and update all of our files on new Unicode releases.

Previously I created a script which downloads all of the Unicode files needed in the generation, so an update should hopefully only require a run of the download script to fetch the UCD, the UCA (Unicode Collation Algorithm), and Emoji data.

One thing I had not talked about previously was some of the ways I have sped up the UCA implementation to be quite fast and efficient, only having to evaluate the collation arrays efficiently even for very large strings. If you are interested, the details are in my slides.

NQP Documentation

In addition to the MoarVM string documentation I mentioned in the introduction, I also ensured all of the ops I have added over my project are documented in NQP.

Ensured all the NQP index, eqat variations (indexic, indexim, indexicim, eqatic, eqaticim) are documented, and added the couple that had not yet been added to the ops list.

Added a Perl 6 program which gets a list of ops which are not mentioned in the oplist which will hopefully be useful to other developers.

The Goal is in Sight

The grant is finally winding down. I have all significant things implemented although not everything has been merged yet. I also have implemented additional things that were not part of the grant (Knuth-Morris-Pratt string search algorithm).

Left to be Done

To finally complete my work and fullfill my objectives I will make any additional documentation or tests that need to be made. If other Perl 6 devs or users of the community want to make any requests for Unicode or string related documentation, you can send me an email or send me a message on freenode IRC (nick samcv).

Other than this, I only need to clean up and merge the collation arrays branch, merge the synthetic grapheme reword, and update the grapheme caching for KMP branch to use caching for strand haystacks and use the now inlined MVM_string_get_grapheme_at_nocheck for flat haystacks.

Zoffix Znet: On Troll Hugging, Hole Digging, and Improving Open Source Communities

Published on 2017-08-31T00:00:00

How to be better

stmuk: Swiss Perl Workshop 2017

Published by stmuk on 2017-08-30T17:48:17

cropped-lake_castle1.jpeg

After a perilous drive up a steep, narrow, winding road from Lake Geneva we arrived at an attractive Alpine village (Villars-sur-Ollon) to meet with fellow Perl Mongers in a small restaurant.  There followed much talk and a little clandestine drinking of exotic spirits including Swiss whisky. The following morning walking to the conference venue there was an amazing view of mountain ranges. On arrival I failed to operate the Nespresso machine which I later found was due to it simply being off.  Clearly software engineers should never try to use hardware. At least after an evening of drinking.

Wendy’s stall was piled high with swag including new Bailador (Perl 6 dancer like framework) stickers, a Shadowcat booklet about Perl 6 and the new O’Reilly “Thinking in Perl 6″. Unfortunately she had sold out of Moritz’s book “Perl 6 Fundamentals” (although there was a sample display copy present). Thankfully later that morning I discovered I had a £3 credit on Google Play Books so I bought the ebook on my phone.

The conference started early with Damian Conway’s Three Little Words.  These were “has”, “class” and “method” from Perl 6 which he liked so much that he had added them to Perl 5 with his “Dios” – “Declarative Inside-Out Syntax” module.  PPI wasn’t fast enough so he had to replace it with a 50,000 character regex PPR. Practical everyday modules mentioned included Regexp::Optimizer and Test::Expr. If the video  doesn’t appear shortly on youtube a version of his talk dating from a few weeks earlier is available at https://www.youtube.com/watch?v=ob6YHpcXmTg

Jonathan Worthington returned with his Perl 6 talk on “How does deoptimization help us go faster?” giving us insight into why Perl 6 was slow at the Virtual Machine level (specifically MoarVM). Even apparently simple and fast operations like indexing an array were slow due to powerful abstractions, late binding and many levels of Multiple Dispatch. In short the flexibility and power of such an extensible language also led to slowness due to the complexity of code paths. The AST optimizer helped with this at compile time but itself took time and it could be better to do this at a later compile time (like Just In Time).  Even with a simple program reading lines from a file it was very hard to determine statically what types were used (even with type annotations) and whether it was worth optimizing (since the file could be very short).

The solution to these dynamic problems was also dynamic but to see what was happening needed cheap logging of execution which was passed to another thread.  This logging is made visible by setting the environment variable MVM_SPESH_LOG to a filename. Better tooling for this log would be a good project for someone.

For execution planning we look for hot (frequently called) code, long blocks of bytecode (slow to run) and consider how many types are used (avoiding “megamorphic” cases with many types which needs many versions of code).  Also analysis of the code flow between different code blocks and SSA.  Mixins made the optimization particularly problematic.

MoarVM’s Spesh did statistical analysis of the code in order to rewrite it in faster, simpler ways. Guards (cheap check for things like types) were placed to catch cases where it got it wrong and if these were triggered (infrequently) it would deoptimize as well, hence the counterintuitive title since “Deoptimization enables speculation” The slides are at http://jnthn.net/papers/2017-spw-deopt.pdf with the video at https://www.youtube.com/watch?v=3umNn1KnlCY The older and more dull witted of us (including myself) might find the latter part of the video more comprehensible at 0.75 Youtube speed.

After a superb multi-course lunch (the food was probably the best I’d had at any Perl event) we returned promptly to hear Damian talk of “Everyday Perl 6”. He pointed out that it wasn’t necessary to code golf obfuscated extremes of Perl 6 and that the average Perl 5 programmer would see many things simpler in Perl 6.  Also a rewrite from 5 to 6 might see something like 25% fewer lines of code since 6 was more expressive in syntax (as well as more consistent) although performance problems remained (and solutions in progress as the previous talk had reminded us).

Next Liz talked of a “gross” (in the numerical sense of 12 x 12 rather than the American teen sense) of Perl 6 Weeklies as she took us down memory lane to 2014 (just about when MoarVM was launched and when unicode support was poor!)  with some selected highlights and memories of Perl 6 developers of the past (and hopefully future again!). Her talk was recorded at https://www.youtube.com/watch?v=418QCTXmvDU

newton

Cal then spoke of Perl 6 maths which he thought was good with its Rats and FatRats but not quite good enough and his ideas of fixing it.  On the following day he showed us he had started some TDD work on TrimRats. He also told us that Newton’s Method wasn’t very good but generated a pretty fractal. See https://www.youtube.com/watch?v=3na_Cx-anvw

Lee spoke about how to detect Perl 5 memory leaks with various CPAN modules and his examples are at https://github.com/leejo/Perl_memory_talk

The day finished with Lightning Talks and a barbecue at givengain — a main sponsor.

On the second day I noticed the robotic St Bernards dog in a tourist shop window had come to life.

dog1

Damian kicked off the talks with my favourite of his talks,  “Standing on the Shoulders of Giants”, starting with the Countess of Lovelace and her Bernoulli number program. This generated a strange sequence with many zeros. The Perl 6 version since it used rational numbers not floating point got the zeros right whereas the Perl 5 version initially suffered from floating point rounding errors (which are fixable).

Among other things he showed us how to define a new infix operator in Perl 6. He also showed us a Perl 6 sort program that looked exactly like LISP even down to the Lots of Irritating Superfluous Parentheses. I think this was quicksort (he certainly showed us a picture of Sir Tony Hoare at some point). Also a very functional (Haskell-like) equivalent  with heavy use of P6 Multiple Dispatch.  Also included was demonstration of P6 “before” as a sort of typeless/multi-type comparison infix. Damian then returned to his old favourite of Quantum Computing.

My mind and notes got a bit jumbled at this point but I particularly liked the slide that explained how factorisation could work by observing the product of possible inputs since this led to a collapse that revealed the factors.  To do this on RSA etc., of course, needs real hardware support which probably only the NSA and friends have (?). Damian’s code examples are at http://www.bit.do/Perl6SOG with  an earlier version of his talk at https://www.youtube.com/watch?v=Nq2HkAYbG5o Around this point there was a road race of classic cars going on outside up the main road into the village and there were car noises in the background that strangely were more relaxing than annoying.

File_000

After Quantum Chaos Paul Johnson brought us all back down to ground with an excellent practical talk on modernising legacy Perl 5 applications based on his war stories. Hell, of course, is “Other People’s Code”, often dating from Perl’s early days and lacking documentation and sound engineering.

Often the original developers had long since departed or, in the worse cases, were still there.  Adding tests and logging (with stack traces) were particularly useful. As was moving to git (although its steep learning curve meant mentoring was needed) and handling CPAN module versioning with pinto.  Many talks had spoken of the Perl 6 future whereas this spoke of the Perl 5 past and present and the work many of us suffer to pay the bills. It’s at https://www.youtube.com/watch?v=4G5EaUNOhR0

File_000 (1)

Jonathan then spoke of reactive distributed software.  A distributed system is an async one where “Is it working?” means “some of it is working but we don’t know which bits”.  Good OO design is “tell don’t ask” — you tell remote service to do something for you and not parse the response and do it yourself thus breaking encapsulation.  This is particularly important in building well designed distributed systems since otherwise the systems are less responsive and reliable.  Reactive (async) works better for distributed software than interactive (blocking or sync).

We saw a table that used a Perl 6 promise for one value and a supply for many values for reactive (async) code and the equivalent (one value) and a Perl 6 Seq for interactive code. A Supply could be used for pub/sub and the Observer Pattern. A Supply could either be live (like broadcast TV) or, for most Perl 6 supplies, on-demand (like Netflix). Then samples of networking (socket) based code were discussed including a web client, web server and SSH::LibSSH (async client bindings often very useful in practical applications like port forwarding)

https://github.com/jnthn/p6-ssh-libssh

Much of the socket code had a pattern of “react { whenever {” blocks with “whenever” as a sort of async loop.He then moved on from sockets to services (using a Supply pipeline) and amazed us by announcing the release of “cro”, a microservices library that even supports HTTP/2 and Websockets, at http://mi.cro.services/.  This is installable using Perl 6 by “zef install –/test cro”.

Slides at http://jnthn.net/papers/2017-spw-sockets-services.pdf and video at https://www.youtube.com/watch?v=6CsBDnTUJ3A

Next Lee showed Burp Scanner which is payware but probably the best web vulnerabilities scanner. I wondered if anyone had dare run it on ACT or the hotel’s captive portal.

Wendy did some cheerleading in her “Changing Image of Perl”.  An earlier version is at https://www.youtube.com/watch?v=Jl6iJIH7HdA

Sue’s talk was “Spiders, Gophers, Butterflies” although the latter were mostly noticeably absent. She promises me that a successor version of the talk will use them more extensively. Certainly any Perl 6 web spidering code is likely to fit better on one slide than the Go equivalent.

During the lightning talks Timo showed us a very pretty Perl 6 program using his SDL2::Raw to draw an animated square spiral with hypnotic colour cycling type patterns. Also there was a talk by the author about https://bifax.org/bif/— a distributed bug tracking system (which worked offline like git).

Later in the final evening many of us ate and chatted in another restaurant where we witnessed a dog fight being narrowly averted and learnt that Wendy didn’t like Perl 5’s bless for both technical and philosophical reasons.


p6steve: perl6 Module How To

Published by p6steve on 2017-08-18T16:29:28

Some investigation has discovered great resources on how to write and then list a perl6 module….


p6steve: Physics::Unit in perl6

Published by p6steve on 2017-08-18T16:23:07

First and foremost, homage to the original authors of Physics::Unit and related perl5 CPAN modules. I would be honoured to hear from you and to collaborate in any way.

What’s the big picture? TOP down, I have in mind:

So, that said, given my poor state of knowledge of most of these things, my thinking is to start build BOTTOM up and see the shape of things that emerges, while learning some perl6 on the way.

So, first I am going to need some MEASUREMENTS which are VALUES expressed in UNITS with associated ERROR.

I took a gander at this CPAN module Physics::Udunits2 which is a perl5 interface to udunits2 and felt that the richness of it’s units and adherence to NIST guidelines were not of sufficient benefit to overcome my sense of incoherence.

So, to cut a long story short, decided to take inspiration from Physics::Unit .

Next, I needed some guidance on How to Build a perl6 module…


p6steve: perl6, really?

Published by p6steve on 2017-08-18T15:19:54

I have been waiting for perl6 for over 15 years since it was first conceived. Recently, I have had an urge to get back to hands’ on coding and, having seen the latest Rakudo* release of perl6 felt that it is now sufficiently mature for my nefarious purposes.

No doubt I am not the only person to have been frustrated by the slow progress of perl6, and certainly many have dropped by the wayside. Perhaps going to the siren call of Python(2 or 3), Ruby, Swift or Go. And now it is finally here, the community is obviously worried that no one will adopt the fruits of their work.

Here ‘zoffix’ makes a desperate plea to change the name from ‘perl6’ to ‘rakudo’ to reboot the brand…. https://www.reddit.com/r/perl6/comments/6lstq3/the_hot_new_language_named_rakudo/

My rebuttal to this concept is reproduced here.

I’m a slow thinking guy who has had two careers:- perl5 dev and marketing manager. I have been wrestling with Zoffix’ proposed change and it is certainly a well construed argument made with feeling. Here’s my line:

So, I detect some natural frustration within and without the community. Keep the faith. We have a new audience, they value truth and beauty, and a story of battling the odds. They need this TECHNOLOGY. It is [perl6 | rakudo] – who cares. It may have to start in academia, it may have to start in the p5 stalwarts. It will ignite. Finish the journey. Do not deny your heritage.

So, yes, I think that perl6 is awesome (whatever it’s called). And I believe that it will be an interesting personal journey to come to grips with the ultimate programming power tool and to deliver something interesting on the way. As a user and sometime low level contributor perhaps.

~p6steve


Perl 6 Maven: Printing to Standard Error in Perl 6

Published by szabgab

Perlgeek.de: My Ten Years of Perl 6

Published by Moritz Lenz on 2017-08-08T22:00:01

Time for some old man's reminiscence. Or so it feels when I realize that I've spent more than 10 years involved with the Perl 6 community.

How I Joined the Perl 6 Community

It was February 2007.

I was bored. I had lots of free time (crazy to imagine that now...), and I spent some of that answering (Perl 5) questions on perlmonks. There was a category of questions where I routinely had no good answers, and those were related to threads. So I decided to play with threads, and got frustrated pretty quickly.

And then I remember that a friend in school had told me (about four years earlier) that there was this Perl 6 project that wanted to do concurrency really well, and even automatically parallelize some stuff. And this was some time ago, maybe they had gotten anywhere?

So I searched the Internet, and found out about Pugs, a Perl 6 compiler written in Haskell. And I wanted to learn more, but some of the links to the presentations were dead. I joined the #perl6 IRC channel to report the broken link.

And within three minutes I got a "thank you" for the report, the broken links were gone, and I had an invitation for a commit bit to the underlying SVN repo.

I stayed.

The Early Days

Those were they wild young days of Perl 6 and Pugs. Audrey Tang was pushing Pugs (and Haskell) very hard, and often implemented a feature within 20 minutes after somebody mentioned it. Things were unstable, broken often, and usually fixed quickly. No idea was too crazy to be considered or even implemented.

We had bots that evaluated Perl 6 and Haskell code, and gave the result directly on IRC. There were lots of cool (and sometimes somewhat frightening) automations, for example for inviting others to the SVN repo, to the shared hosting system (called feather), for searching SVN logs and so on. Since git was still an obscure and very unusable, people tried to use SVK, an attempt to implement a decentralized version control system on top of of the SVN protocol.

Despite some half-hearted attempts, I didn't really make inroads into compiler developments. Having worked with neither Haskell nor compilers before proved to be a pretty steep step. Instead I focused on some early modules, documentation, tests, and asking and answering questions. When the IRC logger went offline for a while, I wrote my own, which is still in use today.

I felt at home in that IRC channel and the community. When the community asked for mentors for the Google Summer of Code project, I stepped up. The project was a revamp of the Perl 6 test suite, and to prepare for mentoring task, I decided to dive deeper. That made me the maintainer of the test suite.

Pet Projects

I can't recount a full history of Perl 6 projects during that time range, but I want to reflect on some projects that I considered my pet projects, at least for some time.

It is not quite clear from this (very selected) timeline, but my Perl 6 related activity dropped around 2009 or 2010. This is when I started to work full time, moved in with my girlfriend (now wife), and started to plan a family.

Relationships

The technologies and ideas in Perl 6 are fascinating, but that's not what kept me. I came for the technology, but stayed for the community.

There were and are many great people in the Perl 6 community, some of whom I am happy to call my friends. Whenever I get the chance to attend a Perl conference, workshop or hackathon, I find a group of Perl 6 hackers to hang out and discuss with, and generally have a good time.

Four events stand out in my memory. In 2010 I was invited to the Open Source Days in Copenhagen. I missed most of the conference, but spent a day or two with (if memory serve right) Carl Mäsak, Patrick Michaud, Jonathan Worthington and Arne Skjærholt. We spent some fun time trying to wrap our minds around macros, the intricacies of human and computer language, and Japanese food. (Ok, the last one was easy). Later the same year, I attended my first YAPC::EU in Pisa, and met most of the same crowd again -- this time joined by Larry Wall, and over three or four days. I still fondly remember the Perl 6 hallway track from that conference. And 2012 I flew to Oslo for a Perl 6 hackathon, with a close-knit, fabulous group of Perl 6 hackers. Finally, the Perl Reunification Summit in the beautiful town of Perl in Germany, which brought together Perl 5 and Perl 6 hackers in a very relaxed atmosphere.

For three of these four events, different private sponsors from the Perl and Perl 6 community covered travel and/or hotel costs, with their only motivation being meeting folks they liked, and seeing the community and technology flourish.

The Now

The Perl 6 community has evolved a lot over the last ten years, but it is still a very friendly and welcoming place. There are lots of "new" folks (where "new" is everybody who joined after me, of course :D), and a surprising number of the old guard still hang around, some more involved, some less, all of them still very friendly and supportive

The Future

I anticipate that my family and other projects will continue to occupy much of my time, and it is unlikely that I'll be writing another Perl 6 book (after the one about regexes) any time soon. But the Perl 6 community has become a second home for me, and I don't want to miss it.

In the future, I see myself supporting the Perl 6 community through infrastructure (community servers, IRC logs, running IRC bots etc.), answering questions, writing a blog article here and there, but mostly empowering the "new" guard to do whatever they deem best.

samcv: Grant Status Update 3

Published on 2017-08-06T07:00:00

This month I accomplished a lot. We now support full Unicode 9.0/Emoji 4.0 text segmentation (putting codepoints into graphemes), some really awesome concatenation improvements (these are both in master). Also a fully functional Unicode Collation Algorithm (this not yet merged into master)

Table of Contents

Unicode Collation Algorithm

In my collation-arrays branch I now have implemented a fully working Unicode Collation Algorithm. Only 82/190,377 of the Unicode collation tests are failing (99.956% passing).

What I did:

  • Collation keys that need to be generated on the fly are working

  • Characters that need to be decomposed for collation (mainly Hangul characters) are working

  • The script that generates the linked list (used for codepoints with more than one collation array in them or for sequences of more than one codepoint with collation keys) was rewritten and I have not discovered any issues with the data

I was also able to properly keep the ability to adjust each of the collation levels (reversing or disabling primary, secondary or tertiary levels).

Since primary > secondary > tertiary for all collation elements we are able to have a default array:

{-1, 0, 1} /* aka Less, Same, More */

This is the default array that is used in case we are comparing between different levels (primary vs secondary for instance).

If we are comparing between the same level, we use a 3x3 array which holds the return values based on whether the collation values are either Less, More or Same. This is the array that is customized to allow you to reverse or disable different collation levels.

 {-1, 0, 1}, {-1, 0, 1}, {-1, 0, 1}
/*  primary,  secondary,   tertiary */

Below, pos_a and pos_b track moving between keys on the collation stack while level_a and level_b track comparing the levels between each other. For more information about how this works, see my [previous grant status update](/perl6/grant-status-update-2).

if (level_a == level_b)
    effective_level_eval = level_eval_settings.a[level_a];
else
    effective_level_eval = level_eval_default

rtrn =
stack_a.keys[pos_a].a[level_a] < stack_b.keys[pos_b].a[level_b] ?  effective_level_eval.s2.Less :
stack_a.keys[pos_a].a[level_a] > stack_b.keys[pos_b].a[level_b] ?  effective_level_eval.s2.More :
                                                                   effective_level_eval.s2.Same ;

Need to do:

  • Either use MVM_COLLATION_QC property to determine whether to look up a character in the linked list or instead store the index of the value of the main node in the Unicode database instead

    • Currently it checks all of the main nodes before then reverting to the values stored in the Unicode Database or based on generated values

  • Investigate country/language selective sorting

Other than needing to improve the speed of finding the main nodes, the implementation is feature complete.

The currently failing tests mostly relate to codepoints whose collation values are different in NFD form compared to NFC form. This happens if there are combining accent/extending characters after it, which would be reordered in NFD form. As the number of tests failing is quite low and there is only a very slight variation in sorting order because of this, I find it acceptable in its current form. I may find a work around to fix it, but since 99.956% of the tests pass and the incorrectness is only a slight variation of sorting.

Text Segmentation/Normalization

Emoji Fixes

Improvements to segmentation of Emoji w/ GCB=Other

Not all Emoji Modifiers have Grapheme_Cluster_Break = E_Base or E_Base_GAZ. In these cases we need to check the Emoji_Modifier_Base property. This fixed 25 more emoji than before.

Commit: 4ff2f1f9

Don’t break after ZWJ for Emoji=True + GCB=Other

With this change we are now counting 100% of the Emoji v4.0 emoji as a single grapheme. Since ASCII numbers and the pound symbol are also counted as Emoji, we disregard any codepoints in the ASCII range. I updated the MoarVM Readme to show we have full Unicode 9.0/Emoji 4.0 text segmentation support!

Other Improvements

Avoid property lookups for Hangul and just use the Grapheme_Cluster_Break property.

Lookup Canonical_Combining_Class as the raw integer property value instead of having to get the string and then converting to an integer. Even though the numbers we get back are not identical to the actual CCC, when doing the Unicode algorithms we only care about relative CCC. This results in lower CPU usage during normalization.

Fix cmp/leg to support deterministic comparing of synthetics

Fixes the cmp/leg operator to have deterministic results when the string contains a synthetic. Previously it compared based on the value the synthetic was stored as, which is based on which synthetic was allocated first. This makes it so cmp/leg always have the same result for the same string, no matter the order the synthetics are allocated. It also makes it so the codepoints the synthetic is made up of are compared instead of the value of the synthetic (whose value has no relation to the codepoints in the synthetic).

Previously we compared naïvely by grapheme, and ended up comparing synthetic codepoints with non-synthetics. This would cause synthetics to be sorted incorrectly, in addition to it making comparing things non-deterministic; if the synthetics were added in a different order, you would get a different result with MVM_string_compare.

Now, we compare down the string the same way we did before, but if the first non-matching grapheme is a synthetic, we iterate over the codepoints in the synthetic, so it returns the result based on codepoint. If the two non-matching graphemes contain the same initial codepoints but different in length, the grapheme with less codepoints in it is determined as less than the other.

Concatenation Improvements

When we concatenate strings normally, we create a new string which contains references to the string or strands of string a and the string or strands of string b.

Previously if there were any of the following conditions for the last grapheme of string a (last_a) or the first grapheme of string b (first_b), we would renormalize string a and string b in their entirety:

  • last_a or first_b were synthetics (contained multiple codepoints in one grapheme) other than the CRLF synthetic

  • Didn’t pass NFG Quickcheck

  • Had a non-zero Canonical_Combining_Class

The first improvement I made was to use same the code that we use during normalizations (should_break) to find out if two adjacent codepoints should be broken up or if they are part of the same grapheme (user visible character unit). This first improvement had a very big effect since previously we were renormalizing even if renormalization would not have caused there to be any change in the final output. This change reduced the amount we had to do full renormalization by about 80% in my tests of various different languages' scripts.

The second change I made was more technically challenging. While the first change reduces how often we would do full renormalization, the second change’s goal was to only renormalize a small section of string a and string b instead of string a and b in their entirety.

With the new code, we either copy the strands or set string a as the first strand of the new string (as we did previously). Then the should_break function we use for checking if we break between two codepoints or not is used. If we break between last_a and first_b we continue as I explained at the very start of this section. If should_break returns that we need to combine the two codepoints, then we create a renormalized_section strand for those two graphemes. Since last_a is now inside the renormalized_section, we adjust the reference to string a (similar to how .substr creates a substring) or if string a was made up of multiple strands, we adjust the last strand of string a.

We then insert the renormalized section. String b or its strands are then set after the renormalized section. The same way we adjusted a, we adjust b or its first strand to be a substring.

Since it’s possible for the adjusted strings or strands to be fully consumed (i.e. the substring is 0 in length), we "drop" the copied strand or string (for a or b) if the adjusted substring has no more graphemes in it. This currently can only happen if string a or the last strand of string a is a single grapheme or the first strand of string b or string b is a single grapheme.

Other minor fixes

Fix overflow in uniprop lookups

Makes excessively large uniprop queries not overflow into negative numbers, which would cause it to throw. Closes MoarVM issue #566.

Commit: a3e98692.

Made string_index 16% faster for incompatible string types

For incompatible string types use string_equal_at_ignore_case_INTERNAL_loop which results in a 16% speed boost. Only a minor change to the INTERNAL_loop function and it works for non-ignorecase/ignoremark as well as with ignorecase/ignoremark functionality.

Commit: 161ec639.

Add nqp::codes op to MoarVM for 3.5x faster doing .codes for short strings

Added a nqp::codes which gets the number of codepoints in the string. We had nqp::chars to get the number of graphemes already, but for getting the number of codepoints in the string we called .NFC.codes which is 3.5x slower for short strings and 2x slower for longer strings.

Tests

Added in tests to make sure that concat is stable when concatenating things which change under normalization. This was done to make sure there were no issues with the concatenation improvement.

Next Month

In the next month I need to get the Collation stuff merged and work on fixing some very longstanding bugs in our Unicode Database implementation. I may not be fully changing out our Unicode Database implementation as I had planned, but I will still at least fix the majority of the issues.

This includes ucd2c.pl not being deterministic and changing every time it is generated. I plan to make it generate the same code on every run. It was originally written to assumes properties and property values are unique. I need to make it so that when the script is generated, all of the properties and values are functional. Currently, when the script is regenerated, it is random which property values are functional and which are not; depending on which ends up last in the hash initializer array (since the property values where occur last overwrite the identical names which appear first).

Perl 6 Maven: Continuous Integration for Perl 6 modules

Published by szabgab

rakudo.org: Announce: Rakudo Star Release 2017.07

Published by Steve Mynott on 2017-07-24T18:36:58

A useful and usable production distribution of Perl 6

On behalf of the Rakudo and Perl 6 development teams, I’m pleased to announce the July 2017 release of “Rakudo Star”, a useful and usable production distribution of Perl 6. The tarball for the July 2017 release is available from https://rakudo.perl6.org/downloads/star/.

Binaries for macOS and Windows (64 bit) are also available.

This is the eighth post-Christmas (production) release of Rakudo Star and implements Perl v6.c. It comes with support for the MoarVM backend (all module tests pass on supported platforms).

IMPORTANT: “panda” is to be removed very shortly since it is deprecated. Please use “zef” instead.

Currently, Star is on a quarterly release cycle and 2017.10 (October) will follow later this year.

Please note that this release of Rakudo Star is not fully functional with the JVM backend from the Rakudo compiler. Please use the MoarVM backend only.

In the Perl 6 world, we make a distinction between the language (“Perl 6”) and specific implementations of the language such as “Rakudo Perl”.

This Star release includes release 2017.07 of the Rakudo Perl 6 compiler, version 2017.07 MoarVM, plus various modules, documentation, and other resources collected from the Perl 6 community.

Note this Star release contains NQP version 2017.07-9-gc0abee7 rather than the release NQP 2017.07 in order to fix the –ll-exception command line flag.

The Rakudo compiler changes since the last Rakudo Star release of 2017.01 are now listed in “2017.05.md”, “2017.06.md” and “2017.07.md” under the “rakudo/docs/announce” directory of the source distribution.

Notable changes in modules shipped with Rakudo Star:

+ DBIish: Doc and CI updates
+ doc: Too many to list. p6doc fixed.
+ grammar-debugger: Works again now.
+ p6-io-string: New dep for doc.
+ p6-native-resources: Removed since deprecated and not used by linenoise.
+ panda: Officially deprecate panda in favour of zef.
+ perl6-Test-When: New dep for perl6-pod-to-bigpage.
+ perl6-lwp-simple: Fix breakage due to rakudo encoding refactor.
+ tap-harness6: Replaces deprecated tap-harness6-prove6.
+ zef: Too many to list.

There are some key features of Perl 6 that Rakudo Star does not yet handle appropriately, although they will appear in upcoming releases. Some of the not-quite-there features include:

+ advanced macros
+ non-blocking I/O (in progress)
+ some bits of Synopsis 9 and 11

There is an online resource at http://perl6.org/compilers/features that lists the known implemented and missing features of Rakudo’s backends and other Perl 6 implementations.

In many places we’ve tried to make Rakudo smart enough to inform the programmer that a given feature isn’t implemented, but there are many that we’ve missed. Bug reports about missing and broken features are welcomed at rakudobug@perl.org.

See https://perl6.org/ for links to much more information about Perl 6, including documentation, example code, tutorials, presentations, reference materials, design documents, and other supporting resources. Some Perl 6 tutorials are available under the “docs” directory in the release tarball.

The development team thanks all of the contributors and sponsors for making Rakudo Star possible. If you would like to contribute, see http://rakudo.org/how-to-help, ask on the perl6-compiler@perl.org mailing list, or join us on IRC #perl6 on freenode.

Perlgeek.de: Perl 6 Fundamentals Now Available for Purchase

Published by Moritz Lenz on 2017-07-21T22:00:01

After about nine months of work, my book Perl 6 Fundamentals is now available for purchase on apress.com and springer.com.

The ebook can be purchased right now, and comes in the epub and PDF formats (with watermarks, but DRM free). The print form can be pre-ordered from Amazon, and will become ready for shipping in about a week or two.

I will make a copy of the ebook available for free for everybody who purchased an earlier version, "Perl 6 by Example", from LeanPub.

The book is aimed at people familiar with the basics of programming; prior Perl 5 or Perl 6 knowledge is not required. It features a practical example in most chapters (no mammal hierarchies or class Rectangle inheriting from class Shape), ranging from simple input/output and text formatting to plotting with python's matplotlib libraries. Other examples include date and time conversion, a Unicode search tool and a directory size visualization.

I use these examples to explain subset of Perl 6, with many pointers to more documentation where relevant. Perl 6 topics include the basic lexicographic structure, testing, input and output, multi dispatch, object orientation, regexes and grammars, usage of modules, functional programming and interaction with python libraries through Inline::Python.

Let me finish with Larry Wall's description of this book, quoted from his foreword:

It's not just a reference, since you can always find such materials online. Nor is it just a cookbook. I like to think of it as an extended invitation, from a well-liked and well-informed member of our circle, to people like you who might want to join in on the fun. Because joy is what's fundamental to Perl. The essence of Perl is an invitation to love, and to be loved by, the Perl community. It's an invitation to be a participant of the gift economy, on both the receiving and the giving end.

Perl 6 Maven: LWP::Simple - a simple web client in Perl 6

Published by szabgab

Perlgeek.de: The Loss of Name and Orientation

Published by Moritz Lenz on 2017-07-10T22:00:01

The Perl 6 naming debate has started again. And I guess with good reason. Teaching people that Perl 6 is a Perl, but not the Perl requires too much effort. Two years ago, I didn't believe. Now you're reading a tired man's words.

I'm glad that this time, we're not discussing giving up the "Perl" brand, which still has very positive connotations in my mind, and in many other minds as well.

And yet, I can't bring myself to like "Rakudo Perl 6" as a name. There are two vary shallow reasons for that: Going from two syllables, "Perl six", to five of them, seems a step in the wrong direction. And two, I remember the days when the name was pretty young, and people would misspell it all the time. That seems to have abated, though I don't know why.

But there's also a deeper reason, probably sentimental old man's reason. I remember the days when Pugs was actively developed, and formed the center of a vibrant community. When kp6 and SMOP and all those weird projects were around. And then, just when it looked like there was only a single compiler was around, Stefan O'Rear conjured up niecza, almost single-handedly, and out of thin air. Within months, it was a viable Perl 6 compiler, that people on #perl6 readily recommended.

All of this was born out of the vision that Perl 6 was a language with no single, preferred compiler. Changing the language name to include the compiler name means abandoning this vision. How can we claim to welcome alternative implementations when the commitment to one compiler is right in the language name?

However I can't weigh this loss of vision against a potential gain in popularity. I can't decide if it's my long-term commitment to the name "Perl 6" that makes me resent the new name, or valid objections. The lack of vision mirrors my own state of mind pretty well.

I don't know where this leaves us. I guess I must apologize for wasting your time by publishing this incoherent mess.

Perl 6 Maven: MongoDB with Perl 6 on Linux

Published by szabgab

Perl 6 Maven: Parsing command line arguments in Perl 6 - ARGS - ARGV - MAIN

Published by szabgab

samcv: Grant Status Update 2

Published on 2017-07-04T07:00:00

This is my second grant progress report for my Perl Foundation grant entitled "Improving the Robustness of Unicode Support in Rakudo on MoarVM".

I got to working on collation this month. I'm going to explain a bit of how the Unicode Collation Algorithm works.

Unicode Collation Algorithm

The collation data for UCA is made up of arrays like so:

[primary.secondary.tertiary]

Each one is an integer. Primary is different for different letters, 'a' vs 'z'. secondary is differences such as diacritics, 'a' vs 'á'. And tertiary is case, 'a' vs 'A'. While it works different for non-Latin characters, that is the gist of what they represent. In most cases you have one codepoint mapped to one collation array. Though in many cases, this is not true.

Single codepoints can map to multiple collation array elements. Sequences of codepoints can also map to one or more than one collation array elements.

Some sequences also can exist inside others.

So the string xyz may have one set of collation elements but xy has another, where x y and z are codepoints in a three codepoint sequence with its own set of multiple collation keys.

So, how do these collation elements translate into sorting the codepoints?

[.0706.0020.0002], [.06D9.0020.0002]

You take the two primary values, then append a 0 as a seperator. Then push the secondary, append another 0 as a separator and then push on the tertiary:

0707, 06D9, 0, 020, 020, 0, 02, 02

Now this would pose a problem since we would need to traverse the entire string before making any decisions. Instead what I have decided to do is to use the arrays with [primary.secondary.tertiary] and push them onto a stack instead of changing them into a linear progression, and iterate through the primary's, and then grab more collation elements as they are needed to resolve ties.

Also when collation data for the next codepoint is added to the stack, if it is a starter is a sequence we will also pull the next codepoint going through a linked list stored in C arrays as needed. If the next codepoint ends up not being a part of a sequence we just push the codepoint we just "peeked" at onto the stack as well, so don't have to go back over codepoints.

Now this improved Unicode Collation Algorithm is not complete, but I am continuing to work on integrating the new C data structure I've created into MoarVM, and it currently works partially, but not as well as the current implementation.

Improvements to the Current UCA Implementation

In the meantime I have made improvements to the current implementation of the Unicode Collation Algorithm. Previously it was possible to enable or disable the primary, secondary or tertiary levels. This allowed you to do things such as ignore diacritics when sorting or ignore casing. What you are now able to do is to reverse the sorting of different levels. This allows you to for example sort uppercase letters before lowercase (default UCA sorts lowercase before uppercase, since lowercase < uppercase). It can also let you put diacritic mark containing characters before the ordinary letters. Any of the three levels can be either enabled, disabled, or reversed. For anybody already using it, supplying True or False to set $*COLLATION still works the same as before, but you are now able to supply 1, 0, or -1 to enable, disable or reverse the collation for specific levels.

Fixes

Grapheme Cluster Break

As I said last week I made improvements to the script that tests our breakup of graphemes. Now we have full support for the Prepend property that was added in Unicode 9.0, as well as passing all the tests for regional indicators. The only tests we now don't pass in GraphemeClusterBreakTest.t are a few emoji tests, and I believe we only fail 3 or so of these! The Prepend mark fixes needed us to save more state across parsing the code, as Prepend is different from all other Unicode grapheme break logic in that it comes before not after a base character.

Igorecase+Ignoremark Regex

The longstanding bug I mentioned in my previous status report has now been fixed. The bug was in regex when both ignorecase and ignoremark adverbs were used.

say "All hell is breaking loose" ~~ m:i:m/"All is fine, I am sure of it"/
# OUTPUT«「All hell is breaking loose」␤» Output before the fix. This should not have matched.

This bug occurred when the entire length of the haystack was searched and all of the graphemes matched the needle.

If the needle exceeded the length of the haystack past that point, it would erroneously think there was a match there, as it only checked that it matched the whole length of the haystack.

Would cause 'fgh' to be found in: 'abcdefg'. This only occurred at the very end of the haystack.

The internal string_equal_at_ignore_case_INTERNAL_loop now returns -1 if there was no match and 0 or more if there was a match at that index.

This return value provides new information which is 0 if there was a match and some positive integer when the haystack was expanded when casefolding it.

As explained by my previous post, information about when characters expand when foldcased must be retained.

This information had been planned to be exposed in some way at a future date, as if we are searching for 'st' inside a string 'stabc', nqp::indexic (index ignorecase) will indicate that it is located at index 0, and in Perl 6 Rakudo it will return 'sta' when it should instead have returned 'st'.

For now this additional information is only internal and the return values of the nqp::indexic_s and nqp::equatic_s ops have not changed.

NQP Codepath Problems…

Previously there were way too many different codepaths handling different variations of no regex adverbs, ignorecase, ignoremark, ignorecase+ignoremark. Problematically each combination had their own codepath. To really solve this bug and improve the code quality I decided to clean it up and correct this.

In my past work I had already added a nqp::indexic op, so it was time to add another! I added a nqp::indexicim op and a nqp::eqaticim op and was able to reuse most of the code and not increase our code burden much on the MoarVM side, and greatly reduce the possibility for bugs to get in on varying combinations of ignorecase/ignoremark ops.

This is was a very longstanding Unicode bug (I don't think both adverbs together ever worked) so it's great that it is now fixed :).

Coming Up

I will be continuing to fix out the issues in the new Unicode Collation Algorithm implementation as I described earlier in this post. I also plan on taking stock of all of the current Grapheme Cluster Break issues, which only exist now for certain Emoji (though the vast majority of Emoji work properly).

I will also be preparing my talks for the Amsterdam Perl conference as well!

Sidenote

I released a new module, Font::QueryInfo which allows you to query font information using FreeType. It can even return the codepoints a font supports as a list of ranges!

Perlgeek.de: Living on the (b)leading edge

Published by Moritz Lenz on 2017-06-24T22:00:01

Perl 6 is innovative in many ways, and sometimes we don't fully appreciate all the implications, for good or for bad.

There's one I stumbled upon recently: The use of fancy Unicode symbols for built-in stuff. In this case: the `.gist` output of Match objects. For example

my token word { \w+ }
say 'abc=def' ~~ /<word> '=' <word>/;
produces this output:
「abc=def」
 word => 「abc」
 word => 「def」

And that's where the problems start. In my current quest to write a book on Perl 6 regexes, I noticed that the PDF that LeanPub generates from my Markdown sources don't correctly display those pesky 「」 characters, which are

$ uni -c 「」
「 - U+0FF62 - HALFWIDTH LEFT CORNER BRACKET
」 - U+0FF63 - HALFWIDTH RIGHT CORNER BRACKET

When I copied the text from the PDF and pasted into my editor, they showed up correctly, which indicates that the characters are likely missing from the monospace font.

The toolchain allows control over the font used for displaying code, so I tried all the monospace fonts that were available. I tried them in alphabetical order. Among the earlier fonts I tried was Deja Vu Sans Mono, which I use in my terminal, and which hasn't let me down yet. No dice. I arrived at Noto, a font designed to cover all Unicode codepoints. And it didn't work either. So it turns out these two characters are part of some Noto Sans variants, but not of the monospace font.

My terminal, and even some font viewers, use some kind of fallback where they use glyphs from other fonts to render missing characters. The book generation toolchain does not.

The Google Group for Leanpub was somewhat helpful: if I could recommend an Open Source mono space font that fit my needs, they'd likely include it in their toolchain.

So I searched and searched, learning more about fonts than I wanted to know. My circle of geek friends came up with several suggestions, one of them being Iosevka, which actually contains those characters. So now I wait for others to step up, either for LeanPub to include that font, or for the Noto maintainers to create a monospace variant of those characters (and then LeanPub updating their version of the font).

And all of that because Perl 6 was being innovative, and used two otherwise little-used characters as delimiters, in an attempt to avoid collisions between delimiters and content.

(In the mean time I've replaced the two offending characters with ones that look similar. It means the example output is technically incorrect, but at least it's readable).

Perlgeek.de: Perl 6 Books Landscape in June 2017

Published by Moritz Lenz on 2017-06-07T22:00:01

There are lots of news around Perl 6 books to share these days. If you follow the community very closely, you might be aware of most of it. If not, read on :-).

Think Perl 6 is now available for purchase, and also for download as a free ebook. Heck, it's even Open Source, with the LaTeX sources on GitHub!

Perl 6 at a Glance, previously only available in print form, is now available as an ebook. Save paper and shipping costs!

My own book, Perl 6 Fundamentals, is now in the "production" phase: copyediting, indexing, layout. And just before the manuscript submission deadline, Larry Wall has contributed a foreword. How awesome is that?

I've revamped perl6book.com to provide a short overview of the current and future Perl 6 books. As a small gimmick, it contains a flow chart explaining which book to chose. And I even got input from two other Perl 6 book authors (Laurent Rosenfeld of "Think Perl 6", Andrew Shitov of "Perl 6 at a Glance", "Migrating to Perl 6".

From a pull request to perl6book.com, it looks like Andrew Shitov is working on two more Perl 6 books. Keep 'em coming!

Last but not least, Gabor Szabo has started a crowd funding campaign for a Perl 6 book on web app development. There are still a few day left, so you can help it succeed!

And as always, if you want to keep informed about Perl 6 books, you can sign up at perl6book.com for my Perl 6 books mailing list (low volume, typically less than one email per month).

samcv: Grant Status Update 1

Published on 2017-06-02T07:00:00

This is my first grant progress report for my Perl Foundation grant entitled "Improving the Robustness of Unicode Support in Rakudo on MoarVM".

I was not able to work quite as many hours as I would have liked this month, but I still made quite a lot of progress.

Improvement for Tests

Merged In

In Roast there is a new version of GraphemeBreakTest.t.

The script tests the contents of each grapheme individually from the GraphemeClusterBreak.txt file from the Unicode 9.0 test suite.

Previously we only checked the total number of ‘.chars’ each for the string as a whole. Obviously we want something more precise than that, since the test specifies the location of each of the breaks between codepoints. The new code checks that codepoints are put in the correct graphemes in the proper order. In addition we also check the string length as well.

This new test uses a grammar to parse the file and generally is much more robust than the previous script.

Running the parse class generates an array of arrays. The index of the outer array indicates the grapheme, while the inner arrays indicate which codepoints should be in that grapheme.

[[10084, 776], [9757]]

The array above would indicate that the 1st grapheme is made up of codepoint's 10084 and 776 while the 2nd grapheme is made up codepoint 9757. This allows us to easily test the contents of each grapheme.

The array shown above corresponds to the following line from the Unicode data file:

÷ 2764 × 0308 ÷ 261D ÷ where × means break and ÷ means no-break

Work in Progress

I have some currently unmerged tests which need to wait to be merged, although sections of it are complete and are being incorporated into the larger Unicode Database Retrofit, reusing this code.

I have written grammars and modules to process and provide data on the PropertyValueAliases and PropertyAliases. They will be used for testing that all of the canonical property names and all the property values themselves properly resolve to separate property codes, as well as that they are usable in regex.

Work on the Unicode Database Retrofit

As part of my grant work I am working on making Unicode property values distinct per property, and also on allowing all canonical Unicode property values to work. For a background on this see my previous post about Unicode Property Names. The WIP generated code can be seen in this gist here and was generated from UCD-gen.p6. The code resolves property name and property value command line arguments and matches them with property codes and property value codes. It is also case insensitive and ignores underscores as Unicode spec says is permissible. In addition it is also deduplicated, meaning we only store one hash per unique set of property values.

For example: Script and Script_Extensions both have the same values, so we don't store these more than once; likewise for the Boolean property values. The C program resolves the property string to a unique property code, and from there is able to look up the property value code. Note: aside from the property values which specify the lack of a property, these codes are internal and have no relation to the Unicode spec, for example Grapheme_Cluster_Break=Other is designated as property value 0.

Docs

I've also started adding some documentation to my Unicode-Grant wiki with information about what is enclosed in each Unicode data file; there are a few other pages as well. This wiki is planned to be expanded to have many more sections than it does currently. https://github.com/samcv/Unicode-Grant/wiki/All-Unicode-Files

Future Work

Next I must integrate the property name/value alias resolving code with UCD-gen.p6. UCD-gen.p6 already has a mostly functional Unicode database with a fair number of properties. When these two are integrated, the next step will be to start integrating it with the MoarVM codebase, making any changes to MoarVM or the database retrofit codebase as needed along the way.

I will also be exploring ways of compressing the mapping of codepoints to unique combinations of Unicode property data in the bitfield. Due to the vast number of codepoints within Unicode, currently the mapping of codepoints to rows in the bitfield takes up many times more space than the actual property value data itself.

For compressing the Unicode names, it is planned to use base 40 encoding with some additional tricks to save additional space for repeated words. I plan on making a blog post where I go into the details of the compression scheme.

I am considering also rolling in the ignorecase/ignoremark bug into my grant. Even though it was not originally planned to be part of the Grant, I think it is important enough to warrant inclusion. Currently, using regex using both ignorecase and ignoremark together is completely broken.

Note

The work described above has been commited to the two repositories as listed below (in addition to the test work described which was merged into Roast).

https://github.com/samcv/UCD

https://github.com/samcv/Unicode-Grant