pl6anet

Perl 6 RSS Feeds

Steve Mynott (Freenode: stmuk) steve.mynott (at)gmail.com / 2017-05-29T18:19:15


Perl 6 Maven: Perl 6 Interactive Shell - values in $ variables

Published by szabgab

brrt to the future: Function call return values

Published by Bart Wiegmans on 2017-05-25T19:02:00

Hi there, it's been about a month since last I wrote on the progress of the even-moar-jit branch, so it is probably time for another update.

Already two months ago I wrote about adding support for function calls in the expression JIT compiler. This was a major milestone as calling C functions is essential for almost everything that is not pure numerical computing. Now we can also use the return values of function calls (picture that!) The main issue with this was something I've come to call the 'garbage restore' problem, by which I mean that the register allocator would attempt to 'restore' an earlier, possibly undefined, version of a value over a value that would result from a function call.

This has everything to do with the spill strategy used by the compiler. When a value has to be stored to memory (spilled) in order to avoid being overwritten and lost, there are a number of things that can be done. The default, safest strategy is to store a value to memory after every instruction that computes it and to load it from memory before every instruction that uses it. I'll call this a full spill. It is safe because it effectively makes the memory location the only 'true' storage location, with the register being merely temporary caches. It can also be somewhat inefficient, especially if the code path that forces the spill is conditional and rarely taken. In MoarVM, this happens (for instance) around memory barriers, which are only necessary when creating cross-generation object references.

That's why around function calls the JIT uses another strategy, which I will call a point spill. What I mean by that is that the (live) values which could be overwritten by the function call are spilled to memory just before the function call, and loaded back into their original registers directly after. This is mostly safe, since under normal control flow, the code beyond the function call point will be able to continue as if nothing had changed. (A variant which is not at all safe is to store the values to memory at the point, and load them from memory in all subsequent code, because it isn't guaranteed that the original spill-point-code is reached in practice, meaning that you overwrite a good value with garbage. The original register allocator for the new JIT suffered from this problem).

It is only safe, though, if the value that is to be spilled-and-restored is both valid (defined in a code path that always precedes the spill) and required (the value is actually used in code paths that follow the restore). This is not the case, for instance, when a value is the result of a conditional function call, as in the following piece of code:

1:  my $x = $y + $z;
2:  if ($y < 0) {
3:      $x = compute-it($x, $y, $z);
4:  }
5:  say "\$x = $x";

In this code, the value in $x is defined first by the addition operation and then, optionally, by the function call to compute-it. The last use of $x is in the string interpolation on line 5. Thus, according to the compiler, $x holds a 'live' value at the site of the function call on line 3, and so to avoid it from being overwritten, it must be spilled to memory and restored. But in fact, loading $x from memory after compute-it would directly overwrite the new value with the old one.

The problem here appears to be that when the JIT decides to 'save' the value of $x around the function call, it does not take into account that - in this code path - the last use of the old value of $x is in fact when it is placed on the parameter list to the compute-it call. From the perspective of the conditional branch, it is only the new value of $x which is used on line 5. Between the use on the parameter list and the assignment from the return value, the value of $x is not 'live' at all. This is called a 'live range hole'. It is then the goal to find these holes and to make sure a value is not treated as live when it is in fact not.

I used an algorithm from a paper by Wimmer and Franz (2010) to find the holes. However, this algorithm relies on having the control flow structure of the program available, which usually requires a separate analysis step. In my case that was fortunately not necessary since this control flow structure is in fact generated by an earlier step in the JIT compilation process, and all that was necessary is to record it. The algorithm itself is really simple and relies on the following ideas:

I think it goes beyond the scope of this blog post to explain how it works in full, but it is really not very complicated and works very well. At any rate, it was sufficient to prevent the JIT from overwriting good values with bad ones, and allowed me to finally enable functions that return values, which is otherwise really simple.

When that was done, I obviously tried to use it and immediately ran into some bugs. To fix that, I've improved the jit-bisect.pl script, which wasn't very robust before. The jit-bisect.pl script uses two environment variables, MVM_JIT_EXPR_LAST_FRAME and MVM_JIT_EXPR_LAST_BB, to automatically find the code sequence where the expression compiler fails and compiles wrong code. (These variables tell the JIT compiler to stop running the expression compiler after a certain number of frames and basic blocks. If we know that the program fails with N blocks compiled, we can use binary search between 0 and N to find out which frame is broken). The jit-dump.pl script then provides disassembled bytecode dumps that can be compared and with that, it is usually relatively easy to find out where the JIT compiler bug is.

With that in hand I've spent my time mostly fixing existing bugs in the JIT compiler. I am now at a stage in which I feel like most of the core functionality is in place, and what is left is about creating extension points and fixing bugs. More on that, however, in my next post. See you then!

Death by Perl6: Perl Toolchain Summit 2017 - CPAN and Perl6

Published by Nick Logan on 2017-05-25T05:41:53

At the 2017 Perl Toolchain Summit (PTS) a lot of stuff got done. This is a brief demonstration style summary of the resulting CPAN-related feature enhancements to zef.

First I should mention that now Perl6 distributions can be uploaded to CPAN (without needing to add a special Perl6/ folder), and will have their source-url automatically set or replaced with the appropriate CPAN url. Additionally App::Mi6 now has mi6 dist and mi6 upload to make the process even simpler.

Now lets get started by making sure we are using a version with features developed at PTS:

$ zef install "zef:ver(v0.1.15+)"
All candidates are currently installed  
No reason to proceed. Use --force to continue anyway  

Perl6 distributions uploaded to CPAN are now indexed. Currently the index is generated by https://github.com/ugexe/Perl6-App--ecogen and stored at https://github.com/ugexe/Perl6-ecosystems alongside a mirror of the existing p6c ecosystem. It is also enabled by default now:

$ zef list --max=10
===> Found via Zef::Repository::Ecosystems<cpan>
Inline:ver('1.2.1')  
Inline:ver('1.2')  
Inline:ver('1')  
IO::Glob:ver('0.1'):auth('github:zostay')  
Text::CSV:ver('0.007'):auth('github:Tux')  
Text::CSV:ver('0.008'):auth('github:Tux')  
Data::Selector:ver('1.01')  
Data::Selector:ver('1.02')  
NativeCall:ver('1')  
CompUnit::Repository::Mask:ver('0.0.1')  
Inline::Perl5:ver('0.26'):auth('github:niner')

$ zef info CompUnit::Repository::Mask
- Info for: CompUnit::Repository::Mask
- Identity: CompUnit::Repository::Mask:ver('0.0.1')
- Recommended By: Zef::Repository::Ecosystems<cpan>
Description:     hide installed modules for testing.  
License:     Artistic-2.0  
Source-url:     http://www.cpan.org/authors/id/N/NI/NINE/Perl6/CompUnit-Repository-Mask-0.0.1.tar.gz  
Provides: 1 modules  
Depends: 0 items

A distribution can exist in multiple "ecosystems":

$ zef search Inline::Perl5
===> Found 3 results
-----------------------------------------------------------------------------------------------------------------------
ID|From                             |Package                                       |Description  
-----------------------------------------------------------------------------------------------------------------------
1 |Zef::Repository::LocalCache      |Inline::Perl5:ver('0.26'):auth('github:niner')|Use Perl 5 code in a Perl 6 program  
2 |Zef::Repository::Ecosystems<cpan>|Inline::Perl5:ver('0.26'):auth('github:niner')|Use Perl 5 code in a Perl 6 program  
3 |Zef::Repository::Ecosystems<p6c> |Inline::Perl5:ver('0.26'):auth('github:niner')|Use Perl 5 code in a Perl 6 program  
-----------------------------------------------------------------------------------------------------------------------

Dependencies can be resolved by any and all ecosystems available, so distributions can be put on cpan and still have their dependencies that aren't get resolved:

$ zef -v install Inline::Perl5
===> Searching for: Inline::Perl5
===> Found: Inline::Perl5:ver('0.26'):auth('github:niner') [via Zef::Repository::Ecosystems<cpan>]
===> Searching for missing dependencies: LibraryMake, File::Temp
===> Found dependencies: File::Temp [via Zef::Repository::Ecosystems<p6c>]
===> Found dependencies: LibraryMake:ver('1.0.0'):auth('github:retupmoca') [via Zef::Repository::LocalCache]
===> Searching for missing dependencies: Shell::Command, File::Directory::Tree
===> Found dependencies: Shell::Command, File::Directory::Tree:auth('labster') [via Zef::Repository::Ecosystems<p6c>]
===> Searching for missing dependencies: File::Which, File::Find
===> Found dependencies: File::Find:ver('0.1'), File::Which [via Zef::Repository::Ecosystems<p6c>]

...<more output>...

In addition to CPAN we have access to CPAN testers. Garu worked with me to create a perl6 cpan testers report module: Zef::CPANReporter

$ zef install Zef::CPANReporter
===> Searching for: Zef::CPANReporter
===> Searching for missing dependencies: Net::HTTP
===> Testing: Net::HTTP:ver('0.0.1'):auth('github:ugexe')
===> Testing [OK] for Net::HTTP:ver('0.0.1'):auth('github:ugexe')
===> Testing: Zef::CPANReporter:ver('0.0.1'):auth('github:garu')
===> Testing [OK] for Zef::CPANReporter:ver('0.0.1'):auth('github:garu')
===> Installing: Net::HTTP:ver('0.0.1'):auth('github:ugexe')
===> Installing: Zef::CPANReporter:ver('0.0.1'):auth('github:garu')

# ...and in use:

$ zef -v install Grammar::Debugger
===> Searching for: Grammar::Debugger
===> Found: Grammar::Debugger:ver('1.0.1'):auth('github:jnthn') [via Zef::Repository::Ecosystems<p6c>]
===> Searching for missing dependencies: Terminal::ANSIColor
===> Found dependencies: Terminal::ANSIColor:ver('0.3') [via Zef::Repository::Ecosystems<p6c>]
===> Fetching [OK]: Grammar::Debugger:ver('1.0.1'):auth('github:jnthn') to /Users/ugexe/.zef/tmp/grammar-debugger.git
===> Fetching [OK]: Terminal::ANSIColor:ver('0.3') to /Users/ugexe/.zef/tmp/Terminal-ANSIColor.git
===> Testing: Terminal::ANSIColor:ver('0.3')
t/00-load.t .. ok  
All tests successful.  
Files=1, Tests=1,  0 wallclock secs  
Result: PASS  
===> Testing [OK] for Terminal::ANSIColor:ver('0.3')
Report for Terminal::ANSIColor:ver('0.3') will be available at http://www.cpantesters.org/cpan/report/a9ed11ac-4108-11e7-9b92-c8514edb94d5  
===> Testing: Grammar::Debugger:ver('1.0.1'):auth('github:jnthn')
t/debugger.t .. ok  
t/ltm.t ....... ok  
t/tracer.t .... ok  
All tests successful.  
Files=3, Tests=3,  1 wallclock secs  
Result: PASS  
===> Testing [OK] for Grammar::Debugger:ver('1.0.1'):auth('github:jnthn')
Report for Grammar::Debugger:ver('1.0.1'):auth('github:jnthn') will be available at http://www.cpantesters.org/cpan/report/ab9c911c-4108-11e7-a777-ca182bdd3934  
===> Installing: Terminal::ANSIColor:ver('0.3')
===> Install [OK] for Terminal::ANSIColor:ver('0.3')
===> Installing: Grammar::Debugger:ver('1.0.1'):auth('github:jnthn')
===> Install [OK] for Grammar::Debugger:ver('1.0.1'):auth('github:jnthn')

This work (and my attendance) was made possible by a bunch of great perl companies and people:

Booking.com, ActiveState, cPanel, FastMail, MaxMind, Perl Careers, MongoDB, SureVoIP, Campus Explorer, Bytemark, CAPSiDE, Charlie Gonzalez, Elastic, OpusVL, Perl Services, Procura, XS4ALL, Oetiker+Partner.

Weekly changes in and around Perl 6: 2017.21 YAP6B

Published by liztormato on 2017-05-22T22:32:09

Or in other words: there’s Yet Another Perl 6 Book in the works! Gábor Szabó has started a crowdfunding campaign to write Web Application Development in Perl 6. Subtitled: “Introduction to web application development using Perl 6. Including several sample applications.” Which would be the 6th Perl 6 book to come out this year! You can already support this effort for as little as 10 US$! I can only agree with Damian Conway in his endorsement: “Take my money, dammit! :-)”. So please support this effort in any way you can!

Unicode Property Names

Samantha McVey is soliciting comments on solving issues with the <:foo> Unicode property syntax that may have overlapping result sets. She’s working on this as part of the Improving the Robustness of Unicode Support grant. In her blog post she describes the issues at hand. Please join in if you think you can help!

Rakudo Compiler Release 2017.05

It’s getting almost as common as landing a rocket on a 10m x 10m spot after delivering a cargo to the edge of space. But it should be kept being mentioned nonetheless! Zoffix Znet and his trusted flock of bots have released the 2017.05 Rakudo compiler last Saturday, with quite an impressive set of changes: more than 300 commits in the rakudo repository alone. If you’re a Docker user: J.J. Merelo already provided a 2017.05 Docker image.

Camelia in the Wild

Well, not entirely in the wild, but a great sight nonetheless (courtesy Lee Johnson)! And here’s a more symbolical sighting (courtesy Nick Logan). But this one takes the cake (courtesy Zoffix Znet)!

Issues with modules of a specific author

If you are a Perl 6 module author, or if you want to find out which modules of an author currently have issues, you can now easily get a list: e.g. the list for Zoffix Znet can be seen at modules.perl6.org/todo/zoffix. Please substitute the name you want at the appropriate place. Or go to modules.perl6.org/todo if you want to see them all!

Perl 6 on Exercism

A Perl 6 track has been opened on Exercism, the place where you can level up your programming skills, especially if you are a (Perl 6) newbie! Personally, I really like the Exercism tagline:

write code like it’s prose

It shows an appreciation for the art of programming that is often seen missing in business.

400 Years of Perl 6 in Oslo

Damian Conway will be giving a freely-accessible presentation of his “400 Years of Perl 6” talk this Wednesday in Oslo, Norway. Highly recommended!

Core Developments

Other Blog Posts

Meanwhile on Twitter

Meanwhile on StackOverflow

Ecosystem Additions

Winding Down

Didn’t think after last week’s Perl 6 Weekly that this week’s would be quite this large again. Perhaps I should start to schedule a whole day to write the Perl 6 Weekly in the future. Let’s see. Check in again next week for more Perl 6 news!


samcv: Unicode Property Names

Published on 2017-05-21T07:00:00

Currently when you do: 'word' ~~ /<:Latin>/, MoarVM looks in a hash which contains all of the property values and looks up what property name it is associated with. So in this case it looks up Latin, and then finds it is related to the Script property.

There is a longstanding issue in MoarVM. The Unicode database of MoarVM was created with the incorrect assumption that Unicode property values were distinct. As part of my work on the Unicode Grant this is one of the issues I am tackling. So to be better informed I generated a list of all of the overlaps. I won't paste it here because it is very long, but if you want to see a full list see my post here.

There are 68 property values which belong to multiple property names and an additional 126 that are shared between Script and Block properties.

In addition we must also make sure that we check overlap between property names and the values themselves.

Here are all of the property names that conflict with values

«« IDC Conflict with property name [blk]  is a boolean property
«« VS Conflict with property name [blk]  is a boolean property
«« White_Space Conflict with property name [bc]  is a boolean property
«« Alphabetic Conflict with property name [lb]  is a boolean property
«« Hyphen Conflict with property name [lb]  is a boolean property
«« Ideographic Conflict with property name [lb]  is a boolean property
«« Lower Conflict with property name [SB]  is a boolean property
«« STerm Conflict with property name [SB]  is a boolean property
«« Upper Conflict with property name [SB]  is a boolean property

Luckily these are all Bool properties and so we don't need to worry about anything complicated there.

A fun fact, currently the only reason ' ' ~~ /<:space>/ matches is because space resolves as Line_Break=space. In fact, it should resolve as White_Space=True. Luckily space character and a few others have Line_Break=space, though this does not work properly "\n" ~~ /<:space>/. I will note though, that using <:White_Space> does work properly, as it resolves to the property name.

I would make Bool properties to be 0th in priority

Then similar to other regex engines, we will allow you to designate General_Category and Script unqualified. <:Latin> <:L> # unqualified <:Script<Latin>> <:General_Category<L>> # qualified

I propose a heirarchy as follows

The following below I have not decided if we want to guarantee them but they should be a part of the internal hierarchy

We should resolve also Numeric_Type so that people can use <:Numeric> in their regex (I'm sure that there must already exist code where this is used so we need to make sure this is resolved as well).

In actuality this resolves as Numeric_Type != None. So this is covered under rule 0.

I am open to adding whichever properties people think most important to the ordered priority list as well. Due to how things are setup in MoarVM/NQP I will need to come up with some hierarchy to resolve all the properties. In addition to this we will have a Guaranteed list, where specs will specify that using them unqualified are guaranteed to work.

The ones property value names with overlap remaining after the proposed list above:

NU => ["Word_Break", "Line_Break", "Sentence_Break"],
NA => ["Age", "Hangul_Syllable_Type", "Indic_Positional_Category"],
E => ["Joining_Group", "Jamo_Short_Name"],
SP => ["Line_Break", "Sentence_Break"],
CL => ["Line_Break", "Sentence_Break"],
D => ["Jamo_Short_Name", "Joining_Type"],
Narrow => ["East_Asian_Width", "Decomposition_Type"],
NL => ["Word_Break", "Line_Break"],
Wide => ["East_Asian_Width", "Decomposition_Type"],
Hebrew_Letter => ["Word_Break", "Line_Break"],
U => ["Jamo_Short_Name", "Joining_Type"],
LE => ["Word_Break", "Sentence_Break"],
Close => ["Bidi_Paired_Bracket_Type", "Sentence_Break"],
BB => ["Jamo_Short_Name", "Line_Break"],
HL => ["Word_Break", "Line_Break"],
Maybe => ["NFKC_Quick_Check", "NFC_Quick_Check"],
FO => ["Word_Break", "Sentence_Break"],
H => ["East_Asian_Width", "Jamo_Short_Name"],
Ambiguous => ["East_Asian_Width", "Line_Break"],

Any ideas above adding further to the hierarchy (even if they don't have any overlap presently [Unicode 9.0] it could be introduced later) will be appreciated. Either comment on this Github issue or send me an email (address at bottom of the page).

Weekly changes in and around Perl 6: 2017.20 Crossing The Alps

Published by liztormato on 2017-05-16T08:07:33

Four days of continuous hacking on the future of Perl. Whether that be Perl 5 or Perl 6. That was what the Perl Toolchain Summit was about. Thanks to the sponsors, 36 developers worked on the Perl Toolchain: from fixing bugs and security issues on PAUSE, adding features to MetaCPAN and CPAN Testers to integrating the Perl 6 Ecosystem into the Perl toolchain. Yes, you can now upload your Perl 6 module to CPAN using Shoichi Kaji‘s mi6, and install it using Nick Logan‘s zef. And that all without interfering with the indexing and testing of Perl 5 modules. We’re also very close to being able to submit results of Perl 6 module testing as CPAN Testers smoke reports!

Stefan Seifert describes it very well in his first blog post ever. For yours truly it felt that finally the ideas that had circulated almost 5 years before at the Perl Reunification Summit, had finally come to fruition.

By the way, there is also a complete list of results of the 2017 Perl Toolchain Summit. Currently visible changes to Rakudo Perl 6 itself are:

Speeding up Perl 6 Development

Jonathan Worthington is looking for further funding of his excellent work on Rakudo Perl 6 and the MoarVM backend. As he states:

Making MoarVM run Perl 6 faster means working on the dynamic optimizer, which needs a good deal of care to avoid doing the wrong thing really fast. And driving forward the design and implementation of Perl 6’s concurrent and parallel features also requires careful consideration.

Please check out the details of the proposal in his blogpost: if the company you’re working for wants to be serious about supporting Perl 6 development, this is an easy and low-threshold way to do it! Please have a look at his latest grant report to get an idea about the quality and scope of the work that Jonathan does!

Perl Events Photostream

At the Perl Toolchain Summit Lee Johnson started a Perl Events Photostream. It’s good to see the Camel and Camelia living together side by side!

Other Blog Posts

The End Of An Era

For over 6 years, panda has been the de-facto module installer on Rakudo Perl 6. But it seems to be that all good things must come to an end. In the past 2 years, panda started to fall behind the new shiny zef. At the Perl Toolchain Summit, Tadeusz Sośnierz (tadzik) marked the project, that he started in 2011 in the very early days of Rakudo Perl 6 and worked on for so long, as deprecated. We all owe a lot to tadzik and all the other contributors, such as Tobias Leich, Moritz Lenz and Stefan Seifert for their work on panda. So I think a big Thank You! is in order.

It’s at moments like these that it is hardest to realize that many times it is about the journey, not about reaching the goal!

Other Core Developments

Meanwhile on Twitter

Meanwhile on FaceBook

Ecosystem Additions

I think this is a new record for number of new modules in a week!

Winding Down

It’s hard to wind down after such an intense week. Pretty sure there will be more exciting things to report next week. So, until then!


6guts: Looking for Perl 6, Rakudo, and MoarVM development funding

Published by jnthnwrthngtn on 2017-05-12T16:17:52

Note for regular 6guts readers: this post isn’t about Perl 6 guts themselves, but rather about seeking funding for my Perl 6 work. It is aimed at finding medium-size donations from businesses who are interested in supporting Perl 6, Rakudo, and MoarVM by funding my work on these projects.

I started contributing to the Rakudo Perl 6 compiler back in 2008, and since then have somehow managed to end up as architect for both Rakudo and MoarVM, together with playing a key design role in Perl 6’s concurrency features. Over the years, I’ve made time for Perl 6 work by:

I’m still greatly enjoying doing Perl 6 stuff and, while I’ve less free time these days for a variety of reasons, I still spend a decent chunk of that on Perl 6 things too. That’s enough for piecing together various modules I find we’re missing in the ecosystem, and for some core development work. However, the majority of Perl 6, Rakudo, and MoarVM issues that end up on my plate are both complex and time-consuming. For various areas of MoarVM, I’m the debugger of last resort. Making MoarVM run Perl 6 faster means working on the dynamic optimizer, which needs a good deal of care to avoid doing the wrong thing really fast. And driving forward the design and implementation of Perl 6’s concurrent and parallel features also requires careful consideration. Being funded through The Perl Foundation over the last couple of years has enabled me to spend quality time working on these kinds of issues (and plenty more besides).

So what’s up?

I’ve been without funding since early-mid February. Unfortunately, my need to renew my funding has come at a time when The Perl Foundation has been undergoing quite a lot of changes. I’d like to make very clear that I’m hugely supportive and thankful for all that TPF have done and are doing, both for me personally and for Perl 6 in general. Already this year, two Perl 6 grants have been made to others for important work. These were made through the normal TPF grants process. By contrast, my work has been funded through a separate Perl 6 Core Development Fund. As a separate fund, it thus needs funds raising specifically for it, and has its own operation separate from the mainstream grant process.

Between the fund being almost depleted, and various new volunteers stepping up to new roles in TPF and needing to get up to speed on quite enough besides the Perl 6 Core Development Fund, unfortunately it’s not been possible to make progress on my funding situation in the last couple of months. I’m quite sure we can get there with time – but at the same time I’m keen to get back to having more time to spend on Perl 6 again.

So, I’ve decided to try out an alternative model. If it works, I potentially get funded faster, and TPF’s energies are freed up to help others spend more time on Perl. If not, well, it’ll hopefully only cost me the time it took to write this blog post.

The Offer

I’m looking for businesses willing to help fund my Perl 6 development work. I can offer in return:

I’m setting a rate for this work of 55 EUR / hour with a minimum order of 25 hours. This need not be billed all in one go; for example, if you happened to be a company wishing to donate 1000 EUR a month to open source and wished to be invoiced that amount each month, this is entirely possible. After all, if 3-4 companies did that, we’d have me doing Perl 6 stuff for 2 full days every week.

If you’re interested in helping, please get in contact with me, either by email or on freenode (I’m jnthn there). Thank you!


Weekly changes in and around Perl 6: 2017.19 Albatross_I

Published by liztormato on 2017-05-08T21:52:00

The past week felt a bit dreary, the weather was meh, and MoarVM had some telemeh issues on ARM processors. And there were some discussions on how to treat non-Int values in situations where Int values are expected (do we floor, fail or throw?). I guess these things came to the front more because there was not a lot else going on this week.

Rakudo on Ubuntu on Windows

Claudio Ramirez tells us that the Ubuntu 16.04 Rakudo packages were made compatible with the Windows 10 Linux Subsystem (AKA bash in Windows 10). Just run /opt/rakudo/fix_windows10 after install.

telemeh

Timo Paulssen has been working on his telemeh for MoarVM project. It basically spits out little status messages at a very high speed and high accuracy. Because instead of asking a clock of some kind, it just reads the “number of cycles elapsed since some point near the beginning of the process” register out of the CPU. It’s also thread-safe, much like the other profiler isn’t really at the moment. The idea is to eventually have a nice GUI-ish frontend that could let you analyze the telemeh-log at a higher level. It’s inspired by a product called “telemetry” by RAD Game Tools.

Core Developments

Blog Posts

Meanwhile on FaceBook

Meanwhile on Twitter

Meanwhile on StackOverflow

Ecosystem Additions

Winding Down

Pretty sure we will see some good Perl 6 things coming from the Perl Toolchain Summit! So check in again next week!


Weekly changes in and around Perl 6: 2017.18 Starlight, Starbright

Published by liztormato on 2017-05-01T22:38:27

Thanks to Steve Mynott, we have another Rakudo Star release: R* 2017.04 is now available for Unix, MacOS and Windows. The announcement has many “too many to list” bullet points. Which is correct, because a lot was improved in Rakudo Perl 6 and the ecosystem in the past 3 months since the last Rakudo Star release.

Perl Developer Survey

The good people of BuiltInPerl published the results of the yearly Perl Developer Survey. Alas, not a lot of useful information about the adoption of Perl 6 yet, but I have a feeling we will see that in next year’s edition!

Unicode Grant

The Unicode improvement Grant Proposal by Samantha McVey has been accepted! Looking forward to seeing more of Samantha McVey‘s excellent work!

Core Developments

Blog posts

A Flash from the Past

Ingo Blechschmidt pointed yours truly to a nice set of slides about the history of Pugs. If you’re new to Perl 6, it might give you some perspective on the shoulders of giants on which the currently most active version of Perl 6 has been built.

Meanwhile on the book front

It is now official: the Perl 6 book that Moritz Lenz is working on (previously called Perl 6 By Example), will be published by Apress as Perl 6 Fundamentals – A Primer with Examples, Projects, and Case Studies. This means we now have two mainline publishers publishing Perl 6 books (the other being O’Reilly with Think Perl 6 by Laurent Rosenfeld). Who, incidentally, had an interview about his book with brian d foy. Highly recommended!

Meanwhile on Twitter

Meanwhile on FaceBook

Meanwhile in Academia

Damian Conway is busy showing students the best Perl 6 has to offer:

Fortunately, an adapted version of the Parsing Techniques class will also be given as a tutorial after the The Perl Conference US (23 June, Washington DC).

Meanwhile on StackOverflow

Ecosystem Additions

Winding Down

Phew! A lot more to write in the Weekly than I thought. Fortunately, I was only mildly distracted by some crazy rocket science in action. Check in again next week for more crazy action in the Perl 6 world!


rakudo.org: Announce: Rakudo Star Release 2017.04

Published by Steve Mynott on 2017-05-01T15:35:01

A useful and usable production distribution of Perl 6

On behalf of the Rakudo and Perl 6 development teams, I’m pleased to announce the April 2017 release of “Rakudo Star”, a useful and usable production distribution of Perl 6. The tarball for the April 2017 release is available from https://rakudo.perl6.org/downloads/star/.

Binaries for macOS and Windows (64 bit) are also available.

This is the seventh post-Christmas (production) release of Rakudo Star and implements Perl v6.c. It comes with support for the MoarVM backend (all module tests pass on supported platforms).

This release includes “zef” as module installer. “panda” is to be shortly replaced by “zef” and will be removed in the near future.

It’s hoped to produce quarterly Rakudo Star releases during 2017 with 2017.07 (July) and 2017.10 (October) to follow.

Please note that this release of Rakudo Star is not fully functional with the JVM backend from the Rakudo compiler. Please use the MoarVM backend only.

In the Perl 6 world, we make a distinction between the language (“Perl 6”) and specific implementations of the language such as “Rakudo Perl”.

This Star release includes [release 2017.04.3] of the Rakudo Perl 6 compiler, version 2017.04-53-g66c6dda of MoarVM, plus various modules, documentation, and other resources collected from the Perl 6 community.

The Rakudo compiler changes since the last Rakudo Star release of 2017.01 are now listed in “2017.02.md” and “2017.04.md” under the “rakudo/docs/announce” directory of the source distribution.

In particular this release featured many important improvements to the IO subsystem thanks to Zoffix and the support of the Perl Foundation.

Please see
Part 1: http://rakudo.org/2017/04/02/upgrade
Part 2: http://rakudo.org/2017/04/03/part-2
Part 3: http://rakudo.org/2017/04/17/final-notes

Note there were point releases of 2017.04 so also see “2017.04.1.md”, “2017.04.2.md” and “2017.04.3.md”.

Notable changes in modules shipped with Rakudo Star:

+ DBIish: New version with pg-consume-input
+ doc: Too many to list. Large number of “IO Grant” doc changes.
+ json\_fast: Too many to list. Big performance improvements.
+ perl6-lwp-simple: Fix for lexical require and incorrect regex for absolute URL matcher
+ test-mock: Enable concurrent use of mock objects
+ uri: Encoding fixes
+ zef: Too many to list. IO fixage.

There are some key features of Perl 6 that Rakudo Star does not yet handle appropriately, although they will appear in upcoming releases. Some of the not-quite-there features include:

+ advanced macros
+ non-blocking I/O (in progress)
+ some bits of Synopsis 9 and 11
+ There is an online resource at http://perl6.org/compilers/features that lists the known implemented and missing features of Rakudo’s backends and other Perl 6 implementations.

In many places we’ve tried to make Rakudo smart enough to inform the programmer that a given feature isn’t implemented, but there are many that we’ve missed. Bug reports about missing and broken features are welcomed at rakudobug@perl.org.

See https://perl6.org/ for links to much more information about Perl 6, including documentation, example code, tutorials, presentations, reference materials, design documents, and other supporting resources. Some Perl 6 tutorials are available under the “docs” directory in the release tarball.

The development team thanks all of the contributors and sponsors for making Rakudo Star possible. If you would like to contribute, see http://rakudo.org/how-to-help, ask on the perl6-compiler@perl.org mailing list, or join us on IRC #perl6 on freenode.

brrt to the future: Letting templates do what you mean

Published by Bart Wiegmans on 2017-04-30T22:12:00

Hi everybody, today I'd like to promote a minor, but important improvement in the 'expression template compiler' for the new JIT backend. This is a tool designed to make it easy to develop expression templates, which are themselves a way to make it easy to generate the 'expression tree' intermediate representation used by the new JIT backend. This is important because MoarVM instructions operate on a perl-like level of abstraction - single instructions can perform operations such as 'convert object to string', 'find first matching character in string' or 'access the last element of an array'. Such operations require rather more instructions to represent as machine code.

This level of abstraction is rather convenient for the rakudo compiler, which doesn't have to consider low-level details when it processes your perl6 code. But it is not very convenient for the JIT compiler which does. The 'expression' intermediate representation is designed to be much closer to what hardware can support directly. Basic operations include loading from and storing to memory, memory address computation, integer arithmetic, (conditional) branching, and function calls. At some point in the future, floating point operations will also be added. But because of this difference in abstraction level, a single MoarVM instruction will often map to many expression tree nodes. So what is needed is an efficient way to convert between the two representations, and that is what expression templates are supposed to do.

Expression templates are very much like the expression tree structure itself, in that both are represented as arrays of integers. Some of the elements represent instructions, some are constants, and some are references (indexes into the same array), forming a directed acyclic graph (not a tree). The only difference is that the template is associated with a set of instructions that indicate how it should be linked into the tree. (Instruction operands, i.e. the data that each instruction operates on, are prepared and linked by the template application process as well).

Surprisingly, arrays of integers aren't a very user-friendly way to write instruction templates, and so the template compiler was born. It takes as input a text file with expression templates defined as symbolic expressions, best known from the LISP world, and outputs a header file that contains the templates, ready for use by the JIT compiler. Note that the word 'template' has become a bit overloaded, referring to the textual input of the template compiler as well as to the binary input to the JIT compiler. That's okay, I guess, since they're really two representations of the same thing. The following table shows how template text, binary, and expression tree relate to each other:

Text 'Binary'Tree

(template: unless_i
(when
(zr $0)
(branch (label $1))
))

template: {
MVM_JIT_ZR,
0,
MVM_JIT_LABEL,
1,
MVM_JIT_BRANCH,
2,
  MVM_JIT_WHEN,
  0,
  4,
},
info: ".f.f.l.ll",
len: 9,
root: 6

I hope it isn't too hard to see how one maps to the other. The unless_i instruction executes a branch if its integer argument is zero, specified by a constant as its second argument. All symbols (like when, label and zr) have been replaced by uppercase prefixed constants (MVM_JIT_WHEN), and all nesting has been replaced by references (indexes) into the template array. The 'info' string specifies how the template is to be linked into the tree. Instruction operands are indicated by an 'f', and internal links by an 'l'. In the tree representation the operands have been linked into the tree by the JIT; they form the LOAD and CONST nodes and everything below them.

Anyway, my improvement concerns a more complex form of template, such as the following example, an instruction to load an object value from the instance field of an object:

(template: sp_p6oget_o
(let: (($val (load (add (^p6obody $1) $2) ptr_sz)))
(if (nz $val) $val (^vmnull))))

This template contains a let: expression, which declares the $val variable. This value can be used in the subsequent expression by its name. Without such declarations the result of a computation could only have one reference, its immediate syntactic parent. (Or in other words, without let:, every template can only construct a tree). That is very inconvenient in case a result should be checked for null-ness, as in this case. (vmnull is a macro for the global 'null object' in MoarVM. The null object represents NULL wherever an object is needed, but isn't actually NULL, as that would mean it couldn't be dereferenced; it saves the interpreter from checking if a pointer to an object is NULL everywhere it is accessed).

The let: construct has another purpose: it ensures the ordering of operations. Although most operations can be ordered in whatever way suits the compiler, some do not, most notably function calls. (Function calls may have numerous unpredictable side effects, after all). All statements declared in the 'let declaration body' are compiled to run before any statements in the 'expression body'. This enables the programmer to ensure that a value is not needlessly computed twice, and more importantly, it ensures that a value that is used in multiple branches of a conditional statement is defined in both of them. For instance:


(let (($foo (...)))
(if (...)
(load $foo)
$foo))

This pseudo-snippet of template code would dereference $foo if some condition is met (e.g. $foo is not NULL) and returns $foo directly otherwise. Without let to order the computation of $foo prior to the blocks of if, the first (conditional) child of if would be the first reference to $foo. That would mean that the code to compute $foo is only compiled in the first conditional block, which would not be executed whenever the if condition was not true, meaning that $foo would be undefined in the alternative conditional block. This would mean chaos. So in fact let does order expressions. All is good.

Except... I haven't told you how this ordering works, which is where my change comes in. Prior to commit 7fb1b10 the let expression would insert a hint to the JIT compiler to add the declared expressions as tree roots. The 'tree roots' are where the compiler starts converting the expression tree (graph) to a linear sequence of byte code. Hence the declaring expressions are compiled prior to the dependent expressions. But this has, of course, one big disadvantage, which is that the set of roots is global for the tree. Every declaration, no matter how deep into the tree, was to be compiled prior to the head of the tree. As a result, the following template code would not at all do what you want:


(let ($foo (...))
(if (nz $foo)
(let (($bar (load $foo))) # dereference $foo !
(... $bar))
...)


The declaration of $bar would cause $foo to be dereferenced prior to checking whether it is non-null, causing a runtime failure. Chaos is back. Well, that's what I've changed. Fortunately, we have another ordering mechanism at our disposal, namely DO lists. These are nodes with a variable number of children that are also promised to be compiled in order. After the patch linked above, the compiler now transforms let expressions into the equivalent DO expressions. Because DO expressions can be nested safely, $bar is not computed prior to the null-check of $foo, as the programmer intended. I had originally intended to implement analysis to automatically order the expressions with regard to the conditionals, but I found that this was more complicated to implement and more surprising to the programmer. I think that in this case, relying on the programmer is the right thing.

One thing that I found interesting is that this reduces the number of mechanisms in the compiler. The 'root-hint' was no longer useful, and subsequently removed. At the same time, all but the last child of a DO list must be void expressions, i.e. yield no value, because DO can only return the value of its last child. Since all expressions in a let declaration must yield some value - otherwise they would be useless - they required a new operation type: discard. Thus with a new node type (extension of data range) we can remove a class of behavior.

After I had implemented this, I've started working on adding basic block analysis. That is a subject for a later post, though. Until next time!

Perlgeek.de: Perl 6 By Example: Now "Perl 6 Fundamentals"

Published by Moritz Lenz on 2017-04-30T22:00:01

This blog post is part of my ongoing project to write a book about Perl 6.

If you're interested, either in this book project or any other Perl 6 book news, please sign up for the mailing list at the bottom of the article, or here. It will be low volume (less than an email per month, on average).


After some silence during the past few weeks, I can finally share some exciting news about the book project. Apress has agreed to publish the book, both as print and electronic book.

The title is Perl 6 Fundamentals, with A Primer with Examples, Projects, and Case Studies as subtitle. The editorial process is happening right now. I've received some great feedback on my initial manuscript, so there's a lot to do for me.

Stay tuned for more updates!

Subscribe to the Perl 6 book mailing list

* indicates required

gfldex: Issue All The Things

Published by gfldex on 2017-04-30T18:07:12

While on her epic quest to clean up the meta part of the ecosystem samvc send me a few pull requests. That raised the question which of my modules have open issues. Github is quite eager to list you many things but lacks the ability to show issues for a group of repos. Once again things fell into place.

Some time ago I made a meta module to save a few clicks when testing modules once a week. What means I have a list of modules I care about already.

perl6 -e 'use META6; use META6::bin :TERM :HELPER;\\
for META6.new(file => "$*HOME/projects/perl6/gfldex-meta-zef-test/META6.json").<depends> -> $name {\\
    say BOLD $name;\\
}'

META6::bin didn’t know about Github issues, what was easily solved, including retries on timeouts of the github api. Now I can feed the module names into &MAIN and get a list of issues.

perl6 -e 'use META6; use META6::bin :TERM :HELPER;\\
for META6.new(file => "$*HOME/projects/perl6/gfldex-meta-zef-test/META6.json").<depends> -> $name {\\
    say BOLD $name;\\
    try META6::bin::MAIN(:issues, :module($name), :one-line, :url);\\
}'

I switfly went to merge the pull requests.

Test::META
[open] Add License checks and use new META license spec [10d] ⟨https://github.com/jonathanstowe/Test-META/pull/21⟩
[open] warn on source [35d] ⟨https://github.com/jonathanstowe/Test-META/issues/20⟩
[open] warn on empty description [37d] ⟨https://github.com/jonathanstowe/Test-META/issues/19⟩
[open] check if source-url is accessible [37d] ⟨https://github.com/jonathanstowe/Test-META/issues/18⟩
[open] Check `perl` version [135d] ⟨https://github.com/jonathanstowe/Test-META/issues/14⟩
[open] Report missing modules? [1y] ⟨https://github.com/jonathanstowe/Test-META/issues/8⟩
[open] Add :strict-versions switch [1y] ⟨https://github.com/jonathanstowe/Test-META/issues/7⟩
[open] Test harder that "provides" is sane [1y] ⟨https://github.com/jonathanstowe/Test-META/issues/6⟩
Typesafe::XHTML::Writer
Rakudo::Slippy::Semilist
Slippy::Semilist
Github timed out, trying again 1/3.
Github timed out, trying again 2/3.
Github timed out, giving up.
Operator::defined-alternation
Concurrent::Channelify
[open] Use SPDX identifier in license field of META6.json [3d] ⟨https://github.com/gfldex/perl6-concurrent-channelify/pull/1⟩
Concurrent::File::Find
[open] Use SPDX identifier in license field of META6.json [3d] ⟨https://github.com/gfldex/perl6-concurrent-file-find/pull/1⟩
XHTML::Writer
Github timed out, trying again 1/3.
Typesafe::HTML
Git::Config
Proc::Async::Timeout
Github timed out, trying again 1/3.
[open] Use SPDX identifier in license field of META6.json [9d] ⟨https://github.com/gfldex/perl6-proc-async-timeout/pull/1⟩

To check the issues of any project that got a META6.json run meta6 --issues. To check if there are issues for a given module in the ecosystem use meta6 --issues --module=Your::Module::Name

UPDATE:

As requested by timotimo, meta6 --issues --one-line --url --deps will list all issues of the repo and all issues of the dependencies listed in META6.json.


Weekly changes in and around Perl 6: 2017.17 Interesting Times

Published by liztormato on 2017-04-24T21:45:59

Indeed. The past week saw the Rakudo Compiler Release 2017.04 have several point updates. Zoffix Znet explains it all in The Failure Point of a Release. The good news: if you’re waiting for a Rakudo Star Release of 2017.04, a release candidate is now available for testing. So please do!

Distribution License

Samantha McVey found that a lot of distributions in the ecosystem have a poor definition of the license they are released under. So she wrote a call to action in: Camelia Wants YOU to Add License Tags to Your Module! So please do!

The Perl Conference – US

The preliminary schedule for the Perl Conference - US on 19-21 June (formerly known as YAPC::NA) is now available. Please note that Damian Conway will be giving some interesting Perl 6 related tutorials!

Core Developments

Blog Posts

Wow, what a nice bunch of blog posts!

Meanwhile on Twitter

Meanwhile on StackOverflow

Ecosystem Additions

Winding down

Yours truly missed most of the excitement the past week on account of being on the road a lot. In a way, I’m glad I did. On the other hand, feels like I should have been around. Ah well, you can’t have it all. But if you want more, please check in again next week for more Perl 6 news!


samcv: Camelia Wants YOU to Add License Tags to Your Module!

Published on 2017-04-23T07:00:00

Open to scene, year 2017: With no good guidance on the license field, the ecosystem had at least as many variations for "Artistic 2.0" license as humans had fingers. But there was a hope that robot kind and human kind could work to solve this problem, together.

Most of our ecosystem modules that have licenses are Artistic 2.0. Here are just some of the variations of the license metedata tag we had in the ecosystem for the same license. Some were ambiguous as well:

The list goes on. Note: the ambiguous license names above (perl and Artistic) were found on modules that were provably Artistic 2.0 as they had a LICENSE file. I make no assertion that all modules using these ambiguous names are Artistic 2.0, and the list above only refers to what was found on actual Artistic 2.0 projects in the ecosystem.

This was by no fault of the module creators, as the docs.perl6.org example didn't even show a license field at all (this has now been updated with guidance on the license field). The original META spec in S22 used to say that the field should contain a URL to the license, and even if this had been consistent between modules in the ecosystem, it still would not have been very useful for computers or people to quickly figure out with certainty what license a project was under, as the URL's were at many different addresses for the same licenses.

It was clear the original spec was not sufficiently useful to computers or package managers, so the spec was changed to more conform to other parts of META. It was decided we would use SPDX standard identifiers in the license field, which are both human and computer readable. Then allowing for an optional URL to the license to go under the support key of the META (where other project URL's already go).

This new effort hopes to make sure the license fields in META are both computer and human useful to look at, so we standardized based on the SPDX identifiers which are the most widely used identifiers in the open source world.

Humans and Robots working together

We had 103 modules that had license fields, but were non-standard values. My robot agreed to make about 50 pull requests, as a show of good will towards us humans :). These automated pull requests were made when it was a full certainty which license the project had, (either by the LICENSE file or because the license field was "The Artistic 2" or something unambiguous but nonstandard).

There is a full list of the modules with non-standard licenses here where we are keeping track of the progress that has been made! The list is planned to expand to also cover modules with no license field at all, but the ones with license fields were much easier for my robot friend to deal with.

If you have a module in that list, or have a module with no license field (don't feel bad, until several days ago none of my modules had license fields either), it is your job to add one!

If you have any modules which me or my robot friend didn't PR, feel free to add those license fields to any other modules you may have. If you see a module and notice it has no license field in the meta file, feel free to submit a PR of your own if they have a LICENSE file showing which license the module is under, or if there is no license field, opening an issue so the author can make the change themself. If possible add it to the list above so we can keep track of it. As mentioned before, make sure to use SPDX identifiers.

For the details of the updated META spec regarding the license field, please see S22 here.

Thank you for doing your part to help make the ecosystem a better place!

P.S. On April 19th only 13% of modules had a license field at all. Now, 4 days later we are up to 20.5%! Keep up the good work everyone!

Zoffix Znet: The Failure Point of a Release

Published on 2017-04-23T00:00:00

The April Glitches in Rakudo Releases

6guts: Massively reducing MoarVM Fixed Size Allocator contention

Published by jnthnwrthngtn on 2017-04-22T14:37:35

The latest MoarVM release, 2017.04, contains a significant improvement for multi-threaded applications, especially those that are CPU-bound. I mentioned the improvement briefly on Twitter, because it’s hard to be anything other than brief on Twitter. This post contains a decidedly less brief, though hopefully somewhat interesting, description of what I did. Oh, and also a bonus footnote in which I attempt to prove the safety (or lack of safety) of a couple of lock free algorithms, because that’s fun, right?

The Fixed Size Allocator

The most obvious way to allocate memory in a C program is through functions like malloc and calloc. Indeed, we do this plenty in MoarVM. The malloc and calloc implementations in C libraries have certainly been tuned a bunch, but at the same time they have to have good behavior for a very wide range of programs. They also need to keep track of the sizes of allocations, since a call to free does not pass the size of the memory being released. And they need to try to avoid fragmentation, which can lead to out-of-memory errors occurring because the heap ends up with lots of small gaps, but none big enough to allocate a larger object.

When we know a few more properties of the memory usage of a program, and we have information around to know the size of the memory block we are freeing, it’s possible to do a little better. MoarVM does this in multiple ways.

One of them is by using a bump-the-pointer allocator for memory that is managed by the garbage collector. These have a header that points to a type table that knows the size of the object that was allocated, meaning the size information is readily available. And the GC can move objects around in memory, since it can find all of the references to an object and update them, meaning there is a way out of the fragmentation trap too.

The call stack is another example. In the absence of closures, it is possible to allocate a block of memory and use it like a stack. When a program makes a call, the current location in the memory block is taken as the address for the call frame memory, and the location is bumped by the frame size. This could be seen as a “push”, in stack terms. Because call frames are left in the opposite order to which they are entered, freeing them is just subtraction. This could be seen as a “pop”. Since holes are impossible, fragmentation cannot occur.

A third case is covered by the fixed size allocator. This is the most difficult of the three. It tries to do a better job than malloc and friends in the case that, at the point when memory is freed, we readily know the size of the memory. This allows it to create regions of memory that consist of N blocks of a fixed size, and allocate the memory out of those regions (which it calls “pages”). When a memory request comes in, the allocator first checks if it’s within the size range that the fixed size allocator is willing to handle. If it isn’t, it’s passed along to malloc. Otherwise, the size is rounded up to the nearest “bin size” (which are 8 bytes, 16 bytes, 24 bytes, and so forth). A given bin consists of:

If the free list contains any entries, then one of them will be taken. If not, then the pages will be considered. If the current page is not full, then the allocation will be made from it. Otherwise, another page will be allocated. When memory is freed, it is always added to the free list of the appropriate bin. Therefore, a longer-running program, in steady state, will typically end up getting all of its allocations from the free list.

Enter threads

Building a fixed size allocator for a single-threaded environment isn’t all that difficult. But what happens when it needs to cope with being used in a multi-threaded program? Well…it’s complicated. Clearly, it is not possible to have a single global fixed size allocator and have all of the threads just use it without any kind of concurrency control. Taking an item off the freelist is a multi-step process, and allocating from a page – or allocating a new page – is even more steps. Without concurrency control, there will be data races all over, and we’ll be handed a SIGSEGV in record time.

It’s worth stopping to consider what would happen if we were to give every thread its own fixed size allocator. This turns out to get messy fast, as memory allocated on one thread may be freed by another. A seemingly simple scheme is to say that the freed memory is simply appended to the freelist of the freeing thread’s fixed size allocator. Unfortunately, this has two bad properties.

  1. When the thread ends, we can’t just throw aways the pages – because bits of them may still be in use by other threads, or referenced in the free lists of other threads. So they’d need to be somehow “re-homed”, which is going to need some kind of coordination. Further measures may be needed to mitigate memory fragmentation in programs that spawn and join many threads during their lifetimes.
  2. Imagine a producer/consumer setup, where one thread does allocations and passes the allocated memory to another thread, which processes the data in the memory and frees it. The producing thread will build up a lot of pages to allocate out of. The consuming thread will build up an ever longer free list. Memory runs out. D’oh.

So, MoarVM went with a single global fixed size allocator. Of course, this has the drawback of needing concurrency control.

Concurrency control

The easiest possible form of concurrency control is to have threads acquire a mutex on every allocate and free operation. This has the benefit of being very straightforward to understand and reason about. It has the disadvantage of being extremely costly. Mutex acquisition can be relatively cheap, but it gets expensive when there is high contention – that is, lots of threads trying to obtain the lock. And since all CPU-bound threads will typically allocate some working memory, particularly in a VM for a dynamic language that doesn’t yet do escape analysis, that adds up to a lot of contention.

So, MoarVM did something more sophisticated.

First, the easy part. It’s possible to append to a free list with a CPU-provided atomic operation, provided taking from the freelist is also using one. So, no mutex acquisition is required for freeing memory. However, an atomic operation still requires a kind of locking down at the CPU level. It’s cheaper than a mutex acquire/release for sure, but there will still be contention between CPU cores for the cache line holding the head of the free list.

What about allocation? It turns out that we can not just take from a free list using an atomic operation without hitting the ABA problem (gory details in footnote). Therefore, some kind of locking is needed to ensure an ordering on the operations. In most cases, the atomic operation will work on the first attempt (it’s competing with frees, which happen without any kind of locking, meaning a retry will sometimes be needed). In cases where something will complete very rapidly, a spinlock may be used in place of a full-on mutex. So, the MoarVM fixed size allocator allocation scheme boiled down to:

  1. Acquire the spin lock.
  2. Try to take from the free list in a loop, until either we succeed or the free list is seen to be empty.
  3. Release the spin lock.
  4. If we failed to obtain memory from the free list, take the slow path to get memory from a page, allocating another page if needed. This slow path does acquire a real mutex.

Contention

First up, I’ll note that the strategy outlined above does beat the “just take a mutex for every allocate/free” approach – at least, in all of the benchmarks I’ve considered. Frees end up being lock free, and most of the allocations just do a spin lock and an atomic operation.

At the same time, contention means contention, and no lock free data structure or spinlock changes that. If multiple threads are constantly scrambling to work on the same memory location – such as the head of a free list – it’s going to get expensive. How expensive? On an Intel Core i7, obtaining a cache line that is held by another core exclusively – which it typically will be under contention – costs somewhere around 70 CPU cycles. It gets worse in a multi-CPU setup, where it could easily be hundreds of CPU cycles. Note this is just for one operation; the spinlock is a further atomic operation and, of course, it uses some cycles as it spins.

But how much could this end up costing in a real world Perl 6 application? I recently had chance to find out, and the numbers were ugly. Measurements obtained by perf showed that a stunning 40% of the application’s runtime was spent inside of the fixed size allocator. (Side note: perf is a sampling profiler, which – in my handwavey understanding – snapshots the callstack at regular intervals to figure out where time is being spent. My experience has been that sampling profilers tend to be better at showing up surprising costs like this than instrumenting profilers are, even if they are in some senses less precise.)

Making things better

Clearly, there was significant room for improvement. And, happily, things are now muchly improved and my real-world program did get something close to a 40% performance boost.

To make things better, I introduced per-thread freelists, while leaving pages global and retaining global free lists also.

Memory is allocated in the first place from global pages, as before. However, when it is freed, it is instead placed on a per-thread free list (with one free list per thread per size bin). When a thread needs memory, it first checks its thread-local free list to see if there is anything there. It will only then look at the global free list, or the global pages, if the thread-local free list cannot satisfy the memory request. The upshot of this is that the vast majority of allocations and frees performed by the fixed size allocator no longer have any contention.

However, as I mentioned earlier, one needs to be very careful when introducing things like thread-local freelists to not create bad behavior when a thread terminates or in producer/consumer scenarios. Therefore:

So, I believe this improvement is both good for performance without being badly behaved for any cases that previously would have worked out fine.

Can we do better?

Always! While the major contention bottleneck is gone, there are further opportunities for improvement that are worth exploring in the future.

In summary…

If you have CPU-bound multi-threaded Perl 6 programs, MoarVM 2017.04 could offer a big performance improvement. For my case, it was close to 40%. And the design lesson from this: on modern hardware, contention is really costly, and using a lock free data structure or picking the “best kind of lock” will not overcome that.


Footnote on the ABA vulnerability: It’s decidedly interesting – at least to me – that prepending to a free list can be safely done with a single atomic operation, but taking from it cannot be. Here I’ll attempt to provide a proof for these claims.

We’ll consider a single free list whose head lives at memory location F, and two threads, T1 and T2. We assume the existence of an atomic operation, TRY-CAS(location, old, new), which will – in a single CPU instruction that may not be interrupted – compare the value in memory pointed to by location with old and, if they match, replace it with new. (CAS is short for Compare And Swap.) The TRY-CAS function evaluates to true if the replacement took place, and false if not. The threads may be preempted (that is, taken off the CPU) at any point in time.

To show that allocation is vulnerable to the ABA problem, we just need to find an execution where it happens. First of all, we’ll define the operation ALLOCATE as:

1: do
2:     allocated = *F
3:     if allocated != NULL
4:         next = allocated.next    
5: while allocated != NULL && !TRY-CAS(F, allocated, next)
6: return allocated

And FREE(C) as:

1: do
2:     current = *F
3:     C.next = current;
4: while !TRY-CAS(F, current, C)

Let’s consider a case where we have 3 memory cells, C1, C2, and C3. The free list head F points to C1, which in turn points to C2, which in turn points to C3.

Thread T1 enters ALLOCATE, but is preempted immediately after the execution of line 4. At this point, allocated contains C1 and next contains C2.

Next, T2 calls ALLOCATE, and succeeds in making an allocation. F now points to C2. It again calls ALLOCATE, meaning that F now points to C3. It then calls FREE(C1). At this point, F points to C1 again, and C1 points to C3. Notice that at this point, cell C2 is considered to be allocated and in use.

Consider what happens if T1 is resumed. It performs TRY-CAS(F, C1, C2). This operation will succeed, because F does indeed currently point to C1. This means that F now come to point to C2. However, we earlier stated that C2 is allocated and in use, and therefore should not be in the free list. Therefore we have demonstrated the code to be buggy, and shown how the bug arises as a result of the ABA problem.

What of the claim that the FREE(C) is not vulnerable to the ABA problem? To be vulnerable to the ABA problem, another thread must be able to change the state of something that the correctness of the operation depends upon, but that is not tested by the TRY-CAS operation. Looking at FREE(C) again:

1: do
2:     current = *F
3:     C.next = current;
4: while !TRY-CAS(F, current, C)

We need to consider C and current. We can very reasonably make the assumption that the calling program is well-behaved, and will never use the cell C again after passing it to FREE(C) (unless it obtains it again in the future through another call to ALLOCATE, which cannot happen until FREE has inserted it into the free list). Therefore, C cannot be changed in any way other than the code in FREE changes it. The FREE operation holds the sole reference to C at this point.

Life is much more complicated for current. It is possible for a preemption at line 3 of FREE, followed by another thread allocating the cell pointed to by current and then freeing it again, which is certainly a case of an ABA state change. However, unlike the situation we saw in ALLOCATE, the FREE operation does not depend on the content of current. We can see this by noticing how it never looks inside of it, and instead just holds a reference to it. An operation cannot depend upon a value it never accesses. Therefore, FREE is not vulnerable to the ABA problem.


Strangely Consistent: The root of all eval

Published by Carl Mäsak

Ah, the eval function. Loved, hated. Mostly the latter.

$ perl -E'my $program = q[say "OH HAI"]; eval $program'
OH HAI

I was a bit stunned when the eval function was renamed to EVAL in Perl 6 (back in 2013, after spec discussion here). I've never felt really comfortable with the rationale for doing so. I seem to be more or less alone in this opinion, though, which is fine.

The rationale was "the function does something really weird, so we should flag it with upper case". Like we do with BEGIN and the other phasers, for example. With BEGIN and others, the upper-casing is motivated, I agree. A phaser takes you "outside of the normal control flow". The eval function doesn't.

Other things that we upper-case are things like .WHAT, which look like attributes but are really specially code-generated at compile-time into something completely different. So even there the upper-casing is motivated because something outside of the normal is happening.

eval in the end is just another function. Yes, it's a function with potentially quite wide-ranging side effects, that's true. But a lot of fairly standard functions have wide-ranging side effects. (To name a few: shell, die, exit.) You don't see anyone clamoring to upper-case those.

I guess it could be argued that eval is very special because it hooks into the compiler and runtime in ways that normal functions don't, and maybe can't. (This is also how TimToady explained it in the commit message of the renaming commit.) But that's an argument from implementation details, which doesn't feel satisfactory. It applies with equal force to the lower-cased functions just mentioned.

To add insult to injury, the renamed EVAL is also made deliberately harder to use:

$ perl6 -e'my $program = q[say "OH HAI"]; EVAL $program'
===SORRY!=== Error while compiling -e
EVAL is a very dangerous function!!! (use the MONKEY-SEE-NO-EVAL pragma to override this error,
but only if you're VERY sure your data contains no injection attacks)
at -e:1
------> program = q[say "OH HAI"]; EVAL $program⏏<EOL>

$ perl6 -e'use MONKEY-SEE-NO-EVAL; my $program = q[say "OH HAI"]; EVAL $program'
OH HAI

Firstly, injection attacks are a real issue, and no laughing matter. We should educate each other and newcomers about them.

Secondly, that error message ("EVAL is a very dangerous function!!!") is completely over-the-top in a way that damages rather than helps. I believe when we explain the dangers of code injection to people, we need to do it calmly and matter-of-factly. Not with three exclamation marks. The error message makes sense to someone who already knows about injection attacks; it provides no hints or clues for people who are unaware of the risks.

(The Perl 6 community is not unique in eval-hysteria. Yesterday I stumbled across a StackOverflow thread about how to turn a string with a type name into the corresponding constructor in JavaScript. Some unlucky soul suggested eval, and everybody else immediately piled on to point out how irresponsible that was. Solely as a knee-jerk reaction "because eval is bad".)

Thirdly, MONKEY-SEE-NO-EVAL. Please, can we just... not. 😓 Random reference to monkies and the weird attempt at levity while switching on a nuclear-chainsaw function aside, I find it odd that a function that enables EVAL is called something with NO-EVAL. That's not Least Surprise.

Anyway, the other day I realized how I can get around both the problem of the all-caps name and the problem of the necessary pragma:

$ perl6 -e'my &eval = &EVAL; my $program = q[say "OH HAI"]; eval $program'
OH HAI

I was so happy to realize this that I thought I'd blog about it. Apparently the very dangerous function (!!!) is fine again if we just give it back its old name. 😜

gfldex: You can call me Whatever you like

Published by gfldex on 2017-04-19T11:00:43

The docs spend many words to explain in great detail what a Whatever is and how to use it from the caller perspective. There are quite a few ways to support Whatever as a callee as I shall explain.

Whatever can be used to express “all of the things”. In that case we ask for the type object that is Whatever.

sub gimmi(Whatever) {};
gimmi(*);

Any expression that contains a Whatever * will be turned into a thunk. The latter happens to be a block without a local scope (kind of, it can be turned into a block when captured). We can ask specifically for a WhateverCode to accept Whatever-expressions.

sub compute-all-the-things(WhateverCode $c) { $c(42) }
say compute-all-the-things(*-1);
say (try say compute-all-the-things({$_ - 1})) // 'failed';
# OUTPUT: «41␤failed␤»

We could also ask for a Block or a Method as both come preloaded with one parameter. If we need a WhateverCode with more then one argument we have to be precise because the compiler can’t match a Callable sub-signature with a WhateverCode.

sub picky(WhateverCode $c where .arity == 2 || fail("two stars in that expession please") ) {
    $c.(1, 2)
}
say picky(*-*);
# OUTPUT: «-1␤»
say (try picky(*-1)) // $!;
# OUTPUT: «two stars in that expession please␤  in sub picky at …»

The same works with a Callable constraint, leaving the programmer more freedom what to supply.

sub picky(&c where .arity == 2) { c(1, 2) }

There are quite a few things a WhateverCode can’t do.

sub faily(WhateverCode $c) { $c.(1) }
say (try faily( return * )) // $!.^name;
# OUTPUT: «X::ControlFlow::Return␤»

The compiler can take advantage of that and provide compile time errors or get things done a little bit qicker. So trading the flexibility of Callable for a stricter WhateverCode constraint may make sense.


gfldex: Dealing with Fallout

Published by gfldex on 2017-04-19T09:51:53

The much welcome and overdue sanification of the IO-subsystem lead to some fallout in some of my code that was enjoyably easy to fix.

Some IO-operations used to return False or undefined values on errors returned from the OS. Those have been fixed to return Failure. As a result some idioms don’t work as they used to.

my $v = §some-filename.txt".IO.open.?slurp // 'sane default';

The conditional method call operator .? does not defuse Failure as a result the whole expression blows up when an error occures. Luckily try can be used as a statement, which will return Nil, so we can still use the defined-or-operator // to assign default values.

my $v = (try "some-filename.txt".IO.open.slurpy) // 'sane default';

The rational to have IO-operations throw explosives is simple. Filesystem dealings can not be atomic (at least seen from the runtime) and can fail unexpectetly due to cable tripping. By packaging exceptions in Failure objects Perl 6 allows us to turn them back into undefined values as we please.


rakudo.org: PART 3: Information on Changes Due to IO Grant Work

Published by Zoffix Znet on 2017-04-17T20:22:46

The IO grant work is at its wrap up. This note lists some of the last-minute changes to the plans delineated in earlier communications ([1], [2], [3]). Most of the listed items do not require any changes to users’ code.

Help and More Info

If you need help or more information, please join our IRC channel and ask there. You can also contact the person performing this work via Twitter @zoffix or by talking to user Zoffix in our dev IRC channel

gfldex: Slipping in a Config File

Published by gfldex on 2017-04-17T15:31:24

I wanted to add a config file to META6::bin without adding another dependency and without adding a grammar or other forms of fancy (and therefore time consuming) parsers. As it turns out, .split and friends are more then enough to get the job done.

# META6::bin config file

general.timeout = 60
git.timeout = 120
git.protocol = https

That’s how the file should look like and I wanted a multidim Hash in the end to query values like %config<git><timeout>.

our sub read-cfg($path) is export(:HELPER) {
    use Slippy::Semilist;

    return unless $path.IO.e;

    my %h;
    slurp($path).lines\
        ».chomp\
        .grep(!*.starts-with('#'))\
        .grep(*.chars)\
        ».split(/\s* '=' \s*/)\
        .flat.map(-> $k, $v { %h{||$k.split('.').cache} = $v });

    %h
}

We slurp in the whole file and process it line by line. All newlines are removed and any line that starts with a # or is empty is skipped. We separate values and keys by = and use a Semilist Slip to build the multidim Hash. Abusing a .map that doesn’t return values is a bit smelly but keeps all operations in order.

A Semilist is the thing you can find in %hash{1;2;3} (same for arrays) to express multi-dimentionallity. Just using a normal list wont cut it because a list is a valid key for a Hash.

I had Rakudo::Slippy::Semilist laying around for quite some time but never really used it much because it’s cheating by using nqp-ops to get some decent speed. As it turned out it’s not really the operations on a Hash as the circumfix:<{ }>-operator itself that is causing a 20x speed drop. By calling .EXISTS-KEY and .BIND-KEY directly the speed hit shrinks down to 7% over a nqp-implementation.

It’s one of those cases where things fall into place with Perl 6. Being able to define my own operator in conjunction with ». allows to keep the code flowing in the order of thoughs instead of breaking it up into nested loops.


Perl 6 Maven: Benchmarking crypt with SHA-512 in Perl 6

Published by szabgab

samcv: Indexing Unicode Things, Improving Case Insensitive Regex

Published on 2017-04-15T07:00:00

In the 2017.04 release of Rakudo under the MoarVM backend, there will be some substantial improvements to regex speed.

I have been meaning to make a new blog post for some time about my work on Unicode in Perl 6. This is going to be the first post of several that I have been meaning to write. As a side note, let me mention that I have a Perl Foundation grant proposal which is related to working on Unicode for Perl 6 and MoarVM.

The first of these improvements I'm going to write about is case insensitive regex m:i/ /. MoarVM had formerly lowercased the haystack and the needle whenever nqp::indexic was called. The new code now also uses foldcase instead of lowercasing.

It ended up 1.8-3.3x faster than before, but it began when MasterDuke submitted a Pull Request which changed the underlying MoarVM function behind nqp::indexic (index ignorecase) to use foldcase instead of lowercase.

At first this seemed like a great and easy improvement, but shortly after there were some serious problems. You see, when you foldcase a string, sometimes the number of graphemes can change. One grapheme can become up to 3 new codepoints! For example the ligature ‘st’ will foldcase to ‘st’, even though it may lowercase to 'st'. The issue was, if the string to be searched contained any of these expanding characters, the nqp::indexic command's would be off by however many codepoints were increased!

’.fc.say; # st’.chars.say; # 1
'ffi'.fc.say; # ffi
'ffi'.chars.say; # 1
'ffi'.fc.chars.say; # 3ß’.fc.say; #  ss

So this was a real problem.

On the bright side of things, this allowed me to make many great changes in how case insensitive strings are searched for under Perl 6/MoarVM.

The nqp::index command does a lot of the effort when searching for a string. I discovered we had a nqp::indexic operation that searched for a string but ignored case, but it was not used everywhere. nqp::index was still used extensively and this required us changing case in both in Perl 6 to use nqp::index and also doing it when using the nqp::indexic (index ignore case). In addition MoarVM changed the case of the entire haystack and needle whenever the nqp::indexic operation was used.

On the MoarVM side I first worked on getting it working with foldcase, and quickly discovered that the only sane way to do this, was to begin foldcasing operations on the haystack only from the starting point sent to the indexic function. If you foldcased them before the requested index, it would screw up the offset. My solution was to foldcase the needle, and then foldcase each grapheme down the haystack, only as far as we needed to find our match, preventing useless work foldcasing parts of the string we did not need.

The offset of the needle that is found from the indexic op is relative to the original string, and it will expand characters as needed, but the returned offset will always be related to the original string, not its changed version, making the offsets useful and relevant information on where the needle is in the original string, not in the altered version. As with regex, we are looking for the match in the haystack, and so must be able to return the section of the string we have matched.

The end result is we now have a 1.8x to 3.3x (depending on not finding a match/finding a match at the beginning) faster case insensitive regex!

gfldex: Speeding up Travis

Published by gfldex on 2017-04-14T20:55:00

After some wiggling I managed to convince travis to use ubuntu packages to trim off about 4 minutes of a test. Sadly the .debs don’t come with build in zef, what would be another 40 seconds.

As follows a working .travis.yml.

sudo: required
before_install:
    - wget https://github.com/nxadm/rakudo-pkg/releases/download/2017.03_02/perl6-rakudo-moarvm-ubuntu16.04_20170300-02_amd64.deb
    - sudo dpkg --install perl6-rakudo-moarvm-ubuntu16.04_20170300-02_amd64.deb
    - sudo /opt/rakudo/bin/install_zef_as_root.sh
    - export PATH=/opt/rakudo/bin:$PATH
    - sudo chown -R travis.travis /home/travis/.zef/
install:
    - zef --debug install .
script:
- zef list --installed

Using a meta package in conjuction with .debs makes it quite easy to test if a module will work not just with bleeding Rakudo but with versions users might actually have.


brrt to the future: Function Call Milestone

Published by Bart Wiegmans on 2017-03-28T16:14:00

Hi everybody. It's high time for another update, and this time I have good news. The 'expression' JIT compiler can now compile native ('C') function calls (although it's not able to use the results). This is a major milestone because function calls are hard! (At least from the perspective of a compiler, and especially from the perspective of the register allocator). Also because native function calls are really very important in MoarVM. Most of its 'primitive' operations (like hash table access, string equality, big integer arithmetic) are implemented by invoking native functions, and so to compile almost any program the JIT has to compile many function calls.

What makes function calls 'hard' is that they must implement the 'calling convention' of the relevant 'application binary interface' (ABI). In short, the ABI specifies the locations of function call parameters.  A small number of parameters (on Windows, the first 4, for POSIX platforms, the first 6) are placed in registers, and if there are more parameters they are usually placed on the stack. Aside from the calling convention, the ABI also specifies the expected alignment of the stack pointer (per 16 bytes) and the registers a functions may overwrite (clobber in ABI-speak) and which registers must have their original values after the function returns. The last type of registers are called 'callee-saved'. Note that at least a few registers must be callee-saved, especially those related to call stack management, because if the callee function would overwrite those it would be impossible to return control back to the caller. By the way, manipulating exactly those registers is how the setjmp and longjmp 'functions' work.

So the compiler is tasked with generating code that ensures the correct values are placed in the correct registers. That sounds easy enough, but what if the these registers are taken by other values, and what if those other values might be required for another parameter? Indeed, what if the value in the %rdx register needs to be in the %rsi register, and the value of the %rsi register is required in the %rdx register? How to determine the correct ordering for shuffling the operands?

One simple way to deal with this would be to eject all values from registers onto the stack, and then to load the values from registers if they are necessary. However, that would be very inefficient, especially if most function calls have no more than 6 (or 4) parameters and most of these parameters are computed for the function call only. So I thought that solution wouldn't do.

Another way to solve this would be if the register allocator could ensure that values are placed in their correct registers directly,- especially for register parameters -  i.e. by 'precoloring'. (The name comes from register allocation algorithms that work by 'graph coloring', something I will try to explain in a later post). However, that isn't an option due to my choice of 'linear scan' as the register allocation algorithm. This is a 'greedy' algorithm, meaning that it decides the allocation for a live range as soon as it encounters them, and that it cannot revert that decision once it's been made. (If it could, it would be more like a dynamic programming algorithm). So to ensure that the allocation is valid I'd have to make sure that the information about register requirements is propagated backwards from the instructions to all values that might conflict with it... and that point we're no longer talking about linear scan, and I would be better off re-engineering a new algorithm. Not a very attractive option either!

Instead, I thought about it and it occurred to me that this problem seems a lot like unravelling a dependency graph, with a number of restrictions. That is to say, it can be solved by a topological sort. I map the registers to a graph structure as follows:

I linked to the topological sort page for an explanation of the problem, but I think my implementation is really quite different from that presented there. They use a node visitation map and a stack, I use an edge queue and and outbound count. A register transfer (edge) can be enqueued if it is clear that the destination register is not currently used. Transfers from registers to stack locations (as function call parameters) or local memory (to save the value from being overwritten by the called function) are also enqueued directly. As soon as the outbound count of a node reaches zero, it is considered to be 'free' and the inbound edge (if any) is enqueued.


Unlike a 'proper' dependency graph, cycles can and do occur, as in the example where '%rdx' and '%rsi' would need to swap places. Fortunately, because of the single-inbound edge rule, such cycles are 'simple' - all outbound edges not belonging to the cycle can be resolved prior to the cycle-breaking, and all remaining edges are part of the cycle. Thus, the cycle can always be broken by freeing just a single node (i.e. by copy to a temporary register).

The only thing left to consider are the values that are used after the function call returns (survive the function call) and that are stored in registers that the called function can overwrite (which is all of them, since the register allocator never selects callee-saved registers). So to make sure they are available afterwards, we must spill them. But there are a few spill strategies to choose from (terminology made up by me):

The current register allocator does a full spill when it's run out of registers, and it would make some sense to apply the same logic for function-call related spills. I've decided to use spill-and-restore, however, because a full spill complicates the sorting order (a value that used to be in a register is suddenly only in memory) and it can be wasteful, especially if the call only happens in an alternative branch. This is common for instance when assigning values to object fields, as that may sometimes require a write barrier (to ensure the GC tracks all references from 'old' to 'new' objects). So I'm guessing that it's going to be better to pay the cost of spilling and restoring only in those alternative branches, and that's why I chose to use spill-and-restore.

That was it for today. Although I think being able to call functions is a major milestone, this is not the very last thing to do. We currently cannot allocate any of the registers used for floating-point calculations, which is a relatively minor limitation since those aren't used very frequently. But I also need to do some more work to actually use function return values and apply generic register requirements of tiles. But I do think the day is coming near where we can start thinking about merging the new JIT with the MoarVM master branch, making it available to everybody. Until next time!

Perl 6 Maven: Encrypting Passwords in Perl 6 using crypt and SHA-512

Published by szabgab

rakudo.org: PART 2: Upgrade Information for Changes Due to IO Grant Work

Published by Zoffix Znet on 2017-04-03T00:15:07

We’re making more changes!

Do the core developers ever sleep? Nope! We keep making Perl 6 better 24/7!

Why?

Not more than 24 hours ago, you may have read Upgrade Information for Changes Due to IO Grant Work. All of that is still happening.

However, it turned out that I, (Zoffix), had an incomplete understanding of how changes in 6.d language will play along with 6.c stuff. My original assumption was we could remove or change existing methods, but that assumption was incorrect. Pretty much the only sane way to incompatibly change a method in an object in 6.d is to add a new method with a different name.

Since I rather us not have, e.g. .child and .child-but-secure, for the next decade, we have a bit of an in-flight course correction:

ORIGINAL PLAN was to minimize incompatibilities with existing 6.c language code; leave everything potentially-breaking for 6.d

NEW PLAN is to right away add everything that does NOT break 6.c-errata specification, into 6.c language; leave everything else for 6.d. Note that current 6.c-errata specification for IO is sparse (the reason IO grant is running in the first place), so there’s lots of wiggle room to make most of the changes in 6.c.

When?

I (Zoffix) still hope to cram all the changes into 2017.04 release. Whether that’s overly optimistic, given the time constraints… we’ll find out on April 17th. If anything doesn’t make it into 2017.04, all of it definitely will be in 2017.05.

What?

Along with the original list in first Upgrade Information Notice, the following changes may affect your code. I’m excluding any non-conflicting changes.

Potential changes:

Changes for 6.d language:

Help and More Info

If you need help or more information, please join our IRC channel and ask there. You can also contact the person performing this work via Twitter @zoffix or by talking to user Zoffix in our dev IRC channel

rakudo.org: Upgrade Information for Changes Due to IO Grant Work

Published by Zoffix Znet on 2017-04-02T08:31:49

As previously notified, there are changes being made to IO routines. This notice is to provide details on changes that may affect currently-existing code.

When?

Barring unforeseen delays, the work affecting version 6.c language is planned to be included in 2017.04 Rakudo Compiler release (planned for release on April 17, 2017) on which next Rakudo Star release will be based.

Some or all of the work affecting 6.d language may also be included in that release and will be available if the user uses use v6.d.PREVIEW pragma. Any 6.d work that doesn’t make it into 2017.04 release, will be included in 2017.05 release.

If you use development commits of the compiler (e.g. rakudobrew), you will
receive this work as-it-happens.

Why?

If you only used documented features, the likelihood of you needing to change any of your code is low. The 6.c language changes due to IO Grant work affect either routines that are rarely used or undocumented routines that might have been used by users assuming they were part of the language.

What?

This notice describes only changes affecting existing code and only for 6.c language. It does NOT include any non-conflicting changes or changes slated for 6.d language. If you’re interested in the full list of changes, you can find it in the IO Grant Action Plan

The changes that may affect existing code are:

Help and More Info

If you need help or more information, please join our IRC channel and ask there. You can also contact the person performing this work via Twitter @zoffix or by talking to user Zoffix in our dev IRC channel

Perlgeek.de: Perl 6 By Example: Idiomatic Use of Inline::Python

Published by Moritz Lenz on 2017-04-01T22:00:01

This blog post is part of my ongoing project to write a book about Perl 6.

If you're interested, either in this book project or any other Perl 6 book news, please sign up for the mailing list at the bottom of the article, or here. It will be low volume (less than an email per month, on average).


In the two previous installments, we've seen Python libraries being used in Perl 6 code through the Inline::Python module. Here we will explore some options to make the Perl 6 code more idiomatic and closer to the documentation of the Python modules.

Types of Python APIs

Python is an object-oriented language, so many APIs involve method calls, which Inline::Python helpfully automatically translates for us.

But the objects must come from somewhere and typically this is by calling a function that returns an object, or by instantiating a class. In Python, those two are really the same under the hood, since instantiating a class is the same as calling the class as if it were a function.

An example of this (in Python) would be

from matplotlib.pyplot import subplots
result = subplots()

But the matplotlib documentation tends to use another, equivalent syntax:

import matplotlib.pyplot as plt
result = plt.subplots()

This uses the subplots symbol (class or function) as a method on the module matplotlib.pyplot, which the import statement aliases to plt. This is a more object-oriented syntax for the same API.

Mapping the Function API

The previous code examples used this Perl 6 code to call the subplots symbol:

my $py = Inline::Python.new;
$py.run('import matplotlib.pyplot');
sub plot(Str $name, |c) {
    $py.call('matplotlib.pyplot', $name, |c);
}

my ($figure, $subplots) = plot('subplots');

If we want to call subplots() instead of plot('subplots'), and bar(args) instead of `plot('bar', args), we can use a function to generate wrapper functions:

my $py = Inline::Python.new;

sub gen(Str $namespace, *@names) {
    $py.run("import $namespace");

    return @names.map: -> $name {
        sub (|args) {
            $py.call($namespace, $name, |args);
        }
    }
}

my (&subplots, &bar, &legend, &title, &show)
    = gen('matplotlib.pyplot', <subplots bar legend title show>);

my ($figure, $subplots) = subplots();

# more code here

legend($@plots, $@top-authors);
title('Contributions per day');
show();

This makes the functions' usage quite nice, but comes at the cost of duplicating their names. One can view this as a feature, because it allows the creation of different aliases, or as a source for bugs when the order is messed up, or a name misspelled.

How could we avoid the duplication should we choose to create wrapper functions?

This is where Perl 6's flexibility and introspection abilities pay off. There are two key components that allow a nicer solution: the fact that declarations are expressions and that you can introspect variables for their names.

The first part means you can write mysub my ($a, $b), which declares the variables $a and $b, and calls a function with those variables as arguments. The second part means that $a.VAR.name returns a string '$a', the name of the variable.

Let's combine this to create a wrapper that initializes subroutines for us:

sub pysub(Str $namespace, |args) {
    $py.run("import $namespace");

    for args[0] <-> $sub {
        my $name = $sub.VAR.name.substr(1);
        $sub = sub (|args) {
            $py.call($namespace, $name, |args);
        }
    }
}

pysub 'matplotlib.pyplot',
    my (&subplots, &bar, &legend, &title, &show);

This avoids duplicating the name, but forces us to use some lower-level Perl 6 features in sub pysub. Using ordinary variables means that accessing their .VAR.name results in the name of the variable, not the name of the variable that's used on the caller side. So we can't use slurpy arguments as in

sub pysub(Str $namespace, *@subs)

Instead we must use |args to obtain the rest of the arguments in a Capture. This doesn't flatten the list of variables passed to the function, so when we iterate over them, we must do so by accessing args[0]. By default, loop variables are read-only, which we can avoid by using <-> instead of -> to introduce the signature. Fortunately, that also preserves the name of the caller side variable.

An Object-Oriented Interface

Instead of exposing the functions, we can also create types that emulate the method calls on Python modules. For that we can implement a class with a method FALLBACK, which Perl 6 calls for us when calling a method that is not implemented in the class:

class PyPlot is Mu {
    has $.py;
    submethod TWEAK {
        $!py.run('import matplotlib.pyplot');
    }
    method FALLBACK($name, |args) {
        $!py.call('matplotlib.pyplot', $name, |args);
    }
}

my $pyplot = PyPlot.new(:$py);
my ($figure, $subplots) = $pyplot.subplots;
# plotting code goes here
$pyplot.legend($@plots, $@top-authors);

$pyplot.title('Contributions per day');
$pyplot.show;

Class PyPlot inherits directly from Mu, the root of the Perl 6 type hierarchy, instead of Any, the default parent class (which in turn inherits from Mu). Any introduces a large number of methods that Perl 6 objects get by default and since FALLBACK is only invoked when a method is not present, this is something to avoid.

The method TWEAK is another method that Perl 6 calls automatically for us, after the object has been fully instantiated. All-caps method names are reserved for such special purposes. It is marked as a submethod, which means it is not inherited into subclasses. Since TWEAK is called at the level of each class, if it were a regular method, a subclass would call it twice implicitly. Note that TWEAK is only supported in Rakudo version 2016.11 and later.

There's nothing specific to the Python package matplotlib.pyplot in class PyPlot, except the namespace name. We could easily generalize it to any namespace:

class PythonModule is Mu {
    has $.py;
    has $.namespace;
    submethod TWEAK {
        $!py.run("import $!namespace");
    }
    method FALLBACK($name, |args) {
        $!py.call($!namespace, $name, |args);
    }
}

my $pyplot = PythonModule.new(:$py, :namespace<matplotlib.pyplot>);

This is one Perl 6 type that can represent any Python module. If instead we want a separate Perl 6 type for each Python module, we could use roles, which are optionally parameterized:

role PythonModule[Str $namespace] is Mu {
    has $.py;
    submethod TWEAK {
        $!py.run("import $namespace");
    }
    method FALLBACK($name, |args) {
        $!py.call($namespace, $name, |args);
    }
}

my $pyplot = PythonModule['matplotlib.pyplot'].new(:$py);

Using this approach, we can create type constraints for Python modules in Perl 6 space:

sub plot-histogram(PythonModule['matplotlib.pyplot'], @data) {
    # implementation here
}

Passing in any other wrapped Python module than matplotlib.pyplot results in a type error.

Summary

Perl 6 offers enough flexibility to create function and method call APIs around Python modules. With a bit of meta programming, we can emulate the typical Python APIs close enough that translating from the Python documentation to Perl 6 code becomes easy.

Subscribe to the Perl 6 book mailing list

* indicates required

Perl 6 Maven: Encrypting Passwords in Perl 6 using crypt

Published by szabgab

Zoffix Znet: But Here's My Dispatch, So callwith Maybe

Published on 2017-03-28T00:00:00

All about nextwith, nextsame, samewith, callwith, callsame, nextcallee, and lastcall

Perlgeek.de: Perl 6 By Example: Stacked Plots with Matplotlib

Published by Moritz Lenz on 2017-03-25T23:00:01

This blog post is part of my ongoing project to write a book about Perl 6.

If you're interested, either in this book project or any other Perl 6 book news, please sign up for the mailing list at the bottom of the article, or here. It will be low volume (less than an email per month, on average).


In a previous episode, we've explored plotting git statistics in Perl 6 using matplotlib.

Since I wasn't quite happy with the result, I want to explore using stacked plots for presenting the same information. In a regular plot, the y coordiante of each plotted value is proportional to its value. In a stacked plot, it is the distance to the previous value that is proportional to its value. This is nice for values that add up to a total that is also interesting.

Matplotlib offers a method called stackplot for that. Contrary to multiple plot calls on subplot object, it requires a shared x axis for all data series. So we must construct one array for each author of git commits, where dates with no value come out as zero.

As a reminder, this is what the logic for extracting the stats looked like in the first place:

my $proc = run :out, <git log --date=short --pretty=format:%ad!%an>;
my (%total, %by-author, %dates);
for $proc.out.lines -> $line {
    my ( $date, $author ) = $line.split: '!', 2;
    %total{$author}++;
    %by-author{$author}{$date}++;
    %dates{$date}++;
}
my @top-authors = %total.sort(-*.value).head(5)>>.key;

And some infrastructure for plotting with matplotlib:

my $py = Inline::Python.new;
$py.run('import datetime');
$py.run('import matplotlib.pyplot');
sub plot(Str $name, |c) {
    $py.call('matplotlib.pyplot', $name, |c);
}
sub pydate(Str $d) {
    $py.call('datetime', 'date', $d.split('-').map(*.Int));
}

my ($figure, $subplots) = plot('subplots');
$figure.autofmt_xdate();

So now we have to construct an array of arrays, where each inner array has the values for one author:

my @dates = %dates.keys.sort;
my @stack = $[] xx @top-authors;

for @dates -> $d {
    for @top-authors.kv -> $idx, $author {
        @stack[$idx].push: %by-author{$author}{$d} // 0;
    }
}

Now plotting becomes a simple matter of a method call, followed by the usual commands adding a title and showing the plot:

$subplots.stackplot($[@dates.map(&pydate)], @stack);
plot('title', 'Contributions per day');
plot('show');

The result (again run on the zef source repository) is this:

Stacked plot of zef contributions over time

Comparing this to the previous visualization reveals a discrepancy: There were no commits in 2014, and yet the stacked plot makes it appear this way. In fact, the previous plots would have shown the same "alternative facts" if we had chosen lines instead of points. It comes from matplotlib (like nearly all plotting libraries) interpolates linearly between data points. But in our case, a date with no data points means zero commits happened on that date.

To communicate this to matplotlib, we must explicitly insert zero values for missing dates. This can be achieved by replacing

my @dates = %dates.keys.sort;

with the line

my @dates = %dates.keys.minmax;

The minmax method finds the minimal and maximal values, and returns them in a Range. Assigning the range to an array turns it into an array of all values between the minimal and the maximal value. The logic for assembling the @stack variable already maps missing values to zero.

The result looks a bit better, but still far from perfect:

Stacked plot of zef contributions over time, with missing dates mapped to zero

Thinking more about the problem, contributions from separate days should not be joined together, because it produces misleading results. Matplotlib doesn't support adding a legend automatically to stacked plots, so this seems to be to be a dead end.

Since a dot plot didn't work very well, let's try a different kind of plot that represents each data point separately: a bar chart, or more specifically, a stacked bar chart. Matplotlib offers the bar plotting method, and a named parameter bottom can be used to generate the stacking:

my @dates = %dates.keys.sort;
my @stack = $[] xx @top-authors;
my @bottom = $[] xx @top-authors;

for @dates -> $d {
    my $bottom = 0;
    for @top-authors.kv -> $idx, $author {
        @bottom[$idx].push: $bottom;
        my $value = %by-author{$author}{$d} // 0;
        @stack[$idx].push: $value;
        $bottom += $value;
    }
}

We need to supply color names ourselves, and set the edge color of the bars to the same color, otherwise the black edge color dominates the result:

my $width = 1.0;
my @colors = <red green blue yellow black>;
my @plots;

for @top-authors.kv -> $idx, $author {
    @plots.push: plot(
        'bar',
        $[@dates.map(&pydate)],
        @stack[$idx],
        $width,
        bottom => @bottom[$idx],
        color => @colors[$idx],
        edgecolor => @colors[$idx],
    );
}
plot('legend', $@plots, $@top-authors);

plot('title', 'Contributions per day');
plot('show');

This produces the first plot that's actually informative and not misleading (provided you're not color blind):

Stacked bar plot of zef contributions over time

If you want to improve the result further, you could experiment with limiting the number of bars by lumping together contributions by week or month (or maybe $n-day period).

Next, we'll investigate ways to make the matplotlib API more idiomatic to use from Perl 6 code.

Subscribe to the Perl 6 book mailing list

* indicates required

Perl 6 Maven: Getting started with Rakudo Perl 6 in a Docker container

Published by szabgab

Perlgeek.de: Perl 6 By Example: Plotting using Matplotlib and Inline::Python

Published by Moritz Lenz on 2017-03-18T23:00:01

This blog post is part of my ongoing project to write a book about Perl 6.

If you're interested, either in this book project or any other Perl 6 book news, please sign up for the mailing list at the bottom of the article, or here. It will be low volume (less than an email per month, on average).


Occasionally I come across git repositories, and want to know how active they are, and who the main developers are.

Let's develop a script that plots the commit history, and explore how to use Python modules in Perl 6.

Extracting the Stats

We want to plot the number of commits by author and date. Git makes it easy for us to get to this information by giving some options to git log:

my $proc = run :out, <git log --date=short --pretty=format:%ad!%an>;
my (%total, %by-author, %dates);
for $proc.out.lines -> $line {
    my ( $date, $author ) = $line.split: '!', 2;
    %total{$author}++;
    %by-author{$author}{$date}++;
    %dates{$date}++;
}

run executes an external command, and :out tells it to capture the command's output, and makes it available as $proc.out. The command is a list, with the first element being the actual executable, and the rest of the elements are command line arguments to this executable.

Here git log gets the options --date short --pretty=format:%ad!%an, which instructs it to print produce lines like 2017-03-01!John Doe. This line can be parsed with a simple call to $line.split: '!', 2, which splits on the !, and limits the result to two elements. Assigning it to a two-element list ( $date, $author ) unpacks it. We then use hashes to count commits by author (in %total), by author and date (%by-author) and finally by date. In the second case, %by-author{$author} isn't even a hash yet, and we can still hash-index it. This is due to a feature called autovivification, which automatically creates ("vivifies") objects where we need them. The use of ++ creates integers, {...} indexing creates hashes, [...] indexing and .push creates arrays, and so on.

To get from these hashes to the top contributors by commit count, we can sort %total by value. Since this sorts in ascending order, sorting by the negative value gives the list in descending order. The list contains Pair objects, and we only want the first five of these, and only their keys:

my @top-authors = %total.sort(-*.value).head(5).map(*.key);

For each author, we can extract the dates of their activity and their commit counts like this:

my @dates  = %by-author{$author}.keys.sort;
my @counts = %by-author{$author}{@dates};

The last line uses slicing, that is, indexing an array with list to return a list elements.

Plotting with Python

Matplotlib is a very versatile library for all sorts of plotting and visualization. It's written in Python and for Python programs, but that won't stop us from using it in a Perl 6 program.

But first, let's take a look at a basic plotting example that uses dates on the x axis:

import datetime
import matplotlib.pyplot as plt

fig, subplots = plt.subplots()
subplots.plot(
    [datetime.date(2017, 1, 5), datetime.date(2017, 3, 5), datetime.date(2017, 5, 5)],
    [ 42, 23, 42 ],
    label='An example',
)
subplots.legend(loc='upper center', shadow=True)
fig.autofmt_xdate()
plt.show()

To make this run, you have to install python 2.7 and matplotlib. You can do this on Debian-based Linux systems with apt-get install -y python-matplotlib. The package name is the same on RPM-based distributions such as CentOS or SUSE Linux. MacOS users are advised to install a python 2.7 through homebrew and macports, and then use pip2 install matplotlib or pip2.7 install matplotlib to get the library. Windows installation is probably easiest through the conda package manager, which offers pre-built binaries of both python and matplotlib.

When you run this scripts with python2.7 dates.py, it opens a GUI window, showing the plot and some controls, which allow you to zoom, scroll, and write the plot graphic to a file:

Basic matplotlib plotting window

Bridging the Gap

The Rakudo Perl 6 compiler comes with a handy library for calling foreign functions, which allows you to call functions written in C, or anything with a compatible binary interface.

The Inline::Python library uses the native call functionality to talk to python's C API, and offers interoperability between Perl 6 and Python code. At the time of writing, this interoperability is still fragile in places, but can be worth using for some of the great libraries that Python has to offer.

To install Inline::Python, you must have a C compiler available, and then run

$ zef install Inline::Python

(or the same with panda instead of zef, if that's your module installer).

Now you can start to run Python 2 code in your Perl 6 programs:

use Inline::Python;

my $py = Inline::Python.new;
$py.run: 'print("Hello, Pyerl 6")';

Besides the run method, which takes a string of Python code and execute it, you can also use call to call Python routines by specifying the namespace, the routine to call, and a list of arguments:

use Inline::Python;

my $py = Inline::Python.new;
$py.run('import datetime');
my $date = $py.call('datetime', 'date', 2017, 1, 31);
$py.call('__builtin__', 'print', $date);    # 2017-01-31

The arguments that you pass to call are Perl 6 objects, like three Int objects in this example. Inline::Python automatically translates them to the corresponding Python built-in data structure. It translate numbers, strings, arrays and hashes. Return values are also translated in opposite direction, though since Python 2 does not distinguish properly between byte and Unicode strings, Python strings end up as buffers in Perl 6.

Object that Inline::Python cannot translate are handled as opaque objects on the Perl 6 side. You can pass them back into python routines (as shown with the print call above), or you can also call methods on them:

say $date.isoformat().decode;               # 2017-01-31

Perl 6 exposes attributes through methods, so Perl 6 has no syntax for accessing attributes from foreign objects directly. If you try to access for example the year attribute of datetime.date through the normal method call syntax, you get an error.

say $date.year;

Dies with

'int' object is not callable

Instead, you have to use the getattr builtin:

say $py.call('__builtin__', 'getattr', $date, 'year');

Using the Bridge to Plot

We need access to two namespaces in python, datetime and matplotlib.pyplot, so let's start by importing them, and write some short helpers:

my $py = Inline::Python.new;
$py.run('import datetime');
$py.run('import matplotlib.pyplot');
sub plot(Str $name, |c) {
    $py.call('matplotlib.pyplot', $name, |c);
}

sub pydate(Str $d) {
    $py.call('datetime', 'date', $d.split('-').map(*.Int));
}

We can now call pydate('2017-03-01') to create a python datetime.date object from an ISO-formatted string, and call the plot function to access functionality from matplotlib:

my ($figure, $subplots) = plot('subplots');
$figure.autofmt_xdate();

my @dates = %dates.keys.sort;
$subplots.plot:
    $[@dates.map(&pydate)],
    $[ %dates{@dates} ],
    label     => 'Total',
    marker    => '.',
    linestyle => '';

The Perl 6 call plot('subplots') corresponds to the python code fig, subplots = plt.subplots(). Passing arrays to python function needs a bit extra work, because Inline::Python flattens arrays. Using an extra $ sigil in front of an array puts it into an extra scalar, and thus prevents the flattening.

Now we can actually plot the number of commits by author, add a legend, and plot the result:

for @top-authors -> $author {
    my @dates = %by-author{$author}.keys.sort;
    my @counts = %by-author{$author}{@dates};
    $subplots.plot:
        $[ @dates.map(&pydate) ],
        $@counts,
        label     => $author,
        marker    =>'.',
        linestyle => '';
}


$subplots.legend(loc=>'upper center', shadow=>True);

plot('title', 'Contributions per day');
plot('show');

When run in the zef git repository, it produces this plot:

Contributions to zef, a Perl 6 module installer

Summary

We've explored how to use the python library matplotlib to generate a plot from git contribution statistics. Inline::Python provides convenient functionality for accessing python libraries from Perl 6 code.

In the next installment, we'll explore ways to improve both the graphics and the glue code between Python and Perl 6.

Subscribe to the Perl 6 book mailing list

* indicates required

rakudo.org: Upgrade Information for Lexical require

Published by Zoffix Znet on 2017-03-18T01:29:32

Upgrade Information for Lexical require

What’s Happening?

Rakudo Compiler release 2017.03 includes the final piece of lexical module loading work: lexical require. This work was first announced in December, in http://rakudo.org/2016/12/17/lexical-module-loading/

There are two changes that may impact your code:

Upgrade Information

Lexical Symbols

WRONG:

# WRONG:
try { require Foo; 1 } and ::('Foo').new;

The require above is inside a block and so its symbols won’t be available
outside of it and the look up will fail.

CHANGE TO:

(try require Foo) !=== Nil and ::('Foo').new;

Now the require installs the symbols into scope that’s lexically accessible
to the ::('Foo') look up.

Optional Loading

WRONG:

# WRONG:
try require Foo;
if ::('Foo') ~~ Failure {
    say "Failed to load Foo!";
}

This construct installs a package named Foo, which would be replaced by the
loaded Foo if it were found, but if it weren’t, the package will remain a
package, not a Failure, and so the above ~~ test will always be False.

CHANGE TO:

# Use return value to test whether loading succeeded:
(try require Foo) === Nil and say "Failed to load Foo!";

# Or use a run-time symbol lookup with require, to avoid compile-time
# package installation:
try require ::('Foo');
if ::('Foo') ~~ Failure {
    say "Failed to load Foo!";
}

In the first example above, we test the return value of try isn’t Nil, since
on successful loading it will be a Foo module, class, or package.

The second example uses a run-time symbol lookup in require and so it never needs
to install the package placeholder during the compile time. Therefore, the
::('Foo') ~~ test does work as intended.

Help and More Info

If you require help or more information, please join our chat channel
#perl6 on irc.freenode.net

6guts: Considering hyper/race semantics

Published by jnthnwrthngtn on 2017-03-16T16:42:05

We got a lot of nice stuff into Perl 6.c, the version of the language released on Christmas of 2015. Since then, a lot of effort has gone on polishing the things we already had in place, and also on optimization. By this point, we’re starting to think about Perl 6.d, the next language release. Perl 6 is defined by its test suite. Even before considering additional features, the 6.d test suite will tie down a whole bunch of things that we didn’t have covered in the 6.c one. In that sense, we’ve already got a lot done towards it.

In this post, I want to talk about one of the things I’d really like to get nailed done as part of 6.d, and that is the semantics of hyper and race. Along with that I will, of course, be focusing on getting the implementation in much better shape. These two methods enable parallel processing of list operations. hyper means we can perform operations in parallel, but we must retain and respect ordering of results. For example:

say (1, 9, 6).hyper.map(* + 5); # (6 14 11)

Should always give the same results as if the hyper was not there, even it a thread computing 6 + 5 gave its result before that computing 1 + 5. (Obviously, this is not a particularly good real-world example, since the overhead of setting up parallel execution would dwarf doing 3 integer operations!) Note, however, that the order of side-effects is not guaranteed, so:

(1..1000).hyper.map(&say);

Could output the numbers in any order. By contrast, race is so keen to give you results that it doesn’t even try to retain the order of results:

say (1, 9, 6).race.map(* + 5); # (14 6 11) or (6 11 14) or ...

Back in 2015, when I was working on the various list handling changes we did in the run up to the Christmas release, my prototyping work included an initial implementation of the map operation in hyper and race mode, done primarily to figure out the API. This then escaped into Rakudo, and even ended up with a handful of tests written for it. In hindsight, that code perhaps should have been pulled out again, but it lives on in Rakudo today. Occasionally somebody shows a working example on IRC using the eval bot – usually followed by somebody just as swiftly showing a busted one!

At long last, getting these fixed up and implemented more fully has made it to the top of my todo list. Before digging into the implementation side of things, I wanted to take a step back and work out the semantics of all the various operations that might be part of or terminate a hyper or race pipeline. So, today I made a list of those operations, and then went through every single one of them and proposed the basic semantics.

The results of that effort are in this spreadsheet. Along with describing the semantics, I’ve used a color code to indicate where the result leaves you in the hyper or race paradigm afterwards (that is, a chained operation will also be performed in parallel).

I’m sure some of these will warrant further discussion and tweaks, so feel free to drop me feedback, either on the #perl6-dev IRC channel or in the comments here.


Perlgeek.de: What's a Variable, Exactly?

Published by Moritz Lenz on 2017-03-11T23:00:01

When you learn programming, you typically first learn about basic expressions, like 2 * 21, and then the next topic is control structures or variables. (If you start with functional programming, maybe it takes you a bit longer to get to variables).

So, every programmer knows what a variable is, right?

Turns out, it might not be that easy.

Some people like to say that in ruby, everything is an object. Well, a variable isn't really an object. The same holds true for other languages.

But let's start from the bottom up. In a low-level programming language like C, a local variable is a name that the compiler knows, with a type attached. When the compiler generates code for the function that the variable is in, the name resolves to an address on the stack (unless the compiler optimizes the variable away entirely, or manages it through a CPU register).

So in C, the variable only exists as such while the compiler is running. When the compiler is finished, and the resulting executable runs, there might be some stack offset or memory location that corresponds to our understanding of the variable. (And there might be debugging symbols that allows some mapping back to the variable name, but that's really a special case).

In case of recursion, a local variable can exist once for each time the function is called.

Closures

In programming languages with closures, local variables can be referenced from inner functions. They can't generally live on the stack, because the reference keeps them alive. Consider this piece of Perl 6 code (though we could write the same in Javascript, Ruby, Perl 5, Python or most other dynamic languages):

sub outer() {
    my $x = 42;
    return sub inner() {
        say $x;
    }
}

my &callback = outer();
callback();

The outer function has a local (lexical) variable $x, and the inner function uses it. So once outer has finished running, there's still an indirect reference to the value stored in this variable.

They say you can solve any problem in computer science through another layer of indirection, and that's true for the implementation of closures. The &callback variable, which points to a closure, actually stores two pointers under the hood. One goes to the static byte code representation of the code, and the second goes to a run-time data structure called a lexical pad, or short lexpad. Each time you invoke the outer function, a new instance of the lexpad is created, and the closure points to the new instance, and the always the same static code.

But even in dynamic languages with closures, variables themselves don't need to be objects. If a language forbids the creation of variables at run time, the compiler knows what variables exist in each scope, and can for example map each of them to an array index, so the lexpad becomes a compact array, and an access to a variable becomes an indexing operation into that array. Lexpads generally live on the heap, and are garbage collected (or reference counted) just like other objects.

Lexpads are mostly performance optimizations. You could have separate runtime representations of each variable, but then you'd have to have an allocation for each variable in each function call you perform, whereas which are generally much slower than a single allocation of the lexpad.

The Plot Thickens

To summarize, a variable has a name, a scope, and in languages that support it, a type. Those are properties known to the compiler, but not necessarily present at run time. At run time, a variable typically resolves to a stack offset in low-level languages, or to an index into a lexpad in dynamic languages.

Even in languages that boldly claim that "everything is an object", a variable often isn't. The value inside a variable may be, but the variable itself typically not.

Perl 6 Intricacies

The things I've written above generalize pretty neatly to many programming languages. I am a Perl 6 developer, so I have some insight into how Perl 6 implements variables. If you don't resist, I'll share it with you :-).

Variables in Perl 6 typically come with one more level of indirection, we which call a container. This allows two types of write operations: assignment stores a value inside a container (which again might be referenced to by a variable), and binding, which places either a value or a container directly into variable.

Here's an example of assignment and binding in action:

my $x;
my $y;
# assignment:
$x = 42;
$y = 'a string';

say $x;     # => 42
say $y;     # => a string

# binding:
$x := $y;

# now $x and $y point to the same container, so that assigning to one
# changes the other:
$y = 21;
say $x;     # => 21

Why, I hear you cry?

There are three major reasons.

The first is that makes assignment something that's not special. For example in python, if you assign to anything other than a plain variable, the compiler translates it to some special method call (obj.attr = x to setattr(obj, 'attr', x), obj[idx] = x to a __setitem__ call etc.). In Perl 6, if you want to implement something you can assign to, you simply return a container from that expression, and then assignment works naturally.

For example an array is basically just a list in which the elements are containers. This makes @array[$index] = $value work without any special cases, and allows you to assign to the return value of methods, functions, or anything else you can think of, as long as the expression returns a container.

The second reason for having both binding and assignment is that it makes it pretty easy to make things read-only. If you bind a non-container into a variable, you can't assign to it anymore:

my $a := 42;
$a = "hordor";  # => Cannot assign to an immutable value

Perl 6 uses this mechanism to make function parameters read-only by default.

Likewise, returning from a function or method by default strips the container, which avoids accidental action-at-a-distance (though an is rw annotation can prevent that, if you really want it).

This automatic stripping of containers also makes expressions like $a + 2 work, independently of whether $a holds an integer directly, or a container that holds an integer. (In the implementation of Perl 6's core types, sometimes this has to be done manually. If you ever wondered what nqp::decont does in Rakudo's source code, that's what).

The third reason relates to types.

Perl 6 supports gradual typing, which means you can optionally annotate your variables (and other things) with types, and Perl 6 enforces them for you. It detects type errors at compile time where possible, and falls back to run-time checking types.

The type of a variable only applies to binding, but it inherits this type to its default container. And the container type is enforced at run time. You can observe this difference by binding a container with a different constraint:

my Any $x;
my Int $i;
$x := $i;
$x = "foo";     # => Type check failed in assignment to $i; expected Int but got Str ("foo")

Int is a subtype of Any, which is why the binding of $i to $x succeeds. Now $x and $i share a container that is type-constrained to Int, so assigning a string to it fails.

Did you notice how the error message mentions $i as the variable name, even though we've tried to assign to $x? The variable name in the error message is really a heuristic, which works often enough, but sometimes fails. The container that's shared between $x and $i has no idea which variable you used to access it, it just knows the name of the variable that created it, here $i.

Binding checks the variable type, not the container type, so this code doesn't complain:

my Any $x;
my Int $i;
$x := $i;
$x := "a string";

This distinction between variable type and container type might seem weird for scalar variables, but it really starts to make sense for arrays, hashes and other compound data structures that might want to enforce a type constraint on its elements:

sub f($x) {
    $x[0] = 7;
}
my Str @s;
f(@s);

This code declares an array whose element all must be of type Str (or subtypes thereof). When you pass it to a function, that function has no compile-time knowledge of the type. But since $x[0] returns a container with type constraint Str, assigning an integer to it can produce the error you expect from it.

Summary

Variables typically only exists as objects at compile time. At run time, they are just some memory location, either on the stack or in a lexical pad.

Perl 6 makes the understanding of the exact nature of variables a bit more involved by introducing a layer of containers between variables and values. This offers great flexibility when writing libraries that behaves like built-in classes, but comes with the burden of additional complexity.

Zoffix Znet: Tag Your Dists

Published on 2017-03-10T00:00:00

Tags support in Perl 6 modules ecosystem

Pawel bbkr Pabian: Your own template engine in 4 flavors. With Benchmarks!

Published by Pawel bbkr Pabian on 2017-02-25T23:43:36

This time on blog I'll show you how to write your own template engine - with syntax and behavior tailored for your needs. And we'll do it in four different ways to analyze pros and cons of each approach as well as code speed and complexity. Our sample task for today is to compose password reminder text for user, which can then be sent by email.

use v6;

my $template = q{
Hi [VARIABLE person]!

You can change your password by visiting [VARIABLE link] .

Best regards.
};

my %fields = (
'person' => 'John',
'link' => 'http://example.com'
);

So we decided how our template syntax should look like and for starter we'll do trivial variables (although that's not very precise name because variables in templates are almost always immutable).
We also have data to populate template fields. Let's get started!

1. Substitutions

sub substitutions ( $template is copy, %fields ) {
    for %fields.kv -> $key, $value {
        $template ~~ s:g/'[VARIABLE ' $key ']'/$value/;
    }
    return $template;
}

say substitutions($template, %fields);

Yay, works:

    Hi John!

You can change your password by visiting http://example.com .

Best regards.

Now it is time to benchmark it to get some baseline for different approaches:

use Bench;

my $template_short = $template;
my %fields_short = %fields;

my $template_long = join(
' lorem ipsum ', map( { '[VARIABLE ' ~ $_ ~ ']' }, 'a' .. 'z')
) x 100;
my %fields_long = ( 'a' .. 'z' ) Z=> ( 'lorem ipsum' xx * );

my $b = Bench.new;
$b.timethese(
1000,
{
'substitutions_short' => sub {
substitutions( $template_short, %fields_short )
},
'substitutions_long' => sub {
substitutions( $template_long, %fields_long )
},
}
);

Benchmark in this post will tests two cases for each approach. Our template from example is "short" case. And there is "long" case with 62KB template containing 2599 text fragments and 2600 variables filled by 26 fields. So here are the results:

Timing 1000 iterations of substitutions_long, substitutions_short...
substitutions_long: 221.1147 wallclock secs @ 4.5225/s (n=1000)
substitutions_short: 0.1962 wallclock secs @ 5097.3042/s (n=1000)

Whoa! That is a serious penalty for long templates. And the reason for that is because this code has three serious flaws - original template is destroyed during variables evaluation and therefore it must be copied each time we want to reuse it, template text is parsed multiple times and output is rewritten every time after populating each variable. But we can do better...

2. Substitution

sub substitution ( $template is copy, %fields ) {
    $template ~~ s:g/'[VARIABLE ' (\w+) ']'/{ %fields{$0} }/;
    return $template;
}

This time we have single substitution. Variable name is captured and we can use it to get field value on the fly. Benchmarks:

Timing 1000 iterations of substitution_long, substitution_short...
substitution_long: 71.6882 wallclock secs @ 13.9493/s (n=1000)
substitution_short: 0.1359 wallclock secs @ 7356.3411/s (n=1000)

Mediocre boost. We have less penalty on long templates because text is not parsed multiple times. However remaining flaws from previous approach still apply and regexp engine still must do plenty of memory reallocations for each piece of template text replaced.

Also it won't allow our template engine to gain new features - like conditions or loops - in the future because it is very hard to parse nested tags in single regexp. Time for completely different approach...

3. Grammars and direct Actions

If you are not familiar with Perl 6 grammars and Abstract Syntax Tree concept you should study official documentation first.

grammar Grammar {
    regex TOP { ^ [  |  ]* $ }
    regex text { <-[ [ ] >+ }
    regex variable { '[VARIABLE ' $=(\w+) ']' }
}

class Actions {

has %.fields is required;

method TOP ( $/ ) {
make [~]( map { .made }, $/{'chunk'} );
}
method text ( $/ ) {
make ~$/;
}
method variable ( $/ ) {
make %.fields{$/{'name'}};
}

}

sub grammar_actions_direct ( $template, %fields ) {
my $actions = Actions.new( fields => %fields );
return Grammar.parse($template, :$actions).made;
}

The most important thing is defining our template syntax as a grammar. Grammar is just a set of named regular expressions that can call each other. On "TOP" (where parsing starts) we see that our template is composed of chunks. Each chunk can be text or variable. Regexp for text matches everything until it hits variable start ('[' character, let's assume it is forbidden in text to make things more simple). Regexp for variable should look familiar from previous approaches, however now we capture variable name in named way instead of positional.

Action class has methods that are called whenever regexp with corresponding name is matched. When called, method gets match object ($/) from this regexp and can "make" something from it. This "made" something will be seen by upper level method when it is called. For example our "TOP" regexp calls "text" regexp which matches "Hi " part of template and calls "text" method. This "text" method just "make"s this matched string for later use. Then "TOP" regexp calls "variable" regexp which matches "[VARIABLE name]" part of template. Then "variable" method is called and it checks in match object for variable name and "makes" value of this variable from %fields hash for later use. This continues until end of template string. Then "TOP" regexp is matched and "TOP" method is called. This "TOP" method can access array of text or variable "chunks" in match object and see what was "made" for those chunks earlier. So all it has to do is to "make" those values concatenated together. And finally we get this "made" template from "parse" method. So let's look at benchmarks:

Timing 1000 iterations of grammar_actions_direct_long, grammar_actions_direct_short...
grammar_actions_direct_long: 149.5412 wallclock secs @ 6.6871/s (n=1000)
grammar_actions_direct_short: 0.2405 wallclock secs @ 4158.1981/s (n=1000)

We got rid of two more flaws from previous approaches. Original template is not destroyed when fields are filled and that means less memory copying. There is also no reallocation of memory during substitution of each field because now every action method just "make"s strings to be joined later. And we can easily extend our template syntax by adding loops, conditions and more features just by throwing some regexps into grammar and defining corresponding behavior in actions. Unfortunately we see some performance regression and this happens because every time template is processed it is parsed, match objects are created, parse tree is built and it has to track all those "make"/"made" values when it is collapsed to final output. But that was not our final word...

4. Grammars and closure Actions

Finally we reached "boss level", where we have to exterminate last and greatest flaw - re-parsing.
The idea is to use grammars and actions like in previous approach, but this time instead of getting direct output we want to generate executable and reusable code that works like this under the hood:

sub ( %fields ) {
    return join '',
        sub ( %fields ) { return "Hi "}.( %fields ),
        sub ( %fields ) { return %fields{'person'} }.( %fields ),
        ...
}

That's right, we will be converting our template body to a cascade of subroutines.
Each time this cascade is called it will get and propagate %fields to deeper subroutines.
And each subroutine is responsible for handling piece of template matched by single regexp in grammars. We can reuse grammar from previous approach and modify only actions:

class Actions {
    
    method TOP ( $/ ) {
        my @chunks = $/{'chunk'};
        make sub ( %fields ) {
           return [~]( map { .made.( %fields ) }, @chunks );
        };
    }
    method text ( $/ ) {
        my $text = ~$/;
        make sub ( %fields ) {
            return $text;
        };
    }
    method variable ( $/ ) {
        my $name = $/{'name'};
        make sub ( %fields  ) {
            return %fields{$name}
        };
    }
    
}

sub grammar_actions_closures ( $template, %fields ) {
state %cache{Str};
my $closure = %cache{$template} //= Grammar.parse(
$template, actions => Actions.new
).made;
return $closure( %fields );
}

Now every action method instead of making final output makes a subroutine that will get %fields and do final output later. To generate this cascade of subroutines template must be parsed only once. Once we have it we can call it with different set of %fields to populate in our template variables. Note how Object Hash %cache is used to determine if we already have subroutines tree for given $template. Enough talking, let's crunch some numbers:

Timing 1000 iterations of grammar_actions_closures_long, grammar_actions_closures_short...
grammar_actions_closures_long: 22.0476 wallclock secs @ 45.3563/s (n=1000)
grammar_actions_closures_short: 0.0439 wallclock secs @ 22778.8885/s (n=1000)

Nice result! We have extensible template engine that is 4 times faster for short templates and 10 times faster for long templates than our initial approach. And yes, there is bonus level...

4.1. Grammars and closure Actions in parallel

Last approach opened a new optimization possibility. If we have subroutines that will generate our template why not run them in parallel? So let's modify our action "TOP" method to process text and variable chunks simultaneously:

method TOP ( $/ ) {
    my @chunks = $/{'chunk'};
    make sub ( %fields ) {
       return [~]( @chunks.hyper.map( {.made.( %fields ) } ).list );
    };
}

Such optimization will shine if your template engine must do some lengthy operations to generate chunk of final output, for example execute heavy database query or call some API. It is perfectly fine to ask for data on the fly to populate template, because in feature rich template engine you may not be able to predict and generate complete set of data needed beforehand, like we did with our %fields. Use this optimization wisely - for fast subroutines you will see a performance drop because cost of sending and retrieving chunks to/from threads will be higher that just executing them in serial on single core.

Which approach should I use to implement my own template engine?

That depends how much you can reuse templates. For example if you send one password reminder per day - go for simple substitution and reach for grammar with direct actions if you need more complex features. But if you are using templates for example in PSGI processes to display hundreds of pages per second for different users then grammar and closure actions approach wins hands down.

You can download all approaches with benchmarks in single file here.

To be continued?

If you like this brief introduction to template engines and want to see more complex features like conditions of loops implemented leave a comment under this article on blogs.perl.org or send me a private message on irc.freenode.net #perl6 channel (nick: bbkr).

brrt to the future: Register Allocator Update

Published by Bart Wiegmans on 2017-02-09T16:19:00

Hi everybody, I thought some yof you might be interested in an update regarding the JIT register allocator, which is after all the last missing piece for the new 'expression' JIT backend. Well, the last complicated piece, at least. Because register allocation is such a broad topic, I don't expect to cover all topics relevant to design decisions here, and reserve a future post for that purpose.

I think I may have mentioned earlier that I've chosen to implement linear scan register allocation, an algorithm first described in 1999. Linear scan is relatively popular for JIT compilers because it achieves reasonably good allocation results while being considerably simpler and faster than the alternatives, most notably via graph coloring (unfortunately no open access link available). Because optimal register allocation is NP-complete, all realistic algorithms are heuristic, and linear scan applies a simple heuristic to good effect. I'm afraid fully explaining the nature of that heuristic and the tradeoffs involves is beyond the scope of this post, so you'll have to remind me to do it at a later point.

Commit ab077741 made the new allocator the default after I had ironed out sufficient bugs to be feature-equivalent to the old allocator (which still exists, although I plan to remove it soon).
Commit 0e66a23d introduced support for 'PHI' node merging, which is really important and exciting to me, so I'll have to explain what it means. The expression JIT represents code in a form in which all values are immutable, called single static assignment form, or SSA form shortly. This helps simplify compilation because there is a clear correspondence between operations and the values they compute. In general in compilers, the easier it is to assert something about code, the more interesting things you can do with it, and the better code you can compile. However, in real code, variables are often assigned more than one value. A PHI node is basically an 'escape hatch' to let you express things like:

int x, y;
if (some_condition()) {
x = 5;
} else {
x = 10;
}
y = x - 3;

In this case, despite our best intentions, x can have two different values. In SSA form, this is resolved as follows:

int x1, x2, x3, y;
if (some_condition()) {
x1 = 5;
} else {
x2 = 10;
}
x3 = PHI(x1,x2);
y = x3 - 3;

The meaning of the PHI node is that it 'joins together' the values of x1 and x2 (somewhat like a junction in perl6), and represents the value of whichever 'version' of x was ultimately defined. Resolving PHI nodes means ensuring that, as far as the register allocator is concerned, x1, x2, and x3 should preferably be allocated to the same register (or memory location), and if that's not possible, it should copy x1 and x2 to x3 for correctness. To find the set of values that are 'connected' via PHI nodes, I apply a union-find data structure, which is a very useful data structure in general. Much to my amazement, that code worked the first time I tried it.

Then I had to fix a very interesting bug in commit 36f1fe94 which involves ordering between 'synthetic' and 'natural' tiles. (Tiles are the output of the tiling process about which I've written at some length, they represent individual instructions). Within the register allocator, I've chosen to identify tiles / instructions by their index in the code list, and to store tiles in a contiguous array. There are many advantages to this strategy but they are also beyond the scope of this post. One particular advantage though is that the indexes into this array make their relative order immediately apparent. This is relevant to linear scan because it relies on relative order to determine when to allocate a register and when a value is no longer necessary.

However, because of using this index, it's not so easy to squeeze in new tiles to that array - which is exactly what a register allocator does, when it decides to 'spill' a value to memory and load it when needed. (Because inserts are delayed and merged into the array a single step, the cost of insertion is constant). Without proper ordering, a value loaded from memory could overwrite another value that is still in use. The fix for that is, I think, surprisingly simple and elegant. In order to 'make space' for the synthetic tiles, before comparison all indexes are multiplied by a factor of 2, and synthetic tiles are further offset by -1 or +1, depending on whether they should be handled before or after the 'natural' tile they are inserted for. E.g. synthetic tiles that load a value should be processed before the tile that uses the value they load.

Another issue soon appeared, this time having to do with x86 being, altogether, quaint and antiquated and annoying, and specifically with the use of one operand register as source and result value. To put it simply, where you and I and the expression JIT structure might say:

a = b + c

x86 says:

a = a + b

Resolving the difference is tricky, especially for linear scan, since linear scan processes the values in the program rather than the instructions that generate them. It is therefore not suited to deal with instruction-level constraints such as these. If a, b, and c in my example above are not the same (not aliases), then this can be achieved by a copy:

a = b
a = a + c

If a and b are aliases, the first copy isn't necessary. However, if a and c are aliases, then a copy may or may not be necessary, depending on whether the operation (in this case '+') is commutative, i.e. it holds for '+' but not for '-'. Commit 349b360 attempts to fix that for 'direct' binary operations, but a fix for indirect operations is still work in progress. Unfortunately, it meant I had to reserve a register for temporary use to resolve this, meaning there are fewer available for the register allocator to use. Fortunately, that did simplify handling of a few irregular instructions, e.g. signed cast of 32 bit integers to 64 bit integers.

So that brings us to today and my future plans. The next thing to implement will be support for function calls by the register allocator, which involves shuffling values to the right registers and correct positions on the stack, and also in spilling all values that are still required after the function call since the function may overwrite them. This requires a bit of refactoring of the logic that spills variables, since currently it is only used when there are not enough registers available. I also need to change the linear scan main loop, because it processes values in order of first definition, and as such, instructions that don't create any values are skipped, even if they need special handling like function calls. I'm thinking of solving that with a special 'interesting tiles' queue that is processed alongside the main values working queue.

That was it for today. I hope to write soon with more progress.

Strangely Consistent: Deep Git

Published by Carl Mäsak

I am not good at chess.

I mean... "I know how the pieces move". (That's the usual phrase, isn't it?) I've even tried to understand chess better at various points in my youth, trying to improve my swing. I could probably beat some of you other self-identified "I know how the pieces move" folks out there. With a bit of luck. As long as you don't, like, cheat by having a strategy or something.

I guess what I'm getting at here is that I am not, secretly, an international chess master. OK, now that's off my chest. Phew!

Imagining what it's like to be really good at chess is very interesting, though. I can say with some confidence that a chess master never stops and asks herself "wait — how does the knight piece move, again?" Not even I do that! Obviously, the knight piece is the one that moves √5 distances on the board. 哈哈

I can even get a sense of what terms a master-level player uses internally, by reading what master players wrote. They focus on tactics and strategy. Attacks and defenses. Material and piece values. Sacrifices and piece exchange. Space and control. Exploiting weaknesses. Initiative. Openings and endgames.

Such high-level concerns leave the basic mechanics of piece movements far behind. Sure, those movements are in there somewhere. They are not irrelevant, of course. They're just taken for granted and no longer interesting in themselves. Meanwhile, the list of master-player concerns above could almost equally well apply to a professional Go player. (s:g/piece/stone/ for Go.)

Master-level players have stopped looking at individual trees, and are now focusing on the forest.

The company that employs me (Edument) has a new slogan. We've put it on the backs of sweaters which we then wear to events and conferences:

We teach what you can't google.

I really like this new slogan. Particularly, it feels like something we as a teaching company have already trended towards for a while. Some things are easy to find just by googling them, or finding a good cheat sheet. But that's not why you attend a course. We should position ourselves so as to teach answers to the deep, tricky things that only emerge after using something for a while.

You're starting to see how this post comes together now, aren't you? 😄

2017 will be my ninth year with Git. I know it quite well by now, having learned it in depth and breadth along the way. I can safely say that I'm better at Git than I am at chess at this point.

Um. I'm most certainly not an international Git grandmaster — but largely that's because such a title does not exist. (If someone reads this post and goes on to start an international Git tournament, I will be very happy. I might sign up.)

No, my point is that the basic commands have taken on the role for me that I think basic piece movements have taken on for chess grandmasters. They don't really matter much; they're a means to an end, and it's the end that I'm focusing on when I type them.

(Yes, I still type them. There are some pretty decent GUIs out there, but none of them give me the control of the command line. Sorry-not-sorry.)

Under this analogy, what are the things I value with Git, if not the commands? What are the higher-level abstractions that I tend to think in terms of nowadays?

(Yes, these are the ACID guarantees for database transactions, but made to work for Git instead.)

A colleague of mine talks a lot about "definition of done". It seems to be a Scrum thing. It's his favorite term more than mine, but I still like it for its attempt at "mechanizing" quality, which I believe can succeed in a large number of situations.

Another colleague of mine likes the Boy Scout Rule of "Always leave the campground cleaner than you found it". If you think of this in terms of code, it means something like refactoring a code base as you go, cleaning it up bit by bit and asymptotically approaching code perfection. But if you think of it in terms of process, it dovetails quite nicely with the "definition of done" above.

Instead of explaining how in the abstract, let's go through a concrete-enough example:

  1. Some regression is discovered. (Usually by some developer dogfooding the system.)
  2. If it's not immediately clear, we bisect and find the offending commit.
  3. ASAP, we revert that commit.
  4. We analyze the problematic part of the reverted commit until we understand it thoroughly. Typically, the root cause will be something that was not in our definition of done, but should've been.
  5. We write up a new commit/branch with the original (good) functionality restored, but without the discovered problem.
  6. (Possibly much later.) We attempt to add discovery of the problem to our growing set of static checks. The way we remember to do that is through a TODO list in a wiki. This list keeps growing and shrinking in fits and starts.

Note in particular the interplay between process, quality and, yes, Git. Someone could've told me at the end of step 6 that I had totalled 29 or so Git basic commands along the way, and I would've believed them. But that's not what matters to us as a team. If we could do with magic pixie dust what we do with Git — keep historic snapshots of the code while ensuring quality and isolation — we might be satisfied magic pixie dust users instead.

Somewhere along the way, I also got a much more laid-back approach to conflicts. (And I stopped saying "merge conflicts", because there are also conflicts during rebase, revert, cherry-pick, and stash — and they are basically the same deal.) A conflict happens when a patch P needs to be applied in an environment which differs too much from the one in which P was created.

Aside: in response to this post, jast++ wrote this on #perl6: "one minor nitpick: git knows two different meanings for 'merge'. one is commit-level merge, one is file-level three-way merge. the latter is used in rebase, cherry-pick etc., too, so technically those conflicts can still be called merge conflicts. :)" — TIL.

But we actually don't care so much about conflicts. Git cares about conflicts, becuase it can't just apply the patch automatically. What we care about is that the intent of the patch has survived. No software can check that for us. Since the (conflict ↔ no conflict) axis is independent from the (intent broken ↔ intent preserved) axis, we get four cases in total. Two of those are straightforward, because the (lack of) conflict corresponds to the (lack of) broken intent.

The remaining two cases happen rarely but are still worth thinking about:

If we care about quality, one lesson emerges from mst's example: always run the tests after you merge and after you've resolved conflicts. And another lesson from my example: try to introduce automatic checks for structures and relations in the code base that you care about. In this case, branch A could've put in a test or a linting step that failed as soon as it saw something according to the old naming convention.

A lot of the focus on quality also has to do with doggedly going to the bottom of things. It's in the nature of failures and exceptional circumstances to clump together and happen at the same time. So you need to handle them one at a time, carefully unraveling one effect at a time, slowly disassembling the hex like a child's rod puzzle. Git sure helps with structuring and linearizing the mess that happens in everyday development, exploration, and debugging.

As I write this, I realize even more how even when I try to describe how Git has faded into the background as something important-but-uninteresting for me, I can barely keep the other concepts out of focus. Quality being chief among them. In my opinion, the focus on improving not just the code but the process, of leaving the campground cleaner than we found it, those are the things that make it meaningful for me to work as a developer even decades later. The feeling that code is a kind of poetry that punches you back — but as it does so, we learn something valuable for next time.

I still hear people say "We don't have time to write tests!" Well, in our teams, we don't have time not to write tests! Ditto with code review, linting, and writing descriptive commit messages.

No-one but Piet Hein deserves the last word of this post:

The road to wisdom? — Well, it's plain
and simple to express:

Err
and err
and err again
but less
and less
and less.

Death by Perl6: Hello Web! with Purée Perl 6

Published by Tony O'Dell on 2017-01-09T18:19:56

Let's build a website.

Websites are easy to build. There are dozens of frameworks out there to use, perl has Mojo and Catalyst as its major frameworks and other languages also have quite a few decent options. Some of them come with boilerplate templates and you just go from there. Others don't and you spend your first few hours learning how to actually set up the framework and reading about how to share your DB connection between all of your controllers and blah, blah, blah. Let's look at one of P6's web frameworks.

Enter Hiker

Hiker doesn't introduce a lot of (if any) new ideas. It does use paradigms you're probably used to and it aims to make the initialization of creating your website very straight forward and easy, that way you can get straight to work sharing your content with the English.

The Framework

Hiker is intended to make things fast and easy from the development side. Here's how it works. If you're not into the bleep blop and just want to get started, skip to the Boilerplate heading.

Application Initialization

  1. Hiker reads from the subdirectories we'll look at later. The controllers and models are classes.
  2. Looking at all controllers, initializes a new object for that class, and then checks for their .path attribute
    1. If Hiker can't find the path attribute then it doesn't bind anything and produces a warning
  3. After setting up the controller routes, it instantiates a new object for the model as specified by the controller (.model)
    1. If none is given by the controller then nothing is instantiated or bound and nothing happens
    2. If a model is required by the controller but it cannot be found then Hiker refuses to bind
  4. Finally, HTTP::Server::Router is alerted to all of the paths that Hiker was able to find and verify

The Request

  1. If the path is found, then the associated class' .model.bind is called.
    1. The response (second parameter of .model.bind($req, $res)) has a hash to store information: $res.data
  2. The controller's .handler($req, $res) is then executed
    1. The $res.data hash is available in this context
  3. If the handler returns a Promise then Hiker waits for that to be kept (and expects the result to be True or False)
    1. If the response is already rendered and the Promise's status is True then the router is alerted that no more routes should be explored
    2. If the response isn't rendered and the Promise's result is True, then .render is called automagically for you
    3. If the response isn't rendered and the Promise's result is False, then the next matching route is called

Boilerplate

Ensure you have Hiker installed:

$ zef install Hiker
$ rakudobrew rehash #this may be necessary to get the bin to work

Create a new directory where you'd like to create your project's boilerplate and cd. From here we'll initialize some boilerplate and look at the content of the files.

somedir$ hiker init  
==> Creating directory controllers
==> Creating directory models
==> Creating directory templates
==> Creating route MyApp::Route1: controllers/Route1.pm6
==> Creating route MyApp::Model1: models/Model1.pm6
==> Creating template templates/Route1.mustache
==> Creating app.pl6

Neato burrito. From the output you can see that Hiker created some directories - controllers, models, templates - for us so we can start out organized. In those directories you will find a few files, let's take a look.

The Model

use Hiker::Model; 

class MyApp::Model1 does Hiker::Model {  
  method bind($req, $res) {
    $res.data<who> = 'web!';
  }
}  

Pretty straight forward. MyApp::Model1 is instantiated during Hiker initialization and .bind is called whenever the controller's corresponding path is requested. As you can see here, this Model just adds to the $res.data hash the key value pair of who => 'web!'. This data will be available in the Controller as well as available in the template files (if the controller decides to use that).

The Controller

use Hiker::Route; 

class MyApp::Route1 does Hiker::Route {  
  has $.path     = '/';
  has $.template = 'Route1.mustache';
  has $.model    = 'MyApp::Model1';

  method handler($req, $res) {
    $res.headers<Content-Type> = 'text/plain';
  }
}  

As you can see above, the Hiker::Route has a lot of information in a small space and it's a class that does a Hiker role called Hiker::Route. This let's our framework know that we should inspect that class for the path, template, model so it can handle those operations for us - path and template are the only required attributes.

As discussed above, our Route can return a Promise if there is some asynchronous operation that is to be performed. In this case all we're going to do is set the header's to indicated the Content Type and then, automagically, render the template file. Note: if you return a Falsey value from the handler method, then the router will not auto render and it will attempt to find the next route. This is so that you can cascade paths in the event that you want to chain them together, do some type of decision making real time to determine whether that's the right class for the request, or perform some other unsaid dark magic. In the controller above we return a Truethy value and it auto renders.

By specifying the Model in the Route, you're able to re-use the same Model class across multiple routes.

The Path

Quick notes about .path. You can pass a ('/staticpath'), maybe a path with a placeholder ('/api/:placeholder'), or if you're path is a little more complicated then you can pass in a regex (/ .* /). Check out the documentation for HTTP::Server::Router (repo).

The Template

The template is specified by the controller's .template attribute and Hiker checks for that file in the ./templates folder. The default template engine is Template::Mustache (repo). See that module's documentation for more info.

Running the App

Really pretty straight forward from the boilerplate:

somedir$ perl6 app.pl6  

Now you can visit http://127.0.0.1:8080/ in your favorite Internet Explorer and find a nice 'Hello web!' message waiting to greet you. If you visit any other URI you'll receive the default 'no route found' message from HTTP::Server::Router.

The Rest

The module is relatively young. With feedback from the community, practical applications, and some extra feature expansion, Hiker could be pretty great and it's a good start to taking the tediousness out of building a website in P6. I'm open to feedback and I'd love to hear/see where you think Hiker can be improved, what it's missing to be productive, and possibly anything else [constructive or otherwise] you'd like to see in a practical, rapid development P6 web server.

Steve Mynott: Rakudo Star: Past Present and Future

Published by Steve Mynott on 2017-01-02T14:07:31

At YAPC::EU 2010 in Pisa I received a business card with "Rakudo Star" and the
date July 29, 2010 which was the date of the first release -- a week earlier
with a countdown to 1200 UTC. I still have mine, although it has a tea stain
on it and I refreshed my memory over the holidays by listening again to Patrick
Michaud speaking about the launch of Rakudo Star (R*):

https://www.youtube.com/watch?v=MVb6m345J-Q

R* was originally intended as first of a number of distribution releases (as
opposed to a compiler release) -- useable for early adopters but not initially production
Quality. Other names had been considered at the time like Rakudo Beta (rejected as
sounding like "don't use this"!) and amusingly Rakudo Adventure Edition.
Finally it became Rakudo Whatever and Rakudo Star (since * means "whatever"!).

Well over 6 years later and we never did come up with a better name although there
was at least one IRC conversation about it and perhaps "Rakudo Star" is too
well established as a brand at this point anyway. R* is the Rakudo compiler, the main docs, a module installer, some modules and some further docs.

However, one radical change is happening soon and that is a move from panda to
zef as the module installer. Panda has served us well for many years but zef is
both more featureful and more actively maintained. Zef can also install Perl
6 modules off CPAN although the CPAN-side support is in its early days. There
is a zef branch (pull requests welcome!) and a tarball at:

http://pl6anet.org/drop/rakudo-star-2016.12.zef-beta2.tar.gz

Panda has been patched to warn that it will be removed and to advise the use of
zef. Of course anyone who really wants to use panda can reinstall it using zef
anyway.

The modules inside R* haven't changed much in a while. I am considering adding
DateTime::Format (shown by ecosystem stats to be widely used) and
HTTP::UserAgent (probably the best pure perl6 web client library right now).
Maybe some modules should also be removed (although this tends to be more
controversial!). I am also wondering about OpenSSL support (if the library is
available).

p6doc needs some more love as a command line utility since most of the focus
has been on the website docs and in fact some of these changes have impacted
adversely on command line use, eg. under Windows cmd.exe "perl 6" is no longer
correctly displayed by p6doc. I wonder if the website generation code should be
decoupled from the pure docs and p6doc command line (since R* has to ship any
new modules used by the website). p6doc also needs a better and faster search
(using sqlite?). R* also ships some tutorial docs including a PDF generated from perl6intro.com.
We only ship the English one and localisation to other languages could be
useful.

Currently R* is released roughly every three months (unless significant
breakage leads to a bug fix release). Problems tend to happen with the
less widely used systems (Windows and the various BSDs) and also with the
module installers and some modules. R* is useful in spotting these issues
missed by roast. Rakudo itself is still in rapid development. At some point a less frequently
updated distribution (Star LTS or MTS?) will be needed for Linux distribution
packagers and those using R* in production). There are also some question
marks over support for different language versions (6.c and 6.d).

Above all what R* (and Rakudo Perl 6 in general) needs is more people spending
more time working on it! JDFI! Hopefully this blog post might
encourage more people to get involved with github pull requests.

https://github.com/rakudo/star

Feedback, too, in the comments below is actively encouraged.


Perl 6 Advent Calendar: Day 24 – Make It Snow

Published by ab5tract on 2016-12-24T13:14:02

Hello again, fellow sixers! Today I’d like to take the opportunity to highlight a little module of mine that has grown up in some significant ways this year. It’s called Terminal::Print and I’m suspecting you might already have a hint of what it can do just from the name. I’ve learned a lot from writing this module and I hope to share a few of my takeaways.

Concurrency is hard

Earlier in the year I decided to finally try to tackle multi-threading in Terminal::Print and… succeeded more or less, but rather miserably. I wrapped the access to the underlying grid (a two-dimensional array of Cell objects) in a react block and had change-cell and print-cell emit their respective actions on a Supply. The react block then handled these actions. Rather slowly, unfortunately.

Yet, there was hope. After jnthn++ fixed a constructor bug in OO::Monitors I was able to remove all the crufty hand-crafted handling code and instead ensure that method calls to the Terminal::Print::Grid object would only run in a single thread at any given time. (This is the class which holds the two-dimensional array mentioned before and was likewise the site of my react block experiment).

Here below are the necessary changes:

- unit class Terminal::Print::Grid;
+ use OO::Monitors;
+ unit monitor Terminal::Print::Grid;

This shift not only improved the readability and robustness of the code, it was significantly faster! Win! To me this is really an amazing dynamic of Perl 6. jnthn’s brilliant, twisted mind can write a meta-programming module that makes it dead simple for me to add concurrency guarantees at a specific layer of my library. My library in turn makes it dead simple to print from multiple threads at once on the screen! It’s whipuptitude enhancers all the the way down!

That said, our example today will not be writing from multiple threads. For some example code that utilizes async, I point you to examples/async.p6 and examples/matrix-ish.p6.

Widget Hero

Terminal::Print is really my first open source library in the sense that it is the first time that I have started my own module from scratch with the specific goal of filling a gap in a given language’s ecosystem. It is also the first time that I am not the sole contributor! I would be remiss not to mention japhb++ in this post, who has contributed a great deal in a relatively short amount of time.

In addition to all the performance related work and the introduction of a span-oriented printing mechanism, japhb’s work on widgets especially deserves its own post! For now let’s just say that it has been a real pleasure to see the codebase grow and extend even as I have been too distracted to do much good. My takeaway here is a new personal milestone in my participation in libre/open software (my first core contributor!) that reinforces all the positive dynamics it can have on a code base.

Oh, and I’ll just leave this here as a teaser of what the widget work has in store for us:

rpg-ui-p6

You can check it out in real-time and read the code at examples/rpg-ui.p6.

Snow on the Golf Course

Now you are probably wondering, where is the darn, snow! Well, here we go! The full code with syntax highlighting is available in examples/snowfall.p6. I will be going through it step by step below.

use Terminal::Print;

class Coord {
    has Int $.x is rw where * <= T.columns = 0;
    has Int $.y is rw where * <= T.rows = 0 ;
}

Here we import Terminal::Print. The library takes the position that when you import it somewhere, you are planning to print to the screen. To this end we export an instantiated Terminal::Print object into the importer’s lexical scope as T. This allows me to immediately start clarifying the x and y boundaries of our coordinate system based on run-time values derived from the current terminal window.

class Snowflake {
    has $.flake = ('❆','❅','❄').roll;
    has $.pos = Coord.new;
}

sub create-flake {
    state @cols = ^T.columns .pick(*); # shuffled
    if +@cols > 0 {
        my $rand-x = @cols.pop;
        my $start-pos = Coord.new: x => $rand-x;
        return Snowflake.new: pos => $start-pos;
    } else {
        @cols = ^T.columns .pick(*);
        return create-flake;
    }
}

Here we create an extremely simple Snowflake class. What is nice here is that we can leverage the default value of the $.flake attribute to always be random at construction time.

Then in create-flake we are composing a way to make sure we have hit every x coordinate as a starting point for the snowfall. Whenever create-flake gets called, we pop a random x coordinate out of the @cols state variable. The state variable enables this cool approach because we can manually fill @cols with a new randomized set of our available x coordinates once it is depleted.

draw( -> $promise {

start {
    my @flakes = create-flake() xx T.columns;
    my @falling;
    
    Promise.at(now + 33).then: { $promise.keep };
    loop {
        # how fast is the snowfall?
        sleep 0.1; 
    
        if (+@flakes) {
            # how heavy is the snowfall?
            my $limit = @flakes > 2 ?? 2            
                                    !! +@flakes;
            # can include 0, but then *cannot* exclude $limit!
            @falling.push: |(@flakes.pop xx (0..$limit).roll);  
        } else {
            @flakes = create-flake() xx T.columns;
        }
    
        for @falling.kv -> $idx, $flake {
            with $flake.pos.y -> $y {
                if $y > 0 {
                    T.print-cell: $flake.pos.x, ($flake.pos.y - 1), ' ';
                }

                if $y < T.rows {
                    T.print-cell: $flake.pos.x, $flake.pos.y, $flake.flake;            
                }

                try {
                    $flake.pos.y += 1;
                    CATCH {
                        # the flake has fallen all the way
                        # remove it and carry on!
                        @falling.splice($idx,1,());
                        .resume;
                    }
                }
            }
        }
    }
}

});

Let’s unpack that a bit, shall we?

So the first thing to explain is draw. This is a handy helper routine that is also imported into the current lexical scope. It takes as its single argument a block which accepts a Promise. The block should include a start block so that keeping the argument promise works as expected. The implementation of draw is shockingly simple.

So draw is really just short-hand for making sure the screen is set up and torn down properly. It leverages promises as (I’m told) a “conv-var” which according to the Promises spec might be an abuse of promises. I’m not very futzed about it, to be honest, since it suits my needs quite well.

This approach also makes it quite easy to create a “time limit” for the snowfall by scheduling a promise to be kept at now + 33 — thirty three seconds from when the loop starts. then we keep the promise and draw shuts down the screen for us. This makes “escape” logic for your screensavers quite easy to implement (note that SIGINT also restores your screen properly. The most basic exit strategy works as expected, too :) ).

The rest is pretty straightforward, though I’d point to the try block as a slightly clever (but not dangerously so) combination of where constraints on Coord‘s attributes and Perl 6’s resumable exceptions.

Make it snow!

And so, sixers, I bid you farewell for today with a little unconditional love from ab5tract’s curious little corner of the universe. Cheers!

snowfall-p6


Perl 6 Advent Calendar: Day 24 – One Year On

Published by liztormato on 2016-12-24T10:51:57

This time of year invites one to look back on things that have been, things that are and things that will be.

Have Been

I was reminded of things that have been when I got my new notebook a few weeks ago. Looking for a good first sticker to put on it, I came across an old ActiveState sticker:

If you don’t know Perl
you don’t know Dick

A sticker from 2000! It struck me that that sticker was as old as Perl 6. Only very few people now remember that a guy called Dick Hardt was actually the CEO of ActiveState at the time. So even though the pun may be lost on most due to the mists of time, the premise still rings true to me: that Perl is more about a state of mind, then about versions. There will always be another version of Perl. Those who don’t know Perl are doomed to re-implement it, poorly. Which, to me, is why so many ideas were borrowed from Perl. And still are!

Are

Where are we now? Is it the moment we know, We know, we know? I don’t think we are at twenty thousand people using Perl 6 just yet. But we’re keeping our fingers crossed. Just in case.

We are now 12 compiler releases after the initial Christmas release of Perl 6. In this year, many, many areas of Rakudo Perl 6 and MoarVM have dramatically improved in speed and stability. Our canary-in-the-coalmine test has dropped from around 14 seconds a year ago to around 5.5 seconds today. A complete spectest run is now about 3 minutes, whereas it was about 4.5 minutes at the beginning of the year, while about 4000 tests were added (from about 50K to 54K). And we now have 757 modules in the Perl 6 ecosystem (aka temporary CPAN for Perl 6 modules), with a few more added every week.

The #perl6 IRC channel has been too busy for me to follow consistently. But if you have a question related to Perl 6 and you want a quick answer, the #perl6 channel is the place to be. You don’t even have to install an IRC client: you can also use a browser to chat, or just follow “live” what is being said.

There are also quite a few useful bots on that channel: they e.g. take care of running a piece of Perl 6 code for you. Or find out at which commit the behaviour of a specific piece of code changed. These are very helpful for the developers of Perl 6, who usually also hang out on the #perl6-dev IRC channel. Which could be you! The past year, at least one contributor was added to the CREDITS every month!

Will Be

The coming year will see at least three Perl 6 books being published. First one will be Think Perl 6 – How To Think Like A Computer Scientist by Laurent Rosenfeld. It is an introduction to programming using Perl 6. But even for those of you with programming experience, it will be a good book to start learning Perl 6. And I can know. Because I’ve already read it :-)

Second one will be Learning Perl 6 by veteran Perl developer and writer brian d foy. It will have the advantage of being written by a seasoned writer going through the newbie experience that most people will have when coming from Perl 5.

The third one will be Perl 6 By Example by Moritz Lenz, which will, as the title already gives away, introduce Perl 6 topics by example.

There’ll be at least two (larger) Perl Conferences apart from many smaller Perl workshops: the The Perl Conference NA on 18-23 June, and the The Perl Conference in Amsterdam on 9-11 August. Where you will meet all sorts of nice people!

And for the rest? Expect a faster, leaner, Perl 6 and MoarVM compiler release on the 3rd Saturday every month. And an update of weekly events in the Perl 6 Weekly on every Monday evening/Tuesday morning (depending on where you live).


Perl 6 Advent Calendar: Day 23 – Everything is either wrong or less than awesome

Published by AlexDaniel on 2016-12-23T00:07:12

Have you ever spent your precious time on submitting a bug report for some project, only to get a response that you’re an idiot and you should f⊄∞÷ off?

Right! Well, perhaps consider spending your time on Perl 6 to see that not every free/open-source project is like this.

In the Perl 6 community, there is a very interesting attitude towards bug reports. Is it something that was defined explicitly early on? Or did it just grow organically? This remains to be a Christmas mystery. But the thing is, if it wasn’t for that, I wouldn’t be willing to submit all the bugs that I submitted over the last year (more than 100). You made me like this.

Every time someone submits a bug report, Perl 6 hackers always try to see if there is something that can done better. Yes, sometimes the bug report is invalid. But even if it is, is there any way to improve the situation? Perhaps a warning could be thrown? Well, if so, then we treat the behavior as LTA (Less Than Awesome), and therefore the bug report is actually valid! We just have to tweak it a little bit, meaning that the ticket will now be repurposed to improve or add the error message, not change the behavior of Perl 6.

The concept of LTA behavior is probably one of the key things that keeps us from rejecting features that may seem to do too little good for the amount of effort required to implement them, but in the end become game changers. Another closely related concept that comes to mind is “Torment the implementors on behalf of the users”.

OK, but what if this behavior is well-defined and is actually valid? In this case, it is still probably our fault. Why did the user get into this situation? Maybe the documentation is not good enough? Very often that is the issue, and we acknowledge that. So in a case of a problem with the documentation, we will usually ask you to submit a bug report for the documentation, but very often we will do it ourselves.

Alright, but what if the documentation for this particular case is in place? Well, maybe the thing is not easily searchable? That could be the reason why the user didn’t find it in the first place. Or maybe we lack some links? Maybe the places that should link to this important bit of information are not doing so? In other words, perhaps there are still ways to improve the docs!

But if not, then yes, we will have to write some tests for this particular case (if there are no tests yet) and reject the ticket. This happens sometimes.

The last bit, even if obvious to some, is still worth mentioning. We do not mark tickets resolved without tests. One reason is that we want roast (which is a Perl 6 spec) to be as full as possible. The other reason is that we don’t want regressions to happen (thanks captain obvious!). As the first version of Perl 6 was released one year ago, we are no longer making any changes that would affect the behavior of your code. However, occasional regressions do happen, but we have found an easy way to deal with those!

If you are not on #perl6 channel very often, you might not know that we have a couple of interesting bots. One of them is bisectable. In short, Bisectable performs a more user-friendly version of git bisect, but instead of building Rakudo on each commit, it has done it before you even asked it to! That is, it has over 5500 rakudo builds, one for every commit done in the last year and a half. This turns the time to run git bisect from minutes to about 10 seconds (Yes, 10 seconds is less than awesome! We are working on speeding it up!). And there are other bots that help us inspect the progress. The most recent one is Statisfiable, here is one of the graphs it can produce.

So if you pop up on #perl6 with a problem that seems to be a regression, we will be able to find the cause in seconds. Fixing the issue will usually take a bit more than that though, but when the problem is significant, it will usually happen in a day or two. Sorry for breaking your code in attempts to make it faster, we will do better next time!

But as you are reading this, perhaps you may be interested in seeing some bug reports? I thought that I’d go through the list of bugs of the last year to show how horribly broken things were, just to motivate the reader to go hunting for bugs. The bad news (oops, good news I mean), it seems that the number of “horrible” bugs is decreasing a bit too fast. Thanks to many Rakudo hackers, things are getting more stable at a very rapid pace.

Anyway, there are still some interesting things I was able to dig up:

That being said, my favorite bug of all times is RT #127473. Three symbols in the source code causing it to go into an infinite loop printing stuff about QAST nodes. That’s a rather unique issue, don’t you think?

I hope this post gave you a little insight on how we approach bugs, especially if you are not hanging around on #perl6 very often. Is our approach less than awesome? Do you have some ideas for other bots that could help us work with bugs? Leave it in the comments, we would like to know!


Perl 6 Advent Calendar: Day 22 – Generative Testing

Published by SmokeMachine on 2016-12-22T00:00:47

OK! So say you finished writing your code and it’s looking good. Let’s say it’s this incredible sum function:

module Sum {
   sub sum($a, $bis export {
      $a + $b
   }
}

Great, isn’t it?! Let’s use it:

use Sum;
say sum 2, 3; # 5

That worked! We summed the number 2 with the number 3 as you saw. If you carefully read the function you’ll see the variables $a and $b haven’t a type set. If you don’t type a variable it’s, by default, of type Any. 2 and 3 are Ints… Ints are Any. So that’s OK! But do you know what’s Any too? Str (just a example)!

Let’s try using strings?

use Sum;
say sum "bla", "ble";

We got a big error:

Cannot convert string to number: base-10 number must begin with valid digits or '.' in 'bla' (indicated by ⏏)
  in sub sum at sum.p6 line 1
  in block  at sum.p6 line 7

Actually thrown at:
  in sub sum at sum.p6 line 1
  in block  at sum.p6 line 7

Looks like it does not accept Strs… It seems like Any may not be the best type to use in this case.

Worrying about every possible input type for all our functions can prove to demand way too much work, as well as still being prone to human error. Thankfully there’s a module to help us with that! Test::Fuzz is a perl6 module that implements the “technique” of generative testing/fuzz testing.

Generative testing or Fuzz Testing is a technique of generating random/extreme data and using this data to call the function being tested.

Test::Fuzz gets the signature of your functions and decides what generators it should use to test it. After that it runs your functions giving it (100, by default) different arguments and testing if it will break.

To test our function, all that’s required is:

module Sum {
   use Test::Fuzz;
   sub sum($a, $bis export is fuzzed {
      $a + $b
   }
}
multi MAIN(:$fuzz!) {
   run-tests
}

And run:

perl6 Sum.pm6 --fuzz

This case will still show a lot of errors:

Use of uninitialized value of type Thread in numeric context
  in sub sum at Sum.pm6 line 4
Use of uninitialized value of type int in numeric context
  in sub sum at Sum.pm6 line 4
    ok 1 - sum(Thread, int)
Use of uninitialized value of type X::IO::Symlink in numeric context
  in sub sum at Sum.pm6 line 4
    ok 2 - sum(X::IO::Symlink, -3222031972)
Use of uninitialized value of type X::Attribute::Package in numeric context
  in sub sum at Sum.pm6 line 4
    ok 3 - sum(X::Attribute::Package, -9999999999)
Use of uninitialized value of type Routine in numeric context
  in sub sum at Sum.pm6 line 4
    not ok 4 - sum(áéíóú, (Routine))
...

What does that mean?

That means we should use one of the big features of perl6: Gradual typing. $a and $b should have types.

So, let’s modify the function and test again:

module Sum {
   use Test::Fuzz;
   sub sum(Int $a, Int $bis export is fuzzed {
      $a + $b
   }
}
multi MAIN(:$fuzz!) {
   run-tests
}
    ok 1 - sum(-2991774675, 0)
    ok 2 - sum(5471569889, 7905158424)
    ok 3 - sum(8930867907, 5132583935)
    ok 4 - sum(-6390728076, -1)
    ok 5 - sum(-3558165707, 4067089440)
    ok 6 - sum(-8930867907, -5471569889)
    ok 7 - sum(3090653502, -2099633631)
    ok 8 - sum(-2255887318, 1517560219)
    ok 9 - sum(-6085119010, -3942121686)
    ok 10 - sum(-7059342689, 8930867907)
    ok 11 - sum(-2244597851, -6390728076)
    ok 12 - sum(-5948408450, 2244597851)
    ok 13 - sum(0, -5062049498)
    ok 14 - sum(-7229942697, 3090653502)
    not ok 15 - sum((Int), 1)

    # Failed test 'sum((Int), 1)'
    # at site#sources/FB587F3186E6B6BDDB9F5C5F8E73C55195B73C86 (Test::Fuzz) line 62
    # Invocant requires an instance of type Int, but a type object was passed.  Did you forget a .new?
...

A lot of OKs!  \o/

But there’re still some errors… We can’t sum undefined values…

We didn’t say the attributes should be defined (with :D). So Test::Fuzz generated every undefined sub-type of Int that it could find. It uses every generator of a sub-type of Int to generate values. It also works if you use a subset or even if you use a where in your signature. It’ll use a super-type generator and grep the valid values.

So, let’s change it again!

module Sum {
   use Test::Fuzz;
   sub sum(Int:D $a, Int:D $bis export is fuzzed {
      $a + $b
   }
}
multi MAIN(:$fuzz!) {
   run-tests
}
    ok 1 - sum(6023702597, -8270141809)
    ok 2 - sum(-8270141809, -3762529280)
    ok 3 - sum(242796759, -7408209799)
    ok 4 - sum(-5813412117, -5280261945)
    ok 5 - sum(2623325683, 2015644992)
    ok 6 - sum(-696696815, -7039670011)
    ok 7 - sum(1, -4327620877)
    ok 8 - sum(-7712774875, 349132637)
    ok 9 - sum(3956553645, -7039670011)
    ok 10 - sum(-8554836757, 7039670011)
    ok 11 - sum(1170220615, -3)
    ok 12 - sum(-242796759, 2015644992)
    ok 13 - sum(-9558159978, -8442233570)
    ok 14 - sum(-3937367230, 349132637)
    ok 15 - sum(5813412117, 1170220615)
    ok 16 - sum(-7408209799, 6565554452)
    ok 17 - sum(2474679799, -3099404826)
    ok 18 - sum(-5813412117, 9524548586)
    ok 19 - sum(-6770230387, -7408209799)
    ok 20 - sum(-7712774875, -2015644992)
    ok 21 - sum(8442233570, -1)
    ok 22 - sum(-6565554452, 9999999999)
    ok 23 - sum(242796759, 5719635608)
    ok 24 - sum(-7712774875, 7039670011)
    ok 25 - sum(7408209799, -8235752818)
    ok 26 - sum(5719635608, -8518891049)
    ok 27 - sum(8518891049, -242796759)
    ok 28 - sum(-2474679799, 2299757592)
    ok 29 - sum(5356064609, 349132637)
    ok 30 - sum(-3491438968, 3438417115)
    ok 31 - sum(-2299757592, 7580671928)
    ok 32 - sum(-8104597621, -8158438801)
    ok 33 - sum(-2015644992, -3)
    ok 34 - sum(-6023702597, 8104597621)
    ok 35 - sum(2474679799, -2623325683)
    ok 36 - sum(8270141809, 7039670011)
    ok 37 - sum(-1534092807, -8518891049)
    ok 38 - sum(3551099668, 0)
    ok 39 - sum(7039670011, 4327620877)
    ok 40 - sum(9524548586, -8235752818)
    ok 41 - sum(6151880628, 3762529280)
    ok 42 - sum(-8518891049, 349132637)
    ok 43 - sum(7580671928, 9999999999)
    ok 44 - sum(-8235752818, -7645883481)
    ok 45 - sum(6460424228, 9999999999)
    ok 46 - sum(7039670011, -7788162753)
    ok 47 - sum(-9999999999, 5356064609)
    ok 48 - sum(8510706378, -2474679799)
    ok 49 - sum(242796759, -5813412117)
    ok 50 - sum(-3438417115, 9558159978)
    ok 51 - sum(8554836757, -7788162753)
    ok 52 - sum(-9999999999, 3956553645)
    ok 53 - sum(-6460424228, -8442233570)
    ok 54 - sum(7039670011, -7712774875)
    ok 55 - sum(-3956553645, 1577669672)
    ok 56 - sum(0, 9524548586)
    ok 57 - sum(242796759, -6151880628)
    ok 58 - sum(7580671928, 3937367230)
    ok 59 - sum(-8554836757, 7712774875)
    ok 60 - sum(9524548586, 2474679799)
    ok 61 - sum(-7712774875, 2450227203)
    ok 62 - sum(3, 1257247905)
    ok 63 - sum(8270141809, -2015644992)
    ok 64 - sum(242796759, -3937367230)
    ok 65 - sum(6770230387, -6023702597)
    ok 66 - sum(2623325683, -3937367230)
    ok 67 - sum(-5719635608, -7645883481)
    ok 68 - sum(1, 6770230387)
    ok 69 - sum(3937367230, 7712774875)
    ok 70 - sum(6565554452, -5813412117)
    ok 71 - sum(7039670011, -8104597621)
    ok 72 - sum(7645883481, 9558159978)
    ok 73 - sum(-6023702597, 6770230387)
    ok 74 - sum(-3956553645, -7788162753)
    ok 75 - sum(-7712774875, 8518891049)
    ok 76 - sum(-6770230387, 6565554452)
    ok 77 - sum(-8554836757, 5356064609)
    ok 78 - sum(6460424228, 8518891049)
    ok 79 - sum(-3438417115, -9999999999)
    ok 80 - sum(-1577669672, -1257247905)
    ok 81 - sum(-5813412117, -3099404826)
    ok 82 - sum(8158438801, -3551099668)
    ok 83 - sum(-8554836757, 1534092807)
    ok 84 - sum(6565554452, -5719635608)
    ok 85 - sum(-5813412117, -2623325683)
    ok 86 - sum(-8158438801, -3937367230)
    ok 87 - sum(5813412117, -46698532)
    ok 88 - sum(9524548586, -2474679799)
    ok 89 - sum(3762529280, -2474679799)
    ok 90 - sum(7788162753, 9558159978)
    ok 91 - sum(6770230387, -46698532)
    ok 92 - sum(1577669672, 6460424228)
    ok 93 - sum(4327620877, 3762529280)
    ok 94 - sum(-6023702597, -2299757592)
    ok 95 - sum(1257247905, -8518891049)
    ok 96 - sum(-8235752818, -6151880628)
    ok 97 - sum(1577669672, 7408209799)
    ok 98 - sum(349132637, 6770230387)
    ok 99 - sum(-7788162753, 46698532)
    ok 100 - sum(-7408209799, 0)
    1..100
ok 1 - sum

No errors!!!

Currently Test::Fuzz only implement generators for Int and Str, but as I said, it will be used for its super and sub classes. If you want to have generators for your custom class, you just need to implement a “static” method called generate-samples that returns sample instances of your class, infinite number of instances if possible.

Test::Fuzz is under development and isn’t in perl6 ecosystem yet. And we’re needing some help!

EDITED: New now you can only call run-tests()


Death by Perl6: Adding on to Channels and Supplies in Perl6

Published by Tony O'Dell on 2016-12-21T16:11:13

Channels and supplies are perl6's way of implementing the Oberserver pattern. There's some significant differences behind the scenes of the two but both can be used to implement a jQuery.on("event" like experience for the user. Not a jQuery fan? Don't you worry your pretty little head because this is perl6 and it's much more fun than whatever you thought.

Why?

Uhh, why do we want this?

This adds some sugar to the basic reactive constructs and it makes the passing of messages a lot more friendly, readable, and manageable.

What in Heck Does that Look Like?

Let's have an example and then we'll dissect it.

A Basic Example

use Event::Emitter;  
my Event::Emitter $e .= new;

$e.on(/^^ .+ $$/, -> $data {
  # you can operate on $data here
  '  regex matches'.say;
});

$e.on({ True; }, -> $data {
  '  block matches'.say;
});

$e.on('event', -> $data {
  '  string matches'.say;
});

'event'.say;  
$e.emit("event", { });

'empty event'.say;  
$e.emit("", { });

'abc'.say;  
$e.emit("abc", { });

Output * this is the output for an emitter using Supply, more on this later

event  
  regex matches
  block matches
  string matches
empty event  
  block matches
abc  
  regex matches
  block matches

Okay, that looks like a lot. It is, and it's much nicer to use than a large given/when combination. It also reduces indenting, so you have that going for you, which is nice.

Let's start with the simple .on blocks we have.

  $e.on(/^^ .+ $$/, -> $data { ...

This is telling the emitter handler that whenever an event is received, run that regular expression against it and if it matches, execute the block (passed in as the second argument). As a note, and illustrated in the example above, the handler can match against a Callable, Str, or Regex. The Callable must return True or False to let the handler know whether or not to execute the block.

If that seems pretty basic, it is. But little things like this add up over time and help keep things manageable. Prepare yourself for more convenience.

The Sugar

Do you want ants? This is how you get ants.

So, now we're looking for more value in something like this. Here it is: you can inherit from the Event::Emitter::Role::Template (or roll your own) and then your classes will automatically inherit these on events.

Example
use Event::Emitter::Role::Template;

class ZefClass does Event::Emitter::Role::Template {  
  submethod TWEAK {
    $!event-emitter.on("fresh", -> $data {
      'Aint that the freshness'.say;
    });
  }
}

Then, further along in your application, whenever an object wants ZefClass to react to the 'fresh' event, all it needs to do is:

$zef-class-instance.emit('fresh');

Pretty damn cool.

Development time is reduced significantly for a few reasons right off the bat:

  1. Implementing Supplier (or Channel) methods, setup, and event handling becomes unnecessary
  2. Event naming or matching is handled so it's easy to debug
  3. Handling or adding new event handling functions during runtime (imagine a plugin that may want to add more events to handle - like an IRC client that implements a handler for channel parting messages)
  4. Messages can be multiplexed through one Channel or Supply rather easily
  5. Creates more readable code

That last reason is a big one. Imagine going back into one of your modules 2 years from now and debugging an issue where a Supplier is given an event and some data and digging through that 600 lines of given/when.

Worse, imagine debugging someone else's.

A Quick Note on Channel vs Supply

The Channel and Supply thing can take some getting used to for newcomers. The quick and dirty is that a Channel will distribute the event to only one listener (chosen by the scheduler) and order isn't guaranteed while a Supply will distribute to all listeners and the order of the messages are distributed in the order received. Because the Event::Emitter Channel based handler executes the methods registered with it directly, when it receives a message all of your methods are called with the data.

So, you've seen the example above as a Supply based event handler, check it out as a Channel based and note the difference in .say and the instantiation of the event handler.

use Event::Emitter;  
my Event::Emitter $e .= new(:threaded); # !important - this signifies a Channel based E:E

$e.on(/^^ .+ $$/, -> $data {
  # you can operate on $data here
  "  regex matches: $data".say;
});

$e.on({ True; }, -> $data {
  "  block matches: $data".say;
});

$e.on('event', -> $data {
  "  string matches: $data".say;
});

'event'.say;  
$e.emit("event", "event");

'empty event'.say;  
$e.emit("", "empty event");

'abc'.say;  
$e.emit("abc", "abc");

Output

event  
empty event  
abc  
  regex matches: event
  block matches: event
  string matches: event
  block matches: empty event
  regex matches: abc
  block matches: abc

Perl 6 Advent Calendar: Day 21 – Show me the data!

Published by nadimkhemir on 2016-12-21T00:01:18

Over the years, I have enjoyed using the different data dumpers that Perl5 offers. From the basic Data::Dumper to modules dumping in hexadecimal, JSON, with colors, handling closures, with a GUI, as graphs via dot and many other that fellow module developers have posted on CPAN (https://metacpan.org/search?q=data+dump&search_type=modules).

I always find things easier to understand when I can see data and relationships. The funkiest display belonging to ddd (https://www.gnu.org/software/ddd/) that I happen to fire up now and then just for the fun (in the example showing C data but it works as well with the Perl debugger).

ddd

Many dumpers are geared towards data transformation and data transmission/storage. A few modules specialize in generating output for the end user to read; I have worked on system that generated hundreds of thousands lines of output and it is close to impossible to read dumps generated by, say, Data::Dumper.

When I started using Perl6, I immediately felt the need to dump data structures (mainly because my noob code wasn’t doing what I expected it to do); This led me to port my Perl5 module (https://metacpan.org/pod/Data::TreeDumper  https://github.com/nkh/P6-Data-Dump-Tree) to Perl6. I am now also thinking about porting my HexDump module. I recommend warmly learning Perl6 by porting your modules (if you have any on CPAN), it’s fun, educative, useful for the Perl6 community, and your modules implement a need in a domain that you master leaving you time to concentrate on the Perl6.

My Perl5 module was ripe for a re-write and I wanted to see if and how it would be better if written in Perl6, I was not disappointed.

Perl6 is a big language, it takes time to get the pieces right, for a beginner it may seem daunting, even if one has years of experience, the secret is to take it easy, not give up and listen. Porting a module is the perfect exercise, you can take it easy because you have already done it before, you’re not going to give up because you know you can do it, and you have time to listen to people that have more experience (they also need your work), the Perl6 community has been examplary, helpful, patient, supportive and always present; if you haven visited #perl6 irc channel yet, now is a good time.

.perl

Every object in Perl6 has a ‘perl’ method, it can be used to dump the object and objects under it. The official documentation (https://docs.perl6.org/language/5to6-nutshell#Data%3A%3ADumper) provides a good example.

.gist

Every object also inherits a ‘gist’ method from Mu, the official documentation (https://docs.perl6.org/routine/gist#(Mu)_routine_gist) states: “Returns a string representation of the invocant, optimized for fast recognition by humans.”

dd, the micro dumper

It took me a while to discover this one, I saw that in a post on IRC. You know how it feel when you discover something simple after typing .perl and .gist a few hundred times, bahhh!

https://docs.perl6.org/routine/dd

The three dumpers above are built-in. They are also the fastest way to dump data but as much as their output is welcome, I know that it is possible to present data in a more legible way.

Enter Data::Dump

You can find the module on https://modules.perl6.org/ where all the Perl6 modules are. Perl6 modules link to repositories, Data::Dump source is on https://github.com/tony-o/perl6-data-dump.

Data::dump introduces color, depth limitation, and type specific dumps. The code is a compact hundred lines that is quite easy to understand. This module was quite helpful for a few cases that I had. It also dumps all the methods associated with objects. Unfortunately, it did fail on a few types of objects. Give it a try.

Data::Dump::Tree

Emboldened by the Perl6 community, the fact that I really needed a Dumper for visualization, and the experience from my Perl5 module (mainly the things that I wanted to be done differently) I started working on the module. I had some difficulties at the beginning, I knew nothing about the details of Perl6 and even if there is a resemblance with Perl5, it’s another beast. But I love it, it’s advanced, clean, and well designed, I am grateful for all the efforts that where invested in Perl6.

P6 vs P5 implementation

It’s less than half the size and does as much, which makes it clearer (as much as my newbie code can be considered clean). The old code was one monolithic module with a few long functions, the new code has a better organisation and some functionality was split out to extra modules. It may sound like bit-rot (and it probably is a little) but writing the new code in Perl6 made the changes possible, multi dispatch, traits and other built-in mechanism greatly facilitate the re-factoring.

What does it do that the other modules don’t?

I’ll only talk about a few points here and refer you to the documentation for all the details (https://raw.githubusercontent.com/nkh/P6-Data-Dump-Tree/master/lib/Data/Dump/Tree.pod); also have a look at the examples in the distribution.

The main goal for Data::Dump::Tree is readability, that is achieved with filter, type specific dumpers, colors, and dumper specialization via traits. In the examples directory, you can find JSON_parsed.pl which parses 20 lines of JSON by JSON::Tiny(https://github.com/moritz/json),. I’ll use it as an example below. The parsed data is dumped with .perl,  .gist , Data::Dump, and Data::Dump::Tree

.perl output (500 lines, unusable for any average human, Gods can manage)screenshot_20161219_185724

.gist (400 lines, quite readable, no color and long lines limit the readability a bit). Also note that it looks better here than on my terminal who has problems handling unicode properly.screenshot_20161219_190004

Data::Dump (4200 lines!, removing the methods would probably make it usable)screenshot_20161219_190439

The methods dump does not help.screenshot_20161219_190601

Data::Dump::Tree (100 lines, and you are the judge for readability as I am biased). Of course, Data::Dump::Tree is designed for this specific usage, first it understands Match objects, second it can display only part of the string that are matched, which greatly reduces the noise.
screenshot_20161219_190932

Tweeking output

The options are explained in the documentation but here is a little list
– Defining type specific dumper
screenshot_20161219_185409

– filtering to remove data or add a representation for a data set;  below the data structure is dumped as it is and then filtered (a filter that shows what it is doing).

As filtering happens on the “header” and “footer” is should be easy to make a HTML/DHTML plugin; Althoug bcat (https://rtomayko.github.io/bcat/), when using ASCII glyphs, works fine.

screenshot_20161219_191525
– set the display colors
– change the glyphs
– display address information or not
– use subscripts for indexes
– use ASCII, ANSI, or unicode for the glyphs

Diffs

I tried to implement a diff display with the Perl5 module but failed miserably as it needed architectural changes, The Perl6 version was much easier, in fact, it’s an add-on, a trait, that synchronizes two data dumps. This could be used in tests to show differences between expected and gotten data.

screenshot_20161219_184701
Of course we can eliminate the extra glyphs and the data that is equivalent (I also changed the glyph types to ASCII)screenshot_20161219_185035

From here

Above anything else, I hope many authors will start writing Perl6 modules. And I also hope to see other data dumping modules. As for Data::Dump::Tree, as it gathers more users, I hope to get requests for change, patches, and error reports.