[cwn] Attn: Development Editor, Latest OCaml Weekly News

Alan Schmitt alan.schmitt at polytechnique.org
Tue Jul 3 02:28:19 PDT 2018


Hello

Here is the latest OCaml Weekly News, for the week of June 26 to July
03, 2018.

Table of Contents
─────────────────

Next OUPS meetup, July 5th 2018
Ocaml-multicore: report on a June 2018 development meeting in Paris
Sightings of OCaml around the Web
Lwt 4.1.0, first release with MIT license!
reed-solomon-erasure 1.0.1
Pull requests to OCaml now being linted
A proposal for a resource-management model for OCaml
What are my options for doing Native OCaml <=> Erlang <=> Browser (js_of_ocaml and/or bucklescript) RPC
What would it take to unify the parsetree with the typedtree?
Ocaml Github Pull Requests
Other OCaml News
Old CWN


――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


[Previous Week] http://alan.petitepomme.net/cwn/2018.06.26.html

[Up] http://alan.petitepomme.net/cwn/index.html

[Next Week] http://alan.petitepomme.net/cwn/2018.07.10.html


Next OUPS meetup, July 5th 2018
═══════════════════════════════

  Archive:
  [https://sympa.inria.fr/sympa/arc/caml-list/2018-06/msg00044.html]


Bruno Bernardo announced
────────────────────────

  The next OUPS meetup will take place on Thursday, July 5, 7pm at IRILL
  on the Jussieu campus. As usual, we will have a few talks, followed by
  pizzas and drinks.

  The talks will be the following:
  • Nicolás Ojeda Bär on runtime types
  • Michel Mauny on the OCaml Foundation
  • Frédéric Bour on Wall, a vector graphics renderer using OpenGL
    ([https://github.com/let-def/wall]).

  Please do note that we are always in demand of talk *proposals* for
  future meetups.

  To register, or for more information, go here:
  [https://www.meetup.com/ocaml-paris/events/252133804/]

  *Registration is required! Access is not guaranteed after 7pm if
  you're not registered.* (It also helps us to order the right amount of
  food.)

  Access map:
  IRILL - Université Pierre et Marie Curie (Paris VI)
  Barre 15-16 1er étage
  4 Place Jussieu
  75005 Paris
  [https://www.irill.org/pages/access.html]




  ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


Ocaml-multicore: report on a June 2018 development meeting in Paris
═══════════════════════════════════════════════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/ocaml-multicore-report-on-a-june-2018-development-meeting-in-paris/2202/1]


gasche announced
────────────────

  Earlier this week (24-25 June 2018) we had a meeting in Paris between
  several OCaml maintainers and researchers (folks from INRIA,
  OCamllabs, Jane Street, and also Frédéric Bour), and one of the things
  that were discussed is the technical state of the Multicore-OCaml
  project. I thought that people here could be interested in a status
  update on that. Note that it's not an update on *user* features (no
  release date for Multicore-OCaml in this post, sorry!), but an update
  on technical *development* plans, so that people that follow the
  compiler distribution development on [https://github.com/ocaml/ocaml/]
  ( or [https://github.com/ocamllabs/ocaml-multicore/] ) have some
  context for what is coming next.

  I'm going to concentrate here on the part that concern the multicore
  *runtime*: garbage collection and low-level runtime code, summarizing
  the main points from my own notes of the meeting. Note that I'm not
  working on Multicore-OCaml myself, I'm reporting on work done by
  others. I certainly have made some mistakes below, and may update my
  post to fix some of those.

  *TL;DR*: In 2019 we hope to integrate the multicore runtime into the
  upstream compiler distribution, even if it is not enabled by default
  (still experimental), so that it can evolve with the distribution and
  get the kind of testing and code porting experience we need to make
  decisions before a wider release.


Current state of the multicore runtime
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

  KC Sivaramakrishnan ( @kayceesrk ) gave a presentation on the current
  state of the [Multicore OCaml] project. The [slides are
  available]. They are a good summary of the history and current state
  of the multicore runtime.


[Multicore OCaml] https://github.com/ocamllabs/ocaml-multicore

[slides are available] http://kcsrk.info/slides/mcocaml_gallium.pdf


General milestone: migrate the multicore-ocaml runtime into the main distribution
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

  Stephen Dolan ( @stedolan ) and KC reported that it is a lot of work
  for them to keep up with changes in the OCaml compiler implementation
  (they just recently rebased their branch on top of 4.06, it was 4.02
  before that). When someone changes something in the OCaml compiler or
  runtime, they check that the rest of the distribution works properly
  or fix/adapt what is affected by the change; those changes are not
  tested against the multicore runtime, and Stephen and KC have to do
  all the fix/adapt work for all changes themselves.

  To help with this problem, the consensus is that we should upstream
  the multicore-runtime code into the compiler distribution soon-ish,
  not enabled by default, so that people that change the compiler
  codebase immediately see the effect on the multicore runtime, and can
  participate in fixing and adapting the code with respect to their
  change – with guidance from multicore experts. For a period of time,
  the multicore runtime would only be available as an experimental
  option.

  We hope that this migration could be done in about a year.


◊ Q3/Q4 2018: build-up PRs

  There are some PRs that change part of the current compiler and
  runtime to play nicer with the multicore bits, but are not part of the
  multicore runtime themselves. The plan is to get these submitted as
  PRs during the next development cycle (for 4.08, although we don't
  know whether they will be merged by the 4.08 release). This is not a
  new process, some such PRs have already been integrated in the last
  few omnths/years (see for example [GPR#1073], [GPR#1723]), but the
  hope is to get as much of those preliminary changes as possible.


  [GPR#1073] https://github.com/ocaml/ocaml/pull/1073

  [GPR#1723] https://github.com/ocaml/ocaml/pull/1723


◊ Q3/Q4 2018 (?): forward-compatible C API

  We know that Multicore OCaml will require some changes to the way C
  stubs are written for OCaml. Of course, we cannot expect authors of C
  bindings to switch overnight, so there will be some transition period
  where existing C bindings won't be able to use the multicore runtime
  and will have to be ported. On the other hand, the current runtime
  should be able to support the multicore-friendly C API, so that ported
  code can work on both runtimes.

  We agreed that it would be useful to have this "multicore-friendly C
  API" be available as soon as possible, or at least part of it, even
  before the multicore-runtime itself lands, so that people can already
  start making their code forward-compatible. Stephen already tried to
  do this in a giant PR [GPR#1003] last year, and that was too big and
  never quite finished.

  My understanding is that the multicore-OCaml people are still not
  completely sure on what the final API will be, which things definitely
  have to be broken and which thing might be supported with more work on
  their side, and hesitated to push changes that could end up
  unnecessary. We agreed to try again, with smaller API changes
  submitted separately of each other, instead of a giant single change;
  and agreed on principle that it was OK to propose new APIs that had a
  small chance of not being necessary in the end, as long as the API is
  sensible and can be easily implemented on top of the current
  runtime. (Early adopters might face a bit of code churn as things
  stabilize.)

  (One first change that we want to look at is [GPR#1798], which
  implements a notion of C-API versioning.)


  [GPR#1003] https://github.com/ocaml/ocaml/pull/1003

  [GPR#1798] https://github.com/ocaml/ocaml/pull/1798


◊ 2019: multicore runtime features, as an experimental runtime

  Then the plan is to start merging the multicore runtime codebase
  itself, piece by piece, so that it becomes possible to perform
  larger-scale experiment with it. It still wouldn't be enabled by
  default at this point, but it would be part of the actively moving
  compiler distribution, and in particular remain at feature parity with
  the rest of the compiler and runtime codebase.

  In this phase we'll need people to review the multicore runtime
  implementation, if only to help future upstream maintenance. We have
  started to ask around who would be potentially interested – hopefully
  with a related skill set. The project also has a laundry list of
  [pending tasks] that could also be worked on by the people studying
  the codebase.

  Some of the tasks being worked on involve implementing parts of the
  OCaml runtime systems that are not yet fully supported by the
  multicore runtime, such as Ephemerons (a generalization of weak
  pointers). My understanding is that the multicore devs would like to
  reach feature parity before merging the runtime code, but this may be
  re-discussed and changed if some parts of the runtime prove too
  difficult to support.


  [pending tasks]
  https://github.com/ocamllabs/ocaml-multicore/projects/3


◊ remark: the runtime/language split

  The current multicore-ocaml fork/switch contains both the multicore
  runtime, and an implementation of (untyped) effect handlers in the
  surface language, as the way for users to access concurrency features
  (to control the fibers / green-threads). Effect handlers come in
  evolving proposals of their own, there is a type-and-effect system
  under work by Leo White, and they are being discussed as well, in a
  somewhat independent way. Bundling the two changes in the same
  patchset makes reviewing more difficult, and it also created some
  silly technical issues: because effect handlers change the language
  AST, most ppx-extension code is broken on the multicore-OCaml fork,
  which makes it difficult to use language tooling, to test user
  programs, run interesting benchmarks, etc.

  In the short term the plan for upstreaming the runtime is to separate
  it from the effect-handler part, by exposing an extremely minimal
  fiber-control API, as compiler primitives or as part of the `Obj'
  runtime. That is not how anybody wants end-users to access the
  multicore runtime, but it would be a minimal device for the first
  period of runtime code upstreaming and reviewing, to make it easy to
  compile any codebase against the multicore-aware compiler, and use the
  standard OCaml packages and tooling in a multicore switch.


◊ remark: performance tuning, not yet

  Right now the multicore-OCaml devs, if I understand correctly, have
  been mostly working with micro-benchmarks, in large part because of
  the difficulty of using regular OCaml packages and tooling previously
  mentioned. A lot of opportunities (and necessity) for performance
  tuning will appear once macro-benchmarks and realistic workloads
  become available, and once some of the larger performance-sensitive
  codebases (which often include some C bindings or compiler-sensitive
  Obj hackery) have been ported. As Anil Madhavapeddy (@avsm) pointed
  out, once more code out there can be benchmarked against the multicore
  runtime we should start continuously monitoring the performance
  results.

  The general expectation is that the multicore runtime will be slower
  for purely-sequential programs than the current runtime, but the goal
  is to keep this overhead small (a first goal that was mentioned was a
  10% overhead, although we really don't know yet how easy/hard that
  target is). The two distinct runtimes may remain available in the
  distribution for as long as there are enough users asking for the
  availability sequential runtime, and that the overhead is high enough
  to justify the maintenance costs of keeping both. (In term of the
  multicore-runtime performance on sequential workloads, some things can
  be made faster at the cost of being harder to write and possibly more
  painful to maintain, so there are tradeoffs still to be explored
  there.)

  One thing I found interesting that Stephen explained to me is: you
  cannot just take a sequential program (say Coq), compile and run it
  under a multicore switch, and expect to get a meaningful "overhead
  number" (as in: "the multicore runtime is X% slower than the
  sequential one on this program"). The problem is that GCs can be
  configured to have more or less memory overhead – asking for less
  memory overhead results in more GC work, so a slower overall
  program. It doesn't make sense to only compare the default settings of
  two GCs for time, as they may have very different memory-overhead
  profile: maybe the second GC looks faster, but if you adjust its
  settings to use no more memory than the first it actually is
  slower. What you have to do instead is to try to plot the 2D
  time/memory compromise, and compare the graphs for the two GCs, or at
  least compare the entire plot of the new GC with the current results
  of your current GC.


Summary
╌╌╌╌╌╌╌

  • In the next six months, we hope to start merging most of the
    preparatory work, and a forward-compatible C API, into the upstream
    compiler distribution.
  • Then in 2019 we hope to start merging the multicore runtime itself
    (independently of effect handlers), as a non-default experimental
    option. We will need people to review the codebase and feel more
    confident in their ability to also edit it.
  • This should allow much more extensive performance testing, and the
    porting of some performance-sensitive codebases, so that we can get
    a better picture of the performance profile, of the difficulty to
    port code, potential parts that need to be reworked, etc.

  Plenty of interesting applications of a multicore runtime (and of a
  typed effect system) have also been discussed, interesting
  memory-model questions, formalization questions, etc. This is
  definitely an interesting time for the OCaml community!




  ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


Sightings of OCaml around the Web
═════════════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/sightings-of-ocaml-around-the-web/1867/22]


Yotam Barnoy announced
──────────────────────

  Great new article out by @bobbypriambodo, this time on [creating a
  project with opam, caqti and postgresql].




  ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


[creating a project with opam, caqti and postgresql]
https://medium.com/@bobbypriambodo/interfacing-ocaml-and-postgresql-with-caqti-a92515bdaa11


Lwt 4.1.0, first release with MIT license!
══════════════════════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/lwt-4-1-0-first-release-with-mit-license/2216/1]


Anton Bachin announced
──────────────────────

  Switching to the MIT license is actually the main reason to release
  Lwt now, so this release is a bit sparse compared to most :) There
  are, however, other nice additions and fixes. See the changelog below.

  [`Lwt_io.establish_server_with_client_socket'] was added to help [port
  http/af to Lwt]. http/af is the fast new HTTP server by
  @seliopou. This should be coming out soon.

  Additions

  • Change license to MIT ([#560]).
  • [`Lwt_fmt'], an Lwt-aware version of [`Format'] ([#548], @Drup).
  • [`Lwt_io.establish_server_with_client_socket'] ([#586]).

  Bugs fixed

  • [`Lwt_stream.iter_n']:
  rename `?max_threads' argument to `?max_concurrency' ([#581],
  @hcarty).
  • PPX: reject `match' expressions that have only `exception' cases
    ([#597], @raphael-proust).

  Miscellaneous

  • Improvements to [`Lwt_pool'] docs ([#575], @bobbypriambodo).
  • Restore all PPX tests ([#590], Jess Smith).

  Thanks to all the contributors who worked on this release, and to past
  contributors for agreeing to the license switch!

  changelog: [https://github.com/ocsigen/lwt/releases/tag/4.1.0]

  [https://github.com/ocsigen/lwt#readme]




  ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


[`Lwt_io.establish_server_with_client_socket']
https://github.com/ocsigen/lwt/blob/33c37b550304b8d18b5dea4818fb1bfbc9a86aaf/src/unix/lwt_io.mli#L533

[port http/af to Lwt] https://github.com/inhabitedtype/httpaf/pull/53

[#560] https://github.com/ocsigen/lwt/issues/560

[`Lwt_fmt']
https://github.com/ocsigen/lwt/blob/33c37b550304b8d18b5dea4818fb1bfbc9a86aaf/src/unix/lwt_fmt.mli

[`Format'] http://caml.inria.fr/pub/docs/manual-ocaml/libref/Format.html

[#548] https://github.com/ocsigen/lwt/pull/548

[#586] https://github.com/ocsigen/lwt/pull/586

[`Lwt_stream.iter_n']
https://github.com/ocsigen/lwt/blob/33c37b550304b8d18b5dea4818fb1bfbc9a86aaf/src/core/lwt_stream.mli#L314

[#581] https://github.com/ocsigen/lwt/pull/581

[#597] https://github.com/ocsigen/lwt/pull/597

[`Lwt_pool']
https://github.com/ocsigen/lwt/blob/33c37b550304b8d18b5dea4818fb1bfbc9a86aaf/src/core/lwt_pool.mli

[#575] https://github.com/ocsigen/lwt/pull/575

[#590] https://github.com/ocsigen/lwt/pull/590


reed-solomon-erasure 1.0.1
══════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/ann-reed-solomon-erasure-1-0-1/2217/1]


Darren announced
────────────────

Overview
╌╌╌╌╌╌╌╌

  reed-solomon-erasure is an OCaml implementation of Reed-Solomon
  erasure coding.

  It is a port of the Rust library [reed-solomon-erasure], which is a
  port of several other libraries.

  It provides functions for encoding, verifying, and reconstructing data
  and parity shards, and deals with `string', `bytes', and
  `Core_kernel.Bigstring.t'

  You can install it via `opam install reed-solomon-erasure'.


[reed-solomon-erasure] https://github.com/darrenldl/reed-solomon-erasure


Example
╌╌╌╌╌╌╌

  ┌────
  │ open Reed_solomon_erasure
  │ 
  │ let () =
  │   let r = ReedSolomon.make 3 2 in (* 3 data shards, 2 parity shards *)
  │ 
  │   let master_copy = [|"\000\001\002\003";
  │ 		      "\004\005\006\007";
  │ 		      "\008\009\010\011";
  │ 		      "\000\000\000\000"; (* last 2 rows are parity shards *)
  │ 		      "\000\000\000\000"|] in
  │ 
  │   (* Construct the parity shards *)
  │   ReedSolomon.encode_str r master_copy;
  │ 
  │   (* Make a copy and transform it into option shards arrangement
  │      for feeding into reconstruct_opt_str *)
  │   let shards = RS_Shard_utils.shards_to_option_shards_str master_copy in
  │ 
  │   (* We can remove up to 2 shards, which may be data or parity shards *)
  │   shards.(0) <- None;
  │   shards.(4) <- None;
  │ 
  │   (* Try to reconstruct missing shards *)
  │   ReedSolomon.reconstruct_opt_str r shards;
  │ 
  │   (* Convert back to normal shard arrangement *)
  │   let result = RS_Shard_utils.option_shards_to_shards_str shards in
  │ 
  │   assert (ReedSolomon.verify_str r result);
  │   assert (master_copy = result)
  └────


Performance
╌╌╌╌╌╌╌╌╌╌╌

  The encoding performance is shown below

  Machine : laptop with `Intel(R) Core(TM) i5-3337U CPU @ 1.80GHz (max
  2.70GHz) 2 Cores 4 Threads'

  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
   Configuration  Klaus Post's  reed-solomon-erasure (Rust)  ocaml-reed-solomon-erasure (bigstr)  … (bytes)  … (str)   
  ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   10x2x1M        ~7800MB/s     ~4800MB/s                    ~3200MB/s                            ~1500MB/s  ~1500MB/s 
  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━


Links :
╌╌╌╌╌╌╌

  • Documentation
    • [https://darrenldl.gitlab.io/ocaml-reed-solomon-erasure/]
  • OPAM
    • [http://opam.ocaml.org/packages/reed-solomon-erasure/]
  • Repo
    • [https://gitlab.com/darrenldl/ocaml-reed-solomon-erasure]
  • Archived GitHub repo (for those who keep track of things by stars)
    • [https://github.com/darrenldl/ocaml-reed-solomon-erasure]




  ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


Pull requests to OCaml now being linted
═══════════════════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/pull-requests-to-ocaml-now-being-linted/2221/1]


David Allsopp announced
───────────────────────

  [OCaml GPR#1148] has been merged this weekend. Contrary to the title
  of this PR, the most relevant thing for contributors to the compiler
  itself is that pull requests are now run via our linting tool which
  resides in `tools/check-typo' in the compiler distribution.  It
  produces output like the following:

  ┌────
  │ ./asmcomp/selectgen.ml:798.81: [long-line] line is over 80 columns
  │ ./asmcomp/selectgen.ml:1129.81: [long-line] line is over 80 columns
  │ ./bytecomp/simplif.ml:419.81: [long-line] line is over 80 columns
  │ ./bytecomp/symtable.ml:43.3: [white-at-eol] whitespace at end of line
  │ ./bytecomp/symtable.ml:62.3: [white-at-eol] whitespace at end of line
  │ ./tools/dumpobj.ml:533.81: [long-line] line is over 80 columns
  │ ./typing/includemod.ml:167.48: [white-at-eol] whitespace at end of line
  │ ./typing/includemod.ml:168.64: [white-at-eol] whitespace at end of line
  │ ./typing/includemod.ml:319.81: [long-line] line is over 80 columns
  │ ./utils/misc.ml:180.40: [white-at-eol] whitespace at end of line
  │ ./utils/misc.mli:105.7: [white-at-eol] whitespace at end of line
  └────

  It's necessary for this test to pass (in certain situations it's
  possible to override the tests via `.gitattributes', but in general
  this is not permitted).  It's hoped that this won't cause too much
  pain for our wonderful contributors, and, to ease development, that
  GPR includes a Git pre-commit hook which can be installed to your
  local Git clone simply by running `cp tools/pre-commit-githook
  .git/hooks/pre-commit' (even on Windows). Once installed, the hook
  blocks attempts to commit files which are invalid:

  ┌────
  │ [git:git-precommit] C:\DRA\ocaml>git commit -m "GPR#1148 Changes"
  │ ./Changes:108.81: [long-line] line is over 80 columns
  └────

  If you're in a rush or, perish the thought, there's a bug in the
  pre-commit hook, you can always use the `--no-verify' flag for `git
  commit' to skip the `check-type' test, but bear in mind that Travis
  will still test it.

  Please do report any issues either here, on caml-list or on Mantis.

  Happy linted compiler hacking!




  ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


[OCaml GPR#1148] https://github.com/ocaml/ocaml/pull/1148


A proposal for a resource-management model for OCaml
════════════════════════════════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/a-proposal-for-a-resource-management-model-for-ocaml/1680/22]


Continuing this thread, Guillaume Munch-Maccagnoni announced
────────────────────────────────────────────────────────────

  Interested people can find [slides] from my talk from Monday at Inria
  Paris (updated to take feedback into account).




  ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


[slides] http://guillaume.munch.name/files/gallium-25-06-2018.pdf


What are my options for doing Native OCaml <=> Erlang <=> Browser (js_of_ocaml and/or bucklescript) RPC
═══════════════════════════════════════════════════════════════════════════════════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/what-are-my-options-for-doing-native-ocaml-erlang-browser-js-of-ocaml-and-or-bucklescript-rpc/2223/1]


Metin Akat asked
────────────────

  I've been reading about various ways of doing RPC, though still
  haven't started testing… but I am already quite lost.

  Here is what I know so far, please correct me for where I'm wrong and
  other suggestions are very welcome.

  1. gRPC - Currently not available for OCaml, as there is no HTTP2
     support
  2. Apache Thrift - seems to be officially supported by Apache, but
     really doubtful it will work in the browser
  3. piqi - this project seems to be quite silent since 2014. I asked
     here if it will work in the browser:
     [https://groups.google.com/forum/#!topic/piqi/yjZA42ZXobc]
  4. Cap'N Proto which [here] only seems to support JS in node.js, so
     probably it won't work in the browser for OCaml either.

  So what would you guys do if you were to implement something like
  this?

  In general, this question can be simplified if we remove the Erlang
  from the equation… but then still, what would you do in order to
  implement RPC between native and in-browser OCaml of some sort?


[here] https://capnproto.org/otherlang.html


Bobby Priambodo suggested
─────────────────────────

  Have you taken a look at ocaml-rpc?

  [https://github.com/mirage/ocaml-rpc]


Aaron L. Zeng also suggested
────────────────────────────

  There is also [Async_rpc_kernel] if you are using the Async
  ecosystem. You can use something like websockets to provide a
  transport layer for the async RPC machinery.




  ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


[Async_rpc_kernel] https://github.com/janestreet/async_rpc_kernel


What would it take to unify the parsetree with the typedtree?
═════════════════════════════════════════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/what-would-it-take-to-unify-the-parsetree-with-the-typedtree/2225/1]


Arthur Charguéraud said
───────────────────────

  I suspect that I'm not the first one to raise this question, but as I
  could not find the answer online, so I'll ask it here:

  *What would be the major obstacles to using a single AST to represent
   both the parsetree and the typedtree?*


Why factorize?
╌╌╌╌╌╌╌╌╌╌╌╌╌╌

  Before elaborating on this question, let me explain why I care so much
  about it.  Improving the code factorization inside the compiler, as
  useful as it might be to decrease the cost of maintaining and
  extending the codebase, is not what motivates my question. What I care
  about is the ability to conveniently write source-to-source program
  transformations that need to exploit information about types, and need
  to be able to rely on fully-qualified paths for identifiers.

  Given that I need type information, it's natural to start from the
  typedtree.  Maintaining types during transformation is tremendous
  effort, plus it's generally completely useless, since I could
  re-typecheck the new code. I thus have the option of rebuilding a
  typedtree with broken type information, or rebuilding a parsetree.
  Rebuilding a parsetree from the typedtree is possible, but is a rather
  inconvenient and awkward thing to do when the input is a
  typedtree. When in a branch I want to leave the input program "as it
  was", I just expect to return it unchanged.  Moreover, due to small
  differences between the typedtree and the parsetree, some extra work
  is required for producing valid a parsetree (see typing/untypeast.ml).

  In summary, it seems that the sensible way to go is to express my
  program transformation as a function that takes as input a typedtree
  with valid types, and returns a typedtree with dummy type information,
  and then call the type-checker again. However, I'm lacking at the
  moment all the mappers and ast_helpers on the typedtree data type to
  help me produce structure. These tools are provided for the parsetree
  only. This raises the obvious question: why wouldn't the parser also
  produce a typedtree with dummy type information?  This would also help
  factorizing all the iterators/mappers/ast_helpers/printers over the
  two ASTs.


How to factorize?
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

  This brings us back to the main question that is the concern of this
  thread.  At a high level, the parsetree has become pretty much a
  strict subset of the typedtree. Imo, it no longer makes much sense to
  live with such a large amount of code duplication between the
  parsetree and the typedtree. Concretely, what I have in mind is that
  the typedtree could be used for both, simply using dummy values for
  fields of type "typ" and "env" and "path" etc. What matters is to
  clearly mark, when manipulating an AST, whether its types are assigned
  or missing. (For efficiency of multiple program transformations, one
  could also imagine an AST with only part of its type information being
  invalid.)

  Over the years, differences between the two AST have been dramatically
  reduced.  Investigating the files and looking at typing/untypeast.ml,
  I could spot several differences, which I think could be handled
  either by adding a few constructors in the typedtree without polluting
  it too much, or in a few cases by requiring the parser to handle some
  trivial syntactic sugar. During type-checking:
  • fun gets encoded using function (not convinced this is for the
    better; it seems not very hard for the typechecker to preserve the
    distinction; preserving it would certainly help for the writing of
    program transformations.)
  • exp_construct has an inherent ambiguity that can only solved during
    typechecking (nevertheless the Texp_construct could be used by the
    parser, and the typechecker could later refine as Pexp_construct
  • exception patterns are cleanly separated from other patterns (parser
    could do it?)
  • order of arguments in applications involving optional arguments is
    unclear (?)


Possible migration steps
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

  I do not underestimate the work involved in a major refactoring of the
  code base, and the issues associated with code migration of
  third-party code.  Yet, this cost could be mitigated, by carrying out
  the migration in several steps:
  • Define the new (unified) AST, with its iterator, mapper, helper, and
    printer, so that it is ready to use for type-aware source-to-source
    program transformation.  This is certainly the part that involves
    the most work. But it's the part that's unavoidable for being able
    to write type-aware transformations, which is something that I —and
    I bet others— will certainly end up programming.  That said, it
    would be great that the definition of the "ideal" AST be the result
    of an open discusssion with developers of the compiler.
  • Define a conversion function from and to that new AST to the old
    parsetree, so as to allow backward compatibility with ppx
    transformations. Note that one direction would be pretty much
    exactly the code from untypeast.ml.  The other direction could be
    defined with the help of the functions from the AST_helper.ml
    associated with the new AST.
  • Update the code of the typechecker to take as input values from the
    new AST.  This involves only very minor and local changes, because
    the new AST would be quite close to the current typed AST.
  • Update the code of the parser to directly produce values in the new
    AST.  This would also involve relatively minor changes: essentially,
    we need to replace the constructors from the old parsetree datatype
    with calls to the AST helper functions for the new AST.

  At that point, the original parsetree datatype will only need to
  remain for backward compatibility with existing ppx code.

  I would really like the understand what are the obstacles to be
  expected to get there.




  ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


Ocaml Github Pull Requests
══════════════════════════

Gabriel Scherer and the editor compiled this list
─────────────────────────────────────────────────

  Here is a sneak peek at some potential future features of the Ocaml
  compiler, discussed by their implementers in these Github Pull
  Requests.

  • Functions to read/write binary representations of numbers
    ([https://github.com/ocaml/ocaml/pull/1864])
  • Remove the C plugins mechanism
    ([https://github.com/ocaml/ocaml/pull/1867])
  • Add paths for built-in types
    ([https://github.com/ocaml/ocaml/pull/1876])




  ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


Other OCaml News
════════════════

From the ocamlcore planet blog
──────────────────────────────

  Here are links from many OCaml blogs aggregated at [OCaml Planet].

  • [Full Time: Compiler Engineer at Jane Street in New York & London]
  • [Full Time: Software Developer (Functional Programming) at Jane
    Street in New York, NY; London, UK; Hong Kong]




  ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――


[OCaml Planet] http://ocaml.org/community/planet/

[Full Time: Compiler Engineer at Jane Street in New York & London]
http://jobs.github.com/positions/9e8ba450-e72e-11e7-926f-6ce07b7015c8

[Full Time: Software Developer (Functional Programming) at Jane Street
in New York, NY; London, UK; Hong Kong]
http://jobs.github.com/positions/0a9333c4-71da-11e0-9ac7-692793c00b45


Old CWN
═══════

  If you happen to miss a CWN, you can [send me a message] and I'll mail
  it to you, or go take a look at [the archive] or the [RSS feed of the
  archives].

  If you also wish to receive it every week by mail, you may subscribe
  [online].
  ――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
  [Alan Schmitt]


[send me a message] mailto:alan.schmitt at polytechnique.org

[the archive] http://alan.petitepomme.net/cwn/

[RSS feed of the archives] http://alan.petitepomme.net/cwn/cwn.rss

[online] http://lists.idyll.org/listinfo/caml-news-weekly/

[Alan Schmitt] http://alan.petitepomme.net/

-- 
OpenPGP Key ID : 040D0A3B4ED2E5C7
Monthly Athmospheric CO₂, Mauna Loa Obs. 2018-05: 411.25, 2017-05: 409.65
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 528 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/caml-news-weekly/attachments/20180703/a7d09cdc/attachment-0001.pgp>


More information about the caml-news-weekly mailing list