[cwn] Attn: Development Editor, Latest OCaml Weekly News

Alan Schmitt alan.schmitt at polytechnique.org
Tue May 26 01:51:36 PDT 2020


Hello

Here is the latest OCaml Weekly News, for the week of May 19 to 26,
2020.

Table of Contents
─────────────────

Hannes Mehnert interview about MirageOS and OCaml by Evrone
A dynamic checker for detecting naked pointers
ANN: Releases of ringo
Solidity parser in OCaml with Menhir
Browsing source with merlin (and tuareg) the right way?
New release of Cucumber ML 1.0.3
New OCaml books?
Integer division behaviour
New release of tablecloth
Language abstractions and scheduling techniques for efficient execution of parallel algorithms on multicore hardware
Release 1.2 of HoCL
Other OCaml News
Old CWN


Hannes Mehnert interview about MirageOS and OCaml by Evrone
═══════════════════════════════════════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/hannes-mehnert-interview-about-mirageos-and-ocaml-by-evrone/5784/1]


Elizabeth Lvova announced
─────────────────────────

  [https://evrone.com/hannes-mehnert-interview]


Guillaume Munch-Maccagnoni then asked
─────────────────────────────────────

  Thank you @elizabethlvova for this link.

  Hi @hannes, and other MirageOS developers if they know the answer. I
  am curious about the soft real-time applications.

  • What kind of latency do you target, and what kind of latency does
    OCaml allows you to achieve? Are there concrete evaluations about it
    in the context of MirageOS? (Bonus internet points if they are
    public so as to be referenced in a paper, that would be very helpful
    to me!)
  • I have learnt on this discuss that low-latency can be obtained in
    OCaml by writing in a special style where you promote very
    little. Do you sometimes have to pay attention to your allocation
    patterns when you program for MirageOS? Have you ever had to profile
    an application for latency, and fix it by changing allocation
    patterns?


Hannes Mehnert replied
──────────────────────

  I am not sure whether there's anyone focussing on low-latency MirageOS
  unikernels. My goal is to first get robust and sustainable
  infrastructure. I have been playing a bit with an old version of
  statmemprof to figure out allocation profiles (and landmarks for
  profiling code), but I am not aware of any in-depth allocation
  analysis. The closest I am aware of is httpaf's motivational
  benchmarks [https://github.com/inhabitedtype/httpaf#performance]. Also
  [https://github.com/mirage/mirage/pull/968] in respect to our IP
  stack, but I still feel there's room for improvement (such as using
  String/Byte instead of Bigarray; avoid allocation of small structures
  when sending data).


Following several questions, Anil Madhavapeddy replied
──────────────────────────────────────────────────────

Bikal Lem said
╌╌╌╌╌╌╌╌╌╌╌╌╌╌

        It is interesting you mentioned this. Isn’t the usage of
        bigarray more efficient than String/bytes? I think httpaf
        uses bigstringaf and faraday which seems to pervasively
        use bigarray as its primary buffer data structure. Isn’t
        this a performant choice?

  This is a good question, and it's helpful to understand what each
  datastructure is backed by, and what operations are inefficient.

  • `Bigarray' is a pointer to externally allocated memory of arbitrary
    length. It supports creating smaller views of the same memory
    without copying it, which is implemented at the [OCaml runtime
    level]. Accessing data within bigarrays is fast thanks to some
    compiler primitives which allow for endian-neutral parsing and
    serialisation, implemented by [ocplib-endian].
  • Bigarrays are extremely convenient for network IO, since they
    support everything needed for minimal copying of data from the OS.
    You can exchange memory pages directly from the OS into the OCaml
    heap, and process them. Unfortunately, one operation is critically
    slow here – creating a substring. Bigarray's provenance was
    originally to interop better with Fortran-style HPC code, where the
    size and dimensionality of arrays is generally large.  For IO, we
    just want really speedy 1-dimension arrays, and in this usecase
    Bigarray substring creation is very slow due to the underlying
    reference counting. Thus [cstruct] was born, which keep a single
    underlying Bigarray structure and allocates [small OCaml records] on
    the minor heap for subviews. These are cheap to create and GC, and
    the underlying data is not copied unless requested.
  • Strings are immutable and sit in the OCaml heap, and require a data
    copy from the outside world into them.  Under some circumstances
    (usually small allocations) they can be more performant.
  • Buffers are a resizable String, and efficient if you need to
    concatenate lots of data of unknown size.

  So the final answer, as with many systems performance problems about
  what is "efficient" depends on your allocation patterns. For
  transmitting data, there is often a number of small pieces of data
  that are combined onto a set of pages for the write path. In this
  case, a hybrid of "in-heap" assembly using small strings followed by
  blitting into a Bigarray is reasonable. For reading, parsing directly
  from a Bigarray into a cstruct works well.


[OCaml runtime level]
https://github.com/ocaml/ocaml/blob/trunk/runtime/caml/bigarray.h#L78

[ocplib-endian] https://github.com/OCamlPro/ocplib-endian

[cstruct] https://github.com/mirage/ocaml-cstruct

[small OCaml records]
https://github.com/mirage/ocaml-cstruct/blob/master/lib/cstruct.mli#L143


Yotam Barnoy said
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

        Compared to string/byte, Bigarray doesn’t have much of an
        advantage. Allocation is always relatively expensive, and
        accessing it requires going through the C API,

  This is basically all incorrect; please see above. Accessing Bigarrays
  can be done via builtin compiler primitives that make it fast. And the
  point of using them is to avoid multiple small allocations, especially
  on the read path.


Guillaume Munch-Maccagnoni asked
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

        • What kind of latency do you target, and what kind of
          latency does OCaml allows you to achieve? Are there
          concrete evaluations about it in the context of
          MirageOS? (Bonus internet points if they are public so
          as to be referenced in a paper, that would be very
          helpful to me!)

  The basic approach to low latency OCaml hasn't really changed much in
  the last few decades. You just need to minimise allocation to maximise
  GC throughput, and OCaml makes it fairly easy to write that sort of
  low level code.  Two papers that might be helpful:

  • ["Melange: Towards a functional internet"], EuroSys 2007. Contains a
    latency analysis of an SSH and DNS server _vs_ C equivalents, and
    some techniques on writing low-latency protocol parsers.  These
    days, we do roughly the same thing with ppx's and cstructs, without
    the DSL in the way.
  • ["Jitsu: Just-in-Time Summoning of Unikernel;s"], NSDI 2015.  This
    shows the benefits of whole-system latency control – you can mask
    latency by doing some operations concurrently, which is easy to do
    in unikernels and hard in a conventional OS.

  We've never really built systems in the "soft realtime" sense so far –
  for example no video transmission system or isochronous Bluetooth
  implementations. Internet protocols are very resilient to variable
  latency, although of course we want to keep things as low as possible.
  I've been looking into multipath multicast video transmission in
  Mirage recently due to the current work-at-home situation, so that
  might change soon depending on how it goes :slight_smile:

  One thing that has changed in the past decade is the [steadily
  improving latency profile] of the OCaml GC, which has only been
  improving thanks to @damiendoligez's steady work. That has let us get
  away with not directly addressing latency much in Mirage itself, as
  every upgrade of the compiler is a pleasant improvement.


["Melange: Towards a functional internet"]
https://www.tjd.phlegethon.org/words/eurosys07-melange.pdf

["Jitsu: Just-in-Time Summoning of Unikernel;s"]
https://www.usenix.org/system/files/conference/nsdi15/nsdi15-paper-madhavapeddy.pdf

[steadily improving latency profile]
https://blog.janestreet.com/building-a-lower-latency-gc/


Calascibetta Romain said
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

        I just would like to add a _pro_ about `bigarray', due to
        the fact that a `bigarray' can not move in your heap, we
        have the ability to release the runtime lock for some
        computations such as _hash algorithms_ as `digestif' does:

        [https://github.com/mirage/digestif/pull/70]

        About MirageOS, we currently mostly use [`cstruct'] which
        has an other difference with `bigarray', the underlying
        record. Such design is to be more efficient when we do a
        `sub' operation as @ivg said here:
        [https://discuss.ocaml.org/t/working-with-a-huge-data-chunks/3955/10?u=dinosaure]

        However, the question to choose `Bytes.t' or `Cstruct.t'
        (or `Bigstring.t' ) is a bit hard and it really depends on
        your context - and, as @xavierleroy said :slight_smile: :

        > Mirage people don’t seem to care, as they allocate small
          bigarrays like crazy.

  And indeed, @xavierleroy is right that we allocate like crazy, with
  the caveat that this only really happens on the transmission path of
  most protocols. Reads tend to go through a more minimal copy
  discipline.

  We certainly do care about this, but it has to be fixed upstream in
  OCaml as we have reached the limits of what we can practically do with
  Bigarray – I am hoping that multicore OCaml is the perfect time to
  [unify all these IO approaches] in that direction as part of that
  effort. Mirage will benefit from whatever happens there eventually.


[`cstruct'] https://github.com/mirage/ocaml-cstruct

[unify all these IO approaches]
https://discuss.ocaml.org/t/ann-a-dynamic-checker-for-detecting-naked-pointers/5805/15?u=avsm


A dynamic checker for detecting naked pointers
══════════════════════════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/ann-a-dynamic-checker-for-detecting-naked-pointers/5805/1]


KC Sivaramakrishnan announced
─────────────────────────────

  We're happy to release an OCaml compiler switch for dynamically
  detecting naked pointers in the code.


Naked pointers in OCaml
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

  A naked pointer is a pointer outside the OCaml heap without a valid
  header. A header outside the heap is said to be valid if it is colored
  black. OCaml does [permit naked pointers] to word-aligned addresses.
  However, the presence of naked pointers incurs overhead in the garbage
  collector (GC). Whenever the GC intends to follow a pointer, it must
  check that the pointer is indeed in the OCaml heap. The GC consults a
  page table that maintains the list of pages currently used by the heap
  and only follows the pointer if it belongs to one of the pages. As you
  can imagine, this adds some overhead in the GC. For the multicore GC,
  maintaining a page table that remains consistent when multiple domains
  are allocating and running GC in parallel would necessitate some
  synchronization around the page table for reading and writing to
  it. It is quite likely that this cost will be prohibitive.

  Luckily, OCaml already has a `no-naked-pointer' mode where the
  compiler *assumes* that the code does not have naked pointers, and
  hence, does not consult the page table for following pointers during
  GC ([except `Closure_tag' objects]). The `no-naked-pointer' mode is a
  configure-time option, enabled by configuring the compiler with
  `--disable-naked-pointers'. Multicore OCaml compiler does not use a
  page table in its implementation currently.


[permit naked pointers]
https://caml.inria.fr/pub/docs/manual-ocaml/intfc.html#ss:c-outside-head

[except `Closure_tag' objects] https://github.com/ocaml/ocaml/pull/8984


Dynamic Check for naked pointers
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

  With the aim of migrating to `no-naked-pointer' mode as the default in
  future releases of OCaml, eventually paving the way for upstreaming
  multicore support, we're happy to release a variant of OCaml 4.10.0
  with a dynamic checker for the presence of naked pointers in the
  code. [OCaml PR#9534] has the discussion around this checker. This
  variant can be installed with:

  ┌────
  │ $ opam update
  │ $ opam switch create 4.10.0+nnpcheck
  │ $ eval $(opam env)
  └────

  Once the variant is installed, you can install your favorite libraries
  using `opam' and run your program to get a report of naked
  pointers. Let us look at an example. We know that `frama-c' has naked
  pointers.

  ┌────
  │ $ opam install frama-c
  │ $ frama-c
  │ Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  └────

  The checker prints warnings to standard error with the address that
  contains the naked pointer, the naked pointer and the reason why the
  warning was raised.


[OCaml PR#9534] https://github.com/ocaml/ocaml/pull/9534


Finding the sources
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

  While the warnings are useful for indentifying that the program has
  naked pointer, it does not help with finding the source of the naked
  pointer in code. For this, we recommend the use of [`rr']. `rr' is
  record and replay framework that wraps around the familiar `gdb'
  interface. We can debug the error above as follows:

  ┌────
  │ $ rr frama-c
  │ rr: Saving execution to trace directory ~/home/kc/.local/share/rr/frama-c-5'.
  │ Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e2754d8 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ Out-of-heap pointer at 0x55fc1e275600 of value 0x55fc1e3a0cc0 has non-black head (tag=144)
  │ $ rr replay
  │ (rr) watch *(value*)0x55fc1e2754d8
  │ Hardware watchpoint 1: *(value*)0x55fc1e2754d8
  │ (rr) c
  │ Continuing.
  │ 
  │ Hardware watchpoint 1: *(value*)0x55fc1e2754d8
  │ 
  │ Old value = 1
  │ New value = 94541327240384
  │ 0x000055fc1dab48f8 in camlUnmarshal__entry () at src/libraries/datatype/unmarshal.ml:72
  │ 72      src/libraries/datatype/unmarshal.ml: No such file or directory.
  └────

  This corresponds to the naked pointer at
  [https://github.com/Frama-C/Frama-C-snapshot/blob/master/src/libraries/datatype/unmarshal.ml#L72].


[`rr'] https://github.com/mozilla/rr


Fixing naked pointers
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌

  The recommended way of fixing naked pointers is to [wrap them in an
  OCaml object with `Custom_tag' or `Abstract_tag' (as appropriate)].


[wrap them in an OCaml object with `Custom_tag' or `Abstract_tag' (as
appropriate)]
https://caml.inria.fr/pub/docs/manual-ocaml/intfc.html#ss:c-outside-head


Limitations
╌╌╌╌╌╌╌╌╌╌╌

  The dynamic analysis only work on AMD64 backend with GCC and Clang. It
  has been known to work on Linux and MacOS.  `rr' currently requires an
  Intel CPU with Nehalem (2010) or later microarchitecture.


Credits
╌╌╌╌╌╌╌

  The analysis was originally proposed by Mark Shinwell (@mshinwell).


KC Sivaramakrishnan added
─────────────────────────

  As @gasche had mentioned earlier, the no-naked-pointers mode was
  already there in OCaml and it is known to work on all the platforms
  that OCaml was supported. Hence, it was a reasonable path to pursue
  for Multicore.

  The concurrent minor collector in Multicore OCaml uses the virtual
  address space trick, but only for the minor heap area. It needs
  contiguous 4GB reserved for 128 domains, each with max 16MB minor heap
  arena. This can be modified at compiler configure time. For comparison
  the minor heap is 2MB by default in OCaml and so 16MB should be quite
  enough.  We hadn't considered this trick for the major heap in
  Multicore.

  However, given our experimental evaluation (see [paper]), we have
  chosen not to pursue concurrent minor collector for the initial
  version of multicore support to be upstreamed. The alternative
  stop-the-world parallel minor collector scales better and does not
  break the C FFI. The parallel minor collector does not need the
  virtual address space trick.

  Given that the space for the entire heap should be reserved, how would
  it work on 32-bit architectures, and does it have an impact on system
  tooling. Looking forward to reading @gadmm's RFC.


[paper] https://arxiv.org/abs/2004.11663


Stephen Kell asked and Anil Madhavapeddy replied
────────────────────────────────────────────────

        As someone with an interest in cross-language interop, I
        like naked pointers. So I’m interested in design choices
        that might make the “our heap or not?” check fast, and in
        reasons why impls might not go for them. Feel free to
        answer “read the paper”, but I was thinking you could do
        something like reserve a big contiguous chunk of VAS for
        OCaml heaps’ use, and then the test could be a simple
        shift and compare. Would that be viable?

  Please do note that this isn't a performance improvement for OCaml –
  this very much a correctness fix.  The failure case is as follows:

  • a naked pointer is created using `malloc' on the C heap and held in
    the OCaml heap
  • the external region is `free''d, but the naked pointer is still held
    in some OCaml heap.
  • the GC ~malloc~s to expand, and that recently freed C memory becomes
    part of the OCaml heap
  • the GC then follows the naked pointer by treating it as an OCaml
    value, since the page table indicates that it is within the OCaml
    heap.  However, the memory the naked pointer is aimed at is not
    necessarily a valid OCaml value as it was formerly a C pointer.
  • memory corruption ensues

  The only way to really avoid this is by only holding naked references
  to static or global C values, which is a pretty minority usecase. As
  @lpw25 notes, you can hold them safely by wrapping them in custom
  blocks, which is entirely safe as it gives the GC a reliable way to
  determining what's going on.

  As for the question about a contiguous VA, this should work fine on
  64-bit, where you have the luxury of such use of the address space. I
  built a version of this a decade ago for OCaml/Xen in early Mirage,
  which you can find evaluated in the [HotCloud 2010 paper] (Figure 4).
  It's pretty straightforward, but the problems come from balancing
  external memory pressure (from C allocations) with the OCaml
  allocation.  This can be adjusted with an obvious use of `sbrk' or
  `realloc' to grow or shrink the contiguous memory, while being careful
  to keep other memory allocations away from the OCaml area.

  The current strategy will need to be maintained for 32-bit
  architectures however, which are very much supported (e.g.  armv7).
  For those, there is very little wiggle room to hold a contiguous VA
  and so the current multicore approach lets us preserve a unified
  memory representation.

  One observation I had when I read @stephenrkell's excellent essay is
  how strange our current memory allocation mechanisms are in operating
  systems.  We have conflated cooperative scheduling across components
  with enforcing protection from mutually untrusted control flow in the
  same language.  For example, we have the system C malloc competing
  with the OCaml GC which competes with the kernel memory allocator.
  I've been sketching out a possible solution in multicore OCaml towards
  this:

  • We move away from `Bigarray' to a specialised `Extvalue' that
    handles external pages in a separate region of memory.  Bigarray
    currently offers too much functionality (subarrays and proxies)
    which slows it down due to dropping into the C FFI.
  • The `Extvalue' is backed by a bundled slab allocator that works in a
    contiguous region of memory, disjoint from the OCaml heap.
  • The compiler provides primitives for very fast translation of values
    in and out of the `Extvalue' (as it does currently for `Bigarray').
  • C libraries linked in with OCaml also use this memory allocator for
    their own mallocs. This will require some trickery (static
    compilation or LD_PRELOAD initially), but it means that all the
    allocations associated with a particular "task" (from OCaml to C or
    Rust code) can be batched together.
  • This approach lets us improve multicore memory locality greatly, as
    every modern machine has significant NUMA effects (see this [FOSDEM
    2013 talk]), and cooperatively allocate memory. It also leaves open
    the possibility of separate isolation mechanisms (such as ARM memory
    domains or Intel MPK) _across_ tasks in a large heap.

  Please note that the above is still only at the experimental stage as
  I'm still evaluating it, but it does have the advantage of degrading
  gracefully if the system malloc has to be used (e.g. if OCaml is
  embedded as a library, noone expects 10GBs gigabit levels of network
  performance).  From an ecosystem perspective, I don't think anyone
  really wants to maintain the current hybrid world of a multitude of
  `Bigarray'-based overlays, such as cstruct or bigstring.


[HotCloud 2010 paper]
http://mort.io/publications/pdf/hotcloud10-lamp.pdf

[FOSDEM 2013 talk] https://www.youtube.com/watch?v=Ss4pUbq09Lw


ANN: Releases of ringo
══════════════════════

  Archive: [https://discuss.ocaml.org/t/ann-releases-of-ringo/5605/2]


Raphaël Proust announced
────────────────────────

  Version 0.4 of `ringo' is now available in `opam'. This version
  includes bug-fixes, minor (sometimes breaking) interface and semantics
  improvements, and, most importantly, a `ringo-lwt' package.

  `ringo-lwt' provides wrapper for using caches in an Lwt-heavy
  application. Specifically, it provides a functor that transform a
  Ringo cache into a Ringo-lwt cache featuring:

  • `val find_or_replace : 'a t -> key -> (key -> 'a Lwt.t) -> 'a Lwt.t'
    which helps avoid race conditions,
  • _automatic cleanup_ by which promises that are rejected are removed
    from the table automatically.

  Additional functors for option (with automatic cleanup of `None') and
  result (with automatic cleanup of `Error') are also provided.


Solidity parser in OCaml with Menhir
════════════════════════════════════

  Archive:
  [https://sympa.inria.fr/sympa/arc/caml-list/2020-05/msg00026.html]


David Declerck announced
────────────────────────

  We just released a parser & printer for the Solidity language:
  [https://medium.com/dune-network/a-solidity-parser-in-ocaml-with-menhir-e1064f94e76b]

  Solidity is one of the most used languages for smart contracts, and
  popularized by the Ethereum blockchain. This work is a step towards
  native support of Solidity in the Dune Network blockchain, and
  developed in a partnership between Origin-Labs and OCamlPro. The
  library is released under LGPLv3 with Static Linking exception.


Browsing source with merlin (and tuareg) the right way?
═══════════════════════════════════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/browsing-source-with-merlin-tuareg-the-right-way/5776/2]


Luc_ML asked
────────────

  Browsing the source of your own program and libraries and of other
  people's libraries is a key for being able to smoothly program and
  also to attract more people to OCaml.

  Am I the only one that find this it so archaic programming in OCaml
  with Emacs/Tuareg? (compared to other mainstream PLs IDE and
  (integrated) tooling).

  Can you share your (Emacs) OCaml IDE setup or give some advice?  This
  should also be of interest for new comers to OCaml that may find IDE
  support neither easy nor fantastic.


Anton Kochkov
─────────────

  You can check out:
  • [Visual Studio Code] + [OCaml platform plugin] - see
    [https://discuss.ocaml.org/t/ann-vscode-platform-plugin-0-5-0/5752]
  • Emacs + [lsp-mode] + [ocaml-lsp] - for now it's experimental though.

  [https://aws1.discourse-cdn.com/standard11/uploads/ocaml/original/2X/3/3dc0a6e4735273399a0c61a25806a6e8ac327ab6.png]


[Visual Studio Code] https://code.visualstudio.com/

[OCaml platform plugin]
https://marketplace.visualstudio.com/items?itemName=ocamllabs.ocaml-platform

[lsp-mode] https://github.com/emacs-lsp/lsp-mode

[ocaml-lsp] https://github.com/ocaml/ocaml-lsp


New release of Cucumber ML 1.0.3
════════════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/ann-new-release-of-cucumber-ml-1-0-3/5813/1]


Christopher Yocum announced
───────────────────────────

  I am pleased to announce the release of [Cucumber ML] 1.0.3.  Cucumber
  ML is a library that brings [Behavior Driven Development] to OCaml via
  [Cucumber].  Essentially, Cucumber is a way to communicate using plain
  language between software development teams and non-developer
  stakeholders that can be turned into code to be executed.

  This release updates the underlying dependency on the gherkin language
  parser, [gherkin-c] up-to-date with the latest version of that library
  (7.0.4).  This will deal with those pesky compile errors.  Just a note
  here, that you will need to installed the gherkin parser as a shared
  object (aka a shared library) on your system for Cucumber ML to link
  against.


[Cucumber ML] https://github.com/cucumber/cucumber.ml

[Behavior Driven Development]
https://en.wikipedia.org/wiki/Behavior-driven_development

[Cucumber] https://docs.cucumber.io/

[gherkin-c] https://github.com/cucumber/gherkin-c

Roadmap
╌╌╌╌╌╌╌

  There are a bunch of things that I could be doing and here are a
  couple that I will be thinking about in the near future:

  • Releasing the library via OPAM for ease of install
  • A more flexible Reporting structure that user can extend via
    functors with some sensible defaults to choose from


New OCaml books?
════════════════

  Archive: [https://discuss.ocaml.org/t/new-ocaml-books/5789/5]


Continuing this thread, Daniil Baturin announced
────────────────────────────────────────────────

  I'm working on a free culture book. The preview is at
  [https://ocaml-book.baturin.org] and the source is at
  [https://github.com/dmbaturin/ocaml-book]

  It's under CC-BY-SA so it belongs to the community—it can be a living
  document that people can keep up to date even if original authors
  abandon it. It's also supposed to be a collaborative project, but
  almost no one is collaborating so far. ;)


Integer division behaviour
══════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/integer-division-behaviour/5815/1]


Daniil Baturin asked
────────────────────

  Number theoretically correct integer division is supposed to work so
  that `(N / K) + (N mod K) = N'.  I was very surprised to see that it's
  not how `(/) : int → int' works!

  ┌────
  │ # 3 mod 2 ;;
  │ - : int = 1
  └────

  Now, two questions. What is the justification for this behaviour? And
  does anything provid real integer division?


Gaëtan Gilbert corrected
────────────────────────

  Surely you mean `((N / K) * K) + (N mod K) = N'?


Aaron L. Zeng replied
─────────────────────

  I assume you meant to include an example with negative numbers?

  ┌────
  │ # 3 mod 2;;
  │ - : int = 1
  │ # (-3) mod 2;;
  │ - : int = -1
  │ # 3 mod (-2);;
  │ - : int = 1
  │ # (-3) mod (-2);;
  │ - : int = -1
  └────

  The `mod' operator always has the same sign as the numerator.  I think
  this is for historical reasons, although I don't know whether to point
  the finger at C, or x86, or something even earlier.

  If you use Base, the `%' operator gives you the Euclidean modulo
  operator that I think you're looking for.  Its result always has the
  same sign as the denominator.  This operator is basically equivalent
  to:

  ┌────
  │ let (%) x y =
  │   let z = x mod y in
  │   if z < 0 then z + y else z
  └────


threepwood also replied
───────────────────────

  This is a "feature" in most programming languages and I think actually
  corresponds to the standard way division is implemented in the CPU
  itself (so it has little to do with OCaml). How this was allowed to
  become the standard I do not know.

  One thing is that I get the impression that people who are not
  familiar with number theory find the following result extremely
  counter-intuitive : `(-3) / 2 = -2'

  I believe it is because they think of integer division as an
  approximation of real division, rather than as being its own special
  thing, and from this perspective it makes no sense that making a
  number negative should change the result. They expect the identities
  that hold of real division (like `(a*b) / c = a * (b/c)') to also hold
  for integer division. (I say "they" not to belittle the perspective, I
  totally see where they are coming from.)

  But then if you have `(-3) / 2 = -1', you need to have `(-3) mod 2 =
  -1' to preserve the relation between `/' and `mod' that you mention
  (so you'll note that the relation does hold in this system).

  I tend to think that the behaviour where `mod' never returns anything
  negative, in addition to being what a mathematician would expect, is
  strictly more useful (for what I believe to be the typical use case of
  modulo over negative numbers in programming, which is indexing into a
  circular buffer). And I also think that you almost never divide
  negative numbers, so the useful behaviour for `mod' should have taken
  priority when deciding how all this works, and whether `/' is
  intuitive or not does not matter much in practice. But I have no idea
  who took that decision and whether such issues were even considered.


Daniel Bünzli then said
───────────────────────

  [This paper] which discusses various definitions of `div' and `mod' in
  programming languages may be of interest.


[This paper] https://dl.acm.org/doi/pdf/10.1145/128861.128862


threepwood replied
──────────────────

  Thanks for this! So the one found in OCaml and most languages is
  T-division (for truncating) and the one I called more useful is
  E-division (for Euclidean) which the paper argues for. It says that
  T-division is found in Ada, that Lisp has two modulo operators, one
  that does T-division and one that does F-division (halfway between T
  and E, and works for the circular buffer case), and that Algol and
  Pascal break the relation between div and mod by doing T-division for
  div and E-division for mod (if I got it right). Interesting stuff.


New release of tablecloth
═════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/ann-new-release-of-tablecloth/5818/1]


Paul Biggar announced
─────────────────────

  I’ve just released a new version of [tablecloth] - an easy-to-use,
  comprehensive standard library that has the same API on all
  OCaml/ReasonML/Bucklescript platforms.

  0.0.7 is a pretty decent release, including many new functions in the
  List, Array, Int, Float, Option, and Result modules, as well as the
  addition of a new Fun module and support for the latest version of
  bs-platform for the bucklescript version of tablecloth.

  See [the tablecloth github repo] for installation instructions, or
  read the full [changelog], or the [original announcement] of
  tablecloth for motivation.

  In addition, Dean Merchant and I have agreed to merge his [Standard]
  library into tablecloth. Dean has done a significant amount of the
  work in tablecloth since the original release, and we plan to release
  a version 0.0.8 after merging the two code bases together. Dean is now
  a maintainer of tablecloth.


[tablecloth] https://github.com/darklang/tablecloth

[the tablecloth github repo] https://github.com/darklang/tablecloth

[changelog]
https://github.com/darklang/tablecloth/blob/master/Changelog.md

[original announcement]
https://medium.com/darklang/tablecloth-a-new-standard-library-for-ocaml-reasonml-d29a73a557b1

[Standard] https://github.com/Dean177/reason-standard


Language abstractions and scheduling techniques for efficient execution of parallel algorithms on multicore hardware
════════════════════════════════════════════════════════════════════════════════════════════════════════════════════

  Archive:
  [https://discuss.ocaml.org/t/language-abstractions-and-scheduling-techniques-for-efficient-execution-of-parallel-algorithms-on-multicore-hardware/5822/1]


Arthur Charguéraud announced
────────────────────────────

  The Multicore OCaml team has made significant progress in the recent
  years. There now seems to be interest in working on the high-level
  parallelism constructs. Such constructs are also tightly connected to
  the problem of controlling the granularity of parallel tasks.

  I've been working on parallel constructs and granularity control from
  2011 to 2019, together with Umut Acar and Mike Rainey. We published a
  number of papers, each of them coming with theoretical bounds, an
  implementation, and evaluation on state-of-the-art benchmark of
  parallel algorithms.

  While we mainly focused on C++ code, I speculate that nearly all of
  our ideas could be easily applied to Multicore OCaml. Porting these
  ideas would deliver what seems to be currently missing in Multicore
  OCaml for efficiently implementing a large class of parallel
  algorithms.

  Gabriel Scherer and François Pottier recently suggested to me that it
  appears timely to share these results with the OCaml community. I'll
  thus try to give an easily-accessible, OCaml-oriented introduction to
  the results that we have produced. Note, however, that most of the
  ideas presented would apply essentially to another other programming
  language that aims to support nested parallelism.

  I plan to cover the semantics of high-level parallelism constructs, to
  describe and argue for work-stealing scheduling, to present a number
  of tricks that are critical for efficiency, and to advertise for our
  modular, provably-efficient approach to granularity control. I'll post
  these parts one after the other, as I write them.

  • [Part 1 in PDF]
  • [Other formats]

  Other parts will be published in the coming weeks or months.


[Part 1 in PDF]
http://www.chargueraud.org/research/2020/multicore/src/part1.pdf

[Other formats]
http://www.chargueraud.org/research/2020/multicore/index.php


Release 1.2 of HoCL
═══════════════════

  Archive: [https://discuss.ocaml.org/t/ann-release-1-2-of-hocl/5837/1]


jserot announced
────────────────

  This is to announce release 1.2 of [HoCL], a functional language for
  describing dataflow process networks.

  *HoCL*
  • can describe *hierarchical* and/or *parameterized* graphs
  • support two styles of description : *structural* and *functional*
  • use *polymorphic type inference* to check graphs
  • supports the notion of *higher order wiring functions* for
    describing and encapsulating *graph patterns*
  • supports several dataflow semantics (SDF, PSDF, ..) by means of
    annotations.

  *HoCL* is entirely written in *OCaml*.

  Documentation (including a [tutorial], the underlying formal
  [semantics] and a general introduction on the [principles] of
  functional graph description) can be found [here].


[HoCL] https://github.com/jserot/hocl

[tutorial] https://github.com/jserot/hocl/tree/master/doc/tutorial.pdf

[semantics] https://github.com/jserot/hocl/tree/master/doc/semantics.pdf

[principles] https://github.com/jserot/hocl/tree/master/doc/fgd.pdf

[here] https://github.com/jserot/hocl/tree/master/doc


Other OCaml News
════════════════

From the ocamlcore planet blog
──────────────────────────────

  Here are links from many OCaml blogs aggregated at [OCaml Planet].

  • [Every proof assistant: Beluga]
  • [TLS 1.3 support for MirageOS]
  • [A Solidity parser in OCaml with Menhir]


[OCaml Planet] http://ocaml.org/community/planet/

[Every proof assistant: Beluga]
http://math.andrej.com/2020/05/25/mechanizing-meta-theory-in-beluga/

[TLS 1.3 support for MirageOS] https://mirage.io/blog/tls-1-3-mirageos

[A Solidity parser in OCaml with Menhir]
http://www.ocamlpro.com/2020/05/19/ocaml-solidity-parser-with-menhir/


Old CWN
═══════

  If you happen to miss a CWN, you can [send me a message] and I'll mail
  it to you, or go take a look at [the archive] or the [RSS feed of the
  archives].

  If you also wish to receive it every week by mail, you may subscribe
  [online].

  [Alan Schmitt]


[send me a message] mailto:alan.schmitt at polytechnique.org

[the archive] http://alan.petitepomme.net/cwn/

[RSS feed of the archives] http://alan.petitepomme.net/cwn/cwn.rss

[online] http://lists.idyll.org/listinfo/caml-news-weekly/

[Alan Schmitt] http://alan.petitepomme.net/

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/caml-news-weekly/attachments/20200526/7d605db7/attachment-0001.html>


More information about the caml-news-weekly mailing list