[cwn] Attn: Development Editor, Latest OCaml Weekly News

Alan Schmitt alan.schmitt at polytechnique.org
Tue May 14 00:07:47 PDT 2019


Here is the latest OCaml Weekly News, for the week of May 07 to 

Table of Contents

Bimap (bi-directional map) implementation
Modules that extend modules from third-party packages
PSA: dns 2.0 – a new udns era dawns
Next OUPS meetup May 21st 2019
Other OCaml News

Bimap (bi-directional map) implementation


paul announced

  An implementation of a bimap, a map that is bi-directional and 
  any client code can use to look up keys if given values as well 
  traditional lookup of values given keys. Implemented using 
  functors as
  a class and a module, with support for multi-maps as well as 
  well as
  single-valued maps. Master branch uses Core. A no-core branch is 
  work-in-progress and needs re-writing. OUnit testing also 


Modules that extend modules from third-party packages


Matt Windsor asked

  What is the recommended way to structure modules that add 
  onto other modules that come from external packages (over which 
  have no control)?

  How do you then structure those modules so that they can 
  themselves be
  extended, and/or that the extensions can be taken out separately 
  say, applied on top of _other_ extensions or modifications to 
  libraries (say, if I target `Base', being able to apply the 
  to `Core_kernel')

  Currently I'm doing something like this, where I want to add 
  things to
  a module in `Base':

  │ (* base_exts/bar.mli *)
  │ include module type of Base.Bar
  │ module Extensions : sig
  │   val foo : t -> Something.t -> Something_else.t -> t 
  │ end
  │ include module type of Extensions
  │ (* base_exts/bar.ml *)
  │ include Base.Bar (* [!] *)
  │ module Extensions = struct
  │   let foo (bar: t) (baz: Something.t) (barbaz: 
  │     : t Base.Or_error.t = (* do something *)
  │ end

  If I then want to re-apply the same extensions to the 
  version of `baz', I'll just import `Base_exts.Bar.Extensions' 
  over an
  include of `Core_kernel.Bar'.  The extensions still depend on 
  `Base' implementation of everything, but that shouldn't matter 
  as long
  as the `t' agrees?

  It occurs to me (after trying to publish an opam package with 
  setup, natch) that this might be a *Very Bad Idea*:

  • That include may _very well_ be copying the whole body of 
    into my module.  (I'm not sure how includes work, but that'd 
    be the
    most semantically obvious thing to do.)  I definitely don't 
    want to
    be distributing half of, say, Jane Street's libraries in my 
    packages, for obvious infrastructural and legal reasons!  I've 
    parts of `Core_kernel' include parts of `Base' like this, but 
    it may
    be that, since they're two parts of the same library family, 
    this is
    OK to do in that situation.
  • `odoc' seems to be picking up the entire API surface of 
    when I do the above.  This certainly isn't what I want—I want 
    to be
    loosely saying 're-export everything that `Base.Bar' exports', 
    're-export this specific thing and this specific thing and 
    then this
    specific thing', _especially_ since the latter ties the
    documentation to a specific version of the external library.

  So far I've seen two other approaches:

  1. Don't re-export anything, just provide extensions to be used 
  as a
     separate module alongside the original one.  This is what I 
     used to
     do, and what I think `Core_extended' and its various spinoff
     libraries do?, but I got sick of having to remember which 
     was in `Base.Bar' and which was in `Base_exts.My_bar'.  In
     hindsight this was probably an acceptable compromise (and 
     having to remember which things are in the other library and 
     are the extensions), and I might revert back to it.
  2. `Core_kernel' sometimes includes `Base' modules in the form
     │ type t = Base.Foo.t [@@deriving stuff]
     │ include (Base.Foo : (module type of struct include Base.Foo 
     end with type t := t))
     I'm not sure whether the indirection of hiding `Base.Foo''s 
     inside another module type has any purpose other than 
     enabling the
     re-declaration of `t', but, if so, is this relevant to what 
     looking at?

  Other suggestions very welcome :slight_smile:

Ivan Gotovchits replied

        What is the recommended way to structure modules that add
        extensions onto other modules that come from external
        packages (over which you have no control)?

  1. fork the project
  2. extend the module
  3. (optionally, but necessary) submit the patch upstream (aka 

  Not really the answer you were looking for? Then read below. 
  in OCaml are not extensible, they are closed structures, like 
  classes in Java, that are not extensible _by design_. OCaml 
  are not namespaces. OCaml doesn't have namespaces¹ and modules 
  are not
  substitution for the namespaces. Trying to use modules as 
  will leave both parties unhappy, you and OCaml.

  Yes, it is harsh, and namespaces is the feature I miss the most 
  I'm developing large programs in OCaml². However, let's look 
  into the program model of OCaml to understand why this is 
  and is there a right way to code in OCaml and be happy.

  There are two kinds of modules in OCaml, structures and 
  Your question is more about the former. OCaml is a language of
  mathematics, where structures denote _algebras_, i.e., tuples of
  functions attached to a set. In mathematics there is only one 
  of integers.  You can't have Janestreet's arithmetics, Matt's
  arithmetics, or Ivan's arithmetics. If you do, then those are
  different algebras with different laws, and therefore they have
  different structures. In other words, OCaml wasn't really 
  that way, it is the essence of mathematics, our vision of 
  that we, the humanity, have developed so far. OCaml just 
  this approach, no more no less. And this is where mathematics 
  with its offspring - programming. Yes, as software developers we 
  namespaces, as we need to reuse software components developed by
  others, we want to build systems from packages, like engineers 
  building complex structures from existing building blocks. Not
  something that mathematics is really offering us, instead it 
  gives us
  the [theory of categories] and [homotopy type theory], that are 
  orthogonal to the design patterns of software engineering.  The 
  difference of programming as a branch of mathematics is that it 
  has a
  much lower entrance barrier (you do not need to learn category 
  to program) and is much more rapidly developing³. Like it or 
  not, but
  programming is still mathematics and therefore we have to play 
  by the
  rules of mathematics.

  With all that said, you can still develop software and apply all
  modern software design patterns in OCaml. Just keep in mind, 
  that a
  module is not a namespace, not a package, not a component. It is 
  mathematical structure which is fixed. It is a tuple of values. 
  keep those values as they are and build new values from 
  rather than trying to destructively substitute them. But before 
  start to explore the design space, I need to bring here two 
  asides, so
  that we can develop some context for reasoning.

[theory of categories] 

[homotopy type theory] <https://homotopytypetheory.org/book/>

Aside: The OCaml program model

  It would be interesting to look inside of OCaml to understand 
  modules and functions are actually implemented, what semantics 
  include statement has and so on. In OCaml the values are not
  referenced by names, unlike Common Lisp, which is the language 
  indeed offers proper namespacing. In fact, in OCaml values are 
  referenced at all, there is no such kind of indirection. Values 
  passed directly to each other. This is a true call-by-value,
  do-by-value, apply-by-value language. When we see an expression 
  `f x
  y', it is a _value_ `f' which is applied to the values `x' and
  `y'. Not a function named `f'. When we say `List.length' it is 
  treated as `["List"; "length"]', it is always and directly 
  resolved to
  a concrete value of the `camlStdlib__list_length' function, 
  which is a
  piece of code⁴. A module, e.g., `List' is a record (tuple) of
  pointers. When you do `include List' you create a new tuple and 
  (as with memcpy) the contents of `List' into the new tuple. When 
  create an implementation of a compilation unit, in other words, 
  you compile an `ml' file, you are actually creating a tuple of 
  or a structure. The interesting and a very important part here, 
  that a compilation unit is implicitly parametrized by all 
  modules that
  occur free in your compilation unit. In other words, when you 
  create a
  file `example.ml' with the following contents,

  │ let list_length = List.length

  and compile it to code, then the code itself will not contain 
  `List.length' value. Essentially, `example.cmo' will be like a
  functor, which is parametrized by a list implementation. It is 
  during the linking phase, when an actual implementation of the 
  module will be applied, and all references to the `List.length' 
  be finally resolved to values. On one side, this is just a 
  of the separate compilation system, on the other side it gives 
  us an
  opportunity to treat compilation units as software components 
  build our software systems on this granularity. But we are not 
  yet at
  this phase, despite several recent improvements in the OCaml
  infrastructure, which include bug fixing in the dynamic linker, 
  aliasing, new dependency analysis, and, last but not least, 
  compilation units are still not the building blocks. From the 
  model standing point, we still are operating with values, not 

Aside: Common Lisp, modules, and namespaces

  It is also interesting to look into other languages, which 
  proper namespaces. Let's pick the Common Lisp as a working 
  example. In
  Common Lisp we have a notion of symbol, which denotes an object
  identity. When you call a function `(f x y)' you are not 
  applying a
  value `f' to `x' and `y', like you do in OCaml, but instead you 
  passing a symbol `f' and the runtime extracts a pointer to a 
  from a specific slot of the symbol object. This is basically the 
  as we would be passing references to functions, e.g., if `let
  list_length = ref List.length', and then calling it like 
  [1;2;3]'. This is, in fact, the operational model of languages 
  namespaces, you never call a function, you call a name of a 
  and the name is a variable, which changes dynamically (the level 
  dynamism differs from language to language). There are, of 
  cons and pros of this design. The main disadvantage is that it 
  is hard
  to reason about the program behavior. Because now every program 
  is not
  a mathematical object built from other mathematical objects, but
  rather an expression in the theory of names, that have multiple
  interpretations in the space of the cartesian product of the 
  sets that
  denote each symbol. In other words, each program term has many
  interpretations, like what is `!list_length [1;2;3]'? You may 
  never be

  There is also another lesson, that we can learn from languages 
  namespaces. The lesson is, you still need modules. For example, 
  Common Lisp, despite the presence of proper namespaces, 
  are still use names like `list-length', but not `list:length'. 
  Why so?
  Because `list-length' denotes an operation in the theory of 
  with well defined meaning. It is not just a name, but an 
  therefore there could be `edu.cmu.ece:list-length' or
  `com.janestreet.core:list-length'. Therefore, we have an 
  (designed by convention) module `list' with some well-known 
  which define a structure of the `list' algebra. So the takeaway 
  is –
  modules and packages are orthogonal.

Design for extension: choices

  OCaml is a very rich language, that means it has a huge search 
  for the design choices. It also means, that most likely it is 
  to implement any design pattern that you can find in the wild. 
  design space is not really fully explored (especially since the 
  years OCaml is rapidly developing) and not all decisions are 
  accepted by the community. For example, we have classes, which 
  being adopted by the community, could solve the module extension
  problem. Imagine, if instead of having the `List' module we had
  instead the `list' class. Now, the extension would be simply an
  inheritance, and names will be all properly indirected, as now 
  you will do `list#length' you will actually reference a symbol 
  will have multiple interpretations. However, the community 
  really adopt this design. Well, mostly because it ended up in a
  nightmare :) And it is not really about classes. Classes in 
  OCaml is
  just an attempt to tame the names problem. You can go rogue and
  actually use records of functions instead. And even make them
  references, e.g.,

  │ let 'a lists = {
  │    mutable length : 'a list -> int;
  │    mutable nth : 'a list -> int -> 'a;
  │ }

  And use it like `list.length [1;2;3]'. The extension is a little 
  hard, as records do not have row-types or an include statement 
  objects and structures), but enables overriding. This approach 
  is also
  not extremely popular, but was adopted at least in ppx 

  So, this is all to say, that in OCaml it is possible to adopt 
  any poor
  choice that was made in the software developing
  community. Fortunately, they are not very popular (that of 
  doesn't prove that they are wrong). So, what is the OCaml way of
  designing reusable components? Ideally, components that follow 
  [Open-Closed principle]. The solution is to design for 

  Not everything should be designed this way. This would one of 
  poor choices. Some entities are inherently and fundamentally not
  extensible. They are algebras. In the ideal world full of
  mathematicians and infinite time, we should define algebras as 
  least fixed points (aka initial algebras). For example, the 
  algebra of list (i.e., the minimum set of definitions) is its 
  definition, so the module `List' shall have only one entry 
  denotes two constructors).

  │ module List = struct
  │    type 'a t = [] | :: of 'a * 'a t
  │ end

  Everything else should be put aside of the `List' module, 
  because it
  is secondary, e.g., we can have a component called
  `stdlib_list_basic_operations.ml' which you could link into your 
  solution and use it, which will basically have the following 

  │ val list_length : 'a List.t -> int
  │ val list_nth : 'a List.t -> int -> 'a
  │ val list_hd : 'a List.t -> 'a

  With this approach, it would be easy to compose different 
  as there wouldn't be any more competition for the `List' module, 
  instead the list interface will be composed by convention. 
  could provide a `list_something' function and it is your choice 
  as the
  system developer to select the right components and glue them 
  and correctly. This is, basically, the approach that is used in 
  Lisp, C++, Java, and other languages.

  Unfortunately, this is not the convention in OCaml. While the 
  design of the OCaml exhibits some notions of this approach (cf.,
  `string_of_int', `int_of_string', and the `Pervasives' module 
  at some point of time, this venue was abandoned, and OCaml 
  sticked to the "blessed module" approach.  In this approach,
  operations are blessed by being included in the main module and 
  other operations are sort of the second sort citizens. As a 
  result, we
  have modules with exploded interfaces, which are hard to 
  use, and it takes so much time to compile programs that use
  Janestreet's libraries.

[Open-Closed principle]


  Design for extensibility, when the extensibility is expected. 
  small modules, which define abstractions. Protect those
  abstractions. If a function doesn't require the access to the 
  representation, doesn't rely on the internal invariants of the
  representation, and could be efficiently implemented using the
  abstract interface only, the do not put it into the module. Good
  example, `list_length' - not a part of the module. But 
  is, since it needs to access the internal representation of the 
  tree. Nor it could be efficiently implemented using
  `Map.fold'. Implement all other functions in separate modules,
  probably structured by their purposes and domains. Use the 
  statements to introduce those names into your namespace.

  When you design a software component that should be extensible,
  parametrize it with abstractions. Use functors, function 
  whatever – OCaml gives you a lot of options here. You can even 
  references to functions, which work especially good with dynamic

  When you find someone else's code which is broken, either 
  because of a
  missing function in the interface or a wrong implementation do 
  hesitate to fork, patch, and submit. In fact, Dune facilitates 
  approach, so that you can create your own workspaces with core, 
  and whatever libraries, edit them to your taste and get a 
  solution. If you want it to be reusable – then push the changes
  back. And we are back where we started.  Yes, you can do the 
  trick and reexport your own `List' module, but this is 
  essentially the
  same as cloning, patching… but not submitting back. Because, at 
  end of the day we will now have `Base.List', `Core.List', 
  how does it differ from having multiple forks on github or, even
  worse, vendoring those modules? Essentially, it is the same. So,

  1. fork the project
  2. extend the module
  3. push it back

  but before doing this, ask yourself, is the operation that I'm 
  to add is really a member of this module?

  ¹: And will never have, because OCaml program model operates on
  values, not on names, as Common Lisp for example.

  ²: However, when I develop large programs in other languages I 
  OCaml modules much more, than I miss namespaces in OCaml

  ³: Programming is like mathematics without elitist approach, 
  which is
  really good.

  ⁴: Don't worry it is still represented as a pointer, but 
  it is the code.

Matt Windsor then said

  This is a really comprehensive and thoughtful answer, and 
  confirms my suspicions about what I'm doing being a fairly bad

  > ask yourself, is the operation that I’m trying to add is 
  > really a
    member of this module?

  Generally: no.  What I'm doing is trying to insert operations 
  `List'​s, etc., that don't directly depend on the intrinsics of 
  how the
  lists are defined, but instead back-form implementations of 
  that are specific to the library I'm designing.  Effectively, 
  trying to do what you'd do with extension methods in C#, or by
  defining instances of a new typeclass I've made over 
  (Indeed, I'm coming to OCaml through C#-via-F# and Haskell, so 
  are the mental models I'm already hardwired to try implement.)

  So, if I consider what I'm doing as 'here is a `List' module and 
  just yanking all of `Base.List' into it while exposing the fact 
  I've done so', then… of course it makes no sense!

  It also makes no sense for me to try to send what I'm doing 
  because, while I might think that what I'm doing to `List' is
  generally useful, it doesn't make any sense outside the 
  that my library is doing, nor is it a key part of `List''s 
  algebra (it might well define algebraic properties on `List', 
  they're derived ones), and putting it in `Base' or `Core_kernel' 
  be bloat.

  I got confused by the fact that `Core_kernel' really does just 
  sit on
  top of `Base' in the way that I was trying to do—but this is a 
  case that doesn't generalise to what I want to do at all.

PSA: dns 2.0 – a new udns era dawns


Anil Madhavapeddy announced

  The [DNS protocol implementation] in MirageOS has been around 
  for a
  very, very long time.  The codebase began way back in 2003 and 
  used in research projects such as [Melange] (the precursor to
  MirageOS) and [the Main Name System].

  Over the years, the ocaml-dns codebase has been refactored many 
  as we developed new libraries: early versions were moved from a
  declarative parsing language ([lost] in the sands of time) over 
  bitstring] and then to the newly developed [cstruct] and so on.
  Meanwhile, our overall coding standards and library 
  infrastructure in
  MirageOS also improved, and the DNS codebase didn't always keep 

  The DNS interfaces tended to leak exceptions from awkward 
  whereas other Mirage libraries have been adopting an [explicit
  approach to error handling] to ensure exceptions are indeed
  exceptional events.  The DNS protocol itself has continued to 
  many more extensions, and now systems such as [LetsEncrypt] that 
  generate TLS certificates via DNS really motivate supporting 

  So it is with enormous pleasure that I recently merged
  [mirage/ocaml-dns#159] into the trunk branch of the DNS 
  This represents a rewrite of the implementation of DNS from the 
  up using the same rigorous coding standards first adopted in
  [ocaml-tls], and spearheaded for over two years by @hannes and 
  in their [udns library].  As udns has matured, we recently took 
  decision for it to merge with the venerable ocaml-dns repository 
  supplant the old implementation.  You can view the [odoc of the 
  branch] online.

  This means that the dns.2.0.0 package will essentially be udns 
  has deliberately not been released to date).  The first thing I 
  like to do is to thank @hannes and @cfcs for their enormous
  persistence and attention to detail in constructing this new 
  and then secondly to issue a call for help and contributions 
  anyone in the OCaml community who is interested in assisting 
  missing features that have regressed from the 1.x branch.

  The core library is in great shape, so I have created some 
  issues for
  known missing elements that we can tackle before cutting a 

  • [create an Async-based resolver]
  • [multicast DNS]
  • [localhost tests using mirage-vnetif virtual stacks]
  • [server-side TCP requests]

  If you are a current user of the dns.1.x branch, we would also 
  like to hear from you about whether the `master' branch of 
  is suitable for your use.  Please feel free to [create new 
  about regressions from 1.x, or to make suggestions.  If you're 
  new to
  DNS and curious to learn more, then do also try to do your own
  deployment of a DNS server and let us know how it goes!

  mirage.io will shortly be running this DNS server as well, of 
  and @hannes can no doubt chime in about his own usecases in 
  with this new codebase over the past few years.


[DNS protocol implementation] 

[Melange] <http://anil.recoil.org/papers/2007-eurosys-melange.pdf>

[the Main Name System] 

[lost] <https://github.com/avsm/mpl>

[to bitstring] <https://github.com/mirage/ocaml-dns/pull/3>

[cstruct] <https://github.com/mirage/ocaml-cstruct>

[explicit approach to error handling]

[LetsEncrypt] <https://letsencrypt.org>


[ocaml-tls] <https://github.com/mirage/ocaml-tls>

[udns library] <https://github.com/roburio/udns>

[odoc of the master branch] <https://mirage.github.io/ocaml-dns/>

[create an Async-based resolver]

[multicast DNS] <https://github.com/mirage/ocaml-dns/issues/160>

[localhost tests using mirage-vnetif virtual stacks]

[server-side TCP requests]

[ocaml-dns] <https://github.com/mirage/ocaml-dns>

[create new issues] <https://github.com/mirage/ocaml-dns/issues>

Hannes Mehnert then said

  I don't quite understand what you mean with TCP server… if you 
  take a
  look at ns0.robur.io (or ns1.robur.io) or 
  ns1/ns2/ns3.mehnert.org or
  ns.nqsb.io / sn.nqsb.io (they're all running udns), they are 
  listening on TCP, and if your request (via udp) is too large to 
  into 400 bytes, you get a truncated answer (an example would be 
  tlsa _letsencrypt._tcp.hannes.nqsb.io @ns.nqsb.io').

  for the motivation behind udns: initially i wanted to write an
  iterative resolver, but then the "how to configure it" question 
  raised, and i discovered [NSUPDATE], an in-protocol dynamic 
  mechanism (with [authentication]), and started to implemented 
  together with a server implementation. Afterwards I intended to 
  let's encrypt via DNS (since I hate to have to run web servers 
  let's encrypt) – thanks to Michele, the [ocaml-letsencrypt] got 
  started with the DNS challenge).

  nowadays, I store TLS certificates (and signing requests) as 
  TLSA in
  DNS, have the zone in a git repository that is pushed and pulled 
  the primary implementation, which [NOTIFY] secondaries (even the 
  encrypt service is a (hidden) secondary), and transfers zones

  if you're interested in server-side unikernels, take a look at
  <https://github.com/roburio/unikernels> – they contain primary,
  secondary, primary-git, let's encrypt, …

  what is more to do? there are still some TODO in the code which 
  be fixed, the test coverage (esp. in server) is not yet optimal, 
  various DNS extensions (DNSSec, DNS-over-TLS, 
  tcp-over-dns, …) are just not there yet… but in the end, I use 
  rewrite this stack since some years (first commit was from end 
  april 2017) – also using the resolver on my laptop :)

[NSUPDATE] <https://tools.ietf.org/html/rfc2136>

[authentication] <https://tools.ietf.org/html/rfc2845>

[ocaml-letsencrypt] <https://github.com/mmaker/ocaml-letsencrypt>

[NOTIFY] <https://tools.ietf.org/html/rfc1996>

[incrementally] <https://tools.ietf.org/html/rfc1995>

Next OUPS meetup May 21st 2019


Bruno Bernardo announced

  The next OUPS meetup will take place on Tuesday, May 21, 7pm at 
  on the Jussieu campus. As usual, we will have a few talks, 
  followed by
  pizzas and drinks.

  The talks will be the following:

  • Nik Graf, TBD (something related to ReasonML),

  • Armaël Guéneau, Incremental Cycles, A certified incremental 
    detection algorithm used in Dune,

  Please do note that we are always in demand of talk *proposals* 
  future meetups.

  To register, or for more information, go here:

  *Registration is required! Access is not guaranteed after 7pm if
   you're not registered.* (It also helps us to order the right 
   of food.)

  Access map:
  IRILL - Université Pierre et Marie Curie (Paris VI)
  Barre 15-16 1er étage
  4 Place Jussieu
  75005 Paris

Other OCaml News

From the ocamlcore planet blog

  Here are links from many OCaml blogs aggregated at [OCaml 

  • [Thoughts from AAAI 2019]
  • [On the road to Irmin v2]
  • [An introduction to OCaml PPX ecosystem]
  • [A course on homotopy (type) theory]

[OCaml Planet] <http://ocaml.org/community/planet/>

[Thoughts from AAAI 2019]

[On the road to Irmin v2]

[An introduction to OCaml PPX ecosystem]

[A course on homotopy (type) theory]


  If you happen to miss a CWN, you can [send me a message] and 
  I'll mail
  it to you, or go take a look at [the archive] or the [RSS feed 
  of the

  If you also wish to receive it every week by mail, you may 

  [Alan Schmitt]

[send me a message] <mailto:alan.schmitt at polytechnique.org>

[the archive] <http://alan.petitepomme.net/cwn/>

[RSS feed of the archives] 

[online] <http://lists.idyll.org/listinfo/caml-news-weekly/>

[Alan Schmitt] <http://alan.petitepomme.net/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/caml-news-weekly/attachments/20190514/0f8e229e/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 487 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/caml-news-weekly/attachments/20190514/0f8e229e/attachment-0001.pgp>

More information about the caml-news-weekly mailing list