[cwn] Attn: Development Editor, Latest OCaml Weekly News
Alan Schmitt
alan.schmitt at polytechnique.org
Tue Apr 23 00:58:17 PDT 2019
Hello
Here is the latest OCaml Weekly News, for the week of April 16 to
23,
2019.
Table of Contents
─────────────────
Wrapping C++ std::shared_ptr and similar smart pointers
OCaml 4.08.0+beta3
Menhir and preserving comments from source
ppx_protocol_conv 5.0.0
Orsetto: structured data interchange languages (preview release)
Searching for functions
Other OCaml News
Old CWN
Wrapping C++ std::shared_ptr and similar smart pointers
═══════════════════════════════════════════════════════
Archive:
<https://discuss.ocaml.org/t/wrapping-c-std-shared-ptr-and-similar-smart-pointers/3582/1>
Manuel Hornung asked
────────────────────
I'm trying to create Reason/OCaml bindings for the Skia 2D
graphics
library. The library makes heavy use of smart pointers similar
to
`std::shared_ptr`s, but they called them `sk_sp`.
Now my first idea for wrapping these was using a regular pointer
that
points to a shared pointer. That would trigger the release of
the
memory behind the shared pointer as soon as the local variable
containing the shared pointer goes out of scope though.
I found a solution that looked promising to me in
<https://github.com/ygrek/scraps/blob/master/cxx_wrapped.h> but
now I
heard that reducing the refcount in the finalizer is also not a
good
idea. Unfortunately I don't know why that is not a good idea and
I
also don't have a better one.
Can anyone help me understand this better and point me towards a
better approach?
Guillaume Munch-Maccagnoni replied
──────────────────────────────────
(Sorry for the delay as I have been busy.)
It all comes down to the fact that tracing and reference
counting have
different advantages and drawbacks, and the main difference for
this
question is that RC reclaims promptly, whereas tracing does not
reclaim predictably; in addition OCaml is currently poor in
terms of
predictable resource management.
Smart pointers can be used to manage resources other than
memory. (I
mean smart pointers that implement deterministic reclamation of
resources such as unique or reference-counted pointers; in
principle
smart pointers are not restricted in what they implement:
delayed
evaluation, [roots for tracing GCs]… such exotic pointers are
out of
the scope of my answer.)
First, you need to determine whether the pointer manages
non-memory
resources (the destruction closes a file, releases a lock, rolls
back
some state…). If so, using finalizers is a no-go, because you
cannot
predict when and in which order finalizers run, and in practice
it can
be way too late. When that is the case, skip 1). For instance I
see
that your library has some functions that return RAII guards;
quite
obviously these cannot be handled with finalizers.
[roots for tracing GCs]
<http://manishearth.github.io/blog/2015/09/01/designing-a-gc-in-rust/>
1) Custom blocks with finalizer
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
If the smart pointer only manages memory, then it is possible to
represent it with a custom block with a finalizer attached to
it. The
GC needs to know the size of what it manages, otherwise it will
not
work hard enough to reclaim memory and you can end up with a
memory
leak. This has occasionally been called “[the familiar
"allocation of
custom objects mess up the speed of the major GC" problem]”.
The situation is supposed to improve in OCaml 4.08, which
introduces
[a new function `caml_alloc_custom_mem'] that lets you specify
the
size of the memory managed by the custom block, which the GC's
heuristics will take into account. (`caml_alloc_custom' also has
parameters to tweak the GC speed but presumably this was not
good
enough as witnessed by the multiple bug reports referenced in
that
PR.)
So you can use as a source of inspiration @ygrek's `wrapped'
pointer
you have linked to above, but you must adapt it to tell the
OCaml GC
the size of the data your custom block contains.
Pros:
• Expressive: the foreign data is abstracted as an OCaml value
that
can be passed around, inserted into data structures, etc.
Cons:
• No-go for non-memory resources.
• You need to know the size of what you are managing—there is no
universal smart pointer wrapper!
• Not so good for performance/scale or interoperability. Mixing
tracing and RC cumulates the drawbacks of both; in particular
you
inherit the possible unbounded latency due to the upfront
deallocation cost of RC (depending on your use-case), and you
are
even at a risk of creating cycles that are never collected if
you
mix this method with [that one] to store OCaml values on the
foreign
side.
These are some guaranteed theoretical drawbacks, but I imagine
that
there can be more practical implementation-specific issues (as
witnessed by `caml_alloc_custom' vs `caml_alloc_custom_mem'). I
do not
have hands-on experience with custom blocks, and while
researching for
this answer, I found this usage not very well documented, so I
hope
that experts can fill-in the gaps and/or correct the above if
needed.
[the familiar "allocation of custom objects mess up the speed of
the
major GC" problem] <https://github.com/ocaml/ocaml/issues/7676>
[a new function `caml_alloc_custom_mem']
<https://github.com/ocaml/ocaml/pull/1738>
[that one]
<https://discuss.ocaml.org/t/storing-an-ocaml-value-in-a-c-structure/3521>
2) Deterministic resource management
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
To avoid the impedance mismatch between smart pointers and the
GC, you
can rely on deterministic resource management. In OCaml, the
idiomatic
expression of it is to use “~with_~” wrappers based on
`unwind-protect' [see the [example of files]]. OCaml 4.08
introduces
`Fun.protect', an implementation of `unwind-protect' suitable
for
OCaml.
Pros:
• Predictable: can be used for non-memory resources.
Cons:
• Lacks expressiveness: resources live for the exact duration of
their
defining scope, and are reclaimed in LIFO order.
• Allows “use after free”: the resource can be referenced
outside of
its scope, if not careful.
• Currently incompatible with asynchronous exceptions: OCaml
does not
currently allow an implementation of unwind-protect that
protects
from asynchronous exceptions being raised inside the finally
clause.
[example of files]
<https://dev.realworldocaml.org/error-handling.html>
3) Manual resource management
╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌
If neither 1) nor 2) fit the bill, you have to resort to manual
resource management, in which the user has to call some `free'
function explicitly (and gets an exception if they use it after
`free'). It is “hard” to program correctly with manual resource
management, moreso in the presence of exceptions. For this
reason,
people mix it with 1) and/or 2); for instance they use
unwind-protect
in a non-systematic manner, or they attach finalizers to act as
a
fallback, or both. While with 1) and 2) you are still within the
realm
of structured programming, with manual resource management you
enter
the realm of debugging-oriented programming—think programming in
a
weird dialect of old C++.
Pros:
• Last resort solution
Cons:
• Non-idiomatic code
• Hard to program
• Hard to reason about the code
Discussions with Serious Industrial OCaml Users a while ago
(starting
around POPL 2017 in Paris) have let appear OCaml's current
issues with
resource management. These discussion prompted [a proposal for a
resource management model for OCaml], inspired by RAII and move
semantics from modern C++/Rust. In a nutshell, it aims to lift
the
expressiveness limitations of 2). Interoperability is probably
its
most important application.
[a proposal for a resource management model for OCaml]
<https://discuss.ocaml.org/t/a-proposal-for-a-resource-management-model-for-ocaml/1680>
OCaml 4.08.0+beta3
══════════════════
Archive:
<https://sympa.inria.fr/sympa/arc/caml-list/2019-04/msg00048.html>
Damien Doligez announced
────────────────────────
Dear OCaml users,
The release of OCaml 4.08.0 is approaching. We have created a
third
beta version to help you adapt your software to the new features
ahead
of the release.
The source code is available at these addresses:
<https://github.com/ocaml/ocaml/archive/4.08.0+beta3.tar.gz>
<https://caml.inria.fr/pub/distrib/ocaml-4.08/ocaml-4.08.0+beta3.tar.gz>
The compiler is (or will soon be) also available in OPAM with
one of
the following commands.
opam switch create ocaml-variants.4.08.0+beta3
–repositories=default,beta=git+<https://github.com/ocaml/ocaml-beta-repository.git>
or
opam switch create ocaml-variants.4.08.0+beta3+<VARIANT>
–repositories=default,beta=git+<https://github.com/ocaml/ocaml-beta-repository.git>
where you replace <VARIANT> with one of these:
afl
default_unsafe_string
flambda
fp
fp+flambda
We want to know about all bugs. Please report them here:
<https://github.com/ocaml/ocaml/issues>
Happy hacking,
– Damien Doligez for the OCaml team.
The changes from beta2 are the following:
• GPR#1942, GPR#2244: simplification of the static check for
recursive
definitions (Alban Reynaud and Gabriel Scherer, review by
Jeremy
Yallop, Armaël Guéneau and Damien Doligez)
• GPR#1354, GPR#2177: Add fma support to Float module. (Laurent
Thévenoux, review by Alain Frisch, Jacques-Henri Jourdan,
Xavier
Leroy)
• GPR#2202: Correct
Hashtbl.MakeSeeded.{add_seq,replace_seq,of_seq} to
use functor hash function instead of default hash
function. Hashtbl.Make.of_seq shouldn't create randomized hash
tables. (David Allsopp, review by Alain Frisch)
• * PR#4208, PR#4229, PR#4839, PR#6462, PR#6957, PR#6950,
GPR#1063,
GPR#2176, GPR#2297: Make (nat)dynlink sound. (Mark Shinwell,
Leo
White, Nicolás Ojeda Bär, Pierre Chambart)
• GPR#2317: type_let: be more careful generalizing parts of the
pattern (Thomas Refis and Leo White, review by Jacques
Garrigue)
• MPR#6242, GPR#2143, MPR#8558, GPR#8559: optimize some local
functions (Alain Frisch, review by Gabriel Scherer)
• #7829, #8585: Fix pointer comparisons in freelist.c (for
32-bit
platforms) (David Allsopp and Damien Doligez)
• #8567, #8569: on ARM64, use 32-bit loads to access
caml_backtrace_active (Xavier Leroy, review by Mark Shinwell
and
Greta Yorsh)
• #8568: Fix a memory leak in mmapped bigarrays (Damien Doligez,
review by Xavier Leroy and Jérémie Dimino)
• MPR#7548: printf example in the tutorial part of the manual
(Kostikova Oxana, rewiew by Gabriel Scherer, Florian
Angeletti,
Marcello Seri and Armaël Guéneau)
• MPR#7547, GPR#2273: Tutorial on Lazy expressions and patterns
in
OCaml Manual (Ulugbek Abdullaev, review by Florian Angeletti
and
Gabriel Scherer)
• GPR#8508: refresh \moduleref macro (Florian Angeletti, review
by
Gabriel Scherer)
• MPR#7919, GPR#2311: Fix assembler detection in configure
(Sébastien
Hinderer, review by David Allsopp)
• GPR#2295: Restore support for bytecode target XLC/AIX/Power
(Konstantin Romanov, review by Sébastien Hinderer and David
Allsopp)
• GPR#8528: get rid of the direct call to the C preprocessor in
the
testsuite (Sébastien Hinderer, review by David Allsopp)
• Issue #7938, GPR #8532: Fix alignment detection for ints on
32-bits
platforms (Sébastien Hinderer, review by Xavier Leroy)
• * GPR#8533: Remove some unused configure tests (Stephen Dolan,
review by David Allsopp and Sébastien Hinderer)
• GPR#2207,#8604: Add opam files to allow pinning (Leo White,
Greta
Yorsh, review by Gabriel Radanne)
• MPR#7835, GPR#1980, GPR#8548, GPR#8586: separate scope from
stamp in
idents and explicitly rescope idents when substituting
signatures. (Thomas Refis, review by Jacques Garrigue and Leo
White)
• #8550, #8552: Soundness issue with class generalization
(Jacques
Garrigue, review by Leo White and Thomas Refis, report by
Jeremy
Yallop)
Menhir and preserving comments from source
══════════════════════════════════════════
Archive:
<https://discuss.ocaml.org/t/menhir-and-preserving-comments-from-source/3686/1>
Chet Murthy asked
─────────────────
I've used ocamlyacc over the years a lot, and menhir in a couple
of
projects (including a big one I'm working on right now). I've
also
used camlp4/camlp5's stream-parsers in a *ton* of projects. And
of
course, with ocamllex and sedlexing. I find that with
stream-parsers,
it's easy to arrange for preserving lexical positions in tokens,
and
then carrying that across to the parse-tree. To wit,
┌────
│ ...
│ type basic_token = ...... ;;
│ type token = basic_token * lexical_position_info_t ;;
│ ...
└────
and then in your stream parser, you pattern-match on the first
component, e.g.
┌────
│ ...
│ parser [< .... ; '(Tstring s, _) ; ... >] -> yadda yadda
│ ...
└────
But with menhir (and ocamlyacc) it seems like, you need to embed
the
lexical position info in the token, e.g.
┌────
│ ...
│ type basic_token =
│ | Tstring of lexicai_position_info_t * string
│ | Tsemi of lexical_position_info_t
│ etc
│ ...
└────
Is there some trick I'm missing, for how to use camlyacc/menhir
in a
manner that allows preserving this positional information during
the
parse?
gasche replied
──────────────
To have location/position information in the AST: the standard
approach I'm familiar with is not to embed position information
in the
tokens, but to query it from the lexer or parser at the place
where
you build your AST values in the parser actions. When using
ocamlyacc,
I use the `Lexing' module for this
(`Lexing.lexeme_{start,end}_p'),
when using Menhir I use its special symbols `${start,end}pos',
`${start,end}pos(n)', `$loc', `$loc(n)'.
To preserve comments, an approach we use in the OCaml compiler
(where
comments that are docstrings are kept in the AST) is to have a
global
table of comments, that is filled by the Lexer, and accessed
from
parsing actions (there is a function that says basically
"collect all
the comments from the last time you were called to <this
position>").
ppx_protocol_conv 5.0.0
═══════════════════════
Archive:
<https://discuss.ocaml.org/t/ann-ppx-protocol-conv-5-0-0/3692/1>
Anders Fugmann announced
────────────────────────
It is my pleasure to announce the release of [Ppx_protocol_conv]
version 5.0.0.
Ppx_protocol_conv is a syntax extension to generate functions to
serialize and de-serialize ocaml types. The ppx itself does not
contain any protocol specific code, but relies on user defined
'drivers' to define serialization and de-serialiazation of basic
types
and structures.
The library comes with multiple pre-defined drivers:
• ppx_protocol_conv_json (Yojson.Safe.json)
• ppx_protocol_conv_jsonm (Ezjson.value)
• ppx_protocol_conv_msgpack (Msgpck.t)
• ppx_protocol_conv_xml-light (Xml.xml)
• ppx_protocol_conv_yaml (Yaml.value)
The library is based on ppxlib and is is compatible with base
v0.12.
Release 5.0.0 is available through opam.
The project homepage is:
<https://github.com/andersfugmann/ppx_protocol_conv>
The project's [wiki pages] contains some more information on how
to
use the library and existing drivers and on how to write you own
drivers.
*Noteworthy Change* This release includes a major rewrite of the
core
of the library to allow more control by user supplied drivers
over the
serialization and de-serialization of types. These changes
breaks
backward compatibility.
The json driver (`Ppx_protocol_conv_json') has been updated to
be
compatible with the serialization format of ppx_deriving_yojson,
supporting both `[@key]', `[@name]' and `[@default]' attributes,
and
can be used as a replacement for `ppx_deriving_yojson' with few
modifications.
Deserialization functions now returns a `result' type. Old
support for
exception type errors is available in functions with the `_exn'
suffix. For a complete list of changes, see the [Changelog].
As always, comments, suggestions and PRs are more than welcome.
[Ppx_protocol_conv]
<https://github.com/anders.fugmann/ppx_protocol_conv>
[wiki pages]
<https://github.com/andersfugmann/ppx_protocol_conv/wiki>
[Changelog]
<https://github.com/andersfugmann/ppx_protocol_conv/blob/master/Changelog>
Orsetto: structured data interchange languages (preview release)
════════════════════════════════════════════════════════════════
Archive:
<https://discuss.ocaml.org/t/ann-orsetto-structured-data-interchange-languages-preview-release/3304/6>
james woodyatt announced
────────────────────────
I have now released `~preview4' which resolves Issue [#8] /OCaml
4.07:
the new Stdlib.Seq.t is functionally equivalent to Cf_seq.t/.
For
OCaml 4.06, this introduces an external dependency on the *seq*
compatibility package. I've also checked that documentary
comments are
available with *odig*, so this might be the last preview release
before 1.0. (It depends on whether I decide to remove the
support for
the *ppx_let* syntax extension.)
[#8]
<https://bitbucket.org/jhw/orsetto/issues/8/ocaml-407-the-new-stdlibseqt-is>
james woodyatt then added
─────────────────────────
> It depends on whether I decide to remove the support for the
*ppx_let* syntax extension.
I've thought about this, and I will not be removing support for
the
*ppx_let* syntax extension. I plan to /deprecate/ it when OCaml
4.08
is released, but it will be retained while I continue supporting
OCaml
4.06 and 4.07.
Searching for functions
═══════════════════════
Archive:
<https://discuss.ocaml.org/t/searching-for-functions/3698/1>
Jordan Mackie announced
───────────────────────
OCaml newbie here - coming from Haskell land out of curiosity.
I'm curious how you guys find your way around stdlib/packages
etc?
Example: I'm writing a script and I want to lookup an
environment
variable. I know there's probably some function along the lines
of
`get_env' somewhere, so I'd like to know where it is and what
type it
has. In Haskell I'd do a hoogle search along the lines of
<https://www.stackage.org/lts-13.18/hoogle?q=getenv> - what
would be
my process in OCaml?
I tried googling "get env var in Ocaml" - first hit is a link to
stdlib, but I'm using base. It did at least give me the hint
that
`Sys' is a relevant namespace, so I go and look at the docs for
`Base.Sys' (many clicks later -
<https://ocaml.janestreet.com/ocaml-core/latest/doc/base/Base/Sys/index.html>)
but `getenv' isn't listed. But it is apparently there…
There must be a better way?
Yawar Amin
──────────
The Hoogle equivalent for OCaml is called odig:
<https://erratique.ch/software/odig> . You can install it
locally and
have it generate documentation for all installed packages.
However,
generated documentation is not globally searchable (see last
point). Besides that, there are a few other strategies:
• Familiarize yourself with the standard library that ships with
every
OCaml distribution:
<https://caml.inria.fr/pub/docs/manual-ocaml/libref/> . This
is the
equivalent of Haskell's `base' package. The `Prelude'
equivalent
module is called `Pervasives'. You will find the `Sys' module
here,
and `getenv' in there.
• Keep <http://opam.ocaml.org/packages/> handy for when you're
given a
package name to look up. Package documentation is mostly not
uploaded to a central location like Haddock. (But people have
been
talking about setting that up at docs.ocaml.org.) You'll
probably
need to open up and search through `.mli' files once in a
while.
• The old-style ocamldoc documentation pages (like the standard
library I linked above) have very handy pages indexing types,
values, and modules. However, the newer odoc documentation
pages
which are becoming the de facto standard do not, as of yet.
There
are a couple of issues tracking this.
Other OCaml News
════════════════
From the ocamlcore planet blog
──────────────────────────────
Here are links from many OCaml blogs aggregated at [OCaml
Planet].
• [OCaml Developer at Ahrefs (Full-time)]
• [Learning ML Depth-First]
[OCaml Planet] <http://ocaml.org/community/planet/>
[OCaml Developer at Ahrefs (Full-time)]
<https://functionaljobs.com/jobs/9165-ocaml-developer-at-ahrefs>
[Learning ML Depth-First]
<https://blog.janestreet.com/learning-ml-depth-first/>
Old CWN
═══════
If you happen to miss a CWN, you can [send me a message] and
I'll mail
it to you, or go take a look at [the archive] or the [RSS feed
of the
archives].
If you also wish to receive it every week by mail, you may
subscribe
[online].
[Alan Schmitt]
[send me a message] <mailto:alan.schmitt at polytechnique.org>
[the archive] <http://alan.petitepomme.net/cwn/>
[RSS feed of the archives]
<http://alan.petitepomme.net/cwn/cwn.rss>
[online] <http://lists.idyll.org/listinfo/caml-news-weekly/>
[Alan Schmitt] <http://alan.petitepomme.net/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.idyll.org/pipermail/caml-news-weekly/attachments/20190423/1e56b937/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 487 bytes
Desc: not available
URL: <http://lists.idyll.org/pipermail/caml-news-weekly/attachments/20190423/1e56b937/attachment-0001.pgp>
More information about the caml-news-weekly
mailing list