[cwn] Attn: Development Editor, Latest Caml Weekly News

Alan Schmitt alan.schmitt at polytechnique.org
Tue Jul 15 00:56:43 PDT 2008


Hello,

Here is the latest Caml Weekly News, for the week of July 8 to 15, 2008.

1) New library: Easy-format 0.9.0
2) Troublesome nodes
3) Name of currently executing function

========================================================================
1) New library: Easy-format 0.9.0
Archive: <http://groups.google.com/group/fa.caml/browse_thread/thread/61252c85858fe7ea# 
 >
------------------------------------------------------------------------
** Martin Jambon announced:

I would like to announce a small library (a module in fact) that is  
meant to
make it easy to produce pretty-printed text:

  <http://martin.jambon.free.fr/easy-format.html>

The data to be printed goes through a tree that carries all the  
information
required for pretty-printing. After that, a single call to
Easy_format.Pretty.to_stdout (for instance) outputs the indented result.

There's a reasonably complete example at

  <http://martin.jambon.free.fr/easy_format_example.html>
			
** He later added:

I've just released a richer version of Easy-format.
The main URL is still <http://martin.jambon.free.fr/easy-format.html>

It now offers the following additional features:

- Support for separators that stick to the next list item (e.g. "|")
- More wrapping options
- Added Custom kind of nodes for using Format directly or existing
  pretty-printers
- Support for markup and escaping, allowing to produce colorized output
  (HTML, terminal, ...) without interfering with the computation of
  line breaks and spacing.

Easy-format now takes advantage of most features of the Format module,  
with
the notable exception of tabulation boxes. I'd be curious so see any
interesting use of tabulation boxes.


This release has slight incompatibilities with version 0.9.0 which was
released earlier this week, simply because more options were added:

- Deprecated use of Easy_format.Param. Instead, inherit from
  Easy_format.list,
  Easy_format.label or Easy_format.atom.
- Atom nodes have now one additional argument for parameters.
- All record types have been extended with more fields.
  Using the "with" mechanism for inheritance is the best way of limiting
  future incompatibilities.
			
========================================================================
2) Troublesome nodes
Archive: <http://groups.google.com/group/fa.caml/browse_thread/thread/943eb2043c8c1266# 
 >
------------------------------------------------------------------------
** Deep in this thread, Dario Teixeira said:

For the sake of future reference, I'm going to summarise the solution to
the original problem, arrived at thanks to the contribution of various
people in this list (your help was very much appreciated!).  This might
came in handy in the future if you run into a similar problem.

So, we have a document composed of a sequence of nodes.  There are four
different kinds of nodes, two of which are terminals (Text and See)
and two of which are defined recursively (Bold and Mref).  Both See and
Mref produce links, and we want to impose a constraint that no link node
shall be the immediate ancestor of another link node.  An additional
constraint is that nodes must be created via constructor functions.
Finally, the structure should be pattern-matchable, and the distinction
between link/nonlink nodes should be preserved so that Node-to-Node
functions do not require code duplication and/or ugly hacks.

Using conventional variants, the structure can be represented as  
follows.
Alas, this solution is not compatible with constructor functions and
requires unwanted scaffolding:

module Ast =
struct
    type super_node_t =
        | Nonlink_node of nonlink_node_t
        | Link_node of link_node_t
    and nonlink_node_t =
        | Text of string
        | Bold of super_node_t list
    and link_node_t =
        | Mref of string * nonlink_node_t list
        | See of string
end


Below is the solution that was finally obtained.  Nodes are represented
using polymorphic variants, and the module itself is recursive to get
around the "type not fully defined" problem:

module rec Node:
sig
    type nonlink_node_t = [ `Text of string | `Bold of  
Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string *  
nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]

    val text: string -> nonlink_node_t
    val bold: [< super_node_t] list -> nonlink_node_t
    val see: string -> link_node_t
    val mref: string -> nonlink_node_t list -> link_node_t
end =
struct
    type nonlink_node_t = [ `Text of string | `Bold of  
Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string *  
nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]

    let text txt = `Text txt
    let bold seq = `Bold (seq :> super_node_t list)
    let see ref = `See ref
    let mref ref seq = `Mref (ref, seq)
end


If you try it out, you will see that this solution enforces the basic
node constraints.  The values foo1-foo4 are valid, whereas the compiler
won't accept foo5, because a link node is the parent of another:

open Node
let foo1 = text "foo"
let foo2 = bold [text "foo"]
let foo3 = mref "ref" [foo1; foo2]
let foo4 = mref "ref" [bold [see "ref"]]
let foo5 = mref "ref" [see "ref"]


Now, suppose you need to create an Ast-to-Node sets of functions (a  
likely
scenario if you are building a parser for documents).  The solution is
fairly straightforward, but note the need to use the cast operator :>
to promote link/nonlink nodes to supernodes:

module Ast_to_Node =
struct
    let rec convert_nonlink_node = function
        | Ast.Text txt          -> Node.text txt
        | Ast.Bold seq          -> Node.bold (List.map  
convert_super_node seq)

    and convert_link_node = function
        | Ast.Mref (ref, seq)   -> Node.mref ref (List.map  
convert_nonlink_node seq)
        | Ast.See ref           -> Node.see ref

    and convert_super_node = function
        | Ast.Nonlink_node node -> (convert_nonlink_node node :>  
Node.super_node_t)
        | Ast.Link_node node    -> (convert_link_node node :>  
Node.super_node_t)
end


Finally, another common situation is to build functions to process  
nodes.
The example below is that of a node-to-node "identity" module.  Again,  
it
is fairly straightforward, but note the use of the operator # and the  
need
to cast link/nonlink nodes to supernodes:

module Node_to_Node =
struct
    open Node

    let rec convert_nonlink_node = function
        | `Text txt              -> text txt
        | `Bold seq              -> bold (List.map convert_super_node  
seq)

    and convert_link_node = function
        | `See ref               -> see ref
        | `Mref (ref, seq)       -> mref ref (List.map  
convert_nonlink_node seq)

    and convert_super_node = function
        | #nonlink_node_t as node -> (convert_nonlink_node node :>  
super_node_t)
        | #link_node_t as node    -> (convert_link_node node :>  
super_node_t)
end
			
** He later added:

Sorry, but in the meantime I came across two problems with the  
supposedly
ultimate solution I just posted.  I have a correction for one, but not
for the other.

The following statements trigger the first problem:

let foo1 = text "foo"
let foo2 = see "ref"
let foo3 = bold [foo1; foo2]

Error: This expression has type Node.link_node_t but is here used with  
type
         Node.nonlink_node_t
       These two variant types have no intersection

The solution that immediately comes to mind is to make the return types
for the constructor functions open:  (I can see no disadvantage with
this solution; please tell me if you find any)


module rec Node:
sig
    type nonlink_node_t = [ `Text of string | `Bold of  
Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string *  
nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]

    val text: string -> [> nonlink_node_t]
    val bold: [< super_node_t] list -> [> nonlink_node_t]
    val see: string -> [> link_node_t]
    val mref: string -> nonlink_node_t list -> [> link_node_t]
end =
struct
    type nonlink_node_t = [ `Text of string | `Bold of  
Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string *  
nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]

    let text txt = `Text txt
    let bold seq = `Bold (seq :> super_node_t list)
    let see ref = `See ref
    let mref ref seq = `Mref (ref, seq)
end


The second problem, while not a show-stopper, may open a hole for  
misuse of
the module, so I would rather get it fixed.  Basically, while the module
provides constructor functions to build nodes, nothing prevents the user
from bypassing them and constructing nodes manually.  The obvious  
solution
of declaring the types "private" results in an "This fixed type has no  
row
variable" error.  Any way around it?
			
========================================================================
3) Name of currently executing function
Archive: <http://groups.google.com/group/fa.caml/browse_thread/thread/25c9706b89196140# 
 >
------------------------------------------------------------------------
** Dave Benjamin asked and blue storm answered:

 > Is there any way to find out the name of the currently executing
 > function from within an OCaml program? I guess, technically, I'm
 > interested in the grandparent. Something that would allow this:
 >
 > let log msg =
 >    Printf.eprintf "%s: %s\n%!"
 >      (get_caller_function_name ())
 >      msg
 >
 > I'm guessing the above is not possible, but perhaps there's some  
way to
 > accomplish this using some combination of camlp4, back traces,
 > profiling, or debugging?

Here is a little camlp4 code for an ad-hoc solution :
<http://bluestorm.info/camlp4/Camlp4GenericProfiler.ml>

It's based upon the Camlp4Filters/Camlp4Profiler.ml from the camlp4
distribution.
The GenericMake functor will traverse your code and apply the
parametrized function to the body of each function declaration. You
can use it with a functor providing a  (with_fun_name : string ->
Ast.expr -> Ast.expr), transforming the Ast to your liking, given the
function name.

I've written a small LoggingDecorator module that operates on the
__LOG__ identifier. Example code :
  let __LOG_FUNC__ func msg =  Printf.eprintf "in function %s: %s\n%!"  
func msg
  let test_function p = if not p then __LOG__ "p is false !"

It will replace the __LOG__ identifier with a __LOG_FUNC__ "p".

You can change that behavior, in particular you could be interested
(for logging purpose) in the location of the function declaration, not
only his name : see how the initial Camlp4Profiler behavior (wich i
kept in the ProfilingDecorator) do that (Loc.dump).
			
========================================================================
Using folding to read the cwn in vim 6+
------------------------------------------------------------------------
Here is a quick trick to help you read this CWN if you are viewing it  
using
vim (version 6 or greater).

:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1
zM
If you know of a better way, please let me know.

========================================================================
Old cwn
------------------------------------------------------------------------

If you happen to miss a CWN, you can send me a message
(alan.schmitt at polytechnique.org) and I'll mail it to you, or go take a  
look at
the archive (<http://alan.petitepomme.net/cwn/>) or the RSS feed of the
archives (<http://alan.petitepomme.net/cwn/cwn.rss>). If you also wish
to receive it every week by mail, you may subscribe online at
<http://lists.idyll.org/listinfo/caml-news-weekly/> .

========================================================================

-- 
Alan Schmitt <http://alan.petitepomme.net/>

The hacker: someone who figured things out and made something cool  
happen.
  .O.
  ..O
  OOO


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.idyll.org/pipermail/caml-news-weekly/attachments/20080715/ca7ace87/attachment.htm 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://lists.idyll.org/pipermail/caml-news-weekly/attachments/20080715/ca7ace87/attachment-0001.pgp 


More information about the caml-news-weekly mailing list