[cwn] Attn: Development Editor, Latest Caml Weekly News

Alan Schmitt alan.schmitt at polytechnique.org
Tue Mar 4 01:08:27 PST 2008


Hello,

Here is the latest Caml Weekly News, for the week of February 26 to  
March 04, 2008.

1) Please a simple Camlp5 example
2) Long-term storage of values
3) SML and friends go GPCE/OOPSLA 2008
4) OCaml version 3.10.2 released
5) Call for help: identify corrupted bytecode packages on Ubuntu Hardy
6) OCaml-Java project : beta version
7) PLplot library bindings update (5.8.0)
8) PhD positions in Functional Programming at Chalmers in Sweden

========================================================================
1) Please a simple Camlp5 example
Archive: <http://groups.google.com/group/fa.caml/browse_frm/thread/5bd8dd39dd6d3dd2/915fcc21c794c10b#915fcc21c794c10b 
 >
------------------------------------------------------------------------
** Fabrice Marchant asked and Richard Jones answered:

 >   I bet this kind of code should be rather common :
 > let string_of_piece_type = function
 >   King   -> "King"
 > | Queen  -> "Queen"
 > | Rook   -> "Rook"
 > | Bishop -> "Bishop"
 > | Knight -> "Knight"
 > | Pawn   -> "Pawn"
 >  Please have you got an example where the macro helps to implement  
such
 > kind of "string_of_type" function ?

You probably want to look at deriving (<http://code.google.com/p/deriving/ 
 >)
or tywith (<http://www.seedwiki.com/wiki/shifting_focus/tywith>) which
can generate these functions automatically.
			
========================================================================
2) Long-term storage of values
Archive: <http://groups.google.com/group/fa.caml/browse_frm/thread/2b4a8a9fbc70adf5/87bcf0f0a12968d3#87bcf0f0a12968d3 
 >
------------------------------------------------------------------------
** Dario Teixeira asked:

Suppose I have a value of type Story.t, fairly complex in its  
definition.
I wish to store this value in a DB (like Postgresql) for posterity.
At the moment, I am storing in the DB the marshalled representation
of the data; whenever I need to use it again in the Ocaml programme
I simply fetch it from the DB and unmarshal it.

This works fine; there is however one nagging problem: the marshalled
representation is brittle.  If Story.t changes even slightly, I will
no longer be able to retrieve values marshalled with the old version.

Note that changes to Story.t are likely to be small and will not
invalidate backwards compatibility (e.g., here or there another
variant might be added, but not removed or renamed).

I am therefore looking for an alternative to marshalling that a) does
not suffer from the brittleness problem, and b) is fast.  At the
moment, the best thing that occurs to me is to convert the data into
XML and to store it as such in the DB (the data is easily converted
into XML).

This strikes me as problem bound to be common.  How do you guys  
typically
solve this sort of situation?  In addition, if the XML route is indeed
the best one, what Ocaml tools would you recommend?  (Again, this is
for an application where speed is of the essence).
			
** David MENTRE suggested:

I asked a similar question once. One of the answer was to use Sexplib,
to store your values as Lisp S-exp:
<http://www.ocaml.info/home/ocaml_sources.html#toc11>

You could also use Config_file: <http://home.gna.org/cameleon/configfile.en.html 
 >
			
** Basile STARYNKEVITCH also answered:

It is even theoretically a very difficult problem. There have been some
publications by Cristal, Moscova, Gallium people at INRIA.

Assuming you have no abstract types, no objects, and no closures, and no
polymorphisms i.e. that there is a *.ml source file containing all the
type definitions. Then the types are composed by base types (int,
string, ...), sums, records, and perhaps arrays.

Then you could consider what are the deltas on the type definition.

In a sum like type sumt = A of t1 | B of t2 | C
you might consider what happens when you remove let say B, or add let's
say a choice | D of t3

In a record, consider likewise what happens when adding or removing a  
field.

Etc....

The details are very complex (at least to me, who tried unsucessfully to
work on this during my year at INRIA).

Maybe a semi-hand-crafted generator could help. Adding polymorphisms,
closures, objects, abstract types is a big mess.

Good luck to you. Try to publish something.
			
** Martin Jambon then replied, and Jake Donham added:

 > if we restrict the transition to a certain subset of operations, it  
can be
 > possible to define a mapping using some Camlp4 tool such as Deriving
 > (well, that's what I was told, assuming I interpreted correctly).

 > For instance:
 > - adding a record field: a default value is injected
 > - removing a record field: just remove it
 > - adding a variant: do nothing since it doesn't exist in the old data
 > - changing a type arbitrarily (such as changing a type foo1 into foo2
 >   everywhere): provide a map function that would override the  
default map
 >   function for such nodes.
 > - functional and abstract values: left as-is

We do this at Skydeck. We modified Deriving's Typeable module to  
expose the
structure of types, and we marshal values along with a version number  
(for
upgrades) and a type description from Typeable (so you get an error if  
you
change the type and forget to change the version number).

Our modified Typeable also has reflection functions so if you have a  
dynamic
(a value along with a Typeable.TypeRep) you can examine its parts
dynamically (implemented with Obj.magic). On top of that we have a  
generic
function for changing a value of one type into a value of another, as  
Martin
describes--we try to do the translation automatically as far as we  
can, and
provide a way to pass in mapping functions for particular types.

For the most part this works well. Compared to a SQL database it is very
nice to have the OCaml type system for data representation. The upgrade
mechanism is much safer than hand-coding SQL schema upgrades. On the  
other
hand, there are many things you get for free with a SQL db that you  
have to
do yourself: e.g. putting IDs on objects so you can refer to them
externally, easy hand-inspection of the data (it's annoying navigating  
big
structures in the top-level), to name a couple of small ones.
			
** Later in the thread, Berke Durak said:

I had exactly those kinds of issues during my work at the EDOS project.
We were basically treating the metadata for every dat of every component
of every architecture of every Debian distribution.

I started using SQL.  I first tried MySQL then PostgreSQL.
Fill performance was bad.  (And didn't get fast enough when Jaap added
prepared statements for postgresql).
Query performance was worse, because we were interested in things like
the transitive dependency closure, and SQL is
well-known to not be suitable for such transitive properties.  Maybe I
should have tried Sqlite but I didn't know it at the time.

I then switched to marshalling.  Boy, that was fast!  Of course I got
bitten very hard by segmentation faults...  sorry, Xavier, for bothering
you about such stupidity.  And yes, I had a version number in my
datastructures but, no, it was not automatically updated.

Also, adding a field was really, really painful.  Marshalling could  
work when
your data structures are stable, but during development, it is painful,
especially if you can't use a toy data set for development.

I then built an I/O combinator library.  Of course, combinators already
exist for defining parsers and printers, but I thought that it would be
great to combine them both.  (Such an idea has been presented at POPL
under the term of "picklers" if I recall correctly.)

Basically you describe your types using "literates" which can then be  
used
for reading and writing, as in

    type my_litterate = io_pair (io_list io_int) (io_hasthbl io_int  
io_string)

Anyway, that led to the "IO" module available here

    <http://caml.inria.fr/cgi-bin/hump.en.cgi?contrib=537>

and a better version still lives in the EDOS codebase, which is now
maintained by Jaap Boender.  I think Jaap made a GODI package for it.

IO can use multiple backends (binary, ASCII) and that was quite useful.
Performance is reasonably good.  One problem is with records and sum  
types -
you have to give pattern-matching functions to define readers/writers  
for
them. However, you can decide how you handle missing values or extra  
fields.

Martin Jambon's JSON wheel could also be used here, but I'm waiting for
the preprocessor situation to stabilize a little bit...

One major drawback of IO is that it does not handle sharing or recursive
structures.  And I know of no efficient, type-safe way of handling  
those,
especially if you don't want the serializer to add extra sharing (for  
instance
if you have distinct mutable records with the same contents)

I am still thinking about these problems.  For instance, I have been  
recently
working on Wikipedia revision history (only the Turkish (15 GB) and  
French
ones (200 GB), the English one is too big).  I needed maximal  
efficiency so I
used Marshal, which was pretty fast.  Of course I added a small version
header.

I noticed that when you Marshal a closure, the module puts a MD5 of
presumably the whole program in the output (which costs some bytes but  
that's
another problem.)

It would be nice to have a mechanism for marshalling a value with an  
MD5 of
its types, and unmarshalling only then the MD5 matches.  In non- 
malicious
environments, that is, in environments where no-one fakes MD5s, that  
would be
quite safe.

Of course it won't be possible to marshal polymorphic types.

So, I'm asking the experts: is it possible to have a very small  
extension that
would allow this? We could have the following restrictions and it  
still would
be very useful:

    - the type t must be defined in a module M, which must live in a  
separate
      compilation unit
      (i.e. have an available cmi)
    - the type t must be ground
    - the type t must be named in an interface
    - it must be explicitly named when marshalling/unmarshalling
    - this is an ugly hack that only works in Marshal

Something along the lines of :

    safer_output_to_channel stdout (z : Interface.t) []

Trying

    safer_output_to_channel stdout z []

Would yield

    safer_output_to_channel stdout z []
                                   ^
Error: value must be explicitly annotated with a type

safer_output_to_channel stdout (33 : int) []
Error: Type needs to be defined in an external interface to be  
Marshallable.

safer_output_to_channel stdout (bar : Foo.t) []
Error: Module Foo.t has no interface

safer_output_to_channel stdout (bar : int Foo.t) []
Error: Type Foo.t is not ground
			
** Brian Hurt said and Dario Teixeira answered:

 > You're making two mistakes.

 > Mistake #1: treating a database as a dumb object store.  This is a  
really
 > popular idea right now- Hibernate does this, as does Ruby on Rails,  
and a
 > number of other ORM packages take this effective approach.  On the  
other
 > hand, dynamically typed languages are also really popular.

Thanks for your reply.  It was quite interesting, though I get the  
feeling
you used my question solely as a trigger to share with us a long-held
dissatisfaction with the current state of affairs concerning the use of
databases, regardless of whether it actually applies to my particular  
problem.
That's fair, and I do agree with practically all your points.  However,
if I were you I would refrain from starting such missives with  
statements
as blunt and uncompromising as "You're making two mistakes".  As it  
turns
out, this is one of those cases where the data (tree-like, with  
recursive
structures) does not map well at all with a relational database.

Moreover, I am far from treating the database as a dumb object  
storage. In
fact, a significant portion of the code for this particular  
application (a web
app using Ocsigen; you can download a preliminary version of the app  
from
<http://dario.dse.nl/projects/lambdium-light/>) lies on the Postgresql  
side,
with the Ocaml code serving as little more than delivery boy. The  
exception is
of course those portions where it is far more natural and performant  
to let
the Ocaml code take the reins and treat the DB as dumb storage. The  
complex
data structure holding the markup of stories/comments is one such  
example.

 > So, mistake number one: either use the data, and structure your  
data (at
 > that layer) to take advantage of it, or don't use a database.

Unfortunately, this is an overly general statement.  I have no doubt  
that
if I were to present you the data I have, your reply would be "in this  
case,
just use the DB as dumb storage".

 > Mistake number two: file formats (and this includes marshalled data
 > structures), are wire protocols, and need to be designed to be as  
abstract
 > as possible- to reveal as little about the internal structure of the
 > program as possible (preferrably none at all).

On the other hand, one of the advantages of using a language with such a
rich type-system as Ocaml's, is that the application-independent  
description
of the data can be translated on practically a 1:1 basis to native  
language
constructs!  Trust me, I didn't need to bend the data definition to suit
the internal structure of the programme.
			
========================================================================
3) SML and friends go GPCE/OOPSLA 2008
Archive: <http://groups.google.com/group/fa.caml/browse_frm/thread/7a849b5d2b66fef8/31fad544bf2be58f#31fad544bf2be58f 
 >
------------------------------------------------------------------------
** Ralf Laemmel announced:

[One-liner version: please submit papers/workshop/tutorial proposals
for GPCE 2008.]

The conference on Generative Programming and Component Engineering
(GPCE) had some functional programming input over the last few years.
SML, Caml, OCaml, MetaOCaml expertise is relevant for GPCE. Think of
aggressive optimization, modularization, and staged computation, for
example. Also, general advanced programming language folks probably
hang out on this list (as is the case on the more lazy Haskell list --
better known to me). This year's GPCE conference again co-locates with
OOPSLA. We are looking for strong proposals for workshops and
tutorials that complement the GPCE technical programme, and thereby
also contribute to the combined GPCE/OOPSLA programme. It will be
great to get a submission or two from the SML, Caml, OCaml, MetaOCaml
community. Please have a look at the call for proposals, at the list
of topics, and at the history of the GPCE conference series.

I specifically encourage you to reach out for the crowd at GPCE/OOPSLA.

<http://www.hope.cs.rice.edu/twiki/bin/view/GPCE08/>
<http://www.hope.cs.rice.edu/twiki/bin/view/GPCE08/CallForWorkshops>
<http://www.hope.cs.rice.edu/twiki/bin/view/GPCE08/CallForTutorials>

[Deadline: 20th March]

Regards,
Ralf Laemmel
Satellite Chair GPCE 2008
Eager type-level programmer

PS: The paper deadline for GPCE is in May.
			
========================================================================
4) OCaml version 3.10.2 released
Archive: <http://groups.google.com/group/fa.caml/browse_frm/thread/f69c8c92e3695499/d873d5f21a391db5#d873d5f21a391db5 
 >
------------------------------------------------------------------------
** Damien Doligez announced:

It is our pleasure to announce the release of OCaml version 3.10.2.
This is mainly a bug-fix release, see the list of changes below.

It will be available here shortly:
  <http://caml.inria.fr/download.en.html>

If you're in a real hurry, you can get the source now from
  <ftp://ftp.inria.fr/INRIA/caml-light/ocaml-3.10/>

Happy hacking,

-- Damien Doligez for the OCaml team.


Objective Caml 3.10.2:
----------------------

Bug fixes:
- PR#1217 (partial) Typo in ocamldep man page
- PR#3952 (partial) ocamlopt: allocation problems on ARM
- PR#4339 (continued) ocamlopt: problems on HPPA
- PR#4455 str.mli not installed under Windows
- PR#4473 crash when accessing float array with polymorphic method
- PR#4480 runtime would not compile without gcc extensions
- PR#4481 wrong typing of exceptions with object arguments
- PR#4490 typo in error message
- Random crash on 32-bit when major_heap_increment >= 2^22
- Big performance bug in Weak hashtables
- Small bugs in the make-package-macosx script
- Bug in typing of polymorphic variants (reported on caml-list)
			
========================================================================
5) Call for help: identify corrupted bytecode packages on Ubuntu Hardy
Archive: <http://groups.google.com/group/fa.caml/browse_frm/thread/30328201fab81d25/3285bf354d1d1188#3285bf354d1d1188 
 >
------------------------------------------------------------------------
** David MENTRE asked:

I discovered the some OCaml bytecodes were silently corrupted in Ubuntu
Hardy (and Gutsy)[1]. The source of corruption, the package
pkg-create-dbgsym, is now fixed in current Hardy. The package ocamlnet
which was corrupted has been properly rebuilt and now works.

However, other packages might also be corrupted.

Right now, I have identified the following packages:
* cameleon

* dag2html

* ocsigen

* omake (not sure)

* polygen-data (not sure)

A rebuilt has been asked here:
<https://bugs.launchpad.net/ubuntu/+source/pkg-create-dbgsym/+bug/ 
197293>

But I might have skipt some of the corrupted packages (I don't know the
expected behavior of all packages). So, if you have access to Ubuntu
Hardy, could you test your favorite OCaml package and report any issue
to me or at the above Ubuntu bug.

Yours,
d.


Footnotes:
[1]  <https://bugs.launchpad.net/ubuntu/+source/ocamlnet/+bug/180364>
			
========================================================================
6) OCaml-Java project : beta version
Archive: <http://groups.google.com/group/fa.caml/browse_frm/thread/1173790aef3a3292/1439e5ad28f8d174#1439e5ad28f8d174 
 >
------------------------------------------------------------------------
** Xavier Clerc announced:

This post announces the beta release of the OCaml-Java project.
The goal of the OCaml-Java project is to allow seamless integration of  
OCaml
and Java.
Home page: <http://ocamljava.x9c.fr>

This version features a new binary distribution that is packed with  
all you
need to run OCaml programs on a JVM. This distribution does not even  
need the
standard Objective Caml distribution to be installed on your computer.
URL: <http://ocamljava.x9c.fr/downloads.html>


All the subprojects of OCaml-Java have of course been updated.
A detailed changelog from the initial alpha release to the new beta  
release
follows.

Barista subproject:
  - minor API changes
  - minor grammar changes
  - Java API

Cadmium subproject:
  - scripting support
  - XML support
  - file hooks / embedded mode
  - toplevel enhancements
  - fr.x9c.cadmium.support.values for easy encoding of OCaml values
  - SPI support

Cafesterol subproject:
  - scripting support
  - standalone linking mode
  - servlet support

Nickel subproject:
  - support for 'meta' tags

OCamlScripting subproject:
  - standalone version
			
** Richard Jones said:

I should add that there's a Fedora tracking bug for ocamljava where
(if you have Fedora or Red Hat Enterprise Linux) you can play with the
previous alpha version:

  <https://bugzilla.redhat.com/show_bug.cgi?id=434560>

I was working on the beta version today but didn't quite get it going.
When I do it'll appear in the above tracking bug.
			
** Adrien said and Xavier Clerc replied:

Now about ocamljava, ocaml.jar immediately exits with "unable to
locate ocaml toplevel". I thought it did not need ocaml(.out).

Also, ocamljava.jar has another problem :
java -jar bin\ocamljava.jar
Fatal error: exception
Sys_error("/usr/local/share/camomile/database/general_cat
egory.mar: java.io.IOException: file does not exist:
.\usr\local\share\camomile\
database\general_category.mar")

Well, I have to admit that I made some errors while crafting the binary
package. I just fixed them and uploaded a new version (check out
<http://ocamljava.x9c.fr/downloads.html>).. You should now be able to  
use the
various tools. Here is a short test session assuming you have a  
"source.ml"
file (typical content: <<let () = List.iter print_endline  
["hello ..."; "...
world"]>>):

Execution of toplevel:
	java -jar ocaml.jar

Compilation to OCaml bytecode:
	java -jar ocamlc.jar -o bc source.ml
Execution of produced btyecode:
	java -jar ocamlrun.jar bc

Compilation to Java jar file:
	java -jar ocamljava.jar -standalone -o prog.jar source.ml
Execution of produced jar file:
	java -jar prog.jar


However, I noticed a bug with the binary package that does not show up  
if you
have the standard OCaml distribution installed on your computer: the
ocamlc.jar/ocamljava.jar compilers will fail if the current directory  
contains
the by-product of a previous compilation (e.g. cmo or cmi files). The
temporary workaround is to delete these files before any compilation.

I suspect this bug has something to do with the embedded mode (the  
ability for
the Java-run programs to look for files inside their jar files rather  
than on
the local file system). I will hunt this bug and report to the list  
when it is
fixed. Sorry for the inconvenience in the meanwhile.
			
========================================================================
7) PLplot library bindings update (5.8.0)
Archive: <http://groups.google.com/group/fa.caml/browse_frm/thread/32ebb3c42bbee0ba/f3146e8ef8187b7f#f3146e8ef8187b7f 
 >
------------------------------------------------------------------------
** Hezekiah M. Carty announced:

I would like to announce an update to my PLplot [1] library bindings
for OCaml.

The code can be downloaded from:
<http://code.google.com/p/ocaml-plplot/>

The main changes for this release are:
- Works now with camlidl from Debian and other non-GODI installs.  The
  previous release had problems on non-GODI OCaml installations.
- More or less complete coverage of PLplot 5.8.0 functions.  If  
anything is
  missing or needed, please let me know.
- The version number has changed to match the corresponding PLplot  
version

Requirements:
- OCaml (tested with 3.10.0, should work on older versions?)
- PLplot version 5.7.x or 5.8.x - tested with 5.7.3 through 5.8.0
- camlidl
- findlib

The license is the same as PLplot (LGPLv2+).

This code has been developed and tested on Debian Sid and CentOS 5,
using both GODI and Debian OCaml packages.  I would be happy to hear
about any successes or failures on other systems.

There are currently two examples included [2], which corresponds to
examples 11 and 19 on the PLplot website [1].  The PLplot  
documentation [3]
is a good reference as this binding is very close to the C library.
There is some
documentation available on the ocaml-plplot Google Code wiki as well.   
Please
feel free to ask if you have any questions, problems or requests.

Enjoy!

[1] - <http://plplot.sourceforge.net/>

[2] - <http://code.google.com/p/ocaml-plplot/source/browse/trunk/examples/x11.ml 
 >
and <http://code.google.com/p/ocaml-plplot/source/browse/trunk/examples/x19.ml 
 >

[3] - <http://plplot.sourceforge.net/docbook-manual/>
			
========================================================================
8) PhD positions in Functional Programming at Chalmers in Sweden
Archive: <http://groups.google.com/group/fa.caml/browse_frm/thread/788ee2e7d384d2d5/d0574a96b5774671#d0574a96b5774671 
 >
------------------------------------------------------------------------
** Koen Claessen and John Hughes announced:

** please forward this to interested students **

PhD Positions in Functional Programming
at the department of Computer Science and Engineering
at Chalmers University of Technology, Sweden

<http://www.cs.chalmers.se/~koen/phdad.html>

Application deadline: March 11, 2008
Position starts: April 1, 2008 or September 1, 2008

The PhD student will join the research activities at our department in
applications of functional programming, much of which concentrates
on the design and application of Domain Specific Embedded Languages.
Examples of our work are QuickCheck for specification-guided random
testing, and Lava for hardware design and verification. Our group has
also developed the award winning automated first-order logic reasoning
tools Paradox and Equinox, which are both written in Haskell.

There are advertised positions in two focus areas:

1. Development of the next generation of Paradox and Equinox, which
involves inventing new programming techniques for building a modular,
flexible automated reasoning tool, as well as developing novel automated
reasoning algorithms. As new exciting applications for automated
reasoning tools arise, new demands are placed on the reasoning tools,
forcing changes in how they are designed and implemented.

2. Development of verification techniques for distributed systems
implemented in the functional programming language Erlang. Methods
here include QuickCheck-based testing, model checking and (automated)
theorem proving techniques, and integration of these different  
techniques.
There is a possibility of attacking real-world problems from our  
industrial
partners.

For more information, please look at:

<http://www.cs.chalmers.se/~koen/phdad.html>
			
========================================================================
Using folding to read the cwn in vim 6+
------------------------------------------------------------------------
Here is a quick trick to help you read this CWN if you are viewing it  
using
vim (version 6 or greater).

:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1
zM
If you know of a better way, please let me know.

========================================================================
Old cwn
------------------------------------------------------------------------

If you happen to miss a CWN, you can send me a message
(alan.schmitt at polytechnique.org) and I'll mail it to you, or go take a  
look at
the archive (<http://alan.petitepomme.net/cwn/>) or the RSS feed of the
archives (<http://alan.petitepomme.net/cwn/cwn.rss>). If you also wish
to receive it every week by mail, you may subscribe online at
<http://lists.idyll.org/listinfo/caml-news-weekly/> .

========================================================================

-- 
Alan Schmitt <http://alan.petitepomme.net/>

The hacker: someone who figured things out and made something cool  
happen.
  .O.
  ..O
  OOO


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.idyll.org/pipermail/caml-news-weekly/attachments/20080304/7222589e/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://lists.idyll.org/pipermail/caml-news-weekly/attachments/20080304/7222589e/attachment-0001.pgp 


More information about the caml-news-weekly mailing list