[cwn] Attn: Development Editor, Latest Caml Weekly News

Alan Schmitt alan.schmitt at polytechnique.org
Tue Aug 1 10:54:00 PDT 2006


Hello,

Here is the latest Caml Weekly News, for the week of July 18 to  
August 01, 2006.

1) syntax extension for parsing command-line options
2) sqlite3 bindings
3) Logging
4) Cocoa bindings (again)
5) efficient sparse matrix implementation anyone?
6) ocaml+twt 0.85
7) Internships at MSR Cambridge
8) Web page scraping packages

========================================================================
1) syntax extension for parsing command-line options
Archive: <http://groups.google.com/group/fa.caml/browse_thread/thread/ 
785983ef019f5b51/d227e623047de0b8#d227e623047de0b8>
------------------------------------------------------------------------
** Eric Breck announced:

Apropos of the recent discussion on the best way to handle command-
line arguments, I'd like to announce the release of pa_arg version
0.2.0, a syntax extension and library for parsing command-line
options. It offers:

* a clean, simple syntax for declaring and using options
* a pure functional interface
* GNU-style or Caml-style handling of options (-ane 'arg' vs -a -n -e
'arg')
* symbol-typed options via polymorphic variants
* generated code to produce a string representation of the chosen
options
* sensible defaults, but powerful, extensible behavior
* and more...

The extension and supporting module, along with a detailed example and
more documentation are available at <http://www.cs.cornell.edu/ 
~ebreck/pa_arg/>
It's also available from GODI as godi-pa_arg (or will be as soon as
the GODI svn archive is back up).

I'd like to thank Martin Jambon for his excellent camlp4 tutorial as
well as detailed feedback on previous versions of this extension.

thanks!

Eric Breck

A short example (a longer example is available on the website):

main.ml contains:

open Parseopt
type option opts = {
    alpha        : help = "smoothing parameter"; float;
    beta = false : action = set; bool;
    gamma        : key = ["-x"]; [`Ecks | `Why | `Zee];
    delta        : int;

}

if the user types: ./main -alpha 0.45 -b -x why anonymous args
then then the call

parse_argv opts

returns

{alpha=Some 0.45; beta=true; gamma=Some `Ecks; delta=None},
["anonymous";"args"]

PS: If anyone remembers a prior posting of this to ocaml_beginners
last year, this version is considerably improved, but somewhat
different (for one thing,it no longer interfaces to Arg, but provides
its own library).  Therefore, it's incompatible with that version.

========================================================================
2) sqlite3 bindings
Archive: <http://groups.google.com/group/fa.caml/browse_thread/thread/ 
a745ca5acf6ffc4e/98a50c2109b00fcd#98a50c2109b00fcd>
------------------------------------------------------------------------
** Mike Lin asked and Ted Kremenek answered:

 > Hi, does anyone know how (if) the available ocaml sqlite bindings
 > work with recent versions of sqlite (namely 3.x.x)?

i think there are more than one available SQLite bindings for OCaml,
including some that are just for versions of SQLite before the API
was changed in version 3 and others engineered directly for SQLite3.

I am using Bárður Árantsson's bindings that are available at:

<http://www.imada.sdu.dk/~bardur/personal/45-programs/>

These particular OCaml bindings in many ways directly mirror the
SQLite3 C API; they are not a DBI-like interfaces, but work well if
you are familiar with SQLite3.

I have modified the bindings slightly for my own needs by adding a
few methods and fixing a few bugs (any fixes of which I plan to
eventually contribute back to Bárður), but it works very well with

the latest version of SQLite3.  I am using it just fine with version
3.3.6 of SQLite.  I have successfully used it to create database
files of several gigabytes (populated with marshaled OCaml objects
using Sqlite3's blob type).

** Toby Allsopp also answered:

 > Hi, does anyone know how (if) the available ocaml sqlite bindings  
work with
 > recent versions of sqlite (namely 3.x.x)?

<http://metamatix.org/~ocaml/ocaml_sqlite3.html>
I've used this and it works fine, although there is a minor bug that
causes seg faults in certain situations that I can't currently
recall.  I have a patch that I will send next week from work.

(See the patch at the archive link.)

========================================================================
3) Logging
Archive: <http://groups.google.com/group/fa.caml/browse_thread/thread/ 
3ef0a74268f2a764/2b2bb5fba0d77298#2b2bb5fba0d77298>
------------------------------------------------------------------------
** Tiago Antão asked and Matthieu Dubuget answered:

 > Are there any best practices/libraries regarding logging in CAML? My
 > main use for it will be debugging (I know there are debugging
 > facilities in CAML, but I am very used to using log as a debug tool -
 > think Log4J for instance)...

I do not know Log4J. For the same purpose as yours, I did a small Log
library, using another one, called Tics.
Tics is a binding to QueryPerformanceCouter and al. on MS Windows. At
this moment, it is not multiplatform, but this should be done very  
easily.

I need Tics because each message sent to Log is timestamped and stored
in memory. The user of Log then can flush and print or display the
stored message when possible.

I did this because I have plenty of memory for those tests, and I wanted
to avoid IO perturbations of timing. I also needed precise timing,
because I'm working with hardware.

Except from those constraint, I really don't think my tools are forth  
using.

** James Woodyatt suggested:

You may want to have a look at the Cf_journal module in my recently
released OCaml NAE Core Foundation library.  See
<http://sf.net/projects/ocnae/> for downloads.  It's not anywhere  
near as full-
featured as Log4J, but it was intended as a foundation upon which to
build something that full-featured.
I find it works pretty well for debugging purposes.  Your mileage may  
vary.

** Jean-Christophe Filliatre suggested:

I'm also  using logging  as a  debug facility sometimes,  and I  use a
printf-like function for this purpose, as follows:

=========================
let log_ch = open_out "logfile"
let log = Format.formatter_of_out_channel log_ch
let () = at_exit (fun () -> Format.pp_print_flush log (); close_out  
log_ch)
let lprintf s = Format.fprintf log s
=========================

which provides a function lprintf of type

=========================
val lprintf : ('a, Format.formatter, unit) format -> 'a = <fun>
=========================

to be used like Format.printf.

This is surely not  the best way to do, and not  very powerful, but at
least  it is  convenient  to  use when  you  already have  Format-like
printers for your datatypes (using %a).

========================================================================
4) Cocoa bindings (again)
Archive: <http://groups.google.com/group/fa.caml/browse_thread/thread/ 
a137583ee5e119a2/3e2f89fa410c7b03#3e2f89fa410c7b03>
------------------------------------------------------------------------
** Joel Reymont asked and Paul Snively answered:

 > I would like to tinker with Cocoa bindings and try to move that
 > project forward.
 >
 > Where should I start from? I would like to take the route of
 > parsing Obj-C header files.

I agree that this is long overdue. A long time ago, Mike Hamburg,
Jeff Henrikson, and I made noises about working on this, based on
Mike's work on addressing the runtime side of it (he got so far as to
have a somewhat under-performant but usable runtime library
integrating O'Caml with the Objective C runtime based, IIRC, on
Obj.magic) and Jeff pointed out that Frontc, the parser that he used
in his Forklift FFI, had diverged from probably the best one for real-
world use, which is embedded in CIL and intertwined in ways that make
it a challenge to backport. Mike, Jeff, if you're reading this, is
this a fair characterization of your efforts and thoughts?
In any case, I still believe that:

1) It's worth addressing whatever issues need addressing in Mike's work.
2) It's worth resolving what parser to use and, IMHO, how to evolve
Forklift to support generating calls to and from Objective C via
Mike's runtime work.
3) It's worth combining the two to provide the Forklift annotations
to allow calling into and out of Cocoa on Mac OS X.
4) It's worth writing an FRP system for O'Caml a la Yampa for Haskell.
5) It's worth using said FRP system in conjunction with perhaps
<http://ocaml-win32.sourceforge.net>,
<http://wwwfun.kurims.kyoto-u.ac.jp/soft/lsl/lablgtk.html>, and our  
Cocoa
bindings to create a truly cross-platform GUI environment for O'Caml.

** Joel Reymont then asked and Paul Snively answered:

 > I must be missing something but ... what does FRP have to do with
 > Cocoa bindings?

In and of itself, nothing; I just like the FRP approach to GUI
programming, so I see an opportunity to kill two birds with one stone:
1) Provide O'Caml a nice FRP framework.
2) Provide O'Caml a nice GUI framework that doesn't suffer the
vagaries of the usual OO GUI frameworks.

** James Woodyatt said:

 > In and of itself, nothing; I just like the FRP approach to GUI
 > programming, so I see an opportunity to kill two birds with one  
stone:

 > 1) Provide O'Caml a nice FRP framework.

At the risk of engaging in more than my fair share of self-promotion,
I should point out that the OCaml NAE I/O Reactor library I just
released is an FRP framework.  It's pretty spare at the moment and
needs a lot of additions.  Also, I didn't write it with graphical
user interfaces in mind-- the goal was a good framework for single-
threaded multiplexing network application servers.  (The acronym
"NAE" stands for 'Network Application Environment'.)

         <http://sf.net/projects/ocnae/>

The Yampa framework doesn't strike me as appropriate for building a
GUI.  I suspect such a GUI toolkit would offer highly underwhelming
performance characteristics.  I could be wrong about that, and would
welcome such a surprise, but that's what my unscientific guess tells me.

 > 2) Provide O'Caml a nice GUI framework that doesn't suffer the
 > vagaries of the usual OO GUI frameworks.

For my own part, I plan to do all my GUI work in Cocoa with native
Objective-C.  A more useful addition to the OCaml HUMP, I argue,
would be bindings for CoreData.
         <http://developer.apple.com/macosx/coredata.html>

At some point, if no one else has done it first, I will get around to

doing it myself.  Don't anybody hold their breath waiting for it,
though... I have a lot of hobby projects these days, what with a day
job and a 6-month old baby in the house.

========================================================================
5) efficient sparse matrix implementation anyone?
Archive: <http://groups.google.com/group/fa.caml/browse_thread/thread/ 
f2d25d14bba464ef/58b4786ecd3592ab#58b4786ecd3592ab>
------------------------------------------------------------------------
** Jonathan Roewen asked and Erick Tryzelaar answered:

 > I'm wondering if anyone has a sparse matrix library for ocaml, or
 > perhaps a wrapper around some C library.

lacaml might be suitable for you. You can find it here:
<http://www.ocaml.info/home/ocaml_sources.html>

========================================================================
6) ocaml+twt 0.85
------------------------------------------------------------------------
** Mike Lin announced:

I just released an update of "The Whitespace Thing" for OCaml, my  
preprocessor
that lets you use Python or Haskell-style indentation to avoid most  
multi-line
parenthesization. The update adds support for some language features  
that I had
previously overlooked.
<http://people.csail.mit.edu/mikelin/ocaml+twt/>
While the implementation is slightly lame, I'm using this every day on
moderately large projects and I recommend it if you like this style.

========================================================================
7) Internships at MSR Cambridge
Archive: <http://groups.google.com/group/fa.caml/browse_thread/thread/ 
c64dcb54143cc844/4983dbd284e803c4#4983dbd284e803c4>
------------------------------------------------------------------------
** Simon Peyton-Jones announced:

Would you be interested in working for three months at Microsoft
Research, Cambridge, on a project related to GHC or Haskell?

MSR Cambridge now takes interns *year-round*, not just in the summer
months.  Simon Marlow and I are keen to attract motivated and
well-qualified folk to work with us on our research, and on improving or
developing GHC.

(I'm being a bit cheeky sending this to the Caml mailing list!  But
there are lots of other programming-language folk at MSR (F#, security,
probabilistic languages...), and stuff beyond that (machine learning,
systems and networks...); see
<http://research.microsoft.com/aboutmsr/labs/cambridge/>.)

Anyway, if you or one of your students is interested, you can find more
details here
         <http://hackage.haskell.org/trac/ghc/wiki/Internships>

Our next empty slot is the Oct-Dec period, but you are welcome to apply
for some later period.

========================================================================
8) Web page scraping packages
Archive: <http://groups.google.com/group/fa.caml/browse_thread/thread/ 
e60e2ab738572869/fe06fdf5a93597d1#fe06fdf5a93597d1>
------------------------------------------------------------------------
** Joel Reymont asked:

Are there any screen-scraping packages for OCaml?

I'm looking for something that would let me analyze the contents of a
web page and extract, for example, all the image tags.

I'm using Ruby for this at work and something like hpricot [1] is
very neat but also somewhat slow.

         Thanks, Joel

[1] <http://code.whytheluckystiff.net/hpricot/>

** Karl Zilles answered:

I don't think of this as screen scraping.  Spidering might be a  
better word.
I've done a good bit of this in OCaml.  I use the curl package for
downloading web pages and the netstring package for parsing them.

I'm going to attach a couple of files that I use for this sort of stuff.
   The file htmltreeutils.ml has a bunch of functions for working with
the results of a nethtml parse tree.

So your program would look something like this.. and this hasn't been
tested:

open Htmltreeutils

      let result = Buffer.create 2000 in
      let connection = Curl.init () in
      Curl.set_httpget connection true;
      Curl.set_url connection "<http://www.yahoo.com/randompage.html>";
      Curl.set_writefunction connection (fun s -> Buffer.add_string  
result s);
      Curl.set_headerfunction connection (fun s -> ());
      Curl.perform connection;
      Curl.cleanup connection;

      let dom = get_parsed_html_from_string result in
      let img_tags = list_tags "img" dom in
      .... do something with img tags here like pull out their src
        attributes

(The helper files are at the archive link.)

** Richard Jones also answered:

We did some web scraping using WWW::Mechanize + perl4caml.  As a
result, perl4caml contains pretty complete bindings for the
WWW::Mechanize library.
<http://merjis.com/developers/perl4caml>
<http://resources.merjis.com/developers/perl4caml/ 
Pl_WWW_Mechanize.www_mechanize.html>

========================================================================
Using folding to read the cwn in vim 6+
------------------------------------------------------------------------
Here is a quick trick to help you read this CWN if you are viewing it  
using
vim (version 6 or greater).

:set foldmethod=expr
:set foldexpr=getline(v:lnum)=~'^=\\{78}$'?'<1':1
zM
If you know of a better way, please let me know.

========================================================================
Old cwn
------------------------------------------------------------------------

If you happen to miss a CWN, you can send me a message
(alan.schmitt at polytechnique.org) and I'll mail it to you, or go take  
a look at
the archive (<http://alan.petitepomme.net/cwn/>) or the RSS feed of the
archives (<http://alan.petitepomme.net/cwn/cwn.rss>). If you also wish
to receive it every week by mail, you may subscribe online at
<http://lists.idyll.org/listinfo/caml-news-weekly/> .

========================================================================

-- 
Alan Schmitt <http://alan.petitepomme.net/>

The hacker: someone who figured things out and made something cool  
happen.
.O.
..O
OOO


-------------- next part --------------
A non-text attachment was scrubbed...
Name: PGP.sig
Type: application/pgp-signature
Size: 186 bytes
Desc: This is a digitally signed message part
Url : http://lists.idyll.org/pipermail/caml-news-weekly/attachments/20060801/c062a479/PGP.bin


More information about the caml-news-weekly mailing list