If selected, you will receive email when updated comments are available for this paper.
PC conflicts
Simon Peyton Jones Stephanie Weirich

Hoopl: A Modular, Reusable Library for Dataflow Analysis and Transformation

Submitted[PDF] 238kBUpdated Friday 2 Apr 2010 10:24:41am EDT  |  SHA-1 4e8df6a07b4c698ac20a073bf2a3decf1de2467c
You are an author of this paper.
Dataflow analysis and transformation of control-flow graphs is pervasive in optimizing compilers, but it is typically tightly interwoven with the details of a particular [more]Dataflow analysis and transformation of control-flow graphs is pervasive in optimizing compilers, but it is typically tightly interwoven with the details of a particular compiler. We describe Hoopl, a reusable Haskell library that makes it unusually easy to define new analyses and transformations for *any* compiler. Hoopl's interface is modular and polymorphic, and it offers unusually strong static guarantees. The implementation is also far from routine: it encapsulates state-of-the-art algorithms (interleaved analysis and rewriting, dynamic error isolation), and it cleanly separates their tricky elements so that they can be understood independently.
N. Ramsey, J. Dias, S. Jones [details]Norman Ramsey (Tufts University) <nr@cs.tufts.edu>
João Dias (Tufts University) <dias@cs.tufts.edu>
Simon Peyton Jones (Microsoft Research) <simonpj@microsoft.com>
compilation components and composition
Documents and Options
Submission category: Functional pearl
[PDF] Supplementary material
OveMerExp
Review #69ABY
Review #69BAY
Review #69CAX
Review #69DCY
[Edit paper] Edit paper  |  [Add response] Add response
[Text] Reviews and comments in plain text

Review #69A

Modified Tuesday 18 May 2010 8:33:53am EDT
Overall merit (?)
B. OK paper, but I will not champion it.
Expertise (?)
Y. I am knowledgeable in the area, though not an expert.
Paper summary
Describes the design, interface and implementation of a generic library for dataflow analyses and transformations. The approach followed is Lerner, Grove and Chambers's "analyze and (simultaneously) transform" rather than the more traditional "analyze then transform". GADTs and type families are used in a couple of places to statically enforce some well-formedness properties on control-flow graphs (e.g. maintain a distinction between nodes that start a block, end a block, or are in the middle of a block).

Paper strengths and weaknesses
+ Very solid engineering.
+ Good writing for such a technical topic
- The examples don't demonstrate the full power of the Lerner-Grove-Chambers approach (composition of analyses)
- As a research paper: no breakthrough, conceptual advances are small.
- As a functional pearl: not really entertaining, reads too much like documentation for a library.

Comments for author
This is a nice, well-engineered implementation of the "analyze and transform" approach of Lerner et al.

As a research paper, it is very short on conceptual advances, though. The use of GADTs and type families to enforce shape properties on CFGs is cute but not very deep. Everything else is known already, although the paper does a good job of connecting the pieces of knowledge together.

As a programming pearl, I'm afraid it fails the Bird-Gibbons test (http://www.comlab.ox.ac.uk/people/Jeremy.Gibbons/pearls/):

"Bird characterizes them as "polished, elegant, instructive, entertaining". [...] Think more along the lines of short stories --- 6 to 10 pages, brisk, engaging, accessible, surprising."

Polished and instructive it is; elegant and accessible, to some extent; but entertaining, brisk, engaging and surprising, no.

I was disappointed by the very short treatment of compositionality (section 4.5). The ability to compose analyses is the major innovation of Lerner et al's paper. For a single analysis followed by a single transformation, their approach simplifies a bit the transfer function (as explained well in the paper), but doesn't improve the precision of the analysis. Where Lerner et al really shine is the ability to combine multiple analyses and transformations. I really missed an example of such a combination in the paper, and was disappointed to see the "thenFwd" combinator on page 7 left as an exercise for the reader. There is also the question of combining forward and backward analyses that is left open in Lerner et al's paper (as far as I can remember) and would definitely deserve a second look.

Page 11: CIL is not "analysis-only". It provides about as many (or as few?) facilities for code rewriting than for code analysis. The comment about the API being complicated (in both cases) is correct, though.

The paper is generally well written but is somewhat repetitive in emphasizing how hard the authors worked and how nice the resulting API is. I'm not sure all of the following sentences have their place in a scientific paper (some of them perhaps, but all of them is a bit too much):

"... an idea that is elegantly captured by the FwdRes type..."
"What a beautiful type thenFwrdRw has!"
"We have spent six years implementing..."
"This formidable design problem..."
"...have been through dozens of revisions"
"We are proud of using GADTs to track..."

Review #69B

Modified Wednesday 19 May 2010 8:22:31am EDT
Overall merit (?)
A. Good paper. I will champion it at the PC meeting.
Expertise (?)
Y. I am knowledgeable in the area, though not an expert.
Paper summary
The paper presents Hoopl, a Haskell framework for implementing
dataflow analyses and compiler optimizations. Hoopl implements the
ideas first presented by Lerner, Grove, and Chambers in their POPL
2002 paper. The paper presents Hoopl, shows how to use it practically on
one example (constant folding), and sketches its implementation.

Paper strengths and weaknesses
+ nicely written

+ good and convincing illustration of the Haskell type machinery

+ inspiring paper on compilers architecture

- merely a packaging of already published ideas

Comments for author
Section 4.5 vaguely remembers me Expansion Passing Style by Dybvig,
Friedman and Haynes. Don't you think EPS is related to your rewrite
functions that return rewrite functions?

Page 7, why no giving definition of rewriteE?

Page 8, section 4.7. This section is not very convincing. It is not
technically very deep and I have mostly read it as a disrupting digression.
Toward the conclusion, you state that using the fuel monad was one of
your three goals. If, indeed, you consider using the fuel monad that
important, then I think you should emphasize it all along the paper.

Section 5. This section is really for Haskell aficionados. I understand why
it is there and I thank you for not making it too long but I still don't
like it very much. It reads too much as a self contentment Section.

Review #69C

Modified Friday 21 May 2010 10:52:35am EDT
Overall merit (?)
A. Good paper. I will champion it at the PC meeting.
Expertise (?)
X. I am an expert in the subject area of this paper.
Paper summary
This paper describes a dataflow analysis and transformation framework
written in Haskell. A client of this library provides the representation
of nodes (i.e., instructions), data-flow facts, and transformations,
and the library provides the analysis and transformations. Hoopl makes
heavy use of advanced Haskell type-system features (GADTs, etc.) to
enforce static correctness.

Paper strengths and weaknesses
Pros:

While it does not introduce a new analysis algorithm or transformation, this
work bridges the gap between the theoretical presentations and actual compilers.

Demonstrates the utility of functional programming for algorithms on
control-flow graphs, which are an aspect of compilers that was _not_ obviously
suited to functional programming languages.

One might argue that the paper does not introduce any new analyses or optimizations,
but I think that problem of how to actually build such applications is important
and this paper does a very good job of addressing that topic in a principled way.

Cons:

Some minor quibbles described below.

Comments for author
On page 4, the claim that middle nodes cannot refer to labels is not
really true (e.g., LoadAddress); what you mean is that middle nodes
do not use labels for their control flow.

In footnote 4, you say that program points correspond to edges, but edges
only occur between blocks and not between nodes. Surely nodes are program
points too?

The mkIfThenElse function in Figure 4 looks like it has a type error:
t and e are AGraphs, but gCat is defined to work on Graphs.

How do you expect function calls to be treated? Would then be open/closed
nodes with a closed/open return node?

I would like to see a bit more discussion about use cases. For example,
would you use it on an SSA representation? Do you support other control-flow
algorithms, such as determining dominators?

You do not mention the MLRisc library, but it also supports some higher-order
programming of optimizations (mostly through the heavy use of functors).
I don't think that there was ever a write up of this, other than the
documentation

        http://www.cs.nyu.edu/leunga/www/MLRISC/Doc/html/index.html

One issue that is not addressed is how fast/slow are the resulting optimization
passes? Do you take a big hit because of the abstractions?

Another piece of related work that is missing is Lee and Colby's implementation
of Parameterized Partial Evaluation using SML functors (the idea is based
on Consel's Parameterized Partial). It is similar in that the client
supplies an abstract domain and operators and the library takes care of the
rest. It also supports combining analyses and transformations automatically.

Review #69D

Modified Friday 21 May 2010 12:18:00pm EDT
Overall merit (?)
C. Weak paper, though I will not fight strongly against it.
Expertise (?)
Y. I am knowledgeable in the area, though not an expert.
Paper summary
The paper descripes Hoopl, a library for dataflow analysis mixed with
transformations. Hoopl is implemented in Haskell in a purely functional
way, using GADT to enforce invariants on internal data structures.

Paper strengths and weaknesses
Strengths:

 - Having this code exported as a library rather than keeping it internal to
   the GHC compiler is useful for others.

 - This paper may serve as a reference for people using the library or for
   other people implementing a similar library in other languages.

Weaknesses:

 - This remains the description of code, without much algorithmic novelty,
   and however useful the work and smart the implementation are, this remains
   a kind of boring paper to read. (Despite the constant efforts of the
   authors to transmit their enthusiasm to the reader.)

 - One originality of the library compared to other existing implementations
   of similar libraries is that it is purely functional. This is presented
   as a great advantage for the design and I can believe it. Still, it
   would have been interesting to have a discussion and some figures about
   the efficiency of the implementation, to convince us that this is not
   simultaneously a dropback.

Comments for author
- Section 3.3, col 1, last sentence: "Using GADTs to enforce..." Since you
  given an interface in a library, you could use smart constructors and
  phantom types to enfore the invariants from the client's point of view,
  i.e. to force the client to write only meaningful graphs. The GADTs helps
  you "take advantage" of these invariants knowing that other cases cannot
  occur.

  My impression is that (here and below) you are slightly exagerating the
  importance of GADTs---form the clients point of view.
 
- Section 4.8: col 1, last two items: I do not understand what you mean
  here. It seems that you throw away the transformation at each
  iteration, so what would be the point of iterating? Please detail.
  You call the original graph "virgin". In what sense? Since you transform
  graphs rather than annotate them, each graph should have the same status as
  another.

- Section 4, last item: "The algorithm is sound". Is it so obvious? Can the
  transfer function be wrong, e.g. say that nothing never flows out of
  anything and the rewrite eliminated significant stuff because it is not
  assume to flow in (this rewrite is sound) so that the resulting
  transformation is wrong? You should assume some form of soundness for the
  transfer function as well.

- Section 5, one before last paragraph: "clever trick". This seems a rather
  obvious thing to do, not even a trick...

Response

The authors' response is intended to address reviewer concerns and correct misunderstandings. The response should be addressed to the program committee, who will consider it when making their decision. Don't try to augment the paper's content or form—the conference deadline has passed. Please keep the response short and to the point.
 
Overall merit choices are:
A. Good paper. I will champion it at the PC meeting.
B. OK paper, but I will not champion it.
C. Weak paper, though I will not fight strongly against it.
D. Serious problems. I will argue to reject this paper.
Expertise choices are:
X. I am an expert in the subject area of this paper.
Y. I am knowledgeable in the area, though not an expert.
Z. I am not an expert. My evaluation is that of an informed outsider.