[logo suggestions welcome]

N
E
P
L
S

These abstracts are for talks at this event.

NEPLS is a venue for ongoing research, so the abstract and supplemental material associated with each talk is necessarily temporal. The work presented here may be in a state of flux. In all cases, please consult the authors' Web pages for up-to-date information. Please don't refer to these pages as a definitive source.


The Essence of Proof-Carrying Code

Zhong Shao (Yale University)

Proof-Carrying Code (PCC) is a very general framework that can, in
principle, be used to verify arbitrary properties of machine language
programs. Existing PCC systems, including Foundational Proof-Carrying
Code (FPCC), however, have not fulfilled such promises. In this talk,
we contemplate the important principles that embody the essence of PCC
and FPCC, describe a framework that satisfies these principles, and
show that the new framework is simpler, more general and flexible, and
more likely to help us realize the full potential of PCC. 


Checking Specifications in Java Systems using Flow Analysis

Steve Reiss (Brown University)

One of today's challenges is producing reliable software in the face
of an increasing number of interacting components. Our system CHET
lets developers define specifications describing how a component
should be used and checks these specifications in real Java
systems. Unlike previous systems, CHET is able to check a wide range
of complex conditions in large software systems without programmer
intervention. It does this by doing a complete and detailed flow
analysis of the software and using this analysis to build a simpler,
model program. This talk will explore the motivations for CHET, the
specification techniques that are used, the flow analysis techniques,
and the methodology used in statically checking that the
specifications are obeyed in a system.


The Ibex extensible parser generator

Nathaniel Nystrom (Cornell University)

When implementing a new language, writing a parser is a necessary, but
often tedious, task.  Ibex is an extensible parser generator implemented
in the Polyglot compiler framework as an extension of Java.  In the Ibex
language, classes may contain parser rules and symbols as well as
methods and fields.  Grammar inheritance--the ability to extend a parser
with new symbols and productions to create a parser for a new
language--is achieved via normal class inheritance.  Unlike yacc, bison,
and other parser generators, semantic actions are type-checked in the
source rather than in the generated code, resulting in more intelligible
error messages.

LR(1) parsers, such as those generated by yacc, are inherently
non-extensible: adding a production to an LR(1) grammar may introduce
conflicts requiring existing productions to be rewritten.  In contrast,
Ibex generates generalized LR (GLR) parsers, which are powerful enough
to parse arbitrary context-free languages.  However, if the grammar is
ambiguous, the parser may create multiple parse trees for a given input;
these parse trees must be reconciled.  Computing the set of productions
that require reconciliation is undecidable.  Ibex implements a heuristic
algorithm for this problem that works well in practice, identifying all
ambiguities in a grammar for Java.

Joint work with Andrew Myers.


Eclat: Automatic Generation and Classification of Test Inputs

Carlos Pacheco (MIT)

We describe a technique that helps a test engineer select,
from a large set of randomly-generated candidate test inputs, a small
subset likely to reveal faults in the software under test.  The
technique takes a program or software component, plus a set of normal
executions--say, from an existing test suite, or from
observations of the software running properly.  The technique works by
extracting an operational model of the software's operation, and
comparing each candidate input's operational pattern of execution
against the model. Candidates whose operational pattern is suggestive
of a fault are further reduced by selecting only one input per such
pattern. The result is a small portion of the original candidates,
deemed most likely to reveal faults.  Thus, the technique can also be
seen as an error-detection technique.  We have implemented these ideas
in the Eclat tool, designed for unit testing of Java classes. Eclat
generates a large number of candidate inputs and uses the technique to
select only a few of them as likely to be fault-revealing.  In our
experiments, the inputs that it selects are an order of magnitude as
likely to reveal faults as the original candidate inputs.

From Eclat's web page, you can donwload the tool and try it out.


Dependent Types for Program Understanding

Raghavan Komondoor (IBM)

Weakly-typed languages such as Cobol often force programmers to represent 
distinct data abstractions using the same low-level physical type. For 
example, elements of two distinct logical domains, such as salaries and 
temperatures, might be represented by the same physical type Int. As 
another example, Cobol allows the same storage area to be accessed using 
different names.  Depending on how the programmer uses this facility, it 
can be used both to provide different "views" of the same runtime value, 
or to store values from one or more distinct domains, with the domain 
distinction made at runtime by "tag" information stored elsewhere.

In this talk, we describe a technique to recover implicitly-defined data 
abstractions from programs using type inference.  We present a novel 
system of dependent types which we call guarded types, a path-sensitive 
algorithm for inferring guarded types for Cobol programs, and a semantic 
characterization of correct guarded typings.  Our type language includes 
"atomic" types (scalar domains), records, and guarded disjoint unions, 
where the guards are type tags derived from control predicates in the 
program. The results of our inference technique can be used to enhance 
program understanding for legacy applications, and to enable a number of 
type-based program transformations, e.g., field expansion and porting from 
weakly- to strongly-typed languages.

This talk is based on a paper accepted for presentation at the upcoming
ETAPS/TACAS (Tools and Algorithms for the Construction and Analyis of
Systems) conference, April 2005, Edinburgh, UK. This is joint work with G.
Ramalingam, Satish Chandra, and John Field.


A Model of Substructural State

Matthew Fluet (Harvard University)

Programming languages such as Clean, CQual, Cyclone, TAL, and Vault
combine imperative features with some form of ``unique'' typing in
order to refine the set of expressions that should be allowed (e.g.,
recycling a reference to hold values of a different type) or
dis-allowed (e.g., failing to de-allocate a reference.)  However,
these languages provide different interpretations of ``uniqueness''
and have different rules regarding how unique values may be mixed with
unrestricted constructors.  Our goal is to provide a single model that
allows us to express and evaluate the different interpretations and to
study their interactions.  To this end, we present a substructural
polymorphic $\lambda$-calculus with mutable references that includes
four flavors of references (unrestricted, relevant, affine, and
linear).  This yields a rich design space, for which some questions
have no ``right'' answer (e.g., should an unrestricted reference be
allowed to hold an affine value?).  We explore a scenario that is
faithful to the operational interpretation of substructural logic, and
also allows shared references to hold unique values (i.e., answering
``yes'' above) as well as strong (type-varying) updates on unique
references.  Our type system provides for controlled interaction among
the different flavors of references and is justified by a standard
operational semantics coupled with a step-indexed interpretation of
types.

Joint work with Amal Ahmed (Harvard University) and Greg Morrisett
(Harvard University).


Eliminating Array Bounds Checks with Machine Arithmetic

Yanling Wang (Harvard University)

The goal of this work is to make it possible for programmers to statically
verify that all array accesses in a program are in bounds.  Traditional
analyses for eliminating array bounds checks are either (a) unsound, (b)
too weak, (c) take too much space or time, or (d) a combination of the
above. 

The approach we are investigating is similar to the one used by the
Extended Static Checking (ESC) project: we use a verification-condition
generator and theorem prover, together with user-supplied assertions and
annotations.  However, we depart from ESC in two crucial respects: First,
our VC-generator and prover are sound in that we accurately model machine
arithmetic including overflow.  In contrast, ESC (and its prover Simplify)
model machine arithmetic using Z, which causes their prover to claim some
array indices are in bounds when they are not. 

Second, our architecture is scalable by default (it runs as fast as the
type-checker) but also extensible.  When the default prover cannot
discharge a verification condition, programmers can insert explicit
proofs, or more generally, explicit tactics to search for a proof.  In
this fashion, the programmer can safely eliminate all dynamic checks that
are provably unnecessary. 


Programming Cryptographic Protocols

Joshua Guttman (MITRE)

Cryptographic protocol design is challenging, because a successful
protocol must work on two different levels: the cryptographic messages
sent, and the real-world consequences intended.  Electronic commerce
transactions, for example, must achieve confidentiality and
authentication by means of encrypted or signed messages, while also
causing real-world consequences at a different level.  The buyer
commits his funds; the seller commits to shipping some goods; and a
financial institution states that funds are available, and will be
transferred.

A domain-specific language can help integrate these two levels.  In
this talk, we describe a Cryptographic Protocol Programming Language
(CPPL), its compiler, and its semantics.  The semantics (based on
strand spaces) allow rigorous proofs of protocol correctness.

In CPPL, protocol actions are annotated with formulas in a trust
management logic.  The formula annotating a message transmission
describes what the sender is committed to as a consequence of sending
the message.  The formula on a message reception tells what the
receiver can rely on after receiving a message, which is normally that
some other principal has committed itself by sending that message.
This approach suggests both the control structures fundamental to CPPL
and a rely-guarantee method used for protocol analysis.

Joint work with Jonathan C Herzog, John D Ramsdell, and Brian T Sniffen. Related paper (ESOP 2004) available at http://www.ccs.neu.edu/home/guttman/trust_mgt_in_strands.pdf.


Certified Assembly Programming with Embedded Code Pointers

Zhaozhong Ni (Yale University)

Embedded code pointers (ECP) are stored handlers of functions and 
continuations commonly seen in low-level binaries as well as 
functional or higher-order programs.  ECP is known to be very hard to 
support well in Hoare-logic style verification systems. As a result, 
existing proof-carrying code (PCC) systems have to either sacrifice 
the expressiveness or the modularity of the assertion languages, or 
resort to heavyweight techniques that require the construction of 
complex semantic models. In this paper, we present a simple but 
powerful technique for solving the ECP problem for PCC. By adding a 
very small amount of syntax to the assertion language, we show how to 
use syntactic proof techniques to perform modular reasoning on 
embedded code pointers while still retaining the expressiveness of 
Hoare logic. Our new framework can also be applied to support 
references, recursive pointers, and polymorphism, thus leading to a 
general but lightweight system for reasoning about machine-level 
programs.

This talk presents work done in collaboration with Zhong Shao (Yale University). Our paper on this work can be found at http://flint.cs.yale.edu/flint/publications/xcap.html.


A Simplified Calculus of Higher-Order Modules

Paul Govereau (Harvard University)

We present a calculus of higher-order modules such as those found in
Standard ML and Objective Caml. Our calculus can express both
generative and applicative functors, and also allows mixing of
generative and applicative styles within a single module. Despite the
expressiveness of our calculus, the underlying type system is simpler
than other approaches. We achieve extra expressiveness and simplicity
through a novel treatment of generative types. Generativity is treated
as primitive in our calculus; it is incorporated into the types of
terms allowing a more direct treatment of generativity and type
abstraction.

The generative types in our calculus share some similarity with the
abstract types found in object calculi. Our generative types have a
nominal flavor although our type system is primarily structural. We
hope, in addition to providing a calculus of higher-order modules, our
system may be a practical mechanism for comparing functional and
object oriented systems, or for examining the relationship between
nominal and structural typing.


A short status report on a C-- compiler

Norman Ramsey (Harvard University)

C-- is a set of interfaces designed to help you implement the language
of your choice on the machine of your choice.  At long last, my
students and I have an implementation that is useful and interesting.
In the near future, we expect to add support for concurrency and for
more optimizations.  We welcome users and collaborators.


Elphin -- Functional Programming with Higher-Order Encodings

Adam Poswolsky (Yale University)

Higher-order encodings use functions provided by one language to
represent variable binders of another.  They lead to concise and
elegant representations which historically have been difficult to
analyze and manipulate.

In my talk I will discuss the design and implementation of Elphin, a
functional programming language that permits programming with general
recursive functions over simply-typed higher-order encodings.  Elphin
is based on the nabla calculus [SPS]. To avoid problems commonly
associated with using the same function space for representations and
computations, we separate one from the other, permitting recursion
under representation-level $\lambda$-binders while guaranteeing
adequate encodings, without cramping the natural style of functional
programming.

(joint work with Carsten Sch\"urmann and Jeffrey Sarnat)

[SPS] Schuermann, Poswolsky, Sarnat: The nabla calculus -- Functional
programming with higher-order encodings, TLCA 2005.  To appear.

For more information please see the Delphin Webpage.


MzTake: A Scriptable Debugger

Guillaume Marceau (Brown University)

Debugging is a laborious, manual activity that often involves
the repetition of common operations.  Ideally, users should
be able to describe these repetitious operations as little
programs.  Debuggers should therefore be programmable, or
scriptable.  The operating environment of these scripts,
however, imposes interesting design challenges on the
programming language that we use to write these scripts.
        
We present MzTake, our scriptable debugger for Java and Scheme. 
MzTake is built on top of FrTime, a functional reactive 
programming language also developed at Brown. We will discuss 
how the powerful data flow constructs of FrTime capture many
important debugging and comprehension metaphors.  By removing
the need for callbacks and state, debugging scripts become 
more concise and reusable.

Based on a paper at Automated Software Engineering, 2004. Joint work with Greg Cooper, Jono Spiro, John Clements, Shriram Krishnamurthi, and Steve Reiss.


Last modified Thursday, February 17th, 2005 10:36:21pmPowered by PLT