New England Programming Languages and Systems Symposium Series (NEPLS): Abstracts

N
E
P
L
S
These abstracts are for talks at this event.
NEPLS is a venue for ongoing research, so the abstract and supplemental material associated with each talk is necessarily temporal. The work presented here may be in a state of flux. In all cases, please consult the authors' Web pages for up-to-date information. Please don't refer to these pages as a definitive source.
The Essence of Proof-Carrying Code

Zhong Shao (Yale University)
Proof-Carrying Code (PCC) is a very general framework that can, in principle, be used to verify arbitrary properties of machine language programs. Existing PCC systems, including Foundational Proof-Carrying Code (FPCC), however, have not fulfilled such promises. In this talk, we contemplate the important principles that embody the essence of PCC and FPCC, describe a framework that satisfies these principles, and show that the new framework is simpler, more general and flexible, and more likely to help us realize the full potential of PCC.
Checking Specifications in Java Systems using Flow Analysis

Steve Reiss (Brown University)
One of today's challenges is producing reliable software in the face of an increasing number of interacting components. Our system CHET lets developers define specifications describing how a component should be used and checks these specifications in real Java systems. Unlike previous systems, CHET is able to check a wide range of complex conditions in large software systems without programmer intervention. It does this by doing a complete and detailed flow analysis of the software and using this analysis to build a simpler, model program. This talk will explore the motivations for CHET, the specification techniques that are used, the flow analysis techniques, and the methodology used in statically checking that the specifications are obeyed in a system.
The Ibex extensible parser generator

Nathaniel Nystrom (Cornell University)
When implementing a new language, writing a parser is a necessary, but often tedious, task. Ibex is an extensible parser generator implemented in the Polyglot compiler framework as an extension of Java. In the Ibex language, classes may contain parser rules and symbols as well as methods and fields. Grammar inheritance--the ability to extend a parser with new symbols and productions to create a parser for a new language--is achieved via normal class inheritance. Unlike yacc, bison, and other parser generators, semantic actions are type-checked in the source rather than in the generated code, resulting in more intelligible error messages. LR(1) parsers, such as those generated by yacc, are inherently non-extensible: adding a production to an LR(1) grammar may introduce conflicts requiring existing productions to be rewritten. In contrast, Ibex generates generalized LR (GLR) parsers, which are powerful enough to parse arbitrary context-free languages. However, if the grammar is ambiguous, the parser may create multiple parse trees for a given input; these parse trees must be reconciled. Computing the set of productions that require reconciliation is undecidable. Ibex implements a heuristic algorithm for this problem that works well in practice, identifying all ambiguities in a grammar for Java. Joint work with Andrew Myers.
Eclat: Automatic Generation and Classification of Test Inputs

Carlos Pacheco (MIT)
We describe a technique that helps a test engineer select, from a large set of randomly-generated candidate test inputs, a small subset likely to reveal faults in the software under test. The technique takes a program or software component, plus a set of normal executions--say, from an existing test suite, or from observations of the software running properly. The technique works by extracting an operational model of the software's operation, and comparing each candidate input's operational pattern of execution against the model. Candidates whose operational pattern is suggestive of a fault are further reduced by selecting only one input per such pattern. The result is a small portion of the original candidates, deemed most likely to reveal faults. Thus, the technique can also be seen as an error-detection technique. We have implemented these ideas in the Eclat tool, designed for unit testing of Java classes. Eclat generates a large number of candidate inputs and uses the technique to select only a few of them as likely to be fault-revealing. In our experiments, the inputs that it selects are an order of magnitude as likely to reveal faults as the original candidate inputs.
From Eclat's web page, you can donwload the tool and try it out.
Dependent Types for Program Understanding

Raghavan Komondoor (IBM)
Weakly-typed languages such as Cobol often force programmers to represent distinct data abstractions using the same low-level physical type. For example, elements of two distinct logical domains, such as salaries and temperatures, might be represented by the same physical type Int. As another example, Cobol allows the same storage area to be accessed using different names. Depending on how the programmer uses this facility, it can be used both to provide different "views" of the same runtime value, or to store values from one or more distinct domains, with the domain distinction made at runtime by "tag" information stored elsewhere. In this talk, we describe a technique to recover implicitly-defined data abstractions from programs using type inference. We present a novel system of dependent types which we call guarded types, a path-sensitive algorithm for inferring guarded types for Cobol programs, and a semantic characterization of correct guarded typings. Our type language includes "atomic" types (scalar domains), records, and guarded disjoint unions, where the guards are type tags derived from control predicates in the program. The results of our inference technique can be used to enhance program understanding for legacy applications, and to enable a number of type-based program transformations, e.g., field expansion and porting from weakly- to strongly-typed languages. This talk is based on a paper accepted for presentation at the upcoming ETAPS/TACAS (Tools and Algorithms for the Construction and Analyis of Systems) conference, April 2005, Edinburgh, UK. This is joint work with G. Ramalingam, Satish Chandra, and John Field.
A Model of Substructural State

Matthew Fluet (Harvard University)
Programming languages such as Clean, CQual, Cyclone, TAL, and Vault combine imperative features with some form of ``unique'' typing in order to refine the set of expressions that should be allowed (e.g., recycling a reference to hold values of a different type) or dis-allowed (e.g., failing to de-allocate a reference.) However, these languages provide different interpretations of ``uniqueness'' and have different rules regarding how unique values may be mixed with unrestricted constructors. Our goal is to provide a single model that allows us to express and evaluate the different interpretations and to study their interactions. To this end, we present a substructural polymorphic $\lambda$-calculus with mutable references that includes four flavors of references (unrestricted, relevant, affine, and linear). This yields a rich design space, for which some questions have no ``right'' answer (e.g., should an unrestricted reference be allowed to hold an affine value?). We explore a scenario that is faithful to the operational interpretation of substructural logic, and also allows shared references to hold unique values (i.e., answering ``yes'' above) as well as strong (type-varying) updates on unique references. Our type system provides for controlled interaction among the different flavors of references and is justified by a standard operational semantics coupled with a step-indexed interpretation of types. Joint work with Amal Ahmed (Harvard University) and Greg Morrisett (Harvard University).
Eliminating Array Bounds Checks with Machine Arithmetic

Yanling Wang (Harvard University)
The goal of this work is to make it possible for programmers to statically verify that all array accesses in a program are in bounds. Traditional analyses for eliminating array bounds checks are either (a) unsound, (b) too weak, (c) take too much space or time, or (d) a combination of the above. The approach we are investigating is similar to the one used by the Extended Static Checking (ESC) project: we use a verification-condition generator and theorem prover, together with user-supplied assertions and annotations. However, we depart from ESC in two crucial respects: First, our VC-generator and prover are sound in that we accurately model machine arithmetic including overflow. In contrast, ESC (and its prover Simplify) model machine arithmetic using Z, which causes their prover to claim some array indices are in bounds when they are not. Second, our architecture is scalable by default (it runs as fast as the type-checker) but also extensible. When the default prover cannot discharge a verification condition, programmers can insert explicit proofs, or more generally, explicit tactics to search for a proof. In this fashion, the programmer can safely eliminate all dynamic checks that are provably unnecessary.
Programming Cryptographic Protocols

Joshua Guttman (MITRE)
Cryptographic protocol design is challenging, because a successful protocol must work on two different levels: the cryptographic messages sent, and the real-world consequences intended. Electronic commerce transactions, for example, must achieve confidentiality and authentication by means of encrypted or signed messages, while also causing real-world consequences at a different level. The buyer commits his funds; the seller commits to shipping some goods; and a financial institution states that funds are available, and will be transferred. A domain-specific language can help integrate these two levels. In this talk, we describe a Cryptographic Protocol Programming Language (CPPL), its compiler, and its semantics. The semantics (based on strand spaces) allow rigorous proofs of protocol correctness. In CPPL, protocol actions are annotated with formulas in a trust management logic. The formula annotating a message transmission describes what the sender is committed to as a consequence of sending the message. The formula on a message reception tells what the receiver can rely on after receiving a message, which is normally that some other principal has committed itself by sending that message. This approach suggests both the control structures fundamental to CPPL and a rely-guarantee method used for protocol analysis.
Joint work with Jonathan C Herzog, John D Ramsdell, and Brian T Sniffen. Related paper (ESOP 2004) available at http://www.ccs.neu.edu/home/guttman/trust_mgt_in_strands.pdf.
Certified Assembly Programming with Embedded Code Pointers

Zhaozhong Ni (Yale University)
Embedded code pointers (ECP) are stored handlers of functions and continuations commonly seen in low-level binaries as well as functional or higher-order programs. ECP is known to be very hard to support well in Hoare-logic style verification systems. As a result, existing proof-carrying code (PCC) systems have to either sacrifice the expressiveness or the modularity of the assertion languages, or resort to heavyweight techniques that require the construction of complex semantic models. In this paper, we present a simple but powerful technique for solving the ECP problem for PCC. By adding a very small amount of syntax to the assertion language, we show how to use syntactic proof techniques to perform modular reasoning on embedded code pointers while still retaining the expressiveness of Hoare logic. Our new framework can also be applied to support references, recursive pointers, and polymorphism, thus leading to a general but lightweight system for reasoning about machine-level programs.
This talk presents work done in collaboration with Zhong Shao (Yale University). Our paper on this work can be found at http://flint.cs.yale.edu/flint/publications/xcap.html.
A Simplified Calculus of Higher-Order Modules

Paul Govereau (Harvard University)
We present a calculus of higher-order modules such as those found in Standard ML and Objective Caml. Our calculus can express both generative and applicative functors, and also allows mixing of generative and applicative styles within a single module. Despite the expressiveness of our calculus, the underlying type system is simpler than other approaches. We achieve extra expressiveness and simplicity through a novel treatment of generative types. Generativity is treated as primitive in our calculus; it is incorporated into the types of terms allowing a more direct treatment of generativity and type abstraction. The generative types in our calculus share some similarity with the abstract types found in object calculi. Our generative types have a nominal flavor although our type system is primarily structural. We hope, in addition to providing a calculus of higher-order modules, our system may be a practical mechanism for comparing functional and object oriented systems, or for examining the relationship between nominal and structural typing.
A short status report on a C-- compiler

Norman Ramsey (Harvard University)
C-- is a set of interfaces designed to help you implement the language of your choice on the machine of your choice. At long last, my students and I have an implementation that is useful and interesting. In the near future, we expect to add support for concurrency and for more optimizations. We welcome users and collaborators.
Elphin -- Functional Programming with Higher-Order Encodings

Adam Poswolsky (Yale University)
Higher-order encodings use functions provided by one language to represent variable binders of another. They lead to concise and elegant representations which historically have been difficult to analyze and manipulate. In my talk I will discuss the design and implementation of Elphin, a functional programming language that permits programming with general recursive functions over simply-typed higher-order encodings. Elphin is based on the nabla calculus [SPS]. To avoid problems commonly associated with using the same function space for representations and computations, we separate one from the other, permitting recursion under representation-level $\lambda$-binders while guaranteeing adequate encodings, without cramping the natural style of functional programming. (joint work with Carsten Sch\"urmann and Jeffrey Sarnat) [SPS] Schuermann, Poswolsky, Sarnat: The nabla calculus -- Functional programming with higher-order encodings, TLCA 2005. To appear.
For more information please see the Delphin Webpage.
MzTake: A Scriptable Debugger

Guillaume Marceau (Brown University)
Debugging is a laborious, manual activity that often involves the repetition of common operations. Ideally, users should be able to describe these repetitious operations as little programs. Debuggers should therefore be programmable, or scriptable. The operating environment of these scripts, however, imposes interesting design challenges on the programming language that we use to write these scripts. We present MzTake, our scriptable debugger for Java and Scheme. MzTake is built on top of FrTime, a functional reactive programming language also developed at Brown. We will discuss how the powerful data flow constructs of FrTime capture many important debugging and comprehension metaphors. By removing the need for callbacks and state, debugging scripts become more concise and reusable.
Based on a paper at Automated Software Engineering, 2004. Joint work with Greg Cooper, Jono Spiro, John Clements, Shriram Krishnamurthi, and Steve Reiss.

Last modified Thursday, February 17th, 2005 10:36:21pm Powered by