Jim Newton

Type-checking heterogeneous sequences in a simple embeddable type system

By Jim Newton

2025-01-01

In Proceedings of 27th international symposium on practical aspects of declarative language (PADL’25)

Abstract

Heterogeneously typed sequences are supported in a wide range of programming languages, both dynamically and statically typed. These sequences often exhibit type patterns such as repetition, alternation, and optionality. The programmer needs a mechanism to declare and query adherence to this regularity. The theory of finite automata over finite alphabets was conceived for characterizing patterns in so-called regular languages, but does not exactly meet this challenge, because the set of potential elements of the sequences is infinite. In this article, we present a generalization of regular expressions called rational type expressions as a means of declaring regular patterns in heterogeneous sequences. We present procedures for constructing and manipulating symbolic finite automata, a generalization of classical finite automata, using a portable, simple, embeddable, type system. For type systems with subtyping, the subtype relation and type vacuity cannot always be computed programmatically. We provide a working, sound solution for constructing finite automata for type-based regular expressions even in cases where the subtype decidability relations is not computable retrospectively, but can be ensured by construction. We demonstrate the generality and portability of the system by providing implementations in Common Lisp, Clojure, Scala, and Python.

Continue reading

An elegant and fast algorithm for partitioning types

By Jim Newton

2023-04-01

In ELS 2023, the 16th european lisp symposium

Abstract

We present an improvement on the Maximal Disjoint Type Decomposition algorithm, published previously. The new algorithm is shorter than the previously best known algorithm in terms of lines of code, and performs better in many, but not all, benchmarks. Additionally the algorithm computes metadata which makes the Brzozowski derivative easier to compute–both easier in terms of accuracy and computation time. Another advantage of this new algorithm is its resilience limited SUBTYPEP implementations.

Continue reading

Comparing use-cases of tree-fold vs fold-left, how to fold and color a map

Abstract

In this article we examine some consequences of computation order of two different conceptual implementations of the fold function. We explore a set of performance- and accuracy-based experiments on two implementations of this function. In particular, we contrast the traditional fold-left implementation with another approach we refer to as tree-fold. It is often implicitly supposed that the binary operation in question has constant complexity. We explore several application areas which diverge from that assumption: rational arithmetic, floating-point arithmetic, and Binary Decisions Diagram construction. These are binary operations which degrade in performance as the fold iteration progresses. We show that these types of binary operations are good candidates for tree-fold.

Continue reading

A portable, simple, embeddable type system

By Jim Newton, Adrien Pommellet

2021-04-26

In ELS 2021, the 14th european lisp symposium

Abstract

We present a simple type system inspired by that of Common Lisp. The type system is intended to be embedded into a host language and accepts certain fundamental types from that language as axiomatically given. The type calculus provided in the type system is capable of expressing union, intersection, and complement types, as well as membership, subtype, disjoint, and habitation (non-emptiness) checks. We present a theoretical foundation and two sample implementations, one in Clojure and one in Scala.

Continue reading

Performance comparison of several folding strategies

By Jim Newton

2020-01-14

In Trends in functional programming

Abstract

In this article we examine the computation order and consequent performance of three different conceptual implementations of the fold function. We explore a set of performance based experiments on different implementations of this function. In particular, we contrast the fold-left implementation with two other implements we refer to as pair-wise-fold and tree-like-fold. We explore two application areas: ratio arithmetic and Binary Decisions Diagram construction. We demonstrate several cases where the performance of certain algorithms is very different depending on the approach taken. In particular iterative computations where the object size accumulates are good candidates for the tree-like-fold.

Continue reading

Implementing baker’s SUBTYPEP decision procedure

By Léo Valais, Jim Newton, Didier Verna

2019-04-01

In ELS 2019, the 12th european lisp symposium

Abstract

We present here our partial implementation of Baker’s decision procedure for SUBTYPEP. In his article “A Decision Procedure for Common Lisp’s SUBTYPEP Predicate”, he claims to provide implementation guidelines to obtain a SUBTYPEP more accurate and as efficient as the average implementation. However, he did not provide any serious implementation and his description is sometimes obscure. In this paper we present our implementation of part of his procedure, only supporting primitive types, CLOS classes, member, range and logical type specifiers. We explain in our words our understanding of his procedure, with much more detail and examples than in Baker’s article. We therefore clarify many parts of his description and fill in some of its gaps or omissions. We also argue in favor and against some of his choices and present our alternative solutions. We further provide some proofs that might be missing in his article and some early efficiency results. We have not released any code yet but we plan to open source it as soon as it is presentable.

Continue reading

Finite automata theory based optimization of conditional variable binding

By Jim Newton, Didier Verna

2019-01-14

In ELS 2019, the 12th european lisp symposium

Abstract

We present an efficient and highly optimized implementation of destructuring-case in Common Lisp. This macro allows the selection of the most appropriate destructuring lambda list of several given based on structure and types of data at run-time and thereafter dispatches to the corresponding code branch. We examine an optimization technique, based on finite automata theory applied to conditional variable binding and execution, and type-based pattern matching on Common Lisp sequences. A risk of inefficiency associated with a naive implementation of destructuring-case is that the candidate expression being examined may be traversed multiple times, once for each clause whose format fails to match, and finally once for the successful match. We have implemented destructuring-case in such a way to avoid multiple traversals of the candidate expression. This article explains how this optimization has been implemented.

Continue reading

Representing and computing with types in dynamically typed languages

Abstract

In this report, we present code generation techniques related to run-time type checking of heterogeneous sequences. Traditional regular expressions can be used to recognize well defined sets of character strings called rational languages or sometimes regular languages. Newton et al. present an extension whereby a dynamic programming language may recognize a well defined set of heterogeneous sequences, such as lists and vectors. As with the analogous string matching regular expression theory, matching these regular type expressions can also be achieved by using a finite state machine (deterministic finite automata, DFA). Constructing such a DFA can be time consuming. The approach we chose, uses meta-programming to intervene at compile-time, generating efficient functions specific to each DFA, and allowing the compiler to further optimize the functions if possible. The functions are made available for use at run-time. Without this use of meta-programming, the program might otherwise be forced to construct the DFA at run-time. The excessively high cost of such a construction would likely far outweigh the time needed to match a string against the expression. Our technique involves hooking into the Common Lisp type system via the DEFTYPE macro. The first time the compiler encounters a relevant type specifier, the appropriate DFA is created, which may be a Omega(2^n operation, from which specific low-level code is generated to match that specific expression. Thereafter, when the type specifier is encountered again, the same pre-generated function can be used. The code generated is Theta(n) complexity at run-time. A complication of this approach, which we explain in this report, is that to build the DFA we must calculate a disjoint type decomposition which is time consuming, and also leads to sub-optimal use of TYPECASE in machine generated code. To handle this complication, we use our own macro OPTIMIZED-TYPECASE in our machine generated code. Uses of this macro are also implicitly expanded at compile time. Our macro expansion uses BDDs (Binary Decision Diagrams) to optimize the OPTIMIZED-TYPECASE into low level code, maintaining the TYPECASE semantics but eliminating redundant type checks. In the report we also describe an extension of BDDs to accomodate subtyping in the Common Lisp type system as well as an in-depth analysis of worst-case sizes of BDDs.

Continue reading

Recognizing heterogeneous sequences by rational type expression

By Jim Newton, Didier Verna

2018-09-14

In Proceedings of the meta’18: Workshop on meta-programming techniques and reflection

Abstract

We summarize a technique for writing functions which recognize types of heterogeneous sequences in Common Lisp. The technique employs sequence recognition functions, generated at compile time, and evaluated at run-time. The technique we demonstrate extends the Common Lisp type system, exploiting the theory of rational languages, Binary Decision Diagrams, and the Turing complete macro facility of Common Lisp. The resulting system uses meta-programming to move an exponential complexity operation from run-time to a compile-time operation, leaving a highly optimized linear complexity operation for run-time.

Continue reading

A theoretical and numerical analysis of the worst-case size of reduced ordered binary decision diagrams

By Jim Newton, Didier Verna

2018-08-28

In ACM Transactions on Computational Logic

Abstract

Binary Decision Diagrams (BDDs) and in particular ROBDDs (Reduced Ordered BDDs) are a common data structure for manipulating Boolean expressions, integrated circuit design, type inferencers, model checkers, and many other applications. Although the ROBDD is a lightweight data structure to implement, the behavior, in terms of memory allocation, may not be obvious to the program architect. We explore experimentally, numerically, and theoretically the typical and worst-case ROBDD sizes in terms of number of nodes and residual compression ratios, as compared to unreduced BDDs. While our theoretical results are not surprising, as they are in keeping with previously known results, we believe our method contributes to the current body of research by our experimental and statistical treatment of ROBDD sizes. In addition, we provide an algorithm to calculate the worst-case size. Finally, we present an algorithm for constructing a worst-case ROBDD of a given number of variables. Our approach may be useful to projects deciding whether the ROBDD is the appropriate data structure to use, and in building worst-case examples to test their code.

Continue reading