Excavating a Common Treasure

2018-06-30

History is full of legends of ancient treasures and powerful artifacts, lost to time or hidden by purpose, that would bring glory, power and riches to their discoverer.

A winged bull depicted on an on archaeological artifact from the Assyrian empire between 1400 and 1200 BC. Cylinder Seal with Winged Bull - Walters Art Museum Licensed under CC0.

Although the history of software development is much shorter than the history of mankind, there already exists artifacts that have near mythological status when it comes to software development, yet often go hidden underneath the surface. One of these is the language of Common Lisp.

Common Lisp is a language originally from the early 80s, and was a standardization of many dialects of the Lisp programming language, which explains both the Common and Lisp part of the name. Lisp itself is one of the oldest general purpose programming languages, first specified in 1958. From Lisp many ideas that we consider common in other languages were popularized, such as conditionals (the if and else structure), notion of functions as first class elements, and many others (see this article on more of these ideas). Common Lisp is also often mentioned as an amazing and fun language to learn, which makes it the clear-cut choice for this article.

As the story goes it gives the user untold levels of productivity and joy of development. One other big advantage is that Common Lisp used to be the language of A.I. research. There is lots of existing code out there for systems on the subject of knowledge representation, reasoning and planning. Given the resurgence of fields such as a Semantic Web that make use of such A.I. systems, learning from such tools and techniques, or even using them directly, can be invaluable. In this post, I aim to find out a very small portion of the riches we can dig up using Common Lisp, and to share this experience with the reader.

Cylinder Seals were common objects that were used to designate ownership in ancient Mesopotamia. Although describing to a different notion of 'common' these show remarkable of craftsmanship that we hope to introduce ourselves to in Common Lisp. Cylinder Seal with Two Heroes and a Tree - Walters Art Museum Licensed under CC0.

To add as a disclamer, I do have experience with a similar language, Clojure that I use in my day-to-day programming, as well as with Emacs (namely Spacemacs), so I probably have some head start in using and editing the language. Nonethess I aim to make this introduction accessible to readers without such knowledge.

With any treasure hunt, preparation is often key. As we are unfamiliar with the language itself it is good to read guides and tutorials to prepare. The guide to selecting the right equipment for our adventure is the cl-cookbook, which describes how to get started with Common Lisp.

Common Lisp is a language with multiple implementations. As recommended by our guide we are using Steel Bank Common Lisp (SBCL) as distribution and using Roswell to manage it. Alternatively a more straightforward install is Portacle that could be used.

Next is testing whether our implementation is working. We run the Read–Eval–Print Loop (REPL), which in the case of Roswell can be done with the command ros run. What the REPL allows us to do is interactively develop our code by reading our input, evaluating it and printing the results. This is one of the many features in programming that originated with Lisp, but has since made its way to many other languages.

Now the question is how we should write Common Lisp. One really nice resource for quickly familiarizing with new languages is Learn X in Y minutes, which also has a nice introduction to Common Lisp. One of the most striking features of a Lisp, is the syntax of it. Elements are either atoms, that evaluate to themselves, such as the number 4 or s-expressions which are a list of expressions in brackets, such as (+ 3 2). Atoms evaluate to themselves, so if we write 4 in the REPL, we get 4 returned back. For s-expressions they are in the form of (function param1 param2 ...), so in the case of (+ 3 2) this will evaluate to 5. One can of course nest these forms, for example (+ 3 (+ 3 2)) will result in 8. In addition to functions there are macros available creating completely different forms to be translated into code. We will see some examples of this later in the article.

With the REPL now working it is time to set up a project where we can write down all the code we need for our digging. According to our cookbook tutorial we can do this from the REPL, by getting cl-project, and using it.

(ql:quickload "cl-project")
(cl-project:make-project #P"./common-treasure")

This will give us the outline of project in the common-treasure directory. The files generated include a readme, systems for the project itself as well as tests and of course some skeletons for a test and a package itself respectively.

First we are going to start off figuring out the package itself. Like in many other languages, functionality can be grouped together units, in this case packages. The generated file outline looks like this:

(defpackage common-treasure
  (:use :cl))
(in-package :common-treasure)

;; blah blah blah.

From the looks of it, it says to us that it defines a package named common-treasure using :cl, which I assume is another package. There is also a call to in-package which looks like is there to ensure that everything else that follows is also in this package. Finally the ;; blah blah blah. portion looks like a comment, something to describe functionality with, but nothing to be executed.

Now in order to see if everything is working correctly let's load this package. Instead of using the REPL from the command line, we are using it directly from our editor, in this case an Emacs distribution named Spacemacs. Without going into much detail about Emacs it can be described as a highly configurable editor for pretty much any programming language out there. SLIME: The Superior Lisp Interaction Mode for Emacs ensures that our interaction with the REPL and the various language features go much more smoothly by integrating it with our editor of choice.

Emacs is incredibly configurable, which is where Spacemacs comes in. It provides a set of curated configurations named layers, that make setting up the configuration for a particular language a breeze. We just enable the common-lisp layer of Spacemacs in the configuration file, add a small piece to let Emacs know of our Common Lisp distribution (see link), and we are good to go.

With everything setup now we can start our treasure hunt a bit more in earnest by starting SLIME. We are greeted by the message "Connected. Hack and be merry!" as well as a REPL in a window. Lets start to dig around by figuring out what exactly all the elements do. We can use the command slime-describe-symbol to get some more information about defpackage and what it does. This brings up some documentation on defpackage. It shows us that this is a macro, as well as the various options one can use with it. A macro, without going into too much detail, is essentially a way to convert forms into different ones before evaluating them. We can get a bit more info on the workings by calling the command slime-hyperspec-lookup. This takes us to the Common Lisp Hyperspec with more detailed docs. The pages for defpackage and inpackage more or less confirm their use is as we expected.

Now it is time to start writing some new code. We remove the comments and create a function named hello-treasure, that should just print the text "Hello Treasure". We add the following to our file:

(defun hello-treasure ()
  (print "Hello Treasure!")
  )

in place of the comments, and we run the command slime-complile-and-load-file. Compilation succeeds, but when we aim to call the function (common-treasure:hello-treasure) get an error when trying to call the function, saying "The symbol "HELLO TREASURE" is not external to the "COMMON TREASURE" package.". This due to the fact that we did not declare the function external in the defpackage package declaration.

What is really interesting is that, although we have this issue, instead of just failing the SLIME REPL presents us with a number of options. For example we can 0: [Continue] Use symbol anyway., 1: [Retry] Retry SLIME REPL evaluation request., 2: [*Abort] Return to SLIME's top level.. In this case we can just abort, and modify the package declaration, but it is interesting to be offered all these options.

By adding (:export :hello-treasure) to the defpackage macro, we can do another call and it indeed returns "Hello Treasure!" in the REPL.

Now that we got the basics down we would like to have a slightly bigger example in which we make use of an existing library. While digging around for the software languages and libraries of the past is often the matter of some searching, reading, coding and a bit of trial and error. Digging around in the world is process governed by various regulations and laws. This is especially true in cities with long history, where a regular building site can easily unearth archaeological finds.

Digging in the certain urban areas can easily lead to excavations done before any subsequent work is done. Romano-Celtic temple excavated in London.

In particular, there are often a set of regulations that specify whether one can dig in areas of archaeological interests. As an example we use a portion of the rules by the municipality of Utrecht, as described on a map of Utrecht and surrounding areas (in Dutch).

These regulations can be summarized by a number of rules, based on two conditions: the type of area and the size of the area that is to be disturbed.

Such regulations can be encoded into a regular function that returns a true or false to the question "Is the a permit required?" based on two parameters, the size of the area and the type of the area. However this could be more complicated and harder to maintain if the set of regulations, along with conditions and outcomes, become more complex.

There are various approaches to deal with this issue. One of these is to encode the facts on which our answer relies into a knowledge base, and have a general mechanism to derive new knowledge from existing facts. This later process is often denoted as inferring or reasoning to derive new knowledge. Such an approach can make it easier to use and maintain such a system.

For this article we are going to implement such an approach using LISA. This is a production rule system, in which the knowledge base is encoded in a set of facts, and rules are uses as a way to derive new knowledge. It is easy to see how such a system would be a good fit for our problem: our domain restrictions are already formulated as a set of rules. It would also allow us to use some existing libraries, to see more of what Common Lisp has to offer.

To start off, first we need to add a dependency to LISA into our system. One of the files created by cl-project is an asdf file that describes how to build the software we are creating. Here we can add the dependency on Lisa, resulting in the following file:

#|
  This file is a part of common-treasure project.
|#

(defsystem "common-treasure"
  :version "0.1.0"
  :author ""
  :license ""
  :depends-on ("lisa") ;; Here we add the dependency on LISA.
  :components ((:module "src"
                :components
                ((:file "common-treasure"))))
  :description ""
  :long-description
  #.(read-file-string
     (subpathname *load-pathname* "README.markdown"))
  :in-order-to ((test-op (test-op "common-treasure-test"))))

Now we just need to make sure the dependencies needed are actually downloaded and available. For this we use quicklisp which is the library manager Common Lisp. By using the command (ql:quickload "common-treasure") we can get all the dependencies for our system, which in this case consists of just LISA.

Now in order to translate our example scenario, we look at some of the examples that are available for LISA. A typical system consists of a knowledge base, in which facts are defined as Common Lisp Object System (CLOS) objects and rules to manipulate such instances as the reasoning method.

CLOS is a way in which Common Lisp can do object-oriented programming, and LISA co-opts this as a way to represent facts in the knowledge base. Without going into detail on object-oriented programming theory, this allows one to, define classes, create instances of those classes and define methods that make use of those classes (see this nice intro on Common Lisp Object System (CLOS) for its general use).

In the context of LISA, these allow us to represent classes of facts, specific instances of facts, and methods that can be used with those facts. For example we can represent the type of facts in our domain with two classes of objects: one that represents "area" and one that whether "a permit is required". In Common Lisp these can be defined with the defclass macro. So for our example the following would be a call for permission requirement:

(defclass permit-required ()
  ()
)

In case of the area, we need to represent two facts about it: what the archaeological type of it is, and what size of the area is disturbed. Such elements of the classes are described in CLOS using slots. There are many possible options that are available to describe a slot:

Putting this together gives us the following class for representing the area:

(defclass area ()
  ((archeological-type :initarg :archeological-type :initform "" :accessor :archeological-type)
   (size-disturbed :initarg :size-disturbed :initform 0.0 :accessor :size-disturbed)
    ))

Now that we figured out how represent facts, we need to encode the rules of our domain into our system. We can do this with 3 rules that describe what to do in the cases of high archeological value, high archeological expectation and archeological expectation

The rules themselves described with three parts:

In the case of LISA, and in many other rule based definitions, the antecedent and the consequent are also called Left Hand Side (LHS) and Right Hand Side (RHS), as they are respectively on the left and right hand side of the arrow. What exactly can be put into the LHS and RHS is dependent on the language for the rule definitions, but one standard way they are used is in the LHS the conditions are described based on the facts that could be in the knowledge base, and on the RHS facts are added in the knowledge based on the meaning of the rule. For example for one of our rules, if the know the fact that our area is of high archaeological value (LHS), we can derive the fact that we need a permit (RHS).

Describing our three rules this way in LISA looks as follows, using the defrule macro:

;; If the area is of high archeological value, then a permit is required no matter what the size of the area is to be disturbed.
(defrule high-value () (area (archeological-type "high-value")) => (assert ((make-instance 'permit-required))))

;; If the area is of high archeological expectation, that a permit is required when the size of the area that is to be disturbed is larger than 100m<sup>2</sup>.
(defrule high-expectation () (area (archeological-type "high-expectation") (> size-disturbed 100))  => (assert ((make-instance 'permit-required))))

;; If the area if of archeological expectation, that a permit is required when the size of the area that is to be disturbed is larger than 1000m<sup>2</sup>.
(defrule expectation () (area (archeological-type "expectation") (> size-disturbed 1000))  => (assert ((make-instance 'permit-required))))

The rules are reasonably straightforward, in that they require the condition matching in the knowledge base, before they assert an instance of the permit-required fact into the knowledge base.

These together already describe the domain, but for this article we also show how things work with two examples. In one scenario we assert that we have an area with high-value archaeological type and a to be disturbed are with the size of 2000m2. In this case we would need a permit, due to our high-value rule. In the second scenario, the area is of high-expectation but the size of the area disturbed is only 10m2. In this case none our rules will fire, and no permit-required fact instance is put into the knowledge base.

Executing the system is just a matter of adding these facts in the knowledge base and letting the system run. In LISA adding a fact can be done using assert similarly to what the RHS of the rules are doing. To actually run the system, unsurprisingly, the function run can be called. To see what facts are in the knowledge base the function facts prints them out for us to see. To tie things together, we make sure that the knowledge base is reset before we execute anything using reset.

Below is the full example implementation that we put together:

(defpackage common-treasure
  )

(in-package :lisa-user)

(reset)

(defclass area ()
  ((archeological-type :initarg :archeological-type :initform "" :accessor :archeological-type)
   (size-disturbed :initarg :size-disturbed :initform 0.0 :accessor :size-disturbed)
    ))

(defclass permit-required ()
  ()
)

;; If the area is of high archeological value, then a permit is required no matter what the size of the area is to be disturbed.
(defrule high-value () (area (archeological-type "high-value")) => (assert ((make-instance 'permit-required))))

;; If the area is of high archeological expectation, that a permit is required when the size of the area that is to be disturbed is larger than 100m<sup>2</sup>.
(defrule high-expectation () (area (archeological-type "high-expectation") (> size-disturbed 100))  => (assert ((make-instance 'permit-required))))

;; If the area if of archeological expectation, that a permit is required when the size of the area that is to be disturbed is larger than 1000m<sup>2</sup>.
(defrule expectation () (area (archeological-type "expectation") (> size-disturbed 1000))  => (assert ((make-instance 'permit-required))))

;;Test fact 1
;; (assert ((make-instance 'area :archeological-type "high-value" :size-disturbed 2000)))

;;Test fact 2
(assert ((make-instance 'area :archeological-type "high-expectation" :size-disturbed 10)))

(facts)

(run)

(facts)

Note that here the fact 1 is commented out, to test the assert of fact 2, so that running the other scenario is just a matter of commenting and un-commenting the relevant lines.

For the first scenario, with fact 1, the output of our initial program is as follows:

[package common-treasure]#<INITIAL-FACT ; id 0 {1002669853}>
#<AREA ; id 1 {1002745373}>
For a total of 2 facts.
#<INITIAL-FACT ; id 0 {1002669853}>
#<AREA ; id 1 {1002745373}>
#<PERMIT-REQUIRED ; id 2 {100283A363}>
For a total of 3 facts.

Note that we print out our facts twice, once before the run and once after, but it shows that now the permit requirement was derived.

For the case of using fact 2 , we get the following output:

[package common-treasure].#<INITIAL-FACT ; id 0 {1004D40AB3}>
#<AREA ; id 1 {10020107C3}>
For a total of 2 facts.
#<INITIAL-FACT ; id 0 {1004D40AB3}>
#<AREA ; id 1 {10020107C3}>
For a total of 2 facts.

This shows that our fact base did not change due to our running the reasoning system.

So there we have it, a bit of a peek of what can be done with Common Lisp. I have to say the language and its various features are quite a bit daunting, as I feel I have only scratched the surface of what is possible, with both interacting with the language as well as with A.I. tools such as LISA. That said digging into the language and seeing how it can handle a small, but also relevant scenario, felt like a worthwhile journey, and that is an experience always worth treasuring.