Archive for the 'Code' Category

regular expression matching without a state machine

Over the past few days I’ve been brushing up on my compiler theory, and in the course of my readings happened across an interesting article in the Scala blogs. The author presented a very concise (albeit not complete; see the comments following the article) implementation of a regex matching engine. I was intrigued and thought I’d try my hand at writing it in Common Lisp. My naive implementation:

(defclass regex () ())

(defclass phi (regex) ())

(defclass empty (regex) ())

(defclass l (regex) ((letter :initarg :letter)))

(defclass choice (regex)
  ((left :initarg :left)
   (right :initarg :right)))

(defclass seq (regex)
  ((left :initarg :left)
   (right :initarg :right)))

(defclass star (regex) ((expr :initarg :exp)))

(defgeneric accepts-empty (re))

(defmethod accepts-empty ((re phi)) nil)

(defmethod accepts-empty ((re empty)) t)

(defmethod accepts-empty ((re choice))
  (or (accepts-empty (slot-value re 'left))
      (accepts-empty (slot-value re 'right))))

(defmethod accepts-empty ((re seq))
  (and (accepts-empty (slot-value re 'left))
       (accepts-empty (slot-value re 'right))))

(defmethod accepts-empty ((re star)) t)

(defmethod accepts-empty ((re l)) nil)

(defgeneric part-deriv (regex item))

(defmethod part-deriv ((re phi) atom)
  (make-instance 'phi))

(defmethod part-deriv ((re empty) atom)
  (make-instance 'phi))

(defmethod part-deriv ((re l) atom)
  (if (equalp atom (slot-value re 'letter))
      (make-instance 'empty)
      (make-instance 'phi)))

(defmethod part-deriv ((re choice) atom)
  (with-slots ((left left) (right right)) re
    (make-instance 'choice
                   :left (part-deriv left atom)
                   :right (part-deriv right atom))))

(defmethod part-deriv ((re seq) atom)
  (with-slots ((left left) (right right)) re
    (let ((rn (make-instance 'seq :left (part-deriv left atom) :right right)))
      (if (accepts-empty left)
          (make-instance 'choice :left rn :right (part-deriv right atom))
          rn))))

(defmethod part-deriv ((re star) atom)
  (make-instance 'seq
                 :left (part-deriv (slot-value re 'expr) atom)
                 :right re))

(defgeneric match-regexp (re word))

(defmethod match-regexp ((re regex) word)
  (if (eq 0 (length word))
      (accepts-empty re)
      (match-regexp (part-deriv re (first word)) (cdr word))))

; testing

(match-regexp (make-instance 'choice
                             :left (make-instance 'seq
                                                  :left (make-instance 'l :letter #a)
                                                  :right (make-instance 'l :letter #c))
                             :right (make-instance 'star :exp (make-instance 'l :letter #b)))
               (list #b #b))

It’s definately not as concise as the Scala version, partly because of my decision to use generic functions to imitate Scala’s case-class matching. If any Lispers happen across this, I’d really appreciate any comments about how to improve on this.

show me the matches

Something that I’ve never been particularly fond of is the way that Vim does filename completion. Completing with the first available match and then allowing one to cycle through the remaining matches with <TAB> feels less effecient than presenting only the longest matching portion of the filename, allowing one to further refine the partially-completed string or retrieve all candidates using <TAB><TAB>. The latter is, of course, the model used by bash and Emacs.

As it turns out Vim has options that control this behaviour:

   set wildmode=longest:full
   set wildmenu

This wildmode setting instructs Vim to complete only the longest common portion of the filename and, if wildmenu is enabled, display the candidates in a status menu. This wildmenu setting, of course, turns this feature on.

emacs & slime for people like me

Peter Christensen has published his "Ultimate n00b SLIME/Emacs cheat sheet" as a Work In Progress. It’s a very handy quick reference.

punch clock: an introduction to objective-c

My first attempt at using Objective-C for something useful has produced Punch Clock, a simple application that one might use to record one’s activities for personal time management or later entry in a time-tracking system or timesheet.

While it is, as an application, little more than a skeleton of a prototype, it has enough useful functionality to give a flavour of Cocoa development in Objective-C. It features:

  • post-launch application initialisation through the use of a NSApplication delegate
  • an editable NSTableView
  • programmatic edits of a table cell
  • NSButtons that are enabled and disabled according to the state of the application

Those not completely disinterested can browse the source. It is hosted by Bitbucket as a Mercurial repository; retrieve it using:

   hg clone http://bitbucket.org/sinistral/punchclock

NSLog an NSString

NSLog(@"hello, %@", @"world");

classpath file classloader – now with properties

The ClasspathFileClassLoader now supports system properties. Thus, the command line for any Java application need consist of only:

   java \\
   -Djava.system.class.loader=tierseven.cfcl.ClasspathFileClassLoader \\
   -Dtierseven.cfcl.ClasspathFileClassLoader.properties_file=/path/to/cfcl/properties_file \\
   -classpath /path/to/cfcl/classes \\
   <main_class>

The properties file should contain other properties, including tierseven.cfcl.ClasspathFileClassLoader.file, which will in turn contain the classpath elements.

The source is available from a Mercurial repository:

   hg clone http://bitbucket.org/sinistral/cfcl

syntax error before ‘AT_NAME’ token

I also ran into this rather less-than-obvious error message compiling Objective-C code on MacOS X 10.5:

   syntax error before ‘AT_NAME’ token

I eventually resolved it thanks to this blog entry. The compiler was moaning about a missing @end in a header file.

being objective

I’ve long been a fan of Apple. In 1985 my father returned from a US visit with a brand new Macintosh Plus and introduced me to a whole new world of personal computing (You have no idea how big a deal this was to a little South Africa boytjie – thanks, Dad!). While my friends were fighting with phix so that they could play Montezuma’s Revenge, I had the Dark Castle games. In later years, however, limited access to (relatively) expensive Apple hardware meant that, by the time I started started programming, all that was available to me was Windows and GNU/Linux on Intel hardware.

It is only recently that I’ve had the privilege of getting re-acquainted with the beauty that is Apple.

I’ve been curious about Apple’s choice of Objective-C and, now that I have the opportunity, I’ve decided to learn the language for myself. The bits and bytes that I pick up, I will post here – mostly as a reminder for myself, but also in the hope that it will help someone else who, like me, is just starting out. As most of my experience to-date has been as a Java developer, some of the things that took me a little while to figure out will no doubt be blindingly obvious to a C/C++ developer.

So … my first step was to try to write something very simple that would let me play around with the classes in the standard frameworks. For me this means being able to do a quick command-line build, without having to fire up Xcode. What I ended up with is the following main.m:

#import "Foundation/Foundation.h"

int main (int argc, char** argv)
{
	NSAutoreleasePool* pool = [[NSAutoreleasePool alloc] init];
	NSMutableArray* records = [NSMutableArray arrayWithCapacity: 3];

	[records addObject: [NSMutableDictionary dictionaryWithCapacity: 2]];
	[records addObject: [NSMutableDictionary dictionaryWithCapacity: 2]];

	[[records objectAtIndex: 0] setValue: @"marc" forKey: @"name"];
	[[records objectAtIndex: 0] setValue: @"male" forKey: @"sex"];
	[[records objectAtIndex: 1] setValue: @"michelle" forKey: @"name"];
	[[records objectAtIndex: 1] setValue: @"female" forKey: @"sex"];

	[records writeToFile: @"entries" atomically: NO];
	[pool release];
}

This may be compiled using:

   gcc -framework Foundation Test.m

There were a couple of things that weren’t immediately obvious to me:

  • to use the classes in the Foundation framework, one needs to include Foundation/Foundation.h; I was looking for specifc headers files for each of the referenced classes.
  • the linker must be instructed to link against the framework using the
       -f <framework>
    option. If you’re not linking, a
       gcc -c main.m
    is sufficient.

learning to lisp

For some time now I’ve been intrigued by the claims of its proponents that Lisp is the language that exemplifies elegance and expressiveness. Those are bold claims, and I’ve finally decided to take the plunge and to see what all the fuss is about. So, this is my step down the road; only time will tell if I’ll ever truly "get it".

I find that I learn best by doing, so with the first few chapters of Peter Seibel’s Practical Common Lisp as an introduction, I wanted to get my hands dirty and started putting together a development environment. The Superior Lisp Interaction Mode for Emacs (SLIME) is by far the most popular (free) Lisp IDE, and it is indeed excellent; I have, however, long been more comfortable with Vim and found Larry Clapp’s VIlisp a functional alternative for interfacing with a Lisp interpreter. My search for a (free) Lisp implementation eventually ended with ECL.

My decision to use ECL was based mainly on the fact that it is designed to be used as an embedded library, and can translate Lisp to C to produce a native executable. I see this as a benefit, because unlike Java (which has a rigid JVM and class format specification), Lisp does not require that fasls be portable between implementations, so there doesn’t appear to be a consistent way to distribute Lisp applications in binary form. The binary produced by ECL’s compilation is linked to the ECL library, which means that users don’t need to download an additional package (a Lisp interpreter) to run one’s application. Thanks to Java this is less of an issue now than it once was, but I for one still like to be able to download and use an application without having to hunt down its required framework.
I will, however, be revisiting CLISP and SBCL, which seem to be more user- (read: developer-) friendly Lisp implementations.

But I digress… With a development environment now set up, I embarked on my first Lisp project: an implementation of John Conway’s Game of Life. The source code is available; SDL is needed for the display.

Initially I, like many others, found the proliferation of parentheses a visual distration that hid the actual code. It required a good deal of effort to focus on the code and ignore the parentheses. In a surprisingly short time, however, I began to appreciate the parentheses as visual cues that delimit forms. They started to function as a kind of scope (in the sniper sense) for widening and narrowing my visual focus, allowing me to always concentrate on a specific context. I think the brain tends to focus on what it doesn’t recognise, and the more Lisp code one writes, the less the parentheses will stand out, and the more they will be interpreted as an appropriate visual cue. From my humble and limited experience, if the parentheses are the only thing stopping you from learning Lisp, try to stick with it … It may get better.

In addition to the REPL, I found Lisp’s lambda functions and closures to be an very useful debugging aid. By wrapping a portion of a function in a lambda function, and assigning it to a global variable, that portion of code is made accessible. Using funcall one can execute the lambda function to ensure that the fragment it wraps is behaving correctly. Because the fragment uses the bindings that have been captured by the closure, there’s no need to write test code that injects appropriate test values. Its not a replacement for proper test cases, but its great for ad-hoc verification of a particular piece of code.

I must say again that I’ve barely (if at all) scratched the surface of what Lisp has to offer. However, the little that I’ve seen has whet my appetite and I’m eager to learn more. I’ve only caught glimpses of the after-images left by the lightning bolts cast by Lisp wizards, but there appears to be a great deal of magic out there … and I’m eager to learn more.