From: Nathaniel Wesley Filardo <nwf@cs.jhu.edu>
Date: Thu, 8 Nov 2012 23:56:08 +0000 (-0500)
Subject: Some frontend work
X-Git-Url: https://hydra-www.ietfng.org/gitweb/?a=commitdiff_plain;h=eba150f62dbf3849d2864a78797f4e6cad7d6d7a;p=dyna2

Some frontend work

Add a special arm to the parser for numerics, and preserve this into the
next representation.  This is not perfect, it is just a start.

A better first pass at Dyna.Analysis.NormalizeParse.  (Though despite the
existence of a Selftest module, there are actually no tests present yet.)

While here, do some refactoring; hopefully things are better than before.
---

diff --git a/README b/README
index 46658f1..cad519b 100644
--- a/README
+++ b/README
@@ -4,9 +4,12 @@ An overview of the source tree
 external/
   damsl-k3         -- The DAMSL K3 tree, tracked as a git submodule.
 
-  ekmett-trifecta  -- The trifecta parser combinator library, tracked
+  ekmett-parsers   -- ekmett's parsers combinator library, tracked
                    -- as a git submodule.
 
+  ekmett-trifecta  -- ekmett's trifecta parser combinator library,
+                   -- tracked as a git submodule.
+
 src/Dyna/
 
   BackendK3        -- An AST and printer for K3,
@@ -27,7 +30,9 @@ src/Dyna/
 For those not familar with cabal
 --------------------------------
 
-First, ensure that you have GHC 7.6 or later.
+First, ensure that you have GHC 7.6 or later.  (Though in a pinch, if you're
+only interested in the frontend stuff, apparently as early as 7.0 continues
+to be servicable.)
 
 Then, sadly, I have to ask you to build some upstream packages out of their
 repositories.  I thought they were going to be released "soon" when I
@@ -35,6 +40,7 @@ switched to these later versions, but it hasn't happened yet:
 
 git submodule init
 git submodule update
+(cd external/ekmett-parsers; cabal install --user)
 (cd external/ekmett-trifecta; cabal install --user)
 
 Build K3, which requires OCaml:
@@ -49,7 +55,7 @@ Build Dyna:
 
 cabal configure --user --enable-tests
 cabal build
-cabal haddock
+# cabal haddock # skip this, for depressing reasons
 cabal test
 
 And then run the REPL:
diff --git a/dyna.cabal b/dyna.cabal
index 6ee9247..3129776 100644
--- a/dyna.cabal
+++ b/dyna.cabal
@@ -26,12 +26,12 @@ Library
     ghc-options:        -Wall
 
 
-    Exposed-Modules:    Dyna.BackendK3.AST,
+    Exposed-Modules:    Dyna.Analysis.NormalizeParse,
+                        Dyna.BackendK3.AST,
                         Dyna.BackendK3.Automation,
                         Dyna.BackendK3.Render,
                         Dyna.BackendK3.Stdlib,
                         Dyna.ParserHS.Parser,
-                        Dyna.NormalizeParse,
                         Dyna.XXX.HList,
                         Dyna.XXX.THTuple,
                         Dyna.XXX.Trifecta
diff --git a/src/Dyna/Analysis/NormalizeParse.hs b/src/Dyna/Analysis/NormalizeParse.hs
new file mode 100644
index 0000000..4d7b246
--- /dev/null
+++ b/src/Dyna/Analysis/NormalizeParse.hs
@@ -0,0 +1,208 @@
+---------------------------------------------------------------------------
+-- | Some simple analysis to move to ANF.
+--
+-- In Dyna's surface syntax, there exists both \"in-place evaluation\" and
+-- \"in-place construction\".  How do we deal with this?  Well, it's a
+-- little messy:
+--
+--   1. If a term is wrapped by the unary @*@ functor, it is explicitly
+--   moved to an ANF evaluation and replaced by a variable.  So
+--   @f(*g(X))@ rewrites as @Y is g(X), f(Y)@.
+--
+--   2. Each functor gets a chance to specify, for each argument, that it
+--   would prefer to evaluate a given position.  If so, and if the position
+--   is occupied by a non-variable, non-primitive term (e.g., atom or
+--   structure) whose disposition is to evaluate and which is not quoted by
+--   the unary @&@ functor, then it is moved to an ANF evaluation.  If f
+--   prefers to evaluate only its first position, and @g@ is disposed to
+--   evaluation by default, @f(g(X),h(Y))@ rewrites as @Z is g(X),
+--   f(Z,h(Y))@.
+--
+--   3. Otherwise, a non-variable, non-primitive term in an argument will
+--   be interpreted as structure-building.
+--
+-- Note that in rules, the head is by default not evaluated (regardless of
+-- the disposition of their outer functor), while the body is interpreted as
+-- a term expression (or list of term expressions) to be evaluated.
+--
+-- XXX This is really quite simplistic and is probably a far cry from where
+-- we need to end up.  Especially of note is that we do not yet parse any
+-- sort of pragmas for augmenting our disposition list.
+--
+-- XXX The handling for "is/2" is probably wrong.  Right now it's not
+-- special at all, but every Dyna program is defined to include
+-- @is(X,Y) :- X = *Y.@.  Is that something we should be normalizing out
+-- here or should be waiting for some further unfolding optimization phase?
+
+-- Header material                                                      {{{
+{-# LANGUAGE FlexibleContexts #-}
+{-# LANGUAGE FlexibleInstances #-}
+{-# LANGUAGE NoMonomorphismRestriction #-}
+{-# LANGUAGE OverloadedStrings #-}
+
+module Dyna.Analysis.NormalizeParse where
+
+import           Control.Monad.Reader
+import           Control.Monad.State
+import           Control.Unification
+import qualified Data.ByteString.UTF8       as BU
+import qualified Data.ByteString            as B
+import qualified Data.Map                   as M
+import qualified Data.Set                   as S
+import qualified Text.Trifecta              as T
+
+import qualified Dyna.ParserHS.Parser       as P
+import           Dyna.Term.TTerm
+-- import           Dyna.Test.Trifecta         -- XXX
+
+data ANFDict = AD
+  { -- | A map from (functor,arity) to a list of bits indicating whether to
+    -- (True) or not to (False) evaluate that positional argument.
+    --
+    -- XXX This isn't going to work when we get more complicated terms.
+    --
+    -- XXX Stronger type desired: we'd like static assurance that the
+    -- length of the list matches the arity in the key!
+    ad_arg_dispos  :: M.Map (B.ByteString,Int) [Bool]
+
+    -- | The set of functors that prefer not to be evaluated.
+  , ad_self_dispos :: S.Set (B.ByteString,Int)
+  }
+
+
+{- This stage of ANF does not actually link evaluations to
+ - their semantic interpretation.  That is, we have not yet
+ - resolved foreign function calls.
+ -}
+data ANFState = AS
+              { as_next  :: !Int
+              , as_evals :: M.Map B.ByteString DTerm
+              , as_unifs :: M.Map B.ByteString DTerm
+              }
+ deriving (Show)
+
+nextVar :: (MonadState ANFState m) => String -> m B.ByteString
+nextVar pfx = do
+    vn  <- gets as_next
+    modify (\s -> s { as_next = vn + 1 })
+    return $ BU.fromString $ pfx ++ show vn
+
+newEval :: (MonadState ANFState m) => String -> DTerm -> m DTerm
+newEval pfx t = do
+    n   <- nextVar pfx
+    evs <- gets as_evals
+    modify (\s -> s { as_evals = M.insert n t evs})
+    return $ UVar n
+
+newUnif :: (MonadState ANFState m) => String -> DTerm -> m DTerm
+newUnif pfx t = do
+    n   <- nextVar pfx
+    uns <- gets as_unifs
+    modify (\s -> s { as_unifs = M.insert n t uns})
+    return $ UVar n
+
+
+unspan :: T.Spanned P.Term -> DTerm
+unspan (P.TVar v T.:~ _)        = UVar v
+unspan (P.TNumeric v T.:~ _)    = UTerm $ TNumeric v
+unspan (P.TFunctor a as T.:~ _) = UTerm $ TFunctor a $ map unspan as
+unspan (P.TAnnot a t T.:~ _)    = UTerm $ TAnnot (fmap unspan a) (unspan t)
+
+-- | Convert a syntactic term into ANF; while here, move to a
+-- Control.Unification term representation.
+--
+-- XXX This sheds span information entirely, which is probably not what we
+-- actually want.  Note that we're careful to keep a stack of contexts
+-- around, so we should probably do something clever like attach them to
+-- operations we extract?
+normTerm_ :: (MonadState ANFState m, MonadReader ANFDict m)
+               => Bool          -- ^ In an evaluation context?
+               -> Bool          -- ^ Unpack unifications?
+               -> [T.Span]      -- ^ List of spans traversed
+               -> P.Term        -- ^ Term being digested
+               -> m DTerm
+
+    -- Variables don't evaluate and don't need to be moved
+normTerm_ _ _   _  (P.TVar  v)       = return $ UVar v
+
+    -- Numerics also get returned in-place.
+normTerm_ _ _   _  (P.TNumeric n)    = return $ UTerm $ TNumeric n
+    
+    -- Quote simply disappears having converted the context to
+    -- a non-evaluation context.
+normTerm_ _ _   ss (P.TFunctor "&" [t T.:~ st]) = do
+    normTerm_ False True (st:ss) t
+
+    -- Star forces evaluation even when the argument would prefer
+    -- to not be evaluated, thus the sort of odd "normalize in
+    -- nonevaluation context then eval" here.
+normTerm_ _ _   ss (P.TFunctor "*" [t T.:~ st]) = do
+    normTerm_ False True (st:ss) t >>= newEval "_normTS_"
+
+     -- Annotations are stripped of their span information
+     --
+     -- XXX this is probably the wrong thing to do
+normTerm_ c u   ss (P.TAnnot a (t T.:~ st)) = do
+    nt <- normTerm_ c u (st:ss) t
+    return $ UTerm $ TAnnot (fmap unspan a) nt
+
+    -- Functors have both top-down and bottom-up dispositions on
+    -- their handling.
+normTerm_ c u   ss (P.TFunctor f as) = do
+    argdispos <- asks $ maybe (repeat True) (id) . M.lookup (f,length as) . ad_arg_dispos
+    normas <- mapM (\(a T.:~ s,d) -> normTerm_ d True (s:ss) a) (zip as argdispos)
+    selfdispos <- getSelfDispos
+    (case () of
+       _ | c && selfdispos -> newEval "_normTF_"
+       _ | u               -> newUnif "_normTU_"
+       _                   -> return)
+     $ UTerm $ TFunctor f normas
+ where
+   getSelfDispos = do
+    set <- asks $ not . S.member (f,length as) . ad_self_dispos
+    return set
+    
+
+normTerm :: (MonadState ANFState m, MonadReader ANFDict m)
+         => Bool               -- ^ In an evaluation context?
+         -> T.Spanned P.Term   -- ^ Term to digest
+         -> m DTerm
+normTerm c (t T.:~ s) = normTerm_ c False [s] t
+
+    -- XXX
+normRule :: (MonadState ANFState m, MonadReader ANFDict m)
+         => T.Spanned P.Rule   -- ^ Term to digest
+         -> m DRule
+normRule (P.Fact t T.:~ _) = do
+    nt <- normTerm False t
+    return $ Rule nt ":-" [] (UTerm $ TFunctor "true" [])
+normRule (P.Rule h a es r T.:~ _) = do
+    nh  <- normTerm False h
+    nr  <- normTerm True  r
+    nes <- mapM (normTerm True) es
+    return $ Rule nh a nes nr
+
+    -- XXX
+dynaFunctorArgDispositions :: M.Map (B.ByteString,Int) [Bool]
+dynaFunctorArgDispositions = M.fromList [
+    (("is",2),[False,True])
+ ]
+
+    -- XXX
+    --
+    -- Functors which prefer not to be evaluated
+dynaFunctorSelfDispositions :: S.Set (B.ByteString,Int)
+dynaFunctorSelfDispositions = S.fromList
+    [ ("true",0)
+    , ("false",0)
+    , ("pair",2)
+    ]
+
+
+-- | Run the normalization routine.
+--
+-- Use as @runNormalize nRule
+runNormalize :: ReaderT ANFDict (State ANFState) a -> (a, ANFState)
+runNormalize =
+  flip runState   (AS 0 M.empty M.empty) .
+  flip runReaderT (AD dynaFunctorArgDispositions dynaFunctorSelfDispositions)
diff --git a/src/Dyna/Analysis/NormalizeParseSelftest.hs b/src/Dyna/Analysis/NormalizeParseSelftest.hs
new file mode 100644
index 0000000..ed28c3d
--- /dev/null
+++ b/src/Dyna/Analysis/NormalizeParseSelftest.hs
@@ -0,0 +1,19 @@
+{-# LANGUAGE FlexibleContexts #-}
+{-# LANGUAGE FlexibleInstances #-}
+{-# LANGUAGE NoMonomorphismRestriction #-}
+{-# LANGUAGE OverloadedStrings #-}
+{-# LANGUAGE StandaloneDeriving #-}
+
+module Dyna.Analysis.NormalizeParseSelftest where
+
+import qualified Data.ByteString              as B
+import           Dyna.Analysis.NormalizeParse
+import qualified Dyna.ParserHS.Parser         as P
+import           Dyna.Term.TTerm
+import           Dyna.XXX.TrifectaTest
+
+testNormTerm :: Monad m => B.ByteString -> (DTerm, ANFState)
+testNormTerm = runNormalize . normTerm False . unsafeParse P.dterm
+
+testNormRule :: Monad m => B.ByteString -> (DRule, ANFState)
+testNormRule = runNormalize . normRule . unsafeParse P.drule
diff --git a/src/Dyna/NormalizeParse.hs b/src/Dyna/NormalizeParse.hs
deleted file mode 100644
index acd6f16..0000000
--- a/src/Dyna/NormalizeParse.hs
+++ /dev/null
@@ -1,204 +0,0 @@
-{-# LANGUAGE FlexibleContexts #-}
-{-# LANGUAGE NoMonomorphismRestriction #-}
-{-# LANGUAGE OverloadedStrings #-}
-
-module Dyna.NormalizeParse where
-
--- import           Control.Arrow
-import           Control.Monad.Reader
-import           Control.Monad.State
-import qualified Data.ByteString.UTF8       as BU
-import qualified Data.ByteString            as B
-import qualified Data.Map                   as M
-import qualified Data.Set                   as S
-import qualified Text.Trifecta              as T
-
-import qualified Dyna.ParserHS.Parser       as P
--- import           Dyna.Test.Trifecta         -- XXX
-
-data Term = TFunctor !B.ByteString ![Term]
-          | TVar !B.ByteString
- deriving (Eq,Ord,Show)
-    -- The Ord instance is only there for the use of M.Map and S.Set
-    -- and should not be relied upon to be meaningful in any sense.
-
-data Rule = Rule !Term !B.ByteString ![Term] !Term
- deriving (Eq,Ord,Show)
-
--- | A map from (functor,arity) to a list of bits indicating whether to
--- (True) or not to (False) evaluate that positional argument.
---
--- XXX This isn't going to work when we get more complicated terms.
---
--- XXX Stronger type desired: we'd like static assurance that the
--- length of the list matches the arity in the key!
---
--- XXX What do we do about certain atoms that don't evaluate?  I don't
--- think I have handled it correctly here.  Maybe we should revisit the
--- "primitives don't evaluate" thing, but that doesn't help for things
--- like "pair/2" which we (presumably) want to be (preferentially)
--- structure-building.
---
-data ANFDict = AD
-             { ad_arg_dispos  :: M.Map (B.ByteString,Int) [Bool]
-             , ad_self_dispos :: S.Set (B.ByteString,Int)
-             }
-
-
-{- This stage of ANF does not actually link evaluations to
- - their semantic interpretation.  That is, we have not yet
- - resolved foreign function calls.
- -}
-data ANFState = AS
-              { as_next  :: Int
-              , as_evals :: S.Set (B.ByteString,Term)
-              }
- deriving (Show)
-
--- data Rule = Rule Term String Term (S.Set (String,Term))
---  deriving (Show)
-
-newEval :: (MonadState ANFState m) => String -> Term -> m Term
-newEval pfx t = do
-    evs <- gets as_evals
-    vn  <- gets as_next
-    let n = BU.fromString $ pfx ++ show vn
-    put $ AS (vn + 1) (S.insert (n,t) evs)
-    return $ TVar n
-
--- | Convert a syntactic term into ANF.
-normalizeTerm_ :: (MonadState ANFState m, MonadReader ANFDict m)
-               => Bool          -- ^ In an evaluation context?
-               -> [T.Span]      -- ^ List of spans traversed
-               -> P.Term        -- ^ Term being digested
-               -> m Term
-
-    -- Variables don't evaluate and don't need to be moved
-normalizeTerm_ _     _  (P.TVar  v)       = return $ TVar v
-
-    
-    -- Quote simply disappears having converted the context to
-    -- a non-evaluation context.
-normalizeTerm_ _     ss (P.TFunctor "&" [t T.:~ st]) = do
-    normalizeTerm_ False (st:ss) t
-
-    -- Star forces evaluation even when the argument would prefer
-    -- to not be evaluated, thus the sort of odd "normalize in
-    -- nonevaluation context then eval" here.
-normalizeTerm_ _     ss (P.TFunctor "*" [t T.:~ st]) = do
-    normalizeTerm_ False (st:ss) t >>= newEval "_normTS_"
-
-    -- Functors have both top-down and bottom-up dispositions on
-    -- their handling.
-normalizeTerm_ c     ss (P.TFunctor f as) = do
-    argdispos <- asks $ maybe (repeat True) (id) . M.lookup (f,length as) . ad_arg_dispos
-    normas <- mapM (\(a T.:~ s,d) -> normalizeTerm_ d (s:ss) a) (zip as argdispos)
-    selfdispos <- getSelfDispos
-    (if c && selfdispos then newEval "_normTF_" else return) $ TFunctor f normas
- where
-   getSelfDispos = do
-    set <- asks $ not . S.member (f,length as) . ad_self_dispos
-    -- XXX numerics
-    return set
-    
-
-normalizeTerm :: (MonadState ANFState m, MonadReader ANFDict m)
-              => Bool               -- ^ In an evaluation context?
-              -> T.Spanned P.Term   -- ^ Term to digest
-              -> m Term
-normalizeTerm c (t T.:~ s) = normalizeTerm_ c [s] t
-
-    -- XXX
-dynaFunctorArgDispositions :: M.Map (B.ByteString,Int) [Bool]
-dynaFunctorArgDispositions = M.fromList [
-    (("is",2),[False,True])
- ]
-
-    -- XXX
-    --
-    -- Functors which prefer not to be evaluated
-dynaFunctorSelfDispositions :: S.Set (B.ByteString,Int)
-dynaFunctorSelfDispositions = S.fromList
-    [ ("true",0)
-    , ("false",0)
-    , ("pair",2)
-    ]
-
-    -- XXX
-run = flip runStateT  (AS 0 S.empty) .
-      flip runReaderT (AD dynaFunctorArgDispositions dynaFunctorSelfDispositions)
-
-    -- XXX
--- testNormTerm = run . normalizeTerm False . unsafeParse P.dterm
-
-normalizeRule (P.Fact t T.:~ _) = do
-    nt <- normalizeTerm False t
-    return $ Rule nt ":-" [] (TFunctor "true" [])
-normalizeRule (P.Rule h a es r T.:~ _) = do
-    nh  <- normalizeTerm False h
-    nr  <- normalizeTerm True  r
-    nes <- mapM (normalizeTerm True) es
-    return $ Rule nh a nes nr
-
--- testNormRule = run . normalizeRule . unsafeParse P.drule
-
-{-
-neis    e = newEval "_normE_"  $ \v -> EIs v e
-neiis i e = newEval "_normEI_" $ \v -> EIndir v i e
-
-normalizeExpr :: Maybe String -> Bool -> P.Expr -> State NormState Term
-normalizeExpr Nothing  c (P.ETerm t) = normalizeTerm neis      c t
-normalizeExpr (Just i) c (P.ETerm t) = normalizeTerm (neiis i) c t
-
-normalizeExpr mi True (P.ENew  e) = do
-    e'  <- normalizeExpr mi True e
-    case e' of
-        TVar ne' -> newEval "_normEN_" $ \v -> ENew v ne'
-        _ -> error "New construct did not reduce to variable (help?)"
-
-normalizeExpr mi c (P.EExpr e1 o e2) = do
-    e1' <- normalizeExpr mi True e1
-    e2' <- normalizeExpr mi True e2
-    (if c then maybe neis neiis mi else return) $ TFunctor o [e1', e2']
-
-normalizeExpr mi True (P.EIndir td ti) = do
-    td' <- normalizeTerm eh True td
-    case td' of
-        TVar ntd' -> normalizeExpr (Just ntd') True ti
-        _ -> error "Indirection LHS did not reduce to variable (help?)"
- where eh = maybe neis neiis mi
-
-normalizeExpr _ False (P.ENew _)     = error "New in nonevaluation context (help?)"
-normalizeExpr _ False (P.EIndir _ _) = error "Indirection in nonevaluation context (help?)"
-
-testNormExpr c s = let x = P.dynaExpr $ L.alexScanTokens s in
-    (x, runState (normalizeExpr Nothing c x) (NS 0 []))
-
-
-nvis    t = newEval "_normV_"  $ \v -> EIs v t
-
-normalizeEval :: P.Eval -> State NormState Eval
-normalizeEval (P.EIs s e) = liftM (EIs s) $ normalizeExpr Nothing True e
-normalizeEval (P.EEq t1 t2) = do
-    t1' <- normalizeTerm nvis False t1
-    t2' <- normalizeTerm nvis False t2
-    return $ EEq t1' t2'
-
-
-nris    t = newEval "_normR_"  $ \v -> EIs v t
-
-normalizeRule_ :: P.Rule -> State NormState Rule
-normalizeRule_ (P.Rule h (P.Aggr a) r es) = do
-    h' <- normalizeTerm nris False h
-    r' <- normalizeExpr Nothing True r
-    es' <- mapM normalizeEval es
-    return $ Rule h' a r' es'
-
-normalizeRule :: P.Rule -> Rule
-normalizeRule r =
-    let (Rule h a v es, NS _ es') = runState (normalizeRule_ r) (NS 0 [])
-     in Rule h a v (es++es')
-
-testNormRule :: String -> Rule
-testNormRule s = let r = P.dynaRule $ L.alexScanTokens s in normalizeRule r
--}
diff --git a/src/Dyna/ParserHS/Parser.hs b/src/Dyna/ParserHS/Parser.hs
index 8763697..7843e12 100644
--- a/src/Dyna/ParserHS/Parser.hs
+++ b/src/Dyna/ParserHS/Parser.hs
@@ -5,7 +5,7 @@
 -- <https://github.com/ekmett/trifecta/blob/master/examples/RFC2616.hs>
 -- as well as the trifecta code itself
 --
--- TODO:
+-- TODO (XXX):
 --
 --   * We might want to use T.T.Literate, too, in the end.
 --
@@ -14,6 +14,9 @@
 --   * Doesn't handle parenthesized aggregators
 --
 --   * Doesn't handle shared subgoals ("whenever ... { ... }")
+--
+--   * Don't end numerics with ., even if it's the end-of-rule marker;
+--   put a space first.
 
 --   Header material                                                      {{{
 
@@ -23,7 +26,7 @@
 {-# LANGUAGE Rank2Types #-}
 
 module Dyna.ParserHS.Parser (
-    Term(..), Annotation(..), dterm, dtexpr,
+    Term(..), dterm, dtexpr,
     Rule(..), drule, Line(..), dline, dlines
 ) where
 
@@ -42,19 +45,20 @@ import           Text.Parser.Token.Highlight
 import           Text.Parser.Token.Style
 import           Text.Trifecta
 
+import           Dyna.Term.TTerm (Annotation(..))
 import           Dyna.XXX.Trifecta (identNL)
 
 ------------------------------------------------------------------------}}}
 -- Parsed output definition                                             {{{
 
-data Annotation = AnnType (Spanned Term)
+data Term = TFunctor !B.ByteString
+                     ![Spanned Term]
+          | TAnnot   !(Annotation (Spanned Term))
+                     !(Spanned Term)
+          | TNumeric !(Either Integer Double)
+          | TVar     !B.ByteString
  deriving (Eq,Ord,Show)
 
-data Term = TFunctor {-# UNPACK #-} !B.ByteString ![Spanned Term]
-          | TAnnot   Annotation !(Spanned Term)
-          | TVar     {-# UNPACK #-} !B.ByteString
-           -- TDBLit XXX
- deriving (Eq,Ord,Show)
 
 -- | Rules are not just terms because we want to make it very syntactically
 --   explicit about the head being a term (though that's not an expressivity
@@ -64,13 +68,13 @@ data Term = TFunctor {-# UNPACK #-} !B.ByteString ![Spanned Term]
 --   XXX The span on Fact is a little silly
 data Rule = Fact (Spanned Term)
           | Rule !(Spanned Term) !B.ByteString ![Spanned Term] !(Spanned Term)
- deriving (Eq,Ord,Show)
+ deriving (Eq,Show)
 
 --   XXX The span on LRule is a little silly
 --   XXX Having one kind of Pragma is probably wrong
 data Line = LRule (Spanned Rule)
           | LPragma !(Spanned Term)
- deriving (Eq,Ord,Show)
+ deriving (Eq,Show)
 
 
 ------------------------------------------------------------------------}}}
@@ -82,9 +86,11 @@ bsf = fmap BU.fromString
 ------------------------------------------------------------------------}}}
 -- Identifier Syles                                                     {{{
 
+-- | The full laundry list of punctuation symbols we "usually" mean.
 usualpunct :: CS.CharSet
 usualpunct = CS.fromList "!#$%&*+/<=>?@\\^|-~:."
 
+-- | Dot operators
 dynaDotOperStyle :: TokenParsing m => IdentifierStyle m
 dynaDotOperStyle = IdentifierStyle
   { styleName = "Dot Operator"
@@ -95,11 +101,13 @@ dynaDotOperStyle = IdentifierStyle
   , styleReservedHighlight = ReservedOperator
   }
 
-    --   Dot is handled specially elsewhere due to its
-    --   dual purpose as an operator and rule separator.
-    --
-    --   Colon is not a permitted beginning to a prefix
-    --   operator, as it is a sigil for type annotations.
+-- | Prefix operators
+--
+-- Dot is handled specially elsewhere due to its
+-- dual purpose as an operator and rule separator.
+--
+-- Colon is not a permitted beginning to a prefix
+-- operator, as it is a sigil for type annotations.
 dynaPfxOperStyle :: TokenParsing m => IdentifierStyle m
 dynaPfxOperStyle = IdentifierStyle
   { styleName = "Prefix Operator"
@@ -110,6 +118,10 @@ dynaPfxOperStyle = IdentifierStyle
   , styleReservedHighlight = ReservedOperator
   }
 
+-- | Infix operators
+--
+-- Dot is handled specially elsewhere due to its
+-- dual purpose as an operator and rule separator.
 dynaOperStyle :: TokenParsing m => IdentifierStyle m
 dynaOperStyle = IdentifierStyle
   { styleName = "Infix Operator"
@@ -123,7 +135,7 @@ dynaOperStyle = IdentifierStyle
 dynaAtomStyle :: TokenParsing m => IdentifierStyle m
 dynaAtomStyle = IdentifierStyle
   { styleName = "Atom"
-  , styleStart    = (lower <|> digit <|> char '_')
+  , styleStart    = (lower <|> oneOf "$")
   , styleLetter   = (alphaNum <|> oneOf "_'")
   , styleReserved = H.fromList [ "is", "new", "whenever" ]
   , styleHighlight = Constant
@@ -184,13 +196,23 @@ term :: DeltaParsing m => m (Spanned Term)
 term  = token $ choice
       [       parens texpr
       ,       spanned $ TVar <$> (bsf $ ident dynaVarStyle)
-      , try $ spanned $ flip TFunctor [] <$> atom <* (notFollowedBy $ char '(')
-      , try $ spanned $ mkta <$> (colon *> term) <* spaces <*> term
+
+      ,       spanned $ mkta <$> (colon *> term) <* spaces <*> term
+
+      , try $ spanned $ TNumeric <$> naturalOrDouble
+
+      , try $ spanned $ flip TFunctor [] <$> atom
+                      <* (notFollowedBy $ char '(')
+
+      , try $ spanned $ flip TFunctor [] <$> (bsf $ string "*") 
       ,       spanned $ parenfunc
       ]
  where
-  parenfunc = TFunctor <$> (highlight Identifier atom <?> "Functor")
+  functor = highlight Identifier atom <?> "Functor"
+
+  parenfunc = TFunctor <$> functor
                        <*>  parens (texpr `sepBy` symbolic ',')
+
   mkta ty te = TAnnot (AnnType ty) te
 
 -- XXX right now all binops are at equal precedence and left-associative; that's wrong.
@@ -234,13 +256,13 @@ rulepfx = Rule <$> term
 
 rule :: DeltaParsing m => m Rule
 rule = choice [
-                -- HEAD OP= RESULT whenever EXPRS .
+                -- HEAD OP= RESULT EXPR whenever EXPRS .
                (try (liftA flip rulepfx
                            <*> texpr
                            <*  hrss "whenever"))
                            <*> (texpr `sepBy1` symbolic ',')
 
-                -- HEAD OP= EXPRS, RESULT .
+                -- HEAD OP= EXPRS, RESULT EXPR .
               , (try rulepfx)
                            <*> many (try (texpr <* symbolic ','))
                            <*> texpr
@@ -259,9 +281,7 @@ drule = spanned rule
 
 progline :: DeltaParsing m => m (Spanned Line)
 progline  = spanned $ choice [ LRule <$> drule
-                             , LPragma <$> (symbol ":-"
-                                       *> spaces
-                                       *> texpr)
+                             , LPragma <$> (symbol ":-" *> spaces *> texpr)
                              ]
 
 dline :: DeltaParsing m => m (Spanned Line)
diff --git a/src/Dyna/ParserHS/Selftest.hs b/src/Dyna/ParserHS/Selftest.hs
index ca1d754..1d5a935 100644
--- a/src/Dyna/ParserHS/Selftest.hs
+++ b/src/Dyna/ParserHS/Selftest.hs
@@ -5,7 +5,7 @@
 --   Writing these is still too hard, Template Haskell and the REPL
 --     notwithstanding.
 --
---   Test.Framework.TH appears not to understand comments at the
+--   Test.Framework.TH appears not to understand block comments at the
 --   moment, and parses right through them.
 
 -- Header material                                                      {{{
@@ -30,6 +30,7 @@ import           Text.Trifecta
 import           Text.Trifecta.Delta
 
 import           Dyna.ParserHS.Parser
+import           Dyna.Term.TTerm (Annotation(..))
 import           Dyna.XXX.TrifectaTest
 
 ------------------------------------------------------------------------}}}
@@ -87,8 +88,8 @@ case_basicFunctorNLComment :: Assertion
 case_basicFunctorNLComment = e @=? (term sfb)
  where
   e =  TFunctor "foo"
-         [TFunctor "1" [] :~ Span (Lines 1 0 9 0) (Lines 1 1 10 1) "1,2\n"
-         ,TFunctor "2" [] :~ Span (Lines 1 2 11 2) (Lines 2 0 13 0) "1,2\n"
+         [TNumeric (Left 1) :~ Span (Lines 1 0 9 0) (Lines 1 1 10 1) "1,2\n"
+         ,TNumeric (Left 2) :~ Span (Lines 1 2 11 2) (Lines 2 0 13 0) "1,2\n"
          ]
         :~ Span (Columns 0 0) (Lines 2 1 14 1) "foo(%xxx\n"
 
@@ -99,11 +100,11 @@ case_basicFunctorTWS :: Assertion
 case_basicFunctorTWS = e @=? (term sfb)
  where
   e = TFunctor "foo"
-       [TFunctor "bar" [] :~ Span (Lines 1 1 5 1) (Lines 1 5 9 5) "(bar )"
-       ] :~ Span (Columns 0 0) (Columns 10 10) "foo\n"
+       [TFunctor "bar" [] :~ Span (Columns 5 5) (Columns 9 9) sfb
+       ] :~ Span (Columns 0 0) (Columns 10 10) sfb
 
   sfb :: (IsString s) => s
-  sfb = "foo\n(bar )"
+  sfb = "foo (bar )"
 
 case_basicFunctorNL :: Assertion
 case_basicFunctorNL = e @=? (term sfb)
@@ -128,6 +129,14 @@ case_colonFunctor = e @=? (term pvv)
        :~ Span (Columns 0 0) (Columns 17 17) pvv
   pvv = "possible(Var:Val)"
 
+-- case_nullaryStar :: Assertion
+-- case_nullaryStar = e @=? (term gs)
+--  where
+--   e  = TFunctor "gensym"
+--          [TFunctor "*" [] :~ Span (Columns 7 7) (Columns 8 8) gs
+--          ] :~ Span (Columns 0 0) (Columns 9 9) gs
+--   gs = "gensym(*)"
+
 case_failIncompleteExpr :: Assertion
 case_failIncompleteExpr = checkParseFail dterm "foo +"
   "(interactive):1:5: error: expected: \"(\",\n    end of input\nfoo +<EOF> "
@@ -145,8 +154,6 @@ case_tyAnnot = e @=? (term fintx)
                   :~ Span (Columns 0 0) (Columns 9 9) fintx
   fintx = "f(:int X)"
 
-------------------------------------------------------------------------}}}
-
 ------------------------------------------------------------------------}}}
 -- Rules and lines                                                      {{{
 
@@ -162,10 +169,10 @@ case_ruleSimple = e @=? (progline sr)
   e  = LRule (Rule (TFunctor "goal" [] :~ Span (Columns 0 0) (Columns 5 5) sr)
                    "+=" 
                    []
-                   (TFunctor "1" [] :~ Span (Columns 8 8) (Columns 10 10) sr)
-            :~ Span (Columns 0 0) (Columns 10 10) sr)
-           :~ Span (Columns 0 0) (Columns 10 10) sr
-  sr = "goal += 1 ."
+                   (TNumeric (Left 1) :~ Span (Columns 8 8) (Columns 9 9) sr)
+            :~ Span (Columns 0 0) (Columns 9 9) sr)
+           :~ Span (Columns 0 0) (Columns 9 9) sr
+  sr = "goal += 1."
   
 case_ruleExpr :: Assertion
 case_ruleExpr = e @=? (progline sr)
@@ -230,7 +237,7 @@ case_ruleKeywordsComma = e @=? (progline sr)
                      :~ Span (Columns 21 21) (Columns 32 32) sr
                    ,TFunctor "is"
                       [TVar "Y" :~ Span (Columns 34 34) (Columns 36 36) sr
-                      ,TFunctor "3" [] :~ Span (Columns 39 39) (Columns 41 41) sr
+                      ,TNumeric (Left 3) :~ Span (Columns 39 39) (Columns 41 41) sr
                       ]
                      :~ Span (Columns 34 34) (Columns 41 41) sr
                    ]
@@ -247,17 +254,17 @@ case_rules = e @=? (proglines sr)
   e = [ LRule (Rule (TFunctor "goal" [] :~ Span (Columns 0 0) (Columns 5 5) sr)
                      "+="
                      []
-                     (TFunctor "1" [] :~ Span (Columns 8 8) (Columns 9 9) sr)
-                    :~ Span (Columns 0 0) (Columns 9 9) sr)
-                   :~ Span (Columns 0 0) (Columns 9 9) sr
-      , LRule (Rule (TFunctor "goal" [] :~ Span (Columns 11 11) (Columns 16 16) sr)
+                     (TNumeric (Left 1) :~ Span (Columns 8 8) (Columns 10 10) sr)
+                    :~ Span (Columns 0 0) (Columns 10 10) sr)
+                   :~ Span (Columns 0 0) (Columns 10 10) sr
+      , LRule (Rule (TFunctor "goal" [] :~ Span (Columns 12 12) (Columns 17 17) sr)
                     "+="
                     []
-                    (TFunctor "2" [] :~ Span (Columns 19 19) (Columns 20 20) sr)
-                   :~ Span (Columns 11 11) (Columns 20 20) sr)
-                  :~ Span (Columns 11 11) (Columns 20 20) sr
+                    (TNumeric (Left 2) :~ Span (Columns 20 20) (Columns 22 22) sr)
+                   :~ Span (Columns 12 12) (Columns 22 22) sr)
+                  :~ Span (Columns 12 12) (Columns 22 22) sr
       ]
-  sr = "goal += 1. goal += 2."
+  sr = "goal += 1 . goal += 2 ."
 
 case_rulesDotExpr :: Assertion
 case_rulesDotExpr = e @=? (proglines sr)
@@ -276,11 +283,11 @@ case_rulesDotExpr = e @=? (proglines sr)
        , LRule (Rule (TFunctor "goal" [] :~ Span (Columns 17 17) (Columns 22 22) sr)
                       "+=" 
                       []
-                      (TFunctor "1" [] :~ Span (Columns 25 25) (Columns 26 26) sr)
-                     :~ Span (Columns 17 17) (Columns 26 26) sr)
-                    :~ Span (Columns 17 17) (Columns 26 26) sr
+                      (TNumeric (Left 1) :~ Span (Columns 25 25) (Columns 27 27) sr)
+                     :~ Span (Columns 17 17) (Columns 27 27) sr)
+                    :~ Span (Columns 17 17) (Columns 27 27) sr
        ]
-  sr = "goal += foo.bar. goal += 1."
+  sr = "goal += foo.bar. goal += 1 ."
 
 ------------------------------------------------------------------------}}}
 -- Harness toplevel                                                     {{{