This project is read-only.

Some questions on the syntax

Jan 28, 2010 at 5:47 PM

Hi Jeff,

as I read in the previous thread you haven't been in the source code for one year. My interest in ometa comes from boo (itself implementing ometa) and the BooLangStudio using OMeta# for implementing the colorizer parser (Syntax Highlighting). Its not really functional right now.

What is my goal? My vision is to integrate OMeta# as a full VS2010 citizen with the new MEF architecture of VS. My first tries on getting started with OMeta where very frustrating but I'm gainging speed. I think I have a good understanding on the core concepts, the grammar and the syntax. But there are quite some issues for me understanding all of the codebase.

  1. Can you tell me the differences between your first core implementation and later added syntax/grammar? therefore I created a "cheat sheet" out of your initial table (at  the end of this post)
  2. Can you explain me the differences of writing a or string with '' "" and do I have to tread `` differently for sequences. Im reffering to the following lines:
    TSString       = '\'' (~'\'' EscapedChar)*:xs '\''                         -> { xs.As<string>() },
      Characters     = '`' '`' (~('\'' '\'') EscapedChar)*:xs '\'' '\''          -> { Sugar.Cons("App", "Seq", xs.ToProgramString()) },
      SCharacters    = '"'     (~'"'    EscapedChar)*:xs '"'                -> { Sugar.Cons("App", "Token",  xs.ToProgramString() ) },
      String        ^= (('#' | '`') TSName | TSString):xs                  -> { Sugar.Cons("App", "Exactly", xs.ToProgramString() ) },

 

And finally ... did you hear of an upcoming implementation of JS on top of the DLR (http://ironjs.com/). Looks very promising. So it hopefully would be a nobrainer to get OMeta#/JS implemented ;-)

 

expression meaning OMeta# Function
e<sub>1</sub> e<sub>2</sub> sequencing Seq
e<sub>1</sub> | e<sub>2</sub> prioritized choice MetaRules.Or
e<sup>*</sup> zero or more repetitions MetaRules.Many
e<sup>+</sup> one or more repetitions (not essential) MetaRules.Many1
~e negation MetaRules.Not
~~e unlimited lookahead, but don't
consume any input
 
<p> production application MetaRules.Apply
'x' matches the character x Exactly
 ^= overide rule  
->{...} semantic action  
!{...} semantic action  
?{...} semantic predicate  
#chars symbol  
(exp) grouping  
e:x binding. bind the value of e
to the identifier x in order to reference to
in a semantic action, or predicate
 
anything build in rule from which others derive.
Anything of type T that is atomic.
 
& lookahead MetaRules.Lookahead
ListOf   Parser.ListOf
Exactly   OMeta.
rule :x :y = ... parameterized rule. x and y can be of
any type. same as
rule = anything:x anythin:y (...)
 
rule 0 = ...
rule 1 = ...
rule :n = ...
pattern matching. Tried in order until
the first one succeeds
 
rule = Foreign(typeof(LangB)

Foreign Rule Invocation
(rule mixin/inheritance)

 
/* ... */ multiline comment consumed as space  
// line comment  
Jan 29, 2010 at 2:31 AM

Hi Rainer,

You're right that I haven't touched the code in ages and therefore haven't put any thought into it. I applaud your vision to move OMeta# to VS2010 and MEF. The addition of the DLR wll greatly simplify things. If you'd like, I can give you commit access here and you're welcome to take over and rewrite things.

As to your questions:

1. I'm not sure I understand your question, but will give it a shot. Part of the initial implementation was getting it to where it could bootstrap itself. The hardest part to write by hand was getting OMetaBase.cs right. That's where the core "meta" rules live (e.g. dealing with memoization). In the code, I refered to this funcionality as "MetaRules". On top of the "base" is OMeta.cs. This is where you can build up rules like "Exactly" and "LetterOrDigit". Those rules are pretty basic to implement by hand (but in terms of the meta rules) and build on the work of the streams (e.g of characters). Now, OMeta itself has a parser and optimizer. I initially wrote those by hand. If you'd like to take a look, go back to my very first changeset (12094) and look in the ManuallyCreatedCode folder at ManualOMetaParser.cs. That parser will show you matching Escaped Characters (EChar) and many more. As the project progressed, OMeta was able to parse its own grammar and generate code that would recognize itself. This happened in changeset 15383. By changeset 15954, the parser, optimizer, and translator were all written in OMeta#.  Throughout this entire process I would write by hand what I would later generate. Thus, if I wanted to change the OMeta language grammar, I'd update the grammar and regenerate the parser. This would allow me to use the new language syntax I just added (e.g. it's like "teaching" the compiler something new). The parser generates some very complicated abstract tree-like representations. Generating code (what is called "translating") to C# off of that would yield really ugly code, so that is why the Optimizer hits it to trim things down.

The JavaScript is probably easier to understand than my version because C# 3.0 didn't supprt dynamic stuff that C# 4.0 will support and so some parts use reflection manually whereas the DLR will do that for you.

2. Again, I'm not sure if I understand your question as some of it might have gotten cut off. One important thing to keep in mind is that between tokens (things between double quote like "|") can be arbitrary amounts of whitespace. You can see this if you look at the OMetaParser.ometacs file itself. Note how a literal double quote '"' followed by 0 or more EscapedChar's followed by a literal '"' gets turned into the abstract application of "Token" of characters. Note then how Token is implememented in Parser.cs and allows the Spaces rule around things -- that's what will eat up white spaces which is just 0 or more of the Space rule (defined in ometaparser.ometacs). The thing to keep in mind is that the streams can be streams of anything. In the parser, we take in streams of chars and produce an abstract syntax tree of sorts (that's what Sugar.Cons is making). The optimizer takes the tree and produces a tree. So you have to always think in terms of the stream.

3. As for IronJS, I hadn't heard of it, but you could probably share some ideas with it. I wrote OMeta# as a proof of concept. With C# 4.0, especially with the "dynamic" keyword support, you could probably rewrite OMeta from scratch in a week or so. It might be good to just take the OMeta paper and ignore my implementation. I encourage you to do this as it's a good learning experience. The only thing to think of is how to make OMeta .NET-y. Note that I extended the OMeta grammar by supporting "using" statements to import namespaces so that the semantic action code would be simpler. You're free to make your own syntax choices

Does that help any?

Jan 29, 2010 at 8:46 PM

Thanks for your answer. I allready read the paper two or three times. And also had an exhausting look at your sources. the pointers to the changesets will help me to understand the codebase a lot better. I've also read all your blogpostings about ometa. Two Months ago I started looking at IronScheme (over here at codeplex) And there is also an Implementation. I'm not very good at compiler theory atm. because i've never touched this area. I'll have to dive deeper into parsing techniques.

My initial motivation on getting in touch with VS2010 extensibility was to help getting the Boo integration more complete. At BooLangStudio I discovered the use of ometa sharp. In the meantime my interest switched over to D (with D-IDE built on SharpDevelop) as native language for evaluating several native code concepts. But now as I'm getting deeper my highest priority will be to get OMeta# up and running. I'll have a more detailed look at the DLR and the ExpressionTrees of .NET 4.0.My first though was to replace your intermediate tree like implementation into the .NET 4.0 Expressions. So I have to experiment a little bit with it.

I don't think I'll get to an implementation point in around one week.

Jun 13, 2010 at 4:26 PM

Hi Rainer,

I have been actively working on an OMeta variant for .NET called MetaSharp, it's many thing right now but at the core of it is a pattern matching library and parser. One of my big goals is to get this integrated into VS better as a plugin and working on highlighting types of tools. Also I have been contemplating a JS transform library like you mention above, I wasn't aware of a DLR version of js though... that actually gives me a lot of ideas!

Anyway, I'm not sure jeff is maintaining OMeta# anymore but if you'd like to try out MetaSharp I would be more than happy to work with you on all of this.

PS Jeff, I'd love to hear what you think about what I have so far. 

--

Here's the main link:

http://metasharp.codeplex.com/

Grammar specs (a bit out of date):

http://metasharp.codeplex.com/wikipage?title=Grammar&referringTitle=Home

And the actual grammar grammar:

http://metasharp.codeplex.com/SourceControl/changeset/view/a97e22b8dc12#src%2ftransformation%2fParsing%2fGrammar%2fGrammar.grammar

 

Currently, the grammar grammar is parsed at build time and generates a class that is actually used to parse and transform itself for subsequent builds.