Newbie syntax questions

Sep 22, 2008 at 1:46 AM
Hi,$0$0$0$0I've been trying to do a simple text transformation along the lines of the Google Code wiki markup.$0$0$0$0$0The small chunk I've been implementing matches text, which are transformed to HTML paragraphs, and headings in the form:$0$0$0$0$0= This is a Heading =$0$0$0$0$0...which are transformed into HTML headings. The number of '=' signs denote a heading level H1 to H3.$0$0$0$0$0The complete (broken) grammar is below. The two points I'm having trouble with are:$0$0$0$0$01. Matching newlines$0$0$0$0$0No matter where I put it, "\n" causes an error when running the generated parser.$0$0$0$0$02. Named matches emit as raw text, not transformed, when appearing more than once in the same rule$0$0$0$0$0I.e. see the HDelim rule below, that I believe should transform the equals signs into an appropriate HTML tag name. When :h appears later in the rule, the raw source text is emitted.$0$0$0$0$0You can paste the grammar into the Little Typechecker example if you want to give it a run - grammar and expression names are unchanged :)$0$0$0$0$0Any advice on where I'm going wrong greatly appreciated.$0$0$0$0$0Nick$0$0$0$0$0--$0$0$0$0$0$0ometa OMetaSharp.Examples.LittleTypeChecker<char, string> : Parser<char> {$0$0$0$0$0    TypeCheck = Body:t End -> { "<html><body>" + t + "</body></html>" },$0$0    $0$0    Body = Block:p Body:b -> { p + b }$0$0 | Block,$0$0 $0$0 Block = Heading | Para,$0$0    $0$0    Heading = HDelim:h Text:t :h NewLine$0$0 -> { "<" + h + ">" + t + "</" + h + ">" },$0$0    $0$0    HDelim = "===" -> { "h3" }$0$0 | "==" -> { "h2" }$0$0 | "=" -> { "h1" },$0$0    $0$0    Para = Text:t -> { "<p>" + t + "</p>" },$0$0    $0$0    Text = Item:i Text:t -> { i + t }$0$0 | Item,$0$0$0$0$0 Item = LetterOrDigit$0$0 | SpaceOrMore -> { " " },$0$0 $0$0 SpaceOrMore = Space Spaces,$0$0 $0$0 NewLine = "\n"$0$0}$0$0$0$0$0
Sep 22, 2008 at 1:47 AM
.. this time hopefully not corrupted:

ometa OMetaSharp.Examples.LittleTypeChecker<char, string> : Parser<char> {

TypeCheck = Body:t End -> { "<html><body>" + t + "</body></html>" },

 

Body = Block:p Body:b -> { p + b }

| Block,

 

Block = Heading | Para,

 

Heading = HDelim:h Text:t :h NewLine

-> { "<" + h + ">" + t + "</" + h + ">" },

 

HDelim = "===" -> { "h3" }

| "==" -> { "h2" }

| "=" -> { "h1" },

 

Para = Text:t -> { "<p>" + t + "</p>" },

 

Text = Item:i Text:t -> { i + t }

| Item,

Item = LetterOrDigit

| SpaceOrMore -> { " " },

 

SpaceOrMore = Space Spaces,

 

NewLine = "\n"

}

Coordinator
Sep 23, 2008 at 1:10 AM

It's great to see you "getting your hands dirty" with OMeta#!

One thing that's subtle with OMeta's syntax is that "sample" in double quotes applies the **Token** rule that is defined in runtime's Parser with the given argument of whatever is inside the quotes (see the SCharacters rule in OMetaParser.ometacs). The token rule allows for an arbitrary number of Space(s) to precede it. If you look at the Space rule that is defined in OMeta.cs, you'll see that it will match whatever Char.IsWhitespace deems as whitespace (which includes newlines '\n'). This is one thing that was causing you grief. '\n' matches the literal new line rather than the token newline. The difference is that with the default Space rule, "\n" will never match because it already matches an arbitrary number of whitespace (that includes '\n') before it.

Also, note that ":h" by itself is an abbreviation for "Anything:h" which consumes one item of input from the input stream, which in this case is a single character. This doesn't appear to be what you wanted.

Other than these small issues, everything else looked pretty good. I took advantage of OMeta's quantifying operators (* and + which mean "zero or more" and "one or more" respectively) to simplify things down a bit. I also changed your HDelim rule to be slightly more general.

The result is a new "Markup" example project in the latest check-in. The grammar file is located here.

Does this answer your question?

Sep 23, 2008 at 1:48 AM
Jeff - thanks for clearing that up! Answers everything perfectly.
I will let you know how I go extending the sample.
Nick

On Mon, Sep 22, 2008 at 5:11 PM, jeffmoser <notifications@codeplex.com> wrote:

From: jeffmoser

It's great to see you "getting your hands dirty" with OMeta#!

One thing that's subtle with OMeta's syntax is that "sample" in double quotes applies the **Token** rule that is defined in runtime's Parser with the given argument of whatever is inside the quotes (see the SCharacters rule in OMetaParser.ometacs). The token rule allows for an arbitrary number of Space(s) to precede it. If you look at the Space rule that is defined in OMeta.cs, you'll see that it will match whatever Char.IsWhitespace deems as whitespace (which includes newlines '\n'). This is one thing that was causing you grief. '\n' matches the literal new line rather than the token newline. The difference is that with the default Space rule, "\n" will never match because it already matches an arbitrary number of whitespace (that includes '\n') before it.

Also, note that ":h" by itself is an abbreviation for "Anything:h" which consumes one item of input from the input stream, which in this case is a single character. This doesn't appear to be what you wanted.

Other than these small issues, everything else looked pretty good. I took advantage of OMeta's quantifying operators (* and + which mean "zero or more" and "one or more" respectively) to simplify things down a bit. I also changed your HDelim rule to be slightly more general.

The result is a new "Markup" example project in the latest check-in. The grammar file is located here.

Does this answer your question?

Read the full discussion online.

To add a post to this discussion, reply to this email (ometasharp@discussions.codeplex.com)

To start a new discussion for this project, email ometasharp@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe or change your settings on codePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at codeplex.com