chevron_left chevron_right

• 0 Vote(s) - 0 Average

 Tutorial CYFPL - Part 2 - Inception filter_list Linear Mode Threaded Mode View a Printable Version Author Message
CYFPL - Part 2 - Inception #1
Hello all, and welcome to part 2 of the CYFPL series. Today we're going to start working on the fun stuff. We're going to define the rules that our language will live by.

You should remember from last time our non-terminals. Let's start working with those, shall we.

Firstly, I want to start with our factors, these will be the last non-terminal to be parsed before we hit a value (digit-seq or ident). Because these are the last, they will be the easiest to understand.
Let's hit the obvious ones first, a factor can be a number, or a variable name.

Code:
```F -> digit-seq F -> ident```

Perfect, now we have two cases to look at, so let's move up one step in the chain, to a term.
A term can obviously be a factor, so let's hammer that one off right away. But, aside from that, a term can also be made up of two parts. Those two parts are a factor mult-op another factor. Allowing us to be able to parse 5 * 6, which is handy.
Code:
```T -> F mult-op F T -> F```
It may not look it, but the order those are in is actually extremely important. See, a RDT works by parsing whichever combination matches first, and since we will only be matching in non-terminals, we aren't going to check if it was the last token in the statement. So, if we had the single factor line first, then the compiler would think that 5 * 5 = 5, not 25.

Now that we have the basics of a term down, let's move up again to an expression.
The obvious case for an expression is that it can be a single term, so we're going to handle that one last. At this point, you may be wondering why a term is not something + something, and this will probably be the most confusing part of the whole series for you. To comply with order of operations, we MUST compute multiplication before addition, therefore we need to look at multiplication after addition. This is because we are writing a descent parser, so code is actually processed backwards by group. So, an expression can also be a term add-op another term, therefore allowing us to handle order correctly.
Code:
```E -> T add-op T E -> T```

Alright, now we are able to parse single terms properly, we have created the basis for a descent parser language. Pat yourself on the back, but come right back because we aren't finished yet!

Ok, so we have our basic math language in descent form, but we are trying to build a complete language, which is where the word recursive comes into play. See, right now all this language can handle is single term arithmetic (term used in sense of mathematics). So, let's look at it from the top to the bottom. We have:

Code:
```E -> T add-op T E -> T T -> F mult-op F T -> F F -> digit-seq F -> ident```

However, looking at expression, an expression can easily contain an expression of its own, right?
Also, a term can also include another term.
Code:
```E -> T add-op E E -> T T -> F mult-op T T -> F F -> digit-seq F -> ident```
perfect! If you spend a little time thinking about it, it will make sense why we did it that way. We won't need the extra case because each of those will ultimately evaluate to an integer.

Now, there is one last piece we are missing. Order of operations includes parenthesis. So, we're going to think about it. What does something in parens mean in mathematics?
It means to evaluate the expression independently.
So, that means that we need to include
Code:
`(E)`
in one of our steps.
If you remember from just a short while ago, that we execute from bottom up, it will make sense that we should parse this at the factor level, since it is our last hit. We will also scan for this first
Code:
```E -> T add-op E E -> T T -> F mult-op T T -> F F -> (E) F -> digit-seq F -> ident```

Perfect!

Now, we have basically grammar definitions for a calculator, which is really all that a programming language is. However, ours is very basic and we don't have the ability to automate anything. We will start that off with variables.
This is where we will travel one more level up, to the statement.
Code:
`S -> ident = E`
Now, we have a full fledged calculator. However, we want to make this more of a language than a bad implementation of R. So, let's add a little more functionality.
We want the following to make it seem more code like:
1. Ability to create and delete variables at runtime
3. Ability to write text
4. Ability to write variables
5. Ability to call functions
So, for these, we call them keywords, and we parse them at the statement level (so they take less overhead).
Code:
```S -> var ident S -> del ident S -> read ident S -> print ident S -> print text-seq S -> call ident S -> return```
I think those are all pretty obvious, except for printing text. So far, we haven't defined text-seq, though I'm sure you know what it means, just be aware of it.

So, let's look at our entire language spec at this point.
Code:
```S -> var ident S -> del ident S -> read ident S -> print ident S -> print text-seq S -> call ident S -> return S -> ident = E E -> T add-op E E -> T T -> F mult-op T T -> F F -> (E) F -> digit-seq F -> ident add-op -> + - mult-op-> * / % ident -> name digit-seq-> 0123456789 text-seq-> string```

Awesome! We now have the exact grammar for <your name here> programming language. Just for fun, I want you to draw some trees of the language to test it out. At the same time, try and convert this set of trees into their statements.

Keep in mind, the terminology in this one is for a different spec, but the basics are identical.

Next up, let's start writing the code! I will be using the C language to build this parser, but I am happy to work with any compiled language (that means no python or VB) if you guys have issues understanding C code.

As always, thank you for reading this, feel free to leave replies, reputation, or hit me up on discord.

4 users Like phyrrus9's post
RE: CYFPL - Part 2 - Inception #2
This is a great thread. Thank you for taking time out of your day to write it and post it for all of us!

RE: CYFPL - Part 2 - Inception #3
(01-02-2017, 08:13 PM)pvnk Wrote: This is a great thread. Thank you for taking time out of your day to write it and post it for all of us!

Anything for you my love

Might I ask, I said in the tutorial I was planning on working with C, is that a language you are comfortable with or would you prefer something like Java?

RE: CYFPL - Part 2 - Inception #4
Just wondering where subtraction and division come into play? Since expressions could be more than adding or multiplying.

Also, I know Java better than C, but I should be fine understanding the C code anyway. I don't mind having to figure things out.
"If you look for the light, you can often find it. But if you look for the dark, that is all you will ever see.”

RE: CYFPL - Part 2 - Inception #5
Nice, thanks for the [less confusing] thread!

I would work with Go or Haskell better than C, but I catch on fast, so I'll be fine.

(11-02-2018, 02:51 AM)Skullmeat Wrote: Ok, there no real practical reason for doing this, but that's never stopped me.

RE: CYFPL - Part 2 - Inception #6
(01-02-2017, 08:58 PM)God Wrote: Just wondering where subtraction and division come into play? Since expressions could be more than adding or multiplying.

Also, I know Java better than C, but I should be fine understanding the C code anyway. I don't mind having to figure things out.

Good question with sub and div, I didn't really draw this out as clearly as I should have. I defined mult-op as multiply, divide, and modulo and add-op as add and subtract, doing it this way just makes it cleaner, not having 5 symbols (or even worse, 5 cases) in the grammar.

RE: CYFPL - Part 2 - Inception #7
(01-02-2017, 09:56 PM)phyrrus9 Wrote:
(01-02-2017, 08:58 PM)God Wrote: Just wondering where subtraction and division come into play? Since expressions could be more than adding or multiplying.

Also, I know Java better than C, but I should be fine understanding the C code anyway. I don't mind having to figure things out.

Good question with sub and div, I didn't really draw this out as clearly as I should have. I defined mult-op as multiply, divide, and modulo and add-op as add and subtract, doing it this way just makes it cleaner, not having 5 symbols (or even worse, 5 cases) in the grammar.

Oh okay. That makes a lot more sense now.
"If you look for the light, you can often find it. But if you look for the dark, that is all you will ever see.”

RE: CYFPL - Part 2 - Inception #8
(01-02-2017, 08:15 PM)phyrrus9 Wrote:
(01-02-2017, 08:13 PM)pvnk Wrote: This is a great thread. Thank you for taking time out of your day to write it and post it for all of us!

Anything for you my love

Might I ask, I said in the tutorial I was planning on working with C, is that a language you are comfortable with or would you prefer something like Java?
I am not really a faggot so C would be great.

RE: CYFPL - Part 2 - Inception #9
I never thought of evaluating addition and subtraction in an entirely different step of the process from multiplication et. al, that's a neat trick.

"However, we want to make this more of a language than a bad implementation of R,"
Silly phyrrus, R is a bad implementation of R

RE: CYFPL - Part 2 - Inception #10
(01-03-2017, 04:06 AM)Inori Wrote: I never thought of evaluating addition and subtraction in an entirely different step of the process from multiplication et. al, that's a neat trick.

"However, we want to make this more of a language than a bad implementation of R,"
Silly phyrrus, R is a bad implementation of R

Believe me, grasping the concept of them being separate operations hit me hard the first time too. It wasn't until I started building CPUs that I realized multiplication and division are really just applications of addition, so it would make sense that they are handled separately. And yes, R is a pitiful language to begin with

1 user Likes phyrrus9's post

Users browsing this thread: 1 Guest(s)