Python:
from calendar import weekday, SUNDAY
[year for year in range(2008, 2122)
if weekday(year, 12, 25) == SUNDAY]
Lua:
require("date")
for year=2008,2121 do
if date(year, 12, 25):getweekday() == 1 then
print(year)
end
end
Javascript:
for (var year = 2008; year <= 2121; year++) {
var xmas = new Date(year, 11, 25)
if ( xmas.getDay() === 0 )
console.log(year)
}
example.ml
¹ let x = 5;
² let y = 6
³ let z = 7
ocamlc example.ml
File "example.ml", line 3, characters 0-3:
3 | let z = 7
^^^
Error: Syntax error
A frustrating experience for developers!
\[ Sentence\quad \to\quad Subject\quad Verb\quad Complement \\ \]
\[ \begin{align*} Subject \quad &\to& \texttt{Jerry}\quad &|\quad \texttt{Mary} \\ Verb \quad &\to& \texttt{eats}\quad &|\quad \texttt{drinks} \\ Complement \quad &\to& \texttt{pizza}\quad &|\quad \texttt{coffee} \\ \end{align*} \]
Terminals: \(\texttt{Jerry}\), \(\texttt{Mary}\), \(\texttt{eats}\), \(\texttt{drinks}\), \(\texttt{pizza}\), \(\texttt{coffee}\)
Nonterminals: \(Sentence, Subject, Verb, Complement\)
\[ \newcommand\leaf[2]{\dfrac{\texttt{#2}}{#1}} \begin{align*} \texttt{Jerry} \quad \texttt{eats} \quad \texttt{pizza} \end{align*} \]
\[ \begin{align*} \newcommand\leaf[2]{\dfrac{\texttt{#2}}{#1}} \leaf{Subject}{Jerry} \quad \texttt{eats} \quad \texttt{pizza} \end{align*} \]
\[ \begin{align*} \newcommand\leaf[2]{\dfrac{\texttt{#2}}{#1}} \leaf{Subject}{Jerry} \quad \leaf{Verb}{eats} \quad \texttt{pizza} \end{align*} \]
\[ \begin{align*} \leaf{Subject}{Jerry} \quad \leaf{Verb}{eats} \quad \leaf{Complement}{pizza} \end{align*} \]
\[ \begin{align*} \newcommand\leaf[2]{\dfrac{\texttt{#2}}{#1}} \dfrac{ \leaf{Subject}{Jerry} \quad \leaf{Verb}{eats} \quad \leaf{Complement}{pizza} }{Sentence} \end{align*} \]
How could we recognize an expression like: \(1+2\)
\[ \begin{align*} E &\to T \\ E &\to T + E \\ T &\to n \\ T &\to (E) \\ \end{align*} \] Terminals: \(+\), \((\), \()\), \(n\)
Nonterminals: \(E\) (expression), \(T\) (term)
\[ %\newcommand\node[3]{\dfrac{#1}{#2}}{\small(#2 \to #3)} \begin{align*} \newcommand\leaf[2]{\dfrac{\texttt{#2}}{#1}} \dfrac{ \dfrac{1}{T} \quad {\small+} \quad \dfrac{ \dfrac{2}{T} }{E} }{E} \end{align*} \]
\(\Rightarrow\) Constructing these trees is the job of a parser
Hand-writting parsers quickly appeared to be problematic: this process was tedious and error-prone.
This motivated a long line of research on parsing algorithms.
In this thesis, I worked on the family of LR parsers:
Yet, mainstream compilers tend to abandon generated parsers:
Why?
We can recognize any sentence by repeating two actions:
Let us write \(\mathrm{stack} \blacktriangleright \mathrm{input}\) for a step, where:
\[ \SR{}{1+2} \]
\[ \SRS \]
\[ \SR{n}{\?+2} \]
\[ \SRR{T\to n} \]
\[ \SR{T}{\?+2} \]
\[ \SRS \]
\[ \SR{T+\?}{2} \]
\[ \SRS \]
\[ \SR{T+n}{} \]
\[ \SRR{T\to n} \]
\[ \SR{T+T}{} \]
\[ \SRR{E\to T} \]
\[ \SR{T+E}{} \]
\[ \SRR{E\to T+E} \]
\[ \SR{E}{} \]
Question: How do we know which actions are applicable?
Idea: Keep track of the grammar rules have been partially recognized.
An item mark a position in a rule with a \(\bullet\).
For instance, \(E \to \bullet T + E\) is an item.
\[ \newcommand\SRSt[1]{\boxed{\begin{align*}#1\end{align*}}} \]When starting, we want to recognize an expression.
\[ \SRSt{ E &\to \bullet T \\ E &\to \bullet T + E \\ } \]
\[ \SRSt{ E &\to \bullet T \\ E &\to \bullet T + E \\ T &\to \bullet n \\ T &\to \bullet (E) \\ } \]
This is the initial state.
A state is a set of items, representing the candidates applicable to a situation.
Let's repeat the parsing process while tracking states:
\[ \newcommand\SRB[3]{ \begin{align*} &\SRSt{#3} \\[1em] &\SR{#1}{#2} \end{align*} } \] $$ \SRSt{ E &\to \bullet T \\ E &\to \bullet T + E \\ T &\to \bullet n \\ T &\to \bullet (E) \\ } $$\[ \SR{}{ 1+2} \]
\[ \SRS \]
\[ \SR{n}{\?+2} \]
\[ \SRR{T\to n} \]
\[ \SRS \]
\[ \ldots \]
Two kinds of actions are sufficient to recognize every sentence:
An item, written \(A\to\beta\bullet\gamma\), represents a position in a rule.
A state tracks active items and gives the applicable actions.
Let us look at examples of incomplete input.
We would like to report: "Expecting a number".
\[ \SR{}{1+\?} \]
\[ \SRS \]
\[ \SR{n}{+\?} \]
\[ \SRR{T\to n} \]
\[ \SR{T}{+} \]
\[ \SRS \]
\[ \SR{T+\?}{} \]
$$ \SR{T+\?}{} $$ $$ \SRSt{ E &\to T+\bullet E \\ E &\to \bullet T \\ E &\to \bullet T + E \\ T &\to \bullet n \\ T &\to \bullet (E) \\ } $$ $$ \SR{T+\?}{} $$We get stuck with a stack and a state.
We would like to report: "Unclosed parenthesis".
\[ \SR{}{(1} \]
\[ \SRS \]
\[ \SR{(}{1} \]
\[ \SRS \]
\[ \SR{(n}{} \]
\[ \SRR{T\to n} \]
\[ \SRR{E\to T} \]
The input is invalid if actions do not permit reaching the goal.
The position of failure is well-defined (after the last shift).
The concept of "failing configuration" is slippery because of reductions.
Could we also specify the handling of errors?
How to specify the association between a failure and a message?
How to realize this specification efficiently?
How to help author this specification?
We focus on the technical aspect and not the human aspect:
If error looks like "1+"
report "Expecting a number"
If error looks like "(1"
report "A parenthesis has not been closed"
Used in earlier versions of Go.
In case of error, reduce T and E (if possible)
If error looks like "1+"
report "Expecting a number"
If error looks like "(1"
report "A parenthesis has not been closed"
Used in Compcert C, Catala, ...