Chapter 7Formal Languagesnotes.en.pdf:142-143

Top-down and Bottom-up Parsing

Two strategies, one tree

Def 7.3.11Top-down parsingDef 7.3.12Bottom-up parsing
Concept

Parsing is the inverse of derivation: given a string, find a parse tree (or
prove none exists). There are two main strategies.

Top-down parsing (Definition 7.3.11, notes p. 142): start from the start
symbol SS and work downwards to terminals. Try each rule that has SS on the
left; for each, recursively try to derive the next nonterminal. Predictive:
you guess what each nonterminal should expand to based on the input.

Bottom-up parsing (Definition 7.3.12, notes p. 143): start from the input
string and work upwards to SS. Find substrings matching the right-hand side of
some rule, and replace them with the left-hand nonterminal. This is called
shift-reduce parsing.

A subtle distinction (notes p. 172):
- LL parsers are top-down (Left-to-right scan, Leftmost derivation).
- LL(1): one token of lookahead.
- LL(k): kk tokens of lookahead.
- LR parsers are bottom-up (Left-to-right scan, Rightmost derivation in
reverse).
- LR(0), SLR, LALR, LR(1): increasing power, increasing table size.

In practice:
- Top-down is conceptually simpler, fits recursive-descent parsers well.
- Bottom-up is more powerful (handles a wider class of grammars) and tends
to be more efficient. Yacc / Bison are bottom-up LR parser generators.

Theorem 7.3.14 (notes p. 173, Lee 2002): context-free parsing is equivalent
to Boolean matrix multiplication. This is why efficient CFG parsers run in
O(n3)O(n^3) — and why we are unlikely to find much faster practical algorithms.

Animation — parsing table
Transcript — click a line to jump8 cues
  1. 0.0sPredictive parsing: LL(1)
  2. 1.2sGrammar E → T + E | T appears at the top
  3. 3.0sParse table builds — rows for non-terminals, columns for terminals
  4. 5.5sEach cell holds the production to apply for that lookahead
  5. 6.5sInput string + n is queued; cursor lands on +
  6. 7.5sCursor + is looked up in row E, column + → production E → T + E
  7. 9.0sCursor moves to n; lookup in row T, column n → T → n
  8. 11.5sLookahead + table entry + production = next derivation step
Worked example
Step 0 of 2
Practice — score 100% to advance
Multiple choice
Q1
Top-down parsing starts from…
Q2
Bottom-up parsing replaces substrings matching…
Q3
What does LL(1) mean?
Q4
Why are bottom-up parsers often preferred?
Q5
Context-free parsing has the same complexity as…
Order the steps
Arrange these proof steps in the correct order using the arrows.
1
Bottom-up: replace body with the rule's head
2
Top-down: expand leftmost nonterminal using a rule
3
Top-down: begin with start symbol SS
4
Top-down: continue until string is all terminals and matches input
5
Bottom-up: find substring matching a rule body
6
Bottom-up: repeat until only SS remains
Loading…