Chomsky Hierarchy
Type 0 / 1 / 2 / 3: increasingly restrictive
Not all grammars are equally powerful. The Chomsky hierarchy (Noam Chomsky,
1965; Definition 7.3.1, notes p. 140) classifies grammars by the shape of their
production rules. Each level is more restrictive — and easier to parse.
The four levels, from least to most restrictive:
- Type 0 — Unrestricted (phrase-structure grammars). Heads contain at
least one nonterminal; bodies are arbitrary strings. Maximum power.
- Type 1 — Context-sensitive. Bodies have no fewer symbols than heads:
. (Exception: is allowed.) The rewrite depends on
the surrounding context.
- Type 2 — Context-free (CFG). Heads are a single nonterminal:
for . The rewrite does not depend on context. CFGs cover
most programming language syntax.
- Type 3 — Regular (REG). Heads are a single nonterminal; bodies are
empty or a single terminal, optionally followed by a single nonterminal.
Equivalent to finite-state machines. Minimum power; easiest to parse.
The hierarchy is strict: every regular language is context-free, every
context-free language is context-sensitive, every context-sensitive language is
unrestricted. Each containment is strict (there are languages at level that
are NOT at level ).
Classic examples:
- — context-sensitive (Type 1). Cannot be CFG.
- — context-free (Type 2). Cannot be regular.
- — regular (Type 3).
For SMAI and most AI applications, context-free grammars are the sweet spot:
expressive enough for most languages, parsing algorithms run in (Theorem
7.3.14, notes p. 173).
- 0.0sChomsky Hierarchy
- 0.8sInnermost rectangle: Type 3, Regular; example (a|b)*.
- 1.9sType 2, Context-Free: example a^n b^n.
- 2.8sType 1, Context-Sensitive: example a^n b^n c^n.
- 3.7sType 0, Unrestricted: example L = a^n b^n c^n.
- 5.1sExamples stack as nested sets.
- 6.5sInclusions: T3 subset T2 subset T1 subset T0.
- 7.5sEach outer type strictly contains the inner ones.
The language is context-sensitive (Type 1) — it cannot be generated by a context-free grammar.
A context-sensitive grammar for :
S → abc | aABc
A → aABc | abc
cB → Bc
bB → bb
The rule cB → Bc swaps c to the right of B — needed to gather all the cs at the end. This is the kind of cross-serial dependency that distinguishes context-sensitive from context-free.
Observation. Natural languages (English, German, ...) are believed to be context-sensitive in general. Crucially, humans parse them in real time — despite the theoretical worst-case complexity. This is the 'real-time parsing puzzle' and motivates incremental / psycholinguistic models of parsing.
The language is regular. A Type-3 grammar:
S → aS | ε
The recursive rule S → aS produces any number of as; the base case S → ε (empty string) terminates. This is the simplest possible non-trivial regular language.
S → aS | ε generates the language .