Formal Languages
Kleene star, language operations
A formal language is just a set of strings. That's it. Nothing more mysterious.
The whole point is to distinguish "good" strings (members of the language, like
well-formed programs) from "bad" strings (non-members, like syntax errors).
Kleene star (Definition 7.1.7, notes p. 119): for an alphabet :
This is the set of all strings over , including . So
.
Kleene plus: — all nonempty strings.
Definition 7.1.9: A formal language over is a set .
Operations on languages (Definition 7.1.13):
- Union , intersection , complement .
- Concatenation .
- Power : -fold concatenation of with itself ().
- Kleene closure .
Important sanity check: a formal language may be infinite (e.g. all strings
starting with followed by any number of 's) — but the alphabet is finite.
The infinite-ness comes from repetition.
Why this matters in AI:
- The set of valid programs in Python is a formal language over Unicode.
- The set of valid logical formulas in propositional logic is a formal language
over .
- The set of grammatical English sentences is (approximately) a formal language.
Two more operations on languages over alphabet :
Complement (Def 1.7): . The set of all strings not in .
Reflection / Reversal (Def 1.8): , where is reversed.
Kleene plus (Def 1.7): — one or more concatenations (vs. Kleene star which includes zero).
Examples over :
- . — every string except ab and ba.
- — same set for this symmetric example.
- — concatenations of length .
abc is ___.