當(dāng)前位置：首頁 > 编程资源 > 综合教程 >内容正文

综合教程

[编译原理]2.语法分析(syntax analysis)

發(fā)布時間：2024/6/21 综合教程 23 生活家

生活随笔收集整理的這篇文章主要介紹了 [编译原理]2.语法分析(syntax analysis) 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

There are three general types of parsers for grammars

universal	top-down	bottom-up
Cocke-Younger-Kasami algorithm, Earley's algorithm
parse any grammar, inefficient	work only for subclasses of grammars	work only for subclasses of grammars

strategies for error recovery：panic-mode，phrase-level recovery，error-productions，global-correction
a compiler is expected to assist the programmer in locating and tracking down errors

error handler in a parser has goals：

Report the presence of errors clearly and accurately.
Recover from each error quickly enough to detect subsequent errors.
Add minimal overhead to the processing of correct programs.

一、context-free grammar

由一系列production組成

terminal symbols
nonterminals
productions
start symbol

Derivation: sentential form, sentence, left-sentential form, leftmost derivation, rightmost derivation

Every construct that can be described by a regular expression can be described by a grammar, but not vice-versa.

Immediate left recursion can be eliminated by the following technique

Algorithm 4.19, below, systematically eliminates left recursion from a grammar.

二、top-down parsing

top-down parsing method 有3種: recursive-descent parsing, predictive parsing 和 nonrecursive predictive parsing，
predictive parsing 是一種特殊的 recursive-descent parsing。
有一類 predictive parsing 一定能夠解析的 grammar 稱為 LL(1) grammar, left-recursive grammar 和 ambiguous grammar 一定不是 LL(1) grammar。

當(dāng)且僅當(dāng)一個 grammar (G) 滿足下列條件時，(G) 才能成為一個 LL(1) grammar:
若 (A o alpha | eta) 是兩個不同的 productions, 則
1）(FIRST(alpha) cap FIRST(eta) = emptyset)
2）若 (epsilon in FIRST(eta)), 則 (FIRST(alpha) cap FOLLOW(eta) = emptyset); (epsilon in FIRST(alpha)) 同理。

為predictive parsing method 構(gòu)造 parsing table 的算法：
輸入為 grammar (G),
輸出為 parsing table (M)。
對 grammar (G) 中的所有 production (A o alpha) 執(zhí)行以下兩步操作:

(forall a in FIRST(alpha)), 將(A o alpha)添加到 (M[A,a]) 中
若(epsilon in FIRST(alpha)), 則 (forall b in FOLLOW(alpha)), 將 (A o alpha)添加到 (M[A,b]) 中
若 (exists A, a, s.t. M[A, a] = emptyset) , 則令(M[A, a] =) error

注意，對于 LL(1) grammar, table (M) 中的每一個 entry 至多包含一條 production。
考慮(A o alpha | eta)，我們來說明它們不可能出現(xiàn)在同一個 entry 中。
LL(1) grammar 的條件 1 說明，這兩條 production 經(jīng)過步驟 1 ，不可能出現(xiàn)在同一個 entry 中
LL(1) grammar 的條件 1 說明(FIRST(alpha), FIRST(eta)) 不可能同時包含 (epsilon), 所以兩條 production 至多只有一條能經(jīng)過步驟 2 添加到 (M) 中，
而LL(1) grammar 的條件 2 又說明，經(jīng)過步驟 2 添加到 (M) 中的那條 production 與另外一條 production 不可能出現(xiàn)在同一個 entry 中。

計算FIRST的方法

計算FOLLOW的方法

三、bottum-up parsing

下面介紹 3 種 bottum-up 方法，SLR, canonical LR (LR for short) 和 LALR, 它們都基于 shift-reduce 方法

我們先看一下 3 種 LR 型 parser 的共同點，然后再看它們之間的差異

1.LR 型 parser 的構(gòu)造過程

(item o DFA o parsing) (table)
(DFA) 的一個 (state) 對應(yīng)一個由若干 (item) 構(gòu)成的一個 (set)

2.LR型 parser 的結(jié)構(gòu)和工作原理

LR 型的 parser 結(jié)構(gòu)如上圖所示，有一個存儲 state 的棧, 一塊存儲輸入的緩沖區(qū), 還有一張用來做決策的 parsing table。
1)parser 每次都根據(jù)棧頂?shù)?state 和緩沖區(qū)中下一個輸入的 terminal 去查詢 parsing table 的 action 區(qū)域，決定接下來是 shift, reduce, accept 還是 error。
2)如果是 reduce, 則要彈出棧頂代表 handle 的若干 state, 再根據(jù)新的棧頂 state 和用來代替 handle 的 nonterminal 去查詢 parsing table 的 goto 區(qū)域，并將跳轉(zhuǎn)的下一個 state 壓入棧頂或者是 error。
ps：從任意其它 state 進入 state j 一定是通過相同的 grammar symbol X。
pss：所有的 LR 型 grammar 都是 unambiguous 的。

3.SLR, LR 和 LALR 的區(qū)別

三種 paser 使用了不同的 item,

item	parser	grammar	state size
LR(0)	SLR	SLR	small
LR(1)	LR	LR	large
LALR(1)	LALR	LALR	small

An (LR(0)) (item) of a grammar (G) is a production of G with a dot at some position of the body. Thus, production (A o XYZ) yields the four items
(A o ·XYZ)
(A o X·YZ)
(A o XY·Z)
(A o XYZ·)

An (LR(1)) (item) has the form ([A o alpha·eta, a]) , where (A o alphaeta) is a production and (a) is a terminal or the right endmarker ($).

Recall that in the SLR method, state $i$ calls for reduction by $A o alpha$ if the set of items $I_i$ contains item $[A o alpha·]$ and $a$ is in FOLLOW($A$). In some situations, however, when state $i$ appears on top of the stack, the viable prefix $etaalpha$ on the stack is such that $eta A$ cannot be followed by $a$ in any right-sentential form.

maybe Follow((A)) != Follow((eta A))

Sets of LR(1) items 構(gòu)造方法

總結(jié)

以上是生活随笔為你收集整理的[编译原理]2.语法分析(syntax analysis)的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： web.xml中欢迎页面的配置
下一篇： Egg入门学习(三)---理解中间件作用