v8世界探险(2) - 词法和语法分析
上节我们学习了API的概况,这节开始我们就循着API来分析实现。
对于解释器或者编译器来说,我们第一个感兴趣的当然是编译的过程。
上节我们学习过了,编译调用的API是Script::Compile函数:
// Compile the source code.
Local<Script> script = Script::Compile(context, source).ToLocalChecked();
Script::Compile
API的实现,大部分都位于src/api.cc中,比如Script::Compile就是如此。
如果指定了ScriptOrigin对象,就用它构造ScriptCompiler::Source对象,否则就用String指定的。
不管哪一支,最后都调用ScriptCompiler::Compile函数去做编译。
MaybeLocal<Script> Script::Compile(Local<Context> context, Local<String> source,
ScriptOrigin* origin) {
if (origin) {
ScriptCompiler::Source script_source(source, *origin);
return ScriptCompiler::Compile(context, &script_source);
}
ScriptCompiler::Source script_source(source);
return ScriptCompiler::Compile(context, &script_source);
}
ScriptCompiler::Compile
ScriptCompiler::Compile函数仍然在api.cc中。
我们之前讲过,没有绑定到Context的编译脚本叫做UnboundScript,ScriptCompiler::Compile首先调用CompileUnboundInternal来编译生成一个UnboundScript,最后再将其BindToCurrentContext()跟上下文绑定。
MaybeLocal<Script> ScriptCompiler::Compile(Local<Context> context,
Source* source,
CompileOptions options) {
auto isolate = context->GetIsolate();
auto maybe = CompileUnboundInternal(isolate, source, options, false);
Local<UnboundScript> result;
if (!maybe.ToLocal(&result)) return MaybeLocal<Script>();
v8::Context::Scope scope(context);
return result->BindToCurrentContext();
}
ScriptCompiler::CompileUnboundInternal
最主要的会调用Isolate的Compiler的CompileScript。
MaybeLocal<UnboundScript> ScriptCompiler::CompileUnboundInternal(
Isolate* v8_isolate, Source* source, CompileOptions options,
bool is_module) {
...
result = i::Compiler::CompileScript(
str, name_obj, line_offset, column_offset, source->resource_options,
source_map_url, isolate->native_context(), NULL, &script_data, options,
i::NOT_NATIVES_CODE, is_module);
has_pending_exception = result.is_null();
if (has_pending_exception && script_data != NULL) {
// This case won't happen during normal operation; we have compiled
// successfully and produced cached data, and but the second compilation
// of the same source code fails.
delete script_data;
script_data = NULL;
}
RETURN_ON_FAILED_EXECUTION(UnboundScript);
if ((options == kProduceParserCache || options == kProduceCodeCache) &&
script_data != NULL) {
// script_data now contains the data that was generated. source will
// take the ownership.
source->cached_data = new CachedData(
script_data->data(), script_data->length(), CachedData::BufferOwned);
script_data->ReleaseDataOwnership();
} else if (options == kConsumeParserCache || options == kConsumeCodeCache) {
source->cached_data->rejected = script_data->rejected();
}
delete script_data;
}
RETURN_ESCAPED(ToApiHandle<UnboundScript>(result));
}
Compiler::CompileScript
这个函数定义在src/compiler.cc中,我们先暂时略过细节,编译会调用到CompileTolevel函数。
Handle<SharedFunctionInfo> Compiler::CompileScript(
Handle<String> source, Handle<Object> script_name, int line_offset,
int column_offset, ScriptOriginOptions resource_options,
Handle<Object> source_map_url, Handle<Context> context,
v8::Extension* extension, ScriptData** cached_data,
ScriptCompiler::CompileOptions compile_options, NativesFlag natives,
bool is_module) {
...
static_cast<LanguageMode>(info.language_mode() | language_mode));
result = CompileToplevel(&info);
if (extension == NULL && !result.is_null()) {
compilation_cache->PutScript(source, context, language_mode, result);
if (FLAG_serialize_toplevel &&
compile_options == ScriptCompiler::kProduceCodeCache) {
HistogramTimerScope histogram_timer(
isolate->counters()->compile_serialize());
*cached_data = CodeSerializer::Serialize(isolate, result, source);
if (FLAG_profile_deserialization) {
PrintF("[Compiling and serializing took %0.3f ms]\n",
timer.Elapsed().InMillisecondsF());
}
}
}
...
CompileToplevel
这是个static函数,定义在compiler.cc中。
这中间主要经过两个步骤:
- Parser::ParseStatic - 词法分析,语法分析生成抽象语法树
- CompileBaselineCode - 代码生成
我们先看前半部分:
static Handle<SharedFunctionInfo> CompileToplevel(CompilationInfo* info) {
Isolate* isolate = info->isolate();
PostponeInterruptsScope postpone(isolate);
DCHECK(!isolate->native_context().is_null());
ParseInfo* parse_info = info->parse_info();
Handle<Script> script = parse_info->script();
...
isolate->debug()->OnBeforeCompile(script);
...
Handle<SharedFunctionInfo> result;
{ VMState<COMPILER> state(info->isolate());
if (parse_info->literal() == NULL) {
// Parse the script if needed (if it's already parsed, literal() is
// non-NULL). If compiling for debugging, we may eagerly compile inner
// functions, so do not parse lazily in that case.
ScriptCompiler::CompileOptions options = parse_info->compile_options();
bool parse_allow_lazy = (options == ScriptCompiler::kConsumeParserCache ||
String::cast(script->source())->length() >
FLAG_min_preparse_length) &&
!info->is_debug();
parse_info->set_allow_lazy_parsing(parse_allow_lazy);
if (!parse_allow_lazy &&
(options == ScriptCompiler::kProduceParserCache ||
options == ScriptCompiler::kConsumeParserCache)) {
// We are going to parse eagerly, but we either 1) have cached data
// produced by lazy parsing or 2) are asked to generate cached data.
// Eager parsing cannot benefit from cached data, and producing cached
// data while parsing eagerly is not implemented.
parse_info->set_cached_data(nullptr);
parse_info->set_compile_options(ScriptCompiler::kNoCompileOptions);
}
if (!Parser::ParseStatic(parse_info)) {
return Handle<SharedFunctionInfo>::null();
}
}
后半部分是编译的部分,调用CompileBaselineCode.
DCHECK(!info->is_debug() || !parse_info->allow_lazy_parsing());
info->MarkAsFirstCompile();
FunctionLiteral* lit = parse_info->literal();
LiveEditFunctionTracker live_edit_tracker(isolate, lit);
...
// Compile the code.
if (!CompileBaselineCode(info)) {
return Handle<SharedFunctionInfo>::null();
}
...
return result;
}
Parser::ParseStatic
下面我们开始进入Parser的世界,入口在Parser::ParseStatic这个静态工厂函数。它定义在src/parsing/parser.cc中:
ParseStatic会构造一个Parser对象,然后调用parser的Parse函数去做解析。
bool Parser::ParseStatic(ParseInfo* info) {
Parser parser(info);
if (parser.Parse(info)) {
info->set_language_mode(info->literal()->language_mode());
return true;
}
return false;
}
Parser::Parse
Parse开始解析脚本源代码,有两种情况,分别是:
- ParseLazy
- ParseProgram
bool Parser::Parse(ParseInfo* info) {
DCHECK(info->literal() == NULL);
FunctionLiteral* result = NULL;
// Ok to use Isolate here; this function is only called in the main thread.
DCHECK(parsing_on_main_thread_);
Isolate* isolate = info->isolate();
pre_parse_timer_ = isolate->counters()->pre_parse();
if (FLAG_trace_parse || allow_natives() || extension_ != NULL) {
// If intrinsics are allowed, the Parser cannot operate independent of the
// V8 heap because of Runtime. Tell the string table to internalize strings
// and values right after they're created.
ast_value_factory()->Internalize(isolate);
}
if (info->is_lazy()) {
DCHECK(!info->is_eval());
if (info->shared_info()->is_function()) {
result = ParseLazy(isolate, info);
} else {
result = ParseProgram(isolate, info);
}
} else {
SetCachedData(info);
result = ParseProgram(isolate, info);
}
info->set_literal(result);
Internalize(isolate, info->script(), result == NULL);
DCHECK(ast_value_factory()->IsInternalized());
return (result != NULL);
}
Parser::ParseProgram
我们先看Parser::ParseProgram,主要干活的会调用Parser::DoParseProgram.
FunctionLiteral* Parser::ParseProgram(Isolate* isolate, ParseInfo* info) {
// TODO(bmeurer): We temporarily need to pass allow_nesting = true here,
// see comment for HistogramTimerScope class.
// It's OK to use the Isolate & counters here, since this function is only
// called in the main thread.
DCHECK(parsing_on_main_thread_);
HistogramTimerScope timer_scope(isolate->counters()->parse(), true);
Handle<String> source(String::cast(info->script()->source()));
isolate->counters()->total_parse_size()->Increment(source->length());
base::ElapsedTimer timer;
if (FLAG_trace_parse) {
timer.Start();
}
fni_ = new (zone()) FuncNameInferrer(ast_value_factory(), zone());
// Initialize parser state.
CompleteParserRecorder recorder;
if (produce_cached_parse_data()) {
log_ = &recorder;
} else if (consume_cached_parse_data()) {
cached_parse_data_->Initialize();
}
source = String::Flatten(source);
FunctionLiteral* result;
if (source->IsExternalTwoByteString()) {
// Notice that the stream is destroyed at the end of the branch block.
// The last line of the blocks can't be moved outside, even though they're
// identical calls.
ExternalTwoByteStringUtf16CharacterStream stream(
Handle<ExternalTwoByteString>::cast(source), 0, source->length());
scanner_.Initialize(&stream);
result = DoParseProgram(info);
} else {
GenericStringUtf16CharacterStream stream(source, 0, source->length());
scanner_.Initialize(&stream);
result = DoParseProgram(info);
}
if (result != NULL) {
DCHECK_EQ(scanner_.peek_location().beg_pos, source->length());
}
HandleSourceURLComments(isolate, info->script());
if (FLAG_trace_parse && result != NULL) {
double ms = timer.Elapsed().InMillisecondsF();
if (info->is_eval()) {
PrintF("[parsing eval");
} else if (info->script()->name()->IsString()) {
String* name = String::cast(info->script()->name());
base::SmartArrayPointer<char> name_chars = name->ToCString();
PrintF("[parsing script: %s", name_chars.get());
} else {
PrintF("[parsing script");
}
PrintF(" - took %0.3f ms]\n", ms);
}
if (produce_cached_parse_data()) {
if (result != NULL) *info->cached_data() = recorder.GetScriptData();
log_ = NULL;
}
return result;
}
Parser::DoParseProgram
下面的代码虽然多,但是我们现在只主要关注两个函数就好了:
if (info->is_module()) {
ParseModuleItemList(body, &ok);
} else {
ParseStatementList(body, Token::EOS, &ok);
}
- ParseModuleItemList: 解析ES6支持的module语句的列表
- ParseStatementList: 解析普通的语句
FunctionLiteral* Parser::DoParseProgram(ParseInfo* info) {
...
Mode parsing_mode = FLAG_lazy && allow_lazy() ? PARSE_LAZILY : PARSE_EAGERLY;
if (allow_natives() || extension_ != NULL) parsing_mode = PARSE_EAGERLY;
FunctionLiteral* result = NULL;
{
// TODO(wingo): Add an outer SCRIPT_SCOPE corresponding to the native
// context, which will have the "this" binding for script scopes.
Scope* scope = NewScope(scope_, SCRIPT_SCOPE);
info->set_script_scope(scope);
if (!info->context().is_null() && !info->context()->IsNativeContext()) {
scope = Scope::DeserializeScopeChain(info->isolate(), zone(),
*info->context(), scope);
// The Scope is backed up by ScopeInfo (which is in the V8 heap); this
// means the Parser cannot operate independent of the V8 heap. Tell the
// string table to internalize strings and values right after they're
// created. This kind of parsing can only be done in the main thread.
DCHECK(parsing_on_main_thread_);
ast_value_factory()->Internalize(info->isolate());
}
original_scope_ = scope;
if (info->is_eval()) {
if (!scope->is_script_scope() || is_strict(info->language_mode())) {
parsing_mode = PARSE_EAGERLY;
}
scope = NewScope(scope, EVAL_SCOPE);
} else if (info->is_module()) {
scope = NewScope(scope, MODULE_SCOPE);
}
scope->set_start_position(0);
// Enter 'scope' with the given parsing mode.
ParsingModeScope parsing_mode_scope(this, parsing_mode);
AstNodeFactory function_factory(ast_value_factory());
FunctionState function_state(&function_state_, &scope_, scope,
kNormalFunction, &function_factory);
// Don't count the mode in the use counters--give the program a chance
// to enable script/module-wide strict/strong mode below.
scope_->SetLanguageMode(info->language_mode());
ZoneList<Statement*>* body = new(zone()) ZoneList<Statement*>(16, zone());
bool ok = true;
int beg_pos = scanner()->location().beg_pos;
if (info->is_module()) {
ParseModuleItemList(body, &ok);
} else {
ParseStatementList(body, Token::EOS, &ok);
}
// The parser will peek but not consume EOS. Our scope logically goes all
// the way to the EOS, though.
scope->set_end_position(scanner()->peek_location().beg_pos);
if (ok && is_strict(language_mode())) {
CheckStrictOctalLiteral(beg_pos, scanner()->location().end_pos, &ok);
}
if (ok && is_sloppy(language_mode()) && allow_harmony_sloppy_function()) {
// TODO(littledan): Function bindings on the global object that modify
// pre-existing bindings should be made writable, enumerable and
// nonconfigurable if possible, whereas this code will leave attributes
// unchanged if the property already exists.
InsertSloppyBlockFunctionVarBindings(scope, &ok);
}
if (ok && (is_strict(language_mode()) || allow_harmony_sloppy() ||
allow_harmony_destructuring_bind())) {
CheckConflictingVarDeclarations(scope_, &ok);
}
if (ok && info->parse_restriction() == ONLY_SINGLE_FUNCTION_LITERAL) {
if (body->length() != 1 ||
!body->at(0)->IsExpressionStatement() ||
!body->at(0)->AsExpressionStatement()->
expression()->IsFunctionLiteral()) {
ReportMessage(MessageTemplate::kSingleFunctionLiteral);
ok = false;
}
}
if (ok) {
ParserTraits::RewriteDestructuringAssignments();
result = factory()->NewFunctionLiteral(
ast_value_factory()->empty_string(), scope_, body,
function_state.materialized_literal_count(),
function_state.expected_property_count(), 0,
FunctionLiteral::kNoDuplicateParameters,
FunctionLiteral::kGlobalOrEval, FunctionLiteral::kShouldLazyCompile,
FunctionKind::kNormalFunction, 0);
}
}
...
return result;
}
Parser::ParseModuleItemList
module语句是ES6中引入的新feature,针对每一条,调用ParseModuleItem语句去解析。
void* Parser::ParseModuleItemList(ZoneList<Statement*>* body, bool* ok) {
// (Ecma 262 6th Edition, 15.2):
// Module :
// ModuleBody?
//
// ModuleBody :
// ModuleItem*
DCHECK(scope_->is_module_scope());
RaiseLanguageMode(STRICT);
while (peek() != Token::EOS) {
Statement* stat = ParseModuleItem(CHECK_OK);
if (stat && !stat->IsEmpty()) {
body->Add(stat, zone());
}
}
// Check that all exports are bound.
ModuleDescriptor* descriptor = scope_->module();
for (ModuleDescriptor::Iterator it = descriptor->iterator(); !it.done();
it.Advance()) {
if (scope_->LookupLocal(it.local_name()) == NULL) {
// TODO(adamk): Pass both local_name and export_name once ParserTraits
// supports multiple arg error messages.
// Also try to report this at a better location.
ParserTraits::ReportMessage(MessageTemplate::kModuleExportUndefined,
it.local_name());
*ok = false;
return NULL;
}
}
scope_->module()->Freeze();
return NULL;
}
Parser::ParseModuleItem
根据token是import,export还是普通语句,分别调用ParseImportDeclaration,ParseExportDeclaration或ParseStatementListItem.
Statement* Parser::ParseModuleItem(bool* ok) {
// (Ecma 262 6th Edition, 15.2):
// ModuleItem :
// ImportDeclaration
// ExportDeclaration
// StatementListItem
switch (peek()) {
case Token::IMPORT:
return ParseImportDeclaration(ok);
case Token::EXPORT:
return ParseExportDeclaration(ok);
default:
return ParseStatementListItem(ok);
}
}
Parser::ParseImportDeclaration
Statement* Parser::ParseImportDeclaration(bool* ok) {
// ImportDeclaration :
// 'import' ImportClause 'from' ModuleSpecifier ';'
// 'import' ModuleSpecifier ';'
//
// ImportClause :
// NameSpaceImport
// NamedImports
// ImportedDefaultBinding
// ImportedDefaultBinding ',' NameSpaceImport
// ImportedDefaultBinding ',' NamedImports
//
// NameSpaceImport :
// '*' 'as' ImportedBinding
int pos = peek_position();
Expect(Token::IMPORT, CHECK_OK);
Token::Value tok = peek();
// 'import' ModuleSpecifier ';'
if (tok == Token::STRING) {
const AstRawString* module_specifier = ParseModuleSpecifier(CHECK_OK);
scope_->module()->AddModuleRequest(module_specifier, zone());
ExpectSemicolon(CHECK_OK);
return factory()->NewEmptyStatement(pos);
}
// Parse ImportedDefaultBinding if present.
ImportDeclaration* import_default_declaration = NULL;
if (tok != Token::MUL && tok != Token::LBRACE) {
const AstRawString* local_name =
ParseIdentifier(kDontAllowRestrictedIdentifiers, CHECK_OK);
VariableProxy* proxy = NewUnresolved(local_name, IMPORT);
import_default_declaration = factory()->NewImportDeclaration(
proxy, ast_value_factory()->default_string(), NULL, scope_, pos);
Declare(import_default_declaration, DeclarationDescriptor::NORMAL, true,
CHECK_OK);
}
const AstRawString* module_instance_binding = NULL;
ZoneList<ImportDeclaration*>* named_declarations = NULL;
if (import_default_declaration == NULL || Check(Token::COMMA)) {
switch (peek()) {
case Token::MUL: {
Consume(Token::MUL);
ExpectContextualKeyword(CStrVector("as"), CHECK_OK);
module_instance_binding =
ParseIdentifier(kDontAllowRestrictedIdentifiers, CHECK_OK);
// TODO(ES6): Add an appropriate declaration.
break;
}
case Token::LBRACE:
named_declarations = ParseNamedImports(pos, CHECK_OK);
break;
default:
*ok = false;
ReportUnexpectedToken(scanner()->current_token());
return NULL;
}
}
ExpectContextualKeyword(CStrVector("from"), CHECK_OK);
const AstRawString* module_specifier = ParseModuleSpecifier(CHECK_OK);
scope_->module()->AddModuleRequest(module_specifier, zone());
if (module_instance_binding != NULL) {
// TODO(ES6): Set the module specifier for the module namespace binding.
}
if (import_default_declaration != NULL) {
import_default_declaration->set_module_specifier(module_specifier);
}
if (named_declarations != NULL) {
for (int i = 0; i < named_declarations->length(); ++i) {
named_declarations->at(i)->set_module_specifier(module_specifier);
}
}
ExpectSemicolon(CHECK_OK);
return factory()->NewEmptyStatement(pos);
}
Parser::ParseStatementList
终于开始做语句的词法和语法分析了,它将继续调用ParseStatementListItem去处理每条语句,后面有一些细节我们先略过:
void* Parser::ParseStatementList(ZoneList<Statement*>* body, int end_token,
bool* ok) {
// StatementList ::
// (StatementListItem)* <end_token>
// Allocate a target stack to use for this set of source
// elements. This way, all scripts and functions get their own
// target stack thus avoiding illegal breaks and continues across
// functions.
TargetScope scope(&this->target_stack_);
DCHECK(body != NULL);
bool directive_prologue = true; // Parsing directive prologue.
while (peek() != end_token) {
if (directive_prologue && peek() != Token::STRING) {
directive_prologue = false;
}
Scanner::Location token_loc = scanner()->peek_location();
Scanner::Location old_this_loc = function_state_->this_location();
Scanner::Location old_super_loc = function_state_->super_location();
Statement* stat = ParseStatementListItem(CHECK_OK);
if (is_strong(language_mode()) && scope_->is_function_scope() &&
IsClassConstructor(function_state_->kind())) {
Scanner::Location this_loc = function_state_->this_location();
Scanner::Location super_loc = function_state_->super_location();
if (this_loc.beg_pos != old_this_loc.beg_pos &&
this_loc.beg_pos != token_loc.beg_pos) {
ReportMessageAt(this_loc, MessageTemplate::kStrongConstructorThis);
*ok = false;
return nullptr;
}
if (super_loc.beg_pos != old_super_loc.beg_pos &&
super_loc.beg_pos != token_loc.beg_pos) {
ReportMessageAt(super_loc, MessageTemplate::kStrongConstructorSuper);
*ok = false;
return nullptr;
}
}
if (stat == NULL || stat->IsEmpty()) {
directive_prologue = false; // End of directive prologue.
continue;
}
...
body->Add(stat, zone());
}
return 0;
}
Parser::ParseStatement
只管空语句,其余的交给ParseSubStatement去处理。
1720Statement* Parser::ParseStatement(ZoneList<const AstRawString*>* labels,
1721 bool* ok) {
1722 // Statement ::
1723 // EmptyStatement
1724 // ...
1725
1726 if (peek() == Token::SEMICOLON) {
1727 Next();
1728 return factory()->NewEmptyStatement(RelocInfo::kNoPosition);
1729 }
1730 return ParseSubStatement(labels, ok);
1731}
AstNodeFactory::NewEmptyStatement
语法分析的输出结果,会生成一棵Ast树。AstNodeFactory就是生成AstNode的Helper函数的工厂类。
我们先看下它的定义:
3086// ----------------------------------------------------------------------------
3087// AstNode factory
3088
3089class AstNodeFactory final BASE_EMBEDDED {
3090 public:
3091 explicit AstNodeFactory(AstValueFactory* ast_value_factory)
3092 : local_zone_(ast_value_factory->zone()),
3093 parser_zone_(ast_value_factory->zone()),
3094 ast_value_factory_(ast_value_factory) {}
3095
3096 AstValueFactory* ast_value_factory() const { return ast_value_factory_; }
3097
3098 VariableDeclaration* NewVariableDeclaration(
3099 VariableProxy* proxy, VariableMode mode, Scope* scope, int pos,
3100 bool is_class_declaration = false, int declaration_group_start = -1) {
3101 return new (parser_zone_)
3102 VariableDeclaration(parser_zone_, proxy, mode, scope, pos,
3103 is_class_declaration, declaration_group_start);
3104 }
我们先看一个最简单的例子:NewEmptyStatement:
EmptyStatement* NewEmptyStatement(int pos) {
return new (local_zone_) EmptyStatement(local_zone_, pos);
}
这些具体的AST类,定义于src/ast/ast.h:
class EmptyStatement final : public Statement {
public:
DECLARE_NODE_TYPE(EmptyStatement)
protected:
explicit EmptyStatement(Zone* zone, int pos): Statement(zone, pos) {}
};
Parser::ParseSubStatement
针对不同的语句,分别有不同的Parse函数来处理,我们选其中的三个例子继续看一下:
- 代码块:ParseBlock
- if语句:ParseIfStatement
- do-while循环:ParseDoWhileStatement
其余的我们看一下解析子语句的完整实现,代码不长,很清晰,不言自明,就不多解释了:
Statement* Parser::ParseSubStatement(ZoneList<const AstRawString*>* labels,
bool* ok) {
// Statement ::
// Block
// VariableStatement
// EmptyStatement
// ExpressionStatement
// IfStatement
// IterationStatement
// ContinueStatement
// BreakStatement
// ReturnStatement
// WithStatement
// LabelledStatement
// SwitchStatement
// ThrowStatement
// TryStatement
// DebuggerStatement
// Note: Since labels can only be used by 'break' and 'continue'
// statements, which themselves are only valid within blocks,
// iterations or 'switch' statements (i.e., BreakableStatements),
// labels can be simply ignored in all other cases; except for
// trivial labeled break statements 'label: break label' which is
// parsed into an empty statement.
switch (peek()) {
case Token::LBRACE:
return ParseBlock(labels, ok);
case Token::SEMICOLON:
if (is_strong(language_mode())) {
ReportMessageAt(scanner()->peek_location(),
MessageTemplate::kStrongEmpty);
*ok = false;
return NULL;
}
Next();
return factory()->NewEmptyStatement(RelocInfo::kNoPosition);
case Token::IF:
return ParseIfStatement(labels, ok);
case Token::DO:
return ParseDoWhileStatement(labels, ok);
case Token::WHILE:
return ParseWhileStatement(labels, ok);
case Token::FOR:
return ParseForStatement(labels, ok);
case Token::CONTINUE:
case Token::BREAK:
case Token::RETURN:
case Token::THROW:
case Token::TRY: {
// These statements must have their labels preserved in an enclosing
// block
if (labels == NULL) {
return ParseStatementAsUnlabelled(labels, ok);
} else {
Block* result =
factory()->NewBlock(labels, 1, false, RelocInfo::kNoPosition);
Target target(&this->target_stack_, result);
Statement* statement = ParseStatementAsUnlabelled(labels, CHECK_OK);
if (result) result->statements()->Add(statement, zone());
return result;
}
}
case Token::WITH:
return ParseWithStatement(labels, ok);
case Token::SWITCH:
return ParseSwitchStatement(labels, ok);
case Token::FUNCTION: {
// FunctionDeclaration is only allowed in the context of SourceElements
// (Ecma 262 5th Edition, clause 14):
// SourceElement:
// Statement
// FunctionDeclaration
// Common language extension is to allow function declaration in place
// of any statement. This language extension is disabled in strict mode.
//
// In Harmony mode, this case also handles the extension:
// Statement:
// GeneratorDeclaration
if (is_strict(language_mode())) {
ReportMessageAt(scanner()->peek_location(),
MessageTemplate::kStrictFunction);
*ok = false;
return NULL;
}
return ParseFunctionDeclaration(NULL, ok);
}
case Token::DEBUGGER:
return ParseDebuggerStatement(ok);
case Token::VAR:
return ParseVariableStatement(kStatement, NULL, ok);
case Token::CONST:
// In ES6 CONST is not allowed as a Statement, only as a
// LexicalDeclaration, however we continue to allow it in sloppy mode for
// backwards compatibility.
if (is_sloppy(language_mode()) && allow_legacy_const()) {
return ParseVariableStatement(kStatement, NULL, ok);
}
// Fall through.
default:
return ParseExpressionOrLabelledStatement(labels, ok);
}
}
构造一个代码块 Parser::ParseBlock
Block* Parser::ParseBlock(ZoneList<const AstRawString*>* labels,
bool finalize_block_scope, bool* ok) {
// The harmony mode uses block elements instead of statements.
//
// Block ::
// '{' StatementList '}'
下面是遇到左大括号时,调用AstNodeFactory的NewBlock函数生成一个Block类的AST节点。
// Construct block expecting 16 statements.
Block* body =
factory()->NewBlock(labels, 16, false, RelocInfo::kNoPosition);
Scope* block_scope = NewScope(scope_, BLOCK_SCOPE);
// Parse the statements and collect escaping labels.
Expect(Token::LBRACE, CHECK_OK);
block_scope->set_start_position(scanner()->location().beg_pos);
{ BlockState block_state(&scope_, block_scope);
Target target(&this->target_stack_, body);
下面如果没遇到右括号,就处理语句列表,递归:
while (peek() != Token::RBRACE) {
Statement* stat = ParseStatementListItem(CHECK_OK);
if (stat && !stat->IsEmpty()) {
body->statements()->Add(stat, zone());
}
}
}
Expect(Token::RBRACE, CHECK_OK);
block_scope->set_end_position(scanner()->location().end_pos);
if (finalize_block_scope) {
block_scope = block_scope->FinalizeBlockScope();
}
body->set_scope(block_scope);
return body;
}
下面是src/ast/ast.h中AstNodeFactory::NewBlock的实现:
Block* NewBlock(ZoneList<const AstRawString*>* labels, int capacity,
bool ignore_completion_value, int pos) {
return new (local_zone_)
Block(local_zone_, labels, capacity, ignore_completion_value, pos);
}
Block是一个BreakableStatement:
class Block final : public BreakableStatement {
public:
DECLARE_NODE_TYPE(Block)
ZoneList<Statement*>* statements() { return &statements_; }
bool ignore_completion_value() const { return ignore_completion_value_; }
static int num_ids() { return parent_num_ids() + 1; }
BailoutId DeclsId() const { return BailoutId(local_id(0)); }
bool IsJump() const override {
return !statements_.is_empty() && statements_.last()->IsJump()
&& labels() == NULL; // Good enough as an approximation...
}
void MarkTail() override {
if (!statements_.is_empty()) statements_.last()->MarkTail();
}
Scope* scope() const { return scope_; }
void set_scope(Scope* scope) { scope_ = scope; }
protected:
Block(Zone* zone, ZoneList<const AstRawString*>* labels, int capacity,
bool ignore_completion_value, int pos)
: BreakableStatement(zone, labels, TARGET_FOR_NAMED_ONLY, pos),
statements_(capacity, zone),
ignore_completion_value_(ignore_completion_value),
scope_(NULL) {}
static int parent_num_ids() { return BreakableStatement::num_ids(); }
private:
int local_id(int n) const { return base_id() + parent_num_ids() + n; }
ZoneList<Statement*> statements_;
bool ignore_completion_value_;
Scope* scope_;
};
if语句 - Parser::ParseIfStatement
if比前面的Block更简单,但是可能遇到表达式,遇到就调用ParseExpression,然后处理then块和else块。没什么技术含量哈。
IfStatement* Parser::ParseIfStatement(ZoneList<const AstRawString*>* labels,
bool* ok) {
// IfStatement ::
// 'if' '(' Expression ')' Statement ('else' Statement)?
int pos = peek_position();
Expect(Token::IF, CHECK_OK);
Expect(Token::LPAREN, CHECK_OK);
Expression* condition = ParseExpression(true, CHECK_OK);
Expect(Token::RPAREN, CHECK_OK);
Statement* then_statement = ParseSubStatement(labels, CHECK_OK);
Statement* else_statement = NULL;
if (peek() == Token::ELSE) {
Next();
else_statement = ParseSubStatement(labels, CHECK_OK);
} else {
else_statement = factory()->NewEmptyStatement(RelocInfo::kNoPosition);
}
return factory()->NewIfStatement(
condition, then_statement, else_statement, pos);
}
do-while循环 - Parser::ParseDoWhileStatement
调用AstNodeFactory的NewDoWhileStatement生成ASTNode对象。然后处理do和while中间的语句,最后解析while中的表达式。
DoWhileStatement* Parser::ParseDoWhileStatement(
ZoneList<const AstRawString*>* labels, bool* ok) {
// DoStatement ::
// 'do' Statement 'while' '(' Expression ')' ';'
DoWhileStatement* loop =
factory()->NewDoWhileStatement(labels, peek_position());
Target target(&this->target_stack_, loop);
Expect(Token::DO, CHECK_OK);
Statement* body = ParseSubStatement(NULL, CHECK_OK);
Expect(Token::WHILE, CHECK_OK);
Expect(Token::LPAREN, CHECK_OK);
Expression* cond = ParseExpression(true, CHECK_OK);
Expect(Token::RPAREN, CHECK_OK);
// Allow do-statements to be terminated with and without
// semi-colons. This allows code such as 'do;while(0)return' to
// parse, which would not be the case if we had used the
// ExpectSemicolon() functionality here.
if (peek() == Token::SEMICOLON) Consume(Token::SEMICOLON);
if (loop != NULL) loop->Initialize(cond, body);
return loop;
}
我们来一张UML图来复习一下上面的过程: