继续来研究JScript解析引擎的GC问题

本文涉及的产品
全局流量管理 GTM,标准版 1个月
云解析 DNS,旗舰版 1个月
公共DNS(含HTTPDNS解析),每月1000万次HTTP解析
简介:

  昨天发现了一个可以引起IE的JScript解析引擎发生Memory Leak的bug,及其引起该bug的代码。后来问题男Laser.NET两位网友给出了很多很有意义的讨论,当然ccBoy网友也给了不少建议,不过ccBoy却更关心innerHTML和appendChild的效率,对ML问题一带而过,好像觉得那根本不是什么大不了得问题

    结果我在google里搜了搜,中文论坛和网站里关于JScript GC的文章,几乎清一色全都是从MSDN上转来的一个半截文章,并且转来转去连个翻译版都没有。中文名叫"JS中关于对内存的释放问题[待续]",原文来自MSDN中:"WEB Q&A"的第三个问题。

    关于JScript脚本引擎的GC的原理和问题,下面这篇文章给予了详细的解释"How Do The Script Garbage Collectors Work?"

    JScript and VBScript both are automatic storage languages.  Unlike, say, C++, the script developer does not have to worry about explicitly allocating and freeing each chunk of memory used by the program.  The internal device in the engine which takes care of this task for the developer is called the garbage collector. 

    Interestingly enough though, JScript and VBScript have completely different garbage collectors.  Occasionally people ask me how the garbage collectors work and what the differences are.

    JScript uses a nongenerational mark-and-sweep garbage collector.  It works like this:

  • Every variable which is "in scope" is called a "scavenger".  A scavenger may refer to a number, an object, a string, whatever.  We maintain a list of scavengers -- variables are moved on to the scav list when they come into scope and off the scav list when they go out of scope.
  • Every now and then the garbage collector runs.   First it puts a "mark" on every object, variable, string, etc – all the memory tracked by the GC.  (JScript uses the VARIANT data structure internally and there are plenty of extra unused bits in that structure, so we just set one of them.)
  • Second, it clears the mark on the scavengers and the transitive closure of scavenger references.  So if a scavenger object references a nonscavenger object then we clear the bits on the nonscavenger, and on everything that it refers to.  (I am using the word "closure" in a different sense than in my earlier post.)
  • At this point we know that all the memory still marked is allocated memory which cannot be reached by any path from any in-scope variable.  All of those objects are instructed to tear themselves down, which destroys any circular references.

    Actually it is a little more complex than that, as we must worry about details like "what if freeing an item causes a message loop to run, which handles an event, which calls back into the script, which runs code, which triggers another garbage collection?"  But those are just implementation details. (Incidentally, every JScript engine running on the same thread shares a GC, which complicates the story even further...)

    You'll note that I hand-waved a bit there when I said "every now and then..."  Actually what we do is keep track of the number of strings, objects and array slots allocated.  We check the current tallies at the beginning of each statement, and when the numbers exceed certain thresholds we trigger a collection.

    The benefits of this approach are numerous, but the principle benefit is that circular references are not leaked unless the circular reference involves an object not owned by JScript. 

    However, there are some down sides as well.  Performance is potentially not good on large-working-set applications -- if you have an app where there are lots of long-term things in memory and lots of short-term objects being created and destroyed then the GC will run often and will have to walk the same network of long-term objects over and over again.  That's not fast.

    The opposite problem is that perhaps a GC will not run when you want one to.  If you say "blah = null" then the memory owned by blah will not be released until the GC releases it. If blah is the sole remaining reference to a huge array or network of objects, you might want it to go away as soon as possible. Now, you can force the JScript garbage collector to run with the CollectGarbage() method, but I don't recommend it.  The whole point of JScript having a GC is that you don't need to worry about object lifetime.  If you do worry about it then you're probably using the wrong tool for the job! 

    VBScript on the other hand, has a much simpler stack-based garbage collector.  Scavengers are added to a stack when they come into scope, removed when they go out of scope, and any time an object is discarded it is immediately freed. 

    You might wonder why we didn't put a mark-and-sweep GC into VBScript.  There are two reasons.  First, VBScript did not have classes until version 5, but JScript had objects from day one; VBScript did not need a complex GC because there was no way to get circular references in the first place!  Second, VBScript is supposed to be like VB6 where possible, and VB6 does not have a mark-n-sweep collector either.

    The VBScript approach pretty much has the opposite pros and cons.  It is fast, simple and predictable, but circular references of VBScript objects are not broken until the engine itself is shut down.

    The CLR GC is also mark-n-sweep but it is generational – the more collections an object survives, the less often it is checked for life.  This dramatically improves performance for large-working-set applications. Of course, the CLR GC was designed for industrial-grade applications, the JScript GC was designed for simple little web pages.

    What happens when you have a web page, ASP page or WSH script with both VBScript and JScript?  JScript and VBScript know nothing about each others garbage collection semantics.  A VBScript program which gets a reference to a JScript object just sees another COM object.  The same for a VBScript object passed to JScript.  A circular reference between VBScript and JScript objects would not be broken and the memory would leak (until the engines were shut down).  A noncircular reference will be freed when the object in question goes out of scope in both language (and the JS GC runs.) 

    上文中红色的代码解释了为什么我的昨天文章里的那个双向引用会产生Memory Leak的问题,因为语句:span.Object = this;和this.m_Element = span;中的span来自DHMTL对象树,而this(TestObject类的一个实例)来自JScript脚本引擎,它俩在不同的scope里,从而不能被JScript引擎中的GC机制自动回收。在昨天的文章中,问题男说道过JS的GC可能会对circular reference的情况晕菜,对于昨天我那个示例来说这个说法是正确的。不过从上文中看来并不是十分的严密,JS并不会对普通的circular reference晕菜的,只是对垮scope的reference会使其GC实效。

    昨天的文章中Laser.NET说道.NET和Java中的GC使用的是标记回收(mark-and-sweep)算法,上文中也作了解释,JScript也是使用的mark-and-sweep算法来进行GC的,只是它们在实现上的复杂度大不相同,JScript的GC是轻量级的,本身就是为Web这种轻量编程开发使用而简化实现的。

    上文的回复也挺有意思的,其中有来自developer-x.com的Tim Scarfe的抱怨,和他对Erik Arvidsson的评述,让人对Erik Arvidsson再次心生敬意。谁是Erik?! 看看这个

    BTW: KB中也还说道过一个JScript的GC bug,叫:JScript Garbage Collector Is in Inconsistent State When Many Cross-Thread Calls Are Made。不过这个bug主要影响IE5.0、IE5.01及Windows Script Engine 5.5,并且已经fixed了。


本文转自博客园鸟食轩的博客,原文链接:http://www.cnblogs.com/birdshome/,如需转载请自行联系原博主。

目录
相关文章
|
2月前
|
机器学习/深度学习 安全 大数据
揭秘!企业级大模型如何安全高效私有化部署?全面解析最佳实践,助你打造智能业务新引擎!
【10月更文挑战第24天】本文详细探讨了企业级大模型私有化部署的最佳实践,涵盖数据隐私与安全、定制化配置、部署流程、性能优化及安全措施。通过私有化部署,企业能够完全控制数据,确保敏感信息的安全,同时根据自身需求进行优化,提升计算性能和处理效率。示例代码展示了如何利用Python和TensorFlow进行文本分类任务的模型训练。
158 6
|
25天前
|
域名解析 负载均衡 安全
DNS技术标准趋势和安全研究
本文探讨了互联网域名基础设施的结构性安全风险,由清华大学段教授团队多年研究总结。文章指出,DNS系统的安全性不仅受代码实现影响,更源于其设计、实现、运营及治理中的固有缺陷。主要风险包括协议设计缺陷(如明文传输)、生态演进隐患(如单点故障增加)和薄弱的信任关系(如威胁情报被操纵)。团队通过多项研究揭示了这些深层次问题,并呼吁构建更加可信的DNS基础设施,以保障全球互联网的安全稳定运行。
|
2月前
|
Kubernetes Cloud Native 调度
云原生批量任务编排引擎Argo Workflows发布3.6,一文解析关键新特性
Argo Workflows是CNCF毕业项目,最受欢迎的云原生工作流引擎,专为Kubernetes上编排批量任务而设计,本文主要对最新发布的Argo Workflows 3.6版本的关键新特性做一个深入的解析。
|
7月前
|
存储 SQL 消息中间件
ClickHouse(12)ClickHouse合并树MergeTree家族表引擎之AggregatingMergeTree详细解析
AggregatingMergeTree是ClickHouse的一种表引擎,它优化了MergeTree的合并逻辑,通过将相同主键(排序键)的行聚合为一行并存储聚合函数状态来减少行数。适用于增量数据聚合和物化视图。建表语法中涉及AggregateFunction和SimpleAggregateFunction类型。插入数据需使用带-State-的聚合函数,查询时使用GROUP BY和-Merge-。处理逻辑包括按排序键聚合、在合并分区时计算、以分区为单位聚合等。常用于物化视图配合普通MergeTree使用。查阅更多资料可访问相关链接。
359 4
|
7月前
|
存储 SQL 算法
ClickHouse(13)ClickHouse合并树MergeTree家族表引擎之CollapsingMergeTree详细解析
CollapsingMergeTree是ClickHouse的一种表引擎,它扩展了`MergeTree`,通过折叠行来优化存储和查询效率。当`Sign`列值为1和-1的成对行存在时,该引擎会异步删除除`Sign`外其他字段相同的行,只保留最新状态。建表语法中,`sign`列必须为`Int8`类型,用来标记状态(1)和撤销(-1)。写入时,应确保状态和撤销行的对应关系以保证正确折叠。查询时,可能需要使用聚合函数如`sum(Sign * x)`配合`GROUP BY`来处理折叠后的数据。使用`FINAL`修饰符可强制折叠,但效率较低。系列文章提供了更多关于ClickHouse及其表引擎的详细解析。
283 1
|
3月前
|
存储 缓存 数据处理
深度解析:Hologres分布式存储引擎设计原理及其优化策略
【10月更文挑战第9天】在大数据时代,数据的规模和复杂性不断增加,这对数据库系统提出了更高的要求。传统的单机数据库难以应对海量数据处理的需求,而分布式数据库通过水平扩展提供了更好的解决方案。阿里云推出的Hologres是一个实时交互式分析服务,它结合了OLAP(在线分析处理)与OLTP(在线事务处理)的优势,能够在大规模数据集上提供低延迟的数据查询能力。本文将深入探讨Hologres分布式存储引擎的设计原理,并介绍一些关键的优化策略。
198 0
|
5月前
|
图形学 C#
超实用!深度解析Unity引擎,手把手教你从零开始构建精美的2D平面冒险游戏,涵盖资源导入、角色控制与动画、碰撞检测等核心技巧,打造沉浸式游戏体验完全指南
【8月更文挑战第31天】本文是 Unity 2D 游戏开发的全面指南,手把手教你从零开始构建精美的平面冒险游戏。首先,通过 Unity Hub 创建 2D 项目并导入游戏资源。接着,编写 `PlayerController` 脚本来实现角色移动,并添加动画以增强视觉效果。最后,通过 Collider 2D 组件实现碰撞检测等游戏机制。每一步均展示 Unity 在 2D 游戏开发中的强大功能。
293 6
|
5月前
|
图形学 机器学习/深度学习 人工智能
颠覆传统游戏开发,解锁未来娱乐新纪元:深度解析如何运用Unity引擎结合机器学习技术,打造具备自我进化能力的智能游戏角色,彻底改变你的游戏体验——从基础设置到高级应用全面指南
【8月更文挑战第31天】本文探讨了如何在Unity中利用机器学习增强游戏智能。作为领先的游戏开发引擎,Unity通过ML-Agents Toolkit等工具支持AI代理的强化学习训练,使游戏角色能自主学习完成任务。文章提供了一个迷宫游戏示例及其C#脚本,展示了环境观察、动作响应及奖励机制的设计,并介绍了如何设置训练流程。此外,还提到了Unity与其他机器学习框架(如TensorFlow和PyTorch)的集成,以实现更复杂的游戏玩法。通过这些技术,游戏的智能化程度得以显著提升,为玩家带来更丰富的体验。
90 1
|
5月前
|
缓存 运维 监控
打造稳定高效的数据引擎:数据库服务器运维最佳实践全解析
打造稳定高效的数据引擎:数据库服务器运维最佳实践全解析
|
5月前
|
开发者 图形学 API
从零起步,深度揭秘:运用Unity引擎及网络编程技术,一步步搭建属于你的实时多人在线对战游戏平台——详尽指南与实战代码解析,带你轻松掌握网络化游戏开发的核心要领与最佳实践路径
【8月更文挑战第31天】构建实时多人对战平台是技术与创意的结合。本文使用成熟的Unity游戏开发引擎,从零开始指导读者搭建简单的实时对战平台。内容涵盖网络架构设计、Unity网络API应用及客户端与服务器通信。首先,创建新项目并选择适合多人游戏的模板,使用推荐的网络传输层。接着,定义基本玩法,如2D多人射击游戏,创建角色预制件并添加Rigidbody2D组件。然后,引入网络身份组件以同步对象状态。通过示例代码展示玩家控制逻辑,包括移动和发射子弹功能。最后,设置服务器端逻辑,处理客户端连接和断开。本文帮助读者掌握构建Unity多人对战平台的核心知识,为进一步开发打下基础。
191 0

推荐镜像

更多