django 源码解析 -1 启动流程分析-阿里云开发者社区

Django是一款经典的Python Web开发框架，也是最受欢迎的Python开源项目之一。不同于Flask框架，Django是高度集成的，可以帮助开发者快速搭建一个Web项目。从本周开始，我们一起进入Djaong项目的源码解析，加深对django的理解，熟练掌握python web开发。django采用的源码版本是: 4.0.0，我们首先采用概读法，大概了解django项目的创建和启动过程，包括下面几个部分:

django简单回顾
django项目概况
django命令脚手架
startproject和startapp命令实现
runserver命令实现
app的setup过程
小结
小技巧

django简单回顾

创建好虚拟环境，安装django的 4.0.0 版本。这个版本和官方的最新文档一致，而且有完整的中文翻译版本。我们可以跟随「快速安装指南」创建一个django项目，感受一下其魅力。首先使用 startproject 命令创建一个名叫 hello 的项目，django会在本地搭建一个基础的项目结构，进入hello项目后，可以直接使用 runserver 命令启动项目。

python3 -m django startproject hello
cd hello  && python3 manage.py runserver
复制代码

django提供很好的模块化支持，可以利用它做大型web项目的开发。在project下的模块称为app，一个project可以包括多个app模块, 和flask的blue-print类似。我们继续使用 startapp 命令来创建一个叫做 api 的app。

python3 -m django startapp api
复制代码

对api-app的 views.py，我们需要完善一下，增加下面内容：

from django.shortcuts import render
# Create your views here.
from django.http import HttpResponse
def index(request):
    return HttpResponse("Hello, Game404. You're at the index.")
复制代码

然后再创建一个urls.py的模块文件，定义url和view的映射关系，填写:

from django.urls import path
from . import views
urlpatterns = [
    path('', views.index, name='index'),
]
复制代码

完成api-app的实现后，我们需要在project中添加上自定义的api模块。这需要两步，先是在hello-project的setting.py中配置这个api-app:

INSTALLED_APPS = [
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'api',
]
复制代码

再在hello-project的urls.py导入api-app中定义的url和视图:

from django.contrib import admin
from django.urls import path, include
urlpatterns = [
    path('admin/', admin.site.urls),
    path('api/', include('api.urls')),
]
复制代码

剩下的内容，都可以使用模版的标准实现。然后我们再次启动项目，访问下面路径:

➜  ~ curl http://127.0.0.1:8000/api/
Hello, Game404. You're at the index.%
复制代码

我们基本完成了一个最简单的django项目创建和启动，接下来我们一起了解这个流程是如何实现的。

django项目概况

django项目源码大概包括下面一些包:

包	功能描述
apps	django的app管理器
conf	配置信息，主要有项目模版和app模版等
contrib	django默认提供的标准app
core	django核心功能
db	数据库模型实现
dispatch	信号，用于模块解耦
forms	表单实现
http	http协议和服务相关实现
middleware	django提供的标准中间件
template && templatetags	模版功能
test	单元测试的支持
urls	一些url的处理类
utils	工具类
views	视图相关的实现

我也简单对比了一下django和flask的代码情况:

-------------------------------------------------------------------------------
Project                      files          blank        comment           code
-------------------------------------------------------------------------------
django                         716          19001          25416          87825
flask                           20           1611           3158           3587
复制代码

仅从文件数量和代码行数可以看到，django是一个庞大的框架，不同于flask是一个简易框架，有700多个模块文件和近9万行代码。在阅读flask源码前，我们还需要先了解sqlalchemy，werkzeug等依赖框架，从pyproject.toml中的依赖项可发现，django默认就没有其它依赖，可以直接开始。这有点类似ios和Android系统，flask需要各种插件的支持，更为开放；django则是全部集成，使用默认的框架就已经可以处理绝大部分web开放需求了。

django命令脚手架

django提供了一系列脚手架命令，协助开发者创建和管理django项目。可以使用 help 参数查看命令清单:

python3 -m django --help
Type 'python -m django help <subcommand>' for help on a specific subcommand.
Available subcommands:
[django]
    check
    compilemessages
    createcachetable
    dbshell
    diffsettings
    dumpdata
    flush
    inspectdb
    loaddata
    makemessages
    makemigrations
    migrate
    runserver
    sendtestemail
    shell
    showmigrations
    sqlflush
    sqlmigrate
    sqlsequencereset
    squashmigrations
    startapp
    startproject
    test
    testserver
复制代码

前面使用到的 startproject ， startapp 和 runserver 三个命令是本篇重点介绍的命令，其它的二十多个命令我们在使用到的时候再行介绍。这是 概读法 的精髓，只关注主干，先建立全局视野，再逐步深入。

django模块的main函数在 __main__.py 中提供:

"""
Invokes django-admin when the django module is run as a script.
Example: python -m django check
"""
from django.core import management
if __name__ == "__main__":
    management.execute_from_command_line()
复制代码

可以看到 django.core.management 模块提供脚手架的功能实现。同样在project的 manager.py 中也是通过调用management模块来实现项目启动:

#!/usr/bin/env python
"""Django's command-line utility for administrative tasks."""
import os
import sys
def main():
    """Run administrative tasks."""
    os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'hello.settings')
    try:
        from django.core.management import execute_from_command_line
    except ImportError as exc:
        ...
    execute_from_command_line(sys.argv)
if __name__ == '__main__':
    main()
复制代码

ManagementUtility的主要结构如下:

class ManagementUtility:
    """
    Encapsulate the logic of the django-admin and manage.py utilities.
    """
    def __init__(self, argv=None):
        # 命令行参数
        self.argv = argv or sys.argv[:]
        self.prog_name = os.path.basename(self.argv[0])
        if self.prog_name == '__main__.py':
            self.prog_name = 'python -m django'
        self.settings_exception = None
    def main_help_text(self, commands_only=False):
        """Return the script's main help text, as a string."""
        pass
    def fetch_command(self, subcommand):
        """
        Try to fetch the given subcommand, printing a message with the
        appropriate command called from the command line (usually
        "django-admin" or "manage.py") if it can't be found.
        """
        pass
    ...
    def execute(self):
        """
        Given the command-line arguments, figure out which subcommand is being
        run, create a parser appropriate to that command, and run it.
        """
        pass
复制代码

init函数解析argv的命令行参数
main_help_text 输出命令的帮助信息
fetch_command 查找django模块的子命令
execute 执行子命令

一个好的命令行工具，离不开清晰的帮助输出。默认的帮助信息是调用main_help_text函数:

def main_help_text(self, commands_only=False):
    """Return the script's main help text, as a string."""
    usage = [
        "",
        "Type '%s help <subcommand>' for help on a specific subcommand." % self.prog_name,
        "",
        "Available subcommands:",
    ]
    commands_dict = defaultdict(lambda: [])
    for name, app in get_commands().items():
        commands_dict[app].append(name)
    style = color_style()
    for app in sorted(commands_dict):
        usage.append("")
        usage.append(style.NOTICE("[%s]" % app))
        for name in sorted(commands_dict[app]):
            usage.append("    %s" % name)
    # Output an extra note if settings are not properly configured
    if self.settings_exception is not None:
        usage.append(style.NOTICE(
            "Note that only Django core commands are listed "
            "as settings are not properly configured (error: %s)."
            % self.settings_exception))
    return '\n'.join(usage)
复制代码

main_help_text主要利用了下面4个函数去查找命令清单:

def find_commands(management_dir):
    """
    Given a path to a management directory, return a list of all the command
    names that are available.
    """
    command_dir = os.path.join(management_dir, 'commands')
    return [name for _, name, is_pkg in pkgutil.iter_modules([command_dir])
            if not is_pkg and not name.startswith('_')]
def load_command_class(app_name, name):
    """
    Given a command name and an application name, return the Command
    class instance. Allow all errors raised by the import process
    (ImportError, AttributeError) to propagate.
    """
    module = import_module('%s.management.commands.%s' % (app_name, name))
    return module.Command()
def get_commands():
    commands = {name: 'django.core' for name in find_commands(__path__[0])}
    if not settings.configured:
        return commands
    for app_config in reversed(list(apps.get_app_configs())):
        path = os.path.join(app_config.path, 'management')
        commands.update({name: app_config.name for name in find_commands(path)})
    return commands
def call_command(command_name, *args, **options):
    pass
复制代码

find_commands查找management/commands下的命令文件，load_command_class使用 import_module 这个动态加载模块的方法导入命令。

子命令的执行是通过fetch_command找到子命令，然后执行子命令的run_from_argv方法。

def execute(self):
    ...
    self.fetch_command(subcommand).run_from_argv(self.argv)
def fetch_command(self, subcommand):
    """
    Try to fetch the given subcommand, printing a message with the
    appropriate command called from the command line (usually
    "django-admin" or "manage.py") if it can't be found.
    """
    # Get commands outside of try block to prevent swallowing exceptions
    commands = get_commands()
    try:
        app_name = commands[subcommand]
    except KeyError:
        ...
    if isinstance(app_name, BaseCommand):
        # If the command is already loaded, use it directly.
        klass = app_name
    else:
        klass = load_command_class(app_name, subcommand)
    return klass
复制代码

在django的core中包含了下面这些命令，多数都直接继承自BaseCommand:

BaseCommand的主要代码结构:

最重要的run_from_argv和execute方法都是命令的执行入口:

def run_from_argv(self, argv):
    ...
    parser = self.create_parser(argv[0], argv[1])
    options = parser.parse_args(argv[2:])
    cmd_options = vars(options)
    # Move positional args out of options to mimic legacy optparse
    args = cmd_options.pop('args', ())
    handle_default_options(options)
    try:
        self.execute(*args, **cmd_options)
    except CommandError as e:
        ...
def execute(self, *args, **options):
    output = self.handle(*args, **options)
    return output
复制代码

handler方法则留给子类覆盖实现:

def handle(self, *args, **options):
    """
    The actual logic of the command. Subclasses must implement
    this method.
    """
    raise NotImplementedError('subclasses of BaseCommand must provide a handle() method')
复制代码

startproject && startapp 命令实现

startproject和startapp两个命令分别创建项目和app，都派生自TemplateCommand。主要功能实现都在TemplateCommand中。

# startproject
def handle(self, **options):
    project_name = options.pop('name')
    target = options.pop('directory')
    # Create a random SECRET_KEY to put it in the main settings.
    options['secret_key'] = SECRET_KEY_INSECURE_PREFIX + get_random_secret_key()
    super().handle('project', project_name, target, **options)
# startapp
def handle(self, **options):
    app_name = options.pop('name')
    target = options.pop('directory')
    super().handle('app', app_name, target, **options)
复制代码

我们使用startproject命令后的项目结构大概如下:

├── hello
│   ├── __init__.py
│   ├── asgi.py
│   ├── settings.py
│   ├── urls.py
│   └── wsgi.py
└── manage.py
复制代码

这和conf下的project_template目录中的模版文件一致：

.
├── manage.py-tpl
└── project_name
    ├── __init__.py-tpl
    ├── asgi.py-tpl
    ├── settings.py-tpl
    ├── urls.py-tpl
    └── wsgi.py-tpl
复制代码

manage.py-tpl模版文件内容:

#!/usr/bin/env python
"""Django's command-line utility for administrative tasks."""
import os
import sys
def main():
    """Run administrative tasks."""
    os.environ.setdefault('DJANGO_SETTINGS_MODULE', '{{ project_name }}.settings')
    try:
        from django.core.management import execute_from_command_line
    except ImportError as exc:
        ....
if __name__ == '__main__':
    main()
复制代码

可见startproject命令的功能是接收开发者输入的project_name，然后渲染到模版文件中，再生成项目文件。

TemplateCommand中模版处理的函数主要内容如下:

from django.template import Context, Engine
def handle(self, app_or_project, name, target=None, **options):
    ...
    base_name = '%s_name' % app_or_project
    base_subdir = '%s_template' % app_or_project
    base_directory = '%s_directory' % app_or_project
    camel_case_name = 'camel_case_%s_name' % app_or_project
    camel_case_value = ''.join(x for x in name.title() if x != '_')
    ...
    context = Context({
        **options,
        base_name: name,
        base_directory: top_dir,
        camel_case_name: camel_case_value,
        'docs_version': get_docs_version(),
        'django_version': django.__version__,
    }, autoescape=False)   
    ...
    template_dir = self.handle_template(options['template'],
                                            base_subdir)
    ...   
    for root, dirs, files in os.walk(template_dir):
        for filename in files:
            if new_path.endswith(extensions) or filename in extra_files:
                with open(old_path, encoding='utf-8') as template_file:
                    content = template_file.read()
                    template = Engine().from_string(content)
                    content = template.render(context)
                    with open(new_path, 'w', encoding='utf-8') as new_file:
                        new_file.write(content)
复制代码

构建模版参数使用的Context上下文
遍历模版文件目录
使用django的模版引擎Engine渲染内容
用渲染的结果生成项目脚手架文件

django.template.Engine如何工作，和模版引擎mako有什么区别，以后再行介绍，本章我们只需要了解即可。

runserver 命令实现

runserver提供了一个开发测试的http服务，帮助启动django项目，也是django使用频率最高的命令。django项目遵循wsgi规范，执行启动之前需要先查找wsgi-application。(如果对wsgi规范不了解的同学，欢迎翻看之前的文章)

def get_internal_wsgi_application():
    """
    Load and return the WSGI application as configured by the user in
    ``settings.WSGI_APPLICATION``. With the default ``startproject`` layout,
    this will be the ``application`` object in ``projectname/wsgi.py``.
    This function, and the ``WSGI_APPLICATION`` setting itself, are only useful
    for Django's internal server (runserver); external WSGI servers should just
    be configured to point to the correct application object directly.
    If settings.WSGI_APPLICATION is not set (is ``None``), return
    whatever ``django.core.wsgi.get_wsgi_application`` returns.
    """
    from django.conf import settings
    app_path = getattr(settings, 'WSGI_APPLICATION')
    if app_path is None:
        return get_wsgi_application()
    try:
        return import_string(app_path)
    except ImportError as err:
        ...
复制代码

结合注释可以知道这里会载入开发者在project的settings中定义的application:

WSGI_APPLICATION = 'hello.wsgi.application'
复制代码

默认情况下，自定义的wsgi-application是这样的:

import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'hello.settings')
application = get_wsgi_application()
复制代码

继续查看runserver的执行, 主要是 inner_run 函数:

from django.core.servers.basehttp import run
def inner_run(self, *args, **options):
    try:
        handler = self.get_handler(*args, **options)
        run(self.addr, int(self.port), handler,
            ipv6=self.use_ipv6, threading=threading, server_cls=self.server_cls)
    except OSError as e:
        ...
        # Need to use an OS exit because sys.exit doesn't work in a thread
        os._exit(1)
    except KeyboardInterrupt:
        ..
        sys.exit(0)
复制代码

http服务的功能实现由 django.core.servers.basehttp 提供，完成http服务和wsgi之间的衔接，其架构图如下:

run函数的代码:

def run(addr, port, wsgi_handler, ipv6=False, threading=False, server_cls=WSGIServer):
    server_address = (addr, port)
    if threading:
        httpd_cls = type('WSGIServer', (socketserver.ThreadingMixIn, server_cls), {})
    else:
        httpd_cls = server_cls
    httpd = httpd_cls(server_address, WSGIRequestHandler, ipv6=ipv6)
    if threading:
        # ThreadingMixIn.daemon_threads indicates how threads will behave on an
        # abrupt shutdown; like quitting the server by the user or restarting
        # by the auto-reloader. True means the server will not wait for thread
        # termination before it quits. This will make auto-reloader faster
        # and will prevent the need to kill the server manually if a thread
        # isn't terminating correctly.
        httpd.daemon_threads = True
    httpd.set_app(wsgi_handler)
    httpd.serve_forever()
复制代码

如果支持多线程，则创建一个WSGIServer和ThreadingMixIn的新类
创建http服务并启动

django的wsgi-application实现, http协议的实现，下一章再行详细介绍，我们也暂时跳过。在runserver中还有一个非常重要的功能: 自动重启服务 。如果我们修改了项目的代码，服务会自动重启，可以提高开发效率。

比如我们修改api的视图功能，随便增加几个字符，可以在控制台看到大概下面的输出:

# python manage.py runserver
Watching for file changes with StatReloader
Performing system checks...
System check identified no issues (0 silenced).
March 05, 2022 - 09:06:07
Django version 4.0, using settings 'hello.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
/Users/yoo/tmp/django/hello/api/views.py changed, reloading.
Watching for file changes with StatReloader
Performing system checks...
System check identified no issues (0 silenced).
March 05, 2022 - 09:12:59
Django version 4.0, using settings 'hello.settings'
Starting development server at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
复制代码

runserver命令检测到 /hello/api/views.py 有修改，然后自动使用StatReloader重启服务。

inner_run有下面2中启动方式, 默认情况下使用autoreload启动:

def run(self, **options):
    """Run the server, using the autoreloader if needed."""
    ...
    use_reloader = options['use_reloader']
    if use_reloader:
        autoreload.run_with_reloader(self.inner_run, **options)
    else:
        self.inner_run(None, **options)
复制代码

autoreload的继承关系如下:

class BaseReloader:
    pass
class StatReloader(BaseReloader):
    pass
class WatchmanReloader(BaseReloader):
    pass
def get_reloader():
    """Return the most suitable reloader for this environment."""
    try:
        WatchmanReloader.check_availability()
    except WatchmanUnavailable:
        return StatReloader()
    return WatchmanReloader()
复制代码

当前版本优先使用WatchmanReloader实现，这依赖于pywatchman库，需要额外安装。否则使用StatReloader的实现，这个实现在之前介绍werkzeug中也有过介绍，本质上都是持续的监听文件的状态变化。

class StatReloader(BaseReloader):
    SLEEP_TIME = 1  # Check for changes once per second.
    def tick(self):
        mtimes = {}
        while True:
            for filepath, mtime in self.snapshot_files():
                old_time = mtimes.get(filepath)
                mtimes[filepath] = mtime
                if old_time is None:
                    logger.debug('File %s first seen with mtime %s', filepath, mtime)
                    continue
                elif mtime > old_time:
                    logger.debug('File %s previous mtime: %s, current mtime: %s', filepath, old_time, mtime)
                    self.notify_file_changed(filepath)
            time.sleep(self.SLEEP_TIME)
            yield
复制代码

StatReloader 以1s为间隔检查文件的时间戳变化。
tick是生成器模式，可以使用next持续调用

当文件有变化后，退出当前进程，并使用subprocess启动新的进程:

def trigger_reload(filename):
    logger.info('%s changed, reloading.', filename)
    sys.exit(3)
def restart_with_reloader():
    new_environ = {**os.environ, DJANGO_AUTORELOAD_ENV: 'true'}
    args = get_child_arguments()
    while True:
        p = subprocess.run(args, env=new_environ, close_fds=False)
        if p.returncode != 3:
            return p.returncode
复制代码

app的setup过程

runserver命令在执行之前还有很重要的一步就是setup: 加载和初始化开发者自定义的app内容。这是在ManagementUtility的execute函数中开始的:

def execute(self):
    ...
    try:
        settings.INSTALLED_APPS
    except ImproperlyConfigured as exc:
        self.settings_exception = exc
    except ImportError as exc:
        self.settings_exception = exc
    ...
复制代码

Settings中会载下面的一些模块，主要是INSTALLED_APPS:

class Settings:
    def __init__(self, settings_module):
        ...
        # store the settings module in case someone later cares
        self.SETTINGS_MODULE = settings_module
        mod = importlib.import_module(self.SETTINGS_MODULE)
        tuple_settings = (
            'ALLOWED_HOSTS',
            "INSTALLED_APPS",
            "TEMPLATE_DIRS",
            "LOCALE_PATHS",
        )
        self._explicit_settings = set()
        ...
复制代码

INSTALLED_APPS在项目的setting中定义:

# Application definition
INSTALLED_APPS = [
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'api',
]
复制代码

这样django框架就完成了开发者自定义的内容动态加载。

小结

Django是一个高度集成的python web开发框架，支持模块化开发。django还提供了一系列脚手架命令，比如使用startproject和startapp协助创建项目和模块模版；使用runserver辅助测试和开发项目。django项目也符合wsgi规范，其http服务的启动中创建了WSGIServer，并且支持多线程模式。django作为一个框架，可以通过约定的配置文件setting动态加载开发者的业务实现。

小技巧

django命令支持智能提示。比如我们 runserver 命令时，不小心输入错误的把字母 u 打成了 i。命令会自动提醒我们，是不是想使用 runserver 命令:

python -m django rinserver
No Django settings specified.
Unknown command: 'rinserver'. Did you mean runserver?
Type 'python -m django help' for usage.
复制代码

智能提示的功能对命令行工具很有帮助，一般的实现就是比较用户输入和已知命令的重合度，从而找到最接近的命令。这是「字符串编辑距离」算法的实际应用。个人认为理解场景，这会比死刷算法更有用，我偶尔会在面试的时候使用这个例子来观测面试人的算法水准。django这里直接使用python标准库difflib提供的实现:

from difflib import get_close_matches
possible_matches = get_close_matches(subcommand, commands)
sys.stderr.write('Unknown command: %r' % subcommand)
if possible_matches:
    sys.stderr.write('. Did you mean %s?' % possible_matches[0])
复制代码

get_close_matches的使用示例:

>>> get_close_matches("appel", ["ape", "apple", "peach", "puppy"])
['apple', 'ape']
复制代码

对算法感兴趣的同学，可以自己进一步了解其实现细节。

另外一个小技巧是一个命名的异化细节。class 一般在很多开发语言中都是关键字，如果我们要定义一个class类型的变量名时候，避免关键字冲突。一种方法是使用 klass 替代:

# 
if isinstance(app_name, BaseCommand):
    # If the command is already loaded, use it directly.
    klass = app_name
else:
    klass = load_command_class(app_name, subcommand)
复制代码

另一种是使用 clazz 替代:

if ( classes.length ) {
  while ( ( elem = this[ i++ ] ) ) {
    curValue = getClass( elem );
    ...
    if ( cur ) {
      j = 0;
      while ( ( clazz = classes[ j++ ] ) ) {
        // Remove *all* instances
        while ( cur.indexOf( " " + clazz + " " ) > -1 ) {
          cur = cur.replace( " " + clazz + " ", " " );
        }
      }
        ...
    }
  }
}
复制代码

大家一般都用哪一种呢?

django 源码解析 -1 启动流程分析

django简单回顾

django项目概况

django命令脚手架

startproject && startapp 命令实现

runserver 命令实现

app的setup过程

小结

小技巧

参考链接

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

django 源码解析 -1 启动流程分析

django简单回顾

django项目概况

django命令脚手架

startproject && startapp 命令实现

runserver 命令实现

app的setup过程

小结

小技巧

参考链接

热门文章

最新文章

相关课程

相关电子书

相关实验场景

推荐镜像