Django是一款经典的Python Web开发框架,也是最受欢迎的Python开源项目之一。不同于Flask框架,Django是高度集成的,可以帮助开发者快速搭建一个Web项目。 从本周开始,我们一起进入Djaong项目的源码解析,加深对django的理解,熟练掌握python web开发。django采用的源码版本是: 4.0.0
,我们首先采用概读法,大概了解django项目的创建和启动过程,包括下面几个部分:
- django简单回顾
- django项目概况
- django命令脚手架
- startproject和startapp命令实现
- runserver命令实现
- app的setup过程
- 小结
- 小技巧
django简单回顾
创建好虚拟环境,安装django的 4.0.0
版本。这个版本和官方的最新文档一致,而且有完整的中文翻译版本。我们可以跟随「快速安装指南」创建一个django项目,感受一下其魅力。首先使用 startproject 命令创建一个名叫 hello 的项目,django会在本地搭建一个基础的项目结构,进入hello项目后,可以直接使用 runserver 命令启动项目。
python3 -m django startproject hello cd hello && python3 manage.py runserver 复制代码
django提供很好的模块化支持,可以利用它做大型web项目的开发。在project下的模块称为app,一个project可以包括多个app模块, 和flask的blue-print类似。我们继续使用 startapp 命令来创建一个叫做 api 的app。
python3 -m django startapp api 复制代码
对api-app的 views.py
,我们需要完善一下,增加下面内容:
from django.shortcuts import render # Create your views here. from django.http import HttpResponse def index(request): return HttpResponse("Hello, Game404. You're at the index.") 复制代码
然后再创建一个urls.py的模块文件,定义url和view的映射关系,填写:
from django.urls import path from . import views urlpatterns = [ path('', views.index, name='index'), ] 复制代码
完成api-app的实现后,我们需要在project中添加上自定义的api模块。这需要两步,先是在hello-project的setting.py中配置这个api-app:
INSTALLED_APPS = [ 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', 'api', ] 复制代码
再在hello-project的urls.py导入api-app中定义的url和视图:
from django.contrib import admin from django.urls import path, include urlpatterns = [ path('admin/', admin.site.urls), path('api/', include('api.urls')), ] 复制代码
剩下的内容,都可以使用模版的标准实现。然后我们再次启动项目,访问下面路径:
➜ ~ curl http://127.0.0.1:8000/api/ Hello, Game404. You're at the index.% 复制代码
我们基本完成了一个最简单的django项目创建和启动,接下来我们一起了解这个流程是如何实现的。
django项目概况
django项目源码大概包括下面一些包:
包 | 功能描述 |
apps | django的app管理器 |
conf | 配置信息,主要有项目模版和app模版等 |
contrib | django默认提供的标准app |
core | django核心功能 |
db | 数据库模型实现 |
dispatch | 信号,用于模块解耦 |
forms | 表单实现 |
http | http协议和服务相关实现 |
middleware | django提供的标准中间件 |
template && templatetags | 模版功能 |
test | 单元测试的支持 |
urls | 一些url的处理类 |
utils | 工具类 |
views | 视图相关的实现 |
我也简单对比了一下django和flask的代码情况:
------------------------------------------------------------------------------- Project files blank comment code ------------------------------------------------------------------------------- django 716 19001 25416 87825 flask 20 1611 3158 3587 复制代码
仅从文件数量和代码行数可以看到,django是一个庞大的框架,不同于flask是一个简易框架,有700多个模块文件和近9万行代码。在阅读flask源码前,我们还需要先了解sqlalchemy,werkzeug等依赖框架,从pyproject.toml中的依赖项可发现,django默认就没有其它依赖,可以直接开始。这有点类似ios和Android系统,flask需要各种插件的支持,更为开放;django则是全部集成,使用默认的框架就已经可以处理绝大部分web开放需求了。
django命令脚手架
django提供了一系列脚手架命令,协助开发者创建和管理django项目。可以使用 help 参数查看命令清单:
python3 -m django --help Type 'python -m django help <subcommand>' for help on a specific subcommand. Available subcommands: [django] check compilemessages createcachetable dbshell diffsettings dumpdata flush inspectdb loaddata makemessages makemigrations migrate runserver sendtestemail shell showmigrations sqlflush sqlmigrate sqlsequencereset squashmigrations startapp startproject test testserver 复制代码
前面使用到的 startproject , startapp 和 runserver 三个命令是本篇重点介绍的命令,其它的二十多个命令我们在使用到的时候再行介绍。这是 概读法 的精髓,只关注主干,先建立全局视野,再逐步深入。
django模块的main函数在 __main__.py
中提供:
""" Invokes django-admin when the django module is run as a script. Example: python -m django check """ from django.core import management if __name__ == "__main__": management.execute_from_command_line() 复制代码
可以看到 django.core.management 模块提供脚手架的功能实现。同样在project的 manager.py
中也是通过调用management模块来实现项目启动:
#!/usr/bin/env python """Django's command-line utility for administrative tasks.""" import os import sys def main(): """Run administrative tasks.""" os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'hello.settings') try: from django.core.management import execute_from_command_line except ImportError as exc: ... execute_from_command_line(sys.argv) if __name__ == '__main__': main() 复制代码
ManagementUtility的主要结构如下:
class ManagementUtility: """ Encapsulate the logic of the django-admin and manage.py utilities. """ def __init__(self, argv=None): # 命令行参数 self.argv = argv or sys.argv[:] self.prog_name = os.path.basename(self.argv[0]) if self.prog_name == '__main__.py': self.prog_name = 'python -m django' self.settings_exception = None def main_help_text(self, commands_only=False): """Return the script's main help text, as a string.""" pass def fetch_command(self, subcommand): """ Try to fetch the given subcommand, printing a message with the appropriate command called from the command line (usually "django-admin" or "manage.py") if it can't be found. """ pass ... def execute(self): """ Given the command-line arguments, figure out which subcommand is being run, create a parser appropriate to that command, and run it. """ pass 复制代码
- init函数解析argv的命令行参数
- main_help_text 输出命令的帮助信息
- fetch_command 查找django模块的子命令
- execute 执行子命令
一个好的命令行工具,离不开清晰的帮助输出。默认的帮助信息是调用main_help_text函数:
def main_help_text(self, commands_only=False): """Return the script's main help text, as a string.""" usage = [ "", "Type '%s help <subcommand>' for help on a specific subcommand." % self.prog_name, "", "Available subcommands:", ] commands_dict = defaultdict(lambda: []) for name, app in get_commands().items(): commands_dict[app].append(name) style = color_style() for app in sorted(commands_dict): usage.append("") usage.append(style.NOTICE("[%s]" % app)) for name in sorted(commands_dict[app]): usage.append(" %s" % name) # Output an extra note if settings are not properly configured if self.settings_exception is not None: usage.append(style.NOTICE( "Note that only Django core commands are listed " "as settings are not properly configured (error: %s)." % self.settings_exception)) return '\n'.join(usage) 复制代码
main_help_text主要利用了下面4个函数去查找命令清单:
def find_commands(management_dir): """ Given a path to a management directory, return a list of all the command names that are available. """ command_dir = os.path.join(management_dir, 'commands') return [name for _, name, is_pkg in pkgutil.iter_modules([command_dir]) if not is_pkg and not name.startswith('_')] def load_command_class(app_name, name): """ Given a command name and an application name, return the Command class instance. Allow all errors raised by the import process (ImportError, AttributeError) to propagate. """ module = import_module('%s.management.commands.%s' % (app_name, name)) return module.Command() def get_commands(): commands = {name: 'django.core' for name in find_commands(__path__[0])} if not settings.configured: return commands for app_config in reversed(list(apps.get_app_configs())): path = os.path.join(app_config.path, 'management') commands.update({name: app_config.name for name in find_commands(path)}) return commands def call_command(command_name, *args, **options): pass 复制代码
find_commands查找management/commands下的命令文件,load_command_class使用 import_module 这个动态加载模块的方法导入命令。
子命令的执行是通过fetch_command找到子命令,然后执行子命令的run_from_argv方法。
def execute(self): ... self.fetch_command(subcommand).run_from_argv(self.argv) def fetch_command(self, subcommand): """ Try to fetch the given subcommand, printing a message with the appropriate command called from the command line (usually "django-admin" or "manage.py") if it can't be found. """ # Get commands outside of try block to prevent swallowing exceptions commands = get_commands() try: app_name = commands[subcommand] except KeyError: ... if isinstance(app_name, BaseCommand): # If the command is already loaded, use it directly. klass = app_name else: klass = load_command_class(app_name, subcommand) return klass 复制代码
在django的core中包含了下面这些命令,多数都直接继承自BaseCommand:
BaseCommand的主要代码结构:
最重要的run_from_argv和execute方法都是命令的执行入口:
def run_from_argv(self, argv): ... parser = self.create_parser(argv[0], argv[1]) options = parser.parse_args(argv[2:]) cmd_options = vars(options) # Move positional args out of options to mimic legacy optparse args = cmd_options.pop('args', ()) handle_default_options(options) try: self.execute(*args, **cmd_options) except CommandError as e: ... def execute(self, *args, **options): output = self.handle(*args, **options) return output 复制代码
handler方法则留给子类覆盖实现:
def handle(self, *args, **options): """ The actual logic of the command. Subclasses must implement this method. """ raise NotImplementedError('subclasses of BaseCommand must provide a handle() method') 复制代码
startproject && startapp 命令实现
startproject
和startapp
两个命令分别创建项目和app,都派生自TemplateCommand。主要功能实现都在TemplateCommand中。
# startproject def handle(self, **options): project_name = options.pop('name') target = options.pop('directory') # Create a random SECRET_KEY to put it in the main settings. options['secret_key'] = SECRET_KEY_INSECURE_PREFIX + get_random_secret_key() super().handle('project', project_name, target, **options) # startapp def handle(self, **options): app_name = options.pop('name') target = options.pop('directory') super().handle('app', app_name, target, **options) 复制代码
我们使用startproject命令后的项目结构大概如下:
├── hello │ ├── __init__.py │ ├── asgi.py │ ├── settings.py │ ├── urls.py │ └── wsgi.py └── manage.py 复制代码
这和conf下的project_template目录中的模版文件一致:
. ├── manage.py-tpl └── project_name ├── __init__.py-tpl ├── asgi.py-tpl ├── settings.py-tpl ├── urls.py-tpl └── wsgi.py-tpl 复制代码
manage.py-tpl模版文件内容:
#!/usr/bin/env python """Django's command-line utility for administrative tasks.""" import os import sys def main(): """Run administrative tasks.""" os.environ.setdefault('DJANGO_SETTINGS_MODULE', '{{ project_name }}.settings') try: from django.core.management import execute_from_command_line except ImportError as exc: .... if __name__ == '__main__': main() 复制代码
可见startproject命令的功能是接收开发者输入的project_name,然后渲染到模版文件中,再生成项目文件。
TemplateCommand中模版处理的函数主要内容如下:
from django.template import Context, Engine def handle(self, app_or_project, name, target=None, **options): ... base_name = '%s_name' % app_or_project base_subdir = '%s_template' % app_or_project base_directory = '%s_directory' % app_or_project camel_case_name = 'camel_case_%s_name' % app_or_project camel_case_value = ''.join(x for x in name.title() if x != '_') ... context = Context({ **options, base_name: name, base_directory: top_dir, camel_case_name: camel_case_value, 'docs_version': get_docs_version(), 'django_version': django.__version__, }, autoescape=False) ... template_dir = self.handle_template(options['template'], base_subdir) ... for root, dirs, files in os.walk(template_dir): for filename in files: if new_path.endswith(extensions) or filename in extra_files: with open(old_path, encoding='utf-8') as template_file: content = template_file.read() template = Engine().from_string(content) content = template.render(context) with open(new_path, 'w', encoding='utf-8') as new_file: new_file.write(content) 复制代码
- 构建模版参数使用的Context上下文
- 遍历模版文件目录
- 使用django的模版引擎Engine渲染内容
- 用渲染的结果生成项目脚手架文件
django.template.Engine如何工作,和模版引擎mako有什么区别,以后再行介绍,本章我们只需要了解即可。
runserver 命令实现
runserver提供了一个开发测试的http服务,帮助启动django项目,也是django使用频率最高的命令。django项目遵循wsgi规范,执行启动之前需要先查找wsgi-application。(如果对wsgi规范不了解的同学,欢迎翻看之前的文章)
def get_internal_wsgi_application(): """ Load and return the WSGI application as configured by the user in ``settings.WSGI_APPLICATION``. With the default ``startproject`` layout, this will be the ``application`` object in ``projectname/wsgi.py``. This function, and the ``WSGI_APPLICATION`` setting itself, are only useful for Django's internal server (runserver); external WSGI servers should just be configured to point to the correct application object directly. If settings.WSGI_APPLICATION is not set (is ``None``), return whatever ``django.core.wsgi.get_wsgi_application`` returns. """ from django.conf import settings app_path = getattr(settings, 'WSGI_APPLICATION') if app_path is None: return get_wsgi_application() try: return import_string(app_path) except ImportError as err: ... 复制代码
结合注释可以知道这里会载入开发者在project的settings中定义的application:
WSGI_APPLICATION = 'hello.wsgi.application' 复制代码
默认情况下,自定义的wsgi-application是这样的:
import os from django.core.wsgi import get_wsgi_application os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'hello.settings') application = get_wsgi_application() 复制代码
继续查看runserver的执行, 主要是 inner_run 函数:
from django.core.servers.basehttp import run def inner_run(self, *args, **options): try: handler = self.get_handler(*args, **options) run(self.addr, int(self.port), handler, ipv6=self.use_ipv6, threading=threading, server_cls=self.server_cls) except OSError as e: ... # Need to use an OS exit because sys.exit doesn't work in a thread os._exit(1) except KeyboardInterrupt: .. sys.exit(0) 复制代码
http服务的功能实现由 django.core.servers.basehttp
提供,完成http服务和wsgi之间的衔接,其架构图如下:
run函数的代码:
def run(addr, port, wsgi_handler, ipv6=False, threading=False, server_cls=WSGIServer): server_address = (addr, port) if threading: httpd_cls = type('WSGIServer', (socketserver.ThreadingMixIn, server_cls), {}) else: httpd_cls = server_cls httpd = httpd_cls(server_address, WSGIRequestHandler, ipv6=ipv6) if threading: # ThreadingMixIn.daemon_threads indicates how threads will behave on an # abrupt shutdown; like quitting the server by the user or restarting # by the auto-reloader. True means the server will not wait for thread # termination before it quits. This will make auto-reloader faster # and will prevent the need to kill the server manually if a thread # isn't terminating correctly. httpd.daemon_threads = True httpd.set_app(wsgi_handler) httpd.serve_forever() 复制代码
- 如果支持多线程,则创建一个WSGIServer和ThreadingMixIn的新类
- 创建http服务并启动
django的wsgi-application实现, http协议的实现,下一章再行详细介绍,我们也暂时跳过。在runserver中还有一个非常重要的功能: 自动重启服务 。 如果我们修改了项目的代码,服务会自动重启,可以提高开发效率。
比如我们修改api的视图功能,随便增加几个字符,可以在控制台看到大概下面的输出:
# python manage.py runserver Watching for file changes with StatReloader Performing system checks... System check identified no issues (0 silenced). March 05, 2022 - 09:06:07 Django version 4.0, using settings 'hello.settings' Starting development server at http://127.0.0.1:8000/ Quit the server with CONTROL-C. /Users/yoo/tmp/django/hello/api/views.py changed, reloading. Watching for file changes with StatReloader Performing system checks... System check identified no issues (0 silenced). March 05, 2022 - 09:12:59 Django version 4.0, using settings 'hello.settings' Starting development server at http://127.0.0.1:8000/ Quit the server with CONTROL-C. 复制代码
runserver命令检测到 /hello/api/views.py 有修改,然后自动使用StatReloader重启服务。
inner_run有下面2中启动方式, 默认情况下使用autoreload启动:
def run(self, **options): """Run the server, using the autoreloader if needed.""" ... use_reloader = options['use_reloader'] if use_reloader: autoreload.run_with_reloader(self.inner_run, **options) else: self.inner_run(None, **options) 复制代码
autoreload的继承关系如下:
class BaseReloader: pass class StatReloader(BaseReloader): pass class WatchmanReloader(BaseReloader): pass def get_reloader(): """Return the most suitable reloader for this environment.""" try: WatchmanReloader.check_availability() except WatchmanUnavailable: return StatReloader() return WatchmanReloader() 复制代码
当前版本优先使用WatchmanReloader实现,这依赖于pywatchman库,需要额外安装。否则使用StatReloader的实现,这个实现在之前介绍werkzeug中也有过介绍,本质上都是持续的监听文件的状态变化。
class StatReloader(BaseReloader): SLEEP_TIME = 1 # Check for changes once per second. def tick(self): mtimes = {} while True: for filepath, mtime in self.snapshot_files(): old_time = mtimes.get(filepath) mtimes[filepath] = mtime if old_time is None: logger.debug('File %s first seen with mtime %s', filepath, mtime) continue elif mtime > old_time: logger.debug('File %s previous mtime: %s, current mtime: %s', filepath, old_time, mtime) self.notify_file_changed(filepath) time.sleep(self.SLEEP_TIME) yield 复制代码
- StatReloader 以1s为间隔检查文件的时间戳变化。
- tick是生成器模式,可以使用next持续调用
当文件有变化后,退出当前进程,并使用subprocess启动新的进程:
def trigger_reload(filename): logger.info('%s changed, reloading.', filename) sys.exit(3) def restart_with_reloader(): new_environ = {**os.environ, DJANGO_AUTORELOAD_ENV: 'true'} args = get_child_arguments() while True: p = subprocess.run(args, env=new_environ, close_fds=False) if p.returncode != 3: return p.returncode 复制代码
app的setup过程
runserver命令在执行之前还有很重要的一步就是setup: 加载和初始化开发者自定义的app内容。这是在ManagementUtility的execute函数中开始的:
def execute(self): ... try: settings.INSTALLED_APPS except ImproperlyConfigured as exc: self.settings_exception = exc except ImportError as exc: self.settings_exception = exc ... 复制代码
Settings中会载下面的一些模块,主要是INSTALLED_APPS:
class Settings: def __init__(self, settings_module): ... # store the settings module in case someone later cares self.SETTINGS_MODULE = settings_module mod = importlib.import_module(self.SETTINGS_MODULE) tuple_settings = ( 'ALLOWED_HOSTS', "INSTALLED_APPS", "TEMPLATE_DIRS", "LOCALE_PATHS", ) self._explicit_settings = set() ... 复制代码
INSTALLED_APPS在项目的setting中定义:
# Application definition INSTALLED_APPS = [ 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', 'api', ] 复制代码
这样django框架就完成了开发者自定义的内容动态加载。
小结
Django是一个高度集成的python web开发框架,支持模块化开发。django还提供了一系列脚手架命令,比如使用startproject和startapp协助创建项目和模块模版;使用runserver辅助测试和开发项目。django项目也符合wsgi规范,其http服务的启动中创建了WSGIServer,并且支持多线程模式。django作为一个框架,可以通过约定的配置文件setting动态加载开发者的业务实现。
小技巧
django命令支持智能提示。比如我们 runserver 命令时,不小心输入错误的把字母 u
打成了 i
。命令会自动提醒我们,是不是想使用 runserver 命令:
python -m django rinserver No Django settings specified. Unknown command: 'rinserver'. Did you mean runserver? Type 'python -m django help' for usage. 复制代码
智能提示的功能对命令行工具很有帮助,一般的实现就是比较用户输入和已知命令的重合度,从而找到最接近的命令。这是「字符串编辑距离」算法的实际应用。个人认为理解场景,这会比死刷算法更有用,我偶尔会在面试的时候使用这个例子来观测面试人的算法水准。django这里直接使用python标准库difflib提供的实现:
from difflib import get_close_matches possible_matches = get_close_matches(subcommand, commands) sys.stderr.write('Unknown command: %r' % subcommand) if possible_matches: sys.stderr.write('. Did you mean %s?' % possible_matches[0]) 复制代码
get_close_matches的使用示例:
>>> get_close_matches("appel", ["ape", "apple", "peach", "puppy"]) ['apple', 'ape'] 复制代码
对算法感兴趣的同学,可以自己进一步了解其实现细节。
另外一个小技巧是一个命名的异化细节。class 一般在很多开发语言中都是关键字,如果我们要定义一个class类型的变量名时候,避免关键字冲突。一种方法是使用 klass 替代:
# if isinstance(app_name, BaseCommand): # If the command is already loaded, use it directly. klass = app_name else: klass = load_command_class(app_name, subcommand) 复制代码
另一种是使用 clazz 替代:
if ( classes.length ) { while ( ( elem = this[ i++ ] ) ) { curValue = getClass( elem ); ... if ( cur ) { j = 0; while ( ( clazz = classes[ j++ ] ) ) { // Remove *all* instances while ( cur.indexOf( " " + clazz + " " ) > -1 ) { cur = cur.replace( " " + clazz + " ", " " ); } } ... } } } 复制代码
大家一般都用哪一种呢?