2、所有的命令代码集合
Usage: wget [OPTION]... [URL]...
Mandatory arguments to long options are mandatory for short options too.
Startup:
-V, --version display the version of Wget and exit.
-h, --help print this help.
-b, --background go to background after startup.
-e, --execute=COMMAND execute a `.wgetrc' command.
Logging and input file:
-o, --output-file=FILE log messages to FILE.
-a, --append-output=FILE append messages to FILE.
-d, --debug print debug output.
-q, --quiet quiet (no output).
-v, --verbose be verbose (this is the default).
-nv, --non-verbose turn off verboseness, without being quiet.
-i, --input-file=FILE read URL-s from file.
-F, --force-html treat input file as HTML.
Download:
-t, --tries=NUMBER set number of retries to NUMBER (0 unlimits).
-O --output-document=FILE write documents to FILE.
-nc, --no-clobber don't clobber existing files.
-c, --continue restart getting an existing file.
--dot-style=STYLE set retrieval display style.
-N, --timestamping don't retrieve files if older than local.
-S, --server-response print server response.
--spider don't download anything.
-T, --timeout=SECONDS set the read timeout to SECONDS.
-w, --wait=SECONDS wait SECONDS between retrievals.
-Y, --proxy=on/off turn proxy on or off.
-Q, --quota=NUMBER set retrieval quota to NUMBER.
Directories:
-nd --no-directories don't create directories.
-x, --force-directories force creation of directories.
-nH, --no-host-directories don't create host directories.
-P, --directory-prefix=PREFIX save files to PREFIX/...
--cut-dirs=NUMBER ignore NUMBER remote directory components.
HTTP options:
--http-user=USER set http user to USER.
--http-passwd=PASS set http password to PASS.
-C, --cache=on/off (dis)allow server-cached data (normally allowed).
--ignore-length ignore `Content-Length' header field.
--header=STRING insert STRING among the headers.
--proxy-user=USER set USER as proxy username.
--proxy-passwd=PASS set PASS as proxy password.
-s, --save-headers save the HTTP headers to file.
-U, --user-agent=AGENT identify as AGENT instead of Wget/VERSION.
FTP options:
--retr-symlinks retrieve FTP symbolic links.
-g, --glob=on/off turn file name globbing on or off.
--passive-ftp use the "passive" transfer mode.
Recursive retrieval:
-r, --recursive recursive web-suck -- use with care!.
-l, --level=NUMBER maximum recursion depth (0 to unlimit).
--delete-after delete downloaded files.
-k, --convert-links convert non-relative links to relative.
-m, --mirror turn on options suitable for mirroring.
-nr, --dont-remove-listing don't remove `.listing' files.
Recursive accept/reject:
-A, --accept=LIST list of accepted extensions.
-R, --reject=LIST list of rejected extensions.
-D, --domains=LIST list of accepted domains.
--exclude-domains=LIST comma-separated list of rejected domains.
-L, --relative follow relative links only.
--follow-ftp follow FTP links from HTML documents.
-H, --span-hosts go to foreign hosts when recursive.
-I, --include-directories=LIST list of allowed directories.
-X, --exclude-directories=LIST list of excluded directories.
-nh, --no-host-lookup don't DNS-lookup hosts.
-np, --no-parent don't ascend to the parent directory.
wget的使用方法
1、下载到指定文件夹
wget https://raw.githubusercontent.com/……/image_ocr.py -O E:\Program Files\wget download
2、下载整站:经常要下载一个网站或网站的某个目录。
wget -r -p -k -np -nc -e robots=off http://www.example.com/mydir/ #下载一个目录,例如下载网站www.example.com/目录mydir下的所有内容
wget -r -p -k -nc -e robots=off http://www.example.com/mydir/ #如果要想下载整个网站,最好去除-np参数。
-r 递归;对于HTTP主机,wget首先下载URL指定的文件,然后(如果该文件是一个HTML文档的话)递归下载该文件所引用(超级连接)的所有文件(递 归深度由参数-l指定)。对FTP主机,该参数意味着要下载URL指定的目录中的所有文件,递归方法与HTTP主机类似。
-c 指定断点续传功能。实际上,wget默认具有断点续传功能,只有当你使用别的ftp工具下载了某一文件的一部分,并希望wget接着完成此工作的时候,才 需要指定此参数。
-nc 不下载已经存在的文件
-np 表示不追溯至父目录,不跟随链接,只下载指定目录及子目录里的东西;
-p 下载页面显示所需的所有文件。比如页面中包含了图片,但是图片并不在/yourdir目录中,而在/images目录下,有此参数,图片依然会被正常下 载。
-k 修复下载文件中的绝对连接为相对连接,这样方便本地阅读。
-o down.log 记录日记到down.log
-e robots=off 忽略robots.txt