前言
前面支持一个国外客户,使用ossutil上传文件到OSS,一直调整不好--jobs和--parallel参数。因此给客户写了一篇简单的英文文档。
用户可从这里获取ossutil。
Concurrency Control
--jobs or --parallel
--jobs controls the amount of concurrency tasks between multi-files.
--parallel controls the amount of concurrency tasks within a file.
- By default, ossutil will calculate the parallel num according to file size, if file size > 100MB.
- So --parallel is useless to small file.
- The file size to use multipart upload can be specified by --bigfile-threshold, the default value is 100MB (104857600).
-
Therefore when batch upload/download/copy files, the total concurrency tasks num
```concurrency = jobs * parallel```
- The two option can be specified by user, if the performance of default setting is poor, user can adjust the two options.
Note:
- If parallel and jobs are too big, because of the switching between threads, the performance of upload/download/copy may decline, so please set the options according to your machine condition.
- If need performance tuning, user can set the two options to small numbers at first and increase them step by step.
- If parallel and jobs are too big, in case of limited machine resources, error "EOF" may occur due to the network transfer too slow, in this situation, please reduce the --jobs and --parallel num.
Best Practice
1. For single small file
e.g., test1.log,
- file-size = 50MB
- And user expect to upload in MPU (MultiPart Upload)
Then need to specify --bigfile-threshold less than 50MB, such as, 20MB (20971520),
ossutil cp test1.log oss://<bucket_name> --bigfile-threshold=20971520
Then, ossutil will upload test1.log in MPU.
2. For single big file
e.g., test2.log,
- file-size = 150MB
- And user expect to upload in MPU
Then no need to specify --bigfile-threshold because file-size is more than the default value (100MB).
ossutil cp test2.log oss://<bucket_name>
Then, ossutil will upload test2.log in MPU.
3. For multi files
Of course, --bigfile-threshold will affect each file for this case.
3.1 If --jobs and parallel not specified
e.g.,ossutil cp <local_dir> oss://<bucket_name>
Then concurrency will be calculated automatically, and
- If ossutil version <= 1.4.0
max parallel is 15. By default jobs = 5, so
max concurrency = 75
- If ossutil version = 1.4.1
max parallel is 12. By default jobs = 3, so
max concurrency = 36
3.2 If --jobs and parallel specified
e.g.,ossutil cp <local_dir> oss://<bucket_name> --jobs=3 --parallel=8
Then,concurrency = jobs * parallel = 24