开发者社区> 问答> 正文

hadoop中 Hadoop DistCp的api使用是什么呢?

hadoop中 Hadoop DistCp的api使用是什么呢?

展开
收起
游客yzrzs5mf6j7yy 2021-12-06 00:13:20 365 0
1 条回答
写回答
取消 提交回答
  • 复制代码
    [root@node105 ~]# hadoop distcp
    usage: distcp OPTIONS [source_path...] <target_path>
                  OPTIONS
     -append                       Reuse existing data in target files and
                                   append new data to them if possible
     -async                        Should distcp execution be blocking
     -atomic                       Commit all changes or none
     -bandwidth <arg>              Specify bandwidth per map in MB
     -blocksperchunk <arg>         If set to a positive value, fileswith more
                                   blocks than this value will be split into
                                   chunks of <blocksperchunk> blocks to be
                                   transferred in parallel, and reassembled on
                                   the destination. By default,
                                   <blocksperchunk> is 0 and the files will be
                                   transmitted in their entirety without
                                   splitting. This switch is only applicable
                                   when the source file system implements
                                   getBlockLocations method and the target
                                   file system implements concat method
     -copybuffersize <arg>         Size of the copy buffer to use. By default
                                   <copybuffersize> is 8192B.
     -delete                       Delete from target, files missing in source
     -diff <arg>                   Use snapshot diff report to identify the
                                   difference between source and target
     -f <arg>                      List of files that need to be copied
     -filelimit <arg>              (Deprecated!) Limit number of files copied
                                   to <= n
     -filters <arg>                The path to a file containing a list of
                                   strings for paths to be excluded from the
                                   copy.
     -i                            Ignore failures during copy
     -log <arg>                    Folder on DFS where distcp execution logs
                                   are saved
     -m <arg>                      Max number of concurrent maps to use for
                                   copy
     -mapredSslConf <arg>          Configuration for ssl config file, to use
                                   with hftps://. Must be in the classpath.
     -numListstatusThreads <arg>   Number of threads to use for building file
                                   listing (max 40).
     -overwrite                    Choose to overwrite target files
                                   unconditionally, even if they exist.
     -p <arg>                      preserve status (rbugpcaxt)(replication,
                                   block-size, user, group, permission,
                                   checksum-type, ACL, XATTR, timestamps). If
                                   -p is specified with no <arg>, then
                                   preserves replication, block size, user,
                                   group, permission, checksum type and
                                   timestamps. raw.* xattrs are preserved when
                                   both the source and destination paths are
                                   in the /.reserved/raw hierarchy (HDFS
                                   only). raw.* xattrpreservation is
                                   independent of the -p flag. Refer to the
                                   DistCp documentation for more details.
     -rdiff <arg>                  Use target snapshot diff report to identify
                                   changes made on target
     -sizelimit <arg>              (Deprecated!) Limit number of files copied
                                   to <= n bytes
     -skipcrccheck                 Whether to skip CRC checks between source
                                   and target paths.
     -strategy <arg>               Copy strategy to use. Default is dividing
                                   work based on file sizes
     -tmp <arg>                    Intermediate work path to be used for
                                   atomic commit
     -update                       Update target, copying only missingfiles or
                                   directories
    
    2021-12-06 00:13:38
    赞同 展开评论 打赏
问答分类:
问答地址:
问答排行榜
最热
最新

相关电子书

更多
Spring Boot2.0实战Redis分布式缓存 立即下载
CUDA MATH API 立即下载
API PLAYBOOK 立即下载