Transformers 4.37 中文文档（一百）（1）-阿里云开发者社区

原文：huggingface.co/docs/transformers

图像处理器的实用工具

原文：huggingface.co/docs/transformers/v4.37.2/en/internal/image_processing_utils

此页面列出了图像处理器使用的所有实用函数，主要是用于处理图像的功能转换。

大多数情况下，这些只有在您研究库中图像处理器的代码时才有用。

图像转换

`transformers.image_transforms.center_crop`

( image: ndarray size: Tuple data_format: Union = None input_data_format: Union = None return_numpy: Optional = None ) → export const metadata = 'undefined';np.ndarray

参数

image（np.ndarray）—要裁剪的图像。
size（Tuple[int, int]）—裁剪图像的目标尺寸。
data_format（str或ChannelDimension，可选）—输出图像的通道维度格式。可以是以下之一：

"channels_first"或ChannelDimension.FIRST：图像以（num_channels，height，width）格式。
"channels_last"或ChannelDimension.LAST：图像以（height，width，num_channels）格式。如果未设置，将使用输入图像的推断格式。

input_data_format（str或ChannelDimension，可选）—输入图像的通道维度格式。可以是以下之一：

"channels_first"或ChannelDimension.FIRST：图像以（num_channels，height，width）格式。
"channels_last"或ChannelDimension.LAST：图像以（height，width，num_channels）格式。如果未设置，将使用输入图像的推断格式。

return_numpy（bool，可选）—是否将裁剪后的图像作为 numpy 数组返回。用于与先前的 ImageFeatureExtractionMixin 方法向后兼容。

未设置：将返回与输入图像相同类型。
True：将返回一个 numpy 数组。
False：将返回一个PIL.Image.Image对象。

np.ndarray

裁剪后的图像。

使用中心裁剪将image裁剪到指定的size。请注意，如果图像太小而无法裁剪到给定的尺寸，将进行填充（因此返回的结果始终为size的大小）。

`transformers.image_transforms.center_to_corners_format`

<来源>

( bboxes_center: TensorType )

将边界框从中心格式转换为角落格式。

中心格式：包含框的中心坐标及其宽度、高度尺寸（center_x，center_y，width，height）角落格式：包含框的左上角和右下角的坐标（top_left_x，top_left_y，bottom_right_x，bottom_right_y）

`transformers.image_transforms.corners_to_center_format`

<来源>

( bboxes_corners: TensorType )

将边界框从角落格式转换为中心格式。

角落格式：包含框的左上角和右下角的坐标（top_left_x，top_left_y，bottom_right_x，bottom_right_y）中心格式：包含框的中心坐标及其宽度、高度尺寸（center_x，center_y，width，height）

`transformers.image_transforms.id_to_rgb`

<来源>

( id_map )

将唯一 ID 转换为 RGB 颜色。

`transformers.image_transforms.normalize`

<来源>

( image: ndarray mean: Union std: Union data_format: Optional = None input_data_format: Union = None )

参数

image（np.ndarray）—要归一化的图像。
mean（float或Iterable[float]）—用于归一化的均值。
std（float或Iterable[float]）—用于归一化的标准差。
data_format（ChannelDimension，可选）—输出图像的通道维度格式。如果未设置，将使用从输入推断的格式。
input_data_format（ChannelDimension，可选）— 输入图像的通道维度格式。如果未设置，将使用从输入中推断的格式。

使用由mean和std指定的均值和标准差对image进行归一化。

图像=（图像-平均值）/标准差

`transformers.image_transforms.pad`

<来源>

( image: ndarray padding: Union mode: PaddingMode = <PaddingMode.CONSTANT: 'constant'> constant_values: Union = 0.0 data_format: Union = None input_data_format: Union = None ) → export const metadata = 'undefined';np.ndarray

参数

image（np.ndarray）— 要填充的图像。
padding（int或Tuple[int, int]或Iterable[Tuple[int, int]]）— 要应用到高度、宽度轴边缘的填充。可以是以下三种格式之一：

（（before_height，after_height），（before_width，after_width））每个轴的唯一填充宽度。
（（before，after），）产生相同的高度和宽度的前后填充。
（填充，）或 int 是所有轴的宽度的快捷方式为 before = after = pad。

mode（PaddingMode）— 要使用的填充模式。可以是以下之一：

“常数”：用常数值填充。
“反射”：用沿每个轴的向量的第一个和最后一个值的镜像填充。
“复制”：用沿每个轴的数组边缘的最后一个值的复制填充。
“对称”：用沿数组边缘镜像的向量的反射填充。

constant_values（float或Iterable[float]，可选）— 如果mode为"constant"，则用于填充的值。
data_format（str或ChannelDimension，可选）— 输出图像的通道维度格式。可以是以下之一：

"channels_first"或ChannelDimension.FIRST：图像以（通道数，高度，宽度）格式。
"channels_last"或ChannelDimension.LAST：图像以（高度，宽度，通道数）格式。如果未设置，将使用与输入图像相同的格式。

input_data_format（str或ChannelDimension，可选）— 输入图像的通道维度格式。可以是以下之一：

"channels_first"或ChannelDimension.FIRST：图像以（通道数，高度，宽度）格式。
"channels_last"或ChannelDimension.LAST：图像以（高度，宽度，通道数）格式。如果未设置，将使用从输入图像推断的格式。

np.ndarray

填充后的图像。

用指定的（高度，宽度）填充和模式填充图像。

`transformers.image_transforms.rgb_to_id`

<来源>

( color )

将 RGB 颜色转换为唯一 ID。

`transformers.image_transforms.rescale`

<来源>

( image: ndarray scale: float data_format: Optional = None dtype: dtype = <class 'numpy.float32'> input_data_format: Union = None ) → export const metadata = 'undefined';np.ndarray

参数

image（np.ndarray）— 要重新缩放的图像。
scale（float）— 用于重新缩放图像的比例。
data_format（ChannelDimension，可选）— 图像的通道维度格式。如果未提供，将与输入图像相同。
dtype（np.dtype，可选，默认为np.float32）— 输出图像的数据类型。默认为np.float32。用于与特征提取器的向后兼容性。
input_data_format（ChannelDimension，可选）— 输入图像的通道维度格式。如果未提供，将从输入图像中推断。

np.ndarray

重新缩放后的图像。

通过scale重新缩放图像。

`transformers.image_transforms.resize`

<来源>

( image: ndarray size: Tuple resample: PILImageResampling = None reducing_gap: Optional = None data_format: Optional = None return_numpy: bool = True input_data_format: Union = None ) → export const metadata = 'undefined';np.ndarray

参数

image（np.ndarray）— 要调整大小的图像。
size（Tuple[int, int]）— 用于调整图像大小的尺寸。
resample（int，可选，默认为PILImageResampling.BILINEAR）— 用于重采样的滤波器。
reducing_gap（int，可选）— 通过在两个步骤中调整图像大小来应用优化。reducing_gap越大，结果越接近公平重采样。有关更多详细信息，请参阅相应的 Pillow 文档。
data_format (ChannelDimension, 可选) — 输出图像的通道维度格式。如果未设置，将使用从输入推断出的格式。
return_numpy (bool, 可选, 默认为True) — 是否将调整大小后的图像作为 numpy 数组返回。如果为 False，则返回一个PIL.Image.Image对象。
input_data_format (ChannelDimension, 可选) — 输入图像的通道维度格式。如果未设置，将使用从输入推断出的格式。

np.ndarray

调整大小后的图像。

使用 PIL 库将image调整为由size指定的(height, width)大小。

`transformers.image_transforms.to_pil_image`

<来源>

( image: Union do_rescale: Optional = None input_data_format: Union = None ) → export const metadata = 'undefined';PIL.Image.Image

参数

image (PIL.Image.Image或numpy.ndarray或torch.Tensor或tf.Tensor) — 要转换为PIL.Image格式的图像。
do_rescale (bool, 可选) — 是否应用缩放因子（使像素值在 0 和 255 之间为整数）。如果图像类型是浮点类型，并且转换为int会导致精度损失，则默认为True，否则为False。
input_data_format (ChannelDimension, 可选) — 输入图像的通道维度格式。如果未设置，将使用从输入推断出的格式。

PIL.Image.Image

转换后的图像。

将image转换为 PIL 图像。可选地重新缩放它，并在需要时将通道维度放回作为最后一个轴。

ImageProcessingMixin

`class transformers.ImageProcessingMixin`

<来源>

( **kwargs )

这是一个用于为顺序和图像特征提取器提供保存/加载功能的图像处理器 mixin。

`fetch_images`

<来源>

( image_url_or_urls: Union )

将单个或一组 url 转换为相应的PIL.Image对象。

如果传递单个 url，则返回值将是单个对象。如果传递列表，则返回对象列表。

`from_dict`

<来源>

( image_processor_dict: Dict **kwargs ) → export const metadata = 'undefined';ImageProcessingMixin

参数

image_processor_dict (Dict[str, Any]) — 将用于实例化图像处理器对象的字典。可以通过利用 to_dict() 方法从预训练检查点中检索这样的字典。
kwargs (Dict[str, Any]) — 用于初始化图像处理器对象的其他参数。

ImageProcessingMixin

从这些参数实例化的图像处理器对象。

从 Python 参数字典中实例化一种类型的 ImageProcessingMixin。

`from_json_file`

<来源>

( json_file: Union ) → export const metadata = 'undefined';A image processor of type ImageProcessingMixin

参数

json_file (str或os.PathLike) — 包含参数的 JSON 文件的路径。

类型为 ImageProcessingMixin 的图像处理器。

从该 JSON 文件实例化的图像处理器对象。

从参数 JSON 文件的路径实例化类型为 ImageProcessingMixin 的图像处理器。

`from_pretrained`

<来源>

( pretrained_model_name_or_path: Union cache_dir: Union = None force_download: bool = False local_files_only: bool = False token: Union = None revision: str = 'main' **kwargs )

参数

pretrained_model_name_or_path (str或os.PathLike) — 这可以是：

一个字符串，预训练图像处理器的模型 ID，托管在 huggingface.co 上的模型存储库内。有效的模型 ID 可以位于根级别，如bert-base-uncased，或者在用户或组织名称下命名空间化，如dbmdz/bert-base-german-cased。
一个包含使用 save_pretrained()方法保存的图像处理器文件的目录路径，例如，./my_model_directory/。
保存的图像处理器 JSON 文件的路径或 URL，例如，./my_model_directory/preprocessor_config.json。

cache_dir（str或os.PathLike，可选）— 如果不应使用标准缓存，则应将下载的预训练模型图像处理器缓存在其中的目录路径。
force_download（bool，可选，默认为False）— 是否强制（重新）下载图像处理器文件并覆盖缓存版本（如果存在）。
resume_download（bool，可选，默认为False）— 是否删除接收不完整的文件。如果存在这样的文件，尝试恢复下载。
proxies（Dict[str, str]，可选）— 要使用的代理服务器字典，按协议或端点分类，例如，{'http': 'foo.bar:3128', 'http://hostname': 'foo.bar:4012'}。代理将在每个请求上使用。
token（str或bool，可选）— 用作远程文件的 HTTP bearer 授权的令牌。如果为True，或未指定，将使用运行huggingface-cli login时生成的令牌（存储在~/.huggingface）。
revision（str，可选，默认为"main"）— 要使用的特定模型版本。它可以是分支名称、标签名称或提交 ID，因为我们在 huggingface.co 上使用基于 git 的系统存储模型和其他工件，所以revision可以是 git 允许的任何标识符。

从图像处理器实例化类型 ImageProcessingMixin。

示例：

# We can't instantiate directly the base class *ImageProcessingMixin* so let's show the examples on a
# derived class: *CLIPImageProcessor*
image_processor = CLIPImageProcessor.from_pretrained(
    "openai/clip-vit-base-patch32"
)  # Download image_processing_config from huggingface.co and cache.
image_processor = CLIPImageProcessor.from_pretrained(
    "./test/saved_model/"
)  # E.g. image processor (or model) was saved using *save_pretrained('./test/saved_model/')*
image_processor = CLIPImageProcessor.from_pretrained("./test/saved_model/preprocessor_config.json")
image_processor = CLIPImageProcessor.from_pretrained(
    "openai/clip-vit-base-patch32", do_normalize=False, foo=False
)
assert image_processor.do_normalize is False
image_processor, unused_kwargs = CLIPImageProcessor.from_pretrained(
    "openai/clip-vit-base-patch32", do_normalize=False, foo=False, return_unused_kwargs=True
)
assert image_processor.do_normalize is False
assert unused_kwargs == {"foo": False}

`get_image_processor_dict`

<来源>

( pretrained_model_name_or_path: Union **kwargs ) → export const metadata = 'undefined';Tuple[Dict, Dict]

参数

pretrained_model_name_or_path（str或os.PathLike）— 我们想要参数字典的预训练检查点的标识符。
subfolder（str，可选，默认为""）— 如果相关文件位于 huggingface.co 上模型存储库的子文件夹中，您可以在此处指定文件夹名称。

Tuple[Dict, Dict]

将用于实例化图像处理器对象的字典。

从pretrained_model_name_or_path解析为参数字典，用于实例化类型为~image_processor_utils.ImageProcessingMixin的图像处理器，使用from_dict。

`push_to_hub`

<来源>

( repo_id: str use_temp_dir: Optional = None commit_message: Optional = None private: Optional = None token: Union = None max_shard_size: Union = '5GB' create_pr: bool = False safe_serialization: bool = True revision: str = None commit_description: str = None tags: Optional = None **deprecated_kwargs )

参数

repo_id（str）— 您要推送图像处理器的存储库名称。在推送到给定组织时，应包含您的组织名称。
use_temp_dir（bool，可选）— 是否使用临时目录来存储在推送到 Hub 之前保存的文件。如果没有名为repo_id的目录，则默认为True，否则为False。
commit_message（str，可选）— 推送时要提交的消息。默认为"Upload image processor"。
private（bool，可选）— 创建的存储库是否应为私有。
token（bool或str，可选）— 用作远程文件的 HTTP bearer 授权的令牌。如果repo_url未指定，将默认为True。
max_shard_size（int或str，可选，默认为"5GB"）— 仅适用于模型。在分片之前的检查点的最大大小。然后，检查点将分片，每个分片的大小都小于此大小。如果表示为字符串，需要是数字后跟一个单位（如"5MB"）。我们将其默认设置为"5GB"，以便用户可以在免费的 Google Colab 实例上轻松加载模型，而不会出现 CPU OOM 问题。
create_pr（bool，可选，默认为False）— 是否创建一个带有上传文件的 PR 或直接提交。
safe_serialization（bool，可选，默认为True）— 是否将模型权重转换为 safetensors 格式以进行更安全的序列化。
revision（str，可选）— 要推送上传文件的分支。
commit_description（str，可选）— 将要创建的提交的描述
tags（List[str]，可选）— 要在 Hub 上推送的标签列表。

将图像处理器文件上传到🤗模型中心。

示例：

from transformers import AutoImageProcessor
image processor = AutoImageProcessor.from_pretrained("bert-base-cased")
# Push the image processor to your namespace with the name "my-finetuned-bert".
image processor.push_to_hub("my-finetuned-bert")
# Push the image processor to an organization with the name "my-finetuned-bert".
image processor.push_to_hub("huggingface/my-finetuned-bert")

`register_for_auto_class`

<来源>

( auto_class = 'AutoImageProcessor' )

参数

auto_class（str或type，可选，默认为"AutoImageProcessor" ）— 要将此新图像处理器注册到的自动类。

将此类注册到给定的 auto 类。这仅应用于自定义图像处理器，因为库中的处理器已经与AutoImageProcessor映射。

此 API 是实验性的，可能在下一个版本中有一些轻微的破坏性更改。

`save_pretrained`

<来源>

( save_directory: Union push_to_hub: bool = False **kwargs )

参数

save_directory（str或os.PathLike）— 将保存图像处理器 JSON 文件的目录（如果不存在，将创建）。
push_to_hub（bool，可选，默认为False）— 是否在保存后将模型推送到 Hugging Face 模型中心。您可以使用repo_id指定要推送到的存储库（将默认为您的命名空间中save_directory的名称）。
kwargs（Dict[str, Any]，可选）— 传递给 push_to_hub()方法的额外关键字参数。

将图像处理器对象保存到目录save_directory中，以便可以使用 from_pretrained()类方法重新加载。

`to_dict`

<来源>

( ) → export const metadata = 'undefined';Dict[str, Any]

Dict[str, Any]

包含此图像处理器实例的所有属性的字典。

将此实例序列化为 Python 字典。

`to_json_file`

<来源>

( json_file_path: Union )

参数

json_file_path（str或os.PathLike）— 保存此 image_processor 实例参数的 JSON 文件路径。

将此实例保存到 JSON 文件中。

`to_json_string`

<来源>

( ) → export const metadata = 'undefined';str

str

包含构成此 feature_extractor 实例的所有属性的字符串，以 JSON 格式呈现。

将此实例序列化为 JSON 字符串。

Transformers 4.37 中文文档（一百）（2）https://developer.aliyun.com/article/1565786

Transformers 4.37 中文文档（一百）（1）

图像处理器的实用工具

图像转换

`transformers.image_transforms.center_crop`

`transformers.image_transforms.center_to_corners_format`

`transformers.image_transforms.corners_to_center_format`

`transformers.image_transforms.id_to_rgb`

`transformers.image_transforms.normalize`

`transformers.image_transforms.pad`

`transformers.image_transforms.rgb_to_id`

`transformers.image_transforms.rescale`

`transformers.image_transforms.resize`

`transformers.image_transforms.to_pil_image`

ImageProcessingMixin

`class transformers.ImageProcessingMixin`

`fetch_images`

`from_dict`

`from_json_file`

`from_pretrained`

`get_image_processor_dict`

`push_to_hub`

`register_for_auto_class`

`save_pretrained`

`to_dict`

`to_json_file`

`to_json_string`

热门文章

最新文章

相关课程

相关电子书

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

Transformers 4.37 中文文档（一百）（1）

图像处理器的实用工具

图像转换

transformers.image_transforms.center_crop

transformers.image_transforms.center_to_corners_format

transformers.image_transforms.corners_to_center_format

transformers.image_transforms.id_to_rgb

transformers.image_transforms.normalize

transformers.image_transforms.pad

transformers.image_transforms.rgb_to_id

transformers.image_transforms.rescale

transformers.image_transforms.resize

transformers.image_transforms.to_pil_image

ImageProcessingMixin

class transformers.ImageProcessingMixin

fetch_images

from_dict

from_json_file

from_pretrained

get_image_processor_dict

push_to_hub

register_for_auto_class

save_pretrained

to_dict

to_json_file

to_json_string

热门文章

最新文章

相关课程

相关电子书

`transformers.image_transforms.center_crop`

`transformers.image_transforms.center_to_corners_format`

`transformers.image_transforms.corners_to_center_format`

`transformers.image_transforms.id_to_rgb`

`transformers.image_transforms.normalize`

`transformers.image_transforms.pad`

`transformers.image_transforms.rgb_to_id`

`transformers.image_transforms.rescale`

`transformers.image_transforms.resize`

`transformers.image_transforms.to_pil_image`

`class transformers.ImageProcessingMixin`

`fetch_images`

`from_dict`

`from_json_file`

`from_pretrained`

`get_image_processor_dict`

`push_to_hub`

`register_for_auto_class`

`save_pretrained`

`to_dict`

`to_json_file`

`to_json_string`