TensorFlow教程之API DOC 6.3.6. IMAGE

简介:

本文档为TensorFlow参考文档,本转载已得到TensorFlow中文社区授权。


Images

Note: Functions taking Tensor arguments can also take anything accepted by tf.convert_to_tensor.

Contents

Images

Encoding and Decoding

TensorFlow provides Ops to decode and encode JPEG and PNG formats. Encoded images are represented by scalar string Tensors, decoded images by 3-D uint8 tensors of shape [height, width, channels].

The encode and decode Ops apply to one image at a time. Their input and output are all of variable size. If you need fixed size images, pass the output of the decode Ops to one of the cropping and resizing Ops.

Note: The PNG encode and decode Ops support RGBA, but the conversions Ops presently only support RGB, HSV, and GrayScale.


tf.image.decode_jpeg(contents, channels=None, ratio=None, fancy_upscaling=None, try_recover_truncated=None, acceptable_fraction=None, name=None)

Decode a JPEG-encoded image to a uint8 tensor.

The attr channels indicates the desired number of color channels for the decoded image.

Accepted values are:

  • 0: Use the number of channels in the JPEG-encoded image.
  • 1: output a grayscale image.
  • 3: output an RGB image.

If needed, the JPEG-encoded image is transformed to match the requested number of color channels.

The attr ratio allows downscaling the image by an integer factor during decoding. Allowed values are: 1, 2, 4, and 8. This is much faster than downscaling the image later.

Args:
  • contents: A Tensor of type string. 0-D. The JPEG-encoded image.
  • channels: An optional int. Defaults to 0. Number of color channels for the decoded image.
  • ratio: An optional int. Defaults to 1. Downscaling ratio.
  • fancy_upscaling: An optional bool. Defaults to True. If true use a slower but nicer upscaling of the chroma planes (yuv420/422 only).
  • try_recover_truncated: An optional bool. Defaults to False. If true try to recover an image from truncated input.
  • acceptable_fraction: An optional float. Defaults to 1. The minimum required fraction of lines before a truncated input is accepted.
  • name: A name for the operation (optional).
Returns:

Tensor of type uint8. 3-D with shape [height, width, channels]..


tf.image.encode_jpeg(image, format=None, quality=None, progressive=None, optimize_size=None, chroma_downsampling=None, density_unit=None, x_density=None, y_density=None, xmp_metadata=None, name=None)

JPEG-encode an image.

image is a 3-D uint8 Tensor of shape [height, width, channels].

The attr format can be used to override the color format of the encoded output. Values can be:

  • '': Use a default format based on the number of channels in the image.
  • grayscale: Output a grayscale JPEG image. The channels dimension of image must be 1.
  • rgb: Output an RGB JPEG image. The channels dimension of image must be 3.

If format is not specified or is the empty string, a default format is picked in function of the number of channels in image:

  • 1: Output a grayscale image.
  • 3: Output an RGB image.
Args:
  • image: A Tensor of type uint8. 3-D with shape [height, width, channels].
  • format: An optional string from: "", "grayscale", "rgb". Defaults to "". Per pixel image format.
  • quality: An optional int. Defaults to 95. Quality of the compression from 0 to 100 (higher is better and slower).
  • progressive: An optional bool. Defaults to False. If True, create a JPEG that loads progressively (coarse to fine).
  • optimize_size: An optional bool. Defaults to False. If True, spend CPU/RAM to reduce size with no quality change.
  • chroma_downsampling: An optional bool. Defaults to True. See http://en.wikipedia.org/wiki/Chroma_subsampling.
  • density_unit: An optional string from: "in", "cm". Defaults to "in". Unit used to specify x_density and y_density: pixels per inch ('in') or centimeter ('cm').
  • x_density: An optional int. Defaults to 300. Horizontal pixels per density unit.
  • y_density: An optional int. Defaults to 300. Vertical pixels per density unit.
  • xmp_metadata: An optional string. Defaults to "". If not empty, embed this XMP metadata in the image header.
  • name: A name for the operation (optional).
Returns:

Tensor of type string. 0-D. JPEG-encoded image.


tf.image.decode_png(contents, channels=None, name=None)

Decode a PNG-encoded image to a uint8 tensor.

The attr channels indicates the desired number of color channels for the decoded image.

Accepted values are:

  • 0: Use the number of channels in the PNG-encoded image.
  • 1: output a grayscale image.
  • 3: output an RGB image.
  • 4: output an RGBA image.

If needed, the PNG-encoded image is transformed to match the requested number of color channels.

Args:
  • contents: A Tensor of type string. 0-D. The PNG-encoded image.
  • channels: An optional int. Defaults to 0. Number of color channels for the decoded image.
  • name: A name for the operation (optional).
Returns:

Tensor of type uint8. 3-D with shape [height, width, channels].


tf.image.encode_png(image, compression=None, name=None)

PNG-encode an image.

image is a 3-D uint8 Tensor of shape [height, width, channels] where channels is:

  • 1: for grayscale.
  • 3: for RGB.
  • 4: for RGBA.

The ZLIB compression level, compression, can be -1 for the PNG-encoder default or a value from 0 to 9. 9 is the highest compression level, generating the smallest output, but is slower.

Args:
  • image: A Tensor of type uint8. 3-D with shape [height, width, channels].
  • compression: An optional int. Defaults to -1. Compression level.
  • name: A name for the operation (optional).
Returns:

Tensor of type string. 0-D. PNG-encoded image.

Resizing

The resizing Ops accept input images as tensors of several types. They always output resized images as float32 tensors.

The convenience function resize_images() supports both 4-D and 3-D tensors as input and output. 4-D tensors are for batches of images, 3-D tensors for individual images.

Other resizing Ops only support 3-D individual images as input: resize_arearesize_bicubic,resize_bilinearresize_nearest_neighbor.

Example:

# Decode a JPG image and resize it to 299 by 299.
image = tf.image.decode_jpeg(...)
resized_image = tf.image.resize_bilinear(image, [299, 299])

Maybe refer to the Queue examples that show how to add images to a Queue after resizing them to a fixed size, and how to dequeue batches of resized images from the Queue.


tf.image.resize_images(images, new_height, new_width, method=0)

Resize images to new_widthnew_height using the specified method.

Resized images will be distorted if their original aspect ratio is not the same as new_widthnew_height. To avoid distortions see resize_image_with_crop_or_pad.

method can be one of:

Args:
  • images: 4-D Tensor of shape [batch, height, width, channels] or
       3-D Tensor of shape `[height, width, channels]`.
    
  • new_height: integer.
  • new_width: integer.
  • method: ResizeMethod. Defaults to ResizeMethod.BILINEAR.
Raises:
  • ValueError: if the shape of images is incompatible with the shape arguments to this function
  • ValueError: if an unsupported resize method is specified.
Returns:

If images was 4-D, a 4-D float Tensor of shape [batch, new_height, new_width, channels]. If imageswas 3-D, a 3-D float Tensor of shape [new_height, new_width, channels].


tf.image.resize_area(images, size, name=None)

Resize images to size using area interpolation.

Input images can be of different types but output images are always float.

Args:
  • images: A Tensor. Must be one of the following types: uint8int8int32float32float64. 4-D with shape [batch, height, width, channels].
  • size: A 1-D int32 Tensor of 2 elements: new_height, new_width. The new size for the images.
  • name: A name for the operation (optional).
Returns:

Tensor of type float32. 4-D with shape [batch, new_height, new_width, channels].


tf.image.resize_bicubic(images, size, name=None)

Resize images to size using bicubic interpolation.

Input images can be of different types but output images are always float.

Args:
  • images: A Tensor. Must be one of the following types: uint8int8int32float32float64. 4-D with shape [batch, height, width, channels].
  • size: A 1-D int32 Tensor of 2 elements: new_height, new_width. The new size for the images.
  • name: A name for the operation (optional).
Returns:

Tensor of type float32. 4-D with shape [batch, new_height, new_width, channels].


tf.image.resize_bilinear(images, size, name=None)

Resize images to size using bilinear interpolation.

Input images can be of different types but output images are always float.

Args:
  • images: A Tensor. Must be one of the following types: uint8int8int32float32float64. 4-D with shape [batch, height, width, channels].
  • size: A 1-D int32 Tensor of 2 elements: new_height, new_width. The new size for the images.
  • name: A name for the operation (optional).
Returns:

Tensor of type float32. 4-D with shape [batch, new_height, new_width, channels].


tf.image.resize_nearest_neighbor(images, size, name=None)

Resize images to size using nearest neighbor interpolation.

Input images can be of different types but output images are always float.

Args:
  • images: A Tensor. Must be one of the following types: uint8int8int32float32float64. 4-D with shape [batch, height, width, channels].
  • size: A 1-D int32 Tensor of 2 elements: new_height, new_width. The new size for the images.
  • name: A name for the operation (optional).
Returns:

Tensor. Has the same type as images. 4-D with shape [batch, new_height, new_width, channels].

Cropping


tf.image.resize_image_with_crop_or_pad(image, target_height, target_width)

Crops and/or pads an image to a target width and height.

Resizes an image to a target width and height by either centrally cropping the image or padding it evenly with zeros.

If width or height is greater than the specified target_width or target_height respectively, this op centrally crops along that dimension. If width or height is smaller than the specified target_width ortarget_height respectively, this op centrally pads with 0 along that dimension.

Args:
  • image: 3-D tensor of shape [height, width, channels]
  • target_height: Target height.
  • target_width: Target width.
Raises:
  • ValueError: if target_height or target_width are zero or negative.
Returns:

Cropped and/or padded image of shape [target_height, target_width, channels]


tf.image.pad_to_bounding_box(image, offset_height, offset_width, target_height, target_width)

Pad image with zeros to the specified height and width.

Adds offset_height rows of zeros on top, offset_width columns of zeros on the left, and then pads the image on the bottom and right with zeros until it has dimensions target_heighttarget_width.

This op does nothing if offset_* is zero and the image already has size target_height by target_width.

Args:
  • image: 3-D tensor with shape [height, width, channels]
  • offset_height: Number of rows of zeros to add on top.
  • offset_width: Number of columns of zeros to add on the left.
  • target_height: Height of output image.
  • target_width: Width of output image.
Returns:

3-D tensor of shape [target_height, target_width, channels]

Raises:
  • ValueError: If the shape of image is incompatible with the offset_* or target_* arguments

tf.image.crop_to_bounding_box(image, offset_height, offset_width, target_height, target_width)

Crops an image to a specified bounding box.

This op cuts a rectangular part out of image. The top-left corner of the returned image is at offset_height, offset_width in image, and its lower-right corner is at `offset_height + target_height, offset_width + target_width'.

Args:
  • image: 3-D tensor with shape [height, width, channels]
  • offset_height: Vertical coordinate of the top-left corner of the result in
              the input.
    
  • offset_width: Horizontal coordinate of the top-left corner of the result in
             the input.
    
  • target_height: Height of the result.
  • target_width: Width of the result.
Returns:

3-D tensor of image with shape [target_height, target_width, channels]

Raises:
  • ValueError: If the shape of image is incompatible with the offset_* or target_* arguments

tf.image.random_crop(image, size, seed=None, name=None)

Randomly crops image to size [target_height, target_width].

The offset of the output within image is uniformly random. image always fully contains the result.

Args:
  • image: 3-D tensor of shape [height, width, channels]
  • size: 1-D tensor with two elements, specifying target [height, width]
  • seed: A Python integer. Used to create a random seed. See set_random_seed for behavior.
  • name: A name for this operation (optional).
Returns:

A cropped 3-D tensor of shape [target_height, target_width, channels].


tf.image.extract_glimpse(input, size, offsets, centered=None, normalized=None, uniform_noise=None, name=None)

Extracts a glimpse from the input tensor.

Returns a set of windows called glimpses extracted at location offsets from the input tensor. If the windows only partially overlaps the inputs, the non overlapping areas will be filled with random noise.

The result is a 4-D tensor of shape [batch_size, glimpse_height, glimpse_width, channels]. The channels and batch dimensions are the same as that of the input tensor. The height and width of the output windows are specified in the size parameter.

The argument normalized and centered controls how the windows are built:

  • If the coordinates are normalized but not centered, 0.0 and 1.0 correspond to the minimum and maximum of each height and width dimension.
  • If the coordinates are both normalized and centered, they range from -1.0 to 1.0. The coordinates (-1.0, -1.0) correspond to the upper left corner, the lower right corner is located at (1.0, 1.0) and the center is at (0, 0).
  • If the coordinates are not normalized they are interpreted as numbers of pixels.
Args:
  • input: A Tensor of type float32. A 4-D float tensor of shape [batch_size, height, width, channels].
  • size: A Tensor of type int32. A 1-D tensor of 2 elements containing the size of the glimpses to extract. The glimpse height must be specified first, following by the glimpse width.
  • offsets: A Tensor of type float32. A 2-D integer tensor of shape [batch_size, 2] containing the x, y locations of the center of each window.
  • centered: An optional bool. Defaults to True. indicates if the offset coordinates are centered relative to the image, in which case the (0, 0) offset is relative to the center of the input images. If false, the (0,0) offset corresponds to the upper left corner of the input images.
  • normalized: An optional bool. Defaults to True. indicates if the offset coordinates are normalized.
  • uniform_noise: An optional bool. Defaults to True. indicates if the noise should be generated using a uniform distribution or a gaussian distribution.
  • name: A name for the operation (optional).
Returns:

Tensor of type float32. A tensor representing the glimpses [batch_size, glimpse_height, glimpse_width, channels].

Flipping and Transposing


tf.image.flip_up_down(image)

Flip an image horizontally (upside down).

Outputs the contents of image flipped along the first dimension, which is height.

See also reverse().

Args:
  • image: A 3-D tensor of shape [height, width, channels].
Returns:

A 3-D tensor of the same type and shape as image.

Raises:
  • ValueError: if the shape of image not supported.

tf.image.random_flip_up_down(image, seed=None)

Randomly flips an image vertically (upside down).

With a 1 in 2 chance, outputs the contents of image flipped along the first dimension, which is height. Otherwise output the image as-is.

Args:
  • image: A 3-D tensor of shape [height, width, channels].
  • seed: A Python integer. Used to create a random seed. See set_random_seed for behavior.
Returns:

A 3-D tensor of the same type and shape as image.

Raises:
  • ValueError: if the shape of image not supported.

tf.image.flip_left_right(image)

Flip an image horizontally (left to right).

Outputs the contents of image flipped along the second dimension, which is width.

See also reverse().

Args:
  • image: A 3-D tensor of shape [height, width, channels].
Returns:

A 3-D tensor of the same type and shape as image.

Raises:
  • ValueError: if the shape of image not supported.

tf.image.random_flip_left_right(image, seed=None)

Randomly flip an image horizontally (left to right).

With a 1 in 2 chance, outputs the contents of image flipped along the second dimension, which is width. Otherwise output the image as-is.

Args:
  • image: A 3-D tensor of shape [height, width, channels].
  • seed: A Python integer. Used to create a random seed. See set_random_seed for behavior.
Returns:

A 3-D tensor of the same type and shape as image.

Raises:
  • ValueError: if the shape of image not supported.

tf.image.transpose_image(image)

Transpose an image by swapping the first and second dimension.

See also transpose().

Args:
  • image: 3-D tensor of shape [height, width, channels]
Returns:

A 3-D tensor of shape [width, height, channels]

Raises:
  • ValueError: if the shape of image not supported.

Image Adjustments

TensorFlow provides functions to adjust images in various ways: brightness, contrast, hue, and saturation. Each adjustment can be done with predefined parameters or with random parameters picked from predefined intervals. Random adjustments are often useful to expand a training set and reduce overfitting.


tf.image.adjust_brightness(image, delta, min_value=None, max_value=None)

Adjust the brightness of RGB or Grayscale images.

The value delta is added to all components of the tensor imageimage and delta are cast to float before adding, and the resulting values are clamped to [min_value, max_value]. Finally, the result is cast back to images.dtype.

If min_value or max_value are not given, they are set to the minimum and maximum allowed values for image.dtype respectively.

Args:
  • image: A tensor.
  • delta: A scalar. Amount to add to the pixel values.
  • min_value: Minimum value for output.
  • max_value: Maximum value for output.
Returns:

A tensor of the same shape and type as image.


tf.image.random_brightness(image, max_delta, seed=None)

Adjust the brightness of images by a random factor.

Equivalent to adjust_brightness() using a delta randomly picked in the interval [-max_delta, max_delta).

Note that delta is picked as a float. Because for integer type images, the brightness adjusted result is rounded before casting, integer images may have modifications in the range [-max_delta,max_delta].

Args:
  • image: 3-D tensor of shape [height, width, channels].
  • max_delta: float, must be non-negative.
  • seed: A Python integer. Used to create a random seed. See set_random_seed for behavior.
Returns:

3-D tensor of images of shape [height, width, channels]

Raises:
  • ValueError: if max_delta is negative.

tf.image.adjust_contrast(images, contrast_factor, min_value=None, max_value=None)

Adjust contrast of RGB or grayscale images.

images is a tensor of at least 3 dimensions. The last 3 dimensions are interpreted as [height, width, channels]. The other dimensions only represent a collection of images, such as [batch, height, width, channels].

Contrast is adjusted independently for each channel of each image.

For each channel, this Op first computes the mean of the image pixels in the channel and then adjusts each component x of each pixel to (x - mean) * contrast_factor + mean.

The adjusted values are then clipped to fit in the [min_value, max_value] interval. If min_value or max_value is not given, it is replaced with the minimum and maximum values for the data type of images respectively.

The contrast-adjusted image is always computed as float, and it is cast back to its original type after clipping.

Args:
  • images: Images to adjust. At least 3-D.
  • contrast_factor: A float multiplier for adjusting contrast.
  • min_value: Minimum value for clipping the adjusted pixels.
  • max_value: Maximum value for clipping the adjusted pixels.
Returns:

The constrast-adjusted image or images.

Raises:
  • ValueError: if the arguments are invalid.

tf.image.random_contrast(image, lower, upper, seed=None)

Adjust the contrase of an image by a random factor.

Equivalent to adjust_constrast() but uses a contrast_factor randomly picked in the interval [lower, upper].

Args:
  • image: 3-D tensor of shape [height, width, channels].
  • lower: float. Lower bound for the random contrast factor.
  • upper: float. Upper bound for the random contrast factor.
  • seed: A Python integer. Used to create a random seed. See set_random_seed for behavior.
Returns:

3-D tensor of shape [height, width, channels].

Raises:
  • ValueError: if upper <= lower or if lower < 0.

tf.image.per_image_whitening(image)

Linearly scales image to have zero mean and unit norm.

This op computes (x - mean) / adjusted_stddev, where mean is the average of all values in image, and adjusted_stddev = max(stddev, 1.0/srqt(image.NumElements())).

stddev is the standard deviation of all values in image. It is capped away from zero to protect against division by 0 when handling uniform images.

Note that this implementation is limited:

  • It only whitens based on the statistics of an individual image.
  • It does not take into account the covariance structure.
Args:
  • image: 3-D tensor of shape [height, width, channels].
Returns:

The whitened image with same shape as image.

Raises:
  • ValueError: if the shape of 'image' is incompatible with this function.
相关文章
|
3天前
|
API 网络安全
发送UDP数据免费API接口教程
此API用于向指定主机发送UDP数据,支持POST或GET请求。需提供用户ID、密钥、接收IP及端口、数据内容等参数。返回状态码和信息提示。示例中含公共ID与KEY,建议使用个人凭证以提高调用频率。
28 13
|
1天前
|
API
icp备案查询免费API接口教程
该接口用于查询指定域名的ICP备案信息,支持POST或GET请求方式。请求时需提供用户ID、用户KEY及待查询的域名,可选参数为查询通道。响应中包含状态码、消息内容、备案号、备案主体、域名及审核时间等信息。示例中提供了GET和POST请求方式及返回数据样例。
|
12天前
|
API
天气预报1天-中国气象局-地址查询版免费API接口教程
此接口提供中国气象局官方的当日天气信息,支持POST和GET请求,需提供用户ID、KEY、省份及具体地点。返回数据包括状态码、消息、天气详情等。示例中使用的ID与KEY为公共测试用,建议使用个人ID与KEY以享受更高调用频次。
|
7天前
|
JSON API 数据格式
随机头像图片[API盒子官方资源库]免费API接口教程
API盒子提供的头像资源接口,包含大量网络公开收集的头像,适合非商业用途。支持POST/GET请求,需提供用户ID、KEY及返回格式类型。返回数据包括状态码和消息内容,支持JSON/TXT格式。更多详情见API盒子官网。
|
16天前
|
API 数据安全/隐私保护
抖音视频,图集无水印直链解析免费API接口教程
该接口用于解析抖音视频和图集的无水印直链地址。请求地址为 `https://cn.apihz.cn/api/fun/douyin.php`,支持POST或GET请求。请求参数包括用户ID、用户KEY和视频或图集地址。返回参数包括状态码、信息提示、作者昵称、标题、视频地址、封面、图集和类型。示例请求和返回数据详见文档。
|
15天前
|
机器学习/深度学习 人工智能 算法
基于Python深度学习的【垃圾识别系统】实现~TensorFlow+人工智能+算法网络
垃圾识别分类系统。本系统采用Python作为主要编程语言,通过收集了5种常见的垃圾数据集('塑料', '玻璃', '纸张', '纸板', '金属'),然后基于TensorFlow搭建卷积神经网络算法模型,通过对图像数据集进行多轮迭代训练,最后得到一个识别精度较高的模型文件。然后使用Django搭建Web网页端可视化操作界面,实现用户在网页端上传一张垃圾图片识别其名称。
60 0
基于Python深度学习的【垃圾识别系统】实现~TensorFlow+人工智能+算法网络
|
16天前
|
机器学习/深度学习 人工智能 算法
【手写数字识别】Python+深度学习+机器学习+人工智能+TensorFlow+算法模型
手写数字识别系统,使用Python作为主要开发语言,基于深度学习TensorFlow框架,搭建卷积神经网络算法。并通过对数据集进行训练,最后得到一个识别精度较高的模型。并基于Flask框架,开发网页端操作平台,实现用户上传一张图片识别其名称。
51 0
【手写数字识别】Python+深度学习+机器学习+人工智能+TensorFlow+算法模型
|
16天前
|
机器学习/深度学习 人工智能 算法
基于深度学习的【蔬菜识别】系统实现~Python+人工智能+TensorFlow+算法模型
蔬菜识别系统,本系统使用Python作为主要编程语言,通过收集了8种常见的蔬菜图像数据集('土豆', '大白菜', '大葱', '莲藕', '菠菜', '西红柿', '韭菜', '黄瓜'),然后基于TensorFlow搭建卷积神经网络算法模型,通过多轮迭代训练最后得到一个识别精度较高的模型文件。在使用Django开发web网页端操作界面,实现用户上传一张蔬菜图片识别其名称。
59 0
基于深度学习的【蔬菜识别】系统实现~Python+人工智能+TensorFlow+算法模型
|
1月前
|
机器学习/深度学习 人工智能 算法
【车辆车型识别】Python+卷积神经网络算法+深度学习+人工智能+TensorFlow+算法模型
车辆车型识别,使用Python作为主要编程语言,通过收集多种车辆车型图像数据集,然后基于TensorFlow搭建卷积网络算法模型,并对数据集进行训练,最后得到一个识别精度较高的模型文件。再基于Django搭建web网页端操作界面,实现用户上传一张车辆图片识别其类型。
74 0
【车辆车型识别】Python+卷积神经网络算法+深度学习+人工智能+TensorFlow+算法模型
|
3月前
|
机器学习/深度学习 人工智能 算法
鸟类识别系统Python+卷积神经网络算法+深度学习+人工智能+TensorFlow+ResNet50算法模型+图像识别
鸟类识别系统。本系统采用Python作为主要开发语言,通过使用加利福利亚大学开源的200种鸟类图像作为数据集。使用TensorFlow搭建ResNet50卷积神经网络算法模型,然后进行模型的迭代训练,得到一个识别精度较高的模型,然后在保存为本地的H5格式文件。在使用Django开发Web网页端操作界面,实现用户上传一张鸟类图像,识别其名称。
110 12
鸟类识别系统Python+卷积神经网络算法+深度学习+人工智能+TensorFlow+ResNet50算法模型+图像识别