Google Image Search Explained

简介: Google put the "Search by image" feature on its homepage, through which you can find similar images to any image on the internet.

PrincipleofSearchingForSimilarImages_part1

Image Search Explained


Here are the questions we hear most. How does it work? How does the computer know that two images are similar?

The principle is easy to understand and is reliant on what Dr. Neal Krawetz calls a fast algorithm.

Or more specifically, the key technology involved here is called "perceptual hash algorithm." Its role is to generate a "fingerprint" character string for each image and then compare the fingerprints. The closer the comparison result, the more similar the two images are.

Below is a simple implementation:

Step 1: Downsize the image.


Downsize the image to 8x8 pixels, i.e. 64 pixels in total. This step removes the details in the image, and only retains the structure, light, and shade among other essential information to discard the image difference caused by difference in sizes and proportions.

12

Step 2: Simplify the color.


Convert the downsized image into 64-level grayscale. In other words, there are 64 colors for all the pixels.

Step 3: Calculate the average value.


Calculate the average grayscale of all the 64 pixels.

Step 4: Compare the pixel grayscale.


Compare the grayscale of each pixel with the mean value. If the grayscale is larger than or equal to the average value, record the value as 1; otherwise, record the value as 0.

Step 5: Calculate the hash value.


Combine the comparison results in the previous step to get a 64-bit integer, which is the "fingerprint" of this image. The order of the combination is not critical, as long as you ensure all the images follow the same order.
2= 4= 8f373714acfcf4d0

After you have the fingerprint, you can compare different images to check how many bits in the 64 bits are different. In theory, this is same as calculating the "Hamming distance." If the number of different bits is less than 5, the two images are similar; if the number of different bits exceeds 10, it means the two images are different.

For specific code implementation, see imgHash.py written by Wote in Python. The code is short (only 52 lines). In usage, the first parameter refers to the benchmark image and the second parameter indicates the directory of other images for comparison. The returned result is the number of different bits of the two images (Hamming distance).

This algorithm is advantageous for being easy and quick, irrespective of the size of the image, but its disadvantage is that the image's content cannot change. If you add several texts on the image, the algorithm will not recognize it. It locates the original picture based on a thumbnail.

In practical application, pHash and SIFT use more robust algorithms as they can recognize the variations in images. They can match the original image as long as the deformation is less than 25 percent. These algorithms are more complicated, but they follow the same principle as the simple algorithm explained above, namely converting the image to a hash character value and then making the comparison.

See! Not that complex after all.

Interesting? Click hear to read two more image search methods.

目录
相关文章
App is not indexable by Google Search; consider adding at least one Activity with an ACTION-VIEW in
App is not indexable by Google Search; consider adding at least one Activity with an ACTION-VIEW in
92 0
|
Web App开发 JavaScript 搜索推荐
Google Ajax Search 参考
  Google AJAX Search API参考 Google AJAX Search API是一种允许您设置Google查寻到您的网页和其他 Web应用程序上的Javascript类库。
946 0
|
3月前
|
数据可视化 定位技术 Sentinel
如何用Google Earth Engine快速、大量下载遥感影像数据?
【2月更文挑战第9天】本文介绍在谷歌地球引擎(Google Earth Engine,GEE)中,批量下载指定时间范围、空间范围的遥感影像数据(包括Landsat、Sentinel等)的方法~
625 0
如何用Google Earth Engine快速、大量下载遥感影像数据?
|
3月前
|
编解码 人工智能 算法
Google Earth Engine——促进森林温室气体报告的全球时间序列数据集
Google Earth Engine——促进森林温室气体报告的全球时间序列数据集
30 0
|
3月前
|
编解码 人工智能 数据库
Google Earth Engine(GEE)——全球道路盘查项目全球道路数据库
Google Earth Engine(GEE)——全球道路盘查项目全球道路数据库
48 0
|
3月前
|
编解码
Open Google Earth Engine(OEEL)——matrixUnit(...)中产生常量影像
Open Google Earth Engine(OEEL)——matrixUnit(...)中产生常量影像
23 0
|
3月前
Google Earth Engine(GEE)——导出指定区域的河流和流域范围
Google Earth Engine(GEE)——导出指定区域的河流和流域范围
51 0