Android OCR(Optical Character Recognition) API 初探

简介: Android OCR(Optical Character Recognition) API 初探

概述


有项目需要使用图片文字识别的功能, 基本需求:


支持中,英文

支持本地识别

免费

开发环境: Android Studio

平台: Android

语言: Java


方案选择


从搜索到的情况看, 大部分的OCR都是在线API, 实际使用过程各种网络要求, 各种付费要求, 如某度. 最终尝试过的两个可用的方案:


Recognize Text in Images with ML Kit on Android

tess-two


参考代码


Firebase ML Kit


使用前提

需新建Firebase 项目 Firebase 管理后台

下载google-services.json 并放置到项目模块根目录下, 如图


测试用的设备需要支持GMS, 即有内置google服务

建议支持连接外网以便登陆google帐户

在应用运行后, GMS服务会检测是否支持OCR, 当未发现模块或不支持时, 会联网下载并安装支持, 在国内的环境, 很艰难才成功了一次(若一直失败, 请尝试清除GMS数据,再重试, LOG参考如下)

2020-01-02 09:45:51.554 1189-1277/com.google.android.gms.persistent W/ConfigurationChimeraPro: Caller is not authorized to access Uri: content://com.google.android.gms.phenotype/com.google.android.gms.vision.sdk -- metadata{ service_id: 51 }
2020-01-02 09:45:51.556 2640-2691/com.testgoogleocr.ocrservice W/DynamiteModule: Local module descriptor class for com.google.android.gms.vision.dynamite.ocr not found.
2020-01-02 09:45:51.561 1189-1277/com.google.android.gms.persistent W/ProviderHelper: Unknown dynamite feature vision.dynamite.ocr
2020-01-02 09:45:51.567 2640-2691/com.testgoogleocr.ocrservice I/DynamiteModule: Considering local module com.google.android.gms.vision.dynamite.ocr:0 and remote module com.google.android.gms.vision.dynamite.ocr:0
2020-01-02 09:45:51.567 2640-2691/com.testgoogleocr.ocrservice D/TextNativeHandle: Cannot load feature, fall back to load dynamite module.
2020-01-02 09:45:51.571 1189-1277/com.google.android.gms.persistent W/ProviderHelper: Unknown dynamite feature vision.ocr
2020-01-02 09:45:51.573 2640-2691/com.testgoogleocr.ocrservice W/DynamiteModule: Local module descriptor class for com.google.android.gms.vision.ocr not found.
2020-01-02 09:45:51.573 2640-2691/com.testgoogleocr.ocrservice I/DynamiteModule: Considering local module com.google.android.gms.vision.ocr:0 and remote module com.google.android.gms.vision.ocr:0
2020-01-02 09:45:51.574 2640-2691/com.testgoogleocr.ocrservice E/Vision: Error loading module com.google.android.gms.vision.ocr optional module true: com.google.android.gms.dynamite.DynamiteModule$LoadingException: No acceptable module found. Local version is 0 and remote version is 0.
2020-01-02 09:45:51.575 2640-2691/com.testgoogleocr.ocrservice D/TextNativeHandle: Broadcasting download intent for dependency ocr
2020-01-02 09:45:51.586 2640-2691/com.testgoogleocr.ocrservice W/TextNativeHandle: Native handle not yet available. Reverting to no-op handle.
2020-01-02 09:45:51.617 1189-1277/com.google.android.gms.persistent W/ConfigurationChimeraPro: Caller is not authorized to access Uri: content://com.google.android.gms.phenotype/com.google.android.gms.clearcut.public -- metadata{ service_id: 51 }
2020-01-02 09:45:51.620 2640-2691/com.testgoogleocr.ocrservice W/DynamiteModule: Local module descriptor class for com.google.android.gms.vision.dynamite.ocr not found.
2020-01-02 09:45:51.627 1189-2278/com.google.android.gms.persistent W/ProviderHelper: Unknown dynamite feature vision.dynamite.ocr
2020-01-02 09:45:51.635 2640-2691/com.testgoogleocr.ocrservice I/DynamiteModule: Considering local module com.google.android.gms.vision.dynamite.ocr:0 and remote module com.google.android.gms.vision.dynamite.ocr:0
2020-01-02 09:45:51.635 2640-2691/com.testgoogleocr.ocrservice D/TextNativeHandle: Cannot load feature, fall back to load dynamite module.
2020-01-02 09:45:51.640 1189-2208/com.google.android.gms.persistent W/ProviderHelper: Unknown dynamite feature vision.ocr
2020-01-02 09:45:51.645 2640-2691/com.testgoogleocr.ocrservice W/DynamiteModule: Local module descriptor class for com.google.android.gms.vision.ocr not found.
2020-01-02 09:45:51.645 2640-2691/com.testgoogleocr.ocrservice I/DynamiteModule: Considering local module com.google.android.gms.vision.ocr:0 and remote module com.google.android.gms.vision.ocr:0
2020-01-02 09:45:51.646 2640-2691/com.testgoogleocr.ocrservice E/Vision: Error loading module com.google.android.gms.vision.ocr optional module true: com.google.android.gms.dynamite.DynamiteModule$LoadingException: No acceptable module found. Local version is 0 and remote version is 0.
2020-01-02 09:45:51.649 2640-2640/com.testgoogleocr.ocrservice W/System.err: com.google.firebase.ml.common.FirebaseMLException: Waiting for the text recognition model to be downloaded. Please wait.
2020-01-02 09:45:51.651 2640-2640/com.testgoogleocr.ocrservice W/System.err:     at com.google.android.gms.internal.firebase_ml.zzsc.zzd(com.google.firebase:firebase-ml-vision@@24.0.1:21)
2020-01-02 09:45:51.651 2640-2640/com.testgoogleocr.ocrservice W/System.err:     at com.google.android.gms.internal.firebase_ml.zzsc.zza(com.google.firebase:firebase-ml-vision@@24.0.1:39)
2020-01-02 09:45:51.651 2640-2640/com.testgoogleocr.ocrservice W/System.err:     at com.google.android.gms.internal.firebase_ml.zzpj.zza(com.google.firebase:firebase-ml-common@@22.0.1:31)
2020-01-02 09:45:51.652 2640-2640/com.testgoogleocr.ocrservice W/System.err:     at com.google.android.gms.internal.firebase_ml.zzpl.call(com.google.firebase:firebase-ml-common@@22.0.1)
2020-01-02 09:45:51.652 2640-2640/com.testgoogleocr.ocrservice W/System.err:     at com.google.android.gms.internal.firebase_ml.zzpf.zza(com.google.firebase:firebase-ml-common@@22.0.1:32)
2020-01-02 09:45:51.652 2640-2640/com.testgoogleocr.ocrservice W/System.err:     at com.google.android.gms.internal.firebase_ml.zzpe.run(com.google.firebase:firebase-ml-common@@22.0.1)
2020-01-02 09:45:51.652 2640-2640/com.testgoogleocr.ocrservice W/System.err:     at android.os.Handler.handleCallback(Handler.java:755)
2020-01-02 09:45:51.652 2640-2640/com.testgoogleocr.ocrservice W/System.err:     at android.os.Handler.dispatchMessage(Handler.java:95)
2020-01-02 09:45:51.652 2640-2640/com.testgoogleocr.ocrservice W/System.err:     at com.google.android.gms.internal.firebase_ml.zze.dispatchMessage(com.google.firebase:firebase-ml-common@@22.0.1:6)
2020-01-02 09:45:51.652 2640-2640/com.testgoogleocr.ocrservice W/System.err:     at android.os.Looper.loop(Looper.java:154)
2020-01-02 09:45:51.652 2640-2640/com.testgoogleocr.ocrservice W/System.err:     at android.os.HandlerThread.run(HandlerThread.java:61)
2020-01-02 09:45:51.673 1189-2696/com.google.android.gms.persistent I/CheckinUtil: Classify the device as Tablet.
2020-01-02 09:45:52.716 1455-2694/com.google.android.gms I/Vision: Details: ocr_armeabi_v7a.zip, a0fa9e293aa78a849c0951e9a9220ca9e9935bd9
2020-01-02 09:45:52.740 1455-2694/com.google.android.gms I/Vision: Engine already satisfied by existing download for ocr
2020-01-02 09:45:52.746 1189-2698/com.google.android.gms.persistent I/CheckinUtil: Classify the device as Tablet.


加入GMS服务支持: 项目根目录下的build.gradle

// Top-level build file where you can add configuration options common to all sub-projects/modules.
buildscript {
    repositories {
        jcenter()
        google()
    }
    dependencies {
        classpath 'com.android.tools.build:gradle:3.4.0'
  classpath 'com.google.gms:google-services:4.3.3'
        // NOTE: Do not place your application dependencies here; they belong
        // in the individual module build.gradle files
    }
}
allprojects {
    repositories {
        jcenter()
        google()
    }
}
task clean(type: Delete) {
    delete rootProject.buildDir
}


classpath 'com.google.gms:google-services:4.3.3’


模块下的build.gradle

dependencies {
    compile fileTree(dir: 'libs', include: ['*.jar'])
    // ML Kit dependencies
    implementation 'com.google.firebase:firebase-ml-vision:24.0.1'
    // Barcode detection model.
    //implementation 'com.google.firebase:firebase-ml-vision-barcode-model:16.0.2'
    // Image Labeling model.
    //implementation 'com.google.firebase:firebase-ml-vision-image-label-model:19.0.0'
    // Face model
    //implementation 'com.google.firebase:firebase-ml-vision-face-model:19.0.0'
    // Custom model
    //implementation 'com.google.firebase:firebase-ml-model-interpreter:22.0.1'
    // Object model
    //implementation 'com.google.firebase:firebase-ml-vision-object-detection-model:19.0.3'
    // AutoML model
    //implementation 'com.google.firebase:firebase-ml-vision-automl:18.0.3'
}
apply plugin: 'com.google.gms.google-services'


AndroidManifest.xml
<?xml version="1.0" encoding="utf-8"?>
    <application>
        <activity ></activity>
        <meta-data
            android:name="com.google.firebase.ml.vision.DEPENDENCIES"
            android:value="ocr" />
  </application>


调用部分代码相对简单

public static String recorgnizeByFirebase(Context ctx, Bitmap bm){
        try {
            FirebaseApp.initializeApp(ctx);
            FirebaseVisionImage fImg = FirebaseVisionImage.fromBitmap(bm);//FirebaseVisionImage.fromFilePath(ctx, Uri.fromFile(new File(path)));
            FirebaseVisionTextRecognizer txtRec = FirebaseVision.getInstance().getOnDeviceTextRecognizer();
            txtRec.processImage(fImg).addOnSuccessListener(new OnSuccessListener<FirebaseVisionText>() {
                @Override
                public void onSuccess(FirebaseVisionText firebaseVisionText) {
                    Log.d("Firebase-vision", "onSuccess " + firebaseVisionText.getText());
                }
            }).addOnFailureListener(new OnFailureListener() {
                @Override
                public void onFailure(@NonNull Exception e) {
                    e.printStackTrace();
                }
            });
        } catch (Exception e) {
            e.printStackTrace();
        }
        return "";
    }


tess-two OCR


提前下载好tessdata文件 tessdata


把tessdata文件放到指定目录


模块下的build.gradle


dependencies {
    compile fileTree(dir: 'libs', include: ['*.jar'])
    compile 'com.rmtheis:tess-two:6.0.0'
}


TessOcr.java

package com.tessocr.ocrservice;
import android.graphics.Bitmap;
import android.graphics.BitmapFactory;
import android.graphics.Rect;
import android.os.Environment;
import android.os.SystemClock;
import com.googlecode.tesseract.android.TessBaseAPI;
import java.io.File;
public class TessOcr {
    static final boolean D = true;
    static void d(String msg){if(D)android.util.Log.d("TessOcr", "ALog > " + msg);}
    public static final String TESSBASE_PATH = Environment.getExternalStorageDirectory() + "/OcrService";
    public static final String TESSBASE_PATH_FULL = TESSBASE_PATH + "/tessdata";
    public static final String[] SUPPORT_LANGUAGE = {"eng", "chi_sim"};
    //Bitmap bm;
    private OnOcrResult lis;
    private TessBaseAPI ocrApi;
    public TessOcr(int lan, OnOcrResult lis){
        this.lis = lis;
        ocrApi = new TessBaseAPI();
        ocrApi.init(TESSBASE_PATH, SUPPORT_LANGUAGE[lan]);
    }
    public String startSync(String path, Rect area) {
        long st = SystemClock.uptimeMillis();
        d("startSync for " + path);
        String resString = "";
        //ocrApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_SINGLE_LINE);
        ocrApi.setImage(new File(path));
        if(area != null)ocrApi.setRectangle(area);
        d("stage init api : " + (SystemClock.uptimeMillis() - st));
        resString = ocrApi.getUTF8Text();
        d("stage ocr done : " + (SystemClock.uptimeMillis() - st));
        ocrApi.clear();
        //ocrApi.end();
        return resString;
    }
    public void release(){
        if(ocrApi != null) ocrApi.end();
    }
    public void startAsync(final String path, final Rect area) {
        new Thread(){
            @Override
            public void run() {
                try {
                    String res = startSync(path, area);
                    if (lis != null) lis.onResult(res);
                }catch(Exception e){
                    if (lis != null) lis.onResult(null);
                }
            }
        }.start();
    }
    public interface OnOcrResult{
        void onResult(String res);
    }
}


相关

Firebase 管理后台

See and Understand Text using OCR with Mobile Vision Text API for Android

Android OCR之tesseract

tessdata

tess-two

利用tess-two和cv4j实现简单的ocr功能


相关文章
|
1月前
|
Android开发
Android 11 修改libcore update-api 遇到的问题
Android 11 修改libcore update-api 遇到的问题
59 1
|
10月前
|
定位技术 API 开发工具
Android 按照步骤接入百度地图API,定位显示不了解决办法
Android 按照步骤接入百度地图API,定位显示不了解决办法
267 0
|
8天前
|
机器学习/深度学习 数据采集 文字识别
印刷文字识别产品使用合集之需要对子用户加什么权限,才能通过API访问
印刷文字识别产品,通常称为OCR(Optical Character Recognition)技术,是一种将图像中的印刷或手写文字转换为机器编码文本的过程。这项技术广泛应用于多个行业和场景中,显著提升文档处理、信息提取和数据录入的效率。以下是印刷文字识别产品的一些典型使用合集。
|
24天前
|
API 定位技术 开发工具
Android Studio2021.1.1 高德地图api调用这一篇就够了
Android Studio2021.1.1 高德地图api调用这一篇就够了
|
1月前
|
文字识别 API 开发工具
印刷文字识别产品使用合集之API接口无法调用如何解决
印刷文字识别(Optical Character Recognition, OCR)技术能够将图片、扫描文档或 PDF 中的印刷文字转化为可编辑和可搜索的数据。这项技术广泛应用于多个领域,以提高工作效率、促进信息数字化。以下是一些印刷文字识别产品使用的典型场景合集。
|
1月前
|
SQL API Android开发
Android API:Activity.managedQuery()
Android API:Activity.managedQuery()
17 2
|
1月前
|
JSON 文字识别 算法
C# 通过阿里云 API 实现企业营业执照OCR识别
C# 通过阿里云 API 实现企业营业执照OCR识别
|
1月前
|
API Android开发
Android Framework增加API 报错 Missing nullability on parameter
Android Framework增加API 报错 Missing nullability on parameter
39 1
|
1月前
|
文字识别 安全 API
阿里云文字识别OCR的发票凭证识别功能可以通过API接口的形式进行调用
【2月更文挑战第5天】阿里云文字识别OCR的发票凭证识别功能可以通过API接口的形式进行调用
125 5
|
1月前
|
JSON 文字识别 API
文字识别OCR服务通常提供了一种API接口
【2月更文挑战第5天】文字识别OCR服务通常提供了一种API接口
57 4