第三节课——【文字识别项目讲解及使用说明】（一）|学习笔记-阿里云开发者社区

第三节课——【文字识别项目讲解及使用说明】（一）|学习笔记

2022-11-01 146

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

视觉智能开放平台，图像资源包5000点

视觉智能开放平台，分割抠图1万点

视觉智能开放平台，视频资源包5000点

简介： 快速学习第三节课——【文字识别项目讲解及使用说明】

开发者学堂课程【达摩院视觉 AI 精品课：第三节课——【文字识别项目讲解及使用说明】】学习笔记，与课程紧密联系，让用户快速学习知识。

课程地址：https://developer.aliyun.com/learning/course/912/detail/14420

第三节课——【文字识别项目讲解及使用说明】（一）

内容介绍：
一、如何使用 SDK

二、介绍 Request 阶层

三、OcrService 的实现逻辑

一、如何使用 SDK

首先进入智能视觉开放平台官网：http://vision.aliyun.com/，点击文字识别-身份证识别，点击查看产品文档，点击 SDK 参考，对 Java 来说有两种SDK 的选择，第一种是通用的 SDK，这里选择使用第二种的支持本地上传的 SDK。

接着复制下方说明里的链接粘贴到网页并在最后加一个ocr/，目前代码使用的版本是1.0.3，点击就可以看到Maven坐标，也就是代码里使用 SDK 的坐标了。

接下来是整个项目的实现路径和其中需要注意的点和方法：

<html lang="en" xmlns:th="http:/ /www.thymeleaf.org">

<head>

<title>VIAPI</title>

<link rel=”stylesheet" href=https://cdn.bootcss.com/bootstrap/3. 3.7/css/bootstrap.min.css>

<script src="https://apps.bdimg. com/libs/jquery/2.1.4/jquery.min.js"</script>

</head>

<body>

<div class="container">

<div class="row">

<div class="col-md-12 mx-auto">

<h2>VIAPI RecognizeIdentityCard Example</h2>

<div class=”col-sm-12">

<p th:text="${message}" th:if="$imessage ne null}" class="alert alert-primary"></p

</div>

<form method="post" th:action="@{/upload}" enctype="multipart/form-data">

<div class="col-sm-4">

<div class="input-group">

<input id='location' class="form-control" onclick="Ş('#i-face' ) .click();"

<label class="input-group-btn">

<input type="button" id="i-check" value="上传人像面" class=btn btn-primary" onclick="Ş('#i-face').click();”>

</label>

</div>

<input type="file" name= “face” id='i-face’ accept=”.jpg，.p1g，.jpeg" onchange=" Ş('#location').val(Ş('#i-face'). val());" style=”display: none”>

<div class="col-sm-4">

<div class="input-group">

<input id='location1' class="form-control" onclick=”Ş('#i-back ' ) .click();"

<label class=”input-group-btn">

<input type="button"id="i-check-1" value="上传国徽面" class=btn btn-primary" onclick="Ş( '#i-back ').click);”>

</label>

</div>

<input type="file" name="back" id='i-back' accept=" .jpg，.png，.jpeg" onchange="Ş('#location1').val(Ş('#i-back '),val()):" style=”…”>

<div class="col-sm-4">

<button type="submit" class="btn btn-primary">开始识别</button>

</div>

</form>

</div>

<div class="row" style="margin-top: 30px; ">

<div class=col-md-12 mx-auto">

<div class="col-sm-4>

<img style="width: 100%;" th:src="${faceImagc}" th:if="$faceImage ne null}" class="img-fluid" alt=""/>

</div>

<div class="col-sm-4>

<img style=”width: 100%;" th:src="${backImage}" th:if="${backImage ne null}" class"

mg-fluid" alt=""/>

</div>

<div class="row" style="…">

<div class="col-md-12 mx-auto">

<div class="col-sm-4>

<p th:if="${faceResult ne null}"><span>姓名

</span><span th:text="${faceResult.name}">/span></p>

<p th:if="${faceResult ne null}"><span>性别:</span><span th:text="${faceResult.gender}">/span></p>

<p th:if="${faceResult ne null}"><span>民族

</span><span th:text="${faceResult.nationality}">/span></p>

<p th:if="${faceResult ne null}"><span>

出生日期:</span><span th:text="${faceResult.birthDate}">/span></p>

<p th:if="${faceResult ne null}"><span>

住址</span><span th:text="${faceResult.address}">/span></p>

<p th:if="${faceResult ne null}"><span>

身份证号码:</span><span th:text="${faceResult.IDNumber}">/span></p>

</div>

<div class-“col-sm-4”>

<p th:if="${backResult ne null}"><span>

签发机关:</span><span th:text="${backResult.issue}">/span></p>

<p th:if="${backResult ne null}"><span>

有效日期:</span><span th:text="${backResult.startDate}">/span>-<span th:text=”${backResult.click:”}></p>

</div>

</body>

效果如图：

以上就是一个前端页面结构以及它实现的逻辑点。

二、介绍 Request 阶层

@Controller

@RequestMapping("/ ")

public class MainController {

private String uploadDirectory;

private OcrService ocrService;

private List<String> faceImages;

private List<String> backImages;

private List<Map<String,String>> faceResults;

private List<Map<string,String>> backResults;

public MainController(@value(""[file.upload. path)" ) String uploadDirectory，OcrService ocrService){

this.uploadDirectory = uploadDirectory;

this.ocrService = ocrService;

faceImages = new ArrayList<>();

backimages = new ArrayList<>();

faceResults = new ArrayList<>();

backResults = new ArrayList<>();

}

private String saveFile(MultipartFile file) throws Exception {

String suffix = StringUtils.substringAfterLast(file.getOriginalFilenae(), separator: " . ");

String filename = UUID.randomUUID().toString() + "." + suffix;

Path path = Paths.get( first: uploadDirectory + filename);Files.copy(file.getInputStream(),path,StandardCopyOption.REPLACE_EXISTING);

return filename;

}

@RequestMapping()

public String index (Model model) {

if (faceimages.size() != backImages.size()) {

facelmages.clear();

backImages.clear();

faceResults.clearO);

backResults.clear(;

}

if (!CollectionUtils.isEmpty (facelmages) && faceimages.size() == backImages.size()){

model.addAttribute(s: "faceImage"，faceImages.get(faceImages.size() - 1));

model.addAttribute(s: "faceResult"，faceResults.get(faceResults.size() - 1));

model.addAttribute(s: "backlmage", backImages.get(backimages.size() - 1));model.addAttribute(s: "backResult", backResults.get(backResults.size() - 1));

}

return "index";

}

l RequestMaping 给 MainController 添加了注解

l 第一个 uploadDirectory 定义的是上传图片文件本地保存地址；

第二个 ocrService 是提供了一层调用视觉智能开放平台 OCR 能力的一层封装；

第三和第四的 faceImages、backImages 是缓存了之前上传图片的路径地址；

第四和第五的 faceImages、backImages 是缓存了之前的识别

结果。

l 因为没有用到数据库，所以在这里其实是在内存里面缓存了一个上传图片和识别结果的信息。这是配置的本机的目录，现在这个目录是放在 resources 下的 static目录下的，方便进行一些文件的存储和读取的操作。images 是相对于 static 下的自定义目录，相当于是存放图片类的数据。

三、OcrService 的实现逻辑

@PostMapping("/upload")

public String uploadFile(@RequestParam("face”) MultipartFile face，@Requestaram("back") MultipartFi1le back，Redirectittributes attribute{

if (face.isEmpty() | | back.isEmpty()) {

attributes.addFlashAttribute( attributeName: "message", attributeValue: "Please select a file to upload.”);

return "redirect:/;

}

String errorMessage = null;

try {

Path dir = Paths.get(uploadDirectory);

if (!Files.exists(dir)) {

Files.createDirectories(dir);

}

if (!face.isEmpty()) {

String filename = saveFile(face);

Map<Strin，String> res = ocrService.RecognizeIdcard( filePath: uploadDirectory + filename，side: "face");

faceImages.add(" /images/" + filename) ;

faceResults.add(res);

}

if (!back.isEmpty()) {

String filename = saveFile(back);

Map<String，String)> res = ocrService.RecognizeIdcard( filePath: uploadDirectory + filename，side: "back");

backImages.add("/images/" + filename);

backResults.add(res);

}

}catch (TeaException e) {

e.printStackTrace();

errorMessage = JsON.toJSONString(e.getData());

}catch (Exception e) {

e.printStackTrace();

errorMessage = e.getMessage();

}

if (StringUtils.isNotBlank(errorMessage)) {

attributes.addFlashAttribute( attributeName: "message", errorMessage);

}

return "redirect:/”;

}

第三节课——【文字识别项目讲解及使用说明】（一）|学习笔记

第三节课——【文字识别项目讲解及使用说明】（一）

一、如何使用 SDK

二、介绍 Request 阶层

三、OcrService 的实现逻辑

文字识别

热门文章

最新文章

相关课程

相关电子书

相关实验场景