使用 pytesseract OCR 识别图像中的文本

2022-11-10 Python开发跟版网

Use pytesseract OCR to recognize text from an image(使用 pytesseract OCR 识别图像中的文本)

本文介绍了使用 pytesseract OCR 识别图像中的文本的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着跟版网的小编来一起学习吧！

问题描述

我需要使用 Pytesseract 从这张图片中提取文字:

I need to use Pytesseract to extract text from this picture:

和代码:

from PIL import Image, ImageEnhance, ImageFilter
import pytesseract
path = 'pic.gif'
img = Image.open(path)
img = img.convert('RGBA')
pix = img.load()
for y in range(img.size[1]):
    for x in range(img.size[0]):
        if pix[x, y][0] < 102 or pix[x, y][1] < 102 or pix[x, y][2] < 102:
            pix[x, y] = (0, 0, 0, 255)
        else:
            pix[x, y] = (255, 255, 255, 255)
img.save('temp.jpg')
text = pytesseract.image_to_string(Image.open('temp.jpg'))
# os.remove('temp.jpg')
print(text)

和temp.jpg"是

and the "temp.jpg" is

还不错，但是打印的结果是,2 WW不是正确的文本2HHH，那我怎样才能去除那些黑点呢?

Not bad, but the result of print is ,2 WW Not the right text2HHH, so how can I remove those black dots?

推荐答案

这是我的解决方案:

import pytesseract
from PIL import Image, ImageEnhance, ImageFilter

im = Image.open("temp.jpg") # the second one 
im = im.filter(ImageFilter.MedianFilter())
enhancer = ImageEnhance.Contrast(im)
im = enhancer.enhance(2)
im = im.convert('1')
im.save('temp2.jpg')
text = pytesseract.image_to_string(Image.open('temp2.jpg'))
print(text)

这篇关于使用 pytesseract OCR 识别图像中的文本的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持跟版网！

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除！

上一篇：使用 OpenCV cv2.VideoCapture 在 Python 中从 IP 摄像机进行视频流传输下一篇：删除图像中的水平线(OpenCV、Python、Matplotlib)

相关文档推荐

在 python 中读取 *.mhd/*.raw 格式

Reading *.mhd/*.raw format in python(在 python 中读取 *.mhd/*.raw 格式)

计算图像中的单元格数

Count number of cells in the image(计算图像中的单元格数)

如何在 Python OpenCV 中检测文本文档图像中的段落是否存在不一致的文本结构

How to detect paragraphs in a text document image for a non-consistent text structure in Python OpenCV(如何在 Python OpenCV 中检测文本文档图像中的段落是否存在不一致的文本结构)

YOLO物体检测中如何获取边界框的坐标?

How to get the coordinates of the bounding box in YOLO object detection?(YOLO物体检测中如何获取边界框的坐标?)

在 python 中将图像划分为 5x5 块并计算每个块的直方图

Divide an image into 5x5 blocks in python and compute histogram for each block(在 python 中将图像划分为 5x5 块并计算每个块的直方图)

从图像中提取奶牛编号

Extract cow number from image(从图像中提取奶牛编号)

栏目导航

前端开发 Java开发 C/C++开发 Python开发 C#/.NET开发 php开发移动开发数据库

最新文章

热门文章

热门标签

织梦资讯网织梦模板 dede 外语学校织梦鬼故事竞价网站源码竞价培训网门户网站织梦二次开发织梦笑话网 dedecms笑话网织梦源码网站建设搞笑图片织梦教程旅游网站源码织梦旅游网学校培训 html5 企业织梦源码医院源码后台样式移动营销页整形医院大学医院新手建站客服代码洗衣机维修企业网站淘宝客导航菜单教育网站学校源码装修网站装修模板美容整形女性健康妈妈网机械源码建站公司珠宝首饰苹果网站手机资讯美女图片织梦模版打包妇科源码安卓市场源码男性时尚网健康之家 app应用网站笑话网站下载站美女图片网中医院网站家装网站源码 QQ网站标牌网站魔兽世界网淘宝客源码 YY网站源码别墅设计网站服装搭配网宝宝起名网站长网站婚庆网站脑科医院源码笑话源码肝胆医院意外怀孕源码工作室