fuck tesseract-ocr
查找了很多资料 也请教了一些网友 还是觉得无从下手 fuck 就是吼吼 哪个大牛可以说说 tesseract?ocr
[解决办法]
http://code.google.com/p/tesseract-ocr/wiki/Documentation 这里有Google工程师的几篇分析文章,这应该是想要了解此项目最好的文档。另外这篇文章(http://www.cnblogs.com/brooks-dotnet/archive/2010/10/05/1844203.html)介绍的挺详细的。
[解决办法]
这是神马东西。。。
[解决办法]
这个软件对中文支持还不行,有点干扰就读出许多乱码来。同求好点的ocr软件.
[解决办法]
VERSION 5.00
Begin VB.Form Form1
Caption = "VB实现OCR文字识别"
ClientHeight = 3195
ClientLeft = 60
ClientTop = 345
ClientWidth = 4680
LinkTopic = "Form1"
ScaleHeight = 3195
ScaleWidth = 4680
StartUpPosition = 3 '窗口缺省
Begin VB.CommandButton Command1
Caption = "识别"
Height = 495
Left = 1800
TabIndex = 0
Top = 1320
Width = 1215
End
End
Attribute VB_Name = "Form1"
Attribute VB_GlobalNameSpace = False
Attribute VB_Creatable = False
Attribute VB_PredeclaredId = True
Attribute VB_Exposed = False
Option Explicit
Private Sub Command1_Click()
Dim strLayoutInfo As String
Dim miDoc As Object
Dim modiLayout As Object
'初始化并加载文档
Set miDoc = CreateObject("MODI.Document") '创建对象
miDoc.Create "c:\VBOCR\z.tif" '加载图片文件
Screen.MousePointer = vbHourglass '设置光标忙
'识别
miDoc.Images(0).OCR miLANG_CHINESE_SIMPLIFIED, True, True '有用的就此一句,识别为中文简体
Set modiLayout = miDoc.Images(0).Layout '读出数据
strLayoutInfo = _
"Language: " & modiLayout.Language & vbCrLf & _
"Number of characters: " & modiLayout.NumChars & vbCrLf & _
"Number of fonts: " & modiLayout.NumFonts & vbCrLf & _
"Number of words: " & modiLayout.NumWords & vbCrLf & _
"Beginning of text: " & Left(modiLayout.Text, 50) & vbCrLf & _
"First word of text: " & modiLayout.Words(0).Text
Debug.Print modiLayout.Text
MsgBox strLayoutInfo, vbInformation + vbOKOnly, "Layout Information"
Set modiLayout = Nothing
Set miDoc = Nothing
Screen.MousePointer = vbDefault
End Sub
[解决办法]
字符的分割自己实现后再用tesseract效果会好
[解决办法]
http://blog.csdn.net/fengbingchun/article/details/8493877