…to club it with Telegram and you can upload the image to Telegram and get back the text from the image as a reply!
As you might already know, Google’s Cloud Vision allows developers to perform optical character recognition (OCR) so in this article, I am going to jot down how to integrate with a telegram bot.
At this point, you must have a telegram bot that listens for chats that contain text or images and replies to them depending on the context.
To enable Cloud Vision API
- In the Cloud console, go to the
Create service account
page – Go to Create service account - Select your project.
- In the
Service account
name field, enter a name. The Cloud console fills in theService account ID
field based on this name. - In the
Service account
description field, enter a description. - Click
Create and continue
. - To provide access to your project, grant the following role(s) to your service account: Project > Owner.
- In the
Select a role
list, select a role. - For additional roles, click add
Add another role
and add each additional role.Use a custom role that meets your needs.
- Click
Continue
. - Click
Done
to finish creating the service account. - Do not close your browser window. You will use it in the next step.
- Create a service account key:
- In the Cloud console, click the email address for the service account that you created.
- Click
Keys
. - Click
Add key
, and then clickCreate new key
. - Click
Create
. A JSON key file is downloaded to your computer. - Click
Close
Install the python library:
1
$ pip install --upgrade google-cloud-vision
Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
def detect_text():
"""Detects text in the file."""
from google.cloud import vision
import io
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "/location/of/the/json/downloaded/from/previous/step/key.json"
client = vision.ImageAnnotatorClient()
path = f'{vision_image_path}/vision.jpg'
with io.open(path, 'rb') as image_file:
content = image_file.read()
image = vision.Image(content=content)
response = client.text_detection(image=image)
os.remove(path)
if response.error.message:
return 'ERROR!\nFor more info on error messages, check:\nhttps://cloud.google.com/apis/design/errors'
elif response.text_annotations:
texts = response.text_annotations
return texts[0].description
else:
return "no text was detected."
#data is the json posted by Telegram
def photo_message(data):
parameters = {'chat_id': data['message']['from']['id'],
'text': 'Detection in progress', 'reply_to_message_id': data['message']['message_id']}
r = requests.post(url=f"{tg_endpoint}/sendmessage", data=parameters)
file_link = requests.post(
url=f"{tg_endpoint}/getFile?file_id={data['message']['photo'][-1]['file_id']}").json()
downloaded_file = requests.get(
url=f"{file_endpoint}/{file_link['result']['file_path']}", allow_redirects=True)
#Since I am the only one using the bot, I set a static filename. You might want to dynamically allocate the file name - hint - use the timestamp etc.
open(f'{vision_image_path}/vision.jpg',
'wb').write(downloaded_file.content)
detected_text = detect_text()
if len(detected_text) > 3999:
x = 4000
message_array = [detected_text[y-x:y]
for y in range(x, len(detected_text)+x, x)]
for message in message_array:
parameters = {'chat_id': data['message']['from']['id'],
'text': message, 'reply_to_message_id': data['message']['message_id']}
r = requests.post(
url=f"{tg_endpoint}/sendmessage", data=parameters)
else:
parameters = {'chat_id': data['message']['from']['id'],
'text': detected_text, 'reply_to_message_id': data['message']['message_id']}
r = requests.post(url=f"{tg_endpoint}/sendmessage", data=parameters)