With this tutorial, you'll learn how to create a simple Voice Bot in Alexa step by step.
ChatBots and Voice Bots have become extremely common these days. They provide a simple mechanism to interact with customers without the need of Support Agents. In this blog post, we will explain how to create a simple Alexa based Voice Bot.
Most bots today are based on the intent recognition models. Basically, you have a bunch of intents and for each intent, you create a response. Let us take an example – suppose you run a bank and you want to build a chatbot or a voice bot that allows the users to ask questions related to their bank account.
One such question could be to allow them to ask their account balance. Now, users can ask their account balance in multiple ways:
All of these sentences are essentially variations of the “Account Balance” intent. Basically, they all have the same meaning – the user is asking his/her account balance. The bot could query the user’s account balance from the bank’s database and the reply could be: “Your account balance is $6,491.00”.
Basically, for each intent, we have the following things:
Suppose we are building a bot for a startup that offers 2 products: Mobile phones and Laptops. We could have the following intents:
So basically, the above bot is able to answer 3 basic questions on the types of products offered, the price of a mobile and the screen size of the laptops sold. As you can see, the list can be extended to increase more intents based on your business requirements. Now, let’s get to the actual voice bot implementation.
First, head to Amazon Alexa Developer Console. You will be redirected to a login page. If you do not have an account with Amazon Alexa, you can click on “Create your Amazon account button”. Once you have created your Amazon Alexa account, you can log in to the Amazon Alexa Developer Console.
Click on the “Create Skill” button on the right.
Provide a Skill name and select the Default Language (English US in our case). Then click on the “Create skill” button on the right side of the page.
We will be creating a skill from scratch so select the “Start from scratch” button and click “Choose”.
Great, so your skill will be set-up with some “default” intents like “Stop”, etc. On the panel on the left, click on “Invocation”.
Provide a good invocation name for your Alexa Skill (your bot). Basically, users would be able to trigger your bot on Amazon Alexa using this name. For instance, if you set your skill name to “Brainhub”, your users would be able to trigger the skill by saying “Alexa, ask Brainhub …”
Click “Save Model” at the top.
Now, let’s add some intents to our skill that we created above. Click on the “+ Add” button on the left side of the page.
Create Intent name as “OfferedProducts”. Note that we removed the space between Offered and Products in the intent name. This is because spaces are not allowed in Alexa Intent names. Then click “Create custom intent”.
Now, you would be taken to “Sample Utterances” of the intent. Basically, these are the training sentences/variations for your intent as we discussed above. Type in “What products do you offer” and press enter. Similarly, add the other sample utterance “I want to know the products”. Note that you cannot use punctuation marks in sample utterances.
Click on “Save Model” again.
You’d get a notification like this.
Do the same for the other 2 intents that we talked about – Mobile prices and Laptop size and click on “Build Model”. You will get a notification once your build is successful.
Now, let’s get to the slightly tricky part – creating the AWS Lambda endpoint for the bot which will allow us to feed answers into the bot.
Head to Amazon AWS Console. If you do not have an account, register and log in. Make sure that you have added the appropriate payment method under the billing tab so as to activate your account.
Once you have logged in, on the top right corner of the page, make sure that your region is selected as “US East (N. Virginia)” since Alexa service is available only in that region.
On the page, search for “Lambda” and click on it to open the AWS Lambda service page. The browser URL should look something like this: https://console.aws.amazon.com/lambda/home?region=us-east-1
Click on “Create function”.
Provide a name to your function and select runtime as Python 2.7. From the Role drop-down, select “Create a new role from one or more templates”. Enter a Role name and click “Create function” on the bottom right corner.
On the next page, scroll down and under the “Add triggers” section, click on “Alexa Skills Kit”. Basically, we are telling AWS Lambda that our Alexa Skill can trigger it. At the bottom of the page, you’d be asked to provide your “Skill ID”.
To obtain your “Skill ID”, head back to your Alexa Developer Console and click on “Endpoint” on the bottom left corner. Click “AWS Lambda ARN” and then under the “Your Skill ID” section, click “Copy to clipboard”. This will copy your Skill ID to the clipboard so that you can go back to AWS Lambda and paste it there.
Go back to AWS Lambda and paste your Skill ID and click “Add”.
On the top right corner of the page, click “Save”. From the top right corner, copy the “ARN” and paste it in the Alexa Developer Console under “Default Region (Required)” and click “Save Endpoints”.
Now, refresh the page to load the AWS Lambda text editor. Copy/paste the following code in the editor and hit “Save”.
from __future__ import print_function
WELCOME_RESPONSE = "Hello! Welcome to my Bot! How can I help you?"
CLOSING_RESPONSE = "Thanks! Hoping to see you again!"
SORRY_RESPONSE = "I am sorry, I don't know that one."
INTENT_RESPONSE_DICT = {
"OfferedProducts": "We offer 2 products Mobiles and Laptops",
"MobileCharges": "Our mobile price varies from $800 - $1,200 per piece.",
"LaptopSize": "We sell only 15.6-inch laptops from Apple.",
}
def build_speechlet_response(title, output, reprompt_text, should_end_session):
return {
'outputSpeech': {
'type': 'PlainText',
'text': output
},
'card': {
'type': 'Simple',
'title': title,
'content': output
},
'reprompt': {
'outputSpeech': {
'type': 'PlainText',
'text': reprompt_text
}
},
'shouldEndSession': should_end_session
}
def build_response(session_attributes, speechlet_response):
return {
'version': '1.0',
'sessionAttributes': session_attributes,
'response': speechlet_response
}
def get_welcome_response():
session_attributes = {}
card_title = WELCOME_RESPONSE
speech_output = WELCOME_RESPONSE
reprompt_text = speech_output
should_end_session = False
return build_response(session_attributes, build_speechlet_response(
card_title, speech_output, reprompt_text, should_end_session))
def handle_session_end_request():
card_title = "Session Ended"
speech_output = CLOSING_RESPONSE
should_end_session = True
return build_response({}, build_speechlet_response(
card_title, speech_output, None, should_end_session))
def send_reply(intent, session, speech_output):
session_attributes = {}
reprompt_text = None
should_end_session = False
return build_response(session_attributes, build_speechlet_response(
"My Bot", speech_output, reprompt_text, should_end_session))
def on_session_started(session_started_request, session):
print("on_session_started requestId=" + session_started_request['requestId']
+ ", sessionId=" + session['sessionId'])
def on_launch(launch_request, session):
print("on_launch requestId=" + launch_request['requestId'] +
", sessionId=" + session['sessionId'])
return get_welcome_response()
def on_intent(intent_request, session):
intent = intent_request['intent']
intent_name = intent_request['intent']['name']
if intent_name in INTENT_RESPONSE_DICT:
return send_reply(intent, session, INTENT_RESPONSE_DICT[intent_name])
else:
if intent_name == "AMAZON.HelpIntent":
return get_welcome_response()
elif intent_name == "AMAZON.CancelIntent" or intent_name == "AMAZON.StopIntent":
return handle_session_end_request()
elif intent_name == "AMAZON.FallbackIntent":
return send_reply(intent, session, SORRY_RESPONSE)
else:
return send_reply(intent, session, SORRY_RESPONSE)
def on_session_ended(session_ended_request, session):
print("on_session_ended requestId=" + session_ended_request['requestId'] +
", sessionId=" + session['sessionId'])
def lambda_handler(event, context):
try:
print("event.session.application.applicationId=" +
event['session']['application']['applicationId'])
if event['session']['new']:
on_session_started({'requestId': event['request']['requestId']},
event['session'])
if event['request']['type'] == "LaunchRequest":
return on_launch(event['request'], event['session'])
elif event['request']['type'] == "SessionEndedRequest":
return on_session_ended(event['request'], event['session'])
elif event['request']['type'] == "IntentRequest":
return on_intent(event['request'], event['session'])
else:
return send_reply(event['request']['intent'], event['session'], SORRY_RESPONSE)
except Exception as e:
return send_reply(event['request']['intent'], event['session'], SORRY_RESPONSE)
Go back to the Alexa Developer Console and at the top, click on “Test”. Enable testing for the skill.
In the text bot, type “ask Brain hub what are the offered products”. You’d immediately get a response “We offer 2 products Mobiles and Laptops”.
Observe that we did not add the variation “what are the offered products”, but still, Alexa was able to recognize it. That’s the power of Machine Learning and Natural Language Processing.
You can try the voice bot in your Alexa App as well. Go ahead and give it a shot!
Read next
Top reads
Become a better tech leader.
Join 200+ CTOs, founders and engineering managers and get weekly bite-sized leadership lessons that take <60 seconds to read.