Back to List

"Hey Google, How Do I Create Actions for the Home?"

Michael Fazio Michael Fazio  |  
Oct 05, 2017
 

Welcome to the Google Home!

Google released the Home late in 2016 to choruses of "Isn't that the same as the Echo?" and "Wait, is that an air freshener?" Yet, when people finally got their hands on it, they quickly realized how many features it has and what they could do with it. You can get directions from home to any location, translate words to and from English, play music from many services, and (not surprisingly, as it is a Google device) search for anything you need.
 
However, not everything was there yet. It could only handle a single user to be assigned to the device, it couldn't set up reminders, and it didn't have Bluetooth support like the Echo does.
 
Now, a year later, all those gaps have been addressed, plus more. You can now call people directly from the Home using your Google Voice number. Not only can the device handle multiple accounts, but it can also recognize multiple people's voices and respond accordingly. They even added Bluetooth support after the fact, which was a shock to those of us who didn't even know it had Bluetooth onboard. All of this was done remotely and without any user interaction.
 
Oh, and did I mention that they expanded the Actions on Google tools (used for building Actions on the Home) to work on the Google Assistant? This change occurred over the summer. It allows users with Android phones or the Google Assistant app on iOS to use all of the same Actions on Google as Home users (with extra visual info, even).
 
This week, Google announced the next generation of Home devices, specifically the Google Home Mini –  a smaller $49 version of the Home that will compete directly against the Amazon Echo Dot.
 
google home devices
 
They also announced the Google Home Max, a high-end speaker coming out in December. These devices will be $399 and intended to compete with Sonos and the upcoming HomePod from Apple.
 
google home max
 
Given all that, the Google Home devices aren't going anywhere. Plus, when you consider all the devices that can utilize Google Assistant, now is a great time to start building your own Actions on Google.
 

Dialogflow

To create our Action on Google, we'll be utilizing Dialogflow (formerly api.ai), a natural language processing platform owned by Google. Neat, right?
 
dialogflow

Now, you're probably thinking "Yeah, neat…what does that even mean?" I know I asked the same thing when I found about the site last year. Simply put, Dialogflow is a platform where you can turn normal text into actionable data.
 
Sorry, more (overly?) technical terms. It's a platform for taking text input and acting upon it (plus usually retrieving context from what's said). For example, if someone says to me, "Do you want to meet me for pizza at 7:00?" my brain will process their statement and pull out significant items of context: Pizza, 7:00, meet, and me. Based on those pieces, I know quickly that I'm meeting (not picking up) my friend (not someone else) at 7:00 (not other times, and I can assume 7 PM since 7 AM pizza is less likely, though totally valid.) I also know we're having pizza, but I'll likely ask what restaurant they have in mind. Once I find out they're thinking Rocky's in Pewaukee (good choice), we're all set.
 
That above exchange is very simple for humans. We can process everything and have a resolution within a few seconds, even if we have a question or two. It's much harder for machines to understand the same scenario, though the technology behind natural language processing is improving on a daily (honestly, hourly) basis. Platforms like Dialogflow make this process easier for us to utilize. We can quickly specify likely or common phrases, how we should react, and even how to recognize contextually-important items. Trust me, this will make more sense if we walk through an actual agent.
 

The Skyline Agent

A Dialogflow Agent is a module that contains a set of components which process text input. These components are called Intents, and we'll get to them in a second, but first we need to do a bit of configuration with our Agent.
 
api.ai skyline agent

At a base level, we just need to give our Agent a name, a little description, and we can use the defaults otherwise. Whenever you create a new Agent, Dialogflow also creates a new Google project for you. This allows you to quickly get your Agent ready to be used with the Google Assistant, as well as tie into other services like Firebase.
 

Intents

Now that our Agent's ready, let's get some Intents set up. Intents, as I mentioned before, are the connection between input from a user and an action (with some kind of fulfillment.)
 

This is going to be a simple intent to allow someone to ask where Skyline is located with one basic response. Yes, it's not that useful of a skill, but it's a good example. Plus, if we didn't have this in here, it'd feel like a huge gap to users.
 

User Says

In the section "User says", we're able to specify sample phrases that we want handled by this Intent.  Note that these are examples, not explicit statements. That means that not only will the Intent handle the listed phrases, but Dialogflow's processing engine can understand phrases which are said in a different but similar manner.
 
For example, if someone says, "Tell me where you're located", the Agent can use the proper intent. However, something like, "Tell me where you are", hits the Default Fallback Intent (which is created automatically during Agent creation). It's recommended for you to list as many examples as possible because this helps the machine-learning system/process get "smarter" and can handle a wider range of statements.
 
I will note that there is also a template mode for the "User says" section, but it's recommended that you use the example mode since it's easier and helps the machine-learning algorithms.
 

Response

Now that we're set with the user comments, we probably should send something back, unless we're out to create the World's Most Boring Action (spoiler alert: we're not). We can do this by specifying a Text Response in the Response section. We can list a single response and have that returned each time, or we can add multiple to mix it up.
 
google home text response
 
Note that when you specify multiple responses, your Agent will cycle through all of them at random and make sure to use every response before repeating one.
 
We now have an actual Intent ready to go. We can even test this out on the right side of the site with any text we want to send in.
 
google action screenshot
 
While this is a now a working Intent, it still doesn't do much. The next logical step here is to add in phrases like "Where is Skyline in Appleton located?" and have it always return the Appleton address. Unfortunately, we're not given this kind of conditional logic in Dialogflow, and are required to use a webhook for anything extra. We won't be going in depth with webhooks in this post as it's worthy of a post on its own, but I'll touch on them a bit at the end.
 

Respond or End?

Keep in mind that each Intent, by default, keeps the conversation open for the Google Assistant. This means that when we finish responding from our intent, the Agent doesn't shut down unless we tell it to do so. The Google Assistant section at the end of each Intent contains a check box to end the conversation once that Intent is fulfilled.
 
google assistant screenshot
 
If we want to continue the conversation after we send a response, then we must include a transitional phrase to the user so they know what to say next.
 

Phone-Specific Responses

If we wanted, we could deploy this to our Google account and test it on real devices or the emulator provided. However, we'll wait until we have a bit more content to do that. In the meantime, let's create another Intent, this time to tell a user about Skyline.
 
phone specific responses
 
This looks like our other intent, but we're going to add a little something extra for people using the Google Assistant on their phone.
 
google assistant response
 
We can add extra types of responses for people with a visual component to their assistant, like cards, lists, and links to websites.
 
google action responses 

Given that we're talking about Skyline, why don't we send people to the About Us page on SkylineTechnologies.com? This is easy to do with a "Link out suggestion" type of message content.
 
google action link out suggestion
 
If we view this on a phone, it'll look something like this:
 
google home action mobile view
 
If we press the "Open Skyline - About Us" button, it'll open the About Us page in the user's browser.
 
We can even add a bit more content with a card. For our About Skyline intent, we're going to give users a card all about our blogs. This is also nice and straight-forward to configure; we just have to set up our text, a link (with title), and a picture of a handsome guy.
 
google intent basic card
 
If we look at the card on our phone, we're presented with this:
 
google intent basic card mobile

Now we're getting somewhere! We can continue to add more single question-and-answer intents, but why not make our Agent a bit more conversational? We can achieve this via Contexts.
 

Contexts

Contexts allow our Agent to carry pieces of information over from one Intent to another. This means our Agent can ask the user a question, then use another Intent to respond to their answer. We can then have more of a conversation (with branching paths) instead of just "ask, answer, done."
 
We can specify context manually if we're getting info from the user we want to send along. For example, if we started our Agent with a welcome intent asking a user's name, we could store that in a context called 'user-info' and send it to the rest of our Intents. That way, those Intents could include the user's name in their responses. Note that if you include a response with a parameter, and that parameter is not present, the response will not be used. This allows for a nice way to have both general and specific responses to user statements.
 
The other main way to utilize Contexts is via a Follow-Up Intent. Follow-Up Intents are basically just intents with the same input Context as their parent had output Context, but Dialogflow shows them in a more user-friendly format. Let's take an example: Someone hits our Skyline Agent and asks about what services Skyline offers.
 
google context example
 
Right now, this is no different than our other Intents, so let's add a second step. Both our responses finish with phrases asking the user if they want to hear more, so let's handle both affirmative and negative responses. We add Follow-Up Intents from the Intents list page, as shown below:
 
google action affirmative negative responses
 
google intent response

For our scenario, we're going to add both "yes" and "no" Follow-Up Intents. There's not really anything special about these intents other than it automatically attaches a context and pre-fills some user statements.
 
Everything else is the same as we've previously set up. Still, this will allow us to quickly configure a response to the user depending on what they say back to our Agent.
 
google follow-up intent
 
In our "yes" Follow-Up Intent, we can add some statements about all the services Skyline offers:
 
google follow-up intent statements
 
As for the "no" Follow-Up Intent, we can just add a simple generic statement (or set of statements) telling the user to ask about something else:
 
google no follow-up intent
 
In both scenarios, once the Agent has received the "yes" or "no" response, the context clears and they're back the normal line of question. This means they can go back to asking any other question that they may have.
 

Webhooks

As you've seen, Dialogflow is a great tool for getting Agents up and running quickly with some straightforward functionality. However, if you want to handle some major conditional logic (allowing the user to tell the Agent which service they want to hear about, for example) you'll need to utilize a Webhook. This is a REST API that receives requests to a single endpoint, then handles what is sent in. You'll get the entire request in the API and are able to send back any response you wish.
 
Webhooks are outside the scope of this post, as I'm just hoping to show how to navigate Dialogflow here. For more info, Dialogflow has some solid reference docs created: https://dialogflow.com/docs/fulfillment#webhook-example.

I will mention that Dialogflow just added the ability to create a Cloud Function for Firebase that is used for fulfillment, right from Dialogflow.com. You can edit your JavaScript code and select any dependencies you like in a package.json file via an in-browser editor. This is really useful for basic conditional logic and even simple REST calls.
 
dialogflow webhook
 

What Do We Have Now?

We now have an Agent created that can handle users asking about the Skyline offices and services, with an even nicer experience for phone users. This obviously isn't enough to be production-ready, but hopefully it gives enough of a picture into how Dialogflow works and what you need to do to get an Agent off the ground!
 

tl;dr

  • Build Agents on Dialogflow for the Google Assistant (Google Home and phones with the Assistant)
  • Intents handle what a user says and how to respond
  • Follow-Up Intents use Contexts to intelligently respond to users
  • Webhooks allow you to do basically anything you want with your Agent
 

More Links

 

Love our Blogs?

Sign up to get notified of new Skyline posts.

 


Related Content


Blog Article
Developing on a Raspberry Pi using Gulpjs
Eric DitterEric Ditter  |  
Jan 15, 2019
Developing on one machine and running on another is a tedious process, but sometimes you need to when a library has different features for ARM vs x64 (and then there are always the Windows vs Linux issues). This was the issue I had when I was working on a Raspberry Pi project using Python...
Blog Article
CQS, SimpleInjector, and the Power of Decorators
Dan LorenzDan Lorenz  |  
Jan 08, 2019
Over the years of developing with n-tier style, I was wondering if there was anything else out there. The biggest problem with n-tier for me is that the interfaces and classes tend to get super large. When that happens, you start breaking SOLID principals and unit testing becomes much more...
Blog Article
Async, Await, and ConfigureAwait – Oh My!
Dan LorenzDan Lorenz  |  
Dec 11, 2018
In .NET Framework 4.5, async/await keywords were added to the language to make async programming easier to work with. In order to maximize device resources and not block UI, you should really try to use asynchronous programming wherever you can. While async/await has made this much easier than...
Blog Article
How to Add Electronic and Digital Signatures to a Universal Application (UWP) with iText
Paul MadaryPaul Madary  |  
Aug 14, 2018
When paying for gas at the pump, checking out at Walmart, or electronically signing a contract to purchase real estate, have you ever thought about what technically goes into that electronic signature?  If not, then this is your lucky day! I recently had to implement this functionality in a...
Blog Article
Updated Mile of Music App Provides an Even Better User Experience
John PtacekJohn Ptacek  |  
Jul 26, 2018
As we get to the end of July at Skyline Technologies, our organization starts to get excited. We know that the Mile of Music festival is just around the corner. With over 70,000 people coming to Appleton, Wisconsin, for four days of original music, it is quite an adventure. Given one of the main...