Amazingly, I have done it.
This was a fairly difficult undertaking and took me through multiple development learning phases including:
- Understanding how Alexa interprets intents and slots, and how it routes information through those intents
- Figure out how Amazon employs account linking and uses OAuth
- Using Node for the first time and working out kinks in HTTP requests
- Being forced to figure out how promises work make sure that all API requests go through.
- Deal with several quirks in the new Alexa Skill Builder (beta)
I’ll go through some of these in detail and cover how I dealt with the issues as they came up.
Intents and Slots:
One of the ridiculous things I said prior to building this app was that I wouldn’t need any custom slot types, which came from not really understanding how slots worked and how I could pass information to the app with custom slots. Obviously my ‘do-list items’ had to be passed in as a custom slot type. Here were a few more things I learned regarding intents and slots:
Any intent can be triggered from any interaction
I set up my skill separated into different states so depending on what state a user was in, they might get different responses if they trigger the same intent. I used this to funnel users down a path with the least likelihood of triggering an intent that they did not want. For example, when Users say “Alexa, ask calendar to-do to add something to my list”, they would enter a state within the app for which there should only be two options – Say an item to add to the list, or cancel. (or help, or stop, etc etc, whatever.) However, I learned that these states don’t necessarily prevent a user from triggering an intent that is not listed in that state – from this state of adding an item, the user can say something and if Alexa interprets that to mean that you want to remove an item from your do list, you’re out of luck. The only way to get around that is to run the same code regardless of whether you’ve triggered one intent or the other.
Interpreting any string in a slot can be a lot of work
People can put an unlimited variety of things in their to do lists, and my paltry list of 20-something sample utterances simply did not recognize when I said something abnormal like ‘brush my chinchilla with salsa.’ Interpreting literal strings, as I discovered, was much easier to do if the sample utterances had a larger base to draw from – cue me coming up with some of the most outrageous to do list tasks ever until I had ~100. (eg. “broker a peace agreement in the middle east”, “solve America’s opioid epidemic”, “reform the criminal justice system”, “run America like a company”, and many other items that might be on random guy’s list who we’ll call, say, Jared.) That seemed to solve the problem. Thanks Jared!
Account linking is literally the easiest thing when you’re connecting to a Google API. You don’t need to know anything about OAuth, and you don’t need to do anything extraordinary. Unfortunately, I didn’t know that it was easy, and it ended up being one of the most frustrating things throughout the development process.
The backbone of the skill is making API requests to google calendar to read and write your to do list, thus the user has to link their account within Alexa App. The easiest way to do it, (if you’re using a Google API, or presumably another API that uses OAuth, is to set up your google API in the console, and then copy the all of the required information into the alexa account linking section in the configuration tab. (Client secret, Auth URL, client ID, Token URI, etc.) – note: I read that you have to add subdomains into the domain list if you want to pull information from a subdomain. I have it set up this way but I am not sure if it is true. (ie. I have both google.com and accounts.google.com in my domains list to make sure that the account linking works getting an access token from https://accounts.google.com/o/oauth2/token)
The biggest problem i ran into was the access token not getting refreshed, and I had to re-link the skill every hour if I still wanted to use it – obviously not an acceptable issue. After scouring the web far and wide I found a forum post saying that I had to add `?access_type=offline` to my authorization URL, so it looked like this: https://accounts.google.com/o/oauth2/auth?access_type=offline. This solved my issue immediately.
For my first time using node I was fairly happy with how simple it seemed. There’s something about writing in normal JS that’s so calming and doesn’t make you want to pull your hair out and smash things. All of the modules are well documented and there seems to be a good community of developers with plenty of answers to difficult questions. I’m looking forward to using Node in the future for sure.
actually making the HTTP requests was a little more frustrating, since the entire concept is a little bit over my head, but again the documentation in the
request module and the simplicity of trying new things was good enough to get me started. The real problem was when I started sending responses to Alexa before I had received a response from the server, which led me to…
Having not had to make a ton of continuous API requests in the past, I had only really known about the concept of promises without really understanding how to use them. I had to really get into it and figure out how to chain my multiple api requests together and get those responses before I could move onto the next step. Suffice it to say, I’m extremely happy that I was forced to learn that, as I’ve been in multiple situations where I just descended down the callback spiral and found myself wondering why there wasn’t a better way. Now I know!
I may be completely wrong about this but my overall feeling with development was that there were a lot of sample projects that you could work on that would help you understand how to develop a skill, but there was no documentation devoted specifically to development – what the Alexa object was, how to get a slot value, how to set up different states, etc. All of this information could be found by poking around different examples, but as with everything in the development world, people tend to do things differently and if your project isn’t set up the same way as someone else’s that can tend to be frustrating. There was a bunch of documentation set up to discuss the concepts of intents, slots, account linking, responses, etc. but there was not very much discussing how to put those concepts into practice.
This is more of an issue with AWS Lambda, but do I really have to upload my entire project every time I make a change to my code? Is there no possible way to have a text editor that can tweak something very quickly, or just allow for me to upload 1 document instead of the multiples that I have in a zipped file? Making minute changes when testing the service was one of the most time-consuming, frustrating processes because it would take 5 minutes to test just one thing. I’m sure there is a better way (eg. hosting the code myself.) but I don’t know how to do that so I guess i’m stuck with the lambda environment.
This is no one’s fault in particular, but I noticed that the developer community for this particular product is much smaller than a community for a language or framework (as is to be expected) so it takes a much longer time to find out if someone is having the same problems as you or how to solve said problems. Several of my questions on the amazon developer forum went unanswered and though I do appreciate that they have amazon staff working to respond to people’s questions, it’s still a pain to have to wait so long for an answer before you can continue with your project. As I move into more projects like this though, I suppose I should get used to it.
If you want to take a look at my skill and use it, you can find it here:
The github page for the skill can be found here: https://github.com/mattcheah/alexa-calendar-to-do, if you want to make any changes or add any functionality for yourself.
I have a few planned improvements when I have time – namely that Amazon is now allowing developers access to the existing to-do lists on that a user has on their apps, so I’d like to add the option of downstream and upstream syncs so that when you use your calendar do list skill, it will pull all of the information from the built in Alexa do list and add it to the list on your calendar.
I’d love to hear feedback or any other thoughts! Thanks for reading guys.