How to implement an Assistant with Google Assist API

H.Nguyen picture H.Nguyen · Feb 9, 2016 · Viewed 8.9k times · Source

I have been checking out and reading about Google Now on Tap (from http://developer.android.com/training/articles/assistant.html).

It was very interesting to find from that article that Now on Tap is based on Google's Assist API bundled with Marshmallow and it seems possible for us to develop our own assistant (the term Google used in the article to refer to app like Now on Tap) using the API.

However, the mentioned article only very briefly discusses how to use Assist API and I couldn't find any additional information about how to use it to develop a custom assistant even after spending a few days searching for it on the Internet. No documentation and no example.

I was wondering if any of you have experience with Assist API that you could share? Any help appreciated.

Thanks

Answer

Fanglin picture Fanglin · Mar 8, 2016

You can definitely implement a personal assistant just like the Google Now on Tap using the Assist API starting Android 6.0. The official developer (http://developer.android.com/training/articles/assistant.html) guide tells exactly how you should implement it.

Some developers may wish to implement their own assistant. As shown in Figure 2, the active assistant app can be selected by the Android user. The assistant app must provide an implementation of VoiceInteractionSessionService and VoiceInteractionSession as shown in this example and it requires the BIND_VOICE_INTERACTION permission. It can then receive the text and view hierarchy represented as an instance of the AssistStructure in onHandleAssist(). The assistant receives the screenshot through onHandleScreenshot().

Commonsware has four demos for basic Assist API usage. The TapOffNow (https://github.com/commonsguy/cw-omnibus/tree/master/Assist/TapOffNow) should be enough to get you started.

You don't have to use the onHandleScreenshot() to get the relevant textual data, the AssistStructure in onHandleAssist() will give you a root ViewNode which usually contains all you can see on the screen.

You probably need to also implement some sorts of function to quickly locate the specific ViewNode that you want to focus on using recursive search on the children from this root ViewNode.