Artificial intelligence (AI) is regarded as the next big thing as it aims to enable computers and machines to perceive, think, and act like human beings without even needing to be taught how.
Chinese AI enterprises, iFly Tek, Unisound, and Face++, to name a few, have achieved significant progress in the “perception” aspect. They’ve done so by developing voice and image recognition technologies which work like the ears, mouth, and eyes of computers.
The next development stage of AI is to design a brain for computers, enabling them to understand information conveyed in text, voice, and image, and then act accordingly. This process is in fact similar to how the human brain functions. This sort of intelligence is exactly what Trio, a rising Beijing-based AI startup, is engaged with.
Trio’s co-founder and CEO, Zhuoran Wang, interviewed with AllTechAsia and revealed the mystery of the AI startup behind Xiaomi smart TVs and Smartisan smartphones.
A team well versed in artificial intelligence
Trio was founded in February 2016 and has raised three rounds of funding within one year, seizing a total amount of USD 12 million. Unlike typical startups which often burn tons of money on initial expansion during early stages of development, Trio is already well on its way to breaking even this year, according to Wang. This would be quite an achievement.
Interestingly, Trio was named after the fact that it has three co-founders: Wang, Chao Qi, and Yuchi Ma.
Wang, 35, full of energy, learned to program as early as elementary school. Though he did not major in programming in college, he would often hang around computer science students and has taken part in several global algorithm competitions.
“When I was in sophomore year, I was inspired by a professor who is specialized in Natural Language Processing (NLP),” Wang recalled. “He led me to where I am.”
He further acquired a doctor’s degree in Computer Science from University College London and became a naturalized British citizen. He studied and worked in Britain for 10 years before returning to China in 2015 to work for Baidu’s Xiaodu Robot, now known as DuerOS. His expertise is mainly focused on task-oriented spoken dialogue systems in which chatbots respond to and fulfill user-prompted tasks.
Qi, CTO of Trio, played a critical role in creating Microsoft’s chatbot XiaoIce and Baidu’s Duer before co-founding the AI startup. Qi differs slightly from Wang in that he specializes in non-task-oriented dialogue systems, also known as free-chatting systems. Trio’s third partner and COO, Ma, is a veteran in marketing and business development. The backgrounds and experiences of these three industry professionals integrate in a highly complementary manner.
Aside from its co-founders, Trio has a team of about 90 staff members with half of its research and development team hailing from Microsoft, Baidu, IBM, and Tencent with NLP focuses.
“In terms of Natural Language Understanding and dialogue systems, our team is one of the best in the market,” Wang claimed.
The AI service provider of star products
Generally speaking, as Wang wrote in an article, human-machine dialogues fall into four categories: 1) non-task-oriented dialogue (or “open-domain social dialogue”), 2) task-oriented spoken dialogue, 3) Question and answer (Q&A), and 4) recommendation. And it’s necessary to have a multi-domain decision-making system serving as the control center if one plans to combine these four AI functions into to a single application.
As mentioned earlier, in the case of non-task-oriented dialogue, users have free and open interaction with chatbots without a fixed purpose while in task-oriented spoken dialogue, users and chatbots converse back and forth in order to accomplish a certain task, such as finding a good restaurant in a specific geographic area.
Q&A is even more straightforward than task-oriented spoken dialogue, as chatbots can provide answers directly to users’ questions, such as “How much does an adult panda weigh?” Lastly, recommendation refers to a system in which a chatbot automatically recommends relevant information to a user’s inquiry. For instance, when a user asks “What is the origin of the marathon sporting event?”, the chatbot answers the question then may recommend an upcoming marathon in the user’s area.
“Our businesses and advantages are mainly focused on non-task-oriented dialogue, task-oriented spoken dialogue, and multi-domain decision-making systems,” Wang said.
At present, Trio’s technologies are mostly applied within three areas: business services, Internet of Things (IoT), and Anime, Comic and Games (ACG).
As for business services, Trio’s chatbot service integrates into the customer service systems of different enterprises, whether on their social media accounts, apps, or webpages.
In terms of IoT, Trio’s technologies have been applied to a wide range of smart devices, including Xiaomi’s Mi TVs, Mi AI speakers, and Smartisan smartphones. They enable Mi TVs and Mi AI speakers to interact with users through voice recognition.
Trio’s collaboration with Chinese smartphone maker Smartisan is more interesting and is what has helped Trio gain considerable fame.
A Smartisan OS function called “Big Bang” allows users to break any texts into separated phrases or words as if it was a big bang, so that users can choose a part of the texts to copy and paste, to drag to rephrase the texts, and to translate the texts more easily. According to Wang, this function significantly improves the efficiency of processing texts on mobile phones. Trio is now working with several other smartphone makers to promote the function.
As for application in ACG, Trio’s technologies allow its partners, including Boston-based robot maker Jibo, to customise personalities for virtual characters.
Competitors at home and abroad
Though Trio’s co-founders were the masterminds behind Microsoft XiaoIce and Baidu Duer, what Trio is doing now differs considerably from their prior projects. Wang explained why.
Microsoft XiaoIce remains a non-task oriented chatbot, while Baidu’s Duer has developed into a platform called DuerOS, similar to Amazon’s voice interaction platform, Alexa. While competitors primarily provide services directly to individual consumers (B2C), all of Trio’s services are aimed at business partners (B2B). Besides differences in business models, their techniques vary as well. For instance, Trio’s bots are able to detect users’ emotions, while the other two lack the capacity to do so. Trio has also developed a children’s version of its chatbot based on characteristics of young users.
Wang believes there are few direct competitors in China and the startup’s major foreign rivals, such as Maluuba, API.AI, and Wit.ai, have been acquired respectively by Microsoft, Google, and Facebook.
Wang said that they are now collaborating with Foxconn, the world’s largest contract electronics manufacturer, and other voice recognition companies to develop an Alexa-like platform in which Trio will provide a multi-domain decision making technique.
(Top photo from trio.ai)