Phoebe Liu (Appen)
Training Conversational Agents on Noisy Data
Featuring:
Phoebe Liu, Senior Data Scientist at Appen
Agenda:
- Data collection and annotation for conversational agents
- Designing dialogues for social robots
- Learning by imitation for social robots
NLU for conversation AI involves 3 steps:
- Defining intent
- Utterance collection
- Entity extraction
Challenges in designing dialogue:
- No way to collect human intent
- Hard to model the flow of real-world conversation
- Data collection can be noisy and costly
Watch a live demo of an ERATO social robot and their amazing ability to have very natural conversation.
The ERATO Intelligent Conversational Android was developed for conversational interactions for ERICA. Since its creation, they’ve added content over several months with more than 2000 behaviors and more than 50 topic sequences.
Watch the systems in action as they recognize speech and make actions. Some of the key takeaways are that data collection and annotation is best when using in-situ approach and high quality data. It’s also important to use ML-assisted validators to reject noisy utterances from the onset.
For all this, we need good quality training data!
Comments
Comment Form