Phoebe Liu (Appen)

Training Conversational Agents on Noisy Data

Featuring:

Phoebe Liu, Senior Data Scientist at Appen

Agenda:

  • Data collection and annotation for conversational agents
  • Designing dialogues for social robots
  • Learning by imitation for social robots

NLU for conversation AI involves 3 steps:

  • Defining intent
  • Utterance collection
  • Entity extraction

Challenges in designing dialogue:

  • No way to collect human intent
  • Hard to model the flow of real-world conversation
  • Data collection can be noisy and costly

Watch a live demo of an ERATO social robot and their amazing ability to have very natural conversation.

The ERATO Intelligent Conversational Android was developed for conversational interactions for ERICA. Since its creation, they’ve added content over several months with more than 2000 behaviors and more than 50 topic sequences.

Watch the systems in action as they recognize speech and make actions. Some of the key takeaways are that data collection and annotation is best when using in-situ approach and high quality data. It’s also important to use ML-assisted validators to reject noisy utterances from the onset.

For all this, we need good quality training data!

Share Post:

modev search icon

Subscribe for updates

By clicking subscribe you agree to receive updates and invitations from the VOICE community.

close-button

Register Now