How to Get High-Quality Training Data for Machine Learning

FREE 30-MIN WEBINAR | Wednesday, December 12 | 11am PT / 2pm ET

To build an effective product that relies on machine learning, you need a large volume of high-quality training data. For the solution to correctly understand and mimic humans, it's crucial to have a strategy around collecting and annotating training data that optimizes for quality. Join us to learn about the data you need to build solutions like natural language processing, chatbots, and sentiment analysis, with live Q&A to follow.

In this webinar you'll learn:

  • The pros and cons of public data vs. building your own data sets
  • How much time and energy to invest in data collection
  • Why curated crowds yield higher-quality data for machine learning

Can't make it? Go ahead and register, and we'll send a recording after the presentation.

 

About the presenter

James-Lyle-Headshot-120x120

 James Lyle is Director of the Custom Linguistic Solutions team at Appen. After earning his Ph.D. in linguistics at the University of Washington, James joined Microsoft in 1999 and spent more than 14 years working on various natural language technologies, including proofing tools, information extraction, and text analytics. Since joining Appen in 2013, he has focused on providing tech industry clients with linguistic consultation and high-quality annotated data for machine-learned NLU solutions.