Artificial intelligence researchers were able to successfully create a machine learning model capable of learning words using footage captured by a toddler wearing a headcam. The findings, published this week in Science, could shed new light on the ways children learn language and potentially inform researchers’ efforts to build future machine learning models that learn more like humans.Â
Previous research estimates children tend to begin acquiring their first words around 6 to 9 months of age. By their second birthday, the average kid possesses around 300 words in their vocabulary toolkit. But the actual mechanics underpinning exactly how children come to associate meaning with words remains unclear and a point of scientific debate. Researchers from New York University’s Center for Data Science tried to explore this gray area further by creating an AI model that attempted to learn the same way a child does.
To train the model, the researchers relied on over 60 hours of video and audio recordings pulled from a light head camera strapped to a child named Sam. The toddler wore the camera on and off starting when he was six months old and ending after his second birthday. Over those 19 months, the camera collected over 600,000 video frames connected to more than 37,500 transcribed utterances from nearby people. The background chatter and video frames pulled from the headcam provides a glimpse into the experience of a developing child as it eats, plays, and generally experiences the world around them.Â
Armed with Sam’s eyes and ears, the researchers then created a neural network model to try and make sense of what Sam was seeing and hearing. The model, which had one module analyzing single frames taken from the camera and another focused on transcribed speech direct towards Sam, was…
Read the full article here