In the realm of education, understanding the dynamics of teacher-student interactions is crucial for enhancing learning outcomes. With this goal in mind, Natalie Huerta, Doctoral Student in Special Education at Vanderbilt University, and Jess Boyle, Doctoral Student & Research Assistant at Vanderbilt Peabody College, are leading a collaborative effort to analyze classroom audio and its impact on teaching practices. DSI Postdoctoral Fellow, Abbie Petulante and DSI graduate student, Thao Nguyen, are actively contributing to the project. Both Abbie and Thao are involved in coding for the project, with Thao taking the lead in this aspect thus far. As the project progresses to the modeling stage, Abbie has started engaging more actively in coding and provides valuable guidance on technical and data science aspects, assisting in model selection and discussing data organization and generation for the project’s objectives.
“Historically, labeling teacher language would be a time intensive process where the language would be recorded, transcribed, and labeled by humans. This would either mean the process taking a very long time to get a labeled data set to use for analysis or making the decision to use a smaller but more feasible dataset… we wanted to use technology, specifically transformer models as a way of removing the transcription step of the process, increasing the efficiency of the labeling process… We see using audio transformers as useful for this current project but also something that would be applicable to many other projects that occur in classroom or other educational settings.” -Natalie Huerta
Using transformer models, these projects seek to classify the types of language and phrases used by teachers during classroom sessions and to provide valuable insights and feedback to educators, empowering them to optimize their teaching practices and create research-based conclusions regarding the impact of “teacher speak” on student performance. The projects’ goals are to listen to recordings of classroom audio and classify different types of “teacher talk” and to provide timely and direct feedback to teachers about their lessons. The projects attempt to transform the way educators reflect on their teaching strategies, fostering continuous improvement in the classroom environment.
The recordings capture teachers’ speech, student interactions, and background noise. Classroom audio recordings often exhibit lower quality and contain more background noise compared to typical audio recordings, so the initial segmentation process posed a considerable challenge. Significant efforts were needed to determine the most effective segmentation plan. Using Whisper, an audio transformer model trained on human speech, the audio is segmented and processed. A fine-tuned transformer model classifies audio samples into predefined categories. As these models are trained on additional classroom audio, they improve in performance to the benefit of all researchers.
“The rapid development and spread of classroom sensing technologies (e.g., smartphones, smartwatches, robotic cameras), along with recent advancements in deep learning have resulted in a growing field of research looking at how these technologies can be applied to improve the scalability, affordability, and effectiveness of feedback and support practices for teachers.” – Jess Boyle
These projects have the potential to provide real-time valuable feedback to teachers regarding their lesson delivery, and to offer a deeper understanding of the language patterns employed by teachers in various classroom settings. By bridging the gap between theory and practice, educators can make informed decisions to optimize student engagement and academic performance.