Comparison of accuracy of machine-generated or human-generated captions of Zoom live lectures in a comparative theriogenology course

  • Margaret Root Kustritz Department of Veterinary Clinical Sciences, University of Minnesota College of Veterinary Medicine, St. Paul, MN, USA
  • Ryan Rupprecht Office of Academic and Student Affairs, University of Minnesota College of Veterinary Medicine, St. Paul, MN, USA
  • Perle Zhitnitskiy Department of Veterinary Population Medicine, University of Minnesota College of Veterinary Medicine, St. Paul, MN, USA
Keywords: Accommodations, accessibility, public speaking

Abstract

Captions are available with captured lectures for student review. In this study, automatically generated captions from Zoom, Kaltura, and YouTube were compared for accuracy with captions generated by a human being. Also investigated was the effect of speaker on accuracy of captioning – does the speed with which someone speaks or their accent alter accuracy of captioning? YouTube was by far the most accurate of the automatic captioning systems. There were numerous mistakes made by Zoom and Kaltura and some significantly altered meaning. Mistakes were due to the transcribing systems, not to things specific to the presenters. Instructors should get in the habit of reviewing tran scripts of their lectures to ensure students are not misled.

Downloads

Download data is not yet available.

References

1. Dommett EJ, Dinu LM, Van Tilburg W, et al: Effects of captions, transcripts and reminders on learning and perceptions of lecture capture. Int J Educ Technol High Educ 2022;19:20 doi: 10.1186/s41239-022-00327-9

2. Dommett EJ, Gardner B, Van Tilburg W: Staff and students perception of lecture capture. Internet High Educ 2020;46:110732. doi: 10.1016/j.iheduc.2020.100732

3. Newton G, Tucker T, Dawson J, et al: Use of lecture capture in higher education – lessons from the trenches. Tech Trends 2014;58:32–44. doi: 10.1007/s11528-014-0735-8

4. Morris KK, Frechette C, Dukes L, et al: Closed captioning matters: Examining the value of closed captions for all students. J Postsec Educ Disability 2016;29:231–238.

5. Griffin E. Who uses closed captions? Not just the deaf or hard of hearing: 2015. Available from: http://www.3playmedia.com/2015/08/28/who-uses-closed-captions-not-just-the-deaf-or-hard-of-hearing/ [cited 15 July 2022].

6. Dommett EJ, Gardner B, Van Tilburg W: Staff and student views of lecture capture: a qualitative study. Int J Educ Technol High Educ 2019;16:1–2. doi: 10.1186/s41239-019-0153-2

7. Dommett EJ, Van Tilburg W, Gardner B: A case study: views on the practice of opting in and out of lecture capture. Educ Inform Technol 2019;24:3075–3090. doi: 10.1007/s10639-019-09918-y

8. Mayer RE: Cognitive theory of multimedia learning. In: Mayer RE: editor. The Cambridge handbook of multimedia learning. New York; Cambridge University Press: 2014:43–71. doi: 10.1017/CBO9781139547369.005

9. Moreno R, Mayer RE: Verbal redundancy in multimedia learning: when reading helps listening. J Educ Psychol 2002;94:156–163. doi: 10.1037/0022-0663.94.1.156

10. Tisdell C, Loch B: How useful are closed captions for learning mathematics via online video? Int J Math Educ Sci Technol 2017;48:229–243. doi: 10.1080/0020739X.2016.1238518

11. Millett P: Improving accessibility with captioning: an overview of the current state of technology. Can Audiol 2022;9. Available from: https://canadianaudiologist.ca/improving-accessibility-with-captioning-an-overview-of-the-current-state-of-technology [cited 15 July 2022].

12. Lasecki WS, Bigham JP: Real-time captioning with the crowd. ACM Digital Library: Interactions 2014;21:50–55. doi: 10.1145/2594459

13. Lasecki WS, Miller CD, Sadilek A, et al: Real-time captioning by groups of non-experts. Proceedings of the 25th annual ACM symposium on User interface software and technology, October 2012; 23–34. doi: 10.1145/2380116.2380122

14. Wald M: Creating accessible educational multimedia through editing automatic speech recognition captioning in real-time. Interact Tech Smart Educ 2006;3:131–141. doi: 10.1108/17415650680000058

15. Takeuchi Y, Kojima D, Sano S, et al: Detection of Input-Difficult Words by Automatic Speech Recognition for PC Captioning. In: Miesenberger, K., Kouroupetroglou, G. (eds) Computers Helping People with Special Needs. ICCHP 2018. Lecture Notes in Computer Science, vol 10896. Springer, Cham. doi: 10.1007/978-3-319-94277-3_32

16. Kent M, Ellis K, Latter N, et al: The case for captioned lectures in Australian higher education. Tech Trends 2018;62:158–165. doi: 10.1007/s11528-017-0225-x

17. Borgaonkar R: Captioning for classroom lecture video. Thesis: University of Houston: 2013.

18. Millett P: Accuracy of speech-to-text captioning for students who are deaf or hard of hearing. J Educ Ped Rehab Audiol 2021;25:1–13.

19. Tatman R: Gender and dialect bias in YouTube’s automatic captions. Proc ACL Workshop on Ethics in Natural Language Processing, Valencia; Association for Computational Linguistics: 2017;53–59.

20. Simonds BK, Meyer KR. Quinlan MM, et al: Effects of instructor speech rate on student affective learning, recall, and perceptions of nonverbal immediacy, credibility, and clarity. Comm Res Report 2006;23;187–197.

21. Van Engen KJ, Peelle JE: Listening effort and accented speech. Front Hum Neurosci 2014;8:577. doi: 10.3389/fnhum.2014.00577

22. Duvall E, Robbins A, Graham T, et al: Exploring filler words and their impact. Schwa Lang Linguistics 2014;11:35–49.

23. Ellis K, Kent M, Peaty G: Captioned recorded lectures as a mainstream learning tool. M/C J 2017;20. doi: 10.5204/mcj.1262

24. Whitney M, Dallas B: Captioning online course videos: an investigation into knowledge retention and student perception. Minneapolis, MN: Proc ACM Technical Symposium on Computer Science Education; 2019.

25. Ranchal R, Taber-Doughty T, Guo Y, et al: Using speech-recognition for real-time captioning and lecture transcription in the classroom. IEEE Transactions Learn Technol 2013;6:299–311. doi: 10.1109/TLT.2013.21

26. Wald M: Using speech recognition transcription to enhance learning from lecture recordings. International Conference on Education and New Developments, Budapest, Hungary. 23–25 Jun 2018;111–115.

27. Xiong W, Droppo J, Huang X, et al: Achieving human parity in conversational speech recognition. Microsoft Technical Report. MSR-TR-2016-71, 2017; Available from: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/11/ms_parity.pdf [cited 10 May 2023].

28. Guppy N, Verpoorten D, Boud D, et al: The post-COVID-19 future of digital learning in higher education: views from educators, students, and other professionals in six countries. Br J Educ Technol 2022;53:1750–1765. doi: 10.1111/bjet.13212

29. Pew Research Center. Experts say the ‘new normal’ in 2025 will be far more tech-driven, presenting more big challenges, 2021. Pewresearch.org/internet/2021/02/18/experts-say-the-new-normal-in-2025-will-be-far-more-tech-driven-presenting-more-big-challenges/.
Published
2023-05-30
How to Cite
Root Kustritz M., Rupprecht R., & Zhitnitskiy P. (2023). Comparison of accuracy of machine-generated or human-generated captions of Zoom live lectures in a comparative theriogenology course. Clinical Theriogenology, 15, 52-56. https://doi.org/10.58292/ct.v15.9596