CMC Blog
"Halibut Over Steel Like a Man" - The Trials and Tribulations of Using AI for the Transcribing and Captioning of Cincinnati Museum Center's Rare Moving Image Collection
By: Arabeth Balasko, Curator, Photographs, Prints and Media
Recently, the Cincinnati Museum Center partnered with Scene Savers in Covington, Kentucky, to digitize 53 rare Moving Image assets including master copies of WLW’s Midwestern Hayride, a pilot episode of “On the Money,” a newlywed themed gameshow hosted by Bob Braun from the 1970s, an early episode of Ruth Lyons 50/50 Club, a 1951 recording of Melody Showcase, a short-lived variety show Rod Serling wrote for during his time with WLW, scenes from and the aftermath of the Avondale Riots of 1967, 1937 flood footage, 1930s Coney Island excursion recordings and much more.
Coney Island Ferris Wheel – 1930s, still taken from Moving Image Collection – MI-93-04A – CMC.
Close to half of these digitized moving images were originally recorded on 2-inch quad tape, which is an analog format that was regularly utilized to record media from television starting in the early 1950s and into the 1980s. These are some of the most at-risk items we currently have in the Moving Image Collection at Cincinnati Museum Center due to their age, degradation and the obsoletion of media available to play (and digitize) these recordings. Many of these assets are one-of-a-kind master copies, which only exist here at Cincinnati Museum Center. Due to their rarity and fragility, they were prioritized for digitization.
2-Inch Quad Tapes – 1950s-1970s, from Moving Image Collection – CMC.
Digitization is only half of the battle for user access and file accessibility though! Another obstacle that is faced when working with sound recording and moving image media is transcribing and providing captioning to accompany the soundtrack of these digitized files. This time-consuming, but essential process has been made a bit less tedious with the assistance of AI (Artificial Intelligence) captioning software components. However, when it comes to AI generating captioning for older media, media with sound degradation, media with multiple individuals speaking at the same time or over one another and media with singing (especially with a southern accent!), the accuracy of computer assisted transcription hovers at around 55% or less! That means it is up to us humans to accurately transcribe and capture the soundtrack of these files – word by word.
This process can take mere minutes to hours to complete, depending on the length of the clip, the quality of the recording and the typing and listening speed of the transcriber! Over this transcribing and captioning process, many errors were made by AI, and many of them were truly quite comical. It is fascinating to study and explore how AI is trying to make sense of the nonsensical – meaning, a sentence that would not traditionally make sense when read by a human being, makes perfect sense to the AI generating software crafting the sentence.
One of the personal highlights of being the human being working on this project was that I was able to see all of the AI transcription fails in real-time. Below, you will see one of my favorite nonsensical sentences created from the AI transcription process, and you will also get to see the final clip of the accurate transcription of the digitized file.
AI Transcription Fail – Country Hayride MI-057 – CMC
Human Transcription – Country Hayride MI-057 – CMC
Cincinnati’s diverse history has been heavily documented over the last 76 years since television took off in the Queen City in 1948. Not only with the variety and comedy shows, as seen above, but with front-of-the-line news stories such as the one below. This snippet comes from a mostly silent moving image reel featuring the aftermath and destruction of the Avondale neighborhood after the 1967 Avondale Riots.
The sheer fact that we at Cincinnati Museum Center even have these items is such a true feat and being able to make these accessible to the world was truly a treasured experience for me as a curator. It is important to shed light on the invisible labor that goes on behind the scenes in order to make collections available. This labor of love is an essential part of creating accessibility for ALL in collections.
Each October, we in the collections and archives world get to highlight special collections we have in our permanent holdings with the public in what we call and celebrate as Archives Month. This October, to highlight Archives Month, Cincinnati Museum Center will be hosting its first ever Archival Mini Film Festival on Saturday, October 26, 2024, from 10:30 a.m. to 4 p.m. in the Scripps Howard Historic Newsreel Theater.
This FREE event is open to the public and will feature a selection of moving image clips including; early Cincinnati variety shows such as the Ruth Lyons 50/50 Club, the Melody Showcase, Midwestern Hayride, Country Hayride and Soul Street; local news, sports and political content from the 1930s to 1970s – including silent films recordings of the 1937 Cincinnati flood; holiday-themed specials – featuring local TV personalities; a featured episode of Rod Serling’s locally produced TV show The Storm, a piloted game show from the 1970s hosted by Bob Braun; local commercials and a few other special surprises!
Many of these clips may have not been viewed since their original airings, and many are being shared publicly for the first time in decades! We invite you all to come step back into the past with Cincinnati Museum Center and help us “Save it Before it is Gone!”
Museum Admission
Includes Cincinnati History Museum, Museum of Natural History & Science and The Children's Museum
Adult: | $24.25 |
Senior: | $17.00 |
Child: | $17.00 |
Member Adult: |
FREE |
Member Child: |
FREE |
Members receive discounts!
Become a Member today to save on programs, exhibits and films throughout CMC.
Museum Hours
Open Thursday – Monday
10 a.m. to 5 p.m.
Closed Tuesday and Wednesday
Closed Thanksgiving Day and Christmas Day
Member’s-only early entry: Saturdays at 9 a.m.
Customer Service Hours:
Monday – Sunday, 9 a.m. to 5 p.m.