Opened 9 years ago
Last modified 12 months ago
#1510 new enhancement
Add full transcripts to videos on WordPress.tv
Reported by: | mor10 | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Component: | WordPress.tv | Keywords: | |
Cc: |
Description
The videos on WordPress.tv provide a wealth of information, but this information is pretty much invisible on the web, both for search engines, the internal search system on the site, and for users who don't want to watch / listen to the videos. The lack of transcripts is also a major accessibility concern.
Adding full transcripts of videos would remedy this situation and provide significant benefits:
- Videos become accessible to a far greater audience
- Presented content becomes searchable from within wordpress.tv
- Presented content becomes indexable for search engines resulting in organic traffic and (probably) boosting the exposure of the videos to a larger audience
- Visitors can choose to read the transcript rather than watch the video (reading is often faster than watching)
- Translation (even automatic) of videos is vastly simplified with the side effect of potential multi-lingual captions
- Videos with poor audio quality become easier to digest for visitors
- Transcripts provide an excellent opportunity for future enhancement of the site including transcript-based navigation similar to what's found on Lynda.com
Implementation
I propose a panel directly below the video player that shows the first few paragraphs and provides a "reveal more" link or similar to reveal the whole transcript. Ideally the DOM should contain the entire transcript to make the content accessible for search.
Practical Implementation
The obvious question is "where are these transcripts going to come from"? Three options:
- Edited text dumps from CART captioning during events
- Edited dumps from volunteer video captions
- Volunteer text transcripts
Change History (13)
This ticket was mentioned in Slack in #wptv by mor10. View the logs.
9 years ago
This ticket was mentioned in Slack in #accessibility by mor10. View the logs.
9 years ago
#4
follow-up:
↓ 5
@
9 years ago
Question: how is this different than the subtitles file? For those that have such, of course.
#5
in reply to:
↑ 4
;
follow-up:
↓ 8
@
9 years ago
Replying to Otto42:
Question: how is this different than the subtitles file? For those that have such, of course.
The proposal here is to make the full transcript part of the page rather than a linked element that requires parsing by the user. When you visit the page for a video, you should see the video player directly followed by the full transcript of the video as part of the post. In feeds you should see the transcript, and it should also be exposed in the API (if available).
The current subtitles file is provided as .ttml which is not natively read by the browser and requires parsing. It also contains a large volume of timecode data which needs to be stripped out for it to work as a transcript.
#6
follow-up:
↓ 7
@
9 years ago
I think this is a good idea, especially until the accessibility of the video player is fixed. Is the code for WPTV open source as of this point? Because at the moment, in order to stream a video, you've got to tab to one of the linked copies, (low, high, ETC), copy the link, open a media player locally, insert the URL and play it that way. Or download the video to your local machine for play. This has resulted in quite a large archive of WPTV videos on my local machine, which is neat, but sometimes transcripts would definitely be useful. I do, however, realize that in the worst case senario of manual transcription, this would involve a ton of effort.
#7
in reply to:
↑ 6
@
9 years ago
Replying to arush:
I do, however, realize that in the worst case senario of manual transcription, this would involve a ton of effort.
As proposed, this would be optional, in the same way that the captions are optional. Hopefully over time we'll have more CART at events, and the added exposure of transcripts will encourage speakers to caption and provide transcripts for their own videos.
#8
in reply to:
↑ 5
;
follow-up:
↓ 9
@
9 years ago
Replying to mor10:
The proposal here is to make the full transcript part of the page rather than a linked element that requires parsing by the user.
No, I mean, if we have subtitles, then is that a suitable source for said transcript to be produced from?
Parsing ttml is dead easy, and a script can convert it to some other format trivially.
#9
in reply to:
↑ 8
@
9 years ago
Replying to Otto42:
Replying to mor10:
No, I mean, if we have subtitles, then is that a suitable source for said transcript to be produced from?
Parsing ttml is dead easy, and a script can convert it to some other format trivially.
Yes. The answer is yes. Subtitles and transcripts are essentially the same, just formatted differently, so auto-generation from subtitles to transcript would simplify the process significantly.
That said, there should be an option for submitting transcripts as well. When events have CART captioning, a transcript will always be available in txt format.
This ticket was mentioned in Slack in #wptv by mor10. View the logs.
9 years ago
This ticket was mentioned in Slack in #wptv by casiepa. View the logs.
5 years ago
#13
@
5 years ago
@mor10 There is a 4th option now
The obvious question is "where are these transcripts going to come from"? Three options:
1.Edited text dumps from CART captioning during events
2.Edited dumps from volunteer video captions
3.Volunteer text transcripts
We (WPTV moderators) started testing YouTube and AWS Transcribe, so a 4th option is needed:
- Automated transcription after the event.
Where categories 2 and 3 could be seen as 'almost ready to publish', categories 1 and 4 would need moderator review or at least have a clear indication that those have not been reviewed. To be decided if those become visible immediately or not.
As the transcription for a video should be only 1 (in the original language of the video), it could be just a txt file uploaded in the media library that gets attached to the post that holds the video. This would allow moderators to attach/detach in case there is a newer (e.g. reviewed) version.
Adding the transcript next to the video as show/hide collapsible field, but fully available to search engines should indeed help to get people reach the correct videos.
/cc @dd32
Somewhat related to #1455