, 2 min read
Converting Myra West's vlog to blog
Original post is here eklausmeier.goip.de/blog/2024/12-08-converting-myra-west-s-vlog-to-blog.
Myra West has a YouTube channel, where she mostly talks about friendship, relationship, personal journey as a yound adult. Below video "21 Years Old: I Have NO Friends" went viral and hit more than six million views.
Myra West has no blog. Below are the steps to convert her YouTube talks to a blog.
1. Getting transcript
Sometimes YouTube offers transcripts, but sometimes they are not available. To get the raw text I used YouTube Transcript.
There are YouTube videos where YouTube Transcript is not able to generate any meaningful text. For example, for "If You Struggle to Make Friends" it failed. In that case I recorded the video with my smartphone and had Google voice recorder produce text. Obviously, that's quite an act. But even then, some parts of above video were not understood by Google voice recorder.
I used copy-paste to copy the text to a text editor, where I added frontmatter:
---
date: "2019-08-12 12:00:00"
title: "21 Years Old: I Have NO Friends"
youtube: "QfbCMjNj9q8"
---
2. Getting punctuation
Unfortunately the transcript from neither YouTube nor YouTube-Transcript contains punctuation. To get this I asked Google Gemini. I used below prompt:
Get punctuation from below text:
<Copy-paste text>
I tried ChatGPT, Claude, MS Copilot, and Gemini: they all only allow a few paragraphs to punctuate. Microsoft Copilot says it only allows 8,000 characters. These limitations are quite a nuisance.
Therefore you have to repeat above step a couple of times. But at this point you have at least a transcript including punctuation.
3. Proofreading
The transcript needs proofreading. In particular, I meticulously checked that Google Gemini did not drop any word!
ChatGPT is very keen on halucinating and adding content on its own.
For this I used:
bcompare <(perl -npe 's/\s+/\n/g' ~/php/saaze-myrawest/raw/$1) <(perl -npe 's/\s+/\n/g' ~/php/saaze-myrawest/content/blog/$1) &
I.e., I compared the raw transcript with the punctuated transcript.
During proofreading I
- dropped duplicate words,
- unfinished parts of sentences,
- dropped fillwords, like "and", "um", "like", etc.
I also added occasional headings. In some posts I even added Markmap mindmaps. In one post I added a gallery.
I added links, i.e., I made use of hypertext. In videos there seems to be no equivalent, i.e., there is no hypervideo, except when you use JavaScript trickery.
4. Result
As can be seen from above steps, the process is quite manual. As of now it doesn't seem to be amenable to automation.
As of today, 08-Dec-2024, Myra West has produced 82 videos. I did not convert all of them. I converted those, which contain a lot of text. The converted vlog is here: Myra West's blog.