Generate Podcast episode for Google Cloud Technology Nuggets

Romin Irani
5 min readSep 17, 2024

--

I write a biweekly newsletter titled Google Cloud Platform Technology Nuggets, that highlights the news and announcements from Google Cloud Platform blog.

A while back, I wrote a post on how I compose the nuggets from 30–40 announcement blogs that get published in a 15-day window. That process remains unchanged and there is no AI generating the Nuggets edition that gets delivered to you.

Text 2 Image Prompt: generate a realistic image of a Robot converting a text based article to a Podcast.

I have been trying to look into ways I can distill the Nuggets into specific areas that folks might be interested in and/or different mediums, for e.g. a Podcast version of the edition, that some folks might prefer listening to, rather than reading it.

I am most interested in a process that gives me control rather than just generating the Podcast, but my initial experiment is still to understand a thing or two with what is available out there. It would be a shame if the AI generation inserted a style of conversation, or used a joke or two, that does not go down well with the listeners.

Enough of that. In the last week or so, there have been multiple reports of folks trying out two tools:

  1. Illuminate : Turns your content into engaging AI generated audio content.
  2. NotebookLM : This is supposedly a note taking research assistant for your content but has introduced “Generate an Audio summary” that has been raved about by the audiences.

I had used NotebookLM before and thought of giving it a try. Let me describe to you, the steps that I used to generate a Podcast that was generated by NotebookLM from the latest episode of Google Cloud Platform Technology Nuggets, September 1–15, 2024 Edition.

Step 1 : Generate a PDF of the article

This step is not necessary since NotebookLM allows for Data sources to be a web url too, but I prefered to go down this route. I simply went to the web url: https://gcptechnuggets.substack.com/p/google-cloud-platform-technology-59c and then did a Print/Save As… PDF for the episode.

Step 2 : Add the PDF to NotebookLM

Login to NotebookLM and you are met with a screen as shown below:

You can have 50 sources at a max. In our case, we don’t have any at the moment. I click on the Upload sources button and this brought up a nice dialog to upload the required data sources or point to them.

I went with choose file to upload since I had generated the PDF and uploaded the same. This brings up some neat summaries and possible questions to ask along with a Chat interface to start conversing with your data sources. It’s pretty powerful and you should try this with your sample documents.

Step 3 : Generate the Audio Conversation

What I am more interested is the Audio overview that you see in the screen above and part of which is reflected below:

I try to look out for any settings or things to tune but nothing much available at the moment other than the Generate button. I am sure in due course, things should arrive, but for now, you can’t do much other than hope that it will do its magic.

Click on Generate and it’s quite blunt in telling you that it’s going to take some time and you don’t need to stick around. A good chance for you to check out some Q & A with your data sources.

Once its generated, we have the Audio file and you can hear it/download it.

I did this experiment twice to see what it generated. So if you’d like to hear what the Podcast version of the latest episode of Google Cloud Platform Technology Nuggets, September 1–15, 2024 Edition sounds like, here are two versions:

  1. 12+ mins version: Download
  2. 18+ mins version: Download

Interesting observations

  • The conversation sounds natural. There is a bit of a banter going on, with some humour. I liked it but at times I felt that it was a bit distracting and I should have got to the points faster.
  • I just ran it twice and the difference is 6 mins between them. That is some magnitude larger in length and time. Wonder what went on behind the scenes there.
  • I was very impressed by its attempt to break down the topic into something simple to understand for the user. I am hoping this translates into a few settings in the future, where we can control the pace, the expectation of the audience, what knowledge to assume on behalf of the audience and more.
  • What really surprised me was that at the end, it even threw up a question to the audience based on Cloud Technology and AI with some options and it was pretty spot on. It invited the folks to participate too and made up some hashtag.

Hats off to the folks behind the tool and the model behind the scenes.

Suggested Do It Yourself Approach

By coincidence, we have a fantastic article by Sascha Heyer titled Building a Dynamic Podcast Generator Inspired by Google’s NotebookLM and Illuminate. This article is a detailed writeup of how you can combine multiple services on Google Cloud (Vertex AI Gemini models, Text to Speech) and external services (ElevenLabs) for voice generators. Check it out and push the limits.

Conclusion

I am definitely keeping these tools handy to see how best to use them to provide quality summaries in other mediums (voice, video). I will put a premium on validating the content and ensuring its meets a certain level of expectations but what those criteria are, its still an open question.

--

--

Romin Irani

My passion is to help developers succeed. ¯\_(ツ)_/¯