The Limits of Just Listening
The day an analog person's YouTube learning hit a wall
On the subway home from work, I plug in my earphones. I open YouTube. Today I watched three videos about authentication systems. By the time I get home, all I remember is "something something tokens." Three videos, half a sentence retained.
I am an analog person. What I hear flows away. What I see doesn't stick. I watch a video and feel like I've understood something, but the next day that understanding exists nowhere. I've tried taking notes. But when I focus on listening, my hands stop. When I focus on typing, I lose the content. I'm not the multitasking type.
I can tolerate inconvenience. I cannot tolerate wasting time. For me, these two are clearly distinct. A cramped subway is fine. But investing an hour watching a video, only for that hour to leave no trace—that's not inconvenience. That's waste.
"I definitely saw this somewhere."
I must have said this dozens of times this year. During conversations, or when hitting a new problem, I clearly remember watching a video about it. I vaguely recall which channel. But the content won't surface. So I search again, watch again, forget again.
I tried organizing in Notion. Typing key points while watching. Lasted about a week. Conclusion: this method doesn't suit me. Taking notes on one video took twice the video's length. An hour for a 30-minute video. In that time I could watch two more. And honestly, the quality of those hour-long notes wasn't great either. I'm not a natural summarizer.
I tried summary apps too. Tools that take a YouTube URL and organize the key points. They were useful. Definitely helped grasp the content of a single video.
But that wasn't what I wanted.
Tools that answer "what's this video about?" existed. Tools that answer "what have I learned about this topic over the past two weeks?" existed nowhere. Summarizing a single video is possible. But connecting and accumulating knowledge across twenty videos—no tool did that.
So the problem was this: I watch plenty of videos, mostly for learning. New technologies, new business models, new tools. Each video is valuable. But they don't connect in my head. I can't notice that what Channel A discussed and what Channel B discussed are actually different perspectives on the same topic. To notice that, the memories would need to persist. They don't.
Realizing this was a tool problem, not a willpower problem, took quite a while.
More precisely, I'd thought "I wish a tool like this existed" many times. But I'd never gotten to "then I'll build it." Building meant coding, and I couldn't code.
I'd heard the word HTML. CSS too. I knew the concepts of frontend and backend existed. I once bought a template and made a website. But that was changing the wallpaper of a house someone else built—not building a house.
Then at some point—I can't pinpoint exactly when—I started hearing that AI could write code. That you describe what you want in plain language and AI generates the program. I first heard the term "vibe coding" around then. You just set the "vibe" and AI writes the actual code.
At first I thought it was hype. An app from a few lines of prompts? That's marketing copy, not reality. But as I started seeing real examples, I arrived at: "It's exaggerated, but not entirely false."
So I thought: could I build a tool that solves this problem I face every day—the problem of YouTube knowledge scattering?
The day that thought occurred, I didn't yet know what a terminal was.
🔧 Technical Terms in This Episode
SaaS (Software as a Service) A service you use through a web browser without installing anything. Netflix, Notion, Slack are all SaaS. What I'm building is also this form—a web service anyone can access by typing in a URL.
Vibe Coding Building software by giving AI instructions in natural language instead of writing code directly. You describe "build a feature like this" and AI generates the code. A neologism from 2025–2026.