Silent Notetaker: no backend, no account, no upload

See a live demo of a browser-only meeting notetaker that transcribes, categorizes notes, and suggests questions using on-device AI, all without uploading audio.

Overview

Silent Notetaker is a meeting notetaker that runs entirely in the browser: it transcribes the conversation live, pulls out decisions, action items, and open questions as they happen, with no audio ever leaving the machine. The whole app is a single HTML file, no backend and nothing to sign into.

I’ll demo it live with a real mic: transcription and speaker labels appearing in real time, notes self-categorizing, and an on-device LLM suggesting the next question to ask. Then review architecture (how the speech model, the speaker model, and the question model share one machine without fighting each other), show the single-file source for people to try out, modify and make it their own.

Video

Transcript

Generated about 2 months ago

Summary

Generating a talk summary...

View full transcript

Speaker 0: Half of this will be me struggling to get that mic on. Alright. So this, there's, like, 2 different types of programming, that I do, and 1 is totally slop, and the other 1 is decent. This is not the latter. So, what I was interested in is, like, the main problem that I had is, like, 1 irritation that I just fixated on. Speaker 0: As I was in a really large, it was AI a AI webinar, and it has a chat on it. It's Sai pretty active Route, but it gets spammed by all of these, like, otter.ai Notetaker. And this is, you know, this is my Notetaker. So and so is AI this, and nobody cares. But it really messes up the chat. Speaker 0: So it's, anyhow, that bugged me enough to make make a tool out of this. Basically, this is going to be all in your browser. Oh, please work. Awesome. Yeah. Speaker 0: So this is all in the browser-based it happens, real time. So as you're talking Vue, you know, if you ever wanna talk about author capture the speech that you're you're doing, not this 1 secure it's mobile. But, if you wanna capture a meeting, you can also switch it around Sai it's not just what you're saying. And that's kind of the fun part, which we can do here. Sai if you get a meeting and Speaker 1: you, Speaker 0: like, play it in the browser-based if this works. Speaker 1: This is Actions custom Michael. Today, we're gonna talk about AI do Meetup get big Real? 6 figure deals, 7 figure deals, 8 figure deals. How do they get big deals? And the kind of subtitle here is, are you selling tools or Vue you selling outcomes? Speaker 1: I think a very common 1st office hour topic for companies that get accepted into AI is some version of them saying, Dalton, how do I get build deal? It's AI file, they they they haven't even Googled it. Right? This is just Mike a and I think the context is that most people understand Speaker 0: self-service because they signed up, like, who have the other thing is that those 2 times speed as well. Speaker 1: Sign up for that, put in our credit card, and Speaker 0: Which is Speaker 1: nice Speaker 0: because nobody actually wants to listen to this at 1 time. Agree. And I do actually have a newer version, and run, it's not what I've lead, but it has Aztra speakers popping up, as it goes through. So each SPEAKER will show. Lead if I AI to load that on the fly here, and I think it just, took forever. Speaker 0: The downside of this, this has a lot going on in the browser. So the models that we're I'm using are, the new Control real time. It's a 4,000,000,000 parameter model. And to get that to run, we had to wait till the most recent transformers JS v 4 that came Pun. And now City can actually run a a decent AI model. Speaker 0: The downside is, you also need to USA, WebGPU GPU for this. Won't won't run-in regular WASM. The Layer is just way too slow. But what's AI of fun is if you have, web GPU for this, which is you keep it as, like, a dedicated thing, you can add a couple other models. And 1 of the models was a very small embedding model. Speaker 0: I think it's about 20 or 40 megs, that does the speakers. It's not perfect, but it does pretty accurate, especially for things like this. If you have a meeting of, you know, 20 people, it's probably gonna have an issue. But some of the fun things that it does is, like, if there is a word, like, Actually, that 1, Control, Vue can correct it. Just click on it, and and then it'll correct throughout, throughout there. Speaker 0: The same thing will happen with speakers. Just Mike sure I don't have it turned off because that would be embarrassing. Yeah. I AI I'd have to reload it. But, yeah, the speakers will show up to the AI, and then you just click on them and type in the name, and then it'll try to identify it as it goes through. Speaker 0: If you really want speaker identification to be perfect, there are other models out there that are much larger, and you can try to get those to run. You'd probably have to pick try to get them to run-in, WASM instead of, WebGPU secure WebGPU is struggling to to keep the the 4,000,000,000, real time parameter, model 2026 and running. The, couple issues that I ran into is this model-judge really accurate, so I really loved working with it. But if if you just let it stream, the your RAM will not LiteRT will not stay at a set length. It's going to just balloon up and crash your your browser. Speaker 0: So you have to, cap it at a certain amount of tokens run a certain amount of time. So I think my tokens were, maybe Mike 200 tokens or so in 45 seconds, and then it goes on a loop. And it just keeps looping through, and then it keeps it as at a capped, like, buffer Date. So it doesn't Actually, overload that. This is something I think I'm gonna refactor into Event code, but right now it's all JavaScript. Speaker 0: JSON Speaker 1: So you're saying it's something that's rolling off? Speaker 0: Mhmm. Speaker 1: And are you you're persisting it so that you have the transcript? Speaker 0: Yeah. It's only persisted in here. So the the actual, real time model, after these words are written written out, it doesn't need to know about that at all. And then we also have a couple other, a really small Quinn model that is running in WASM, and that is going that is opening up the different questions and keynotes and, like, tracking the notes of the meeting as it goes along. And I think it has Sai if this works. Speaker 0: We also have a, AI, a oh, crap button. If you get called on and you aren't paying attention to the meeting, then it'll AI-generated a smart question. Speaker 2: Sai, Speaker 0: and then let's see if we stop this. Speaker 1: Have you done a stress test where you have all these models running at the same time? Speaker 2: They're currently running. Speaker 0: And then at the end, I switched the Quinn, point 6,000,000,000 parameter model. It was running in WASM. Search it to WebGPU for an overall, like, recap of the meeting Sai it gets a better better recap. It should be routing up a bit. Speaker 1: Are you able to June, like, a Sai context window Sai it doesn't keep showing memory requirements? Speaker 0: For Quinn or for? Speaker 3: For Model. Speaker 0: Yeah. Voxel is a a set 45 lessons, and it's, Mike, cuts it Jun, and that or and AI think, like, 200 tokens. Speaker 1: Was it the Quinn model that was ballooning? Or Speaker 0: City was the Control model that that was going off. In account 5 minutes, your tabs die and then your computer will die and, things are things are pretty nice. Yeah. And I I couldn't Actually, couldn't find this in I don't do a lot of JavaScript. I don't like City. Speaker 0: So when I rewrite this, I'll Robb rewrite it in Rust Tools WASM, mobile I I wasn't able to see the memory allocation on on the heap inside JavaScript secure it's in web GPU, and, it just doesn't doesn't show up. So that was kind of a, gotcha moment for me that I lead Speaker 1: to realize. Speaker 0: And Speaker 2: then Jun tried to use this with the Speaker 1: Cell models Jun all, or is are they too big? Is there Pun most Speaker 0: of them are too big, but I think now we like, the 4,000,000,000 mark is is right where you could get those really small JEMMA models. I'm sure you could get on there. You could switch that out for some of the, recaps and the, you know, open questions part. Oh, yeah. Ariso, it also lets you know when the questions are answered too in there. Speaker 0: This is not great code, like I said, but if Vue, hang out with the repo, it will be refactored, and Real free to, you know, manipulate City, do whatever you'd like with it. This no 1 will really detect that you're being that you're recording these. This is just for you. All of the data stays locally in your browser, and it is hooked up to IndexedDB. If you've done any, like, Wasm work, it's like a a little database that's just in your browser. Speaker 0: So you do have a HINTS. And, Vue know, AI, another thing I should do is have the models tag the, title, but I just have the date as the title. So you can search through all of your past meetings, and, you know, City questions. And lastly, also built in a connector. If you have, like, AI code, you can hit connect, and then your transcripts get not the transcripts, but the the questions get a lot Center. Speaker 0: And City can also run over your history. So you basically are it's like a linking, cloud code to your IndexedDB inside your browser to read through all of your history and do whatever you need. So if your employer says no AI, you can still use this. Speaker 1: June. I was thinking Vue you built a better granola secure granola will run-in the background without being detected. But there's better features here than what Grinnell has. The only thing that is they are different secure you can take your own notes. Do you have a place to type in your own notes? Speaker 0: No. I don't I don't pay attention to most of my meeting. Sai don't have API notes. Speaker 1: Lead add on to the evil crap AI be identification of yourself AI it can audit automatically detect 2026 you're called on? That's Speaker 0: a good idea. Speaker 4: Real. Yeah. Yeah. Speaker 0: Yeah. That that is, something I did wanna have is, like, switch the Mike. So even if it is it continues search, transcribe from your meeting, but you could also, like, have voice notes in LiteRT depends on how active you are in in the meeting. But, yeah, this is Pun other another idea was, adding the meeting as, like, a window Sai. But most of the time when you're on a meeting, if anyone's presenting anything, you need it as large as you possibly can author, so I didn't waste my time on that. Speaker 0: And then I am taking a version of this and AI twisting it around for a, like, a code architect. So you're able to go through and click on, where City does have, like, your whatever project you're working on, opens up inside this Notetaker. And then you're able to click on it and just talk about the the aspects of, like, the front end that you like or don't like. And then based on the AI stamps that it captures as you're talking it talking about it and where your cursor is, it gives the, coding agent, a direct idea as to what you were talking about and what you were pointing at Tools. So you don't have to worry about, like, a Pun to 1, like, DOM element that you're trying to trying to, change. Speaker 0: Let me see if I AI these things. Speaker 1: You have a little LRASPP your eyes. Speaker 0: Yeah. Well, you're not paying attention to your meeting, so your eyes eyes wouldn't talk. Architect. Oh 0, yeah. Speaker 1: Did you add the links to your Presenter page? Speaker 0: Mhmm. Speaker 1: So if anyone wants to go find them, they should be on the Speaker 2: end tuner's presenter page. Speaker 0: I do have a Speaker 2: a Speaker 0: oh, I'll jump back to that in a 2nd. Yeah. Sai so I also Street to throw it up on CloudFlare at, like, 4. So if you go to that, you can get it for free. Three is a my the link to my repo if you wanna edit it, which I would wait a little bit till I refactor it because it is a nasty bit of Coding. Speaker 0: But, my link is on the, AI Tinkerers thing for my GitHub repo too. Sai Cell. Any questions? Speaker 1: I don't have much of a question, but I I wanna call out that that the idea three, the API, the notes to mark down. Speaker 0: Mhmm. Speaker 1: It's really cool because I a lot of AI, what I'll do is I'll have the transcript run, and I'll just teeth Coding, pasting the transcript Yeah. In the meeting. And then at the end of the meeting, I'll be like, hey. Here's here's my takeaways. We all agree on this, like, zone API. Speaker 1: And that's that's why getting everybody aligned is really, critical at the end of the meeting, but X-rays has to happen in emails back and forth. The idea of, like, copying the meeting notes before the meeting ends, posting in the reviewing it with everybody, get alignment Yeah. Before the meeting ends. Oh, that's cool. That would be cool. Speaker 1: I see. Speaker 0: Yeah. I think there are 2 other things. Yeah. So if you did wanna copy it straight out, you can take the AI. So you can switch, like, the time around from, like, how long ago they Cell it to how much time has elapsed or the actual clock time. Speaker 0: Or if you wanna copy 1 thing, you're not gonna want that 7 0 4 in three. So you can just move Live it out and and copy it straight. But, yes, if you need the whole transcript, then it's AI a quick Coding. Speaker 1: Is that copy of markdown, is that the meeting notes there? Is that what the Speaker 0: Yeah. Copy to mark markdown is the meeting notes, and then it tries to take Structured of of most unique things after every, like, 15 seconds. If it's the exact same image, it's not supposed Date, but it's not a you know, it's not great Coding, like I think I've said 4 times. But, yeah. Speaker 2: Thanks, Speaker 0: AI. Alright. Cool.

Links

Tech stack