Silent Notetaker: no backend, no account, no upload | Columbus .

Members-Only

Recent Talks & Demos are for members only

Exclusive feed

You must be an AI Tinkerers active member to view these talks and demos.

June 01, 2026 · Columbus

Silent Notetaker: no backend, no account, no upload

See a live demo of a browser-only meeting notetaker that transcribes, categorizes notes, and suggests questions using on-device AI, all without uploading audio.

Overview
Links
Tech stack
  • Transformers
    The deep learning architecture that revolutionized sequence modeling (NLP, vision) by replacing recurrent units with a parallelizable multi-head self-attention mechanism.
    The Transformer: a neural network architecture introduced in the landmark 2017 paper, "Attention Is All You Need." It eliminated the sequential processing bottleneck of prior Recurrent Neural Networks (RNNs) by relying solely on self-attention, enabling massive parallelization and significantly faster training (up to 10x faster) on modern hardware. This efficiency allowed for the creation of large-scale pre-trained models: BERT (encoder-only) and the generative GPT series (decoder-only). The architecture is now foundational to all modern Large Language Models (LLMs) and drives the current state-of-the-art in AI.
  • WebGPU
    WebGPU is the modern JavaScript API providing high-performance 3D graphics and general-purpose GPU compute (GPGPU) access for the web platform.
    WebGPU is the successor to WebGL: it exposes a low-level, explicit API for leveraging the system’s Graphics Processing Unit (GPU) for both rendering and computation. The API design mirrors modern native graphics frameworks (Vulkan, Metal, Direct3D 12), ensuring better performance and compatibility across diverse hardware. It introduces first-class support for compute shaders, enabling high-speed parallel processing for tasks like machine learning and physics simulations directly in the browser. All shader code is written in the custom WebGPU Shading Language (WGSL), which is validated and compiled for security and efficiency.
  • onnxruntime-web
    ONNX Runtime Web is a high-performance JavaScript library that runs machine learning models directly in the browser and Node.js using WebAssembly, WebGL, and WebGPU.
    ONNX Runtime Web brings server-grade machine learning inference directly to client-side applications, eliminating latency from server-client roundtrips while keeping user data private. By leveraging WebAssembly for CPU execution and WebGL or WebGPU for hardware-accelerated GPU processing, the library allows developers to run pre-trained models from PyTorch, TensorFlow, and Keras at near-native speeds. Integration is straightforward: install the npm package onnxruntime-web, import the module, and load an optimized .onnx or .ort model file to execute high-performance computer vision, natural language processing, or generative AI tasks directly on the user's device.
  • WebAssembly
    WebAssembly (Wasm) is a compact binary instruction format: it provides a portable compilation target for languages like C/C++, Rust, and Go, enabling near-native speed execution in browsers and server-side environments.
    WebAssembly (Wasm) is a low-level, stack-based virtual machine instruction set: it is a W3C standard, not a human-written language. Developers compile high-performance code, typically from C/C++ (via Emscripten) or Rust, into the compact `.wasm` binary format. This format executes in a safe, sandboxed environment, achieving near-native performance for computationally intensive tasks like 3D rendering (WebGL), game engines, or complex data processing. Wasm runs alongside JavaScript in all major browsers (Chrome, Firefox, Safari, Edge) and is expanding rapidly into non-web use cases via WASI (WebAssembly System Interface) for serverless and containerized computing.
  • IndexedDB
    IndexedDB provides a robust transactional database system within the browser, moving beyond the 5MB limits of localStorage to handle gigabytes of data
    IndexedDB provides a robust transactional database system within the browser, moving beyond the 5MB limits of localStorage to handle gigabytes of data. It operates on an object-oriented model rather than relational tables: you store JavaScript objects indexed with a key. This asynchronous API supports high-performance searches using indexes and allows web applications to function reliably offline. Developers use it to cache application state, manage large datasets for PWAs, and sync local changes back to a server once a connection restores.