Building an AI-Powered Desktop App from Scratch – A Founder’s Journey with icheetz.fun
*Estimated Reading Time: 15–20 minutes*
🎯 Introduction: Why I Built icheetz.fun
Every developer has that one side project — the idea that starts as a scribble in a notebook or a late-night thought after a frustrating experience.
For me, it was frustration during job interviews.
Despite knowing my stuff, I’d freeze up. Forget key points. Second-guess myself.
So instead of practicing more, I did what any self-respecting developer would do:
I built a tool to help me — and others — get better at interviews in real time.
That tool became **icheetz.fun** — an AI-powered desktop assistant that gives real-time suggestions during live conversations like job interviews, presentations, and meetings.
In this post, I’ll walk you through:
- The idea came to life
- The tech stack behind icheetz.fun
- Key challenges I faced while building it
- What I learned along the way
- And how you can build something similar too
Let’s dive in.
🛠️ Part 1: Choosing the Right Tech Stack
Before writing a single line of code, I had to decide on the right tools.
Since I wanted **icheetz.fun** to be a desktop app with real-time AI capabilities, the choice was clear:
🔧 Frontend: Electron + LitElement
Electron allows developers to build cross-platform desktop apps using JavaScript, HTML, and CSS — which means faster development and easier debugging.
I chose **LitElement** for the UI components because of its lightweight nature and reactivity model. It gave me a structured way to build custom elements without the overhead of heavier frameworks.
⚙️ Backend: Node.js with Electron IPC
The backend runs on **Node.js**, handling communication between the UI and the core functionality of the app.
To allow seamless interaction between the main process and the renderer process in Electron, I used **IPC (Inter-Process Communication)** — making sure screen capture, audio processing, and AI requests could all happen smoothly in the background.
🤖 AI Engine: Google Gemini 2.0 Flash Live API
This is where the magic happens.
The **Google Gemini 2.0 Flash Live API** powers the real-time contextual understanding and suggestion generation. By sending multi-modal data (both visual and audio), I was able to get highly relevant responses during live conversations.
This was by far the most exciting part — watching the AI understand context and offer meaningful suggestions in real time.
🎯 Part 2: Designing the Core Features
Before diving into implementation, I defined the must-have features:
Feature | Purpose |
---|---|
Real-Time Suggestions | Help users stay articulate under pressure |
Multi-Modal Input | Combine screen + audio analysis for deeper context |
Non-Intrusive UI | Transparent overlay that doesn’t block the user |
Platform Support | Work on macOS and Windows (Linux support planned) |
Privacy First | All processing local; no cloud logs |
These goals shaped every technical decision I made.
🖥️ Part 3: Implementing Screen Capture
One of the biggest challenges was capturing screen content in real time without slowing down performance.
Thankfully, Electron provides a built-in module called **desktopCapturer**, which allows access to screen and audio sources.
Here’s how I used it:
const { desktopCapturer } = /* require */('electron');
desktopCapturer.getSources({ types: ['screen'] }).then(async sources => {
for (const source of sources) {
const stream = await navigator.mediaDevices.getUserMedia({
audio: false,
video: {
mandatory: {
chromeMediaSource: 'desktop',
minWidth: 1280,
maxWidth: 1920,
minHeight: 720,
maxHeight: 1080
}
}
});
// Process stream frames here
}
});
This allowed me to grab screen content and send it to the AI engine for analysis.
🔊 Part 4: Audio Capture Across Platforms
Audio processing was another big hurdle — especially getting system-level audio across different operating systems.
macOS
On macOS, I used a native binary called **SystemAudioDump** to capture system audio directly. This provided clean, loopback-free audio input.
Windows
For Windows, I used Electron's `mediaDevices` API to implement loopback capture — essentially recording the system sound output.
Linux (Experimental)
Linux support is still experimental. For now, I’m using basic microphone input via `getUserMedia`.
While not ideal, it’s a solid starting point until full loopback support becomes available.
🌐 Part 5: Integrating with Google Gemini 2.0 Flash Live API
Once I had both screen and audio data, I needed to send it to the AI model.
I chose **Google Gemini 2.0 Flash Live API** because of its ability to handle streaming inputs and provide real-time responses.
Here’s a simplified version of the integration:
const { GoogleGenerativeAI } = require('@google/generative-ai');
const genAI = new GoogleGenerativeAI('YOUR_API_KEY');
async function analyzeContext(screenData, audioData) {
const model = genAI.getGenerativeModel({ model: "gemini-2.0-flash-live" });
const result = await model.generateContent([
"Analyze the following conversation context:",
{ inlineData: { mimeType: "image/png", data: screenData } },
{ text: `Transcribed audio: ${audioData}` }
]);
return result.response.text();
}
This sends both visual and audio context to the AI, which then returns smart suggestions based on the current conversation.
💻 Part 6: Creating the Overlay UI
The final interface needed to be subtle yet effective.
I designed a **transparent, always-on-top window** that displays AI suggestions in real time.
Key design decisions:
- No mouse interaction — only keyboard shortcuts
- Lightweight styling to avoid distraction
- Keyboard toggle (`Cmd/Ctrl + M`) to show/hide the overlay
- Enter key to send feedback or selected suggestions
Using Electron’s `BrowserWindow`, I created the overlay like this:
const overlayWindow = /* new */ BrowserWindow({
width: 400,
height: 200,
transparent: true,
frame: false,
alwaysOnTop: true,
webPreferences: {
// nodeIntegration: true
}
});
overlayWindow.loadFile('overlay.html');
This gave users a minimal but powerful interface that didn’t interfere with their calls or presentations.
🧪 Part 7: Testing and Debugging Challenges
As with any real-time application, testing was tricky.
Some of the issues I ran into:
- Performance drops when analyzing high-resolution screens
- Audio synchronization issues with screen data
- API rate limiting from the Gemini API
- Handling errors gracefully without crashing the app
To solve these, I implemented:
- Frame sampling to reduce CPU usage
- Debounced API calls to prevent rate limits
- Fallback logic if AI response was delayed
- Logging system (only for debugging, never stored user data)
📈 Part 8: Lessons Learned from Building icheetz.fun
Building this product taught me more than just technical skills — it reshaped how I think about software development and entrepreneurship.
1. Start Small, Think Big
I didn’t try to build everything at once. I focused on the MVP: screen capture + AI suggestions. Everything else came later.
2. Build for Yourself First
Because I was solving my own problem, I knew exactly what features were essential. That made prioritization easy.
3. Use the Right Tools for the Job
Electron, LitElement, and the Gemini API worked together beautifully. Don’t force a framework — let the problem guide the solution.
4. Prioritize Privacy From Day One
Even in early development, I made sure all data stayed local unless explicitly shared. Users appreciate transparency.
5. Feedback Is Invaluable
Early testers helped shape the direction of the app. Listening to them made icheetz.fun better for everyone.
🚀 Part 9: What’s Next for icheetz.fun?
Now that the beta is out, I’m working on several exciting updates:
- ✅ Linux support
- 🎯 Custom profiles for different interview types
- 🗣️ Voice command support
- 📊 Exportable conversation insights
- 🧩 Calendar integrations for automatic prep
If you're interested in contributing or trying the beta, feel free to reach out!
📢 Part 10: Want to Build Something Like This?
You absolutely can.
Here’s a quick checklist to get started:
- Pick a problem you face regularly
- Choose the right tools for the job
- Focus on a minimum viable product
- Iterate based on feedback
- Keep privacy and usability top of mind
Whether you’re building an AI assistant, productivity tool, or just learning to code — start small, stay consistent, and ship early.
And remember: great products aren’t built overnight — they’re built one line of code, one test, and one user at a time.
💬 Final Thoughts
icheetz.fun started as a personal frustration — and turned into a product that helps professionals around the world.
If you're reading this and thinking about building your own side project, take the leap.
Start small. Build for yourself. Ship early. Improve constantly.
You never know where it might lead.
🙋♂️ Questions? Feedback? Want to Contribute?
Feel free to reach out!