Building an AI-Powered Desktop App from Scratch – A Founder’s Journey with icheetz.fun

*Estimated Reading Time: 15–20 minutes*

Building an AI-Powered Desktop App

🎯 Introduction: Why I Built icheetz.fun

Every developer has that one side project — the idea that starts as a scribble in a notebook or a late-night thought after a frustrating experience.

For me, it was frustration during job interviews.

Despite knowing my stuff, I’d freeze up. Forget key points. Second-guess myself.

So instead of practicing more, I did what any self-respecting developer would do:

I built a tool to help me — and others — get better at interviews in real time.

That tool became **icheetz.fun** — an AI-powered desktop assistant that gives real-time suggestions during live conversations like job interviews, presentations, and meetings.

In this post, I’ll walk you through:

  • The idea came to life
  • The tech stack behind icheetz.fun
  • Key challenges I faced while building it
  • What I learned along the way
  • And how you can build something similar too

Let’s dive in.

🛠️ Part 1: Choosing the Right Tech Stack

Before writing a single line of code, I had to decide on the right tools.

Since I wanted **icheetz.fun** to be a desktop app with real-time AI capabilities, the choice was clear:

🔧 Frontend: Electron + LitElement

Electron allows developers to build cross-platform desktop apps using JavaScript, HTML, and CSS — which means faster development and easier debugging.

I chose **LitElement** for the UI components because of its lightweight nature and reactivity model. It gave me a structured way to build custom elements without the overhead of heavier frameworks.

⚙️ Backend: Node.js with Electron IPC

The backend runs on **Node.js**, handling communication between the UI and the core functionality of the app.

To allow seamless interaction between the main process and the renderer process in Electron, I used **IPC (Inter-Process Communication)** — making sure screen capture, audio processing, and AI requests could all happen smoothly in the background.

🤖 AI Engine: Google Gemini 2.0 Flash Live API

This is where the magic happens.

The **Google Gemini 2.0 Flash Live API** powers the real-time contextual understanding and suggestion generation. By sending multi-modal data (both visual and audio), I was able to get highly relevant responses during live conversations.

This was by far the most exciting part — watching the AI understand context and offer meaningful suggestions in real time.

🎯 Part 2: Designing the Core Features

Before diving into implementation, I defined the must-have features:

FeaturePurpose
Real-Time SuggestionsHelp users stay articulate under pressure
Multi-Modal InputCombine screen + audio analysis for deeper context
Non-Intrusive UITransparent overlay that doesn’t block the user
Platform SupportWork on macOS and Windows (Linux support planned)
Privacy FirstAll processing local; no cloud logs

These goals shaped every technical decision I made.

🖥️ Part 3: Implementing Screen Capture

One of the biggest challenges was capturing screen content in real time without slowing down performance.

Thankfully, Electron provides a built-in module called **desktopCapturer**, which allows access to screen and audio sources.

Here’s how I used it:

const { desktopCapturer } = /* require */('electron');

desktopCapturer.getSources({ types: ['screen'] }).then(async sources => {
  for (const source of sources) {
    const stream = await navigator.mediaDevices.getUserMedia({
      audio: false,
      video: {
        mandatory: {
          chromeMediaSource: 'desktop',
          minWidth: 1280,
          maxWidth: 1920,
          minHeight: 720,
          maxHeight: 1080
        }
      }
    });

    // Process stream frames here
  }
});

This allowed me to grab screen content and send it to the AI engine for analysis.

🔊 Part 4: Audio Capture Across Platforms

Audio processing was another big hurdle — especially getting system-level audio across different operating systems.

macOS

On macOS, I used a native binary called **SystemAudioDump** to capture system audio directly. This provided clean, loopback-free audio input.

Windows

For Windows, I used Electron's `mediaDevices` API to implement loopback capture — essentially recording the system sound output.

Linux (Experimental)

Linux support is still experimental. For now, I’m using basic microphone input via `getUserMedia`.

While not ideal, it’s a solid starting point until full loopback support becomes available.

🌐 Part 5: Integrating with Google Gemini 2.0 Flash Live API

Once I had both screen and audio data, I needed to send it to the AI model.

I chose **Google Gemini 2.0 Flash Live API** because of its ability to handle streaming inputs and provide real-time responses.

Here’s a simplified version of the integration:

const { GoogleGenerativeAI } = require('@google/generative-ai');
const genAI = new GoogleGenerativeAI('YOUR_API_KEY');

async function analyzeContext(screenData, audioData) {
  const model = genAI.getGenerativeModel({ model: "gemini-2.0-flash-live" });

  const result = await model.generateContent([
    "Analyze the following conversation context:",
    { inlineData: { mimeType: "image/png", data: screenData } },
    { text: `Transcribed audio: ${audioData}` }
  ]);

  return result.response.text();
}

This sends both visual and audio context to the AI, which then returns smart suggestions based on the current conversation.

💻 Part 6: Creating the Overlay UI

The final interface needed to be subtle yet effective.

I designed a **transparent, always-on-top window** that displays AI suggestions in real time.

Key design decisions:

  • No mouse interaction — only keyboard shortcuts
  • Lightweight styling to avoid distraction
  • Keyboard toggle (`Cmd/Ctrl + M`) to show/hide the overlay
  • Enter key to send feedback or selected suggestions

Using Electron’s `BrowserWindow`, I created the overlay like this:

const overlayWindow = /* new */ BrowserWindow({
  width: 400,
  height: 200,
  transparent: true,
  frame: false,
  alwaysOnTop: true,
  webPreferences: {
    // nodeIntegration: true
  }
});

overlayWindow.loadFile('overlay.html');

This gave users a minimal but powerful interface that didn’t interfere with their calls or presentations.

🧪 Part 7: Testing and Debugging Challenges

As with any real-time application, testing was tricky.

Some of the issues I ran into:

  • Performance drops when analyzing high-resolution screens
  • Audio synchronization issues with screen data
  • API rate limiting from the Gemini API
  • Handling errors gracefully without crashing the app

To solve these, I implemented:

  • Frame sampling to reduce CPU usage
  • Debounced API calls to prevent rate limits
  • Fallback logic if AI response was delayed
  • Logging system (only for debugging, never stored user data)

📈 Part 8: Lessons Learned from Building icheetz.fun

Building this product taught me more than just technical skills — it reshaped how I think about software development and entrepreneurship.

1. Start Small, Think Big

I didn’t try to build everything at once. I focused on the MVP: screen capture + AI suggestions. Everything else came later.

2. Build for Yourself First

Because I was solving my own problem, I knew exactly what features were essential. That made prioritization easy.

3. Use the Right Tools for the Job

Electron, LitElement, and the Gemini API worked together beautifully. Don’t force a framework — let the problem guide the solution.

4. Prioritize Privacy From Day One

Even in early development, I made sure all data stayed local unless explicitly shared. Users appreciate transparency.

5. Feedback Is Invaluable

Early testers helped shape the direction of the app. Listening to them made icheetz.fun better for everyone.

🚀 Part 9: What’s Next for icheetz.fun?

Now that the beta is out, I’m working on several exciting updates:

  • ✅ Linux support
  • 🎯 Custom profiles for different interview types
  • 🗣️ Voice command support
  • 📊 Exportable conversation insights
  • 🧩 Calendar integrations for automatic prep

If you're interested in contributing or trying the beta, feel free to reach out!

📢 Part 10: Want to Build Something Like This?

You absolutely can.

Here’s a quick checklist to get started:

  • Pick a problem you face regularly
  • Choose the right tools for the job
  • Focus on a minimum viable product
  • Iterate based on feedback
  • Keep privacy and usability top of mind

Whether you’re building an AI assistant, productivity tool, or just learning to code — start small, stay consistent, and ship early.

And remember: great products aren’t built overnight — they’re built one line of code, one test, and one user at a time.

💬 Final Thoughts

icheetz.fun started as a personal frustration — and turned into a product that helps professionals around the world.

If you're reading this and thinking about building your own side project, take the leap.

Start small. Build for yourself. Ship early. Improve constantly.

You never know where it might lead.

🙋‍♂️ Questions? Feedback? Want to Contribute?

Feel free to reach out!