Back to browse
GitHub Repository

A simple, elegant Python library and Mini SDK for AI models with powerful features. Built for developers who want to integrate AI into their projects without dealing with complex API setup. Supports Google Gemini, OpenAI, Anthropic, and any custom provider via the Adapter Pattern.

6 starsPython

Dracula-AI – A lightweight, async SQLite-backed Gemini wrapper

by suleymanibis·Mar 4, 2026·2 points·0 comments

AI Analysis

●●SolidShip ItSolve My Problem

SQLite memory beats JSON bloat; async streaming works—but it's still a Gemini wrapper.

Strengths
  • Practical v0.8.0 rewrite: SQLite + aiosqlite eliminate RAM bloat from chat history, exponential backoff prevents crash cascades
  • Streaming + function calling + logging reduce boilerplate vs. official Gemini SDK; no PyQt6 forced dependency
Weaknesses
  • Gemini-only; no multi-model abstraction (LiteLLM, LangChain solve this more generally)
  • 18-year-old author shows promise, but SDK wrappers are crowded category—LLM abstraction layers (LiteLLM, Anthropic SDK) dominate
Target Audience

Python developers integrating Gemini into chatbots, agents, and server applications requiring persistent memory

Similar To

LiteLLM · Langchain · Google's official Gemini SDK

Post Description

I'm an 18-year-old CS student from Turkey. I've been building Dracula, a Python wrapper for the Google Gemini API. I initially built it because I wanted a simpler Mini SDK that handled conversational memory, function calling, and streaming out of the box without the boilerplate of the official SDK.

Recently, I got some well-deserved technical criticism from early users: using JSON files to store chat history was a memory-bloat disaster waiting to happen; forcing a PyQt6 dependency on server-side bots was a terrible design choice; and lacking a retry mechanism meant random 503s from Google crashed the whole app.

So, I went back to the drawing board and completely rewrote the core architecture for v0.8.0. Here is what I changed to make it production-ready:

Swapped JSON for SQLite: I implemented a local database system (using sqlite3 for sync and aiosqlite for async). It now handles massive chat histories without eating RAM, and tracks usage stats safely.

True Async Streaming: Fixed a generator bug that was blocking the asyncio event loop. Streaming now yields chunks natively in real-time.

Exponential Backoff: Added an under-the-hood auto-retry mechanism that gracefully handles 429 rate limits and 503/502 server drops.

Zero Bloat: Split the dependencies. "pip install dracula-ai" installs just the core for FastAPI/Discord bots. "pip install dracula-ai[ui]" brings in the desktop interface.

Here is a quick example of the async streaming:

import os, asyncio from dracula import AsyncDracula

async def main(): async with AsyncDracula(api_key=os.getenv("GEMINI_API_KEY")) as ai: async for chunk in ai.stream("Explain quantum computing"): print(chunk, end="", flush=True)

asyncio.run(main())

Building this has been a huge learning curve for me regarding database migrations, event loops, and package management. I would love for the HN community to look at the code, review the async architecture, and tell me what I did wrong (or right!).

GitHub: https://github.com/suleymanibis0/dracula PyPI: https://pypi.org/project/dracula-ai/

Thanks for reading!

Similar Projects

AI/ML●●Solid

AgentForge – Multi-LLM Orchestrator in 15KB of Python

AgentForge compresses common production patterns—token-aware rate limiting (token-bucket), retry+exponential backoff, prompt templates and cost tracking—into a tiny async core and lets you flip providers with one parameter. The multi-agent mesh and ReAct loop bits are the most interesting engineering bets here, and the repo includes benchmarks and a Streamlit demo, but it lives in a crowded space next to LangChain and similar toolkits so real differentiation will come from adoption and edge-case robustness.

Niche GemShip It
chunktort
213mo ago