Agno
New
Build multi-modal agents in plain Python — text, image, audio, and video inputs handled natively. Agno's tool library and memory system handle the infrastructure.
Agents
★ 4.4(1,500 reviews)freeOverview
Agno (formerly phidata) is a Python framework for building multi-modal AI agents that can reason over text, images, audio, and video with native tool use and memory. Agents are defined in plain Python classes — no DSL, no YAML — and Agno handles the inference routing, tool execution loop, and session state. Ships with 30+ built-in tools (web search, database queries, file operations) and pluggable memory backends.
Key Features
- Multi-modal agents with native text, image, audio, and video reasoning
- Plain Python class definitions — no framework-specific DSL
- 30+ built-in tools: web search, SQL, file ops, APIs
- Pluggable memory backends including PostgreSQL, MongoDB, and SQLite
- Agent Teams for orchestrating multiple specialized sub-agents
- Structured output support via Pydantic models
Pros
- • Multi-modal by default — no special handling for image or audio inputs
- • Pythonic API makes agents readable to anyone who knows Python
- • Built-in tool library means less boilerplate for common tasks
Cons
- • Python-only — no TypeScript SDK unlike some competitors
- • Cloud observability platform is still early-stage
- • Less community content than LangChain or CrewAI at the same maturity
Advertisement