AI-Powered WhatsApp Chatbot π€π² for Text, Voice, Images & PDFs with memory π§
VerifiedMultimodal AI chatbot processes WhatsApp text, voice, images, and PDFs with memory.
What this workflow does
This workflow receives WhatsApp messages, detects their type, and uses OpenAI models to transcribe voice, describe images, extract PDF content, or answer text while preserving conversation history through a 10-turn memory window.
It suits support teams, content creators, and accessibility-focused organizations that need reliable multimodal handling inside WhatsApp without building custom infrastructure.
Who is this for?
Customer support teams and small businesses using WhatsApp for client communication. Also suitable for product teams needing quick multimodal query handling.
What problem it solves
Manually processing varied WhatsApp inputs like text, voice, images, and PDFs is slow and inconsistent. Lacks built-in memory for natural ongoing conversations.
Live workflow preview
Interactive canvas of every node and connection β scroll and click to explore. Powered by n8n's preview.
Open the template on n8n to import and run it. View source template β
What it automates
Customer support queries
Users send photos of products or PDFs of invoices; the bot analyzes and replies with summaries or next steps.
Voice note responses
Clients leave audio messages; workflow transcribes, answers via AI, and optionally returns a voice reply.
Visual accessibility help
Describe images for visually impaired users or interpret charts sent via WhatsApp.
How the workflow works
The 7 nodes in this automation, in order.
- 1HTTP RequesthttpRequest
- 2WhatsApp Business CloudwhatsApp
- 3Codecode
- 4AI Agent@n8n/n8n-nodes-langchain.agent
- 5OpenAI Chat Model@n8n/n8n-nodes-langchain.lmChatOpenAi
- 6Simple Memory@n8n/n8n-nodes-langchain.memoryBufferWindow
- 7OpenAI@n8n/n8n-nodes-langchain.openAi
Apps & integrations used
How to set up AI-Powered WhatsApp Chatbot π€π² for Text, Voice, Images & PDFs with memory π§
- 1Add WhatsApp Trigger node and connect to WhatsApp Business Cloud credentials.
- 2Insert Input type node to branch on text, audio, image, or PDF MIME types.
- 3Use HTTP Request nodes to download media and convert images to base64.
- 4Connect AI Agent with OpenAI Chat Model for text/PDF/image analysis and Simple Memory for session context.
- 5Add OpenAI node for Whisper transcription on voice and optional TTS for audio replies.
- 6Route final AI output back through WhatsApp Business Cloud to send responses.
How to customize this workflow
- βSwap OpenAI Chat Model for another provider supported by the AI Agent node.
- βChange trigger to another messaging platform using available n8n nodes.
- βExtend memory window size in Simple Memory node or add user-specific session logic.
- βAdd extra HTTP Request step to log all interactions to a Google Sheet.
AI-Powered WhatsApp Chatbot π€π² for Text, Voice, Images & PDFs with memory π§ : pros & cons
Pros
- +Handles four input types in one flow
- +Maintains per-user memory for context
- +Uses existing OpenAI and WhatsApp nodes
- +Concise mobile-friendly replies
Cons
- βRequires separate API keys for OpenAI and WhatsApp Business
- βPDF extraction limited to text content only
- βNo built-in error handling for unsupported file types
Frequently asked questions
It receives WhatsApp messages of multiple types, processes them with OpenAI models, and replies while keeping conversation memory.
User reviews
Verified reviews from the community shape this listing's rating.
Loading reviewsβ¦