Sphinx Agent logo Sphinx Agent
Guide 9 min read

AI Voice Agents: How to Deploy a Voice AI Agent (Free Tier Available)

Two types of voice agents exist. Phone-based agents answer real calls on a real number. Browser-based agents let website visitors talk through their mic. Here is how to set up both.

There are two kinds of AI voice agents, and most platforms only do one of them.

The first is a phone-based voice agent. It gets a real phone number. Customers call it, and the agent picks up, talks, and handles the conversation — scheduling appointments, answering questions, taking messages. It works exactly like calling a receptionist, except the receptionist never calls in sick and never puts anyone on hold.

The second is a browser-based voice agent. It lives on your website as a widget. Visitors click a microphone icon, speak into their mic, and the agent responds out loud. No phone number involved. It is voice-powered chat, running directly in the browser.

Sphinx Agent does both. Browser voice is on the free tier. Phone voice starts at Pro ($49/mo). This guide walks through setting up each one.

Browser Voice Agent Setup (Free Tier)

Browser voice is the fastest way to get a voice agent running. No phone number to provision, no telephony costs, no carrier setup. Your visitors talk through their browser, and the agent talks back.

Here is the setup, start to finish:

  1. Sign up at sphinxagent.ai. Free plan. No credit card.
  2. Create a new agent. Give it a name, pick a voice (male or female, multiple accents available), and write a system prompt that tells it who it is and what it should do. Example: "You are the front desk assistant for Greenfield Dental. You answer questions about services, hours, and insurance. You can schedule appointments."
  3. Train it on your data. Upload your FAQ, paste your website URL for auto-scraping, or drop in a PDF. The agent needs context about your business to give useful answers.
  4. Enable browser voice. In your agent settings, toggle on "Voice Mode." Pick your preferred voice model and language.
  5. Grab the embed code. Go to the Deploy tab. Copy the script tag. It looks like this:
<script src="https://sphinxagent.ai/widget/YOUR_AGENT_ID.js" async></script>
  1. Paste it into your site. Drop the script tag before the closing </body> tag on any page where you want the agent to appear.
  2. Test it. Open your site, click the mic icon on the chat widget, and talk. The agent should respond within 1–2 seconds.

That is the entire process. Five minutes if your FAQ is already written. The free tier gives you 100 messages per month, which is enough to test whether voice works for your audience before committing to a paid plan.

Phone Voice Agent Setup (Pro Tier, $49/mo)

Phone voice is where things get serious. Your agent gets a real phone number. Customers call it, the agent answers, and the conversation happens over a normal phone line. The caller does not need an app, a browser, or an internet connection. They just dial a number.

This is the setup:

  1. Upgrade to Pro ($49/mo). Phone numbers require the Pro plan or higher. This includes one phone number and 25,000 messages per month.
  2. Provision a phone number. In your dashboard, go to Voice > Phone Numbers > Add Number. Pick a local or toll-free number. You can choose your area code. The number is live within 60 seconds.
  3. Configure the voice greeting. Set what the agent says when it picks up. Keep it short: "Thanks for calling Greenfield Dental. How can I help you?" Long greetings waste the caller's time and sound robotic.
  4. Set business hours. Define when the agent should answer calls. Outside of those hours, you can configure it to take a message, play a custom voicemail, or transfer to an after-hours number.
  5. Set up call routing rules. For questions the agent cannot handle, configure it to transfer to a human. You set the transfer trigger in the system prompt: "If the caller asks to speak to a manager or describes a medical emergency, transfer to 512-555-0100."
  6. Test it. Call the number from your personal phone. Run through your top five customer scenarios. Adjust the system prompt based on where the agent stumbles.

One thing most people skip: test with a bad connection. Call from your car. Call from a noisy room. The agent needs to handle real-world audio, not just quiet-office conditions. If it struggles, shorten your system prompt responses and make sure the agent confirms key details by repeating them back.

What Voice Agents Handle Well

Voice agents are not general-purpose replacements for human staff. They are very good at a specific set of tasks, and you should scope them tightly to those tasks.

  • Appointment scheduling. "I need to book a cleaning for next Thursday." The agent checks availability, confirms the time, and sends a confirmation. This is the single highest-value use case for most businesses.
  • FAQ and basic information. Hours, directions, pricing, insurance accepted, return policy. Anything that lives on your website's FAQ page is fair game.
  • Order status. "Where is my order?" If you connect the agent to your order management system via API, it can pull tracking info and relay it in real time.
  • After-hours messaging. When the office is closed, the agent takes a message, captures the caller's name and number, and emails it to your team. Every missed call after hours is a potential lost customer. The agent catches those.
  • Call routing and triage. "I need to talk to billing." The agent asks a few qualifying questions and transfers to the right department. This alone can cut your receptionist's workload by 30–40%.

What Voice Agents Do Not Handle Well Yet

Knowing the limits is more important than knowing the capabilities. If you deploy a voice agent into a scenario it cannot handle, your customers will notice immediately, and they will not call back.

  • Complex multi-step troubleshooting. "My printer shows error code E-305, I already replaced the toner, but now the paper tray light is blinking too." This kind of branching diagnostic conversation requires back-and-forth context that current voice models lose track of after four or five exchanges. A human support rep is still better here.
  • Heavy accents in noisy environments. Speech-to-text has improved dramatically, but it still drops accuracy when there is background noise combined with an unfamiliar accent. A caller on a construction site with a thick regional accent will get misunderstood. If your caller base skews this way, use voice as a first pass and make it easy to transfer to a human.
  • Emotional callers who need empathy. Someone calling to cancel a service because a family member died. A patient calling with bad test results. A customer who is genuinely angry and needs to feel heard. Voice agents can be polite, but they cannot read emotional subtext or provide genuine empathy. These calls need a human. Set up your transfer rules accordingly.

The honest answer is that voice agents are best deployed as a first line — handling the 60–70% of calls that are routine, and routing the rest to your team. That is where the ROI is. Trying to make them handle everything will hurt your customer experience.

Voice Agent Pricing Breakdown

Here is what it costs on Sphinx Agent. No hidden fees. No per-minute charges on top of the plan price.

Plan Price Voice Messages Phone Numbers
Free $0/mo Browser only 100 0
Starter $19/mo Browser only 5,000 0
Pro $49/mo Browser + Phone 25,000 1
Business $99/mo Browser + Phone 75,000 5
Enterprise $249/mo Browser + Phone 200,000 10
Enterprise Plus $499/mo Browser + Phone Unlimited Unlimited

The free tier is real. No trial period. No credit card required. You get browser voice and 100 messages per month, indefinitely. That is enough to validate whether voice works for your use case before spending anything.

If you need phone numbers, Pro is the entry point at $49/mo. Most single-location businesses — dental offices, law firms, real estate agents — never need more than one number, so they stay on Pro.

Multi-location businesses jump to Business ($99/mo, 5 numbers) or Enterprise ($249/mo, 10 numbers). Each number can have its own greeting, hours, and routing rules.

Voice vs. Chat: When to Use Which

Voice and chat are not interchangeable. They serve different customer behaviors, and the right choice depends on your business type and who is calling.

Use Voice When:

  • Your customers already call you. Medical offices, law firms, home services, auto repair, insurance agencies. If your phone rings 20+ times a day, voice agents will have the biggest impact. These callers are not going to open a chat widget. They picked up the phone because that is how they do business.
  • The interaction is simple and transactional. Booking an appointment, confirming a reservation, checking if you are open on Saturday. Voice handles these faster than chat because talking is faster than typing.
  • Your audience skews older or less tech-savvy. A 65-year-old scheduling a doctor's appointment is going to call. A chat widget is invisible to them.

Use Chat When:

  • Your customers are already on your website. SaaS products, e-commerce stores, online services. If someone is browsing your pricing page at 11 PM, they want to type a question, not make a phone call.
  • The conversation involves links, code, or structured data. Chat can send URLs, embed images, display formatted tables. Voice cannot. If your support involves sharing documentation or walking someone through a dashboard, chat wins.
  • Your customers are comparison shopping. Someone evaluating five SaaS products is not going to call all five. They will chat with the ones that have a widget and skip the rest.

Use Both When:

You have a mix of customer types. A dental office might get phone calls from existing patients and website chats from new patients comparing providers. A law firm gets calls from referrals and chat inquiries from people who found them on Google. Sphinx Agent runs voice and chat from the same agent, same knowledge base, same training data. You are not maintaining two separate systems.

Get Started

Sign up at sphinxagent.ai. Create an agent. Enable voice. That is the whole process.

Browser voice is free and takes five minutes. Phone voice takes ten minutes and a Pro plan. Either way, you will have a working voice agent before lunch.

If you are not sure which setup fits your business, start with browser voice on the free tier. See how your visitors interact with it. If the conversations are good, upgrade to phone when you are ready.

Terrell K. Flautt

Terrell K. Flautt

Founder of Sphinx Agent and SnapIT Software. Writes about AI agents, autonomous systems, and the business of artificial intelligence.

Share this article

Deploy Your Voice Agent Today

Browser voice is free. Phone voice starts at $49/mo. Either way, you will have a working agent in under 10 minutes.

Try Sphinx Agent Free

Related Articles