← All posts
· 5 min read ·
SalesforceEinsteinAIMachine Learning

Salesforce Einstein AI in Practice: What Actually Works and What Doesn't

A practitioner's guide to deploying Salesforce Einstein features - scoring models, generative AI with Prompt Builder, Einstein Copilot, and the honest trade-offs you won't find in the marketing material.

Neural network visualization representing AI

Salesforce Einstein has been rebranded and repackaged more times than I can count. Behind the marketing, there’s a genuinely useful set of AI capabilities - and a graveyard of features that shipped, saw limited adoption, and quietly disappeared. Here’s what’s worth investing in right now.

The Einstein Landscape (2025)

The capabilities that matter in practice:

Predictive AI (ML models):

  • Einstein Lead Scoring / Opportunity Scoring - works well at scale
  • Einstein Case Classification - reliable for high-volume service orgs
  • Einstein Discovery - for custom ML models on your own data

Generative AI (LLM-powered):

  • Einstein Copilot - conversational AI assistant embedded in Salesforce
  • Prompt Builder - custom prompts integrated into Flows, Apex, page layouts
  • Einstein for Service (case summaries, knowledge drafts)
  • Einstein for Sales (email generation, call summaries)

Einstein Scoring: What Actually Works

Lead and Opportunity scoring are the most mature features and they genuinely move the needle when set up correctly. The key prerequisites most implementations skip:

Data volume requirements: Einstein scoring models need at least 1,000 converted leads (for Lead Scoring) or 200 closed-won/lost opportunities in the last two years (for Opportunity Scoring). Below these thresholds, the model can’t generalise - you’ll see nearly all scores cluster in the 50–70 range.

Signal quality over signal quantity: Einstein uses fields you select as training signals. More fields is not better. The fields that consistently predict outcomes:

  • Lead source (but normalise it - “web” and “Web” are different to ML)
  • Industry and company size (if you’re B2B)
  • Response time to first contact
  • Engagement signals (email opens, web visits) if you have Marketing Cloud connected
  • NOT: name, email address, phone number - these leak identity, not intent

The recency trap: Einstein models train on historical data. If your market changed significantly (new product, new ideal customer profile), your historical conversion patterns may actively mislead the model. Retrain on data from the relevant period only.

Prompt Builder: The Generative AI Entry Point

Prompt Builder lets you create prompt templates that call an LLM (GPT-4/Claude via Salesforce’s bring-your-own-LLM, or Einstein’s default models) with Salesforce data merged in. The templates are portable - you can invoke them from Flows, Apex, or page layouts.

A practical example - auto-generate a case summary when a case is closed:

You are a Salesforce Service agent summarising a resolved customer case.

Case Details:
Subject: {!$Record.Subject}
Description: {!$Record.Description}
Resolution: {!$Record.Resolution__c}
Time to Close: {!$Record.Time_to_Close__c} hours
Customer: {!$Record.Contact.Name} ({!$Record.Account.Name})

Write a 2–3 sentence summary of this case suitable for the account's success manager.
Focus on the issue and resolution, not the mechanics of how the case was worked.
Do not include personal identifiers beyond first name.

Key Prompt Builder patterns:

  • Always include explicit output format instructions
  • Constrain length (“2–3 sentences”) - unconstrained LLM output is inconsistent in length
  • Include guardrails for PII in the prompt itself, in addition to Salesforce’s data masking
  • Test prompts with edge cases: empty fields, very long descriptions, non-English input

Einstein Copilot: The Honest Assessment

Einstein Copilot (the conversational AI assistant embedded in Salesforce) is genuinely useful for specific tasks and overhyped for others.

Where it works well:

  • “Summarise this account’s recent activity” - accessing related records and synthesising them is a real time saver
  • “Draft a follow-up email for this opportunity” - reduces blank-page friction for reps
  • “What cases has this customer had in the last 90 days?” - natural language queries against Salesforce data

Where it falls short:

  • Complex multi-step actions (“create an opportunity, add these products, and send a quote”) - the action chaining is still unreliable
  • Anything requiring business-specific context that isn’t in the CRM - Copilot only knows what’s in Salesforce
  • High-volume use cases - the current rate limits aren’t designed for every rep prompting it hundreds of times a day

Copilot adoption reality: most Salesforce users don’t actively type questions into a chat interface. They want AI assistance in the flow of their work - smart suggestions, auto-filled fields, anomaly alerts. Copilot is valuable as a power-user tool; don’t position it as something all 500 users will interact with daily.

Einstein for Service: Case Classification and Routing

For high-volume service operations (200+ cases/day), Einstein Case Classification delivers measurable ROI:

  • Automatically suggests values for custom fields (Category, Product, Priority) based on case description
  • Routes cases to the correct queue without a complex routing rule tree
  • Flags escalation risk based on language patterns

Setup requirements:

  1. At least 400 historical cases with the fields you want to classify already populated
  2. Consistent field values (clean picklists) - free-text classifications kill accuracy
  3. A feedback loop: agents should be able to correct Einstein’s suggestions, and those corrections feed the model

The accuracy ceiling: Einstein Case Classification typically reaches 75–85% accuracy on well-structured service orgs. That’s genuinely useful - but it means 15–25% of classifications still need human review. Build your UX to make corrections easy, not invisible.

Data Cloud + Einstein: The Combination Worth Investing In

The most underutilised Einstein capability is its integration with Data Cloud. When unified customer profiles (purchase history, service history, behavioural signals) feed into Einstein models, scoring accuracy improves significantly.

Practical examples:

  • Churn prediction: train an Einstein Discovery model on unified profiles from Data Cloud, including order frequency, service ticket volume, and NPS scores from Marketing Cloud. Scores update daily, surface in the Account record
  • Next best action: combine Einstein recommendation models with Data Cloud segments - the recommendation engine knows who the customer is across all touchpoints, not just their CRM history
  • Product affinity: e-commerce browse + purchase data from Commerce Cloud → Data Cloud → Einstein recommendation → personalised search results

The Data Cloud → Einstein pipeline requires thoughtful data preparation but the combination is qualitatively better than Einstein running on CRM data alone.

What to Avoid

Einstein Vision/Language (custom ML): the development overhead is high and the maintenance burden is significant. Most organisations are better served by calling an external ML API (Claude, GPT-4, a custom model hosted on AWS SageMaker) via Named Credentials than building inside Salesforce’s ML platform.

Einstein Prediction Builder: it was useful before Einstein Discovery existed. Use Discovery for custom predictions now - better model quality, more flexible data inputs.

Generative AI features in early GA: Salesforce ships generative AI features fast. The first release of any given feature tends to be rough. I wait for the second or third release before recommending production adoption. Review the Known Issues list before committing.

← All posts