Matter Matching & Pricing
Part 2 of 2 β How we find relevant precedents and generate data-backed RFP responses
From Enriched Data to Actionable Intelligence
Once historical matters are transformed into structured profiles (see Part 1), Narrative uses that data to power intelligent matter matching and automated pricing generation.
Step 1: RFP Ingestion & Requirement Extraction
When a new RFP arrives, Narrative doesn't just keyword-search β it understands the request.
What We Extract
Matter Type
M&A, patent litigation, arbitration
Determines which historical matters are relevant
Deal/Case Characteristics
Deal size, claim value, number of parties
Affects complexity and resource needs
Jurisdiction
England & Wales, Delaware, multi-jurisdictional
Impacts timeline, staffing, and costs
Industry/Sector
Technology, healthcare, financial services
Relevant experience demonstration
Timeline
"Complete by Q2", "expedited"
Affects staffing intensity
Special Requirements
Regulatory approvals, cross-border elements
Complexity drivers
Clarifying Questions
The system proactively identifies ambiguities and asks clarifying questions:
"The RFP mentions 'regulatory approval' β is this FDI screening, merger control, or sector-specific regulation?"
"Deal value isn't specified β what range should we assume for benchmarking purposes?"
This ensures the matter matching is precise, not just approximate.
Step 2: Semantic Matter Matching
Traditional search finds matters by keywords. Narrative finds matters by meaning and relevance.
Embeddings-Based Search
Every enriched matter profile is converted into a high-dimensional vector (embedding) that captures its semantic meaning β not just keywords, but the underlying characteristics of the work.
What Makes a Match "Relevant"
The matching algorithm considers multiple dimensions:
Matter type
High
M&A matters should match M&A, not litigation
Complexity indicators
High
Deal size, party count, jurisdictions
Practice area
High
Corporate, disputes, regulatory, etc.
Industry/sector
Medium
Sector expertise often matters to clients
Jurisdiction
Medium
Local law impacts effort significantly
Timeline/duration
Medium
Fast-tracked vs. standard pacing
Outcome
Low
Success patterns, but less predictive
Beyond Simple Similarity
The system also identifies:
Near misses: Matters that are 80% similar but differ in one key dimension (useful for understanding variance)
Complexity outliers: Matters with unusual characteristics that drove costs up or down
Recency weighting: Recent matters may be more relevant for current market conditions
Step 3: Relevance Ranking & Curation
Automatic Baseline Selection
Narrative automatically surfaces and selects the most relevant matters β whether that's 3 or 15 β based on the clarified criteria.
The baseline set is presented with:
Match score (e.g., 92% relevance)
Key similarities (why this matter matched)
Key differences (how it differs from the RFP requirements)
Human-in-the-Loop Curation
Users can refine the selection:
Add matters the system didn't prioritize (e.g., "Include Project X β the client specifically mentioned it")
Remove matters that aren't relevant (e.g., "Exclude β that had unusual circumstances")
Sort by relevance, fees, recency, or outcome
Filter by specific attributes (e.g., "Only matters with successful outcomes")
This ensures the final comparison set reflects both data-driven relevance and human judgment.
Step 4: Pricing Generation
Once the precedent set is finalized, Narrative generates a comprehensive pricing proposal.
What Gets Generated
Total fee estimate
Recommended fee based on precedent averages and adjustments
Phase-by-phase breakdown
Hours and fees per phase (e.g., Due Diligence: 30%, Negotiation: 45%, Closing: 25%)
Staffing plan
Recommended role mix (partner %, associate %, paralegal %)
Timeline estimate
Expected duration based on comparable matters
Confidence range
Low/mid/high estimates based on precedent variance
The Pricing Rationale
Critically, every recommendation comes with justification:
"Recommended fee: Β£1.2M"
Based on 5 comparable matters with average fees of Β£1.15M. Adjusted +5% for:
Multi-jurisdictional complexity (UK + Germany)
Expedited timeline (4 months vs. typical 6)
Key precedent: Project Atlas (Β£1.4M) β similar deal size and jurisdictions, but included post-merger integration work not scoped here.
This data-backed rationale:
Gives pricing teams confidence in the recommendation
Provides defensible justification for client discussions
Highlights risk factors that might drive costs up or down
Step 5: Scenario Modeling
Before finalizing, users can test alternative approaches:
What-If Analysis
More junior leverage
Increase associate ratio from 60% to 75%
-12% on fees
Fixed fee
Cap at Β£1.1M
Margin risk if complexity exceeds baseline
Expedited timeline
Compress from 6 months to 4
+15% on fees (overtime, parallel workstreams)
AI tool usage
Apply Harvey to due diligence
-20% on DD hours, +8% margin
Real-Time Impact Visualization
Adjustments immediately show:
Fee impact (total and by phase)
Margin impact (expected profitability)
Risk indicators (how the scenario compares to precedent range)
The Complete Workflow
Benchmarking & Evaluation
We rigorously evaluate our system to ensure consistent, accurate results across two critical dimensions: matter selection and resource allocation.
Evaluation Framework
Every RFP scenario is tested across 100+ runs to measure both accuracy and consistency. We evaluate:
Matter Selection Consistency
For each test RFP, we run the matter selection 100 times and measure how consistently the system selects the same precedent matters.
What We Measure
Selection rate
How often does a matter appear across 100 runs?
β₯95% for core matches
Rank stability
Does the matter appear at consistent positions?
Low variance
Similarity score
Semantic similarity between RFP and matter
Consistent across runs
Example Benchmark Output
For a cross-border tech M&A RFP, here's what a benchmark run looks like:
π SELECTED MATTERS CONSISTENCY (100 runs):
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
1. HarborPointβNorthBeacon Strategic Alliance (IP Cross-License)
β
Appeared in: 97/100 runs (97.0%)
Average rank: 1.4
ββββββββββββββββββββββββββββββββββββββ
Rank distribution:
Rank 1: 74.2% ββββββββββββββ
Rank 2: 16.5% βββ
Rank 3: 7.2% β
Rank 4: 2.1%
2. Orion Analytics IP Divestiture (Cross-Border Sale)
β
Appeared in: 98/100 runs (98.0%)
Average rank: 2.6
ββββββββββββββββββββββββββββββββββββββ
Rank distribution:
Rank 1: 17.3% βββ
Rank 2: 39.8% βββββββ
Rank 3: 11.2% ββ
Rank 4: 24.5% ββββ
Rank 5: 7.1% β
3. Helios Data Systems SaaS IP Spin-Out (UK/US)
β οΈ Appeared in: 93/100 runs (93.0%)
Average rank: 3.8
ββββββββββββββββββββββββββββββββββββ
Rank distribution:
Rank 2: 5.4% β
Rank 3: 39.8% βββββββ
Rank 4: 22.6% ββββ
Rank 5: 30.1% ββββββAcceptance Criteria
Core matters (top 3-4)
β₯95% selection rate
99%
Extended set (top 5-6)
β₯80% selection rate
β₯90%
Rank variance
Low variance
Minimal variance
Jaccard similarity
β₯85% pairwise similarity
β₯95%
Our target: 99% selection consistency. We're actively working to minimize variance and ensure the system reliably surfaces the same high-quality precedent matters on every run.
Queried Matters Analysis
Beyond selected matters, we track the full set of matters queried (typically 40 from the database) to ensure the retrieval layer is stable:
π QUERIED MATTERS CONSISTENCY:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Average queried matters per run: 40.0
Top queried matters by appearance:
1. Helios Data Systems (5d91...) β 100/100 runs, avg rank: 1.92, avg similarity: 0.656
2. Orion Analytics (8e42...) β 100/100 runs, avg rank: 3.95, avg similarity: 0.639
3. HarborPointβNorthBeacon (69c6...)β 100/100 runs, avg rank: 4.64, avg similarity: 0.642
4. SentinelWave Acquisition (67e0...)β 100/100 runs, avg rank: 2.79, avg similarity: 0.648
β Queried matters with >95% consistency: 33/50
β Average pairwise Jaccard similarity: 0.881Similarity Distribution
We track the semantic similarity scores to ensure the matching is based on meaningful relevance:
π SIMILARITY DISTRIBUTION:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Count: 4000 (40 matters Γ 100 runs)
Mean similarity: 0.585
Std deviation: 0.048
Percentiles:
p10: 0.520 | p25: 0.551 | p50: 0.587 | p75: 0.620 | p90: 0.646
Distribution:
[0.4, 0.5) β 158
[0.5, 0.6) ββββββββββββββββββββ 2256
[0.6, 0.7) ββββββββββββββ 1578
[0.7, 0.8) β 8Phase & Keyword Consistency
We also verify that the system consistently identifies the correct practice area, phase taxonomy, and key terms:
π― PHASE GROUP CONSISTENCY:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
"M&A" β 100/100 runs (100.0%)
π€ KEYWORD CONSISTENCY (top terms):
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
"minority" β 100/100 runs (100.0%)
β
"licensing" β 100/100 runs (100.0%)
β
"AI" β 99/100 runs (99.0%)
β
"acquisition" β 98/100 runs (98.0%)
β οΈ "cloud" β 94/100 runs (94.0%)
β οΈ "antitrust" β 59/100 runs (59.0%)Resource Allocation Accuracy
Once matters are selected, we evaluate whether the system accurately predicts hours and fees by phase and role.
Budget variance
Variance in final budget prediction
5%
<3%
Total hours accuracy
Predicted vs. actual total hours
Β±10%
Β±5%
Total fee accuracy
Predicted vs. actual total fees
Β±10%
Β±5%
Phase-level accuracy
Predicted vs. actual hours per phase
Β±15%
Β±10%
Role-level accuracy
Predicted vs. actual hours per role
Β±15%
Β±10%
Current budget variance: 5%. We're actively working toward <3% variance to give pricing teams even greater confidence in generated estimates.
How we test:
Holdout validation: Train on historical matters, predict on held-out closed matters
Thousands of prediction runs: Ensure consistency and low variance
Stratified testing: Evaluate separately by matter type, complexity band, and jurisdiction
Overall Assessment
Each benchmark run produces a pass/fail assessment:
================================================================================
π― OVERALL ASSESSMENT:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
PASS: System consistency meets threshold
β Core matters (3-4) with >95% selection rate
β Queried matters with >95% consistency: 33
β Pairwise Jaccard similarity: 0.881 (target: β₯0.85)
β Phase group consistency: 100%
Average run duration: 77.3s
Total test duration: 11636.3s (100 runs)
================================================================================Continuous Monitoring
Beyond initial benchmarking, we continuously monitor production accuracy:
Selection consistency
Per release
Alert if <95% for core matters
Prediction vs. actual
Weekly
Alert if accuracy drops below 90%
User overrides
Real-time
Flag patterns where users adjust recommendations
Feedback submissions
Real-time
Route to model improvement pipeline
Feedback Loop
When predictions diverge from actuals:
Root cause analysis: Why did this matter cost more/less than predicted?
Pattern identification: Is this a systematic issue (e.g., a certain phase always overruns)?
Model refinement: Adjust weights and factors based on new data
Re-benchmark: Validate improvements against full test suite before deployment
Every matter completed adds to the precedent database, making future predictions more accurate.
Summary: From RFP to Proposal in Minutes
Hours searching for similar matters
Seconds to surface best matches
Guesswork on pricing
Data-backed recommendations
Generic phase breakdowns
Precedent-based phase estimates
No rationale for fees
Justified pricing with citations
Static quotes
Interactive scenario modeling
Days to prepare response
Minutes to generate draft
Last updated