The most expensive problem in home services is not bad work — it is mismatched expectations. A professional arrives expecting a 45-minute tap replacement. The actual job is a corroded pipe joint behind a tiled wall that requires cutting access, and what should have been a simple job is now a three-hour project that neither the homeowner nor the professional budgeted for.
OSCAR's fixed-price model depends on getting complexity estimation right before the professional arrives. If we quote €65 for a tap replacement and the actual job is worth €180, one of two things happens: either the professional does a poor job to stay within the time implied by the price, or the homeowner is asked to pay more on the day. Neither outcome is acceptable. So complexity estimation is not an optional feature — it is the foundation the pricing model sits on.
This post describes how we currently estimate job complexity at booking time, what signals we use, and where the model still falls short.
The Input Signal Stack
When a homeowner submits a booking, they provide a combination of structured and unstructured inputs. The structured inputs are the service category (plumbing, electrical, tile, etc.), the specific job type (tap replacement, circuit breaker fault, cracked tile repair), and their answers to a short set of clarifying questions we present based on the job type. The unstructured inputs are any photos they upload and a free-text description of the problem.
Complexity estimation draws on both, plus a third source: the building profile we infer from the service address.
Building age inference is one of the more useful signals we have. Porto's residential building stock divides roughly into three eras: pre-1940 (the older Cedofeita, Bonfim, and Miragaia apartments, typically with galvanised or lead pipework and old-style electrical installations), 1940–1980 (the bulk of the city's housing stock, including much of Paranhos, Ramalde, and Campanhã, typically with copper plumbing but aging fuse boards), and post-1980 (more standard modern installations). Each era carries statistically different job duration distributions. A tap replacement in a pre-1940 apartment takes, on average, 40% longer than the same job in a post-2000 apartment, based on historical job time data from our network. We derive an approximate building age from the Arquivo Municipal do Porto's public caderneta predial records, cross-referenced with postal code.
Photo analysis runs each uploaded image through a classification model that identifies the fixture type (tap, basin, pipe configuration), visible condition indicators (scale buildup, corrosion, visible damage), and access constraints (tight under-sink space, pipes running through walls). The model does not make a pricing decision — it generates a set of condition flags that adjust the baseline complexity estimate upward or downward.
Historical job similarity is the third layer. When a homeowner describes a job in Paranhos in a building estimated at 1960s construction, we look at the distribution of actual job durations for comparable jobs in similar buildings. If 30% of "tap replacement" jobs in that building cohort turned into pipe replacement jobs, the complexity score reflects that risk, and the homeowner is asked an additional clarifying question before the price is confirmed.
A Concrete Example: The Leaking Tap in Bonfim
A homeowner in Bonfim submits a booking for a leaking kitchen tap. They upload two photos — one showing water dripping from the spout, another showing the under-sink cabinet. The address caderneta indicates a building registered in 1957.
The photo classification identifies: a single-lever mixer tap of a type common in 1980s–1990s retrofits, visible scale buildup on the tap body, and a copper pipe connection visible in the under-sink photo that shows greenish corrosion at the joint. The historical data for this building age cohort shows that 28% of tap replacement jobs require additional work on the pipe joint below the tap.
The system flags this as a moderate-complexity booking and presents the homeowner with two additional questions: "Is the tap dripping only from the spout, or is there also leaking from the base or the connection below the tap?" and "Has this tap been replaced before, or is it the original fitting?" The homeowner's answers move the complexity score up or down. If they confirm leaking at the base connection, the booking is priced to include the joint repair. If they confirm dripping from the spout only with no visible leak elsewhere, the standard tap replacement price applies with a note that the professional will verify on arrival.
Error Rates by Category
We track the difference between the complexity estimate at booking time and the actual job complexity as reported by the professional after completion. A "scope match" means the professional confirmed the job was within the originally quoted scope. A "scope creep" event means the actual job required work outside the original scope — triggering either an add-on charge or, if we failed to price it correctly, a cost absorption on our side.
Current scope match rates by category:
- Locksmith work: 95% scope match (simple category, very low variance)
- Plumbing: 86% scope match
- Electrical: 83% scope match
- Tile work: 77% scope match
Tile work is the worst-performing category, and we want to be direct about why. The complexity of tile work depends heavily on factors that are not visible in photos: subfloor condition, adhesive type under existing tiles, and whether the grout has been sealed. We simply do not yet have enough historical job data to accurately estimate those variables from surface indicators. The model has a roughly 23% error margin on tile scope. We are not pretending otherwise.
What the Model Cannot See
Photo-based complexity estimation has hard limits. A homeowner who uploads photos of the visible symptom (a dripping tap) may not think to photograph the access panel, the shut-off valve, or the broader pipe layout. The model sees what the homeowner shows it, which is not always what the professional needs to see.
We address this partially through structured questions — the clarifying questions at booking time are designed to surface the information that photos typically miss. But a professional arriving at a job always has some information advantage over the booking system. We accept that. The complexity model is not meant to eliminate surprises entirely; it is meant to narrow the scope of possible surprises enough that our fixed-price model holds in the large majority of cases.
In cases where the professional arrives and finds a genuine scope mismatch — something that could not have been reasonably detected at booking — we have an in-app scope escalation process. The professional documents the discrepancy with photos and a short explanation, the homeowner receives an updated price to approve before any additional work begins, and our operations team reviews the escalation to check whether the original complexity model should have flagged it. That review loop is how we improve the model over time.
The Feedback Loop
Every completed job contributes data to the complexity model: booking inputs, actual job duration, scope match or creep, and professional notes. We retrain the model periodically — not continuously — because continuous retraining on a small data set tends to overfit to recent edge cases.
The data set is still modest. We have historical job data from roughly 1,400 completed jobs at the time of writing. That is enough to build a functional model for plumbing and locksmith categories, but thin for tile work and early-stage for painting and electrical. As volume grows, so does model accuracy. For now, the system works better on jobs we have seen many times before and worse on categories where our job history is short.