Bayesian Spatially Varying Coefficient Model — Gen 1
Gen 1 builds on the first prototype by letting every coefficient change smoothly over space, using latent Gaussian processes on standardized coordinates.
Correlation follows the exponential form \(r(d;\,\phi)=\exp(-d/\phi)\) with a shared length scale \(\phi\), and per-coefficient magnitudes \(\rho_k\) set the relative variability.
Inference is performed with PyMC, using NUTS and the NumPyro/JAX backend when available for speed and stability.
This richer formulation helps reveal neighborhood-dependent effects, such as how floor premium or metro proximity change across the city, while keeping the model interpretable.
Model
The listing outcome \(y_i\) for observation \(i\) at location \(\mathbf{s}_i\) is modeled as a spatially varying linear combination of features:
Each coefficient—including the intercept—has its own latent GP sharing the same exponential length scale \(\phi\), while \(\rho_k\) scales per-coefficient variability. Coordinates are standardized for numerical stability.
Features
- Target: price, modeled on the log scale where helpful (see equation above).
- Spatial inputs: standardized \((\text{lat}_{std}, \text{lon}_{std})\)
- Design (all vary spatially): intercept, area\(_{std}\), disposition, ownership, loggia size, parking, floor\(_{std}\), elevator, distance to public transport\(_{std}\), distance to metro, time trend
Benchmark
Training & Inference
- Bayesian inference with PyMC (NUTS); NumPyro/JAX backend used when available.
- Exponential covariance \(r(d;\phi)=\exp(-d/\phi)\) with shared length scale \(\phi\); per-coefficient scales \(\rho_k\).
- 80/20 train–test split with fixed seed; standardized coordinates and key numerics.
- Artifacts saved:
trace.nc,posterior_summary.csv,obs_vs_pred.png,metrics.txt.
Tools
Model fitting and diagnostics are driven by project scripts, using PyMC for inference and the NumPyro/JAX backend when available for faster execution. Artifacts such as the trace, posterior summaries, plots and metrics.txt are saved to the experiment directory for inspection.
Notes & Next Steps
- Shared \(\phi\) simplifies computation and encourages comparable smoothness across coefficients; per-feature \(\phi_k\) is a possible extension.
- Model complexity increases posterior coupling; careful diagnostics (ESS, \(\hat{R}\)) and re-centering/standardization remain important.
- Future: hierarchical shrinkage on \(\rho_k\), alternative kernels (Matérn), and sparse/inducing-point approximations for scalability.