Bayesian Spatially Varying Coefficient Model — Gen 1
Gen 1 extends the initial design by allowing all coefficients to vary over space via latent Gaussian processes defined on standardized latitude/longitude.
Correlation follows the exponential form \(r(d;\,\phi)=\exp(-d/\phi)\) with a shared length scale \(\phi\), and per-coefficient magnitudes \(\rho_k\) control variability.
Inference uses PyMC with NUTS (NumPyro/JAX when available) on the target \( \log(\text{price}) \).
This model captures heterogeneous neighborhood effects beyond size alone, providing a more flexible foundation for local price dynamics across Prague.
Model
The log-price \(y_i\) for listing \(i\) at location \(\mathbf{s}_i\) is modeled as a spatially varying linear combination of features:
Each coefficient—including the intercept—has its own latent GP sharing the same exponential length scale \(\phi\), while \(\rho_k\) scales per-coefficient variability. Coordinates are standardized for numerical stability.
Features
- Target: \(\log(\text{price})\)
- Spatial inputs: standardized \((\text{lat}_{std}, \text{lon}_{std})\)
- Design (all vary spatially): intercept, area\(_{std}\), disposition, ownership, loggia size, parking lots, floor\(_{std}\), elevator present, distance to public transport\(_{std}\), distance to metro, time trend
Benchmark
Training & Inference
- Bayesian inference with PyMC (NUTS); NumPyro/JAX backend used when available.
- Exponential covariance \(r(d;\phi)=\exp(-d/\phi)\) with shared length scale \(\phi\); per-coefficient scales \(\rho_k\).
- 80/20 train–test split with fixed seed; standardized coordinates and key numerics.
- Artifacts saved:
trace.nc,posterior_summary.csv,obs_vs_pred.png,metrics.txt.
CLI Usage
Example run (single chain, NumPyro if available):
python gen_1_model.py --fit --draws 800 --chains 1 --save-dir artifacts/svcp_all
Notes & Next Steps
- Shared \(\phi\) simplifies computation and encourages comparable smoothness across coefficients; per-feature \(\phi_k\) is a possible extension.
- Model complexity increases posterior coupling; careful diagnostics (ESS, \(\hat{R}\)) and re-centering/standardization remain important.
- Future: hierarchical shrinkage on \(\rho_k\), alternative kernels (Matérn), and sparse/inducing-point approximations for scalability.