| Title: | Geographically Optimal Similarity |
|---|---|
| Description: | Understanding spatial association is essential for spatial statistical inference, including factor exploration and spatial prediction. Geographically optimal similarity (GOS) model is an effective method for spatial prediction, as described in Yongze Song (2022) <doi:10.1007/s11004-022-10036-8>. GOS was developed based on the geographical similarity principle, as described in Axing Zhu (2018) <doi:10.1080/19475683.2018.1534890>. GOS has advantages in more accurate spatial prediction using fewer samples and critically reduced prediction uncertainty. |
| Authors: | Yongze Song [aut, cph] (ORCID: <https://orcid.org/0000-0003-3420-9622>), Wenbo Lyu [aut, cre] (ORCID: <https://orcid.org/0009-0002-6003-3800>) |
| Maintainer: | Wenbo Lyu <[email protected]> |
| License: | GPL-3 |
| Version: | 3.10 |
| Built: | 2026-05-28 10:37:35 UTC |
| Source: | https://github.com/ausgis/geosimilarity |
Computationally optimized function for geographically optimal similarity (GOS) model
gos(formula, data = NULL, newdata = NULL, kappa = 0.25, cores = 1)gos(formula, data = NULL, newdata = NULL, kappa = 0.25, cores = 1)
formula |
A formula of GOS model. |
data |
A |
newdata |
A |
kappa |
(optional) A numeric value of the percentage of observation locations
with high similarity to a prediction location. |
cores |
(optional) Positive integer. If cores > 1, a |
A tibble made up of predictions and uncertainties.
predGOS model prediction results
uncertainty90uncertainty under 0.9 quantile
uncertainty95uncertainty under 0.95 quantile
uncertainty99uncertainty under 0.99 quantile
uncertainty99.5uncertainty under 0.995 quantile
uncertainty99.9uncertainty under 0.999 quantile
uncertainty100uncertainty under 1 quantile
Song, Y. (2022). Geographically Optimal Similarity. Mathematical Geosciences. doi: 10.1007/s11004-022-10036-8.
data("zn") # log-transformation hist(zn$Zn) zn$Zn <- log(zn$Zn) hist(zn$Zn) # remove outliers k <- removeoutlier(zn$Zn, coef = 2.5) dt <- zn[-k,] # split data for validation: 70% training; 30% testing split <- sample(1:nrow(dt), round(nrow(dt)*0.7)) train <- dt[split,] test <- dt[-split,] system.time({ g1 <- gos(Zn ~ Slope + Water + NDVI + SOC + pH + Road + Mine, data = train, newdata = test, kappa = 0.25, cores = 1) }) test$pred <- g1$pred plot(test$Zn, test$pred) cor(test$Zn, test$pred)data("zn") # log-transformation hist(zn$Zn) zn$Zn <- log(zn$Zn) hist(zn$Zn) # remove outliers k <- removeoutlier(zn$Zn, coef = 2.5) dt <- zn[-k,] # split data for validation: 70% training; 30% testing split <- sample(1:nrow(dt), round(nrow(dt)*0.7)) train <- dt[split,] test <- dt[-split,] system.time({ g1 <- gos(Zn ~ Slope + Water + NDVI + SOC + pH + Road + Mine, data = train, newdata = test, kappa = 0.25, cores = 1) }) test$pred <- g1$pred plot(test$Zn, test$pred) cor(test$Zn, test$pred)
Computationally optimized function for determining the best kappa parameter for the optimal similarity
gos_bestkappa( formula, data = NULL, kappa = seq(0.05, 1, 0.05), nrepeat = 10, nsplit = 0.5, cores = 1 )gos_bestkappa( formula, data = NULL, kappa = seq(0.05, 1, 0.05), nrepeat = 10, nsplit = 0.5, cores = 1 )
formula |
A formula of GOS model. |
data |
A |
kappa |
(optional) A numeric value of the percentage of observation locations
with high similarity to a prediction location. |
nrepeat |
(optional) A numeric value of the number of cross-validation training times.
The default value is |
nsplit |
(optional) The sample training set segmentation ratio,which in |
cores |
(optional) Positive integer. If cores > 1, a |
A list.
bestkappathe result of best kappa
cvrmseall RMSE calculations during cross-validation
cvmeanthe average RMSE corresponding to different kappa in the cross-validation process
plotthe plot of rmse changes corresponding to different kappa
Song, Y. (2022). Geographically Optimal Similarity. Mathematical Geosciences. doi: 10.1007/s11004-022-10036-8.
data("zn") # log-transformation hist(zn$Zn) zn$Zn <- log(zn$Zn) hist(zn$Zn) # remove outliers k <- removeoutlier(zn$Zn, coef = 2.5) dt <- zn[-k,] # determine the best kappa system.time({ b1 <- gos_bestkappa(Zn ~ Slope + Water + NDVI + SOC + pH + Road + Mine, data = dt, kappa = c(0.01, 0.1, 1), nrepeat = 1, cores = 1) }) b1$bestkappa b1$plotdata("zn") # log-transformation hist(zn$Zn) zn$Zn <- log(zn$Zn) hist(zn$Zn) # remove outliers k <- removeoutlier(zn$Zn, coef = 2.5) dt <- zn[-k,] # determine the best kappa system.time({ b1 <- gos_bestkappa(Zn ~ Slope + Water + NDVI + SOC + pH + Road + Mine, data = dt, kappa = c(0.01, 0.1, 1), nrepeat = 1, cores = 1) }) b1$bestkappa b1$plot
spatial grid data of explanatory variables.
gridgrid
grid: A tibble of grided trace element explanatory variables
with 13132 rows and 12 variables, where the first column is ID.
Yongze Song [email protected]
Function for removing outliers.
removeoutlier(x, coef = 2.5)removeoutlier(x, coef = 2.5)
x |
A vector of a variable |
coef |
A number of the times of standard deviation. Default is |
Location of outliers in the vector
data("zn") # log-transformation hist(zn$Zn) zn$Zn <- log(zn$Zn) hist(zn$Zn) # remove outliers k <- removeoutlier(zn$Zn, coef = 2.5) kdata("zn") # log-transformation hist(zn$Zn) zn$Zn <- log(zn$Zn) hist(zn$Zn) # remove outliers k <- removeoutlier(zn$Zn, coef = 2.5) k
spatial datasets of trace element Zn.
znzn
zn: A tibble of trace element Zn with 894 rows and 12 variables
Yongze Song [email protected]