TY - GEN
T1 - Unbiased risk and cross-validation method for selecting optimal knots in multivariable nonparametric regression spline truncated (case study: Unemployment rate in Central Java, Indonesia, 2015)
AU - Devi, Alvita Rachma
AU - Budiantara, I. Nyoman
AU - Ratnasari, Vita
N1 - Publisher Copyright:
© 2018 Author(s).
PY - 2018/10/17
Y1 - 2018/10/17
N2 - Nonparametric regression gives better flexibility because the form of the regression curve estimation adjusts to the data. One nonparametric regression method is spline truncation. The number of knots and their locations affect the form of this regression curve estimation. The optimal knot is needed in order to obtain the best model. There are methods to select optimal knots, such as unbiased risk (UBR) and cross-validation (CV). This paper discusses UBR and CV, then, using both simulated data and the unemployment rate data of Central Java Province, Indonesia, in 2015, compares UBR and CV for selecting the optimal knots. The criteria for selecting the best model were based on Mean Squared Error and R-square. The simulation was performed on a spline truncated function with error generated from normal distribution for varied sample sizes and error variance. The results of the simulation study showed that CV estimates the knots more accurately than UBR. From the application to unemployment rate data, the optimal knot by using CV was a combination of 2-3-2-1-3 knot with MSE of 0.3946 and R-square of 93.047%. Meanwhile, by using UBR, the optimal knot was a three knot with MSE of 0.6865 and R-square of 90.59%. In conclusion, from the results of simulation data and application to unemployment rate data, the CV method generated a better model than the UBR method.
AB - Nonparametric regression gives better flexibility because the form of the regression curve estimation adjusts to the data. One nonparametric regression method is spline truncation. The number of knots and their locations affect the form of this regression curve estimation. The optimal knot is needed in order to obtain the best model. There are methods to select optimal knots, such as unbiased risk (UBR) and cross-validation (CV). This paper discusses UBR and CV, then, using both simulated data and the unemployment rate data of Central Java Province, Indonesia, in 2015, compares UBR and CV for selecting the optimal knots. The criteria for selecting the best model were based on Mean Squared Error and R-square. The simulation was performed on a spline truncated function with error generated from normal distribution for varied sample sizes and error variance. The results of the simulation study showed that CV estimates the knots more accurately than UBR. From the application to unemployment rate data, the optimal knot by using CV was a combination of 2-3-2-1-3 knot with MSE of 0.3946 and R-square of 93.047%. Meanwhile, by using UBR, the optimal knot was a three knot with MSE of 0.6865 and R-square of 90.59%. In conclusion, from the results of simulation data and application to unemployment rate data, the CV method generated a better model than the UBR method.
KW - cross-validation
KW - nonparametric regression
KW - unbiased risk
KW - unemployement
UR - http://www.scopus.com/inward/record.url?scp=85056177248&partnerID=8YFLogxK
U2 - 10.1063/1.5062767
DO - 10.1063/1.5062767
M3 - Conference contribution
AN - SCOPUS:85056177248
T3 - AIP Conference Proceedings
BT - 8th Annual Basic Science International Conference
A2 - Karim, Corina
A2 - Azrianingsih, Rodliyati
A2 - Pamungkas, Mauludi Ariesto
A2 - Jatmiko, Yoga Dwi
A2 - Safitri, Anna
PB - American Institute of Physics Inc.
T2 - 8th Annual Basic Science International Conference: Coverage of Basic Sciences toward the World's Sustainability Challanges, BaSIC 2018
Y2 - 6 March 2018 through 7 March 2018
ER -