Visualizing Multi-Variable Prediction Functions by Segmented k-CPG's
Visualizing Multi-Variable Prediction Functions by Segmented k-CPG's
- 주제(키워드) Visualization of prediction functions , k-Means clustering , variable importance , support vector machine , random forests , environmental data
- 발행기관 한국통계학회
- 발행년도 2009
- 총서유형 Journal
- UCI G704-000420.2009.16.1.012
- KCI ID ART001313427
- 본문언어 영어
초록/요약
Machine learning methods such as support vector machines and random forests yield nonparametric prediction functions of the form y = f(x_1 ,..., x_p ) . As a sequel to the previous article (Huh and Lee, 2008) for visualizing nonparametric functions, I propose more sensible graphs for visualizing y = f(x_1 ,..., x_p ) herein which has two clear advantages over the previous simple graphs. New graphs will show a small number of prototype curves of f(x_1 ,..., x_{j-1} , x_j , x_{j+1} ..., x_p ), revealing statistically plausible portion over the interval of x_j which changes with (x_1 ,..., x_{j-1}, x_{j+1}, ..., x_p ). To complement the visual display, matching importance measures for each of p predictor variables are produced. The proposed graphs and importance measures are validated in simulated settings and demonstrated for an environmental study.
more

