Bagging 1

[Machine Learning] Aggregating decision trees

Bagging(Bootstrap aggregation) ๋จธ์‹ ๋Ÿฌ๋‹ ๋ฐฉ๋ฒ•์˜ ๋ณ€๋™์„ฑ(variance)์„ ์ค„์ด๊ธฐ ์œ„ํ•œ ์ผ๋ฐ˜์ ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ decision tree ๋ฐฉ๋ฒ•์— ํŠนํžˆ ์œ ์šฉํ•˜์—ฌ ๋งŽ์ด ์ ์šฉ๋จ -> ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋…๋ฆฝ์ ์ธ ๋ฐ์ดํ„ฐ์…‹์„ ํ™•๋ณดํ•˜๋Š” ๊ฒƒ์ด ์–ด๋ ค์›Œ bootstrap ๋ฐฉ๋ฒ• ์‚ฌ์šฉ OOB(Out-of-Bag Error Estimation) Bagging ๋ฐฉ๋ฒ•์—์„œ๋Š” ์•„์ฃผ ์ง๊ด€์ ์ธ test error ์ถ”์ • ๋ฐฉ๋ฒ•์ด ์กด์žฌ Bootstrap์€ ์ค‘๋ณต์„ ํ—ˆ์šฉํ•˜๋ฏ€๋กœ ํ•˜๋‚˜์˜ bootstrap training data์—์„œ ํ‰๊ท ์ ์œผ๋กœ ๋ณธ๋ž˜(original) ๋ฐ์ดํ„ฐ์˜ 2/3๊ฐ€ ์ƒ˜ํ”Œ๋ง๋จ ๋‚˜๋จธ์ง€ ์ ํ•ฉ์— ์‚ฌ์šฉ๋˜์ง€ ์•Š์€ 1/3์˜ ๋ฐ์ดํ„ฐ๋ฅผ OOB(out-of-bag)์œผ๋กœ ๋ช…๋ช… i๋ฒˆ์งธ ๋ฐ์ดํ„ฐ๊ฐ€ OOB์ธ ๊ฒฝ์šฐ์˜ decision tree์—์„œ i๋ฒˆ์งธ..