software engineering/ํŒŒ์ด์ฌ ๋จธ์‹ ๋Ÿฌ๋‹

[Machine Learning] Aggregating decision trees

jjingle 2024. 1. 11. 14:48

Bagging(Bootstrap aggregation)

  • ๋จธ์‹ ๋Ÿฌ๋‹ ๋ฐฉ๋ฒ•์˜ ๋ณ€๋™์„ฑ(variance)์„ ์ค„์ด๊ธฐ ์œ„ํ•œ ์ผ๋ฐ˜์ ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ decision tree ๋ฐฉ๋ฒ•์— ํŠนํžˆ ์œ ์šฉํ•˜์—ฌ ๋งŽ์ด ์ ์šฉ๋จ
  • -> ์—ฌ๋Ÿฌ ๊ฐœ์˜ ๋…๋ฆฝ์ ์ธ ๋ฐ์ดํ„ฐ์…‹์„ ํ™•๋ณดํ•˜๋Š” ๊ฒƒ์ด ์–ด๋ ค์›Œ bootstrap ๋ฐฉ๋ฒ• ์‚ฌ์šฉ

 

OOB(Out-of-Bag Error Estimation)

  • Bagging ๋ฐฉ๋ฒ•์—์„œ๋Š” ์•„์ฃผ ์ง๊ด€์ ์ธ test error ์ถ”์ • ๋ฐฉ๋ฒ•์ด ์กด์žฌ
  • Bootstrap์€ ์ค‘๋ณต์„ ํ—ˆ์šฉํ•˜๋ฏ€๋กœ ํ•˜๋‚˜์˜ bootstrap training data์—์„œ ํ‰๊ท ์ ์œผ๋กœ ๋ณธ๋ž˜(original) ๋ฐ์ดํ„ฐ์˜ 2/3๊ฐ€ ์ƒ˜ํ”Œ๋ง๋จ
  • ๋‚˜๋จธ์ง€ ์ ํ•ฉ์— ์‚ฌ์šฉ๋˜์ง€ ์•Š์€ 1/3์˜ ๋ฐ์ดํ„ฐ๋ฅผ OOB(out-of-bag)์œผ๋กœ ๋ช…๋ช…
  • i๋ฒˆ์งธ ๋ฐ์ดํ„ฐ๊ฐ€ OOB์ธ ๊ฒฝ์šฐ์˜ decision tree์—์„œ i๋ฒˆ์งธ ๋ฐ์ดํ„ฐ์˜ response๋ฅผ ๊ตฌํ•˜๊ณ  ์ด๋“ค์„ ํ‰๊ท ํ•˜์—ฌ  test error์˜ ์ถ”์ •์— ์‚ฌ์šฉ -> ์ด ์ถ”์ • ๊ฐ’์€ B๊ฐ€ ์ถฉ๋ถ„ํžˆ ํฌ๋ฉด LOOCV์™€ ๊ฐ™์Œ

 

Random forests

  • Bootstrap์— ์˜ํ•ด ์ƒ์„ฑ๋œ ํŠธ๋ฆฌ๋“ค์˜ ๊ด€๊ณ„์„ฑ์„ ๋‚ฎ์ถฐ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒํ•˜๋Š” ๋ฐฉ๋ฒ•
  • Bagging๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ bootstrap dataset์— ๋Œ€ํ•˜์—ฌ ํŠธ๋ฆฌ๋ฅผ ์ƒ์„ฑํ•˜์ง€๋งŒ predictor์˜ ์‚ฌ์šฉ๋ฐฉ๋ฒ•์ด ๋‹ค๋ฆ„

 

Boosting

  • Bagging๊ณผ ์œ ์‚ฌํ•œ ๋ฐฉ๋ฒ•์ด๋‚˜ ํŠธ๋ฆฌ๋ฅผ ์ด์ „์— ํ•™์Šตํ•œ ํŠธ๋ฆฌ๋กœ๋ถ€ํ„ฐ ์ˆœ์ฐจ์ ์œผ๋กœ(sequentially) ์ฒœ์ฒœํžˆ ํ•™์Šต(learn slowly)์‹œํ‚ค๋Š” ๋ฐฉ์‹
  • ํ•™์Šต(ํŠธ๋ฆฌ ์ƒ์„ฑ) ๋ฐฉ๋ฒ•
    1. ์•„๋ฌด ์ •๋ณด๋„ ์—†๋Š” ํŠธ๋ฆฌ ์ƒ์„ฑ
    2. ํŠธ๋ฆฌ๋ฅผ response์— ์ ํ•ฉํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹Œ residual(์ ํ•ฉ ํ›„ ๋‚จ์€ ์–‘)์— ์ ํ•ฉ
    3. Residual์— ์ ํ•ฉํ•˜์—ฌ ๋งŒ๋“  ํŠธ๋ฆฌ๋ฅผ ๊ธฐ์กด ํŠธ๋ฆฌ์— ์—…๋ฐ์ดํŠธํ•˜๊ณ  r๋ฅผ ์—…๋ฐ์ดํŠธ
    4.   -> 2,3 ์˜ ๊ณผ์ •์„ BํšŒ ๋ฐ˜๋ณต ํ›„ ์ตœ์ข… ํŠธ๋ฆฌ๋ฅผ ์–ป์Œ
  • ์•Œ๊ณ ๋ฆฌ์ฆ˜ : ์ž‘์€ ํŠธ๋ฆฌ(์†Œ์ˆ˜์˜ terminal node)๋ฅผ ๋”ํ•ด ๊ฐ€๋Š” ๋ฐฉ๋ฒ•

 

Boosting ์˜ tuning parameter

  • ํŠธ๋ฆฌ์˜ ์ˆ˜B
    • B๊ฐ€ ํฌ๋ฉด ๊ณผ์ ํ•ฉ ๋  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ cross-validation์œผ๋กœ ์ •ํ•จ
  • Shrinkage parameter ๋žŒ๋‹ค
    • ์ž‘์€ ์–‘์ˆ˜์˜ ๊ฐ’์œผ๋กœ boosting์ด ์–ผ๋งˆ๋‚˜ ๋น ๋ฅด๊ฒŒ ํ•™์Šต๋˜๋Š”์ง€๋ฅผ ์ •ํ•จ
    • ๋ณดํ†ต 0.01 ๋˜๋Š” 0.001 ์ •๋„๋กœ ์ •ํ•˜๋ฉฐ ๊ฐ’์ด ์ž‘์„์ˆ˜๋ก ๋” ํฐ B ํ•„์š”
  • ๋ถ„๊ธฐ์˜ ์ˆ˜(the number of splits) d 
    • d=1์˜ ๊ฒฝ์šฐ์—๋„ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ƒ„