Potato
์•ˆ๋…•ํ•˜์„ธ์š”, ๊ฐ์žก๋‹ˆ๋‹ค?๐Ÿฅ” ^___^ ๐Ÿ˜บ github ๋ฐ”๋กœ๊ฐ€๊ธฐ ๐Ÿ‘‰๐Ÿป

AI study/potato's PJT (in dongguk)

[Data Science] TM(ํ…”๋ ˆ๋งˆ์ผ€ํŒ…)์ด ์ƒํ’ˆ๊ฐ€์ž…์œผ๋กœ ์ด์–ด์ง€๋Š” ๊ณ ๊ฐ๊ตฐ ๋ถ„์„

๊ฐ์ž ๐Ÿฅ” 2021. 4. 28. 16:19
๋ฐ˜์‘ํ˜•

์‹œ์ž‘ํ•˜๋ฉฐ

์ด์ „์— ์ง„ํ–‰ํ–ˆ๋˜ ํ”„๋กœ์ ํŠธ์— ๋Œ€ํ•ด ์ •๋ฆฌํ•ด๋ณด๋ ค๊ณ  ํ•œ๋‹ค. ํ˜ผ์ž์„œ ํŒŒ์ผ๊ณผ ๋ฌธ์„œํ˜•์‹์œผ๋กœ ์ €์žฅํ•ด๋‘์—ˆ์—ˆ์ง€๋งŒ, ๊ณต์œ ํ•˜๋Š” ์‚ถ์„ ์‚ด์•„๋ณด๊ณ ์ž ๋ชจ๋“  ํ”„๋กœ์ ํŠธ๋ฅผ ์ฐจ๊ทผ์ฐจ๊ทผ ์—…๋กœ๋“œํ•ด๋ณผ ์˜ˆ์ •์ด๋‹ค :)

ํ•™๋ถ€์ƒ ์ˆ˜์ค€์—์„œ ์ง„ํ–‰ํ•œ ํ”„๋กœ์ ํŠธ์ด๊ธฐ์—, ๋Œ€๋‹จํ•˜์ง€๋„ ๋›ฐ์–ด๋‚˜์ง€๋„ ์•Š๋Š”๋‹ค.
๋‹จ์ง€ ์œ ๋ช…ํ•œ ๋ฐ์ดํ„ฐ์…‹์„ ๊ฐ€์ง€๊ณ  ๋‚ด๊ฐ€ ๊ฐ€์ง„ ๊ธฐ์ˆ ์„ ์ฆ๋ช…ํ•ด๋ณด๊ธฐ ์œ„ํ•ด ์ง„ํ–‰ํ–ˆ๋˜ ํ”„๋กœ์ ํŠธ๊ฐ€ ๋Œ€๋ถ€๋ถ„์ด๋‹ค.
ํ˜ผ์ž ์ง„ํ–‰ํ•œ ๊ฒƒ์ด ์•„๋‹Œ, ํŒ€์›๋“ค๊ณผ ์ง„ํ–‰ํ•œ ๊ฒƒ์œผ๋กœ ์ž์„ธํ•œ ์ „์ฒด ์ฝ”๋“œ์— ๋Œ€ํ•œ ์—…๋กœ๋“œ๋Š” ํ•˜์ง€ ์•Š์„ ์˜ˆ์ •์ด๊ณ 
๋‚ด๊ฐ€ ๋‹ด๋‹นํ–ˆ๋˜ ๋ถ€๋ถ„ ๋ถ€๋ถ„๋งŒ ์—…๋กœ๋“œํ•  ์˜ˆ์ •์ด๋‹ค ;)

๊ทธ๋ž˜๋„! ์ง€๋‚˜๊ฐ€์‹œ๋‹ค๊ฐ€ ํ˜น์‹œ ์ฝ์–ด๋ณด์‹œ๊ณ , ์ž˜๋ชป๋œ ๋ถ€๋ถ„, ๊ฐœ์„ ์˜ ์—ฌ์ง€๊ฐ€ ์žˆ๋Š” ๋ถ€๋ถ„, ๋ฌธ์ œ์  ๋“ฑ์— ๋Œ€ํ•œ ๋ชจ๋“  ํ”ผ๋“œ๋ฐฑ์€ ํ™˜์˜์ž…๋‹ˆ๋‹ค. ๋งŽ์€์‚ฌ๋žŒ๋“ค์—๊ฒŒ ๋ฐฐ์šฐ๊ณ ์‹ถ์–ด์š”! 

 

 

ํ”„๋กœ์ ํŠธ ์ฃผ์ œ

TM(ํ…”๋ ˆ๋งˆ์ผ€ํŒ…)์ด ์ƒํ’ˆ๊ฐ€์ž…์œผ๋กœ ์ด์–ด์ง€๋Š” ๊ณ ๊ฐ๊ตฐ ๋ถ„์„

 

์ฃผ์ œ ์„ ์ • ๋ฐฐ๊ฒฝ

2019๋…„ ์ž๋ฃŒ

TM(ํ…”๋ ˆ๋งˆ์ผ€ํŒ…)์˜ ํŒ๋งค ์‹ค์ ์€ ๋ณดํ—˜์‚ฌ / ์นด๋“œ์‚ฌ์—์„œ ๊ฐ๊ฐ 8%์™€ 5%๋ฅผ ์ฐจ์ง€ํ•˜๋ฏ€๋กœ ๋ฌด์‹œํ•˜์ง€ ๋ชปํ•˜๋Š” ์‹ค์ ์„ ๋‹ฌ์„ฑํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์„ ํ™•์ธํ•ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ์ฆ‰, ๊ธˆ์œต๊ถŒ, ๋ณดํ—˜์‚ฌ, ํ†ต์‹ ์‚ฌ ๋“ฑ ๋‹ค์–‘ํ•œ ์‚ฐ์—…์˜ ๊ธฐ์—…์—์„œ TM์„ ํ™œ์šฉํ•˜์—ฌ ๊ณ ๊ฐ ์˜์—…์„ ํ•˜๋Š” ์ค‘์ด๊ณ , ์ˆ˜์ต์„ ๊พธ์ค€ํžˆ ์˜ฌ๋ฆฌ๊ณ  ์žˆ๋Š” ๊ฒƒ์œผ๋กœ ํŒ๋‹จํ–ˆ๋‹ค.

2019๋…„ ์ž๋ฃŒ

ํ•˜์ง€๋งŒ ๋Œ€๋ถ€๋ถ„์˜ ์‚ฌ๋žŒ๋“ค์€ TM ์ „ํ™”๋ฅผ ์ŠคํŒธ ์ „ํ™”๋กœ ์ธ์‹ํ•˜๊ณ  ์žˆ๋‹ค.  ๋”ฐ๋ผ์„œ ๊ธฐ์—…์—๊ฒŒ๋Š” ๊ผญ ํ•„์š”ํ•œ ๋งˆ์ผ€ํŒ… ์ „๋žต์ด์ง€๋งŒ, ๊ณ ๊ฐ์˜ ๋Œ€๋ถ€๋ถ„์€ TM ์ „ํ™” ์ˆ˜์‹ ์„ ์›์น˜ ์•Š๋Š”๋‹ค๊ณ  ํŒ๋‹จํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.

๊ฒฐ๋ก ์ ์œผ๋กœ, ๊ธฐ์—…์˜ ์ž…์žฅ์—์„œ๋Š” TM ์ „ํ™”๋ฅผ ๋ฐ›์•˜๋‹ค๊ฐ€ ๋Š์–ด๋ฒ„๋ฆฌ๊ฑฐ๋‚˜, ์‹ ๊ณ ํ•˜๋Š” ๊ณ ๊ฐ๊ตฐ์„ ์ œ์™ธํ•œ ์ตœ์†Œํ•œ์˜ ๊ณ ๊ฐ๊ตฐ์„ ์„ ์ •ํ•˜์—ฌ TM ๋งˆ์ผ€ํŒ…์„ ์ง„ํ–‰ํ•ด์•ผํ•จ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. 

์ „์ฒด ๊ณ ๊ฐ๊ตฐ ๋ณด๋‹ค๋Š” ์˜์—… ์„ฑ๊ณต๋ฅ ์ด ๋†’์€ ํŠน์ • ํƒ€๊ฒŸ์„ ๋Œ€์ƒ์œผ๋กœ TM์„ ์ง„ํ–‰ํ•˜์—ฌ TM ํˆฌ์ž ๋น„์šฉ์„ ์ ˆ๊ฐํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ์•ˆ์„ ์ฐพ์•„๋ณด๊ณ ์ž ์ด๋Ÿฐ ํ”„๋กœ์ ํŠธ๋ฅผ ์ง„ํ–‰ํ•˜๊ฒŒ ๋˜์—ˆ๋‹ค.

 

 

ํ”„๋กœ์ ํŠธ์˜ ๋ชฉ์ 

TM์ด ์ƒํ’ˆ ๊ฐ€์ž…์œผ๋กœ ์ด์–ด์ง€๋Š” ๊ณ ๊ฐ๊ตฐ์˜ ํŠน์„ฑ์„ ๋ถ„์„ํ•˜์—ฌ ์–ด๋–ค ๊ณ ๊ฐ์—๊ฒŒ TM์„ ์ˆ˜ํ–‰ํ•ด์•ผํ•˜๋Š” ๊ฒƒ์ด ํƒ€๋‹นํ• ์ง€ ์˜ˆ์ธกํ•˜๊ณ , ์ด๋ฅผ ํ†ตํ•ด TM์˜ ํšจ๊ณผ๋ฅผ ๋†’์ด๊ณ  ํˆฌ์ž ๋น„์šฉ์€ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ชจ์ƒ‰ํ•œ๋‹ค.

 

ํ™œ์šฉ ๋ฐ์ดํ„ฐ

UCI์˜ ์€ํ–‰ ํ…”๋ ˆ๋งˆ์ผ€ํŒ… ๋ฐ์ดํ„ฐ

UCI ์€ํ–‰ ํ…”๋ ˆ๋งˆ์ผ€ํŒ… ๋ฐ์ดํ„ฐ ์ปฌ๋Ÿผ

 

ํ™œ์šฉ ์•Œ๊ณ ๋ฆฌ์ฆ˜

Naive Bayes / Decision Tree / CART / Random Forest / SVM / Logistic Regression / ANN

 

Step1. ๋ฐ์ดํ„ฐ ํƒ์ƒ‰

1. poutcome - previous ์˜ ๊ด€๊ณ„

  • ์ด์ „ TM์ด ์„ฑ๊ณตํ•œ ๊ณ ๊ฐ์€ TM์ด์ „ ์—ฐ๋ฝ ํšŸ์ˆ˜๊ฐ€ ์ ์œผ๋ฉด ์ƒํ’ˆ ๊ฐ€์ž…์œผ๋กœ ์ด์–ด์งˆ ํ™•๋ฅ ์ด ๋†’๋‹ค.
  • ์ด์ „ TM ์„ฑ๊ณต ์œ ๋ฌด๊ฐ€ ๋ถˆ๋ถ„๋ช…ํ•œ ๊ฒฝ์šฐ๋Š” TM ์ด์ „ ์—ฐ๋ฝ ํšŸ์ˆ˜์™€ ๊ด€๊ณ„ ์—†์ด ๋Œ€์ฒด๋กœ ์ƒํ’ˆ์„ ๊ฐ€์ž…ํ•˜์ง€ ์•Š๋Š”๋‹ค.

 

2. poutcome - campaign ์˜ ๊ด€๊ณ„

  • ์ด์ „ TM์ด ์„ฑ๊ณตํ•œ ๊ฒฝ์šฐ TMํšŸ์ˆ˜๊ฐ€ ์ ์–ด์•ผ ์ƒํ’ˆ๊ฐ€์ž…์œผ๋กœ ์ด์–ด์ง„๋‹ค.
  • ์ด์ „ TM์˜ ์„ฑ๊ณต์œ ๋ฌด๊ฐ€ ๋ถˆ๋ถ„๋ช…ํ•œ ๊ฒฝ์šฐ TMํšŸ์ˆ˜๊ฐ€ ๋งŽ์œผ๋ฉด ์ƒํ’ˆ์„ ๊ฐ€์ž…ํ•˜์ง€ ์•Š๋Š”๋‹ค.
  • ์ด์ „ TM์ด ์„ฑ๊ณต๋„ ์‹คํŒจ๋„ ์•„๋‹ˆ๋ผ๋ฉด, ํšŸ์ˆ˜๊ฐ€ ๋„ˆ๋ฌด ๋งŽ๊ฑฐ๋‚˜ ์ ์„ ๋•Œ ์ƒํ’ˆ์„ ๊ฐ€์ž…ํ•˜์ง€ ์•Š๋Š”๋‹ค.

3. poutcome - pdays ์˜ ๊ด€๊ณ„

  • ์ด์ „ TM์ด ์„ฑ๊ณตํ•œ ๊ณ ๊ฐ์—๊ฒŒ ์งง์€ ๊ฐ„๊ฒฉ์„ ๋‘๊ณ  ๋‹ค์‹œ TM์„ ์ˆ˜ํ–‰ํ•˜๋ฉด ์ƒํ’ˆ ๊ฐ€์ž…์ด ์‹คํŒจ๋กœ ์ด์–ด์งˆ ํ™•๋ฅ ์ด ํฌ๋‹ค.
  • ์ด์ „ TM์ด ์‹คํŒจํ•œ ๊ณ ๊ฐ์—๊ฒŒ ํ•œ์ฐธ ํ›„์— ๋‹ค์‹œ TM์„ ์ˆ˜ํ–‰ํ•˜๋ฉด ์ƒํ’ˆ ๊ฐ€์ž…์— ์„ฑ๊ณตํ•  ๊ฐ€๋Šฅ์„ฑ์ด ํฌ๋‹ค.

 

Step2. ๋ณ€์ˆ˜ ํƒ€์ž… ๋ณ€๊ฒฝ

์ข…์†๋ณ€์ˆ˜๋ฅผ ๋ฒ”์ฃผํ˜• ๋ณ€์ˆ˜๋กœ ๋ฐ”๊ฟ”์ฃผ๋Š” ๊ณผ์ •์„ ์ˆ˜ํ–‰

setwd('C:\\Users\\Desktop')
bank<-read.csv('bank_term_deposit.csv')
str(bank)

#์ข…์†๋ณ€์ˆ˜๋ฅผ ๋ฒ”์ฃผํ˜• ๋ณ€์ˆ˜๋กœ ๋ฐ”๊ฟˆ
bank$y<-factor(bank$y)
summary(bank)
str(bank)

#1ํ–‰ & age ๋ณ€์ˆ˜ ์‚ญ์ œ
bank<-bank[,-1:-2]
str(bank)

๋ฐ์ดํ„ฐ๋ฅผ bank ๋ณ€์ˆ˜์— ์ €์žฅํ•œ ํ›„, y๊ฐ’์„ ๋ฒ”์ฃผํ˜• ๋ณ€์ˆ˜๋กœ ๋ณ€ํ˜•ํ•ด์ฃผ์—ˆ๋‹ค. 
๊ทธ๋ฆฌ๊ณ  ๋ถ„์„์— ํ•„์š” ์—†๋‹ค๊ณ  ์ƒ๊ฐ๋˜๋Š” age ๋ณ€์ˆ˜์™€, ์ฒซ๋ฒˆ์งธ ํ–‰์€ ์‚ญ์ œํ•ด ์ฃผ์—ˆ๋‹ค.

 

step3. ๋ฐ์ดํ„ฐ ๋ถ„ํ•  ๋ฐ ์˜ค๋ฒ„์ƒ˜ํ”Œ๋ง

#๋ฐ์ดํ„ฐ ๋ถ„ํ• 
set.seed(123)
index<-sample(3,nrow(bank),replace=TRUE,prob=c(0.5,0.25,0.25))
bank.train<-bank[index==1,]
bank.validation<-bank[index==2,]
bank.test<-bank[index==3,]
table(bank.train$y)

#์˜ค๋ฒ„์ƒ˜ํ”Œ๋ง
library(ROSE)
data.balanced.over<-ovun.sample(y~., data=bank.train,
                                p=0.5, seed=1,
                                method='over')$data
table(bank.train$y)   #ํ›ˆ๋ จ ๋ฐ์ดํ„ฐ
table(data.balanced.over$y)   #์˜ค๋ฒ„์ƒ˜ํ”Œ๋ง ๋ฐ์ดํ„ฐ

์˜ค๋ฒ„์ƒ˜ํ”Œ๋ง์„ ์ง„ํ–‰ํ•˜์ง€ ์•Š์•˜๋˜ ํ”„๋กœ์ ํŠธ์˜ ์ดˆ๊ธฐ ๋‹จ๊ณ„์—์„œ๋Š”, Accuracy๊ฐ€ ์ƒ๋‹นํžˆ ๋‚ฎ๊ณ , ๋ฐ์ดํ„ฐ์˜ ๊ฐฏ์ˆ˜๊ฐ€ ์ถฉ๋ถ„ํ•˜์ง€ ๋ชปํ•˜๋‹ค๊ณ  ์ƒ๊ฐ๋˜์–ด ์˜ค๋ฒ„์ƒ˜ํ”Œ๋ง์„ ์ง„ํ–‰ํ•˜๊ฒŒ ๋˜์—ˆ๋‹ค.

์œ„ ์ฝ”๋“œ์˜ ๊ฒฐ๊ณผ๊ฐ’

 

Step3. ๋ชจ๋ธ๋ง

1. Decision Tree

#Decision tree _ data.balanced.over
DT.bank2<-C5.0(y ~ ., data = bank.train)
DT.bank.predict2<-predict(DT.bank2, bank.validation)
DT.bank.table2<-table(bank.validation$y, DT.bank.predict2)
DT.bank.table2
paste('decision tree accuracy2 :',
      (DT.bank.table2[1,1]+DT.bank.table2[2,2])/sum(DT.bank.table2))

 

2. CART

# CART
bank.cart = rpart(y~., data=bank.train, control = rpart.control(minsplit=10))
print(bank.cart)
plot(bank.cart)
text(bank.cart,use.n = T)
print(bank.cart$cptable)

opt=which.min(bank.cart$cptable[,"xerror"])
cp=bank.cart$cptable[opt,"CP"]
bank_prune=prune(bank.cart,cp=cp) #๊ฐ€์ง€์น˜๊ธฐ
print(bank_prune)
plot(bank_prune)
text(bank_prune)

bank.prune_pre=predict(bank_prune, newdata = bank.validation, type = 'class')
bank.prune_pre

bank.prune_tab = table(bank.validation$y,bank.prune_pre)
bank.prune_tab


bank.prune_tab
paste('CART accuracy :',
      (bank.prune_tab[1,1]+bank.prune_tab[2,2])/sum(bank.prune_tab))

 

3. RandomForest

3-1)  CV๋ฅผ ํ™œ์šฉํ•œ ํŒŒ๋ผ๋ฏธํ„ฐ ํŠœ๋‹ ์ง„ํ–‰

library(cvTools)
library(foreach)

set.seed(123); K=10 ;R=3

cv <- cvFolds(Nrow(bank), K=K, R=R)
grid <- expand.grid(ntree = c(10, 100, 200), .combine = rbind) %do% {
	foreach(r = 1:R, .combine = rbind) %do% {
    	foreach(k=1:K, .combine = rbind) %do% {
    	    validation_idx <- cv$subsets[which(cv$which == k), r]
            train <- bank[-validation_idx, ]
            validation <- bank[validation_idx, ]
            #๋ชจ๋ธํ›ˆ๋ จ
            m <- randomForest(y~., data = train, ntree=grid[g, "ntree"],
            mtry = grid[g, "mtry" ]
            
            #์˜ˆ์ธก
            predicted <- predict(m, newdata = validation)
            
            #์„ฑ๋Šฅํ‰๊ฐ€
            accuracy <- sum(predicted == validation $ y) / NROW(predicted)
            return(data.frame(g=g, accuracy = accuracy))
            }
	}
}

result
library(plyr)
ddply(result, .(g), summarize, mean_accuracy = mean(accuracy))
grid[c(6), ]
  • ntree = 200, myry = 4 ์ผ ๋•Œ ์ •ํ™•๋„๊ฐ€ 90.83%๋กœ ๊ฐ€์žฅ ๋†’์Œ

3-2) ๋ชจ๋ธ๋ง

#Random Forest
install.packages("randomForest")
library(randomForest)
RF.bank<-randomForest(y~., data=bank.train)
RF.bank.predict<-predict(RF.bank, bank.validation)
RF.bank.table<-table(bank.validation$y,RF.bank.predict)
paste('random forest accuracy :',
      (RF.bank.table[1,1]+RF.bank.table[2,2])/sum(RF.bank.table))

 

4. Logistic Regression

#Logistic Regression
logit.bank<-glm(y~., family=binomial, data=bank.train)
logit.bank.predict<-(predict(logit.bank,bank.validation, type="response")>=0.5)
logit.bank.table<-table(bank.validation$y, logit.bank.predict)
logit.bank.table
paste('logistic regression accuracy :',
      (logit.bank.table[1,1]+logit.bank.table[2,2])/sum(logit.bank.table))

 

5. SVM

#SVM
library(e1071)

#numeric ๋ณ€์ˆ˜ normalization
normalize<-function(x){
  return((x-min(x))/(max(x)-min(x)))
}
bank.train$age<-normalize(bank.train$age)
bank.train$balance<-normalize(bank.train$balance)
bank.train$duration<-normalize(bank.train$duration)
head(bank.train)
bank.validation$age<-normalize(bank.validation$age)
bank.validation$balance<-normalize(bank.validation$balance)
bank.validation$duration<-normalize(bank.validation$duration)
SVM.bank<-svm(y~age+balance+duration+campaign+pdays+previous, data=bank.train)
SVM.bank.predict<-predict(SVM.bank,bank.validation)
SVM.bank.table<-table(bank.validation$y, SVM.bank.predict)
SVM.bank.table

 

step4. ๋ชจ๋ธ ์„ ์ •

์˜ค๋ฒ„์ƒ˜ํ”Œ๋ง ์ง„ํ–‰ ์ „ ๊ฒฐ๊ณผ

์•ž์„œ Accuracy ๊ฐ€ ํ˜„์ €ํžˆ ๋‚ฎ์•„์„œ, ์˜ค๋ฒ„์ƒ˜ํ”Œ๋ง์„ ํ•˜๊ณ  ๋‹ค์‹œ ๋ชจ๋ธ๋ง์„ ์ง„ํ–‰ํ–ˆ๋‹ค๊ณ  ์–ธ๊ธ‰ํ–ˆ๋‹ค.

๋”ฐ๋ผ์„œ ์ตœ์ข…์ ์œผ๋กœ๋Š”, 

์˜ค๋ฒ„์ƒ˜ํ”Œ๋ง ์ง„ํ–‰ ํ›„, LogisticRegression ๊ฒฐ๊ณผ

0.928์˜ Accuracy, 0.808์˜ recall ๊ฐ’์„ ๋„์ถœํ•ด๋‚ธ, Logistic Regression ๋ชจ๋ธ์„ ์ตœ์ข… ๋ชจ๋ธ๋กœ ์„ ์ •ํ–ˆ๋‹ค.

 

step7. ROC ์ปค๋ธŒ ํ™•์ธ

#ROC curve๋ฅผ ํ†ตํ•œ ์ตœ์  cut-off value ํ™•์ธ
ROC = roc(bank.validation$y, pred.bin3)
plot.roc(ROC, col = 'red', print.auc = FALSE, max.auc.polygen=True,
		print.thres=True, prin.thres.pch=19, print.thres.col="black",
        auc.polygon=True, auc.polygon.col = "#D1F2EB")

Validation data์— ๋Œ€ํ•œ ROC curve

cut-off value๊ฐ€ 0.5์ผ ๋•Œ ๊ฐ€์žฅ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ž„.

0.5 ์ด์ƒ์ด๋ฉด 1๋กœ ๋ถ„๋ฅ˜ํ•˜๋„๋ก ํ•˜๋ฉด ๋  ๊ฒƒ์ด๋ผ๊ณ  ํŒ๋‹จํ–ˆ๋‹ค.

 

step6. ์„ฑ๋Šฅ ํ™•์ธ

#Lift Chart
pr <- prediction(pred.test, bank.test$y)
model.lift2 <- performance(pr, "lift", "rpp")
with.pragh(); plot(model.lift2, col = "red")

 

ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ Lift Chart๋ฅผ ์ถœ๋ ฅํ•ด ๋ณด์•˜๋Š”๋ฐ, ์ตœ์ƒ์œ„ ํ™•๋ฅ ์„ ๊ฐ€์ง„ ๊ณ ๊ฐ์—๊ฒŒ๋Š” ์•ฝ๊ฐ„ ๋ถˆ์•ˆ์ •ํ•˜์ง€๋งŒ ์ ์  ์•ˆ์ •๋˜๊ฒŒ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋ถ„์„ํ•˜๊ณ  ์žˆ์Œ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค.

 

๊ธฐ๋Œ€ํšจ๊ณผ

1. ์ „์ฒด ๊ณ ๊ฐ๊ตฐ ์ค‘ ์ผ๋ถ€๋งŒ ์ถ”์ถœํ•˜๋”๋ผ๋„ ์ƒํ’ˆ ๊ฐ€์ž… ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์€ ๊ณ ๊ฐ์„ ์„ ๋ณ„ ๊ฐ€๋Šฅํ•˜๋‹ค.
2. ๋ชจ๋“  ๊ณ ๊ฐ์—๊ฒŒ ํ•˜๋Š” ์ผ๊ด„์ ์ธ ๋งˆ์ผ€ํŒ…๋ณด๋‹ค ๋น„์šฉ์ด ํฌ๊ฒŒ ๊ฐ์†Œํ•  ๊ฒƒ์ด๋‹ค.
3. ์žฌํ˜„๋ฅ (recall)์„ ๋†’์ž„์œผ๋กœ์จ ๋‚ฎ์€ ์ •ํ™•๋„ ๋Œ€๋น„ ๋†’์€ ์ˆ˜์ต์„ ๊ธฐ๋Œ€ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.

 

์ด ํ”„๋กœ์ ํŠธ์˜ ํ•œ๊ณ„์ 

1. ๊ณ ๊ฐ๋ณ„ ์ƒํ’ˆ ๊ฐ€์ž…์˜ ๊ทœ๋ชจ๋ฅผ ๊ณ ๋ คํ•˜์ง€ ์•Š์•˜๋‹ค.
2. ์ •๋ฐ€๋„ (Precision)์ด ๋‚ฎ์•„ ์„ ํƒ๋œ ๊ณ ๊ฐ ์ค‘ ๊ฐ€์ž…์˜์‚ฌ๊ฐ€ ์—†๋Š” ๊ณ ๊ฐ์ด ์žˆ๋‹ค.
3. ์˜ค๋ฒ„์ƒ˜ํ”Œ๋งํ•œ ๋ฐ์ดํ„ฐ์ด๊ธฐ ๋•Œ๋ฌธ์— ์‹ค์ œ์™€ ๋‹ค๋ฅผ ์ˆ˜ ์žˆ๋‹ค.

 

๋งˆ์น˜๋ฉฐ

ํ•™๋ถ€์ƒ๋•Œ ์ง„ํ–‰ํ–ˆ๋˜ ์ฝ”๋“œ๋ฅด ๋ณด๋‹ˆ ์ •ํ™•ํ•˜์ง€ ์•Š์€ ๋ณ€์ˆ˜๋ช…๋„ ๋งŽ๊ณ , ์ฝ”๋“œ๊ฐ€ ์ข€ ์–ด์ง€๋Ÿฝ๋‹ค. ๋”๋ถˆ์–ด ๋ถ„์„์„ ์ œ๋Œ€๋กœ ์‹œํ–‰ํ–ˆ๋‹ค ํ•˜๋Š” ์ž์‹ (?)๋˜ํ•œ ์—†๋‹ค. ํ•˜์ง€๋งŒ ์ด ์‹œ์ ์— ๋‹ค์‹œ ์ •๋ฆฌํ•ด๋ด„์œผ๋กœ์จ ๋‚˜์—๊ฒŒ ๋ฌด์—‡์ด ๋ถ€์กฑํ•œ์ง€, ๋ฌด์Šจ ๊ณต๋ถ€๊ฐ€ ๋” ํ•„์š”ํ•œ์ง€์— ๋Œ€ํ•ด ์ฐพ์•„๋ณผ ์ˆ˜ ์žˆ์—ˆ๋‹ค. ์•ž์œผ๋กœ๋Š” ํ†ต๊ณ„๊ณต๋ถ€๋ฅผ ์กฐ๊ธˆ ๋” ํ•˜๊ณ , ํŒŒ์ƒ๋ณ€์ˆ˜๋ฅผ ์ƒ์„ฑํ•œ๋‹ค๋˜์ง€, ๋‹ค๋ฅธ ์•™์ƒ์ƒ๋ธ”๋ชจ๋ธ์„ ํ™œ์šฉํ•œ๋‹ค๋˜์ง€ํ•ด์„œ ์ •ํ™•๋„๋ฅผ ์˜ฌ๋ฆฌ๋Š” ๋ฐฉ๋ฒ•์„ ๋ชจ์ƒ‰ํ•ด ๋ณผ๊ฒƒ์ด๋‹ค.

๋ฐ˜์‘ํ˜•