Potato
์•ˆ๋…•ํ•˜์„ธ์š”, ๊ฐ์žก๋‹ˆ๋‹ค?๐Ÿฅ” ^___^ ๐Ÿ˜บ github ๋ฐ”๋กœ๊ฐ€๊ธฐ ๐Ÿ‘‰๐Ÿป

AI study/potato's PJT (in dongguk)

[NLP] ํŠนํ—ˆ ๋ฐ์ดํ„ฐ๋กœ Topic Modeling์„ ์ง„ํ–‰ํ•ด๋ณด์ž

๊ฐ์ž ๐Ÿฅ” 2021. 4. 29. 00:13
๋ฐ˜์‘ํ˜•

์‹œ์ž‘ํ•˜๋ฉฐ

ํ•™๋ถ€์ƒ์‹œ์ ˆ ์ง„ํ–‰ํ–ˆ๋˜ Topic Modeling ํ”„๋กœ์ ํŠธ์— ๋Œ€ํ•ด ์ž‘์„ฑํ•˜๋ ค๊ณ  ํ•œ๋‹ค.
NLP์— ๊ด€์‹ฌ์ด ์ƒ๊ธฐ๊ธฐ ์‹œ์ž‘ํ•˜์—ฌ ๋ฐ์ดํ„ฐ์–ด๋‚ผ๋ฆฌํ‹ฑ์Šค ์ˆ˜์—…์„ ์ˆ˜๊ฐ•ํ–ˆ๋Š”๋ฐ, ์šด์ข‹๊ฒŒ๋„ NLP์— ๋Œ€ํ•œ ๊ฐœ๋…๊ณผ, ์‹ค์Šต์„ ์ง„ํ–‰ํ•ด๋ณผ ์ˆ˜์žˆ์—ˆ๋‹ค. ๊ทธ ์ง€์‹์„ ๋ฐ”ํƒ•์œผ๋กœ ์—ฌ๋Ÿฌ๊ฐ€์ง€ ํ”„๋กœ์ ํŠธ๋ฅผ ์ง„ํ–‰ํ–ˆ๋Š”๋ฐ, ๊ทธ ์ค‘ ํ•˜๋‚˜์ด๋‹ค!

ํ”„๋กœ์ ํŠธ ๋งˆ๋ฌด๋ฆฌ ๋‹จ๊ณ„์— ๋ ˆํฌํŠธ๋กœ ์ž‘์„ฑํ•˜์—ฌ ๊ณผ์ œ๋กœ ์ œ์ถœํ•œ ๊ฒฝํ—˜์ด ์žˆ์–ด์„œ, ์‹œ๊ฐ„์ด ์ง€๋‚œ ์ง€๊ธˆ์—๋„ ์ด๋ ‡๊ฒŒ ๊ธ€๋กœ ๋‚จ๊ธธ ์ˆ˜ ์žˆ๊ฒŒ๋˜์—ˆ๋‹ค. ์ž๋ฃŒ์˜ ๋ฌธ์„œํ™”์— ๋Œ€ํ•œ ์ค‘์š”์„ฑ์„ ๋‹ค์‹œ๊ธˆ ๊นจ๋‹ซ๋Š” ์‹œ๊ฐ„์ด๋‹ค :) 

-- ๋ณธ ํ”„๋กœ์ ํŠธ๋Š” ํ•™๊ต ๊ณผ์ œ ๊ฒธ ํ”„๋กœ์ ํŠธ๋กœ ์ง„ํ–‰ํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค :) 

 

1. ์ฃผ์ œ์„ ์ •

1.1 ํ”„๋กœ์ ํŠธ ์ฃผ์ œ

Quantum-dot Display ๊ธฐ์ˆ ์˜ ํŠนํ—ˆ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ LDA

 

1.2 ํ”„๋กœ์ ํŠธ ๋ชฉํ‘œ

Quantum dot display์— ๊ด€ํ•œ 794๊ฐœ์˜ ํŠนํ—ˆ์˜ Topic์„ ๋น ๋ฅด๊ฒŒ ํŒŒ์•…ํ•  ์ˆ˜์žˆ๋Š” Topic Modeling์„ ์ง„ํ–‰ํ•˜๊ณ ์ž ํ•œ๋‹ค.

ํŠนํ—ˆ๋ฐ์ดํ„ฐ๋Š” ๋„ˆ๋ฌด ๋„˜์ณ๋‚˜๊ณ , ๊ธฐ์ˆ ์ ์ธ ์šฉ์–ด๊ฐ€ ์–ด๋ ต๊ธฐ ๋•Œ๋ฌธ์— ์ด Topic Modeling ์ด ์‚ฐ์—…์—์„œ ๋„๋ฆฌ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐ๋œ๋‹ค!

 

2. ๋ถ„์„ ๊ณผ์ •

2.1 ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘

2.1.1 ํ™œ์šฉ ๋ฐ์ดํ„ฐ

๋ถ„์„ ๋ฐ์ดํ„ฐ๋Š” ๋ฏธ๊ตญ ํŠนํ—ˆ์ฒญ ํŠนํ—ˆ DB ๋‚ด ์ „์ฒด ๊ธฐ๊ฐ„๋™์•ˆ ์ถœ์›, ๊ณต๊ฐœ ๋ฐ ๋“ฑ๋ก ๋˜์–ด์žˆ๋Š” ํŠนํ—ˆ๋“ค ์ค‘์— ์„ ์ •ํ•˜์˜€๋‹ค. ๋™๊ตญ๋Œ€ํ•™๊ต์—์„œ ์ง€์›ํ•˜๋Š” ํŠนํ—ˆ ๊ฒ€์ƒ‰ ์‚ฌ์ดํŠธ( ๋ฌด์Šจ ์‚ฌ์ดํŠธ์˜€๋Š”์ง€ ๊นŒ๋จน์—ˆ๋‹ค... ์ฐพ๋Š”๋Œ€๋กœ ์—…๋กœ๋“œ ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.!) ์— Quantum-dot display์™€ ์œ ์‚ฌํ•œ ์˜๋ฏธ๋กœ ์“ฐ์ด๋Š” ๋‹จ์–ด๋ฅผ ์•„๋ž˜ <ํ‘œ1>๊ณผ ๊ฐ™์ด ์ •๋ฆฌํ•˜์—ฌ ๊ฒ€์ƒ‰์‹์„ ์ž‘์„ฑํ•˜์˜€๋‹ค.

Quantum - dot display ๊ด€๋ จ ํŠนํ—ˆ ๊ฒ€์ƒ‰์‹
TI = ("Quantum*" or "QD") AND ("display*") or ("QLED" or "QLCD")

<ํ‘œ 1> Quantum-dot display ํŠนํ—ˆ ๊ฒ€์ƒ‰์‹

ํŠนํ—ˆ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ ์ด 794๊ฑด์˜ ํŠนํ—ˆ๊ฐ€ ๊ฒ€์ƒ‰๋˜์—ˆ๊ณ , ํ•ด๋‹น ํŠนํ—ˆ์˜ ๋ฒˆํ˜ธ, ๋ช…์นญ, ์š”์•ฝ, ์ถœ์›๋ฒˆํ˜ธ, ์ „์ฒด์ฒญ๊ตฌํ•ญ์„ ๊ธฐ์ค€ ๋“ฑ์„ csv ํ˜•ํƒœ์˜ ๋ฐ์ดํ„ฐ๋กœ ๋‹ค์šด๋ฐ›์•˜๋‹ค. ๋ณธ ๋ ˆํฌํŠธ์—์„œ๋Š” ๋‹ค์šด๋ฐ›์€ ๋ฐ์ดํ„ฐ๋ฅผ ๋Œ€์ƒ์œผ๋กœ LDA๋ฅผ ์ˆ˜ํ–‰ํ•œ๋‹ค. <๊ทธ๋ฆผ 2>๋Š” ํŠนํ—ˆ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋ฅผ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ์ด๊ณ , <๊ทธ๋ฆผ 3>์€ ๋‹ค์šด๋ฐ›์€ ๋ฐ์ดํ„ฐ์˜ ์˜ˆ์‹œ๋ฅผ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ์ด๋‹ค.

<๊ทธ๋ฆผ 2> ํŠนํ—ˆ ๋ฐ์ดํ„ฐ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ
<๊ทธ๋ฆผ 3> ๋‹ค์šด๋ฐ›์€ ๋ฐ์ดํ„ฐ ์˜ˆ์‹œ

2.2 LDA Topic Modeling

R studio ํ”„๋กœ๊ทธ๋žจ์„ ์ด์šฉํ•˜์—ฌ 2.1์—์„œ ์ˆ˜์ง‘ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ Topic Modeling์„ ์ˆ˜ํ–‰ํ•˜์˜€๋‹ค. LDA Topic Modeling์˜ ์ „์ฒด์ ์ธ ํ”„๋กœ์„ธ์Šค๋Š” LDA ํ™˜๊ฒฝ ๊ตฌ์„ฑ, ๋ง๋ญ‰์น˜ (Corpus) ์ƒ์„ฑ, Term-doc Matrix ์ƒ์„ฑ, LDA๊ณ„์‚ฐ, LDA ๊ฒฐ๊ณผ ๊ณ„์‚ฐ ๋„์ถœ ์ˆœ์œผ๋กœ ์ง„ํ–‰ํ•˜์˜€๋‹ค.

2.2.1 LDA ํ™˜๊ฒฝ๊ตฌ์„ฑ

R studio์—์„œ LDA Topic Modeling์— ํ•„์š”ํ•œ ldatuning, topicmodels, tm, slam, ida ํŒจํ‚ค์ง€๋ฅผ ์„ค์น˜ ํ›„ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ ๋ถˆ๋Ÿฌ์™€ LDA Topic Modeling์„ ์œ„ํ•œ ํ™˜๊ฒฝ์„ ๊ตฌ์„ฑํ•˜์˜€๋‹ค. ์ดํ›„ “quntum_dot_data.csv”ํŒŒ์ผ์„ ๋ถˆ๋Ÿฌ์˜ค๊ณ  ์ „์ฒด ์ฒญ๊ตฌํ•ญ์—ด์— ๋Œ€ํ•œ ๋ถ„์„์„ ์ง„ํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ์ด๋ฅผ target_doc์œผ๋กœ ์ง€์ •ํ•˜์˜€๋‹ค.

# ํŒจํ‚ค์ง€ ์„ค์น˜ ๋ฐ ํ˜ธ์ถœ
install.packages('ldatuning')
install.packages('topicmodels')
install.packages('tm')
install.packages('slam')
install.packages('lda')

library(ldatuning)
library(topicmodels)
library(tm)
library(slam)
library(lda)

# ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ ๋ฐ ์ „์ฒ˜๋ฆฌ
setwd("C:\\Users\\my\\Desktop")
lda_source<-read.csv("quantum_dot_data.csv",stringsAsFactors = FALSE, header = FALSE)
target_doc<-lda_source[,6]

 

2.2.2 ๋ง๋ญ‰์น˜ (Corpus) ์ƒ์„ฑ

tmํŒจํ‚ค์ง€์˜ VectorSource ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ Corpus๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ์ดํ›„, TermDocumentMatrix ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ์ˆซ์ž๋‚˜ ๊ธฐํ˜ธ, ๋นˆ ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ฑฐํ•ด์ฃผ์—ˆ๋‹ค. ๋˜ํ•œ ๋งŽ์ด ์“ฐ์ด๋Š” ํ†ต์‚ฌ๊ตฌ์กฐ์ธ claim, comprising, including, includes๋ฅผ stop words ๋กœ ์ฒ˜๋ฆฌํ•ด ์ฃผ์—ˆ๋‹ค.

# corpus ์ƒ์„ฑ
doc_vec<-VectorSource(target_doc)
corpus<-Corpus(doc_vec)
# term document matrix
tdm = TermDocumentMatrix(corpus, control = list(removeNumbers = T,
                                                removePunctuation = T,
                                                stemming = FALSE,
                                                stopwords = c(stopwords('SMART'),
                                                              'comprising','including',
                                                              'includes'),
                                                omit_empty = T))
word.count = as.array(rollup(tdm,2))
word.order = order(word.count, decreasing = T)[1:1000] ## ๋งŽ์ด ์“ฐ์ธ ๋‹จ์–ด๋Œ€๋กœ ์˜ฌ๋ฆผ์ฐจ์ˆœ
freq.word = word.order[1:1000]

 

2.2.3 Term-Doc Matrix ์ƒ์„ฑ

๋ถ„์„๋œ ๋ฐ์ดํ„ฐ์˜ ํฌ๊ธฐ๊ฐ€ ํฌ๊ธฐ ๋•Œ๋ฌธ์— LDA ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š”๋ฐ์— ์‹œ๊ฐ„์ด ๋งŽ์ด ์†Œ์š”๋œ๋‹ค. ์ด๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•ด ์ƒ์œ„ 1000๊ฐœ์˜ ๋‹จ์–ด๋งŒ ์‚ฌ์šฉํ•˜๊ฒŒ๋” Term-Doc Matrix๋ฅผ ์ƒ์„ฑํ–ˆ๋‹ค.

# dtm ์ƒ์„ฑ
dtm = as.DocumentTermMatrix(tdm[freq.word,])
dtm.matrix<-as.matrix(dtm)
dtm

 

2.2.4 LDA ๊ณ„์‚ฐ

lda ํŒจํ‚ค์ง€๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์•ž์„œ ์ค€๋น„ํ•œ lda ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ lda ๋ถ„์„์„ ์‹คํ–‰ํ•œ๋‹ค. ๋ถ„์„ ์‹œ Gibbs Sampling์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ํ† ํ”ฝ ๊ฐœ์ˆ˜ K๋Š” 50๊ฐœ๋กœ ์ง€์ •, interation ๊ฐ’์€ 500, burnin๊ฐ’ 100, ๋ฌธ์„œ ๋‚ด ํ™•๋ฅ ๋ถ„ํฌ๋Š” 0.01, ํ•˜๋‚˜์˜ topic ์•ˆ์—์„œ์˜ hyper parameter๋Š” 0.01๋กœ ๊ตฌ์„ฑํ•˜์˜€๋‹ค.

ldaform=dtm2ldaformat(dtm, omit_empty=T) ## dtm์„ LDA ํฌ๋ฉง์˜ ๋ฐ์ดํ„ฐ๋กœ ๋ณ€ํ™˜

result.lda = lda.collapsed.gibbs.sampler(documents = ldaform$documents,
                                         K = 50,
                                         vocab = ldaform$vocab,
                                         num.iterations = 500,
                                         burnin = 100,
                                         alpha = 0.01,
                                         eta = 0.01)
#alpha = ๋ฌธ์„œ๋‚ด์˜ ํ™•๋ฅ ๋ถ„ํฌ / eta = ๋ฌธ์„œ๋‚ด ์ฃผ์ œ๋ณ„ ํ™•๋ฅ ๋ถ„ํฌ

 

2.2.5 LDA ๊ณ„์‚ฐ ๊ฒฐ๊ณผ ๋„์ถœ

์ฃผ์ œ๋ณ„ ํ‚ค์›Œ๋“œ 20๊ฐœ์™€, ์ฃผ์ œ๋ณ„ ํ•ต์‹ฌ ๋ฌธ์„œ 10๊ฐœ๋ฅผ ๋„์ถœํ•ด์„œ ๊ฐ๊ฐ์˜ ํŒŒ์ผ๋กœ ์ €์žฅํ•˜์˜€๋‹ค.

# ๋ถ„์„ ๊ฒฐ๊ณผ ํ™•์ธ
attributes(result.lda)
dim(result.lda$topics)
result.lda$topics
document_sums <- result.lda$document_sums

# ์ฃผ์ œ๋ณ„ ํ‚ค์›Œ๋“œ ๋„์ถœ ๋ฐ ์ €์žฅ
topic_word <- top.topic.words(result.lda$topics, num.words = 20)
write.csv(topic_word, "topic_word.csv")

# ์ฃผ์ œ๋ณ„ ํ•ต์‹ฌ ๋ฌธ์„œ ๋„์ถœ ๋ฐ ์ €์žฅ
num.documents = 10
doc_topic <- top.topic.documents(document_sums, num.documents, alpha = 0.1)
write.csv(doc_topic, "doc_topic.csv")

 

3. ๊ฒฐ๊ณผ ๋ฐ ์‹œ์‚ฌ์ 

3.1 ๊ฒฐ๊ณผ

๋„์ถœ๋œ ๊ฒฐ๊ณผ์˜ ์ผ๋ถ€๋ฅผ ๋ฐœ์ทŒํ•˜์—ฌ ์ฃผ์ œ๋ณ„ ์ƒ์œ„ ํ‚ค์›Œ๋“œ ๋ฐ ๋ฌธ์„œ์˜ ์ผ๋ถ€๋ฅผ ํ•ด์„ํ•ด ๋ณด์•˜๋‹ค. ๋„์ถœ๋œ ๊ฒฐ๊ณผ์˜ ๋‘ ๊ฐœ์˜ ํŒŒ์ผ์€ ์•„๋ž˜ <๊ทธ๋ฆผ 4> ์™€ <๊ทธ๋ฆผ 5> ์— ์ฒจ๋ถ€ํ•˜์˜€๋‹ค.

<๊ทธ๋ฆผ 4> Topic_word.csv : ์ฃผ์ œ๋ณ„ ์ƒ์œ„ ํ‚ค์›Œ๋“œ 20
<๊ทธ๋ฆผ 5> doc_topic.csv : ์ฃผ์ œ๋ณ„ ํ•ต์‹ฌ ๋ฌธ์„œ ์ƒ์œ„ 10๊ฐœ

 

3.2 ํ•ด์„

์•„๋ž˜ <ํ‘œ 2> ๋ฅผ ๋ณด๊ณ  ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ•ด์„ํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ์–‘์ž์ (Quantum dot) display์˜ pixel structure์˜ ๊ตฌ์„ฑ(comprises)์— ๋Œ€ํ•œ ํŠนํ—ˆ๋ผ๋Š” ๊ฒƒ์„ ์œ ์ถ”ํ•ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค. pixel์ด ๊ฐ์ž(respective) ๋‹ค๋ฅธ diode๋กœ configured ๋˜์–ด ์žˆ๊ณ , pixel์ด ๊ฒน๊ฒน์ด ์Œ“์ธ layer ๊ตฌ์กฐ์ž„์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. ์ด ์ฃผ์ œ๋ฅผ ๊ฐ–๊ณ  ์žˆ๋Š” ํŠนํ—ˆ๋Š” 303๋ฒˆ ํŠนํ—ˆ, 656๋ฒˆ ํŠนํ—ˆ, 420๋ฒˆ ํŠนํ—ˆ, 265๋ฒˆ ํŠนํ—ˆ ๋“ฑ์ด ์žˆ์œผ๋ฉฐ, <ํ‘œ 3>์— ์ด ์ฃผ์ œ๋ฅผ ๊ฐ€์ง„ ํŠนํ—ˆ ์ƒ์œ„ 10๊ฐœ๋ฅผ ํ‘œํ˜„ํ•ด ๋†“์•˜๋‹ค.

Topic 4 pixel plurality computational layers type
display diode defined configured polygon
claim pixels structure system portion
comprise respective regions coupled method

<ํ‘œ 2> Topic 4์˜ ์ฃผ์ œ๋ณ„ ์ƒ์œ„ ํ‚ค์›Œ๋“œ 20๊ฐœ

Topic 4 303 656 420 265 418
271 434 597 261 612

<ํ‘œ 3> Topic 4์˜ ์ฃผ์ œ๋ณ„ ํ•ต์‹ฌ ์ƒ์œ„ ๋ฌธ์„œ 10๊ฐœ

 

3.3 ์‹œ์‚ฌ์ 

ํŠนํ—ˆ ๋ฐ์ดํ„ฐ์— LDA๋ฅผ ์ ์šฉํ•จ์œผ๋กœ์จ, ์ˆ˜๋งŽ์€ ํŠนํ—ˆ ์ค‘ ์‚ฌ์šฉ์ž๊ฐ€ ๊ด€์‹ฌ์žˆ๋Š” ํŠน์ • ํŠนํ—ˆ ๋ฌธ์„œ๋ฅผ ๊ตฐ์ง‘ํ™” ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ํŠน์ • ์ฃผ์ œ์— ๋Œ€ํ•œ ํ˜„ํ™ฉ, ๋ฌธ์ œ์ , ํ•ด๊ฒฐ์ฑ…์„ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ๋Š” ์ธ์‚ฌ์ดํŠธ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. ๋˜ ํŠน์ • ์ฃผ์ œ ๋ฐ ๊ด€๋ จ์„ฑ ๋†’์€ ๋‹จ์–ด๋“ค์„ ๋ถ„์„ํ•˜๋ฉด์„œ ํ•ด๋‹น ๊ธฐ์ˆ ์˜ ํ˜„ํ™ฉ ํŒŒ์•…, ์—ฐ๊ตฌ ๋™ํ–ฅ์„ ํŒŒ์•… ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.

4. ๊ฒฐ๋ก 

TV ๋ฐ ๊ฐ€์ „ ์ œํ’ˆ ์‹œ์žฅ์— ์ฐจ์„ธ๋Œ€ ๊ธฐ์ˆ ๋กœ ์ฃผ๋ชฉ๋ฐ›๊ณ  ์žˆ๋Š” quantum-dot display ๊ธฐ์ˆ ์€ ๋†’์€ ์„ฑ์žฅ๋ฅ ์„ ๋ณด์ด๊ณ  ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ quantum-dot display์— ๊ด€ํ•œ ํŠนํ—ˆ ๋ฐ์ดํ„ฐ๋ฅผ LDA๋กœ ๋ถ„์„ํ•˜์—ฌ topic๋ณ„๋กœ clustering ํ•˜์˜€๋‹ค. ๋˜ํ•œ ํ•ด๋‹น topic์— ๋Œ€ํ•œ ํ•ต์‹ฌ ๋ฌธ์„œ๋„ ๋„์ถœํ•˜์˜€๋‹ค. ์ด๋ฅผ ํ†ตํ•ด quantum-dot display์— ๋Œ€ํ•œ ๊ตฌ์กฐ๋ฅผ ์‚ดํŽด๋ณผ ์ˆ˜ ์žˆ์—ˆ๋‹ค. ํŠนํ—ˆ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ LDA๋ฅผ ์ˆ˜ํ–‰ํ•จ์œผ๋กœ์จ ํ•ด๋‹น ์ฃผ์ œ์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๋™ํ–ฅ, ํ˜„์žฌ ํ˜„ํ™ฉ์„ ํŒŒ์•…ํ•˜๊ณ , ํ•ต์‹ฌ ์ฃผ์ œ๋ณ„๋กœ ๋ฌถ์–ด์„œ ์‚ดํŽด๋ณผ ์ˆ˜ ์žˆ์—ˆ์œผ๋ฉฐ, ๊ฒฐ๊ณผ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ•ด๋‹น ์ฃผ์ œ์— ๊ด€ํ•œ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ๊นŒ์ง€ ์„ค๊ณ„ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋‹ค.

 

 

๋งˆ์น˜๋ฉฐ

์ฒ˜์Œ์œผ๋กœ NLP์— ๊ด€์‹ฌ์„ ๊ฐ–๊ฒŒ๋œ ๊ณ„๊ธฐ์˜€๋‹ค. ๋‚ด์šฉ์„ ๋ชจ๋‘ ํ›‘์–ด๋ณด์ง€ ์•Š๊ณ , ๋‚ด์šฉ์„ ์œ ์ถ”ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์€ ๊ต‰์žฅํ•œ ํ˜๋ช…์œผ๋กœ ๋‹ค๊ฐ€์™”๋‹ค. NLP๋ฅผ ๋งˆ์Šคํ„ฐํ•˜๋Š” ๊ทธ๋‚ ๊นŒ์ง€......... ๊ธฐ์ดˆ์ ์ธ ์ง€์‹๋ถ€ํ„ฐ ์Œ“๊ณ  ๊ณต๋ถ€ํ•ด์•ผ์ง•

๋ฐ˜์‘ํ˜•