Potato
μ•ˆλ…•ν•˜μ„Έμš”, κ°μž‘λ‹ˆλ‹€?πŸ₯” ^___^ 😺 github λ°”λ‘œκ°€κΈ° πŸ‘‰πŸ»

AI study/potato's PJT (in dongguk)

[Speech AI] 청각μž₯애인을 μœ„ν•œ μ‚¬μš©μž λ§žμΆ€ν˜• μ†Œλ¦¬ λΆ„λ₯˜ μ‹œμŠ€ν…œ 개발

감자 πŸ₯” 2021. 4. 29. 13:32
λ°˜μ‘ν˜•

μ‹œμž‘ν•˜λ©°

2019λ…„ AIκ΄€λ ¨ μˆ˜μ—…μ„ μˆ˜κ°•ν•˜λ©° computer vision에 λŒ€ν•΄μ„œ λ°°μ› λ‹€. 이 이후, λΉ„μ •ν˜• 데이터에 λŒ€ν•œ μš•μ‹¬μ΄ 생기기 μ‹œμž‘ν–ˆμœΌλ©° 3ν•™λ…„ μ£Όμ œμ— 4학년이 μ‘Έμ—…ν”„λ‘œμ νŠΈλ‘œ λ‚˜κ°€λŠ” ν•™μˆ λŒ€νšŒμ— μ°Έμ—¬λ₯Ό λ„μ „ν•˜κ²Œ λ˜μ—ˆλ‹€. 4ν•™λ…„μ˜ μ‘Έμ—…ν”„λ‘œμ νŠΈλŠ” 기업연계 ν”„λ‘œμ νŠΈλ‘œ μ‹€μ œλ‘œ κΈ°μ—… ν˜„μ§μžλ“€κ³Ό ν•¨κ»˜ν•˜λŠ” ν”„λ‘œμ νŠΈλ“€ μ΄μ—ˆλ‹€.

이 사건을 κ³„κΈ°λ‘œ, λ§Žμ€ λŒ€νšŒμ— μ°Έμ—¬ν•˜λŠ” 용기λ₯Ό μ–»κ²Œλœλ‹€ :) 또, 끈기있게 λ„μ „ν•˜κ³  결과물을 만쑱슀럽게 뽑아낸 κ²°κ³Ό λ‚΄ 인생 처음으둜 큰 λŒ€νšŒμ—μ„œ 상을 λ°›κ²Œ λœλ‹€. κ·Έ 쒋은 기얡을 λ– μ˜¬λ¦¬λ©°, μš°λ¦¬νŒ€μ΄ λ§Œλ“€μ—ˆλ˜ μ‹œμŠ€ν…œμ„ 리뷰해보렀고 ν•œλ‹€ .

 

1. ν”„λ‘œμ νŠΈ κ°œμš”

1.1 ν”„λ‘œμ νŠΈ 주제 μ„ μ • λ°°κ²½

청각μž₯애인이 μ†Œλ¦¬λ₯Ό λ“£μ§€λͺ»ν•΄ κ²©λŠ” 문제 μ •μ˜

β‘  μœ„ν—˜ 감지 어렀움

  • μ°¨ κ²½μ μ†Œλ¦¬, μ‚¬μ΄λ Œ μ†Œλ¦¬ λ“± μœ„ν—˜ 감지 μš”μ†Œκ°€ 될 수 μžˆλŠ” μ†Œλ¦¬λ₯Ό νŒŒμ•…ν•˜μ§€ λͺ»ν•˜μ—¬ μœ„ν—˜ λ…ΈμΆœ κ°€λŠ₯성이 λ†’λ‹€.
  • μ‹€μ œ 사둀) 청각μž₯μ• μΈμ˜ ν˜„μ‹€κ³Ό μš”κ΅¬μ‚¬ν•­
 

‘보청기’뿐인 청각μž₯애보쑰기 μ‹œμž₯에 μƒˆλ°”λžŒ μΌμœΌν‚¨ η”·

“λ„Œ μ™œ 맨날 벽에 λΆ™μ–΄λ‹€λ‹ˆλƒ?”μ–΄λŠ λ‚  μ„ λ°°κ°€ λ‚΄κ²Œ λ¬Όμ—ˆλ‹€. “λ’€μ—μ„œ μ°¨κ°€ λΉ΅λΉ΅ κ±°λ¦¬λŠ” κ±Έ 듀을 수 μ—†μœΌλ‹ˆκΉŒμš”. ν•œ λ²ˆμ€ κΈ°λ‹€λ¦¬λ˜ μš΄μ „μžκ°€ μ„Έκ²Œ λ°•μ•„μ„œ ν—ˆλ¦¬κ°€ λ‚˜κ°ˆ λ»”ν•œ 적도 μžˆκ³ μš”.” 보

brunch.co.kr

β‘‘ 청각 보쑰기기의 높은 가격

  • νŠΉμ • μ†Œλ¦¬λ₯Ό κ°μ§€ν•˜κΈ° μœ„ν•΄ 높은 κ°€κ²©μ˜ 청각보쑰기기λ₯Ό ꡬ맀해야 ν•œλ‹€. μ •λΆ€ μ§€μ›κΈˆ μ œλ„κ°€ μžˆμœΌλ‚˜ 이λ₯Ό 지원 λ°›μ§€ λͺ»ν•˜λŠ” μž₯애인이 λ‹€μˆ˜μ˜€λ‹€.
  • μ‹€μ œ 사둀) 보청기 보쑰금의 ν˜„μ‹€
 

‘μ΅œλŒ€ 131λ§Œμ›’ 보청기 보쑰금, μ ˆμ°¨μ™€ 자격쑰건 TIP

β‘  μ΅œλŒ€ 131λ§Œμ›κΉŒμ§€ 지원 =μš°μ„  보청기 보쑰금 λŒ€μƒμžλŠ” 2~6λ“±κΈ‰ 청각μž₯μ• νŒμ •μ„ 받은 λ‚œμ²­μΈμ— ν•œν•˜λ©°, 등급별 가격 차이 없이 보청기 μ§€μ›κΈˆ μ΅œλŒ€ μ•‘μˆ˜λŠ” 131λ§Œμ›μ΄λ‹€. 이밖에 보청기 κ΅¬μž…λΆ€ν„°

news.joins.com

β‘’ ν˜„μ‹€μ  λŒ€μ•ˆ λΆ€μ‘±

  • 청각μž₯애인이 ν˜„μ‹€μ μœΌλ‘œ μš”κ΅¬ν•˜λŠ” ν•­λͺ©μ— λΆ€ν•©ν•˜λŠ” μ œν’ˆμ΄ 거의 μ—†λ‹€.
  • μ‹€μ œ 사둀) ν™”μž¬κ²½λ³΄κΈ° μ„€μΉ˜ ν˜„ν™©
 

"청각μž₯μ• μΈμš© 경보μž₯치, λͺ¨λ“  건물에 μ„€μΉ˜ μ˜λ¬΄ν™” ν•΄μ•Ό" - λΉ„λ§ˆμ΄λ„ˆ

청각μž₯μ• μΈμš© μž¬λ‚œ 경보μž₯치λ₯Ό λͺ¨λ“  건좕물에 의무적으둜 μ„€μΉ˜ν•΄μ•Ό ν•œλ‹€λŠ” μ£Όμž₯이 μ œκΈ°λ˜μ—ˆλ‹€. '(κ°€)μž₯μ• μ˜ 벽을 ν—ˆλ¬΄λŠ” μ‚¬λžŒλ“€(μ•„λž˜ μž₯λ²½ν—ˆμ‚¬)'은 μ§€λ‚œ 6일 경기도 ν™”μ„±μ—μ„œ λ°œμƒν•œ ν™”μž¬λ‘œ 60λŒ€

www.beminor.com

 

1.2 λͺ©ν‘œ μ •μ˜

청각μž₯애인을 μœ„ν•œ μ‚¬μš©μž λ§žμΆ€ν˜• μ†Œλ¦¬ μ•Œλ¦Ό μ–΄ν”Œλ¦¬μΌ€μ΄μ…˜ μ†Œλ¦¬λ―Έμ œμ•ˆ

  • 슀마트폰 진동 μ•Œλ¦Ό μ–΄ν”Œμ„ κ΅¬ν˜„ν•¨μœΌλ‘œμ¨ 청각μž₯애인듀이 λŠλΌλŠ” 청각 보쑰기ꡬ의 가격에 λŒ€ν•œ 뢀담을 쀄이고, μ‚¬μš©λ°©λ²•μ΄ κ°„λ‹¨ν•˜λ©°, μΈμ‹ν•˜κ³  싢은 μ†Œλ¦¬λ₯Ό μ‚¬μš©μžκ°€ 직접 μ„ νƒν•˜μ—¬ μ›ν•˜λŠ” μ•Œλ¦Όμ„ 받을 수 μžˆλ„λ‘ 함

 

1.3 λ²”μœ„ μ •μ˜

ν”„λ‘œμ νŠΈμ˜ λ²”μœ„λ₯Ό μ •μ˜ν•¨μœΌλ‘œμ¨ ν”„λ‘œμ νŠΈμ˜ μƒμ„Έν•œ μœ€κ³½μ„ μž‘μ•„λ³΄κ³ , μΆ”ν›„ ν•΄λ‹Ή 주제λ₯Ό λ²—μ–΄λ‚˜μ§€ μ•Šλ„λ‘ 미리 λ²”μœ„λ₯Ό μ •μ˜ν•΄λ‘κ³  μ‹œμž‘ν–ˆλ‹€. μ‹€μ œλ‘œ λ³Έ μ‹œμŠ€ν…œμ„ κ΅¬ν˜„ν• λ•Œ, μ‚¬μš©μžκ°€ λˆ„κ΅¬μΈμ§€, μ™œ 이 μ‹œμŠ€ν…œμ„ κ°œλ°œν•΄μ•Ό ν•˜λŠ”μ§€ ν—·κ°ˆλ¦¬κ±°λ‚˜ λ²”μœ„λ₯Ό λ²—μ–΄λ‚œ 생각을 ν•˜λŠ” κ²½μš°κ°€ μ’…μ’… μƒκ²ΌλŠ”λ° λ²”μœ„ 섀정을 ν•΄λ‘μ–΄μ„œ 주제λ₯Ό λ‹€μ‹œ λ‹€μž‘κ³  νŒ€ 회의λ₯Ό μ§„ν–‰ν•˜λŠ” 쒋은 역할을 해주기도 ν•˜μ˜€λ‹€. 

 

2. 섀계 및 κ΅¬ν˜„

2.1 ν”„λ‘œμ νŠΈ 섀계

2.1.1 Activity Diagram

Activity Diagram을 κ·Έλ €λ΄„μœΌλ‘œμ¨ ν”„λ‘œκ·Έλž¨μ˜ ν”„λ‘œμ„ΈμŠ€μ— λŒ€ν•œ ꡬ체적인 μ •μ˜λ₯Ό ν•˜κ³ , 섀계에 λŒ€ν•œ λŒ€λž΅μ μΈ 그림을 κ·Έλ €λ³΄μ•˜λ‹€.

 

 

2.1.2 AS-IS, TO-BE 뢄석

AS-IS, TO-BE 뢄석을 톡해 λ³Έ μ‹œμŠ€ν…œ κ΅¬ν˜„μ„ 톡해 μ–΄λ–€ 점이 ν•΄κ²°λ˜λŠ”μ§€, κΈ°μ‘΄ μ‹œμŠ€ν…œμ— λŒ€ν•œ 문제λ₯Ό νŒŒμ•…ν•΄λ³΄λŠ” μ‹œκ°„μ„ κ°€μ‘Œλ‹€.

β‘  AS-IS 뢄석 및 문제 μ •μ˜

β‘‘ TO-BE 뢄석

 

2.1.3 User Journey Map

ν•΄λ‹Ή μ‹œμŠ€ν…œμ΄ κ΅¬ν˜„λ˜μ—ˆμ„ λ•Œ, μ‚¬μš©μžκ°€ μ–΄λ–»κ²Œ λ°˜μ‘ν•˜λŠ”μ§€μ— λŒ€ν•œ μ‚¬μš©μž κ²½ν—˜μ„ μ‹œκ°ν™” ν•˜μ—¬ ν‘œν˜„ν•¨μœΌλ‘œμ¨ λ³Έ μ‹œμŠ€ν…œμ— λŒ€ν•œ ν•„μš”μ„±μ„ λ”μš± κ°•μ‘°ν–ˆλ‹€.

이 user journey map을 μ‚΄νŽ΄λ³΄μ•„λ„, μ‚¬μš©μžμ˜ λ‹ˆμ¦ˆκ°€ 맀우 닀양함을 μ•Œ 수 μžˆλ‹€. 이 λͺ¨λ“  μ‚¬μš©μžμ˜ λ‹ˆμ¦ˆλ₯Ό μΆ©μ‘±μ‹œμΌœμ£ΌλŠ” μ‹œμŠ€ν…œμΈ μ‚¬μš©μž λ§žμΆ€ν˜• μ„œλΉ„μŠ€λ₯Ό μ œκ³΅ν•˜μ—¬ Installed Baseλ₯Ό ν™•λ³΄ν•˜λŠ” 것을 μ΅œμ’… λͺ©ν‘œλ‘œ λ‘μ—ˆλ‹€.

 

2.2 ν”„λ‘œμ νŠΈ κ΅¬ν˜„

2.2.1 데이터 μˆ˜μ§‘

λ‘κ°€μ§€μ˜ 데이터 셋을 μ‚¬μš©ν–ˆλ‹€. 

FSD Kaggle 2018 Kaggle λŒ€νšŒλ₯Ό μœ„ν•΄ Google AudioSetμ—μ„œ λ°œμ·Œν•œ λ°μ΄ν„°μ…‹μœΌλ‘œ,
18873개의 μ‚¬μš΄λ“œ 데이터λ₯Ό 41개의 클래슀둜 λΌλ²¨λ§ν•œ 데이터
Urbansound8K λ„μ‹œ μ˜μƒμ—μ„œ μΆ”μΆœν•œ μ‚¬μš΄λ“œλ‘œ κ΅¬μ„±λœ λ°μ΄ν„°μ…‹μœΌλ‘œ,
8732개의 μ‚¬μš΄λ“œ 데이터λ₯Ό 10개의 클래슀둜 λΌλ²¨λ§ν•œ 데이터

여기에 μžˆλŠ” λͺ¨λ“  데이터λ₯Ό μ‚¬μš©ν•˜μ§€λŠ” μ•Šμ•˜κ³ , ν”„λ‘œμ νŠΈ μ£Όμ œμ— μ ν•©ν•œ 8개의 클래슀만 일뢀 μΆ”μΆœν•˜μ—¬ μ‚¬μš©ν–ˆλ‹€.
클래슀 λΆˆκ· ν˜• λ°©μ§€λ₯Ό μœ„ν•΄ μ•„λž˜ ν΄λž˜μŠ€μ—μ„œλ„ λ„ˆλ¬΄ λ§Žμ€ 데이터λ₯Ό λ³΄μœ ν•œ λ°μ΄ν„°μ˜ κ²½μš°λ„ μ΅œλŒ€ 500개의 λ°μ΄ν„°λ§Œ μ‚¬μš©ν–ˆλ‹€. λ”°λΌμ„œ 총 8개의 클래슀둜 라벨링 된 2721개의 μ‚¬μš΄λ“œ 데이터λ₯Ό ν™œμš©ν–ˆλ‹€.

μ‚¬μš©λœ 8개의 클래슀 Knock, Cough, Meow, Chime, Engine_idling, Dog-bark, Siren, Car-horn
print(metadata.class_name.value_counts())


2.2.2 데이터 ν˜•νƒœ μ‹œκ°ν™”

 

μ†Œλ¦¬ λ°μ΄ν„°λŠ” λͺ¨λ“  ν˜•νƒœμ™€ νŒŒν˜•μ΄ λ‹€μ–‘ν•˜κΈ° λ•Œλ¬Έμ— μ‹œκ°ν™”ν•˜μ—¬ κ·Έ λͺ¨μŠ΅μ„ μ‚΄νŽ΄λ³΄μ•˜λ‹€.

filename = 'c:/code/dataset/kaggle_sample/00fbb28b.wav'
plt.figure(figsize = (12, 4))
data.sample_rate = librosa.load(filename)
_ = librosa.display.waveplot(data, sr = sample_rate)

클래슀 별 νŒŒν˜•μ€ μ΄λ ‡κ²Œ λ‚˜νƒ€λ‚¬λ‹€.

 

2.2.3 데이터 속성 뢄석

β‘  Audio Channels

  • μ˜€λ””μ˜€λŠ” μŠ€ν…Œλ ˆμ˜€, λͺ¨λ…Έ 두 κ°€μ§€μ˜ νƒ€μž…μœΌλ‘œ 쑴재
  • μŠ€ν…Œλ ˆμ˜€λŠ” λ‘κ°œμ˜ 채널, λͺ¨λ…ΈλŠ” ν•œ 개의 채널을 가짐

  • μˆ˜μ§‘λ°μ΄ν„°λŠ” 두 κ°€μ§€μ˜ 타이이 ν˜Όμž¬λ˜μ–΄ μžˆλŠ” ν˜•νƒœλ‘œ λ³΄μž„

β‘‘ Sample Rate

  • 이산적인 μ‹ ν˜Έλ₯Ό λ§Œλ“€κΈ° μœ„ν•΄μ„œ 연속적인 μ‹ ν˜Έμ—μ„œ μ–»μ–΄μ§„ λ‹¨μœ„λ‹Ή (주둜 초) μƒ˜ν”Œλ§ 횟수
  • μˆ˜μ§‘λ°μ΄ν„°μ˜ Sample RateλŠ” 96K ~ 11K둜 λ‹€μ–‘
print(audiodf.sample_rate.value_counts(normalize = True))

 

β‘’ Bit Depth

  • μ˜€λ””μ˜€λ₯Ό μ„ΈλΆ„ν™”ν•˜μ—¬ ν‘œν˜„ν•˜λŠ” 정도, λΉ„νŠΈμ˜ 심도라고 뢀름.
  • μˆ˜μ§‘ λ°μ΄ν„°μ˜ Bit Depth λ˜ν•œ 맀우 λ‹€μ–‘ν•˜κ²Œ 뢄포함.
print(audiodf.bit_depth.value_counts(normalize = True))

 

2.2.4 데이터 μ „μ²˜λ¦¬

β‘  ν•˜λ‚˜μ˜ μ±„λ„λ‘œ 톡합

  • μŠ€ν…Œλ ˆμ˜€ νƒ€μž…μ˜ 데이터λ₯Ό 두 μ±„λ„μ˜ 평균값을 μ΄μš©ν•˜μ—¬ ν•˜λ‚˜λ‘œ λ³‘ν•©ν•˜κ³ , λͺ¨λ…Έ νƒ€μž…μœΌλ‘œ λ³€ν™˜ν•΄ μ£Όμ—ˆλ‹€.
import matplotlib.pyplot as plt

# μ›λž˜ original λ°μ΄ν„°μ˜ ν˜•νƒœ
plt.figure(figsize=(12, 4))
plt.plot(scipy_audio)

# 합쳐진 채널
plt.figure(figsize=(12, 4))
plt.plot(librosa_audio)

β‘‘ Sample Rate κ°’λ“€μ˜ ν‘œμ€€ν™”

  • λ‹€μ–‘ν•˜κ²Œ λΆ„ν¬ν–ˆλ˜ Sample Rate 값듀을 ν‘œμ€€ν™” μ‹œμΌœμ£ΌλŠ” μž‘μ—…μ„ μˆ˜ν–‰ν–ˆλ‹€.
  • Sample-rate conversion κΈ°μˆ μ„ ν™œμš©ν•˜μ—¬ λ‹€μ–‘ν•œ Sample Rate듀을 ν‘œμ€€ν™” μ‹œμΌ°λ‹€.
    • Sample-rate conversion? μ΄μ‚°μ‹ ν˜Έμ˜ μƒ˜ν”Œλ§ 속도λ₯Ό λ³€κ²½ν•˜μ—¬ κΈ°λ³Έ 연속 μ‹ ν˜Έμ˜ μƒˆλ‘œμš΄ 이산 ν‘œν˜„μ„ μ–»λŠ” κ³Όμ •
  • Librosa λͺ¨λ“ˆμ˜ load κΈ°λŠ₯을 μ΄μš©ν•˜μ—¬ λ°μ΄ν„°λ“€μ˜ Sample Rateλ₯Ό 22.05KHz둜 λ³€ν™˜ν–ˆλ‹€.
    • librosa.load(filename, sr=None) 이 ν˜•νƒœ, sr을 None으둜 μ„€μ •ν•΄μ£Όμ§€ μ•ŠμœΌλ©΄ μžλ™μœΌλ‘œ resampling이 진행됨
    • λ³Έ ν”„λ‘œμ νŠΈμ—μ„œλŠ” μ§€μ •ν•΄μ£Όμ§€ μ•Šμ•˜κΈ° 떄문에 resampling이 μ§„ν–‰λμŒμ„ μ•Œ 수 있음.
    • resampling된 λ°μ΄ν„°μ˜ λΆ„ν¬λŠ” -1κ³Ό 1μ‚¬μ΄λ‘œ μ •κ·œν™”λ¨
    • 참고링크
import librosa 
from scipy.io import wavfile as wav

import numpy as np

filename = 'UrbanSound_Dataset_sample/audio/102857-5-0-0.wav' 

librosa_audio, librosa_sample_rate = librosa.load(filename) 
scipy_sample_rate, scipy_audio = wav.read(filename) 

print('Original sample rate:', scipy_sample_rate) 
print('Librosa sample rate:', librosa_sample_rate) 

β‘’ Bit Depth κ°’λ“€μ˜ μ •κ·œν™”

  • librosa λͺ¨λ“ˆμ˜ load κΈ°λŠ₯을 μ΄μš©ν•˜μ—¬ 각 λ°μ΄ν„°λ“€μ˜ Bit Depthλ₯Ό -1 κ³Ό 1 μ‚¬μ΄μ˜ 값을 갖도둝 μ •κ·œν™” μ‹œμΌœμ£Όμ—ˆλ‹€.
print('Original audio file min~max range:', np.min(scipy_audio), 'to', np.max(scipy_audio))
print('Librosa audio file min~max range:', np.min(librosa_audio), 'to', np.max(librosa_audio))

 

2.2.5 데이터 νŠΉμ„± μΆ”μΆœ

μ†Œλ¦¬λ°μ΄ν„°λŠ” λ‹€λ₯Έ λΉ„μ •ν˜• λ°μ΄ν„°μ™€λŠ” λ‹€λ₯΄κ²Œ 주파수 및 μ‹œκ³„μ—΄μ˜ νŠΉμ„±μ„ λͺ¨λ‘ λΆ„μ„ν•œ featureκ°€ ν•„μš”ν–ˆλ‹€. μ—¬λŸ¬λ²ˆμ˜ μ‹€νŒ¨λ₯Ό 맛보고, νŒ€μ›λ“€κ³Ό μ—΄μ‹¬νžˆ κ³΅λΆ€ν•œ κ²°κ³Ό MFCC μ•λ¦¬μ¦˜μ„ ν™œμš©ν•΄μ„œ Featureλ₯Ό μΆ”μΆœν•΄μ£Όμ–΄μ•Ό ν•œλ‹€λŠ” 점을 μ•Œκ²Œ λ˜μ—ˆλ‹€. MFCC 에 λŒ€ν•œ μžμ„Έν•œ ν¬μŠ€νŒ…μ€ λ‚˜μ€‘μ— ν•˜λ„λ‘ ν•˜κ³ , ν”„λ‘œμ νŠΈμ— ν•„μš”ν•œ κ°„λ‹¨ν•œ 지식과 κ°„λ‹¨ν•œ μ½”λ“œλ₯Ό κ΅¬ν˜„ν•˜μ—¬ λ³Έ ν”„λ‘œμ νŠΈλ₯Ό λ§ˆμ € μ—…λ‘œλ“œ 해보겠닀.

  • MFCC (Mel-Frequency Cepstral Coefficients)
    • μ†Œλ¦¬λ°μ΄ν„°λ₯Ό 일정 κ΅¬κ°„μœΌλ‘œ λ‚˜λˆ„μ–΄ ν•΄λ‹Ή ꡬ간에 λŒ€ν•œ μŠ€νŽ™νŠΈλŸΌμ„ λΆ„μ„ν•˜μ—¬ νŠΉμ§•μ„ μΆ”μΆœν•΄μ£ΌλŠ” λŒ€ν‘œμ μΈ μŒμ„±μΈμ‹ μ•Œκ³ λ¦¬μ¦˜
    • μŒμ •μ΄ 변해도 MFCCκ°€ μΌμ •ν•˜κ²Œ μœ μ§€λ˜κΈ° λ•Œλ¬Έμ— ν™œμš©λ²”μœ„κ°€ μƒλ‹Ήνžˆ λ„“μŒμ„ μ•Œ μˆ˜μžˆλ‹€.
    • μ†Œλ¦¬μ˜ 주파수 및 μ‹œκ°„μ˜ νŠΉμ„±κΉŒμ§€ λͺ¨λ‘ 뢄석해쀀닀.
    • μ†Œλ¦¬λŠ” μ‹œκ³„μ—΄ 데이터이기 λ•Œλ¬Έμ— μ‹œκ°„μ˜ νŠΉμ„±κΉŒμ§€ 뢄석이 ν•„μš”ν–ˆκ³ , μ£Όλ³€μ˜ μΌμ •ν•˜μ§€ μ•Šμ€ μ†Œλ¦¬λ₯Ό μΈμ‹ν•˜κ³ , μŒμ •μ΄ 달라도 κ°™μ€μ†Œλ¦¬λ‘œ 인식해주어야 ν•˜κΈ° λ•Œλ¬Έμ— MFCC μ•Œκ³ λ¦¬μ¦˜μ„ μ‚¬μš©ν•΄λ³΄κΈ°λ‘œ ν•œλ‹€.
import librosa 
import numpy as np
import pandas as pd
import os

max = 0

def extract_features(file_name):
   
    try:
        audio, sample_rate = librosa.load(file_name, res_type='kaiser_fast') 
        mfccs = librosa.feature.mfcc(y=audio, sr=sample_rate, n_mfcc=40)
#         print(mfccs.shape) # 이거 μΆ”κ°€ν•΄λ΄„
        mfccsscaled = np.mean(mfccs.T,axis=0)
        global max
        if mfccs.shape[1] > max:
            max = mfccs.shape[1]
#             print(max)

    except Exception as e:
        print("Error encountered while parsing file: ", file_name)
        return None 
     
    return mfccsscaled
    
    
# 데이터셋 path μ§€μ •
fulldatasetpath = 'C:/code/cap_sound/dataset/final_dataset_500/'

metadata = pd.read_csv('C:/code/cap_sound/dataset/final_500.csv')

features = []

# 각 μ†Œλ¦¬μ— λŒ€ν•΄ featureλ₯Ό μΆ”μΆœ
for index, row in metadata.iterrows():
    
    file_name = os.path.join(os.path.abspath(fulldatasetpath)+'/'+str(row["slice_file_name"]))
    
    class_label = row["class_name"]
    data = extract_features(file_name)
    
    features.append([data, class_label])

# df둜 λ§Œλ“€κΈ°
featuresdf = pd.DataFrame(features, columns=['feature','class_label'])

print('Finished feature extraction from ', len(featuresdf), ' files') 
print('Max :',max)

 

μš”μ•½ν•΄μ„œ μΆ”μΆœλœ featureλ₯Ό μ‚΄νŽ΄λ³΄λ©΄

mfccs = librosa.feature.mfcc(y = librosa_audio, sr = librosa_sample_rate, n_mfcc = 40)
print(mfccs.shape)

import librosa.display
librosa.display.specshow(mfccs, sr = librosa_sample_rate, x_axis = 'time')

 

2.2.6 λͺ¨λΈ ꡬ좕

배운 지식을 ν™œμš©ν•΄λ³΄κ³ μž μ„Έ κ°€μ§€μ˜ λͺ¨λΈμ„ κ΅¬μΆ•ν–ˆκ³ , 각 λͺ¨λΈ 별 Accuracyλ₯Ό 보고 μ΅œμ’… λͺ¨λΈμ„ κ²°μ •ν•˜μ˜€λ‹€. 

β‘  데이터 λΆ„ν•  및 μ €μž₯

  • 데이터 λ³€ν™˜ 및 라벨 인코더 적용
    • Sklearn의 labelEncoderλ₯Ό μ μš©ν–ˆλ‹€.
    • λ²”μ£Όν˜• ν…μŠ€νŠΈ 데이터λ₯Ό 이해할 수 μžˆλŠ” μˆ˜μΉ˜ν˜• λ°μ΄ν„°λ‘œ λ³€ν™˜ν•˜κΈ° μœ„ν•΄μ„œ
from sklearn.preprocessing import LabelEncoder
from keras.utils import to_catagorical

x = np.array(featuresdf.feature.tolist())
y = np.array(featuresdf.class_label.tolist())

le = LabelEncoder()
yy = to_categorical(le.fit_transform(y))

 

β‘‘ 데이터 λΆ„ν• 

  • sklearn의 train_test_split을 μ μš©ν•˜μ—¬ train:test = 8:2둜 λΆ„ν• ν–ˆλ‹€.
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(X, yy, test_size = 0.2, random_state = 42)

 

2.2.6.1 MLP (Multi-Layer Perception)

  • λͺ¨λΈ μ„ μ • 이유
    • μ‹ κ²½λ§μ˜ 기본이 λ˜λŠ” λͺ¨λΈλ‘œμ„œ, μž…λ ₯μΈ΅, 은닉측, 좜λ ₯측의 λ‹¨μˆœν•œ ꡬ쑰둜 κ΅¬ν˜„μ΄ μš©μ΄ν•˜μ—¬ ν•™μŠ΅ μ†Œμš”μ‹œκ°„μ΄ μ§§μœΌλ―€λ‘œ 채택
    • μ§€λ„ν•™μŠ΅μ΄ ν•„μš”ν•œ 문제λ₯Ό ν•΄κ²°ν•˜λŠ”λ° 주둜 μ‚¬μš©λ˜λ©°, μŒμ„±μΈμ‹ ν˜Ήμ€ 이미지 μΈμ‹μ—μ„œ 주둜 μ‚¬μš©λ˜κΈ° λ•Œλ¬Έμ— μ‚¬μš©
  • λͺ¨λΈ κ΅¬ν˜„
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import Adam
from keras.utils import np_utils
from sklearn import metrics 

num_labels = yy.shape[1]
filter_size = 2

# Construct model 
model = Sequential()

model.add(Dense(256, input_shape=(40,)))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))

model.add(Dense(num_labels))
model.add(Activation('softmax'))
  • μ™„μ „ μ—°κ²° 계측 3측으둜 ꡬ성
  • ν™œμ„±ν™” ν•¨μˆ˜λŠ” 각 layer λ§ˆλ‹€ ReLU, ReLU, softmax ν•¨μˆ˜λ₯Ό μ‚¬μš©
  • Batch 256 / epoch λŠ” 100, 150,,,, 1000 κΉŒμ§€ λ‹€μ–‘ν•˜κ²Œ ν•™μŠ΅ μ§„ν–‰
  • ν•™μŠ΅ μ§„ν–‰
from keras.callbacks import ModelCheckpoint 
from datetime import datetime 

num_epochs = 1000
num_batch_size = 32

# μ–˜ μ›λž˜ μžˆλŠ” λͺ¨λΈμ„ μ“°λŠ”κ²Œ λ§žλŠ”μ§€
checkpointer = ModelCheckpoint(filepath='C:/code/cap_sound/save_models/weights.best.basic_mlp.hdf5', 
                               verbose=1, save_best_only=True)
start = datetime.now() 

model.fit(x_train, y_train, batch_size=num_batch_size, epochs=num_epochs, validation_data=(x_test, y_test), callbacks=[checkpointer], verbose=1)


duration = datetime.now() - start
print("Training completed in time: ", duration)
  • μ‹€ν–‰κ²°κ³Ό ν•™μŠ΅μ‹œκ°„μ΄ 맀우 μ§§λ‹€λŠ” μž₯점이 μžˆμ—ˆμŒ. (epoch 1000 μΌλ•Œ ν•™μŠ΅ μ†Œμš” μ‹œκ°„ μ•½ 38초)
  • μ „μ²˜λ¦¬ 및 MFCC λ₯Ό μ μš©ν•œ 데이터λ₯Ό λ”°λ‘œ λ³€ν˜•μ—†μ΄ λ°”λ‘œ μ‚¬μš© κ°€λŠ₯ν•˜λ‹€λŠ” 점이 νŽΈλ¦¬ν–ˆμŒ.

 

2.2.6.2 CNN (Convolutional Nural Network)

  • λͺ¨λΈ μ„ μ • 이유
    • λ³Έμ§ˆμ μœΌλ‘œλŠ” λ‹¨μˆœν•œ MLP의 ν™•μž₯μ΄μ§€λ§Œ, μž…λ ₯μΈ΅, μ»¨λ³Όλ£¨μ…˜μΈ΅, 폴링측, 완전연결계측이 κ²°ν•©λœ ν˜•νƒœλ‘œ λŒ€λΆ€λΆ„μ˜ 경우 MLP보닀 더 쒋은 μ„±λŠ₯을 λ°œνœ˜ν•˜κΈ°μ— 채택

β‘  데이터 μ „μ²˜λ¦¬, νŠΉμ§• μΆ”μΆœ, μ œλ‘œνŒ¨λ”© μˆ˜ν–‰

# extract feature ν•¨μˆ˜ μž¬μ •μ˜
import numpy as np
max_ped_len = 1287
test_num = 0

def extract_features(file_name):
	try:
      audio, sample_rate = librosa.load(file_name, res_type = 'kaiser_fast')
      mfccs = librosa.feature.mfcc(y=audio, sr = sample_rate, n_mfcc = 40)
      pad_width = max_pad_len - mfcc.shape[1]
      mfccs = np.pad(mfccs, pad_width = ((0, 0), (0, pad_width)), mode = 'constant')
    
    except Exception as e:
    	print("Error λ°œμƒ: ", file_name)
        return None
	
    return mfccs
import pandas as pd
import os
import librosa

fulldatasetpath = 'C:/AI/final_dataset_500/'
metadata = pd.read_csv("C:/AI/final_500.csv')
features = []

for index, row in metadata.iterrows():
	file_name = os.path.join(os.path.abspath(fulldatasetpath) + '/' + str(row["slice_file_name"]))
    
    class_label = row['class_name']
    data = extract_features(file_name)
    
    features.append([data, class_label])

featuresdf = pd.DataFrame(features, columns = ['feature', 'class_label'])

print("Finished feature extraction from ', len(featuresdf), ' files')    

β‘‘ ν•™μŠ΅ 데이터 reshape

num_rows = 40
num_columns = 1287
num_channels = 1

prnt("train data shape")
print(x_train.shape)
print(x_test.shape)

x_train = x_train.reshape(x_train.shape[0], num_rows, num_columns, num_channels)
x_test = x_test.reshape(x_test.shape[0], num_rows, num_columns, num_channels)

print("\ntrain data reshape κ²°κ³Ό")
print(x_train.shape)
print(x_test.shape)

β‘’ λͺ¨λΈ κ΅¬ν˜„

import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D, GlobalAveragePooling2D
from keras.optimizers import Adam
from keras.utils import np_utils
from sklearn import metrics

num_labels = yy.shape[1]
filter_size = 2

#CNNλͺ¨λΈ κ΅¬ν˜„
model = Sequential()
model.add(Conv2D(filters = 16, kernel_size = 2, input_shape = (num_rows, num_columns, num_channels), activation = 'relu'))
model.add(MaxPooling2D(pool_size = 2))
model.add(Dropout(0.2))

model.add(Conv2D(filters = 32, kernel_size = 2, activation = 'relu'))
model.add(MaxPooling2D(pool_size = 2))
model.add(Dropout(0.2))

model.add(Conv2D(filters = 64, kernel_size = 2, activation = 'relu'))
model.add(MaxPooling2D(pool_size = 2))
model.add(Dropout(0.2))

model.add(Conv2D(filters = 128, kernel_size = 2, activation = 'relu'))
model.add(MaxPooling2D(pool_size = 2))
model.add(Dropout(0.2))
model.add(GlobalAveragePooling2D())

model.add(Dense(num_labels, activation = 'softmax'))


#컴파일
model.compile(loss = 'categorical_crossentropy'
				, metrics = ['accuracy']
                , optimizer = 'adam')
                
model.summary()
score = model.evaluate(x_test, y_test, verbose = 1)
accuracy = 100 * score[1]

print('Pre-training accuracy: %.4f%%' % accuracy)
  • λͺ¨λΈ μ„€λͺ…
    • 4개의 convolution 계측과 4개의 Max pooling κ³„μΈ΅μœΌλ‘œ ꡬ성
    • 각 layer의 ν™œμ„±ν™”ν•¨μˆ˜λŠ” ReLUν•¨μˆ˜, 좜λ ₯측은 softmaxν•¨μˆ˜λ₯Ό 이용
    • drop out 값은 0.2
  • ν•™μŠ΅ μ§„ν–‰
    • batch 256
    • epochλŠ” 100, 150,,, 1000 κΉŒμ§€ λ‹€μ–‘ν•˜κ²Œ μˆ˜ν–‰ν•΄λ΄„
  • νŠΉμ΄μ‚¬ν•­
    • CNNλͺ¨λΈμ—μ„œλŠ” 인풋 λ°μ΄ν„°μ˜ 크기가 (3차원)으둜 λͺ¨λ‘ 동일해야함
    • κ·ΈλŸ¬λ‚˜ MFCC μ „μ²˜λ¦¬κ°€ 적용된 λ°μ΄ν„°μ˜ 경우 λ‹€μŒκ³Ό κ°™μœΌ λͺ¨λ‘ λ‹€μ–‘ν•œ shape을 λ³΄μœ ν•˜κΈ°μ— λͺ¨λ‘ λ™μΌν•œ ν˜•νƒœλ‘œ reshape이 λ”°λ‘œ ν•„μš”ν–ˆμŒ. 

MFCC μ „μ²˜λ¦¬λœ λ°μ΄ν„°μ˜ λ‹€μ–‘ν•œ shape

  • λ”°λΌμ„œ max len = 1287둜 μ œλ‘œνŒ¨λ”©μ„ μˆ˜ν–‰ν•΄μ£Όμ—ˆμŒ.
  • 결둠적으둜 데이터 shapeλŠ” μ•„λž˜ ν‘œ 처럼 변화함

 

2.2.6.3 LSTM (Long Short-Term Memory)

  • λͺ¨λΈ μ„ μ • 이유
    • RNN의 μ•½μ μœΌλ‘œ μ§€μ λ˜λŠ” Vanishing Gradient Problem을 κ°œμ„ ν•œ λͺ¨λΈλ‘œ, λŒ€λΆ€λΆ„μ˜ κ²½μš°μ—μ„œ RNN보닀 쒋은 μ„±λŠ₯을 λ³΄μ΄λ―€λ‘œ 채택

β‘  ν•™μŠ΅λ°μ΄ν„° reshape

print("train data shape")
print(x_train.shape)
print(x_test.shape)

#x_train=x_train.reshape(2176,40,1)
x_train = np.reshape(x_train, (len(x_train), len(x_train[0]), -1))
x_test = np.reshape(x_test, (len(x_test), len(x_test[0]), -1))

#print(y_train.shape)
#print(y_test.shape)

print("\ntrain data reshape κ²°κ³Ό")
print(x_train.shape)
print(x_test.shape)

 

β‘‘ λͺ¨λΈ κ΅¬ν˜„

import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, Conv2D, MaxPooling2D, GlobalAveragePooling2D
from keras.layers import GRU, LSTM, Embedding               # RNN
from keras.optimizers import Adam
from keras.utils import np_utils
from sklearn import metrics 

num_labels = y_train.shape[1]

#LSTM λͺ¨λΈ κ΅¬ν˜„
model = Sequential()    
model.add(LSTM(256,input_shape=(40,1),return_sequences=False))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
	#model.add(TimeDistributed(Dense(vocabulary)))
model.add(Dense(num_labels, activation='softmax'))

model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
# Display model architecture summary 
model.summary()
  •  λͺ¨λΈ μ„€λͺ…
    • LSTM λͺ¨λΈμ— 완전연결계측 5개 μ—°κ²°
    • 각 κ³„μΈ΅μ—μ„œ ReLUν•¨μˆ˜, 좜λ ₯κ³„μΈ΅μ—μ„œ softmaxν•¨μˆ˜λ₯Ό μ‚¬μš©
    • 5개의 Dense
    • drop out 0.2
  • λ‹€μŒκ³Ό 같이 ν•™μŠ΅ μ§„ν–‰
    •  Batch 256
    • epochλŠ” 100, 150,,,,, 1000 κΉŒμ§€ λ‹€μ–‘ν•˜κ²Œ μ§„ν–‰ν•΄λ΄„
  • νŠΉμ΄μ‚¬ν•­
  • LSTMλͺ¨λΈμ„ μžμ—°μ–΄ μ²˜λ¦¬μ— μ‚¬μš©μ‹œ, 인풋데이터λ₯Ό 숫자 ν˜•νƒœλ‘œ λ³€ν™˜ν•˜κΈ°μœ„ν•΄ 주둜 원-핫인코딩을 μ μš©ν•œλ‹€.
  • μ•žμ„œ MLPλͺ¨λΈμ—μ„œ μ •μ˜ν•œ μ „μ²˜λ¦¬ν•¨μˆ˜ (Extract_feature) μ—μ„œ MFCCμ μš©μ„ 톡해 인풋데이터λ₯Ό 숫자둜 λ³€ν™˜ν•˜μ˜€κΈ°μ— λ³„λ„μ˜ 인코딩은 μ§„ν–‰ν•˜μ§€ μ•Šμ•˜λ‹€. λ‹€λ§Œ, λ°μ΄ν„°μ˜ shape만 살짝 λ°”κΎΈμ–΄ μ£Όμ—ˆλ‹€.

 

2.2.7 λͺ¨λΈ 정확도 비ꡐ 및 λͺ¨λΈ 선택

MLP, CNN, LSTM 각각 epoch 300, eopch 300, eopch 600μ—μ„œ accuracy κ°€ κ°€μž₯ λ†’κ²Œ λ‚˜νƒ€λ‚¬λ‹€. μ •ν™•λ„λŠ” CNN > MLP > LSTM 순으둜 λ†’μ•˜κ³ , ν•™μŠ΅ μ‹œκ°„μ€ MLP, LSTM, CNN 순으둜 μ§§μ•˜λ‹€. LSTM은 정확도가 85% 미만이고, ν•™μŠ΅μ‹œκ°„μ€ MLP λŒ€λΉ„ κΈΈκΈ° λ•Œλ¬Έμ— λͺ¨λΈ μ±„νƒμ—μ„œ μ œμ™Έν•˜κ³ , μ„±λŠ₯이 μ’‹μ§€λ§Œ μ‹œκ°„μ΄ μ˜€λž˜κ±Έλ¦¬λŠ” CNNκ³Ό μ„±λŠ₯이 λ‹€μ†Œ λ–¨μ–΄μ§€μ§€λ§Œ ν•™μŠ΅μ‹œκ°„μ΄ 짧은 MLP 쀑에 λͺ¨λΈμ„ μ„ νƒν•˜κΈ°λ‘œ ν•œλ‹€.

2.2.7.1 MFCC λ³€κ²½

κ΅μˆ˜λ‹˜κ»˜μ„œ 인풋 λ°μ΄ν„°μ˜ 길이가 κΈ΄ κ²½μš°μ™€ 짧은 경우의 것쀑 μ–΄λŠ 것이 더 정확도가 높을지에 λŒ€ν•œ μΆ”κ°€ 연ꡬ가 ν•„μš”ν•  κ²ƒμ΄λΌλŠ” μ˜κ²¬μ„ μ£Όμ…§λ‹€. λ”°λΌμ„œ MFCC 의 ν”„λ ˆμž„ 길이λ₯Ό λ³€κ²½ν•΄μ„œ λͺ¨λΈ ν•™μŠ΅μ„ μ§„ν–‰ν•΄λ³΄κΈ°λ‘œ ν–ˆλ‹€.

일반적으둜 MFCCμ—μ„œλŠ” 이 ν”„λ ˆμž„μ˜ 길이가 20~40ms 정도이닀. ν”„λ ˆμž„μ˜ 길이가 λ„ˆλ¬΄ κΈΈλ©΄ 주파수 λΆ„μ„μ—μ„œ 신뒰도가 λ–¨μ–΄μ§„λ‹€. λ˜ν•œ ν”„λ ˆμž„μ˜ 길이가 λ„ˆλ¬΄ 짧으면 ν•œ ν”„λ ˆμž„ λ‚΄ μ‹ ν˜Έ λ³€ν™”κ°€ 컀지기 떄문에 μ’‹μ§€ μ•Šμ•˜λ‹€. κ°€μž₯ 쒋은 λͺ¨λΈ MLP 와 CNN λͺ¨λΈ 쀑, MFCC 길이의 값을 λ³€κ²½ν•΄μ„œ 또 ν•™μŠ΅μ„ μ§„ν•΄ν•΄ λ³΄μ•˜λ‹€. MFCC=20μΌλ•Œμ™€ MFCC=40μΌλ•Œμ˜ λͺ¨λΈμ„ λΉ„κ΅ν•˜μ—¬ μ΅œμ’… λͺ¨λΈμ„ 선택할 것이닀.

λͺ¨λΈ ν•™μŠ΅ κ²°κ³Ό, 우리 νŒ€μ˜ μ΅œμ’… λͺ¨λΈμ€ MFCC = 40 μΌλ•Œ, CNNλͺ¨λΈμ„ μ‚¬μš©ν•˜κΈ°λ‘œ ν–ˆλ‹€.

2.2.8 κ²°κ³Ό μ˜ˆμ‹œ

μ΅œμ’… μ„ νƒλœ λͺ¨λΈλ‘œ μ†Œλ¦¬λ₯Ό λΆ„λ₯˜ν•΄ 보겠닀.

def print_prediction(file_name):
    prediction_feature = extract_feature(file_name) 

    predicted_vector = model.predict_classes(prediction_feature)
    predicted_class = le.inverse_transform(predicted_vector) 
    print("The predicted class is:", predicted_class[0], '\n') 

    predicted_proba_vector = model.predict_proba(prediction_feature) 
    predicted_proba = predicted_proba_vector[0]
    for i in range(len(predicted_proba)): 
        category = le.inverse_transform(np.array([i]))
        print(category[0], "\t\t : ", format(predicted_proba[i], '.32f') )
filename = 'UrbanSound Dataset sample/audio/100648-1-0-0.wav'
print_prediction(filename) 

μžλ™μ°¨ μ†Œλ¦¬λ₯Ό λ¬΄μ‚¬νžˆ μžλ™μ°¨ μ†Œλ¦¬λΌκ³  λΆ„λ₯˜ν•  수 μžˆμ—ˆλ‹€!!

filename = '../Evaluation audio/siren_1.wav'
print_prediction(filename) 

λ‹€λ₯Έ 데이터셋을 μ‚¬μš©ν•΄λ„ λ¬΄μ‚¬νžˆ μ‚¬μ΄λ Œ μ†Œλ¦¬λ₯Ό μ‚¬μ΄λ Œ μ†Œλ¦¬λ‘œ ꡬ별할 μˆ˜μžˆμ—ˆλ‹€.
μš°λ¦¬νŒ€μ€ 일정 μ†Œλ¦¬μœΌ 

2.2.9 ν”„λ‘œν† νƒ€μž… μ œμž‘

우리의 μ£Όμ œλŠ” μ• μ΄ˆμ— 'μ–΄ν”Œλ¦¬μΌ€μ΄μ…˜'μ΄μ§€λ§Œ, μ œν•œ μ‹œκ°„ μ΄λ‚΄λ‘œ μ•ˆλ“œλ‘œμ΄λ“œμ™€ Javaλ₯Ό μˆ™λ ¨μ‹œμΌœ μ–΄ν”Œλ¦¬μΌ€μ΄μ…˜μ„ μ‹€μ œλ‘œ μ œμž‘ν•˜κΈ°μ—” 무리가 μžˆμ—ˆλ‹€. λ”°λΌμ„œ 우리 νŒ€μ€ μ–΄ν”Œλ¦¬μΌ€μ΄μ…˜μœΌλ‘œ 보일 수 μžˆλŠ” exe νŒŒμΌμ„ μ œμž‘ν•˜μ—¬ ν”„λ‘œν† νƒ€μž…μ„ μ œμž‘ν•™λ‘œ ν•˜κ³  ν”„λ‘œμ νŠΈλ₯Ό μ§„ν–‰ν–ˆμ—ˆλ‹€. python으둜 κ΅¬ν˜„ν•œ λͺ¨λΈκ³Ό PyQtλ₯Ό μ—°λ™μ‹œμΌœ ν”„λ‘œν† νƒ€μž…μ„ μ œμž‘ν•  수 μžˆμ—ˆλ‹€. 

μ•„λž˜ λ°©λ²•μœΌλ‘œ ν”„λ‘œν† νƒ€μž… μ œμž‘μ„ μ§„ν–‰ν–ˆλ‹€.

ν”„λ‘œν† νƒ€μž… pyqt μ‹€ν–‰ μž₯λ©΄

 

ν”„λ‘œν† νƒ€μž… μ§„ν–‰ κ³Όμ •

ν•΄λ‹Ή λ§ˆμ§€λ§‰ 과정은 λ‚΄κ°€ μ°Έμ—¬ν•œ 것이 μ•„λ‹Œ νŒ€μ›μ˜ μž‘ν’ˆμ΄κΈ°μ— μžμ„Έν•œ μ½”λ“œλŠ” μ²¨λΆ€ν•˜μ§€ μ•Šκ³ , μ΅œμ’… 결과물을 μ—…λ‘œλ“œ 해보겠닀.

μ§„μ§œ μ–΄ν”Œλ¦¬μΌ€μ΄μ…˜μ€ μ•„λ‹ˆλ”λΌλ„ μ΄λ ‡κ²Œ ν”„λ‘œν† νƒ€μž…μœΌλ‘œ μ œμž‘ν•˜μ—¬ μ‹€μƒν™œμ—μ„œ μ–΄λ–»κ²Œ μ‚¬μš©λ μ§€ 직접 κ΅¬ν˜„ν•˜μ—¬ λ°œν‘œν•œ 점이 ν•™μˆ λŒ€νšŒ κΈˆμƒ μˆ˜μƒμ— κ°€μž₯ 큰 영ν–₯을 끼친 것이 μ•„λ‹κΉŒ μƒκ°ν–ˆλ‹€. 보톡 μ½”λ“œλ₯Ό 톡해 μ†Œλ¦¬λ₯Ό inputν•΄μ„œ, outλ˜λŠ” 결과값을 λ³΄μ—¬μ£ΌλŠ” λΆ„λ₯˜μ˜ κ°€λŠ₯성을 λ³΄μ—¬μ£Όμ—ˆλŠ”λ°, 우리 νŒ€μ€ μ‹€μ œλ‘œ μ†Œλ¦¬λ₯Ό λ“€λ €μ£Όκ³ , 이λ₯Ό λΆ„λ₯˜ν•˜λŠ” κ³Όμ •κΉŒμ§€ λ™μ˜μƒμ„ 톡해 λ³΄μ—¬μ£Όλ©΄μ„œ λ§Žμ€ μ‚¬λžŒλ“€μ—κ²Œ 인정 받을 수 μžˆμ—ˆλ˜ 것 κ°™λ‹€.

μ‚¬μš©μžκ°€ μ†Œλ¦¬λ₯Ό μ„ νƒν•˜κ³ , ν•΄λ‹Ή μ†Œλ¦¬μ— μ›ν•˜λŠ” μ§„λ™νŒ¨ν„΄μ„ 선택할 수 μžˆλ„λ‘ λ§Œλ“€μ—ˆλ‹€. λ˜ν•œ 청각μž₯애인이 μ΄‰κ°μœΌλ‘œ μ†Œλ¦¬λ₯Ό λŠλ‚„ 수 μžˆλ„λ‘ μ§„λ™μœΌλ‘œ κ΅¬ν˜„ν•˜κ³ μž ν–ˆμ§€λ§Œ, λ°œν‘œλΌλŠ” νŠΉμ„±μƒ ν”„λ‘œν† νƒ€μž…μ˜ 청각 및 μ‹œκ°μ μΈ 효과λ₯Ό μœ„ν•΄ 삐--- μ†Œλ¦¬λ₯Ό μ΄μš©ν•΄μ„œ κ΅¬ν˜„ν•΄ λ³΄μ•˜λ‹€.

 

 

3. κΈ°λŒ€νš¨κ³Ό 및 μΆ”ν›„ 연ꡬ λ°©ν–₯

  • κΈ°λŒ€νš¨κ³Ό
    • 범죄 λ°œμƒλ₯ μ΄ 높은 지역에 κΈ°κΈ°λ₯Ό μ„€μΉ˜ν•˜μ—¬ λ°©λ²•μš©μœΌλ‘œ μ‚¬μš© κ°€λŠ₯
    • 이어폰 μ°©μš©μ€‘μ—λ„ μ°¨ 경적 및 μ—”μ§„μ†Œλ¦¬λ₯Ό κ°μ§€ν•˜λŠ” μ•Œλ¦Όμ„ μ£ΌλŠ” μ–΄ν”Œλ¦¬μΌ€μ΄μ…˜μœΌλ‘œ λ°œμ „ κ°€λŠ₯
    • 청각μž₯애인을 μœ„ν•œ μ‹œμ„€ λΉ„μš©μ„ μ ˆκ°ν•  수 있음
    • μ†Œλ¦¬λ₯Ό λ“£μ§€ λͺ»ν•΄μ„œ μƒκΈ°λŠ” μœ„ν—˜μ„ λ°©μ§€ν•  수 있음.
    • λ§žμΆ€ν˜• μ†Œλ¦¬ λΆ„λ₯˜ μ‹œμŠ€ν…œμœΌλ‘œ μ‚Άμ˜ 질 ν–₯상 κ°€λŠ₯
    • 제쑰 μ‚°μ—…μ—μ„œμ˜ μœ„ν—˜ 사고 λ°œμƒ 감지, κ²Œμž„ μ‚°μ—…μ—μ„œμ˜ μŒμ„±μΈμ‹, ott λ“± λ―Έλ””μ–΄ μ‚°μ—…μ—μ„œμ˜ μŒμ„± 인식, κΈˆμœ΅μ—…μ—μ„œμ˜ 챗봇 λ“± λ‹€μ–‘ν•œ μ‚°μ—…μ—μ„œμ˜ λ°œμ „ κ°€λŠ₯성이 λ¬΄ν•œνžˆ μ‘΄μž¬ν•¨
  • ν•œκ³„ 및 μΆ”ν›„ 연ꡬ λ°©ν–₯
    • μ†Œλ¦¬ λΆ„λ₯˜μ˜ μ •ν™•λ„λŠ” μ£Όλ³€ 상황에 따라 λ‹¬λΌμ§ˆ 수 μžˆκΈ°μ— λ…Έμ΄μ¦ˆ κ΄€λ ¨ 연ꡬ가 μΆ”κ°€λ‘œ 진행이 ν•„μš”ν•˜λ‹€. 
    • μ†Œλ¦¬λ₯Ό νƒμ§€ν•˜λŠ” 마이크의 μ„±λŠ₯도 μ€‘μš”ν•˜κ²Œ μž‘μš©ν•œλ‹€.
    • 감지해야할 μ†Œλ¦¬κ°€ λ™μ‹œλ‹€λ°œμ μœΌλ‘œ λ°œμƒν•˜λ©΄, λͺ¨λ“  μ†Œλ¦¬λ₯Ό μΈμ‹ν•˜κΈ° νž˜λ“€λ‹€λŠ” ν•œκ³„κ°€ μ‘΄μž¬ν•œλ‹€.
    • λ”°λΌμ„œ λ™μ‹œλ‹€λ°œμ μœΌλ‘œ λ°œμƒν•˜λŠ” μ†Œλ¦¬μ— λŒ€ν•œ 처리λ₯Ό μœ„ν•œ λͺ¨λΈ μ„±λŠ₯ κ°œμ„ μ΄ ν•„μš”ν•˜λ‹€.
    • μ†Œλ¦¬μ˜ μ’…λ₯˜λ₯Ό λŠ˜λ €μ„œ 제쑰, 곡정, λ””μ§€ν„Έ ν—¬μŠ€μΌ€μ–΄, 금육, κ²Œμž„ λ“± λ‹€μ–‘ν•œ μ‚°μ—…μ—μ„œ μ‚¬μš© κ°€λŠ₯ν•˜λ„λ‘ 연ꡬ가 ν•„μš”ν•˜λ‹€. 

 

 

마치며 ... :)

κΈΈκ³  κΈ΄ ν¬μŠ€νŒ…μ΄ 끝났닀. λ§ˆμ§€λ§‰μ— μ–΄ν”Œλ¦¬μΌ€μ΄μ…˜μ„ μ™„λ²½ν•˜κ²Œ κ΅¬ν˜„ν•˜μ§€λͺ»ν–ˆμ§€λ§Œ, μ΄λ ‡κ²ŒλΌλ„ AIκ°€ 산업에 μ–΄λ–»κ²Œ μ μš©λ˜μ–΄ μ‚¬μš©λ˜κ³ , μ–΄ν”Œλ¦¬μΌ€μ΄μ…˜μ— μ–΄λ–€ κΈ°λŠ₯으둜 μ μš©λ˜λŠ”μ§€μ— λŒ€ν•΄μ„œ 곡뢀해볼 수 μžˆμ—ˆλ‹€. PyQt νˆ΄λ„ μ‹ μ„ ν–ˆκ³ , 이λ₯Ό 톡해 κ°„λ‹¨ν•œ ν”„λ‘œκ·Έλž¨μ„ κ΅¬ν˜„ν•  수 μžˆλ‹€λŠ” 점도 μ‹ μ„ ν–ˆλ‹€. 

λ§ˆμ§€λ§‰μœΌλ‘œ,,, μ§„μ§œ μ•½ 3κ°œμ›”κ°„μ˜ 고생끝에 λͺ¨λ“  μ„ λ°°λ“€, λͺ¨λ“  νŒ€λ“€μ„ 제치고 μˆ˜μƒν•œ 상μž₯ 사진을 λ§ˆμ§€λ§‰μœΌλ‘œ ν¬μŠ€νŒ… λ§ˆλ¬΄λ¦¬μ§“κ²Ÿλ‹€!!!!!!!!!!! ν–‰λ³΅ν–ˆλ”°!!!!!!!!!!

λ…Έλ ₯은 λ°°μ‹ ν•˜μ§€ μ•ŠλŠ”λ‹€ :) 

 

λ°˜μ‘ν˜•