21.12.18 나이브베이즈 분류
2021. 12. 19. 20:52ㆍ작업/머신러닝
실습1 유방암검사키트
def main():
sensitivity = float(input())
prior_prob = float(input())
false_alarm = float(input())
print("%.2lf%%" % (100 * mammogram_test(sensitivity, prior_prob, false_alarm)))
def mammogram_test(sensitivity, prior_prob, false_alarm):
p_a1_b1 = sensitivity # p(A = 1 | B = 1)
p_b1 = prior_prob # p(B = 1)
p_b0 = 1 - p_b1 # p(B = 0)
p_a1_b0 = false_alarm # p(A = 1|B = 0)
p_a1 = p_a1_b0 * p_b0 + p_a1_b1 * p_b1 # p(A = 1) = p(A=1|B=0) + P(A=1|B=1)P(B=1)
p_b1_a1 = p_a1_b1 * p_b1 / p_a1 # p(B = 1|A = 1)
return p_b1_a1
# 40대 여성이 mammogram(X-ray) 검사를 통해 유방암 양성 의심 판정을 받았을 때 유방암을 실제로 가지고 있을 확률은 어떻게 될까요?
if __name__ == "__main__":
main()
실습 4 Bag of Words
import re
special_chars_remover = re.compile("[^\w'|_]")
def main():
sentence = "Bag-of-Words 모델을 Python으로 직접 구현하겠습니다."
bow = create_BOW(sentence)
print(bow)
def create_BOW(sentence):
bow = {}
sentence_lowered = sentence.lower()
sentence_without_special_characters = remove_special_characters(sentence_lowered)
splitted_sentence = sentence_without_special_characters.split()
splitted_sentence.append('')
splitted_sentence_filtered = [
token
for token in splitted_sentence
if len(token) >= 1
]
print(sentence_without_special_characters)
print(splitted_sentence_filtered)
for token in splitted_sentence_filtered:
# sol2. bow.setdefault(token,0)
# bow[token] +=1
if token not in bow:
bow[token]=1
else:
bow[token]+=1
return bow
def remove_special_characters(sentence):
return special_chars_remover.sub(' ', sentence)
if __name__ == "__main__":
main()
'작업 > 머신러닝' 카테고리의 다른 글
22.01.12 딥러닝 개론 (0) | 2022.01.12 |
---|---|
21.12.19 모의테스트 (0) | 2021.12.19 |
21.12.18 회귀분석 (0) | 2021.12.19 |
21.12.18 선형대수 / Numpy (0) | 2021.12.19 |
21.12.18 머신러닝 분류(Classification) (0) | 2021.12.19 |