21.12.18 나이브베이즈 분류

2021. 12. 19. 20:52작업/머신러닝

실습1 유방암검사키트

def main():
    sensitivity = float(input())
    prior_prob = float(input())
    false_alarm = float(input())

    print("%.2lf%%" % (100 * mammogram_test(sensitivity, prior_prob, false_alarm)))

def mammogram_test(sensitivity, prior_prob, false_alarm):
    p_a1_b1 = sensitivity # p(A = 1 | B = 1)

    p_b1 = prior_prob    # p(B = 1)

    p_b0 = 1 - p_b1    # p(B = 0)

    p_a1_b0 = false_alarm # p(A = 1|B = 0)

    p_a1 = p_a1_b0 * p_b0 + p_a1_b1 * p_b1    # p(A = 1) = p(A=1|B=0) + P(A=1|B=1)P(B=1)

    p_b1_a1 = p_a1_b1 * p_b1 / p_a1 # p(B = 1|A = 1)

    return p_b1_a1
# 40대 여성이 mammogram(X-ray) 검사를 통해 유방암 양성 의심 판정을 받았을 때 유방암을 실제로 가지고 있을 확률은 어떻게 될까요?

if __name__ == "__main__":
    main()

 

실습 4 Bag of Words

import re

special_chars_remover = re.compile("[^\w'|_]")

def main():
    sentence =  "Bag-of-Words 모델을 Python으로 직접 구현하겠습니다."
    bow = create_BOW(sentence)

    print(bow)


def create_BOW(sentence):
    bow = {}
    sentence_lowered = sentence.lower()
    sentence_without_special_characters = remove_special_characters(sentence_lowered)
    splitted_sentence = sentence_without_special_characters.split()
    splitted_sentence.append('')
    splitted_sentence_filtered = [
        token
        for token in splitted_sentence
        if len(token) >= 1
    ]
    print(sentence_without_special_characters)
    print(splitted_sentence_filtered)
    
    for token in splitted_sentence_filtered:
        # sol2. bow.setdefault(token,0)
        # bow[token] +=1
        if token not in bow:
            bow[token]=1
        else:
            bow[token]+=1
    return bow


def remove_special_characters(sentence):
    return special_chars_remover.sub(' ', sentence)


if __name__ == "__main__":
    main()

 

'작업 > 머신러닝' 카테고리의 다른 글

22.01.12 딥러닝 개론  (0) 2022.01.12
21.12.19 모의테스트  (0) 2021.12.19
21.12.18 회귀분석  (0) 2021.12.19
21.12.18 선형대수 / Numpy  (0) 2021.12.19
21.12.18 머신러닝 분류(Classification)  (0) 2021.12.19