'Python/MachineLearning' 카테고리의 글 목록

[NLP][ML] 문자열 기반 카테고리 분류 예측 모델

1. 문자열 데이터 수집사내 데이터를 기반으로 진행한 프로젝트로, 데이터는 공유할 수 없음.2. 텍스트 전처리1) 한국어와 영어 분리re 활용# 한글과 영어를 분리korean_text = re.findall(r"[가-힣]+", text)english_text = re.findall(r"[a-zA-Z]+", text)print("Korean:", korean_text)print("English:", english_text) 한국어는 konlpy 라이브러리를 활용하여 불용어 처리, 토큰화를 진행한다.한국어의 경우, 띄어쓰기만으로는 형태소 분리가 어렵고, 단순하게 문맥을 파악하기 쉽지 않다.konlpy는 JAVA의 패키지를 사용하므로 jdk 설치 후 사용 가능, 불용어 처리 시 Okt 라이브러리를 사용대용량 ..

format_list_bulleted Python/MachineLearning
· 2025. 1. 22.
textsms

Tensorflow dataset 'cats_vs_dogs' 이미지 분류

라이브러리 설치 # pip install opencv-python # pip install tensorflow-datasets==4.6.0 # 'cats_vs_dogs'는 tfds 버전 4.6.0 에서 실행됨 # tfds 최신 버전 에러 해결 # pip install tfds-nightly import cv2 import matplotlib.pyplot as plt import tensorflow_datasets as tfds import tensorflow as tf 오류 발생시 # URL 변경 오류 시 다시 세팅 setattr(tfds.image_classification.cats_vs_dogs, '_URL',"https://download.microsoft.com/download/3/E/1/3E1C..

format_list_bulleted Python/MachineLearning
· 2024. 4. 1.
textsms

[Scikit-Learn] K-최근접 이웃(K-nearest Neightbors, KNN) 알고리즘_분류

실행 순서 1. 데이터 수집 2. 데이터 전처리 3. 훈련 모델 생성(fit) 4. 모델 검증(score) 5. 예측(predict) 1. 데이터 수집 임의 데이터 생성 # - 도미 길이와 무게 bream_length = [25.4, 26.3, 26.5, 29.0, 29.0, 29.7, 29.7, 30.0, 30.0, 30.7, 31.0, 31.0, 31.5, 32.0, 32.0, 32.0, 33.0, 33.0, 33.5, 33.5, 34.0, 34.0, 34.5, 35.0, 35.0, 35.0, 35.0, 36.0, 36.0, 37.0, 38.5, 38.5, 39.5, 41.0, 41.0] # - 도미 무게 bream_weight = [242.0, 290.0, 340.0, 363.0, 430.0, 45..

format_list_bulleted Python/MachineLearning
· 2022. 8. 3.
textsms

[Tensorflow] 'contib()' 오류 해결

[ 오류 원인 ] - tensorflow 2 버전에서는 지원하지 않는 함수 [ 오류 해결 ] - tensorflow 버전을 낮추기(1버전으로 낮추어 사용) import tensorflow.compat.v1 as tf tf.disable_v2_behavior()

format_list_bulleted Python/MachineLearning
· 2022. 6. 2.
textsms

[NLP][ML] 문자열 기반 카테고리 분류 예측 모델

Tensorflow dataset 'cats_vs_dogs' 이미지 분류

[Scikit-Learn] K-최근접 이웃(K-nearest Neightbors, KNN) 알고리즘_분류

[Tensorflow] 'contib()' 오류 해결

티스토리툴바