728x90

PYTHON 26

[Python] Pandas Dataframe ์ค‘๋ณต ์ œ๊ฑฐ, distinctํ•œ ๊ฐ’ ํ™•์ธ

df.drop_duplicates() df ์ „์ฒด์˜ ์ค‘๋ณต ์ œ๊ฑฐ๋„ ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ, ์—ด ๋ผ๋ฆฌ ์ค‘๋ณต ์ œ๊ฑฐ๋„ ๊ฐ€๋Šฅํ•˜๋‹ค. ์œ„์˜ ๋ฐ์ดํ„ฐ๋Š” pert_iname์ด๋ผ๋Š” ์—ด์— ์ค‘๋ณต๋œ ๋ฐ์ดํ„ฐ๋“ค์ด ๋งŽ์ด ์žˆ๋Š”๋ฐ, ์—ฌ๊ธฐ์„œ df.drop_duplicates()๋กœ distinctํ•œ ๊ฐ’์€ ๋ช‡ ๊ฐœ์ธ์ง€ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ์›๋ž˜ 13553๊ฐœ์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ์ค‘๋ณต๊ฐ’์„ ์ œ์™ธํ•˜๋ฉด 6798๊ฐœ๋ผ๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋‹ค. ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์œผ๋กœ df.value_counts() ๋ฅผ ์ด์šฉํ•˜๋ฉด distinctํ•œ ๊ฐ’์„ ์ฐพ์•„์ฃผ๋ฉด์„œ ๋ช‡ ๊ฐœ๊ฐ€ ์ค‘๋ณต๋˜์–ด์žˆ๋Š”์ง€ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

[Python] sys.path ๋ชจ๋“ˆ ์‚ฌ์šฉ, ์ƒ๋Œ€๊ฒฝ๋กœ

sys ๋ชจ๋“ˆ์„ ์ด์šฉํ•ด ์ƒ๋Œ€๊ฒฝ๋กœ ์„ค์ • ๊ฐ€๋Šฅ import sys sys.path.append('๋‚ด๊ฒฝ๋กœ') ์œ„ ์ฝ”๋“œ๊ฐ€ ๋“ค์–ด๊ฐ€๋ฉด ๋‚ด ๊ฒฝ๋กœ๊ฐ€ ํŒŒ์ผ ์‹คํ–‰ ์œ„์น˜๊ฐ€ ๋˜๊ณ  ๋‹ค๋ฅธ ํŒŒ์ผ์„ import ํ•  ๋•Œ from ~ import ~๋ฅผ ์‚ฌ์šฉํ•ด ์ƒ๋Œ€๊ฒฝ๋กœ๋กœ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋‹ค. ex) parent ํด๋”์— child ํด๋”๊ฐ€ ์กด์žฌํ•˜๊ณ , child ํด๋” ์•ˆ์— myfuncํ•จ์ˆ˜๋ฅผ ๋‹ด์€ example.py ์žˆ๋‹ค๋ฉด import sys sys.path.append('C:/Parent') from child.example import myfunc ์œ„์ฒ˜๋Ÿผ myfuncํ•จ์ˆ˜๋ฅผ ์ƒ๋Œ€๊ฒฝ๋กœ๋กœ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋‹ค. ์ƒ๋Œ€๊ฒฝ๋กœ๋กœ ์ž‘์„ฑํ•˜๋ฉด ์ ˆ๋Œ€๊ฒฝ๋กœ๋กœ ๊ฒฝ๋กœ๋ฅผ ๋‹ค ์จ์ฃผ์ง€ ์•Š๊ณ ๋„ ํŽธํ•˜๊ฒŒ ํŒŒ์ผ์„ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ์ง€๋งŒ, ํŒŒ์ผ ์œ„์น˜๊ฐ€ ๋ฐ”๋€๋‹ค๊ฑฐ๋‚˜ ํ•˜๋ฉด ๋ถˆํŽธํ•ด์งˆ ์ˆ˜ ์žˆ๋‹ค.

[Data Analysis] ๋ฐ์ดํ„ฐ ๋ถ„์„ ๊ณผ์ •, ์ „์ฒ˜๋ฆฌ์˜ ์ค‘์š”์„ฑ

๋ฐ์ดํ„ฐ ๋ถ„์„ ๊ณผ์ •(Data Analysis Process) 1. Goal Definition ๊ฐ๊ด€์ , ๊ตฌ์ฒด์ ์œผ๋กœ ๋ถ„์„ ๋Œ€์ƒ ์ •์˜(=๋ฌธ์ œ ์ •์˜) ํ•ด๋‹น ๋„๋ฉ”์ธ์— ๋Œ€ํ•œ ์ดํ•ด ํ•ด๋‹น ํ”„๋กœ์ ํŠธ์— ๋Œ€ํ•œ ์ดํ•ด 2. Data Searching & Collecting ๋ฌธ์ œ ์ •์˜ ํ›„ ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ ๊ฒ€์ƒ‰ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ๋ฐ ๋ฐ์ดํ„ฐ ํŒŒ์•… 3. Data Preparation ๋ฐ์ดํ„ฐ์˜ noise๋ฅผ ์ œ๊ฑฐํ•˜๊ณ  ์›ํ•˜๋Š” ํ˜•ํƒœ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณ€ํ™˜ํ•˜๋Š” Data preprocessing(๋ฐ์ดํ„ฐ ์ „์ฒ˜๋ฆฌ ๊ณผ์ •)ํฌํ•จ ์ตœ์ข… ๋ชจ๋ธ์„ ๋งŒ๋“ค๊ธฐ ์œ„ํ•œ ๋ฐ์ดํ„ฐ ์ค€๋น„ ๋‹จ๊ณ„ ๊ด€๋ จ ๋ฐ์ดํ„ฐ๋ผ๋ฆฌ ๊ด€๊ณ„ ์„ค์ • ๋ฐ ๋ฐ์ดํ„ฐ ์ดํ•ด, ๋ฐ์ดํ„ฐ ๋ณ‘ํ•ฉ 4. Modeling ์–ด๋–ป๊ฒŒ ๋ชจ๋ธ ์„ค๊ณ„ํ• ์ง€ ๊ตฌ์„ฑ R, Python ๋“ฑ ์ด์šฉํ•ด ๋จธ์‹ ๋Ÿฌ๋‹ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ๋“ฑ ๋‹ค์–‘ํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ ์šฉ 5. Evaluatio..

[Deep Learning] CNN์˜ ๊ฐœ๋…, Object Detection

๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ์ค‘ ํ•˜๋‚˜์ธ CNN(Convolutional Neural Network)์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž. CNN์€ computer vision problem์—์„œ ๋งŽ์ด ์“ฐ์ธ๋‹ค. ํŠนํžˆ ๊ทธ ์ค‘ ๋งŽ์ด ํ™œ์šฉ๋˜๋Š” ๊ฒƒ์€ object detection์ด๋‹ค. Object Detection์ด๋ž€? Feature extraction(ํŠน์ง• ์ถ”์ถœ) ์ด๋ฏธ์ง€์—์„œ ๋Œ์–ด๋‚ผ ์ˆ˜ ์žˆ๋Š” ์œ ์šฉํ•œ feature ์ถ”์ถœ Bounding Box ์ƒ์„ฑ object๋ฅผ ๊ฐ์‹ธ๋Š” bounding box ์ƒ์„ฑ Class classification bounding box ์•ˆ์˜ object๊ฐ€ ์–ด๋–ค class์ธ์ง€ ๋ถ„๋ฅ˜ํ•˜๋Š” ๊ณผ์ • CNN(Convolutional Neural Network) image์˜ ํ˜•ํƒœ๋ฅผ ๋ณด์กดํ•˜๋„๋ก ํ–‰๋ ฌ ํ˜•ํƒœ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅ๋ฐ›์•„ ์ •๋ณด ์†์‹ค์„ ๋ฐฉ์ง€ํ•˜๊ณ ,..

[Python] Pandas dataframe ๊ฒฐํ•ฉ, ์กฐ์ธ, ๋ณ‘ํ•ฉ(Join, Merge)

Join 1. ์˜ˆ์‹œ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์ƒ์„ฑ import pandas as pd df = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'], 'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5']}) other = pd.DataFrame({'key': ['K0', 'K1', 'K2'], 'B': ['B0', 'B1', 'B2']}) 2. ์—ด์˜ Index ์ง€์ •ํ•ด์„œ Join df.join(other, lsuffix = '_caller', rsuffix = '_other') 3. Key๋ฅผ index๋กœ ์ง€์ •ํ•ด Join df.set_index('key').join(other.set_index('key')) 4. join ๋ฉ”์†Œ๋“œ์˜ parameter ..

[์›นํฌ๋กค๋ง] Python์œผ๋กœ ์›น ํฌ๋กค๋ง, HTML ํŒŒ์‹ฑ, requests, BeautifulSoup ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์ด์šฉ

์›น ํฌ๋กค๋ง์œผ๋กœ href ํƒœ๊ทธ์˜ ๊ฐ’ ๊ฐ€์ ธ์˜ค๊ธฐ (HTML ํŒŒ์‹ฑ) import requests from bs4 import BeautifulSoup ftp.ensembl.org/pub/current_regulation/homo_sapiens/RegulatoryFeatureActivity/ Index of /pub/current_regulation/homo_sapiens/RegulatoryFeatureActivity/ ftp.ensembl.org ์ด๋ ‡๊ฒŒ ์ƒ๊ธด ์›น ํ˜•ํƒœ์—์„œ ๋ฐ์ดํ„ฐ ์ด๋ฆ„๋งŒ ๊ฐ€์ ธ์˜ค๋Š” ์›น ํฌ๋กค๋ง ์‹ค์Šต. 1. requests ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์ด์šฉํ•ด url get ํ•˜๊ธฐ import requests from bs4 import BeautifulSoup res = requests.get('http://ftp.en..

728x90