728x90

๋ถ„๋ฅ˜ ์ „์ฒด๋ณด๊ธฐ 128

[Python] Pandas dataframe ๊ฒฐํ•ฉ, ์กฐ์ธ, ๋ณ‘ํ•ฉ(Join, Merge)

Join 1. ์˜ˆ์‹œ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์ƒ์„ฑ import pandas as pd df = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'], 'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5']}) other = pd.DataFrame({'key': ['K0', 'K1', 'K2'], 'B': ['B0', 'B1', 'B2']}) 2. ์—ด์˜ Index ์ง€์ •ํ•ด์„œ Join df.join(other, lsuffix = '_caller', rsuffix = '_other') 3. Key๋ฅผ index๋กœ ์ง€์ •ํ•ด Join df.set_index('key').join(other.set_index('key')) 4. join ๋ฉ”์†Œ๋“œ์˜ parameter ..

[์›นํฌ๋กค๋ง] Python์œผ๋กœ ์›น ํฌ๋กค๋ง, HTML ํŒŒ์‹ฑ, requests, BeautifulSoup ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์ด์šฉ

์›น ํฌ๋กค๋ง์œผ๋กœ href ํƒœ๊ทธ์˜ ๊ฐ’ ๊ฐ€์ ธ์˜ค๊ธฐ (HTML ํŒŒ์‹ฑ) import requests from bs4 import BeautifulSoup ftp.ensembl.org/pub/current_regulation/homo_sapiens/RegulatoryFeatureActivity/ Index of /pub/current_regulation/homo_sapiens/RegulatoryFeatureActivity/ ftp.ensembl.org ์ด๋ ‡๊ฒŒ ์ƒ๊ธด ์›น ํ˜•ํƒœ์—์„œ ๋ฐ์ดํ„ฐ ์ด๋ฆ„๋งŒ ๊ฐ€์ ธ์˜ค๋Š” ์›น ํฌ๋กค๋ง ์‹ค์Šต. 1. requests ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์ด์šฉํ•ด url get ํ•˜๊ธฐ import requests from bs4 import BeautifulSoup res = requests.get('http://ftp.en..

[Python] Pandas Dataframe ์—ด์— ์–ด๋–ค ๋ฐ์ดํ„ฐ ์žˆ๋Š”์ง€ value ํ™•์ธ, ๋ฐ์ดํ„ฐ ๋ณ„๋กœ ๊ฐœ์ˆ˜ ์„ธ๊ธฐ, ์ค‘๋ณต๊ฐ’ ํ™•์ธ, ์œ ์ผํ•œ(์œ ๋‹ˆํฌํ•œ) ๊ฐ’ ์ฐพ๊ธฐ

df.unique() ์œ„์˜ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์€ cell_line์— ๋Œ€ํ•œ ๋‚ด ์˜ˆ์‹œ ๋ฐ์ดํ„ฐ์ด๋‹ค. ์ด์ œ ์ด cell_lien ๋ฐ์ดํ„ฐ์—์„œ ์œ ๋‹ˆํฌํ•œ ๊ฐ’์„ ์ฐพ์•„๋ณผ ๊ฒƒ์ด๋‹ค. 1. ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์—์„œ ์ค‘๋ณต์„ ์ œ๊ฑฐํ•˜์ง€ ์•Š๊ณ  ๊ฐ’ ํ™•์ธ cell_line = f['epigenomes_with_experimental_evidence'].values # values = df[์ปฌ๋Ÿผ๋ช…].values ์œ„์˜ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๋ฉด arrayํ˜•ํƒœ๋กœ ๋ชจ๋“  ๊ฐ’์ด ์ถœ๋ ฅ๋œ๋‹ค. 2. ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์—์„œ ๊ฐ ์š”์†Œ๋ณ„๋กœ ๋ช‡๊ฐœ์˜ ๋ฐ์ดํ„ฐ๊ฐ€ ์žˆ๋Š”์ง€ ํ™•์ธ f['epigenomes_with_experimental_evidence'].value_counts() # df[์ปฌ๋Ÿผ๋ช…].value_counts() 3. ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์—์„œ ์ค‘๋ณต์„ ์ œ๊ฑฐํ•˜๊ณ  ์œ ๋‹ˆํฌํ•œ ๊ฐ’ ํ™•์ธ f['epigenome..

[Python] Pandas Dataframe ์ž๋ฃŒํ˜•์—์„œ NaN ๊ฐ’ ์ฐพ๊ธฐ(๊ฒฐ์ธก๊ฐ’ ์—ฌ๋ถ€ ํ™•์ธ, ๊ฒฐ์ธก๊ฐ’ ๊ฐœ์ˆ˜ ์„ธ๊ธฐ)

How to check NaN in Pandas Dataframe null ๊ฐ’ ํ™•์ธ df.isnull() isnull(df) null ์•„๋‹Œ ๊ฐ’ ํ™•์ธ df.notnull() notnull(df) 1. ์˜ˆ์‹œ dataframe ์ƒ์„ฑ import pandas as pd import numpy as np dates = pd.date_range("20130101", periods=6) df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list("ABCD")) 2. null ๊ฐ’ ์ถ”๊ฐ€ํ•˜๊ธฐ 'NaN' ํ˜น์€ None์„ ํ†ตํ•ด null๊ฐ’์„ ์ž„์˜๋กœ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ๋‹ค. df['A'][1] = 'NaN' df['B'][2] = None df['C'][2] = 'NaN' d..

[MySQL] ๋‚ด์žฅํ•จ์ˆ˜ - ๋ฌธ์ž์—ด ํ•จ์ˆ˜

ASCII(์•„์Šคํ‚ค์ฝ”๋“œ), CHAR(์ˆซ์ž) SELECT ASCII('A'), CHAR(65); --> 65, A๋ฐ˜ํ™˜ BIT_LENGTH, CHAR_LENGHT, LENGTH BIT_LENGTH(): ํ• ๋‹น๋œ ๋น„ํŠธํฌ๊ธฐ ๋ฐ˜ํ™˜ CHAR_LENGTH(): ๋ฌธ์ž ๊ฐœ์ˆ˜ ๋ฐ˜ํ™˜ LENGTH(): ํ• ๋‹น๋œ ๋ฐ”์ดํŠธ ์ˆ˜ ๋ฐ˜ํ™˜ ์˜๋ฌธ: ๋ฌธ์ž๋‹น 1๋ฐ”์ดํŠธ, ํ•œ๊ธ€: ๋ฌธ์ž๋‹น 3๋ฐ”์ดํŠธ SELECT BIT_LENGTH('ABC'), CHAR_LENGTH('ABC), LENGTH('ABC); --24, 3, 3๋ฐ˜ํ™˜ SELECT BIT_LENGTH('๊ฐ€๋‚˜๋‹ค'), CHAR_LENGTH('๊ฐ€๋‚˜๋‹ค'), LENGTH('๊ฐ€๋‚˜๋‹ค'); -- 72, 3, 9 ๋ฐ˜ํ™˜ CONCAT(๋ฌธ์ž์—ด1, ๋ฌธ์ž์—ด2, ...), CONCAT_WS(๋ฌธ์ž์—ด1, ๋ฌธ์ž์—ด2, ...) CO..

DB(Database)/MySQL 2021.01.20

[MySQL] MySQL ๋ณ€์ˆ˜ ์‚ฌ์šฉ๋ฐฉ๋ฒ•(SET, PREPARE, EXECUTE)๊ณผ ๋ฐ์ดํ„ฐ ํ˜•์‹ ๋ณ€ํ™˜ ํ•จ์ˆ˜(CAST, CONVERT,CONCAT), ๋‚ด์žฅํ•จ์ˆ˜(์ œ์–ดํ•จ์ˆ˜ IF, IFNULL, NULLIF, CASE WHEN ELSE END)

๋ณ€์ˆ˜ ์‚ฌ์šฉ ํ˜•์‹ SET @๋ณ€์ˆ˜๋ช… = ๋ณ€์ˆ˜๊ฐ’; -- ๋ณ€์ˆ˜ ์„ ์–ธ, ๊ฐ’ ๋Œ€์ž… SELECT @๋ณ€์ˆ˜์ด๋ฆ„; -- ๋ณ€์ˆ˜ ์ถœ๋ ฅ ์˜ˆ์ œ SET @VAR1 = 1; SET @VAR2 = 2; SELECT @VAR1; -- ๊ฒฐ๊ณผ 1 SELECT @VAR1 + @VAR2; -- ๊ฒฐ๊ณผ 2 SET, PREPARE, EXECUTE ๋ฌธ SET @VAR1 = 1; PREPARE Query FROM 'SELECT * FROM [ํ…Œ์ด๋ธ”๋ช…] ORDER BY [์ •๋ ฌ๊ธฐ์ค€] LIMIT ?'; -- '?'์œ„์น˜์— ๋ณ€์ˆ˜ ์‚ฌ์šฉํ•ด ์ฟผ๋ฆฌ๋ฌธ PREPARE EXECUTE Query USING @VAR1; -- ๋ณ€์ˆ˜ ์‹คํ–‰๋ฌธ ๋ฐ์ดํ„ฐ ํ˜•์‹ ๋ณ€ํ™˜ ํ•จ์ˆ˜ CONVERT(), CAST(): ๊ฐ•์ œ๋กœ ๋ฐ์ดํ„ฐํ˜•์„ ๋ณ€ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜ -- ๋ฐ์ดํ„ฐ๋ฅผ ์ •์ˆ˜๋กœ ์ถœ๋ ฅ SELECT CAST([..

DB(Database)/MySQL 2021.01.20

Biopython ์œผ๋กœ ์—ญ์ƒ๋ณด์„œ์—ด ๊ตฌํ•˜๊ธฐ

Biopython ์„ค์น˜ ํ›„ reverse complement ์„œ์—ด ๊ตฌํ•˜๊ธฐ Biopython ์„ค์น˜ pip install biopython ์„ค์น˜ ํ™•์ธ ๋ฐ reverse_complement ์ฝ”๋“œ ์‹คํ–‰ import Bio # ์„ค์น˜ํ™•์ธ from Bio.Seq import Seq my_seq = Seq('TGGTGAAACCCCA').reverse_complement() print(my_seq) print(type(my_seq)) ๊ฒฐ๊ณผ sequence์˜ ํƒ€์ž… ํ™•์ธ ๊ฒฐ๊ณผ ์ฐธ๊ณ : biopython.org/wiki/Getting_Started · Biopython OB— title: Getting Started permalink: wiki/Getting_Started layout: wiki — Download and In..

Bioinfomatics 2021.01.19
728x90