์ ์ ์ฌ๋ ธ๋ TFBS(Transcription Factor Binding Site) data์ start, end๋ฅผ ์ด์ฉํด (์์, ์ข ๊ฒฐ ์ฝ๋) ์ผ๊ธฐ์์ด์ ์ถ๊ฐํด ๋ณด์๋ค.
์์๋ธ์์ ์ ๊ณตํ๋ api ์ด์ฉ
rest.ensembl.org/documentation/info/sequence_region
import requests, sys
import pandas as pd
f = pd.read_csv('21test.csv') # 21๋ฒ ์ผ์์ฒด ๋ฐ์ดํฐ ์ผ๋ถ ๊ฐ์ ธ์ค๊ธฐ
# start์ end์ dataframe ์ด ๋์
start = f['4']
end = f['5']
print(f) ๊ฒฐ๊ณผ, ์๋ ๋ฐ์ดํฐ
# ๋ฌธ์์ด format ์ด์ฉํด REST API์์ ๋ฐ์ดํฐ ๋ถ๋ฌ์ค๊ธฐ
# sequence list ๋ง๋ค์ด api์์ sequence๋ฅผ ๋ถ๋ฌ์ฌ ๋๋ง๋ค appendํ๊ธฐ
sequence = []
server = "https://rest.ensembl.org"
for i in f.index:
ext = "/sequence/region/human/{0}:{1}..{2}?".format('21', start[i], end[i])
r = requests.get(server+ext, headers={ "Content-Type" : "text/plain"})
if not r.ok:
r.raise_for_status()
sys.exit()
print(r.text)
sequence.append(r.text)
# sequence list๋ฅผ dataframe์ ์๋ก์ด ์ด๋ก ์ง์
f['sequence'] = sequence
sequence ์ด์ด ์ถ๊ฐ๋ ๊ฒ์ ๋ณผ ์ ์์.
REST API ํธ์ถ ์ฑ๊ณต~~
728x90