Bioinfomatics

Biological Networks, Biological pathway analysis (ํ•™๊ธฐ 2019-1)

ํƒฑ์ ค 2021. 1. 14. 01:04

Biological Networks ์ƒ๋ฌผ๊ณ„์˜ ๋‹ค์–‘ํ•œ ๋„คํŠธ์›Œํฌ

  1.  Protein-protein Interaction (PPI) Networks
  2. Gene regulatory networks
  3. Co-expression networks
  4. Metabolic networks
  5. Signaling networks
  6. .....

1.  Protein-protein Interaction (PPI) Networks

  • ๋‹จ๋ฐฑ์งˆ ๊ฐ„์˜ ๋‹ค์–‘ํ•œ interaction์„ ๋‚˜ํƒ€๋‚ธ ๋„คํŠธ์›Œํฌ
  • ์œ ์ „์ž-๋‹จ๋ฐฑ์งˆ์ด 1:1 ๋Œ€์‘ํ•˜๋ฏ€๋กœ ์ฃผ๋กœ ์œ ์ „์ž ๋ฐœํ˜„๋Ÿ‰์„ PPI ๋„คํŠธ์›Œํฌ์— ๋Œ€์ž…ํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Œ
  • STRING ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๊ฐ€ ๋งŽ์ด ์“ฐ์ž„ --> ์œ ์ „์ž(๋‹จ๋ฐฑ์งˆ ์ด๋ฆ„์ด๋‚˜ ์•„๋ฏธ๋…ธ์‚ฐ ์„œ์—ด ๋“ฑ์œผ๋กœ ๊ฒ€์ƒ‰ํ•˜์—ฌ ํ•ด๋‹น ๋‹จ๋ฐฑ์งˆ๊ณผ ์ƒํ˜ธ์ž‘์šฉ์„ ํ•˜๋Š” ๋„คํŠธ์›Œํฌ๋ฅผ ๋ณด์—ฌ์คŒ)
  • STRING ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์˜ Actions ๋ชฉ๋ก
Actions Description
reaction ํ™”ํ•™๋ฐ˜์‘
expression ๋ฐœํ˜„ ์กฐ์ ˆ
activation ํ™œ์„ฑํ™”
post-translational modifications ๋ฒˆ์—ญ ํ›„ ๋ณ€ํ˜•
binding ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์„œ๋กœ ์—ฐ๊ฒฐ
catalysis ์ด‰๋งค ์ž‘์šฉ
 

STRING: functional protein association networks

 

string-db.org

PPI ์˜ˆ์‹œ

2. Gene Regulatory Networks(GRN)

  • Gene regulation: ์œ ์ „์ž ๋ฐœํ˜„ ์กฐ์ ˆ
  • GRN - ์œ ์ „์ž ๋ฐœํ˜„ ์กฐ์ ˆ์„ ํ•˜๋Š” regulator์™€ ์œ ์ „์ž ๊ฐ„์˜ ๋„คํŠธ์›Œํฌ
  • Regulator๋กœ๋Š” ์ฃผ๋กœ ์ „์‚ฌ์ธ์ž(transcription factor)์™€ microRNA๋ฅผ ๋‹ค๋ฃธ

    (์ „์‚ฌ์ธ์ž: activator or repressor / microRNA: repressor)
  • ์ „์‚ฌ์ธ์ž๋ž€?

    - DNA ์„œ์—ด์— ๋ถ™์–ด์„œ transcription์„ ์กฐ์ ˆํ•˜๋Š” ๋‹จ๋ฐฑ์งˆ

    - RNA polymerase์˜ ๊ฒฐํ•ฉ์„ ์ด‰์ง„(activator)ํ•˜๊ฑฐ๋‚˜ ๋ฐฉํ•ด(repressor)ํ•จ

    - ์ธ๊ฐ„์˜ ๊ฒฝ์šฐ ์•ฝ 1600์—ฌ๊ฐœ (~2600์—ฌ๊ฐœ)

  • TF-TG(transcription factor - target gene) ๋„คํŠธ์›Œํฌ
  • microRNA๋ž€?

    - ๋‹จ๋ฐฑ์งˆ์„ ๋งŒ๋“ค์–ด๋‚ด์ง€ ์•Š๋Š” non-coding RNA๋กœ, ๊ธธ์ด๊ฐ€ ์งง์Œ(์•ฝ 22๊ฐœ ์—ผ๊ธฐ)

    - ์ธ๊ฐ„์˜ ๊ฒฝ์šฐ ์•ฝ 2์ฒœ์—ฌ๊ฐœ๊ฐ€ ์•Œ๋ ค์ง

    - RNA interference(RNAi, RNA ๊ฐ„์„ญ)์„ ํ†ตํ•ด ํƒ€๊ฒŸ ์œ ์ „์ž ๋ฐœํ˜„ ์–ต์ œ

    - ํ—ค์–ดํ•€ ๊ตฌ์กฐ๋ฅผ ๋”

  • TRANSFAC, TRUST, HTRI-db ๋“ฑ: ์–ด๋–ค TF ๊ฐ€ ์–ด๋–ค ์œ ์ „์ž์˜ DNA์— bindingํ•˜๋Š”์ง€ ์‹คํ—˜ ๋ฐ ์˜ˆ์ธกํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋ชจ์€ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค
  • www.grnpedia.org/trrust/
 

TRRUST - Transcriptional Regulatory Relationships Unraveled by Sentence-based Text mining

 

www.grnpedia.org

GRN ์˜ˆ์‹œ

3. Co-expression Networks

  • ์œ ์ „์ž ๋ฐœํ˜„๋Ÿ‰์˜ correlation์„ ํ†ตํ•ด ๋งŒ๋“ค์–ด์ง€๋Š” ๋„คํŠธ์›Œํฌ

co-expression network ์˜ˆ์‹œ


Biological Pathway

1. Biological pathway ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด

  • ์„ธํฌ ๋‚ด ๋ถ„์ž๋“ค ์‚ฌ์ด์˜ ์ผ๋ จ์˜ ์ƒํ˜ธ์ž‘์šฉ
  • ์ƒˆ๋กœ์šด ๋ถ„์ž ์ƒ์„ฑ, ์‹ ํ˜ธ ์ „๋‹ฌ ๋“ฑ ํŠน์ •ํ•œ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์ƒํ˜ธ์ž‘์šฉ๋“ค์„ ์ •๋ฆฌํ•œ ๊ฒƒ
  • KEGG๋‚˜ REACTOME ๋“ฑ์˜ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค
  • www.genome.jp/kegg/
 

KEGG: Kyoto Encyclopedia of Genes and Genomes

 

www.genome.jp

reactome.org/

 

Home - Reactome Pathway Database

Warning! Unable to extract citation. Please try again later.

reactome.org

kegg pathway

2. Metabolic pathway

  • ํšจ์†Œ์— ์˜ํ•ด ์ด‰์ง„๋œ ์ผ๋ จ์˜ ํ™”ํ•™๋ฐ˜์‘๋“ค
  • ๊ธฐ์งˆ(substrate) -> ํšจ์†Œ์™€ ๊ฒฐํ•ฉ -> ์ƒ์„ฑ๋ฌผ(product)
  • ํ•œ ํšจ์†Œ์— ์˜ํ•ด ๋งŒ๋“ค์–ด์ง„ ์ƒ์„ฑ๋ฌผ์€ ๋˜ ๋‹ค๋ฅธ ํšจ์†Œ์˜ ๊ธฐ์งˆ์ด ๋จ
  • ์˜ˆ: TCA cycle in REACTOME and KEGG

3. Signaling pathway

  • ์„ธํฌ๋ง‰์˜ ์ˆ˜์šฉ์ฒด์—์„œ ์ž๊ทน์„ ๊ฐ์ง€ํ•˜์—ฌ ์„ธํฌ ๋‚ด์— ๊ทธ ์‹ ํ˜ธ๊ฐ€ ์ „๋‹ฌ๋˜๋Š” ๊ณผ์ •
  • Stress/disease ์ƒํƒœ์—์„œ ์ค‘์š”ํ•œ ์—ญํ• 
  • ์˜ˆ: TGF-beta signaling pathway in REACTOME and KEGG

Biological pathway and network

  • Biological pathway

    - Gene regulatory network์™€ protein-protein interaction network๋ฅผ ํ•ฉ์นœ ์ „์ฒด ๋„คํŠธ์›Œํฌ์—์„œ ํŠน์ •ํ•œ ๊ธฐ๋Šฅ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์ผ๋ถ€๋ถ„


    - Biological pathway๋ฅผ GRN์ด๋‚˜ PIN์œผ๋กœ ๋ณ€ํ˜•ํ•˜๊ณ  ์‹ถ์œผ๋ฉด ๋‹จ์ˆœํ™” ํ•„์š”(protein complex ๋ฐ ๊ทธ ์™ธ ๋‹จ๋ฐฑ์งˆ์ด ์•„๋‹Œ ๋ถ„์ž ์ œ๊ฑฐ ๋“ฑ)
  • Gene Ontology

    - ๊ฐ™์€ ๊ธฐ๋Šฅ์„ ์ˆ˜ํ–‰ํ•˜๊ฑฐ๋‚˜ ํ•˜๋‚˜์˜ ์„ธํฌ ๊ตฌ์กฐ๋ฅผ ์ด๋ฃจ๋Š” ์œ ์ „์ž ๊ทธ๋ฃน(GO term) ์ •๋ฆฌ

    - ์„ธ ๊ฐ€์ง€ class๊ฐ€ ์žˆ๊ณ  ๊ฐ๊ฐ์˜ class ์•ˆ์— GO term๋“ค ์ •์˜๋˜์–ด ์žˆ์Œ
Class  
Molecular function gene product ๊ธฐ๋Šฅ ๋ฌ˜์‚ฌ, ๋‹ค๋ฅธ ๋ถ„์ž ๊ฐœ์ฒด์™€์˜ ์ง์ ‘์ ์ธ ๋ฌผ๋ฆฌ์  ์ƒํ˜ธ์ž‘์šฉ, ์ˆ˜ํ–‰์ž‘์šฉ
Cellular component ์„ธํฌ ๊ตฌ์กฐ, ๋‹จ๋ฐฑ์งˆ ๋ณตํ•ฉ์ฒด ๋“ฑ, ๋ณธ์ž๊ธฐ๋Šฅ ์ˆ˜ํ–‰ ์‹œ ๋ถ„์ž๊ฐ€ ์‚ฌ์šฉํ•˜๋Š” ์„ธํฌ ๊ตฌ์กฐ์™€ ๊ตฌํš ๋“ฑ์˜ ์ƒ๋Œ€์  ์œ„์น˜
Biological process gene product ์ƒํ˜ธ์ž‘์šฉ ๋ฌ˜์‚ฌ, ๊ฒฐ๊ณผ๋‚˜ ์ข…๋ฃŒ ํ˜•ํƒœ, biological pathway์™€ ์œ ์‚ฌ

      - GO term ๊ฐ„ hierarchy ์กด์žฌ

 

Gene Ontology Resource

The Gene Ontology (GO) project is a major bioinformatics initiative to develop a computational representation of our evolving knowledge of how genes encode biological functions at the molecular, cellular and tissue system levels.

geneontology.org


Pathway analysis 

  • DEG(๋‹ค๋ฅด๊ฒŒ ๋ฐœํ˜„๋œ ์œ ์ „์ž)๋“ค์ด ์–ด๋–ค ์ƒ๋ฌผํ•™์  ํŒจ์Šค์›จ์ด์— ์†ํ•˜๋Š”์ง€ ํ™•์ธ
  • ์–ด๋–ค ํŒจ์Šค์›จ์ด์— DEG๊ฐ€ ํŠนํžˆ ๋” ๋งŽ์€ ๋น„์œจ๋กœ ํฌํ•จ๋˜์–ด ์žˆ๋Š”์ง€ ์•Œ๊ณ ์ž ํ•จ

 

  1. ๋‘ ์กฐ๊ฑด์˜ ์œ ์ „์ž ๋ฐœํ˜„ ๋ฐ์ดํ„ฐ๋กœ limma test

  2. Limma test ๊ฒฐ๊ณผ DEG์˜ ๋ชฉ๋ก๊ณผ ๋ฐœํ˜„๋Ÿ‰(ํ˜น์€ fold change) ์œผ๋กœ ํŒจ์Šค์›จ์ด ๋ถ„์„

    -
    ํŒจ์Šค์›จ์ด ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์ •๋ณด ํ•„์š”,, 2000์—ฌ๊ฐœ ํŒจ์Šค์›จ์ด ๋ณ„ ์œ ์ „์ž ๋ชฉ๋ก๊ณผ ๋„คํŠธ์›Œํฌ

    - Limma: Bioconductor์—์„œ ์ œ๊ณตํ•˜๋Š” R package, Gene expression Data๋ฅผ ๋ถ„์„ํ•˜๋Š” ๋ฐ ๋„๋ฆฌ ํ™œ์šฉ๋˜๋Š” ๋Œ€ํ‘œ์ ์ธ
     
      ๋„๊ตฌ, ์ฐจ๋ณ„ ๋ฐœํ˜„ ์œ ์ „์ž๋ฅผ ์ฐพ๋Š” ๋ฐ ํƒ์›”ํ•œ ๊ฒ€์ •๋ ฅ์„ ๋ณด์—ฌ์ฃผ๋ฉฐ ์ ์€ ์ˆ˜์˜ ํ‘œ๋ณธ์— ๋Œ€ํ•ด์„œ๋„ ํšจ๊ณผ์ ์ž„

  3. ํŒจ์Šค์›จ์ด ๋ถ„์„ ๊ฒฐ๊ณผ

    ๊ฐ ํŒจ์Šค์›จ์ด์— ๋Œ€ํ•ด์„œ DEG์˜ ๋น„์œจ์ด ๋†’์€์ง€(ORA) + ์ „๋ถ€ up์ธ์ง€ ํ˜น์€ ์ „๋ถ€ down์ธ์ง€ DEG๋“ค์˜ ๋ฐœํ˜„ ํŒจํ„ด ํŒ๋‹จ
    (GSEA)
    ํ•ด p-value ๊ณ„์‚ฐ


    - Over-representation analysis(ORA)
    - Gene set enrichment analysis(GSEA)

  4. ORA

     - ๊ฐ ํŒจ์Šค์›จ์ด์— ๋Œ€ํ•ด์„œ DEG์˜ ๋น„์œจ์ด ๋†’์€์ง€ ๋น„์œจ์„ ์ด์šฉํ•œ ํ†ต๊ณ„์  ํ…Œ์ŠคํŠธ

     - ์–ด๋–ค ํŒจ์Šค์›จ์ด์— DEG๊ฐ€ ํŠนํžˆ ๋” ๋งŽ์€ ๋น„์œจ๋กœ ํฌํ•จ๋˜์–ด ์žˆ๋Š”์ง€ ์•Œ๊ณ ์ž ํ•จ

     - Hypergeometric distribution์ •์˜๋ฅผ ํ™œ์šฉ: ์ „์ฒด n๊ฐœ์— ๋Œ€ํ•ด ํŠน์ • type์˜ ๊ฐœ์ˆ˜๊ฐ€ k๊ฐœ๋กœ ์•Œ๋ ค์ ธ ์žˆ์„ ๋•Œ ๊ทธ ์ค‘ n ๊ฐœ๋ฅผ ๋ฝ‘์•˜์„ ๋•Œ ํ•ด๋‹น type์˜ ๊ฐœ์ˆ˜๊ฐ€ k๊ฐœ๊ฐ€ ๋‚˜์˜ฌ ํ™•๋ฅ 


     -
 Pathway์˜ p-value: hypergeometric distribution์—์„œ ํ˜„์žฌ k๋ณด๋‹ค ๋” ๊ทน๋‹จ์ ์ธ ๊ฒฝ์šฐ์ธ ํ™•๋ฅ ๋“ค์˜ ํ•ฉ

 

   5. GSEA(gene set enrichment analysis)


     - ๊ฐ ํŒจ์Šค์›จ์ด์— ์†ํ•œ ์œ ์ „์ž์˜ fold change, ํ˜น์€ t๊ฐ’์„ ์ด์šฉ, ์œ ์ „์ž๋“ค์˜ fold change๊ฐ€ ํ•œ์ชฝ ๋ฐฉํ–ฅ(์–‘์ˆ˜ or ์Œ์ˆ˜)์œผ๋กœ ์ผ๊ด€์„ฑ์ด ์žˆ์œผ๋ฉฐ fold change์˜ ์ ˆ๋Œ€๊ฐ’์ด ๋†’์€์ง€ ํ™•์ธ

     

     - GSEA์—์„œ์˜ gene set = ๊ฐ ํŒจ์Šค์›จ์ด์˜ ์œ ์ „์ž ์ง‘ํ•ฉ

 

     - ๋‘ ์‹คํ—˜ ์กฐ๊ฑด์˜ ์œ ์ „์ž ๋ฐœํ˜„๋Ÿ‰์„ ๋น„๊ตํ•œ ๊ฒฐ๊ณผ ํ•„์š”(fold change ๋˜๋Š” t๊ฐ’)

 

     - ์•ฝ๋ฌผ ํˆฌ์—ฌ ํ›„์™€ ํˆฌ์—ฌ ์ „์„ ๋น„๊ตํ•œ Fold change = log2(์•ฝ๋ฌผ ํˆฌ์—ฌ ํ›„ ์œ ์ „์ž ๋ฐœํ˜„๋Ÿ‰ / ์•ฝ๋ฌผ ํˆฌ์—ฌ ์ „ ์œ ์ „์ž ๋ฐœํ˜„๋Ÿ‰)

 

     - Fold change๊ฐ€ ์–‘์ˆ˜์ธ ๊ฒฝ์šฐ: ์•ฝ๋ฌผ ํˆฌ์—ฌ ํ›„ ์œ ์ „์ž ๋ฐœํ˜„๋Ÿ‰ > ํˆฌ์—ฌ ์ „ ↔ ์Œ์ˆ˜์ธ ๊ฒฝ์šฐ:  ์•ฝ๋ฌผ ํˆฌ์—ฌ ํ›„ ์œ ์ „์ž ๋ฐœํ˜„๋Ÿ‰ < ํˆฌ์—ฌ ์ „

 

     - ํŠน์ • ํŒจ์Šค์›จ์ด๊ฐ€ ์•ฝ๋ฌผํˆฌ์—ฌ์™€ ์—ฐ๊ด€์ด ์—†๋‹ค๋ฉด ํŒจ์Šค์›จ์ด ์œ ์ „์ž๋Š” ๊ฒฝํ–ฅ์„ฑ ์—†์ด ํฉ์–ด์ ธ์„œ ๋ถ„ํฌ, ๋ฐ˜๋Œ€๋กœ ์—ฐ๊ด€ ์žˆ์œผ๋ฉด ํ•œ์ชฝ์œผ๋กœ ์ ๋ ค์„œ ๋ถ„ํฌ

 

     - P-value: ES๋ฅผ randomํ•œ ์œ ์ „์ž ์ˆœ์„œ๋กœ k๋ฒˆ ๊ณ„์‚ฐํ–ˆ์„ ๋•Œ, ์›๋ž˜ ES๋ณด๋‹ค ๋” ๋†’๊ฒŒ ๊ณ„์‚ฐ๋œ ๋น„์œจ (ES: entrichment score)

 


2019-1 ๋ฐ”์ด์˜ค๋ฐ์ดํ„ฐ์ „์‚ฐ๊ธฐ์ดˆ๋ฐ์‹ค์Šต ๊ณผ์ œ Pathway analysis 

BINGO ์‹œ๊ฐํ™”๊ฒฐ๊ณผ

728x90