PL(Programming Language)/Python

[Python] Pandas dataframe ๊ฒฐํ•ฉ, ์กฐ์ธ, ๋ณ‘ํ•ฉ(Join, Merge)

ํƒฑ์ ค 2021. 1. 28. 14:08

Join

1. ์˜ˆ์‹œ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์ƒ์„ฑ

import pandas as pd

df = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3', 'K4', 'K5'], 'A': ['A0', 'A1', 'A2', 'A3', 'A4', 'A5']})
other = pd.DataFrame({'key': ['K0', 'K1', 'K2'], 'B': ['B0', 'B1', 'B2']})

2. ์—ด์˜ Index ์ง€์ •ํ•ด์„œ Join

df.join(other, lsuffix = '_caller', rsuffix = '_other')

3. Key๋ฅผ index๋กœ ์ง€์ •ํ•ด Join

df.set_index('key').join(other.set_index('key'))

4. join ๋ฉ”์†Œ๋“œ์˜ parameter ์ด์šฉํ•ด join

df.join(other.set_index('key'), on = 'key')


Merge

 

1. ์˜ˆ์‹œ ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์ƒ์„ฑ (์œ„์™€ ๋™์ผ)

2. ์ค‘๋ณต๋œ key๋ฅผ ๊ธฐ์ค€์œผ๋กœ merge (= ๊ต์ง‘ํ•ฉ, inner join์ด default ๊ฐ’์ž„)

# pd.merge(df, other, how = 'inner')
pd.merge(df, other)

 ์œ„์˜ ๋‘ ๋ฌธ์žฅ์˜ ๊ฒฐ๊ณผ๋Š” ๋™์ผ

3. ๊ธฐ์ค€์ด ๋  ์—ด์€ ์ˆ˜๋™์œผ๋กœ ์ง€์ •ํ•ด merge

pd.merge(df, other, left_on = 'key', right_on = 'key')

4. Merge Outer join, ๋‹ค๋ฅธ ์ชฝ์— ์—†๋Š” ๊ฐ’์€ NaN(๋นˆ๊ฐ’)์œผ๋กœ ์ฑ„์›€

pd.merge(df, other, how = 'outer')

5. Merge Left join, ์™ผ์ชฝ ๊ทธ๋Œ€๋กœ ์œ ์ง€ํ•ด merge

pd.merge(df, other, how = 'left')

6. Merge Right join, ์˜ค๋ฅธ์ชฝ ๊ทธ๋Œ€๋กœ ์œ ์ง€ํ•ด merge

pd.merge(df, other, how = 'right')

728x90