PL(Programming Language)/Python

[Python] Pandas Dataframe ์ž๋ฃŒํ˜•์—์„œ NaN ๊ฐ’ ์ฐพ๊ธฐ(๊ฒฐ์ธก๊ฐ’ ์—ฌ๋ถ€ ํ™•์ธ, ๊ฒฐ์ธก๊ฐ’ ๊ฐœ์ˆ˜ ์„ธ๊ธฐ)

ํƒฑ์ ค 2021. 1. 25. 16:53

How to check NaN in Pandas Dataframe

null ๊ฐ’ ํ™•์ธ

  • df.isnull()

  • isnull(df)

null ์•„๋‹Œ ๊ฐ’ ํ™•์ธ

  • df.notnull()

  • notnull(df)


1. ์˜ˆ์‹œ dataframe ์ƒ์„ฑ

import pandas as pd
import numpy as np

dates = pd.date_range("20130101", periods=6)
df = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list("ABCD"))

2. null ๊ฐ’ ์ถ”๊ฐ€ํ•˜๊ธฐ

'NaN' ํ˜น์€ None์„ ํ†ตํ•ด null๊ฐ’์„ ์ž„์˜๋กœ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ๋‹ค.

df['A'][1] = 'NaN'
df['B'][2] = None
df['C'][2] = 'NaN'
df['D'][3] = None

3. isnull ์ด์šฉํ•ด ๊ฒฐ์ธก๊ฐ’ ํ™•์ธ

pd.isnull(df)
df.isnull()

์œ„์˜ ๋‘๊ฐœ๋Š” ๊ฐ™์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์ธ๋‹ค.

4. notnull ์ด์šฉํ•ด ๊ฒฐ์ธก๊ฐ’ ์•„๋‹Œ ๊ฐ’ ํ™•์ธ

pd.notnull(df)
df.notnull()

์œ„์˜ ๋‘๊ฐœ๋Š” isnull์—์„œ ๊ทธ๋žฌ๋˜ ๊ฒƒ์ฒ˜๋Ÿผ ๊ฐ™์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์ธ๋‹ค.

5. ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ ์ „์ฒด์˜ ๊ฒฐ์ธก๊ฐ’ ์„ธ๊ธฐ

df.isnull().sum()

6. ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„ column ๋ณ„๋กœ ๊ฒฐ์ธก๊ฐ’ ์„ธ๊ธฐ

df['A'].isnull().sum()

์œ„์ฒ˜๋Ÿผ []์•ˆ์— column๋ช…์„ ๋„ฃ์œผ๋ฉด ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์˜จ๋‹ค.

 

728x90