Computer ๐Ÿ’ป/๋ฐ์ดํ„ฐ ๋ถ„์„

[์ „๊ตญ ๋„์‹œ ๊ณต์› ํ‘œ์ค€ ๋ฐ์ดํ„ฐ] ์‹œ๋„๋ณ„ ๊ณต์› ๋ถ„ํฌ

yeon42 2021. 8. 13. 00:49
728x90

4. ์‹œ๋„๋ณ„ ๊ณต์› ๋ถ„ํฌ

 

4.1 ์‹œ๋„๋ณ„ ๊ณต์› ๋น„์œจ

 

  • ์‹œ๋„๋ณ„๋กœ ํ•ฉ๊ณ„ ๋ฐ์ดํ„ฐ ์ถœ๋ ฅ
city_count = df["์‹œ๋„"].value_counts().to_frame()
city_mean = df["์‹œ๋„"].value_counts(normalize=True).to_frame()

  - normalize=True : ๋น„์œจ๋กœ ๊ตฌํ•˜๊ธฐ

  - ๋‘˜์„ ํ•ฉ์ณ์ฃผ๊ธฐ ์œ„ํ•ด dataframe ํ˜•ํƒœ๋กœ ๋ฐ”๊พธ์—ˆ๋‹ค.

 

 

 

 

 

  • ํ•ฉ๊ณ„์™€ ๋น„์œจ ํ•จ๊ป˜ ๊ตฌํ•˜๊ธฐ: merge
city = city_count.merge(city_mean, left_index=True, right_index=True)
city.columns = ["ํ•ฉ๊ณ„", "๋น„์œจ"]
city.style.background_gradient()

 

 

 

 


 

 

4.2 ๊ณต์›๊ตฌ๋ถ„๋ณ„ ๋ถ„ํฌ

 

  • "๊ณต์›๊ตฌ๋ถ„" ๋ณ„๋กœ ์ƒ‰์ƒ ๋‹ค๋ฅด๊ฒŒ, "๊ณต์›๋ฉด์ " ๋ณ„๋กœ ์›์˜ ํฌ๊ธฐ ๋‹ค๋ฅด๊ฒŒ
plt.figure(figsize=(8, 9))
sns.scatterplot(data=df_park, x="๊ฒฝ๋„", y="์œ„๋„", hue="๊ณต์›๊ตฌ๋ถ„", size="๊ณต์›๋ฉด์ ", sizes=(10, 100))

 

 

 

 

 


 

 

4.3 ์‹œ๋„๋ณ„ ๊ณต์›๋ถ„ํฌ

 

  • "์‹œ๋„" ๋ณ„๋กœ ์ƒ‰์ƒ ๋‹ค๋ฅด๊ฒŒ, "๊ณต์›๋ฉด์ " ๋ณ„๋กœ ์›์˜ ํฌ๊ธฐ ๋‹ค๋ฅด๊ฒŒ
plt.figure(figsize=(8, 9))
sns.scatterplot(data=df_park, x="๊ฒฝ๋„", y="์œ„๋„", hue="์‹œ๋„", size="๊ณต์›๋ฉด์ ", sizes=(10, 100))

 

 

 

  • countplot ์œผ๋กœ ์‹œ๋„๋ณ„ ๋นˆ๋„์ˆ˜ ๊ทธ๋ฆฌ๊ธฐ
sns.countplot(data=df, y="์‹œ๋„", order=city_count.index, palette="Greens_r")