๊ด€๋ฆฌ ๋ฉ”๋‰ด

yeon's ๐Ÿ‘ฉ๐Ÿป‍๐Ÿ’ป

[๋ฐ์ดํ„ฐ ๋ถ„์„] Pandas, Seaborn, plot, unstack ๋ณธ๋ฌธ

Computer ๐Ÿ’ป/๋ฐ์ดํ„ฐ ๋ถ„์„

[๋ฐ์ดํ„ฐ ๋ถ„์„] Pandas, Seaborn, plot, unstack

yeon42 2021. 8. 4. 23:07
728x90
  • ์ƒ๊ถŒ์—…์ข…์†Œ๋ถ„๋ฅ˜๋ช…, ์‹œ๊ตฐ๊ตฌ๋ช… ์œผ๋กœ ๊ทธ๋ฃนํ™” ํ•˜๊ณ  '์ƒํ˜ธ๋ช…'์œผ๋กœ ๋นˆ๋„์ˆ˜ ์„ธ๊ธฐ
g = df_academy_selectd.groupby(["์ƒ๊ถŒ์—…์ข…์†Œ๋ถ„๋ฅ˜๋ช…", "์‹œ๊ตฐ๊ตฌ๋ช…"])["์ƒํ˜ธ๋ช…"].count()

 

 

1.2.4 Pandas์˜ plot์œผ๋กœ ์‹œ๊ฐํ™”

 

- 'ํ•™์›-์ž…์‹œ' ๋ฐ์ดํ„ฐ๋งŒ ๊ฐ€์ ธ์™€ ์‹œ๊ฐํ™” ํ•˜๊ธฐ

g.loc["ํ•™์›-์ž…์‹œ"].sort_values().plot.barh(figsize=(10, 7))

 

g.plot.bar()

  - ๊ทธ๋ฃนํ™”๋œ ๋ฐ์ดํ„ฐ๋Š” multi-index ์ด๊ธฐ ๋•Œ๋ฌธ์— ๋ณด๊ธฐ๊ฐ€ ์–ด๋ ต๋‹ค. -> ์–ด๋–ป๊ฒŒ ๊ฐœ์„ ํ•  ์ˆ˜ ์žˆ์„๊นŒ?

 

 

 

 


 

 

1.2.5 unstack() ์ดํ•ดํ•˜๊ธฐ

 

https://pandas.pydata.org/docs/user_guide/reshaping.html

 

Reshaping and pivot tables — pandas 1.3.1 documentation

Reshaping by melt The top-level melt() function and the corresponding DataFrame.melt() are useful to massage a DataFrame into a format where one or more columns are identifier variables, while all other columns, considered measured variables, are “unpivo

pandas.pydata.org

 

 

  • ์œ„์—์„œ g๋Š” multi-index์˜€๋‹ค.

 

  • unstack() ์ทจํ•ด์ฃผ๊ธฐ

  - pivot ํ˜•ํƒœ๋กœ ๋ณ€๊ฒฝ

 

 

 

  • barh ๊ทธ๋ž˜ํ”„๋กœ
g.unstack().plot.barh(figsize=(8, 9))

 

g.unstack().loc["ํ•™์›-์ž…์‹œ"].plot.barh(figsize=(8, 9))

 

 

  • T = transpoze()
g.unstack().T.plot.bar(figsize=(15, 5))

  - ์ „์ฒด์ ์œผ๋กœ ํ•™์› ๊ฐœ์ˆ˜๊ฐ€ ๊ฐ•๋‚จ๊ตฌ, ์„œ์ดˆ๊ตฌ, ์–‘์ฒœ๊ตฌ, ... ๊ฐ€ ๋งŽ์€ ๊ฑธ ์•Œ ์ˆ˜ ์žˆ์Œ

 

 

 


 

1.2.6 ๊ฐ™์€ ๊ทธ๋ž˜ํ”„๋กœ seaborn์œผ๋กœ ๊ทธ๋ฆฌ๊ธฐ

https://pandas.pydata.org/pandas-docs/stable/user_guide/reshaping.html

 

Reshaping and pivot tables — pandas 1.3.1 documentation

Reshaping by melt The top-level melt() function and the corresponding DataFrame.melt() are useful to massage a DataFrame into a format where one or more columns are identifier variables, while all other columns, considered measured variables, are “unpivo

pandas.pydata.org

 

 

  • ๊ทธ๋ฃนํ™”ํ•œ ๊ฐ’์˜ ์ธ๋ฑ์Šค ํ™•์ธ
g.index

 

 

  • ์ธ๋ฑ์Šค๊ฐ’์„ ์ปฌ๋Ÿผ์œผ๋กœ ๋งŒ๋“ค๊ณ  rename
t = g.reset_index()
t = t.rename(columns={"์ƒํ˜ธ๋ช…":"์ƒํ˜ธ์ˆ˜"})

 

  • x์ถ•์— ์‹œ๊ตฐ๊ตฌ๋ช…, y์ถ•์— ์ƒํ˜ธ์ˆ˜ ๋ฅผ ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„๋กœ ๊ทธ๋ฆฌ๊ธฐ
plt.figure(figsize=(15, 4))
sns.barplot(data=t, x="์‹œ๊ตฐ๊ตฌ๋ช…", y="์ƒํ˜ธ์ˆ˜", ci=None)

  - ์ƒํ˜ธ์ˆ˜๋ณ„๋กœ ์ƒ‰์ƒ์„ ๋‹ค๋ฅด๊ฒŒ ํ‘œํ˜„

 

 

  • x์ถ•์— ์ƒ๊ถŒ์—…์ข…์†Œ๋ถ„๋ฅ˜๋ช…, y์ถ•์— ์ƒํ˜ธ์ˆ˜๋ฅผ ๋ง‰๋Œ€๊ทธ๋ž˜ํ”„๋กœ ๊ทธ๋ฆฌ๊ธฐ
plt.figure(figsize=(15, 4))
sns.barplot(data=t, x="์ƒ๊ถŒ์—…์ข…์†Œ๋ถ„๋ฅ˜๋ช…", y="์ƒํ˜ธ์ˆ˜", ci=None)

  - ์‹œ๊ตฐ๊ตฌ๋ช…์œผ๋กœ ์ƒ‰์ƒ ๋‹ค๋ฅด๊ฒŒ ํ‘œํ˜„

 

 

 

  • ์ƒ๊ถŒ์—…์ข…์†Œ๋ถ„๋ฅ˜๋ช… ์ด 'ํ•™์›-์ž…์‹œ'์ธ ์„œ๋ธŒ์…‹๋งŒ
academy_sub = t[t["์ƒ๊ถŒ์—…์ข…์†Œ๋ถ„๋ฅ˜๋ช…"] == "ํ•™์›-์ž…์‹œ"].copy()

plt.figure(figsize=(15, 4))
sns.barplot(data=academy_sub, x="์‹œ๊ตฐ๊ตฌ๋ช…", y="์ƒํ˜ธ์ˆ˜")

  - ๊ตฌ๋ณ„๋กœ ํ•™์›-์ž…์‹œ ๊ทธ๋ž˜ํ”„๋งŒ ๊ทธ๋ฆผ

  - ๊ฐ•๋‚จ๊ตฌ, ์–‘์ฒœ๊ตฌ, ์„œ์ดˆ๊ตฌ, ... ์— ์ž…์‹œํ•™์›์ด ๋งŽ๊ตฌ๋‚˜

 

 

 

  • catplot์œผ๋กœ ์„œ๋ธŒํ”Œ๋กฏ ๊ทธ๋ฆฌ๊ธฐ
sns.catplot(data=t, x="์ƒ๊ถŒ์—…์ข…์†Œ๋ถ„๋ฅ˜๋ช…", y="์ƒํ˜ธ์ˆ˜", kind="bar", col="์‹œ๊ตฐ๊ตฌ๋ช…", col_wrap=4, sharex=False)

  - sharex=False : ๊ฐ ๊ทธ๋ž˜ํ”„ ์•„๋ž˜์— x์ถ• ๊ฐ’ ํ‘œํ˜„ํ•˜๊ธฐ

 

 

 

 

 

 

Comments