๐Ÿ’ก WIDA/DACON ๋ถ„๋ฅ˜-ํšŒ๊ท€

[DACON/์ตœ๋‹ค์˜ˆ] ์ฒœ์ฒด ์œ ํ˜• ๋ถ„๋ฅ˜ ๋Œ€ํšŒ๋ฅผ ์œ„ํ•œ ๋„๋ฉ”์ธ ์ง€์‹ ์•Œ์•„๋ณด๊ธฐ

๋‹ค์˜ˆ๋ป 2023. 3. 16. 23:47

์Šฌ๋ก  ๋””์ง€ํ„ธ ์ฒœ์ฒด ๊ด€์ธก(Sloan Digital Sky Survey:SDSS)

๋ชฉํ‘œ : train data๋กœ ํ•™์Šต์„ ํ•˜๊ณ  test data์˜ ์ฒœ์ฒด์˜ type์„ ์•Œ์•„๋‚ด๋Š” ๊ฒƒ

 

[type] 

= Source type : ์ฒœ์ฒด์˜ ๋ถ„๋ฅ˜

 

QSO : ํ€˜์ด์‚ฌ

- ํ™œ๋™์€ํ•˜ํ•ต(Active Galactic Nucleus, AGN)์„ ๊ฐ–๋Š” ๋งค์šฐ ๋ฉ€๊ณ  ๋ฐ์€ ์€ํ•˜

- ๊ฐ€์žฅ ๋ฐ์€ ์ฒœ์ฒด ์ค‘์˜ ํ•˜๋‚˜

- ๋„“์€ ์„ ํญ์˜ ๋ฐฉ์ถœ์„ ์„ ๊ฐ€์ง€๋ฉฐ, ๊ฐ€์‹œ๊ด‘์„ ๊ณผ ์—‘์Šค์„ (X-ray) ์˜์—ญ๋Œ€์—์„œ ๊ฐ•ํ•œ ๋ฐฉ์ถœ์„ ์„ ๊ฐ€์ง

- ์ ์ƒ‰ํŽธ์ด ๊ฐ’์€ ๋งค์šฐ ํผ

(https://terms.naver.com/entry.naver?docId=5741238&cid=60217&categoryId=60217)

STAR_RED_DWARF : ์ ์ƒ‰์™œ์„ฑ

STAR_WHITE_DWARF : ๋ฐฑ์ƒ‰์™œ์„ฑ

STAR_BROWN_DWARF : ๊ฐˆ์ƒ‰์™œ์„ฑ

- ๊ฐ€์žฅ ๊ฐ€๋ฒผ์šด ํ•ญ์„ฑ(์ด๋ฅผํ…Œ๋ฉด M9V์ธ ์ ์ƒ‰์™œ์„ฑ)๊ณผ ๊ฐ€์žฅ ๋ฌด๊ฑฐ์šด ๊ธฐ์ฒด ํ–‰์„ฑ(๋ชฉ์„ฑ ์งˆ๋Ÿ‰์˜ ์•ฝ 13๋ฐฐ ์ •๋„๋˜๋Š” ๊ฑฐ๋Œ€ ๋ชฉ์„ฑํ˜• ํ–‰์„ฑ)์‚ฌ์ด์˜ ์งˆ๋Ÿ‰์„ ๊ฐ€์ง„ ์ค€ํ•ญ์„ฑ์ฒœ์ฒด(substellar object)

(https://terms.naver.com/entry.naver?docId=5753054&cid=62801&categoryId=62801)

STAR_SUB_DWARF : ์™œ์†Œ์€ํ•˜

- ์งˆ๋Ÿ‰์ด ๋ณดํ†ต์€ํ•˜์˜ 1/100๋ฐฐ์—์„œ 1/1000๋ฐฐ์— ๋ถˆ๊ณผํ•œ ์ž‘์€ ์€ํ•˜

(https://m.terms.naver.com/entry.naver?docId=3557852&cid=40942&categoryId=32290)

STAR_BHB : ์ˆ˜ํ‰๊ฑฐ์—ด์„ฑ

STAR_CATY_VAR : ๊ฒฉ๋ณ€๋ณ€๊ด‘์„ฑ

- ์Œ์„ฑ๊ณ„ ํ•œ์ชฝ์˜ ๋ณ„์ด ๋ฐฑ์ƒ‰์™œ์„ฑ์ด๋‚˜ ์ค‘์„ฑ์ž๋ณ„ใ†๋ธ”๋ž™ํ™€์ด ๋˜์–ด ์žˆ์œผ๋ฉฐ, ๋‹ค๋ฅธ ์ชฝ์˜ ์ ์ƒ‰๊ฑฐ์„ฑ์˜ ๋Œ€๊ธฐ๊ฐ€ ์œ ์ž…๋˜์—ˆ์„ ๋•Œ ๊ธ‰๊ฒฉํ•œ ์ฆ๊ด‘์„ ๋ณด์ด๋Š” ์ฒœ์ฒด

(https://terms.naver.com/entry.naver?docId=1621093&cid=50316&categoryId=50316)

SERENDIP_RED, SERENDIP_BLUE, SERENDIP_DISTANT : ํ•ญ์„ฑ ๊ตฌ์—ญ ์™ธ๋ถ€์— ๋†“์ธ ์ฒœ์ฒด

- ํ•ญ์„ฑ์€ ์šฐ๋ฆฌ๊ฐ€ ํ”ํžˆ ์•Œ๊ณ ์žˆ๋Š” ๋ณ„์˜ ๋‹ค๋ฅธ ์ด๋ฆ„

(https://astro.kasi.re.kr/learning/pageView/6372)

SERENDIPITY_FIRST : ์ฒซ ๋ฒˆ์งธ ๊ด€์ธก์—์„œ ํ€˜์ด์‚ฌ๋กœ ๋ถ„๋ฅ˜๋˜์—ˆ์ง€๋งŒ, ์ด์–ด์ง„ ๊ด€์ธก์—์„œ๋Š” ๋” ํ๋ฆฟํ•œ ์ฒœ์ฒด

SERENDIPITY_MANUAL : ์ˆ˜๋™์œผ๋กœ ๊ด€์ธก๋œ ์ฒœ์ฒด

SKY : ๋นˆ ํ•˜๋Š˜

ROSAT_D : X-์„  ํŒŒ์žฅ๋Œ€์—์„œ ๊ด€์ธกํ•œ ์ฒœ์ฒด์ด์ง€๋งŒ, SDSS ๋ง์›๊ฒฝ์—์„œ๋„ ๊ด€์ธก๋˜๋Š” ์ฒœ์ฒด

GALAXY : ์€ํ•˜

STAR_CARBON : ํƒ„์†Œ๋ณ„

- ๋Œ€๊ธฐ์— ์‚ฐ์†Œ๋ณด๋‹ค ํƒ„์†Œ๋ฅผ ๋” ๋งŽ์ด ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๋ณ„

SERENDIPITY_RED : ?

SPECTROPHOTO_STD : ?

REDDEN_STD : ?

ROSAT_D : ?

 

[๋น›์˜ ๋ฐ๊ธฐ]

u : Ultraviolet (์ž์™ธ์„ )

g : Green

r : Red

i : Near Infrared (๊ทผ์ ์™ธ์„  : ์ ์™ธ์„  ์ค‘ ํŒŒ์žฅ์ด ๊ฐ€์žฅ ์งง์€ ๊ฒƒ)

z : Infrared

fiberID : ์ฒœ์ฒด๋ฅผ ๊ด€์ธกํ•  ๋•Œ ์‚ฌ์šฉ๋œ ๊ด‘์„ฌ์œ  ์‹๋ณ„๋ฒˆํ˜ธ

- ๊ด‘์„ฌ์œ ๋ž€ ์ „๋ฐ˜์‚ฌ๋ฅผ ํ†ตํ•ด ๋น›์˜ ์†์‹ค ์—†์ด ์ „๋‹ฌ์‹œํ‚ค๋Š” ์„ฌ์œ 

- ๊ตฌ๋ฆฌ์„ ์— ๋น„ํ•ด ํ›จ์”ฌ ๋งŽ์€ ์–‘์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ฉ€๋ฆฌ๊นŒ์ง€ ์ „๋‹ฌ ๊ฐ€๋Šฅ

- ๊ด‘์„ฌ์œ ์— ์œ ๋ฆฌ ์„ฌ์œ ๊ฐ€ ์“ฐ์ด๋Š” ์ด์œ ๋Š” ๋ฐ์ดํ„ฐ ์†์‹ค์ด ์ ๊ณ  ์ „์ž๊ธฐ์ ์ธ ๊ฐ„์„ญ๋„ ํ›จ์”ฌ ์ ๊ณ  ๊ณ ์˜จ์ด ์ž˜ ๋ฒ„ํ‹ฐ๊ธฐ ๋•Œ๋ฌธ

(https://terms.naver.com/entry.naver?docId=5741207&cid=60217&categoryId=60217)

psfMag : Point spread function magnitudes

- ๋จผ ์ฒœ์ฒด๋ฅผ ํ•œ ์ ์œผ๋กœ ๊ฐ€์ •ํ•˜์—ฌ ์ธก์ •ํ•œ ๋น›์˜ ๋ฐ๊ธฐ

fiberMag : Fiber magnitudes

- ์ฒœ์ฒด๋ฅผ 3์ธ์น˜ ๊ด‘์„ฌ์œ ๋กœ ๊ด€์ธกํ•  ๋•Œ ์ธก์ • ๋  ๊ด‘๋„

- ๊ด‘์„ฌ์œ ๋ฅผ ํ†ต๊ณผํ•˜๋Š” ๋น›์˜ ๋ฐ๊ธฐ

petroMag : Petrosian Magnitudes

- ์ฒœ์ฒด์˜ ์œ„์น˜์™€ ๊ฑฐ๋ฆฌ์— ์ƒ๊ด€์—†์ด ๋น›์˜ ๋ฐ๊ธฐ๋ฅผ ๋น„๊ตํ•˜๊ธฐ ์œ„ํ•œ ์ˆ˜์น˜

modelMag : Model magnitudes

- ์ฒœ์ฒด ์ค‘์‹ฌ์œผ๋กœ๋ถ€ํ„ฐ ํŠน์ • ๊ฑฐ๋ฆฌ์˜ ๋ฐ๊ธฐ 

(https://moondol-ai.tistory.com/m/59)

 


Regression๊ณผ Classification

 

Regression๊ณผ Classification์€ ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ ๋งค์šฐ ์ค‘์š”ํ•œ ๋ฌธ์ œ ์œ ํ˜•์œผ๋กœ, ๋ฐ์ดํ„ฐ ๋ถ„์„์—์„œ ๊ฐ€์žฅ ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉ๋จ

์ด ๋‘ ์œ ํ˜•์€ ๋ชจ๋‘ ์ง€๋„ํ•™์Šต(Supervised Learning)์˜ ์ผ๋ถ€

์ง€๋„ํ•™์Šต์€ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์™€ ์ •๋‹ต(label)์ด ์กด์žฌํ•˜๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ, ํ•™์Šต๋œ ๋ชจ๋ธ์€ ์ƒˆ๋กœ์šด ์ž…๋ ฅ๊ฐ’์— ๋Œ€ํ•ด ์ •ํ™•ํ•œ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•จ

 

  1. Regression

์—ฐ์†์ ์ธ ๊ฐ’์„ ์˜ˆ์ธกํ•˜๋Š” ๋ฌธ์ œ

์ž…๋ ฅ ๋ณ€์ˆ˜์™€ ์ถœ๋ ฅ ๋ณ€์ˆ˜ ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ์ฐพ์•„๋‚ด๋Š” ๊ฒƒ

์ž…๋ ฅ ๋ณ€์ˆ˜ = ๋…๋ฆฝ ๋ณ€์ˆ˜, ์ถœ๋ ฅ ๋ณ€์ˆ˜ = ์ข…์† ๋ณ€์ˆ˜

์ฃผ๋กœ Linear Regression, Polynomial Regression, Support Vector Regression ๋“ฑ์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉ

์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜๋“ค์€ ์ž…๋ ฅ ๋ณ€์ˆ˜์™€ ์ถœ๋ ฅ ๋ณ€์ˆ˜ ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ๋ชจ๋ธ๋งํ•˜์—ฌ, ์ƒˆ๋กœ์šด ์ž…๋ ฅ ๊ฐ’์— ๋Œ€ํ•œ ์ถœ๋ ฅ ๊ฐ’์„ ์˜ˆ์ธกํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ

 

ex) ์•„ํŒŒํŠธ์˜ ๊ฐ€๊ฒฉ์„ ์˜ˆ์ธก - ์•„ํŒŒํŠธ์˜ ๋ฉด์ , ์œ„์น˜, ์ธต์ˆ˜ ๋“ฑ์˜ ๋…๋ฆฝ ๋ณ€์ˆ˜๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ์•„ํŒŒํŠธ์˜ ๊ฐ€๊ฒฉ์„ ์˜ˆ์ธก

 

  1. Classification

์ž…๋ ฅ ๊ฐ’์„ ๋ช‡ ๊ฐ€์ง€์˜ ๋ฒ”์ฃผ๋กœ ๋ถ„๋ฅ˜ํ•˜๋Š” ๋ฌธ์ œ

์ถœ๋ ฅ ๊ฐ’์„ ํด๋ž˜์Šค(class) ๋˜๋Š” ๋ ˆ์ด๋ธ”(label)์ด๋ผ๊ณ  ํ•จ

์ฃผ๋กœ Logistic Regression, Decision Tree, Random Forest, Naive Bayes, Support Vector Machine ๋“ฑ์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉ

์ด ์•Œ๊ณ ๋ฆฌ์ฆ˜๋“ค์€ ์ฃผ์–ด์ง„ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์™€ ํด๋ž˜์Šค ์‚ฌ์ด์˜ ๊ด€๊ณ„๋ฅผ ํ•™์Šตํ•˜์—ฌ, ์ƒˆ๋กœ์šด ์ž…๋ ฅ ๊ฐ’์— ๋Œ€ํ•œ ํด๋ž˜์Šค๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ

 

ex) ์ด๋ฉ”์ผ์ด ์ŠคํŒธ ๋ฉ”์ผ์ธ์ง€ ์•„๋‹Œ์ง€๋ฅผ ์˜ˆ์ธก - ์ด๋ฉ”์ผ์˜ ์ œ๋ชฉ, ๋ณธ๋ฌธ ๋“ฑ์˜ ์ž…๋ ฅ ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฉ”์ผ์„ ์ŠคํŒธ ๋ฉ”์ผ์ธ์ง€ ์•„๋‹Œ์ง€๋กœ ๋ถ„๋ฅ˜

 

 

Regression๊ณผ Classification์€ ๋จธ์‹ ๋Ÿฌ๋‹์—์„œ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๋ฌธ์ œ ์œ ํ˜•์œผ๋กœ, ๋ฐ์ดํ„ฐ ๋ถ„์„์—์„œ ๋‹ค์–‘ํ•œ ๋ถ„์•ผ์—์„œ ์‚ฌ์šฉ๋จ

์ด ๋‘ ๋ฌธ์ œ ์œ ํ˜•์— ๋Œ€ํ•œ ์ดํ•ด๋Š” ๋จธ์‹ ๋Ÿฌ๋‹์˜ ๊ธฐ์ดˆ๋ฅผ ์ดํ•ดํ•˜๋Š” ๋ฐ ๋งค์šฐ ์ค‘์š”