๐Ÿ’ก WIDA/DACON ๋ถ„๋ฅ˜-ํšŒ๊ท€

[DACON/์กฐ์•„์˜] ์ฒœ์ฒด ๋ถ„๋ฅ˜ ๊ฒฝ์ง„๋Œ€ํšŒ

๋ ค์šฐ 2023. 5. 6. 01:01

๊ณผ์ •

1. EDA ๋ฐ ์ „์ฒ˜๋ฆฌ

2. ๋ชจ๋ธ๋ง ๋ฐ ๊ฒฐ๊ณผ

3. ์ธ์‚ฌ์ดํŠธ ๋„์ถœ


1. EDA ๋ฐ ์ „์ฒ˜๋ฆฌ

Training set์˜ ๊ฒฝ์šฐ ์ด 23๊ฐœ์˜ column์œผ๋กœ ์ด๋ฃจ์–ด์ ธ ์žˆ์œผ๋ฉฐ ๋ฐ์ดํ„ฐ๋Š” ์•ฝ 20๋งŒ๊ฑด์ด ์กด์žฌํ•œ๋‹ค. 

Test set์˜ ๊ฒฝ์šฐ ์ด 22๊ฐœ์˜ column์œผ๋กœ Training set๊ณผ๋Š” ๋‹ค๋ฅด๊ฒŒ 'type' column์ด ์กด์žฌํ•˜์ง€ ์•Š๋Š”๋‹ค. 

์ด๋Š” Test set์„ ์ด์šฉํ•ด ์˜ˆ์ธก ํ›„ submission ํŒŒ์ผ์„ ๋งŒ๋“ค์–ด ์ œ์ถœํ•˜๋Š” ์šฉ๋„์ด๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. 

Submission file์˜ ๊ฒฝ์šฐ column์€ test set์˜ ๋ฐ์ดํ„ฐ id, ๋ณ„๋“ค์˜ type๋“ค์ด ์กด์žฌํ•œ๋‹ค. 

๊ฐ type์„ ์–ด๋Š์ •๋„์˜ ํ™•๋ฅ ๋กœ ์˜ˆ์ธกํ–ˆ๋Š”์ง€ ๊ธฐ๋ก ํ›„ ์ œ์ถœํ•˜๋Š” ํ˜•ํƒœ์ด๋‹ค. 

 

ํ‰๊ฐ€ ๋ฐฉ๋ฒ•์€ log_loss๋ฅผ ์ด์šฉํ•˜๋ผ๊ณ  ํ–ˆ์œผ๋‚˜, ์ผ๋‹จ์€ ์ •ํ™•๋„์™€ ์ „๋ฐ˜์ ์ธ ์˜ˆ์ธก ํ™•๋ฅ  ์œ„์ฃผ๋กœ ํ™•์ธํ•˜๊ณ ์ž ํ•œ๋‹ค. 

1) ์ˆ˜์น˜, ํ…์ŠคํŠธ ํ•ด์„

์ „๋ฐ˜์ ์ธ ๊ธฐ์ดˆํ†ต๊ณ„๋Ÿ‰์„ ํ™•์ธํ•ด๋ณธ ๊ฒฐ๊ณผ ์ตœ๋Œ“๊ฐ’๊ณผ ์ตœ์†Ÿ๊ฐ’ ์‚ฌ์ด์˜ ๊ฐญ์ด ๊ต‰์žฅํžˆ ํฌ๋‹ค. 

์ •๊ทœํ™”์˜ ๊ณผ์ •์„ ๊ฑฐ์น˜๊ฑฐ๋‚˜ outlier๋ฅผ ์ œ๊ฑฐํ•ด์•ผํ•  ๋“ฏ ํ•˜๋‹ค. 

column๋“ค์€ ์ด๋ ‡๊ฒŒ ๊ตฌ์„ฑ๋˜์–ด์žˆ์œผ๋ฉฐ ๊ฐ column๋ณ„ ์„ค๋ช…์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค. 

1. id

์ธ๋ฑ์Šค์™€ ์œ ์‚ฌํ•œ ์กด์žฌ์ด๋‹ค.

2. type

์šฐ๋ฆฌ๊ฐ€ ์˜ˆ์ธกํ•˜๊ณ ์ž ํ•˜๋Š” ๋ณ„์˜ ์ข…๋ฅ˜์ด๋‹ค.

๊ต‰์žฅํžˆ ๋‹ค์–‘ํ•œ ์ข…๋ฅ˜๊ฐ€ ์žˆ์œผ๋ฉฐ, ์ด์— ๋Œ€ํ•ด์„œ๋Š” ์ถ”ํ›„์— ์ถ”๊ฐ€ํ•˜๋„๋ก ํ•œ๋‹ค. 

3. fiberID

๋ณ„์„ ์ธก์ •ํ•  ๋•Œ ์‚ฌ์šฉํ•œ ๊ด‘์„ฌ์œ ์˜ ID์ด๋‹ค.

4. psfMag_u, g, r, i, z

๋’ค์— ๋ถ™์–ด์žˆ๋Š” u, g, r, i, z๋Š” ๊ฐ๊ฐ ํŒŒ์žฅ์„ ์˜๋ฏธํ•œ๋‹ค. 

psfMag๋Š” ์ ํ™•์‚ฐํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•ด ์ธก์ •ํ•œ ๋ณ„์˜ ๊ด‘๋„๋ฅผ ์˜๋ฏธํ•˜๋Š” ๋“ฏ ํ•˜๋‹ค. 

5. fiberMag_u, g, r, i, z

๋งˆ์ฐฌ๊ฐ€์ง€๋กœ '_' ๋’ค์— ๋ถ™์–ด์žˆ๋Š” ์•ŒํŒŒ๋ฒณ์€ ๊ฐ๊ฐ ํŒŒ์žฅ์„ ์˜๋ฏธํ•œ๋‹ค.

fiberMag์˜ ๊ฒฝ์šฐ ๊ด‘๋„๋ฅผ ์ธก์ •ํ•œ ๊ด‘์„ฌ์œ  ๊ธฐ์ค€์˜ ๊ด‘๋„๋ฅผ ์˜๋ฏธํ•˜๋Š” ๋“ฏ ํ•˜๋‹ค.

6. petroMag_u, g, r, i, z

petrosian์ด๋ผ๋Š” ์ธก์ • ์‹œ์Šคํ…œ์ด ์žˆ๋Š”๋ฐ, ์ด๋ฅผ ์ด์šฉํ•ด ์ธก์ •ํ•œ ๊ด‘๋„๋ฅผ ์˜๋ฏธํ•˜๋Š” ๋“ฏ ํ•˜๋‹ค.

7. modelMag_u, g, r, i, z

๋ชจ๋ธํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•ด ์ธก์ •๋œ ๊ด‘๋„๋ฅผ ์˜๋ฏธํ•œ๋‹ค. 

์—ฌ๊ธฐ์„œ ๋ชจ๋ธ ํ•จ์ˆ˜๋ผ๋Š” ๊ฒƒ์ด ๊ตฌ์ฒด์ ์œผ๋กœ ๋ฌด์—‡์ธ์ง€๋Š” ์•Œ์•„๋‚ด์ง€ ๋ชปํ–ˆ๋‹ค. 

 

๊ฐ column๋“ค์€ type์„ ์ œ์™ธํ•˜๊ณ ๋Š” ๋ชจ๋‘ ์ˆ˜์น˜ํ˜•์ด์˜€์œผ๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด type์„ ์ˆ˜์น˜ํ™”ํ•˜์—ฌ ์˜ˆ์ธกํ•ด์•ผ ํ•จ์„ ์•Œ ์ˆ˜ ์žˆ์—ˆ๋‹ค. 

๊ฒฐ์ธก์น˜๋Š” ์กด์žฌํ•˜์ง€ ์•Š์•˜์œผ๋ฉฐ, type์˜ ๊ฐœ์ˆ˜๋Š” ์ด 19๊ฐœ์˜€๋‹ค. 

๊ฐ type์˜ ๊ฐœ์ˆ˜๋ฅผ ์•Œ์•„๋ณธ ๊ฒฐ๊ณผ ๋‹ค์Œ๊ณผ ๊ฐ™์•˜๊ณ , ์ด๋ฅผ ์‹œ๊ฐํ™”ํ–ˆ์„ ๋•Œ ๋ช…ํ™•ํ•˜๊ฒŒ ํด๋ž˜์Šค์— ๋ถˆ๊ท ํ˜•์ด ์กด์žฌํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์—ˆ๋‹ค. 

 

 

2) ๊ทธ๋ž˜ํ”„ ํ•ด์„

๊ฐ type๊ณผ ๋‹ค๋ฅธ ๋ณ€์ˆ˜๋“ค๊ฐ„์˜ ๊ด€๊ณ„๋ฅผ ์•Œ์•„๋ณด์ž. 

์ด 19๊ฐœ์˜ ํƒ€์ž…๋“ค ์ค‘ QSO๊ฐ€ ๊ฐ€์žฅ ํญ๋„“๊ฒŒ fiberID๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉฐ ๋‚˜๋จธ์ง€ ํƒ€์ž…๋“ค์€ 600์ดˆ๋ฐ˜์ •๋„์˜ ๊ด‘์„ฌ์œ ๋ฅผ ์ด์šฉํ•œ ๊ฒƒ์œผ๋กœ ํŒŒ์•…๋œ๋‹ค. 

 

* ...์ด๋ถ€๋ถ„ ์ง„์ž ๋ชปํ•ด๋จน๊ฒ ์–ด์š” ๋ฏธ์นœ ์ง„์งœ ์ผ์ฃผ์ผ ๋‚ด๋‚ด ์ด๋ถ€๋ถ„๋งŒ ๊ฐ€์ง€๊ณ  ๊ณ ๋ฏผํ–ˆ๋Š”๋ฐ ๋จธ๋ฆฌ ํ•˜๋‚˜๋„ ์•ˆ๋Œ์•„๊ฐ€์š” EDA ๋„ˆ๋ฌด ์–ด๋ ต๋‹ค ์ฐจ๋ผ๋ฆฌ ๋ชจ๋ธ๋ง๋งŒ ํ• ๋ž˜์š” ์ง„์งœ ์—ฌ๋Ÿฌ๋ถ„ ์ง‘๋‹จ์ง€์„ฑ plz

 


2. ๋ชจ๋ธ๋ง ๋ฐ ๊ฒฐ๊ณผ

# type์„ ์ˆซ์ž๋กœ ๋ณ€ํ™˜
column_number = {}
for i, column in enumerate(submission_df.columns):
    column_number[column] = i
    
def to_number(x, dic):
    return dic[x]

train_df['type_num'] = train_df['type'].apply(lambda x: to_number(x, column_number))

๋ฒ ์ด์Šค๋ผ์ธ ์ฝ”๋“œ์—์„œ ์ œ๊ณตํ•ด์ค€ ์ฝ”๋“œ๋ฅผ ์ด์šฉํ•ด type์„ ์ˆซ์ž๋กœ ๋ณ€ํ™˜ํ•ด์ค€๋‹ค. 

# train set์„ ์ด์šฉํ•ด validation set์„ ๋งŒ๋“ค์–ด test set์„ ์ด์šฉํ•ด ํ‰๊ฐ€ ์ „ ๊ฒ€์ฆ์„ ํ•ด๋ณด๊ณ ์ž ํ•จ
X = train_df.drop(columns=['type_num', 'type'], axis=1)
y = train_df['type_num']
X_test = test_df

์ฃผ์–ด์ง„ testset์—๋Š” type์ด ์—†๊ธฐ์— trainset์„ ์ตœ๋Œ€ํ•œ ์ด์šฉํ•˜์—ฌ ํ•™์Šต์„ ์‹œํ‚ค๊ณ  ๊ฒ€์ฆํ•ด๋ณด๋ ค๊ณ  ํ•œ๋‹ค. 

from lightgbm import LGBMClassifier
from sklearn.metrics import log_loss, accuracy_score
from sklearn.model_selection import train_test_split

์‚ฌ์šฉํ•œ ๋ชจ๋ธ์€ ํŠธ๋ฆฌ ๊ธฐ๋ฐ˜ LightGBM์œผ๋กœ GBM ๋ชจ๋ธ๋ณด๋‹ค ํ•™์Šต ์†๋„๊ฐ€ ๋น ๋ฅด๋‹ค๋Š” ์žฅ์ ์ด ์žˆ๋‹ค. 

train_test_split์„ ์ด์šฉํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ๋ถ„๋ฆฌํ•ด์ฃผ๋ฉฐ, ์ด๋•Œ ํด๋ž˜์Šค๋ณ„๋กœ ์ ์ ˆํ•œ ๋น„์œจ์— ๋”ฐ๋ผ ๋ฐ์ดํ„ฐ๋ฅผ ๋‚˜๋ˆ„๊ธฐ ์œ„ํ•ด stratify๋ผ๋Š” ๊ฐ’์„ ์ด์šฉํ–ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ํด๋ž˜์Šค ๋ถˆ๊ท ํ˜•์„ ์กฐ๊ธˆ์ด๋‚˜๋งˆ ๋ง‰์„ ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ–ˆ๋‹ค. 

 

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.3, random_state=100, stratify=y)

model = LGBMClassifier(n_estimators=500,
            learning_rate=0.01,
            boosting_type='gbdt',
            num_leaves=20,
            max_depth=15)

model.fit(X_train, y_train)
        
y_pred_proba = model.predict_proba(X_val)
y_pred_acc = model.predict(X_val)

์ฒซ๋ฒˆ์งธ๋กœ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฐ’์œผ๋กœ ํ•™์Šต์‹œํ‚ค๊ณ  ๊ฒ€์ฆํ•˜์˜€๋‹ค. ์ •ํ™•๋„๋Š” ์•ฝ 87%์ด๋‹ค. 

์ด๋•Œ ๋ถ€์ŠคํŒ… ํƒ€์ž…์„ dart๋กœ ๋ณ€๊ฒฝํ•˜๊ณ , learning_rate์„ 0.005๋กœ ๋‚ฎ์ถ”์—‡๋‹ค. ๋˜ํ•œ max_depth=18๋กœ ์„ค์ •ํ•˜์˜€๊ณ , ๊ณผ์ ํ•ฉ์„ ๋ง‰๊ธฐ ์œ„ํ•ด n_estimators=400์œผ๋กœ ๋‚ฎ์ถ”์—ˆ๋‹ค. 

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.3, random_state=100, stratify=y)

model = LGBMClassifier(n_estimators=400,
            learning_rate=0.005,
            boosting_type='dart',  # traditional gradient boosting decision tree
            num_leaves=20,
            max_depth=18)

model.fit(X_train, y_train)
        
y_pred_proba = model.predict_proba(X_val)
y_pred_acc = model.predict(X_val)

print(accuracy_score(y_val, y_pred_acc))

์ •ํ™•๋„๊ฐ€ ๋‚ฎ์•„์ง„ ๊ฒƒ์„ ํ†ตํ•ด ๊ณผ์ ํ•ฉ์ด ์ƒ๊ธธ ๊ฐ€๋Šฅ ์„ฑ์ด ๋งค์šฐ ๋†’์€ ์ƒํ™ฉ์ด๋ฉฐ, ์ด์ „๊ณผ ๊ฐ™์€ ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์œผ๋กœ ๋‹ค์‹œํ•œ๋ฒˆ ํ•™์Šตํ•ด๋ณธ๋‹ค. boosting_type๋งŒ dart๋กœ ๋ณ€๊ฒฝํ•˜์—ฌ ํ•™์Šต์‹œํ‚จ ๊ฒฐ๊ณผ ์ฒ˜์Œ๋ณด๋‹ค๋Š” ๋†’์ง€ ์•Š์€ ์ •ํ™•๋„๋ฅผ ๋ณด์ธ๋‹ค. 

์•„๋ฌด๋ž˜๋„ ์ด์ƒ์น˜์˜ ์˜ํ–ฅ์ด ๊ฝค ํฐ ๋“ฏ ํ•˜๋‹ค. ์ด์ƒ์น˜๋ฅผ ์ œ๊ฑฐํ•˜๊ธฐ์—๋Š” ๋ถˆ์•ˆํ•จ์ด ์žˆ์œผ๋‹ˆ ์ •๊ทœํ™” ํ›„ ํ•™์Šต์„ ์‹œ์ผœ๋ณด๋„๋ก ํ•˜๊ฒ ๋‹ค. 

์ •๊ทœํ™”์˜ ๊ฒฝ์šฐ 0~1์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ์ •๊ทœํ™”๋ฅผ ํ•˜๋Š” ๋ฐฉ๋ฒ•, log๋ฅผ ์”Œ์›Œ ์ •๊ทœํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ• ๋“ฑ์ด ์žˆ๋Š”๋ฐ, ์ผ๋‹จ 0~1์‚ฌ์ด ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋ฐฉ๋ฒ•๋งŒ ์‚ฌ์šฉํ•ด๋ณด๋„๋ก ํ•˜๊ฒ ๋‹ค. 

 

  • Standard Scaler

๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ ์ •๊ทœํ™” ๋ฐฉ์‹์œผ๋กœ ์ •๊ทœ๋ถ„ํฌ๋กœ ๋ณ€ํ™˜ํ•ด์ฃผ๋Š” ์Šค์ผ€์ผ๋Ÿฌ์ด๋‹ค. 

from sklearn.preprocessing import StandardScaler, MinMaxScaler
# ์ •๊ทœํ™”๋ฅผ ํ•ด๋ณด์ž. 
X = train_df.drop(columns=['type_num', 'type'], axis=1)
y = train_df['type_num']
X_test = test_df

# StandardScaler
std = StandardScaler()
std.fit(X)

# # MinMaxScaler
# mm = MinMaxScaler()
# mm.fit(X)

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.3, random_state=100, stratify=y)

model = LGBMClassifier(n_estimators=800,
            learning_rate=0.005,
            boosting_type='gbdt',
            num_leaves=20,
            max_depth=15)

model.fit(X_train, y_train)
        
y_pred_proba = model.predict_proba(X_val)
y_pred_acc = model.predict(X_val)

print(accuracy_score(y_val, y_pred_acc))

ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋„ ์กฐ๊ธˆ ์กฐ์ ˆํ•ด ๋ณด๋‹ˆ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๊ฒฐ๊ณผ๊ฐ€ ๋‚˜์™”๋‹ค. ๊ทธ๋‹ฅ ์ž˜ ๋‚˜์˜ค์ง„ ์•Š๋Š”๋‹ค. 

  • MinMaxScaler

์ตœ๋Œ“๊ฐ’๊ณผ ์ตœ์†Œ๊ฐ’์„ ์ด์šฉํ•œ ์Šค์ผ€์ผ๋Ÿฌ๋กœ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋‹ค์Œ์„ ์ฐธ๊ณ ํ•˜์ž. 

 

sklearn.preprocessing.MinMaxScaler

Examples using sklearn.preprocessing.MinMaxScaler: Release Highlights for scikit-learn 0.24 Release Highlights for scikit-learn 0.24 Image denoising using kernel PCA Image denoising using kernel PC...

scikit-learn.org

from sklearn.preprocessing import StandardScaler, MinMaxScaler
# ์ •๊ทœํ™”๋ฅผ ํ•ด๋ณด์ž. 
X = train_df.drop(columns=['type_num', 'type'], axis=1)
y = train_df['type_num']
X_test = test_df

# # StandardScaler
# std = StandardScaler()
# std.fit(X)

# MinMaxScaler
mm = MinMaxScaler()
mm.fit(X)

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.3, random_state=100, stratify=y)

model = LGBMClassifier(n_estimators=1000,
            learning_rate=0.005,
            boosting_type='gbdt',
            num_leaves=20,
            max_depth=18)

model.fit(X_train, y_train)
        
y_pred_proba = model.predict_proba(X_val)
y_pred_acc = model.predict(X_val)

print(accuracy_score(y_val, y_pred_acc))

๋‘๋ฒˆ์งธ๋Š” MinMaxScaler๋ฅผ ์ด์šฉํ–ˆ์œผ๋ฉฐ, ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ข€ ๋” ์กฐ์ ˆํ•ด๋ณด์•˜๋”๋‹ˆ ๊ฒฐ๊ณผ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋‚˜์™”๋‹ค. 

 

์•„๋ฌด๋ž˜๋„ ์Šค์ผ€์ผ์ด ๋ฌธ์ œ๊ฐ€ ์•„๋‹ˆ๋ผ ํด๋ž˜์Šค ๋ถˆ๊ท ํ˜•์˜ ๋ฌธ์ œ๊ฐ€ ๊ฐ€์žฅ ํฐ ๋“ฏ ํ•˜๋‹ค. 

์ด ๋ถ€๋ถ„์€ ์ข€ ๋” ๊ณต๋ถ€๋ฅผ ํ•ด๋ณผ ํ•„์š”๊ฐ€ ์žˆ์„ ๊ฒƒ ๊ฐ™๋‹ค. 


3. ์ธ์‚ฌ์ดํŠธ ๋„์ถœ

๋ฐ์ดํ„ฐ ํด๋ž˜์Šค ๋ถˆ๊ท ํ˜•์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด์„œ ์˜ค๋ฒ„์ƒ˜ํ”Œ๋ง ํ˜น์€ ์–ธ๋”์ƒ˜ํ”Œ๋ง ๋ฐฉ์‹์„ ์ด์šฉํ•ด์•ผ ํ•  ๊ฒƒ์œผ๋กœ ํŒ๋‹จ๋œ๋‹ค. 

์ด๋•Œ ํŠน์ • ํด๋ž˜์Šค๋งŒ ์–ธ๋”์ƒ˜ํ”Œ๋ง ํ›„ ๊ณผํ•˜๊ฒŒ ์ ์€ ํด๋ž˜์Šค๋Š” ์‚ญ์ œํ•ด ํ•™์Šต ๊ฒฐ๊ณผ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ํ•™์Šตํ•œ ํด๋ž˜์Šค๊ฐ€ ์•„๋‹ˆ๋ผ๋ฉด ์‚ญ์ œ๋œ ๊ทธ ํด๋ž˜์Šค๋กœ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๋ฉด ์–ด๋–จ๊นŒ ์‹ถ๋‹ค. ๋‹ค๋งŒ ์–ด๋–ป๊ฒŒ ๊ตฌํ˜„ํ•ด์•ผ ํ• ์ง€๋Š” ์•„์ง ๋” ์•Œ์•„๋ด์•ผํ•  ๊ฒƒ ๊ฐ™๋‹ค. 

 

4. ์•„์‰ฌ์šด์ 

EDA ๋„ˆ๋ฌด ์–ด๋ ต๋‹ค. ์™œ ๋‚ด๊ฐ€ ์›ํ•˜๋Š” ํ˜•ํƒœ๋กœ๋Š” ๊ทธ๋ž˜ํ”„๊ฐ€ ๋‚˜์˜ค์ง€ ์•Š๋Š”๊ฑธ๊นŒ?

 

log_loss ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜๋ ค๋Š”๋ฐ ํŠœํ”Œํ˜•์„ ๋ฐ›์ง€ ์•Š์•„ ์ด ๊ฐ’์„ ๊ตฌํ•˜์ง€ ๋ชปํ–ˆ๋Š”๋ฐ, ์ด ๋ถ€๋ถ„์„ ํ•ด๊ฒฐํ•  ๋ฐฉ์•ˆ์„ ์ฐพ์•„๋ด์•ผํ•  ๊ฒƒ ๊ฐ™๋‹ค. 

 

ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ์ตœ์ ํ™” ํ•˜๋ ค๊ณ  ํ–ˆ๋Š”๋ฐ 2์‹œ๊ฐ„์„ ๋Œ๋ ธ๋Š”๋ฐ๋„ ํ•™์Šต์ด ์•ˆ๋๋‚˜ ๊ทธ๋ƒฅ ์†์œผ๋กœ ํ•˜๋‚˜ํ•˜๋‚˜ ์ˆ˜์ •ํ•˜๋Š” ๊ฒƒ์ด ๋น ๋ฅผ ๋“ฏ ํ•˜๋‹ค. 

 

ํด๋ž˜์Šค ๋ถˆ๊ท ํ˜•...

์•„์›ƒ๋ผ์ด์–ด๋ณด๋‹ค ์‚ฌ์‹ค ํด๋ž˜์Šค ๋ถˆ๊ท ํ˜•์˜ ์˜ํ–ฅ์ด ์ œ์ผ ํฐ ๋“ฏ ํ•˜๋‹ค. ์ด ๊ธฐํšŒ๋กœ ์˜ค๋ฒ„์ƒ˜ํ”Œ๋ง, ์–ธ๋” ์ƒ˜ํ”Œ๋ง์— ๋Œ€ํ•ด ๊ณต๋ถ€๋ฅผ ํ•ด๋ณด๊ณ  ์ ์šฉํ•ด๋ด์•ผ๊ฒ ๋‹ค. 

 


์–ด๋ ต๋‹ค. 

ํ˜ผ์žํ•˜๋ ค๋‹ˆ ๋จธ๋ฆฌ ํ„ฐ์งˆ ๊ฒƒ ๊ฐ™๋‹ค. 

์•ž์œผ๋กœ์˜ ๊ณ„ํš์„...... ์ข€ ์ˆ˜์ •ํ•ด๋ด์•ผ๊ฒ ๋‹ค. ํšŒ๊ท€๋ฅผ ๋‚˜๊ฐ€๋Š” ๊ฒƒ์ด ๋งž๋Š”๊ฐ€?

๊ทธ๋ž˜๋„ ์ „๋ฐ˜์ ์œผ๋กœ ํ›‘๋Š”๊ฒŒ ์ข‹์ง€ ์•Š์€๊ฐ€? vs ๋ถ„๋ฅ˜ ํ•˜๋‚˜๋ผ๋„ ์ œ๋Œ€๋กœ ํ•ด๋ณด์ž

์—ฌ๋Ÿฌ๋ถ„๋“ค์˜ ์„ ํƒ์€?

 

์ œ ๊ณผ์ •์— ๋Œ€ํ•œ ํ”ผ๋“œ๋ฐฑ๊ณผ ์•ฝ ๋‘๋‹ฌ ๋ฐ˜์ •๋„์˜ ํ•™์Šต์„ ํ†ตํ•œ ์—ฌ๋Ÿฌ๋ถ„๋“ค์˜ ์˜๊ฒฌ๋„ ์ฃผ์‹ญ์‡ผ.

๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. 

 

๋‹ค๋งŒ. ๊ณผ์ œ๊ฐ€ ๋งŽ๋‹ค๋Š” ๊ฒƒ์€ ์žฌ๊ณ ํ•˜์ง€ ์•Š์„ ์˜ˆ์ •์ž…๋‹ˆ๋‹ค..