포스팅 목차
45. Display department numbers and maximum salary for each department.
* 개별 부서에 근무하는 직원의 최대 급여와 최소 급여를 출력하시오.
- [그룹별 집계함수] 그룹별 최대값과 최소값 집계
- R stackoverflow(Apply several summary functions on several variables by group in one call) : [링크]
- Python stackoverflow(Apply multiple functions to multiple groupby columns) : [링크]
|
1. Oracle(오라클)
Max() 함수와 Min() 함수
Oracle Programming |
select deptno,
max(sal) max_sal,
min(sal) min_sal
from emp
group by deptno;
2. Python Pandas(파이썬)
Agg() 함수와 np.max, np.min 인수
Python Programming |
emp['sal'].groupby(emp['deptno']).agg([np.max, np.min]).reset_index()
Results |
deptno | amax | amin | |
0 | 10 | 5000 | 1300 |
1 | 20 | 3000 | 800 |
2 | 30 | 2850 | 950 |
Agg() 함수와 max, min 인수
Python Programming |
emp.groupby(['deptno'])['sal'].agg(['max','min']).reset_index()
Results |
deptno | max | min | |
0 | 10 | 5000 | 1300 |
1 | 20 | 3000 | 800 |
2 | 30 | 2850 | 950 |
Agg() 함수와 max, min 인수
Python Programming |
emp.groupby('deptno').agg({'sal':['max', 'min']})
Results |
sal | |
max | |
--- | --- |
deptno | |
--- | --- |
10 | 5000 |
20 | 3000 |
30 | 2850 |
Agg() 함수와 max, min 인수
Python Programming |
emp.groupby('deptno').agg( sal_max =('sal', 'max'),
sal_min =('sal', 'min'),
sal_mean =('sal', 'mean'),
sal_sum =('sal', 'sum'),
sal_range=('sal', lambda x: x.max() - x.min())
).reset_index()
Results |
deptno | sal_max | sal_min | sal_mean | sal_sum | sal_range | |
0 | 10 | 5000 | 1300 | 2916.666667 | 8750 | 3700 |
1 | 20 | 3000 | 800 | 2175.000000 | 10875 | 2200 |
2 | 30 | 2850 | 950 | 1566.666667 | 9400 | 1900 |
Apply() 함수와 max, min 함수
Python Programming |
emp.groupby('deptno') \
.apply(lambda x: pd.Series({ 'max_sal' : x['sal'].max(),
'min_sal' : x['sal'].min(),
'mean_sal' : x['sal'].mean(),
'TOT_prodsum' : (x['sal'] * x['comm']).sum()
})
).reset_index()
Results |
deptno | max_sal | min_sal | mean_sal | TOT_prodsum | |
0 | 10 | 5000.0 | 1300.0 | 2916.666667 | 0.0 |
1 | 20 | 3000.0 | 800.0 | 2175.000000 | 0.0 |
2 | 30 | 2850.0 | 950.0 | 1566.666667 | 2855000.0 |
3. R Programming (R Package)
Aggregate() 함수와 max, min 인수
R Programming |
%%R
library(epiDisplay) # 에러 방지('utf-8' codec can't decode byte 0xc0 in position 46: invalid start byte)
aggregate(emp$sal, by = list(Deptno = emp$deptno), FUN = c("max","min"), "na.rm"=FALSE)
Results |
Deptno max.emp.sal min.emp$sal
1 10 5000 1300
2 20 3000 800
3 30 2850 950
Aggregate() 함수와 max(), min() 함수
R Programming |
%%R
aggregate(sal ~ deptno, data = emp, FUN = function(x) c(max = max(x), min = min(x) ) )
Results |
deptno sal.max sal.min
1 10 5000 1300
2 20 3000 800
3 30 2850 950
Aggregate() 함수와 max, min 함수
R Programming |
%%R
aggregate(sal ~ deptno, data = emp, FUN = plyr::each(max = max, min = min))
Results |
deptno sal.max sal.min
1 10 5000 1300
2 20 3000 800
3 30 2850 950
4. R Dplyr Package
Summarise() 함수와 max, min 함수
R Programming |
%%R
emp %>%
dplyr::group_by(deptno) %>%
dplyr::summarise( sal_mac = max(sal), sal_min = min(sal) )
Results |
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 3 x 3
deptno sal_mac sal_min
<dbl> <dbl> <dbl>
1 10 5000 1300
2 20 3000 800
3 30 2850 950
5. R sqldf Package
Max() 함수와 Min() 함수
R Programming |
%%R
sqldf(" select deptno, max(sal) max_sal,min(sal) min_sal
from emp
group by deptno")
Results |
deptno max_sal min_sal
1 10 5000 1300
2 20 3000 800
3 30 2850 950
6. Python pandasql Package
Max() 함수와 Min() 함수
Python Programming |
ps.sqldf("select deptno, max(sal) max_sal,min(sal) min_sal from emp group by deptno")
Results |
deptno | max_sal | min_sal | |
0 | 10 | 5000 | 1300 |
1 | 20 | 3000 | 800 |
2 | 30 | 2850 | 950 |
7. R data.table Package
Max() 함수와 Min() 함수
R Programming |
%%R
DT <- data.table(emp)
DT[ , .(sal_max = max(sal), sal_min = min(sal), sal_cnt = .N), by = .(deptno)]
Results |
deptno sal_max sal_min sal_cnt
1: 20 3000 800 5
2: 30 2850 950 6
3: 10 5000 1300 3
8. SAS Proc SQL
Max() 함수와 Min() 함수
SAS Programming |
%%SAS sas
PROC SQL;
CREATE TABLE STATSAS_1 AS
select deptno,
max(sal) AS max_sal,
min(sal) AS min_sal
from emp
group
by deptno;
QUIT;
PROC PRINT;RUN;
Results |
OBS | deptno | max_sal | min_sal |
1 | 10 | 5000 | 1300 |
2 | 20 | 3000 | 800 |
3 | 30 | 2850 | 950 |
9. SAS Data Step
Proc summary 프로시져와 max, min 인수
SAS Programming |
%%SAS sas
PROC SUMMARY DATA=EMP NWAY;
CLASS deptno;
VAR SAL;
OUTPUT OUT=STATSAS_2(DROP=_:) MAX=MAX_SAL MIN=MIN_SAL;
RUN;
PROC PRINT;RUN;
Results |
OBS | deptno | MAX_SAL | MIN_SAL |
1 | 10 | 5000 | 1300 |
2 | 20 | 3000 | 800 |
3 | 30 | 2850 | 950 |
Max() 함수와 Min() 함수
SAS Programming |
%%SAS sas
PROC SORT DATA=EMP OUT=EMP_1;
BY deptno;
RUN;
DATA STATSAS_3;
SET EMP_1;
BY deptno;
RETAIN MAX_SAL MIN_SAL 0;
IF FIRST.deptno THEN DO;
MAX_SAL = SAL;
MIN_SAL = SAL;
END;
ELSE DO;
MAX_SAL =MAX(MAX_SAL,SAL);
MIN_SAL =MIN(MIN_SAL,SAL);
END;
IF LAST.deptno THEN OUTPUT STATSAS_3; * IF LAST.job;
KEEP DEPTNO MAX_SAL MIN_SAL;
RUN;
PROC PRINT;RUN;
Results |
OBS | deptno | MAX_SAL | MIN_SAL |
1 | 10 | 5000 | 1300 |
2 | 20 | 3000 | 800 |
3 | 30 | 2850 | 950 |
10. Python Dfply Package
Summarize() 함수와 max, min 함수
Python Programming |
emp >> group_by('deptno') >> summarize( sal_max = X.sal.max(), sal_min = X.sal.min() )
Results |
deptno | sal_max | sal_min | |
0 | 10 | 5000 | 1300 |
1 | 20 | 3000 | 800 |
2 | 30 | 2850 | 950 |
[SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트
반응형
댓글