포스팅 목차
* 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크
[ MAX Oracle Function ]
MAX 함수는 인수중에서 최댓값을 반환한다. 집계 함수, 분석 함수로 이용 가능하다.
- 함수 설명 : MAX 오라클 함수 링크
1. Oracle(오라클)
1) 집계 함수
- Max 함수
Oracle Programming |
SELECT MAX(SAL) "Maximum"
FROM EMP;
Results |
Maximum
------------
5000
2) 분석함수
- Max 함수
Oracle Programming |
SELECT DEPTNO,
MGR, ENAME, SAL,
MAX(SAL) OVER (PARTITION BY DEPTNO) AS mgr_max
FROM EMP
ORDER
BY DEPTNO;
Results |
DEPTNO MGR ENAME SAL MGR_MAX
------------------------------------
10 7839 CLARK 2450 5000
10 7782 MILLER 1300 5000
10 - KING 5000 5000
20 7566 FORD 3000 3000
20 7566 SCOTT 3000 3000
20 7839 JONES 2975 3000
20 7902 SMITH 800 3000
20 7788 ADAMS 1100 3000
30 7698 WARD 1250 2850
30 7698 MARTIN 1250 2850
30 7698 TURNER 1500 2850
30 7698 JAMES 950 2850
30 7698 ALLEN 1600 2850
30 7839 BLAKE 2850 2850
2. Python Pandas(파이썬)
1) 집계 함수
- Max 함수
개별 부서에서 최대 급여를 출력하시오.
Python Programming |
emp.groupby('deptno')['sal'].max()
Results |
deptno
10 5000
20 3000
30 2850
Name: sal, dtype: int64
- Max 함수
Python Programming |
emp.groupby('deptno')['sal'].apply(max)
Results |
deptno
10 5000
20 3000
30 2850
Name: sal, dtype: int64
- Rank() 함수
개별 부서에서 최대 급여를 수령하는 직원의 정보를 출력하시오
Python Programming |
emp.groupby('deptno').apply(lambda x : x[ x['sal'].rank(method="first",ascending=False) == 1 ])
Results |
empno ename job mgr hiredate sal comm deptno
deptno
10 8 7839 KING PRESIDENT NaN 1981/11/17 5000 NaN 10
20 7 7788 SCOTT ANALYST 7566.0 1982/12/09 3000 NaN 20
30 5 7698 BLAKE MANAGER 7839.0 1981/03/01 2850 NaN 30
2) 분석함수
- Max 함수
Python Programming |
withmooc = emp.copy()
group_max = withmooc.groupby('deptno')['sal'].max()
withmooc['sal_max'] = withmooc['deptno'].map(group_max)
withmooc.sort_values('deptno').head()
Results |
empno ename job mgr hiredate sal comm deptno sal_max
6 7782 CLARK MANAGER 7839.0 1981/01/09 2450 NaN 10 5000
8 7839 KING PRESIDENT NaN 1981/11/17 5000 NaN 10 5000
13 7934 MILLER CLERK 7782.0 1982/01/23 1300 NaN 10 5000
0 7369 SMITH CLERK 7902.0 1980/12/17 800 NaN 20 3000
3 7566 JONES MANAGER 7839.0 1981/04/02 2975 NaN 20 3000
- Transform()과 Max 함수
Python Programming |
withmooc = emp.copy()
withmooc['max_value'] = emp.groupby('deptno')['sal'].transform(lambda x:x.max())
withmooc.sort_values('deptno').head()
Results |
empno ename job mgr hiredate sal comm deptno max_value
6 7782 CLARK MANAGER 7839.0 1981/01/09 2450 NaN 10 5000
8 7839 KING PRESIDENT NaN 1981/11/17 5000 NaN 10 5000
13 7934 MILLER CLERK 7782.0 1982/01/23 1300 NaN 10 5000
0 7369 SMITH CLERK 7902.0 1980/12/17 800 NaN 20 3000
3 7566 JONES MANAGER 7839.0 1981/04/02 2975 NaN 20 3000
- Transform() 과 Max 함수
Python Programming |
withmooc = emp.copy()
withmooc['max_value'] = emp.groupby('deptno')['sal'].transform('max')
withmooc.sort_values('deptno').head()
Results |
empno ename job mgr hiredate sal comm deptno max_value
6 7782 CLARK MANAGER 7839.0 1981/01/09 2450 NaN 10 5000
8 7839 KING PRESIDENT NaN 1981/11/17 5000 NaN 10 5000
13 7934 MILLER CLERK 7782.0 1982/01/23 1300 NaN 10 5000
0 7369 SMITH CLERK 7902.0 1980/12/17 800 NaN 20 3000
3 7566 JONES MANAGER 7839.0 1981/04/02 2975 NaN 20 3000
3. R Programming (R Package)
1) 집계 함수
- Aggregate()와 Max
R Programming |
%%R
aggregate(sal ~ deptno, data=emp , FUN=max)
Results |
deptno sal
1 10 5000
2 20 3000
3 30 2850
- By와 Max() 함수
R Programming |
%%R
unlist(with(emp, by(sal, deptno, max)))
Results |
deptno: 10
[1] 5000
------------------------------------------------------------
deptno: 20
[1] 3000
------------------------------------------------------------
deptno: 30
[1] 2850
- Map()과 Max 함수
R Programming |
%%R
do.call(rbind, Map(max, split(emp$sal, emp$deptno)))
Results |
[,1]
10 5000
20 3000
30 2850
- purrr:map_dfr()과 max 함수
R Programming |
%%R
# purrr::map_chr(split(emp,emp$deptno), ~max(.x$sal))
purrr::map_dfr(split(emp,emp$deptno), ~max(.x$sal))
Results |
# A tibble: 1 x 3
`10` `20` `30`
<dbl> <dbl> <dbl>
1 5000 3000 2850
- purrr:map_dfr()과 max 함수
R Programming |
%%R
do.call(rbind, purrr::map_dfr(split(emp,emp$deptno), ~max(.x$sal)) )
Results |
[,1]
10 5000
20 3000
30 2850
- Lapply()와 Max 함수
R Programming |
%%R
do.call(rbind, lapply(split(emp$sal, emp$deptno), max))
Results |
[,1]
10 5000
20 3000
30 2850
2) 분석함수
- ave()와 Max
R Programming |
%%R
withmooc <- emp
withmooc['Max_value'] = with(withmooc, ave(x = sal, deptno, FUN=max)) # FUN =function(x) { max(x) }
withmooc[order(withmooc$deptno),][1:10, ]
Results |
# A tibble: 14 x 9
empno ename job mgr hiredate sal comm deptno Max_value
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <dbl>
1 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 5000
2 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 5000
3 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 5000
4 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 3000
5 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 3000
6 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 3000
7 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 3000
8 7902 FORD ANALYST 7566 1981-12-03 3000 NA 20 3000
9 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 2850
10 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 2850
11 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 2850
12 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 2850
13 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30 2850
14 7900 JAMES CLERK 7698 1981-12-03 950 NA 30 2850
- ave()와 Max
R Programming |
%%R
withmooc <- emp
withmooc['Max_value'] = ave(x = withmooc$sal, withmooc$deptno, FUN = max) # FUN = function (x) { max(x) }
withmooc[order(withmooc$deptno),][1:10, ]
Results |
# A tibble: 10 x 9
empno ename job mgr hiredate sal comm deptno Max_value
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <dbl>
1 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 5000
2 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 5000
3 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 5000
4 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 3000
5 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 3000
6 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 3000
7 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 3000
8 7902 FORD ANALYST 7566 1981-12-03 3000 NA 20 3000
9 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 2850
10 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 2850
- purrr::map_chr()와 Max 함수
R Programming |
%%R
withmooc <- emp
Min_var <- with(withmooc, purrr::map_chr(base::split(withmooc,withmooc$deptno), ~max(.x$sal)) )
data.frame(withmooc, Max_value =rep(Min_var, table(withmooc$deptno)))[1:10, ]
Results |
empno ename job mgr hiredate sal comm deptno Max_value
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 5000.000000
2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 5000.000000
3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 5000.000000
4 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 3000.000000
5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 3000.000000
6 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 3000.000000
7 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 3000.000000
8 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 3000.000000
9 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 2850.000000
10 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30 2850.000000
- Tapply()와 Max 함수
R Programming |
%%R
withmooc <- emp
Max_var <- with(withmooc, tapply(sal, deptno, max))
data.frame(emp, sal_max =rep(Max_var, table(withmooc$deptno)))
# with(withmooc, data.frame(emp, sal_max=rep(tapply(sal, deptno, max), table(deptno))) )[1:10, ]
Results |
empno ename job mgr hiredate sal comm deptno sal_max
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 5000
2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 5000
3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 5000
4 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 3000
5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 3000
6 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 3000
7 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 3000
8 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 3000
9 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 2850
10 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30 2850
11 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 2850
12 7900 JAMES CLERK 7698 1981-12-03 950 NA 30 2850
13 7902 FORD ANALYST 7566 1981-12-03 3000 NA 20 2850
14 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 2850
- plyr::ddply()와 max 함수
R Programming |
%%R
library(plyr) # dplyr 보다 먼저 로딩
library(dplyr)
plyr::ddply(emp,.(deptno),transform,sal_max = max(sal))[1:10, ]
Results |
empno ename job mgr hiredate sal comm deptno sal_max
1 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 5000
2 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 5000
3 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 5000
4 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 3000
5 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 3000
6 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 3000
7 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 3000
8 7902 FORD ANALYST 7566 1981-12-03 3000 NA 20 3000
9 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 2850
10 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 2850
- purrr::map_dfc()와 max() 함수
R Programming |
%%R
withmooc <- emp
mapping <- t( purrr::map_dfc( split(withmooc,withmooc$deptno), ~max(.x$sal) ) )
trans_data <- melt(mapping, value.name="sal_max")
merge(emp,trans_data,by.x='deptno',by.y='Var1')[1:10, ]
Results |
deptno empno ename job mgr hiredate sal comm Var2 sal_max
1 10 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 1 5000
2 10 7839 KING PRESIDENT NA 1981-11-17 5000 NA 1 5000
3 10 7934 MILLER CLERK 7782 1982-01-23 1300 NA 1 5000
4 20 7369 SMITH CLERK 7902 1980-12-17 800 NA 1 3000
5 20 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 1 3000
6 20 7566 JONES MANAGER 7839 1981-04-02 2975 NA 1 3000
7 20 7902 FORD ANALYST 7566 1981-12-03 3000 NA 1 3000
8 20 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 1 3000
9 30 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 1 2850
10 30 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 1 2850
4. R Dplyr Package
1) 집계 함수
- dplyr::summarize()와 max
R Programming |
%%R
emp %>%
dplyr::group_by(deptno) %>%
dplyr::summarize(sal_max = max(sal))
Results |
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 3 x 2
deptno sal_max
<dbl> <dbl>
1 10 5000
2 20 3000
3 30 2850
- Map()과 max() 함수
R Programming |
%%R
emp %>%
tbl_df() %>%
nest(-deptno) %>%
mutate(sal_max = map(data, ~ max(.$sal))) %>%
unnest(sal_max)
Results |
# A tibble: 3 x 3
deptno data sal_max
<dbl> <list> <dbl>
1 20 <tibble [5 x 7]> 3000
2 30 <tibble [6 x 7]> 2850
3 10 <tibble [3 x 7]> 5000
2) 분석함수
- Max() 함수
R Programming |
%%R
emp %>%
dplyr::group_by(deptno) %>%
dplyr::mutate(sal_max = max(sal)) %>%
dplyr::arrange(deptno) %>%
head()
Results |
# A tibble: 6 x 9
# Groups: deptno [2]
empno ename job mgr hiredate sal comm deptno sal_max
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <dbl>
1 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 5000
2 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 5000
3 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 5000
4 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 3000
5 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 3000
6 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 3000
- purrr::map_dfr()와 Max 함수
R Programming |
%%R
withmooc <- emp %>% mutate(deptno = as.character(deptno))
withmooc %>%
dplyr::left_join( ( purrr::map_dfr(split(withmooc,withmooc$deptno), ~max(.x$sal)) %>%
gather(key = deptno, value = sal_max )
),
id = "deptno"
) %>%
dplyr::arrange(deptno) %>%
head()
Results |
Joining, by = "deptno"
# A tibble: 6 x 9
empno ename job mgr hiredate sal comm deptno sal_max
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <chr> <dbl>
1 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 5000
2 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 5000
3 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 5000
4 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 3000
5 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 3000
6 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 3000
5. R sqldf Package
1) 집계 함수
- Max 함수
R Programming |
%%R
sqldf(" SELECT deptno,
MAX(sal) sal_max
FROM emp
group by deptno; ")
Results |
deptno sal_max
1 10 5000
2 20 3000
3 30 2850
2) 분석함수
- Max 함수
R Programming |
%%R
sqldf(" SELECT deptno, empno, sal,
MAX(sal) OVER (PARTITION BY deptno) AS sal_max
FROM emp
ORDER BY deptno, sal; ")
Results |
deptno empno sal sal_max
1 10 7934 1300 5000
2 10 7782 2450 5000
3 10 7839 5000 5000
4 20 7369 800 3000
5 20 7876 1100 3000
6 20 7566 2975 3000
7 20 7788 3000 3000
8 20 7902 3000 3000
9 30 7900 950 2850
10 30 7521 1250 2850
11 30 7654 1250 2850
12 30 7844 1500 2850
13 30 7499 1600 2850
14 30 7698 2850 2850
6. Python pandasql Package
1) 집계 함수
- Max 함수
Python Programming |
ps.sqldf(" SELECT deptno,MAX(sal) sal_max FROM emp group by deptno ")
Results |
deptno sal_max
0 10 5000
1 20 3000
2 30 2850
2) 분석함수
- Max 함수
Python Programming |
ps.sqldf(" SELECT deptno, empno, sal, \
MAX(sal) OVER (PARTITION BY deptno) AS sal_max \
FROM emp \
ORDER BY deptno, sal ").head()
Results |
deptno empno sal sal_max
0 10 7934 1300 5000
1 10 7782 2450 5000
2 10 7839 5000 5000
3 20 7369 800 3000
4 20 7876 1100 3000
7. R data.table Package
1) 집계 함수
- Max 함수
R Programming |
%%R
DT <- data.table(emp)
dept_DT <- data.table(dept)
DT[, max(sal), by =deptno]
Results |
deptno V1
1: 20 3000
2: 30 2850
3: 10 5000
2) 분석함수
- by= 와 Max 함수
R Programming |
%%R
DT <- data.table(emp)
dept_DT <- data.table(dept)
DT[, sal_max := max(sal), by = deptno][order(deptno),][, c('job','hireate','comm') := NULL]
Results |
empno ename mgr hiredate sal deptno sal_max
1: 7782 CLARK 7839 1981-01-09 2450 10 5000
2: 7839 KING NA 1981-11-17 5000 10 5000
3: 7934 MILLER 7782 1982-01-23 1300 10 5000
4: 7369 SMITH 7902 1980-12-17 800 20 3000
5: 7566 JONES 7839 1981-04-02 2975 20 3000
6: 7788 SCOTT 7566 1982-12-09 3000 20 3000
7: 7876 ADAMS 7788 1983-01-12 1100 20 3000
8: 7902 FORD 7566 1981-12-03 3000 20 3000
9: 7499 ALLEN 7698 1981-02-20 1600 30 2850
10: 7521 WARD 7698 1981-02-22 1250 30 2850
11: 7654 MARTIN 7698 1981-09-28 1250 30 2850
12: 7698 BLAKE 7839 1981-03-01 2850 30 2850
13: 7844 TURNER 7698 1981-09-08 1500 30 2850
14: 7900 JAMES 7698 1981-12-03 950 30 2850
- . SD와 Max 함수
R Programming |
%%R
DT <- data.table(emp)
dept_DT <- data.table(dept)
DT[, `:=` ( sal_max = max(unlist(.SD)) ),
by = deptno,
.SDcols = c("sal")][order(deptno),][1:10, ][ , c('job','hireate','comm') := NULL]
Results |
empno ename mgr hiredate sal deptno sal_max
1: 7782 CLARK 7839 1981-01-09 2450 10 5000
2: 7839 KING NA 1981-11-17 5000 10 5000
3: 7934 MILLER 7782 1982-01-23 1300 10 5000
4: 7369 SMITH 7902 1980-12-17 800 20 3000
5: 7566 JONES 7839 1981-04-02 2975 20 3000
6: 7788 SCOTT 7566 1982-12-09 3000 20 3000
7: 7876 ADAMS 7788 1983-01-12 1100 20 3000
8: 7902 FORD 7566 1981-12-03 3000 20 3000
9: 7499 ALLEN 7698 1981-02-20 1600 30 2850
10: 7521 WARD 7698 1981-02-22 1250 30 2850
8. Python DuckDB의 SQL
1) 집계 함수
- Max 함수
Python Programming |
%%sql
SELECT MAX(sal) as Maximum
FROM emp
Python Programming |
duckdb.sql(" SELECT MAX(sal) as Maximum \
FROM emp ").df()
Results |
Maximum
0 5000
2) 분석함수
- Max 함수
Python Programming |
%%sql
SELECT deptno, empno, sal,
MAX(sal) OVER (PARTITION BY deptno) AS sal_max
FROM emp
ORDER BY deptno, sal
LIMIT 6
Python Programming |
duckdb.sql(" SELECT deptno, empno, sal, \
MAX(sal) OVER (PARTITION BY deptno) AS sal_max \
FROM emp \
ORDER BY deptno, sal \
LIMIT 6 ").df() )
Results |
deptno empno sal sal_max
0 10 7934 1300 5000
1 10 7782 2450 5000
2 10 7839 5000 5000
3 20 7369 800 3000
4 20 7876 1100 3000
5 20 7566 2975 3000
--------------------------------------------
[Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table] 오라클 함수와 R & Python 비교 사전 목록 링크 |
오라클 SQL 함수(Oracle SQL Function) 목록 리스트 링크 |
[SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트 링크 링크 |
반응형
댓글