포스팅 목차
* 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크
[ NTH_VALUE Oracle Function ]
NTH_VALUE 함수는 ‘analytic_clause’로 정의된 윈도우(그룹)에서 n번째 행의 measure_expr 값을 반환한다. 반환된 값은 ‘measure_expr’의 데이터 유형을 가진다.
- 함수 설명 :
1. Oracle(오라클)
NTH_VALUE() 함수
Oracle Programming |
SELECT empno,
ename,
deptno,
sal,
nth_value(sal, 2) over (PARTITION BY deptno ORDER BY sal RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS nth_top,
NTH_VALUE(sal, 2) over (PARTITION BY deptno ORDER BY sal desc RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS nth_bottom
FROM emp
order by deptno, sal;
Results |
EMPNO ENAME DEPTNO SAL NTH_TOP NTH_BOTTOM
------------------------------------------------------
7934 MILLER 10 1300 2450 2450
7782 CLARK 10 2450 2450 2450
7839 KING 10 5000 2450 2450
7369 SMITH 20 800 1100 3000
7876 ADAMS 20 1100 1100 3000
7566 JONES 20 2975 1100 3000
7788 SCOTT 20 3000 1100 3000
7902 FORD 20 3000 1100 3000
7900 JAMES 30 950 1250 1600
7654 MARTIN 30 1250 1250 1600
7521 WARD 30 1250 1250 1600
7844 TURNER 30 1500 1250 1600
7499 ALLEN 30 1600 1250 1600
7698 BLAKE 30 2850 1250 1600
2. Python Pandas(파이썬)
transform() 함수와 'nth' 인수
개별 부서 내에서 최소 급여 값과 최대 급여 값을 반환한다. 급여를 오름차순과 내림차순으로 각각 정렬 후 첫 번째 값을 취한다.
Python Programming |
withmooc = emp.copy()
withmooc['nth_top'] = emp.sort_values('sal', ascending=True).groupby('deptno')['sal'].transform('nth',1)
withmooc['nth_bottom'] = emp.sort_values('sal', ascending=False).groupby('deptno')['sal'].transform('nth',1)
withmooc.sort_values(['deptno','sal']).head(10)
Results |
empno ename job mgr hiredate sal comm deptno nth_top nth_bottom
13 7934 MILLER CLERK 7782.0 1982/01/23 1300 NaN 10 2450 2450
6 7782 CLARK MANAGER 7839.0 1981/01/09 2450 NaN 10 2450 2450
8 7839 KING PRESIDENT NaN 1981/11/17 5000 NaN 10 2450 2450
0 7369 SMITH CLERK 7902.0 1980/12/17 800 NaN 20 1100 3000
10 7876 ADAMS CLERK 7788.0 1983/01/12 1100 NaN 20 1100 3000
3 7566 JONES MANAGER 7839.0 1981/04/02 2975 NaN 20 1100 3000
7 7788 SCOTT ANALYST 7566.0 1982/12/09 3000 NaN 20 1100 3000
12 7902 FORD ANALYST 7566.0 1981/12/03 3000 NaN 20 1100 3000
11 7900 JAMES CLERK 7698.0 1981/12/03 950 NaN 30 1250 1600
2 7521 WARD SALESMAN 7698.0 1981/02/22 1250 500.0 30 1250 1600
transform() 함수와 iloc
Python Programming |
withmooc = emp.copy()
withmooc['nth_top'] = emp.sort_values('sal', ascending=True).groupby('deptno')['sal'].transform(lambda x:x.iloc[1])
withmooc['nth_bottom'] = emp.sort_values('sal', ascending=False).groupby('deptno')['sal'].transform(lambda x:x.iloc[1])
withmooc.sort_values(['deptno','sal']).head()
Results |
empno ename job mgr hiredate sal comm deptno nth_top nth_bottom
13 7934 MILLER CLERK 7782.0 1982/01/23 1300 NaN 10 2450 2450
6 7782 CLARK MANAGER 7839.0 1981/01/09 2450 NaN 10 2450 2450
8 7839 KING PRESIDENT NaN 1981/11/17 5000 NaN 10 2450 2450
0 7369 SMITH CLERK 7902.0 1980/12/17 800 NaN 20 1100 3000
10 7876 ADAMS CLERK 7788.0 1983/01/12 1100 NaN 20 1100 3000
transform() 함수와 'nth' 인수
개별 부서의 급여 최솟값과 최댓값을 먼저 선택 후 원본 테이블에 해당 그룹의 선택한 값을 mapping 한다.
Python Programming |
withmooc = emp.copy()
group_top = withmooc.sort_values('sal',ascending = True).groupby('deptno')['sal'].nth(1)
withmooc['nth_top'] = withmooc['deptno'].map(group_top)
group_bottom = withmooc.sort_values('sal',ascending = False).groupby('deptno')['sal'].nth(1)
withmooc['nth_bottom'] = withmooc['deptno'].map(group_bottom)
withmooc.sort_values(['deptno','sal']).head()
Results |
empno ename job mgr hiredate sal comm deptno nth_top nth_bottom
13 7934 MILLER CLERK 7782.0 1982/01/23 1300 NaN 10 2450 2450
6 7782 CLARK MANAGER 7839.0 1981/01/09 2450 NaN 10 2450 2450
8 7839 KING PRESIDENT NaN 1981/11/17 5000 NaN 10 2450 2450
0 7369 SMITH CLERK 7902.0 1980/12/17 800 NaN 20 1100 3000
10 7876 ADAMS CLERK 7788.0 1983/01/12 1100 NaN 20 1100 3000
3. R Programming (R Package)
dplyr::nth() 함수
개별 부서 내에서 상위 2번째와 하위 2번째 급여 값을 반환한다. 급여를 오름차순과 내림차순으로 각각 정렬 후 두 번째 값을 취한다.
R Programming |
%%R
library(plyr) # dplyr 보다 먼저 로딩
library(dplyr)
withmooc <- emp
withmooc <- plyr::ddply(withmooc,.(deptno),transform,nth_top = dplyr::nth(sal,2, order_by = sal ))
withmooc <- plyr::ddply(withmooc,.(deptno),transform,nth_bottom = dplyr::nth(sal,2, order_by = desc(sal) ))
withmooc[order(withmooc$deptno,withmooc$sal), ]
Results |
empno ename job mgr hiredate sal comm deptno nth_top nth_bottom
3 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 2450 2450
1 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 2450 2450
2 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 2450 2450
4 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 1100 3000
7 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 1100 3000
5 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 1100 3000
6 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 1100 3000
8 7902 FORD ANALYST 7566 1981-12-03 3000 NA 20 1100 3000
14 7900 JAMES CLERK 7698 1981-12-03 950 NA 30 1250 1600
10 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 1250 1600
dplyr::nth() 함수
R Programming |
%%R
withmooc <- emp[order(emp$sal),]
withmooc['nth_top'] = with(withmooc, ave(x = sal, deptno, FUN =function(x) { dplyr::nth(x,2) }))
#withmooc['nth_top'] = with(withmooc, ave(x = sal, deptno, FUN =function(x) { x[2] }))
withmooc <- withmooc[order(-withmooc$sal),]
withmooc['nth_bottom'] = with(withmooc, ave(x = sal, deptno, FUN =function(x) { dplyr::nth(x,2) }))
#withmooc['nth_bottom'] = with(withmooc, ave(x = sal, deptno, FUN =function(x) { x[2] }))
withmooc[order(withmooc$deptno,withmooc$sal),][1:10, ]
Results |
# A tibble: 10 x 10
empno ename job mgr hiredate sal comm deptno nth_top nth_bottom
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
1 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 2450 2450
2 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 2450 2450
3 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 2450 2450
4 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 1100 3000
5 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 1100 3000
6 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 1100 3000
7 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 1100 3000
8 7902 FORD ANALYST 7566 1981-12-03 3000 NA 20 1100 3000
9 7900 JAMES CLERK 7698 1981-12-03 950 NA 30 1250 1600
10 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 1250 1600
dplyr::nth() 함수
R Programming |
%%R
withmooc <- emp[order(emp$sal),]
withmooc['nth_top'] = ave(x = withmooc$sal, withmooc$deptno, FUN = function (x) { dplyr::nth(x,2) } )
# withmooc['nth_top'] = ave(x = withmooc$sal, withmooc$deptno, FUN = function (x) { x[2] } )
withmooc <- withmooc[order(-withmooc$sal),]
withmooc['nth_bottom'] = ave(x = withmooc$sal, withmooc$deptno, FUN = function (x) { dplyr::nth(x,2) } )
# withmooc['nth_bottom'] = ave(x = withmooc$sal, withmooc$deptno, FUN = function (x) { x[2] } )
withmooc[order(withmooc$deptno,withmooc$sal),][1:10, ]
Results |
# A tibble: 10 x 10
empno ename job mgr hiredate sal comm deptno nth_top nth_bottom
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
1 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 2450 2450
2 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 2450 2450
3 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 2450 2450
4 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 1100 3000
5 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 1100 3000
6 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 1100 3000
7 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 1100 3000
8 7902 FORD ANALYST 7566 1981-12-03 3000 NA 20 1100 3000
9 7900 JAMES CLERK 7698 1981-12-03 950 NA 30 1250 1600
10 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 1250 1600
nth() 함수
개별 부서 내에서 하위 2번째 급여 값을 반환한다.
R Programming |
%%R
# split 되는 그룹내의 관측치 순서 보존을 위하여 데이터 사전 정렬 필요
withmooc <- emp[order(emp$deptno, emp$sal),]
Nth_var <- with(withmooc, purrr::map_chr(base::split(withmooc,withmooc$deptno), ~nth(.x$sal,2)) )
withmooc1 <- data.frame(withmooc, nth_top =rep(Nth_var, table(withmooc$deptno)))
withmooc1[order(withmooc1$deptno,withmooc1$sal),][1:10, ]
Results |
empno ename job mgr hiredate sal comm deptno nth_top
1 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 2450.000000
2 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 2450.000000
3 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 2450.000000
4 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 1100.000000
5 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 1100.000000
6 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 1100.000000
7 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 1100.000000
8 7902 FORD ANALYST 7566 1981-12-03 3000 NA 20 1100.000000
9 7900 JAMES CLERK 7698 1981-12-03 950 NA 30 1250.000000
10 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 1250.000000
tapply() 함수
R Programming |
%%R
withmooc <- emp[order(emp$deptno, emp$sal),]
First_var <- with(withmooc, tapply(sal, deptno, nth, 2)) # head, 1
data.frame(withmooc, nth_top =rep(Nth_var, table(withmooc$deptno)))[1:10, ]
# with(withmooc[order(withmooc$sal),], data.frame(emp, nth_top=rep(tapply(sal, denptno, nth, 2), table(job))) )
Results |
empno ename job mgr hiredate sal comm deptno nth_top
1 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 2450.000000
2 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 2450.000000
3 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 2450.000000
4 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 1100.000000
5 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 1100.000000
6 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 1100.000000
7 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 1100.000000
8 7902 FORD ANALYST 7566 1981-12-03 3000 NA 20 1100.000000
9 7900 JAMES CLERK 7698 1981-12-03 950 NA 30 1250.000000
10 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 1250.000000
nth() 함수
R Programming |
%%R
withmooc <- emp[order(emp$sal),]
mapping <- t( purrr::map_dfc( split(withmooc,withmooc$deptno), ~nth(.x$sal,2) ) )
trans_data <- melt(mapping, value.name="nth_top")
merge(emp,trans_data,by.x='deptno',by.y='Var1')[1:10, ]
Results |
deptno empno ename job mgr hiredate sal comm Var2 nth_top
1 10 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 1 2450
2 10 7839 KING PRESIDENT NA 1981-11-17 5000 NA 1 2450
3 10 7934 MILLER CLERK 7782 1982-01-23 1300 NA 1 2450
4 20 7369 SMITH CLERK 7902 1980-12-17 800 NA 1 1100
5 20 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 1 1100
6 20 7566 JONES MANAGER 7839 1981-04-02 2975 NA 1 1100
7 20 7902 FORD ANALYST 7566 1981-12-03 3000 NA 1 1100
8 20 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 1 1100
9 30 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 1 1250
10 30 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 1 1250
4. R Dplyr Package
dplyr::nth() 함수 구조 확인
R Programming |
%%R
dplyr::nth
Results - dplyr::nth 함수 구조 |
function (x, n, order_by = NULL, default = default_missing(x))
{
if (length(n) != 1 || !is.numeric(n)) {
abort("`n` must be a single integer.")
}
n <- trunc(n)
if (n == 0 || n > length(x) || n < -length(x)) {
return(default)
}
if (n < 0) {
n <- length(x) + n + 1
}
if (is.null(order_by)) {
x[[n]]
}
else {
x[[order(order_by)[[n]]]]
}
}
<bytecode: 0x00000218d09751f0>
<environment: namespace:dplyr>
dplyr::nth() 함수
R Programming |
%%R
emp %>%
dplyr::group_by(deptno) %>%
dplyr::mutate( nth_top = nth(sal, 2, order_by = sal),
nth_bottom = nth(sal, 2, order_by = desc(sal))) %>%
dplyr::arrange(deptno,sal) %>%
head(10)
Results |
# A tibble: 10 x 10
# Groups: deptno [3]
empno ename job mgr hiredate sal comm deptno nth_top nth_bottom
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
1 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 2450 2450
2 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 2450 2450
3 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 2450 2450
4 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 1100 3000
5 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 1100 3000
6 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 1100 3000
7 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 1100 3000
8 7902 FORD ANALYST 7566 1981-12-03 3000 NA 20 1100 3000
9 7900 JAMES CLERK 7698 1981-12-03 950 NA 30 1250 1600
10 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 1250 1600
[참고]
- row_number()
R Programming |
%%R
emp %>%
dplyr::group_by(deptno) %>%
dplyr::arrange(deptno,sal) %>%
filter(row_number() == 2)
Results |
# A tibble: 3 x 8
# Groups: deptno [3]
empno ename job mgr hiredate sal comm deptno
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl>
1 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10
2 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20
3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30
5. R sqldf Package
nth_value() 함수
R Programming |
%%R
sqldf(" SELECT empno,
ename,
deptno,
sal,
nth_value(sal, 2) over (PARTITION BY deptno ORDER BY sal RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS nth_top,
NTH_VALUE(sal, 2) over (PARTITION BY deptno ORDER BY sal desc RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS nth_bottom
FROM emp
order by deptno, sal")
Results |
empno ename deptno sal nth_top nth_bottom
1 7934 MILLER 10 1300 2450 2450
2 7782 CLARK 10 2450 2450 2450
3 7839 KING 10 5000 2450 2450
4 7369 SMITH 20 800 1100 3000
5 7876 ADAMS 20 1100 1100 3000
6 7566 JONES 20 2975 1100 3000
7 7788 SCOTT 20 3000 1100 3000
8 7902 FORD 20 3000 1100 3000
9 7900 JAMES 30 950 1250 1600
10 7521 WARD 30 1250 1250 1600
11 7654 MARTIN 30 1250 1250 1600
12 7844 TURNER 30 1500 1250 1600
13 7499 ALLEN 30 1600 1250 1600
14 7698 BLAKE 30 2850 1250 1600
6. Python pandasql Package
nth_value() 함수
Python Programming |
ps.sqldf(" SELECT empno, \
ename, \
deptno, \
sal, \
nth_value(sal, 2) over (PARTITION BY deptno ORDER BY sal RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS nth_top, \
NTH_VALUE(sal, 2) over (PARTITION BY deptno ORDER BY sal desc RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS nth_bottom \
FROM emp \
order by deptno, sal ").head(10)
Results |
empno ename deptno sal nth_top nth_bottom
0 7934 MILLER 10 1300 2450 2450
1 7782 CLARK 10 2450 2450 2450
2 7839 KING 10 5000 2450 2450
3 7369 SMITH 20 800 1100 3000
4 7876 ADAMS 20 1100 1100 3000
5 7566 JONES 20 2975 1100 3000
6 7788 SCOTT 20 3000 1100 3000
7 7902 FORD 20 3000 1100 3000
8 7900 JAMES 30 950 1250 1600
9 7521 WARD 30 1250 1250 1600
7. R data.table Package
nth() 함수
R Programming |
%%R
DT <- data.table(emp)
DT[order(-sal) , `:=`( nth_top= nth(sal,2,order_by=sal), nth_bottom= nth(sal,2,order_by=desc(sal)) ), by = deptno][order(deptno,sal),]
Results |
empno ename job mgr hiredate sal comm deptno nth_top nth_bottom
1: 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 2450 2450
2: 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 2450 2450
3: 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 2450 2450
4: 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 1100 3000
5: 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 1100 3000
6: 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 1100 3000
7: 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 1100 3000
8: 7902 FORD ANALYST 7566 1981-12-03 3000 NA 20 1100 3000
9: 7900 JAMES CLERK 7698 1981-12-03 950 NA 30 1250 1600
10: 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 1250 1600
11: 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 1250 1600
12: 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30 1250 1600
13: 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 1250 1600
14: 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 1250 1600
dplyr::nth() 함수
R Programming |
%%R
DT <- data.table(emp)
dept_DT <- data.table(dept)
DT[order(sal), `:=` ( nth_top_0 = dplyr::nth(sal,2, order_by = sal ), # 첫 번째 정렬 옵션(order(sal)) 필요 없음
nth_top_1 = na.omit(unlist(.SD))[2]), # 첫 번째 정렬 옵션(order(sal)) 필요
by = deptno,
.SDcols = c("sal")][order(deptno,sal),]
DT[order(-sal), `:=` ( nth_bottom_0 = dplyr::nth(sal,2, order_by = desc(sal) ), # 첫 번째 정렬 옵션(order(-sal)) 필요 없음
nth_bottom_1 = na.omit(unlist(.SD))[2]), # 첫 번째 정렬 옵션(order(-sal)) 필요
by = deptno,
.SDcols = c("sal")][order(deptno,sal),]
DT[,c('hiredate','job','mgr','ename') := NULL][1:10, ]
Results |
empno sal comm deptno nth_top_0 nth_top_1 nth_bottom_0 nth_bottom_1
1: 7369 800 NA 20 1100 1100 3000 3000
2: 7499 1600 300 30 1250 1250 1600 1600
3: 7521 1250 500 30 1250 1250 1600 1600
4: 7566 2975 NA 20 1100 1100 3000 3000
5: 7654 1250 1400 30 1250 1250 1600 1600
6: 7698 2850 NA 30 1250 1250 1600 1600
7: 7782 2450 NA 10 2450 2450 2450 2450
8: 7788 3000 NA 20 1100 1100 3000 3000
9: 7839 5000 NA 10 2450 2450 2450 2450
10: 7844 1500 0 30 1250 1250 1600 1600
dplyr::nth() 함수
R Programming |
%%R
DT <- data.table(emp)
DT[ order(sal), `:=` (nth_top_0 = dplyr::nth(sal,2, order_by = sal ),
nth_top_1 = sal[2],
nth_top_2 = na.omit(unlist(.SD))[2],
fir_sal = first(sal),
max_sal = max(sal),
min_sal = min(sal),
last_sal = last(sal)),by=deptno,.SDcols = c("sal")][order(deptno,sal),][,c('hiredate','job','mgr','ename','comm','empno') := NULL][1:10, ]
Results |
sal deptno nth_top_0 nth_top_1 nth_top_2 fir_sal max_sal min_sal last_sal
1: 1300 10 2450 2450 2450 1300 5000 1300 5000
2: 2450 10 2450 2450 2450 1300 5000 1300 5000
3: 5000 10 2450 2450 2450 1300 5000 1300 5000
4: 800 20 1100 1100 1100 800 3000 800 3000
5: 1100 20 1100 1100 1100 800 3000 800 3000
6: 2975 20 1100 1100 1100 800 3000 800 3000
7: 3000 20 1100 1100 1100 800 3000 800 3000
8: 3000 20 1100 1100 1100 800 3000 800 3000
9: 950 30 1250 1250 1250 950 2850 950 2850
10: 1250 30 1250 1250 1250 950 2850 950 2850
8. Python Duckdb의 SQL
Python Programming |
%%sql
SELECT empno,
ename,
deptno,
sal,
nth_value(sal, 2) over (PARTITION BY deptno ORDER BY sal RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS nth_top,
NTH_VALUE(sal, 2) over (PARTITION BY deptno ORDER BY sal desc RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS nth_bottom
FROM emp
order by deptno, sal
LIMIT 6
Python Programming |
duckdb.sql(" SELECT empno, \
ename, \
deptno, \
sal, \
nth_value(sal, 2) over (PARTITION BY deptno ORDER BY sal RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS nth_top, \
NTH_VALUE(sal, 2) over (PARTITION BY deptno ORDER BY sal desc RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS nth_bottom \
FROM emp \
order by deptno, sal \
LIMIT 6 ").df()
Results |
empno ename deptno sal nth_top nth_bottom
0 7934 MILLER 10 1300 2450 2450
1 7782 CLARK 10 2450 2450 2450
2 7839 KING 10 5000 2450 2450
3 7369 SMITH 20 800 1100 3000
4 7876 ADAMS 20 1100 1100 3000
5 7566 JONES 20 2975 1100 3000
--------------------------------------------
[Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table] 오라클 함수와 R & Python 비교 사전 목록 링크 |
오라클 SQL 함수(Oracle SQL Function) 목록 리스트 링크 |
[SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트 링크 링크 |
반응형
댓글