본문 바로가기
통계프로그램 비교 시리즈/오라클함수 비교(R & Python)

MIN 오라클 함수 [Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table, DuckDB]

by 기서무나구물 2021. 12. 4.

포스팅 목차

    * 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크

    MIN Oracle Function

     


    MIN함수는 (개별 그룹내의) 사용자가 지정한 인수의 최소값을 반환한다. 집계 함수, 분석 함수 모두 이용 가능하다.

     

     


    1. Oracle(오라클)

    • Min 함수

    다음 예제는 employees 테이블의 개별 부서내에서 최소 급여를 계산한다.

     

    Oracle Programming
    SELECT deptno,MIN(sal) sal_min 
    FROM   emp 
    group  by deptno

     

    Results
    DEPTNO	SAL_MIN
    ------------------
    30	950
    10	1300
    20	800

     


    2. Python Pandas(파이썬)

     

    1) 집계함수(Aggregate function)

    • Min 함수
    Python Programming
    emp.groupby('deptno')['sal'].min()

     

    Results
    deptno
    10    1300
    20     800
    30     950
    Name: sal, dtype: int64

     


     

    • Apply()와 Min 함수
    Python Programming
    emp.groupby('deptno')['sal'].apply(min)

     

    Results
    deptno
    10    1300
    20     800
    30     950
    Name: sal, dtype: int64

     


    • Rank() 함수
    Python Programming
    emp.groupby('deptno').apply(lambda x : x[ x['sal'].rank() ==1 ])

     

    Results
    		empno	ename	job	mgr	hiredate	sal	comm	deptno
    deptno									
    10	13	7934	MILLER	CLERK	7782.0	1982/01/23	1300	NaN	10
    20	0	7369	SMITH	CLERK	7902.0	1980/12/17	800	NaN	20
    30	11	7900	JAMES	CLERK	7698.0	1981/12/03	950	NaN	30
     

    2) 분석함수(Analytic Function)

     

    • groupby()와 Min 함수
    Python Programming
    withmooc = emp.copy()
    
    group_min = withmooc.groupby('deptno')['sal'].min()
    withmooc['sal_min'] = withmooc['deptno'].map(group_min)
    
    withmooc.sort_values('deptno').head()

     

    Results
    	empno	ename	job		mgr	hiredate	sal	comm	deptno	sal_min
    6	7782	CLARK	MANAGER		7839.0	1981/01/09	2450	NaN	10	1300
    8	7839	KING	PRESIDENT	NaN	1981/11/17	5000	NaN	10	1300
    13	7934	MILLER	CLERK		7782.0	1982/01/23	1300	NaN	10	1300
    0	7369	SMITH	CLERK		7902.0	1980/12/17	800	NaN	20	800
    3	7566	JONES	MANAGER		7839.0	1981/04/02	2975	NaN	20	800

     


    • Transform()과 Min 함수
    Python Programming
    withmooc = emp.copy()
    withmooc['min_value'] = emp.groupby('deptno')['sal'].transform(lambda x:x.min())
    withmooc.sort_values('deptno').head()

     

    Results
    	empno	ename	job		mgr	hiredate	sal	comm	deptno	min_value
    6	7782	CLARK	MANAGER		7839.0	1981/01/09	2450	NaN	10	1300
    8	7839	KING	PRESIDENT	NaN	1981/11/17	5000	NaN	10	1300
    13	7934	MILLER	CLERK		7782.0	1982/01/23	1300	NaN	10	1300
    0	7369	SMITH	CLERK		7902.0	1980/12/17	800	NaN	20	800
    3	7566	JONES	MANAGER		7839.0	1981/04/02	2975	NaN	20	800

     


    • Transform()과 Min 함수
    Python Programming
    withmooc = emp.copy()
    withmooc['min_value'] = emp.groupby('deptno')['sal'].transform('min')
    withmooc.sort_values('deptno').head()

     

    Results
    	empno	ename	job		mgr	hiredate	sal	comm	deptno	min_value
    6	7782	CLARK	MANAGER		7839.0	1981/01/09	2450	NaN	10	1300
    8	7839	KING	PRESIDENT	NaN	1981/11/17	5000	NaN	10	1300
    13	7934	MILLER	CLERK		7782.0	1982/01/23	1300	NaN	10	1300
    0	7369	SMITH	CLERK		7902.0	1980/12/17	800	NaN	20	800
    3	7566	JONES	MANAGER		7839.0	1981/04/02	2975	NaN	20	800

     

     


    3. R Programming (R Package)

     

    1) 집계함수 (Aggregate function)

     

    • Aggregate()와 Min
    R Programming
    %%R
    
    aggregate(sal ~ deptno, data=emp , FUN=min)

     

    Results
      deptno  sal
    1     10 1300
    2     20  800
    3     30  950

     


    • by()와 Min
    R Programming
    %%R
    
    unlist(with(emp, by(sal, deptno, min)))

     

    Results
    deptno: 10
    [1] 1300
    ------------------------------------------------------------ 
    deptno: 20
    [1] 800
    ------------------------------------------------------------ 
    deptno: 30
    [1] 950

     


    • Map과 Min
    R Programming
    %%R
    
    do.call(rbind, Map(min, split(emp$sal, emp$deptno)))

     

    Results
       [,1]
    10 1300
    20  800
    30  950

     


    • purrr::map_dfr()과 min
    R Programming
    %%R
    # purrr::map_chr(split(emp,emp$deptno), ~min(.x$sal))
    
    purrr::map_dfr(split(emp,emp$deptno), ~min(.x$sal))

     

    Results
    # A tibble: 1 x 3
       `10`  `20`  `30`
      <dbl> <dbl> <dbl>
    1  1300   800   950

     


    • purrr::map_dfr()과 min
    R Programming
    %%R
    
    do.call(rbind, purrr::map_dfr(split(emp,emp$deptno), ~min(.x$sal)) )

     

    Results
       [,1]
    10 1300
    20  800
    30  950

     


    • lapply()와 Min
    R Programming
    %%R
    
    do.call(rbind, lapply(split(emp$sal, emp$deptno), min))

     

    Results
       [,1]
    10 1300
    20  800
    30  950

     

     


    2) 분석함수(Analytic Function)

     

    • ave()와 min
    R Programming
    %%R
    withmooc <- emp
    withmooc['sal_min'] = with(withmooc, ave(x = sal, deptno, FUN=min))   # FUN =function(x) { median(x) }
    
    withmooc[order(withmooc$deptno),][1:10,]

     

    Results
    # A tibble: 10 x 9
       empno ename  job         mgr hiredate     sal  comm deptno sal_min
       <dbl> <chr>  <chr>     <dbl> <date>     <dbl> <dbl>  <dbl>   <dbl>
     1  7782 CLARK  MANAGER    7839 1981-01-09  2450    NA     10    1300
     2  7839 KING   PRESIDENT    NA 1981-11-17  5000    NA     10    1300
     3  7934 MILLER CLERK      7782 1982-01-23  1300    NA     10    1300
     4  7369 SMITH  CLERK      7902 1980-12-17   800    NA     20     800
     5  7566 JONES  MANAGER    7839 1981-04-02  2975    NA     20     800
     6  7788 SCOTT  ANALYST    7566 1982-12-09  3000    NA     20     800
     7  7876 ADAMS  CLERK      7788 1983-01-12  1100    NA     20     800
     8  7902 FORD   ANALYST    7566 1981-12-03  3000    NA     20     800
     9  7499 ALLEN  SALESMAN   7698 1981-02-20  1600   300     30     950
    10  7521 WARD   SALESMAN   7698 1981-02-22  1250   500     30     950

     


    • ave()와  Min
    R Programming
    %%R
    
    withmooc <- emp
    
    withmooc['sal_median'] = ave(x = withmooc$sal, withmooc$deptno, FUN = min)    # FUN = function (x) {  min(x) }
    
    withmooc[order(withmooc$deptno),][1:10,]

     

    Results
    # A tibble: 10 x 9
       empno ename  job         mgr hiredate     sal  comm deptno sal_median
       <dbl> <chr>  <chr>     <dbl> <date>     <dbl> <dbl>  <dbl>      <dbl>
     1  7782 CLARK  MANAGER    7839 1981-01-09  2450    NA     10       1300
     2  7839 KING   PRESIDENT    NA 1981-11-17  5000    NA     10       1300
     3  7934 MILLER CLERK      7782 1982-01-23  1300    NA     10       1300
     4  7369 SMITH  CLERK      7902 1980-12-17   800    NA     20        800
     5  7566 JONES  MANAGER    7839 1981-04-02  2975    NA     20        800
     6  7788 SCOTT  ANALYST    7566 1982-12-09  3000    NA     20        800
     7  7876 ADAMS  CLERK      7788 1983-01-12  1100    NA     20        800
     8  7902 FORD   ANALYST    7566 1981-12-03  3000    NA     20        800
     9  7499 ALLEN  SALESMAN   7698 1981-02-20  1600   300     30        950
    10  7521 WARD   SALESMAN   7698 1981-02-22  1250   500     30        950

     


    • purrr::map_chr()과 min
    R Programming
    %%R
    
    withmooc <- emp
    
    Min_var <- with(withmooc, purrr::map_chr(base::split(withmooc,withmooc$deptno), ~min(.x$sal)) ) 
    
    data.frame(withmooc, Min_value =rep(Min_var, table(withmooc$deptno)))[1:10,]

     

    Results
       empno  ename       job  mgr   hiredate  sal comm deptno   Min_value
    1   7369  SMITH     CLERK 7902 1980-12-17  800   NA     20 1300.000000
    2   7499  ALLEN  SALESMAN 7698 1981-02-20 1600  300     30 1300.000000
    3   7521   WARD  SALESMAN 7698 1981-02-22 1250  500     30 1300.000000
    4   7566  JONES   MANAGER 7839 1981-04-02 2975   NA     20  800.000000
    5   7654 MARTIN  SALESMAN 7698 1981-09-28 1250 1400     30  800.000000
    6   7698  BLAKE   MANAGER 7839 1981-03-01 2850   NA     30  800.000000
    7   7782  CLARK   MANAGER 7839 1981-01-09 2450   NA     10  800.000000
    8   7788  SCOTT   ANALYST 7566 1982-12-09 3000   NA     20  800.000000
    9   7839   KING PRESIDENT   NA 1981-11-17 5000   NA     10  950.000000
    10  7844 TURNER  SALESMAN 7698 1981-09-08 1500    0     30  950.000000

     


    • tapply와 min
    R Programming
    %%R
    
    withmooc <- emp
    
    Min_var <- with(withmooc, tapply(sal, deptno, min))
    
    data.frame(emp, min_value =rep(Min_var, table(withmooc$deptno)))
    
    # with(withmooc,  data.frame(emp, min_value=rep(tapply(sal, deptno, min), table(deptno))) )[1:10,]

     

    Results
       empno  ename       job  mgr   hiredate  sal comm deptno min_value
    1   7369  SMITH     CLERK 7902 1980-12-17  800   NA     20      1300
    2   7499  ALLEN  SALESMAN 7698 1981-02-20 1600  300     30      1300
    3   7521   WARD  SALESMAN 7698 1981-02-22 1250  500     30      1300
    4   7566  JONES   MANAGER 7839 1981-04-02 2975   NA     20       800
    5   7654 MARTIN  SALESMAN 7698 1981-09-28 1250 1400     30       800
    6   7698  BLAKE   MANAGER 7839 1981-03-01 2850   NA     30       800
    7   7782  CLARK   MANAGER 7839 1981-01-09 2450   NA     10       800
    8   7788  SCOTT   ANALYST 7566 1982-12-09 3000   NA     20       800
    9   7839   KING PRESIDENT   NA 1981-11-17 5000   NA     10       950
    10  7844 TURNER  SALESMAN 7698 1981-09-08 1500    0     30       950
    11  7876  ADAMS     CLERK 7788 1983-01-12 1100   NA     20       950
    12  7900  JAMES     CLERK 7698 1981-12-03  950   NA     30       950
    13  7902   FORD   ANALYST 7566 1981-12-03 3000   NA     20       950
    14  7934 MILLER     CLERK 7782 1982-01-23 1300   NA     10       950

     


    • plyr::ddply()와 Min
    R Programming
    %%R
    
    library(plyr)   # dplyr 보다 먼저 로딩
    library(dplyr)
    
    plyr::ddply(emp,.(deptno),transform,sal_min = min(sal))[1:10,]

     

    Results
       empno  ename       job  mgr   hiredate  sal comm deptno sal_min
    1   7782  CLARK   MANAGER 7839 1981-01-09 2450   NA     10    1300
    2   7839   KING PRESIDENT   NA 1981-11-17 5000   NA     10    1300
    3   7934 MILLER     CLERK 7782 1982-01-23 1300   NA     10    1300
    4   7369  SMITH     CLERK 7902 1980-12-17  800   NA     20     800
    5   7566  JONES   MANAGER 7839 1981-04-02 2975   NA     20     800
    6   7788  SCOTT   ANALYST 7566 1982-12-09 3000   NA     20     800
    7   7876  ADAMS     CLERK 7788 1983-01-12 1100   NA     20     800
    8   7902   FORD   ANALYST 7566 1981-12-03 3000   NA     20     800
    9   7499  ALLEN  SALESMAN 7698 1981-02-20 1600  300     30     950
    10  7521   WARD  SALESMAN 7698 1981-02-22 1250  500     30     950

     


    • purrr::map_dfc()와 min
    R Programming
    %%R
    
    withmooc <- emp
    
    mapping    <- t( purrr::map_dfc( split(withmooc,withmooc$deptno), ~min(.x$sal) ) )
    
    trans_data <- melt(mapping, value.name="sal_min")
    
    merge(emp,trans_data,by.x='deptno',by.y='Var1')[1:10,]

     

    Results
       deptno empno  ename       job  mgr   hiredate  sal comm Var2 sal_min
    1      10  7782  CLARK   MANAGER 7839 1981-01-09 2450   NA    1    1300
    2      10  7839   KING PRESIDENT   NA 1981-11-17 5000   NA    1    1300
    3      10  7934 MILLER     CLERK 7782 1982-01-23 1300   NA    1    1300
    4      20  7369  SMITH     CLERK 7902 1980-12-17  800   NA    1     800
    5      20  7876  ADAMS     CLERK 7788 1983-01-12 1100   NA    1     800
    6      20  7566  JONES   MANAGER 7839 1981-04-02 2975   NA    1     800
    7      20  7902   FORD   ANALYST 7566 1981-12-03 3000   NA    1     800
    8      20  7788  SCOTT   ANALYST 7566 1982-12-09 3000   NA    1     800
    9      30  7499  ALLEN  SALESMAN 7698 1981-02-20 1600  300    1     950
    10     30  7698  BLAKE   MANAGER 7839 1981-03-01 2850   NA    1     950

     

     


    4. R Dplyr Package

     

    1) 집계함수(Aggregate function)

     

    • dplyr::summarize()와 Min
    R Programming
    %%R
    
    emp %>%
      dplyr::group_by(deptno) %>%
      dplyr::summarize(sal_min = min(sal)) %>%
      head()

     

    Results
    `summarise()` ungrouping output (override with `.groups` argument)
    # A tibble: 3 x 2
      deptno sal_min
       <dbl>   <dbl>
    1     10    1300
    2     20     800
    3     30     950

     


    • map()과 min
    R Programming
    %%R
    
    emp %>%
      tbl_df() %>%
      nest(-deptno) %>%
      mutate(sal_min = map(data, ~ min(.$sal))) %>% 
      unnest(sal_min) %>%
      head()

     

    Results
    # A tibble: 3 x 3
      deptno data             sal_min
       <dbl> <list>             <dbl>
    1     20 <tibble [5 x 7]>     800
    2     30 <tibble [6 x 7]>     950
    3     10 <tibble [3 x 7]>    1300

     

     


    2) 분석함수(Analytic Function)

     

    • min() 함수
    R Programming
    %%R
    
    emp %>%
      dplyr::group_by(deptno) %>%
      dplyr::mutate(sal_min = min(sal)) %>%
      dplyr::arrange(deptno) %>%
      head()

     

    Results
    # A tibble: 6 x 9
    # Groups:   deptno [2]
      empno ename  job         mgr hiredate     sal  comm deptno sal_min
      <dbl> <chr>  <chr>     <dbl> <date>     <dbl> <dbl>  <dbl>   <dbl>
    1  7782 CLARK  MANAGER    7839 1981-01-09  2450    NA     10    1300
    2  7839 KING   PRESIDENT    NA 1981-11-17  5000    NA     10    1300
    3  7934 MILLER CLERK      7782 1982-01-23  1300    NA     10    1300
    4  7369 SMITH  CLERK      7902 1980-12-17   800    NA     20     800
    5  7566 JONES  MANAGER    7839 1981-04-02  2975    NA     20     800
    6  7788 SCOTT  ANALYST    7566 1982-12-09  3000    NA     20     800

     


    • purrr:::map_dfr()과 min()
    R Programming
    %%R
    
    withmooc <- emp %>% mutate(deptno = as.character(deptno))
    
    withmooc %>%
      dplyr::left_join( ( purrr::map_dfr(split(withmooc,withmooc$deptno), ~min(.x$sal)) %>%
                          gather(key = deptno, value = sal_min )
                        ),
                        id = "deptno"
                      )              %>%
      dplyr::arrange(deptno)         %>%
      head()

     

    Results
    Joining, by = "deptno"
    # A tibble: 6 x 9
      empno ename  job         mgr hiredate     sal  comm deptno sal_min
      <dbl> <chr>  <chr>     <dbl> <date>     <dbl> <dbl> <chr>    <dbl>
    1  7782 CLARK  MANAGER    7839 1981-01-09  2450    NA 10        1300
    2  7839 KING   PRESIDENT    NA 1981-11-17  5000    NA 10        1300
    3  7934 MILLER CLERK      7782 1982-01-23  1300    NA 10        1300
    4  7369 SMITH  CLERK      7902 1980-12-17   800    NA 20         800
    5  7566 JONES  MANAGER    7839 1981-04-02  2975    NA 20         800
    6  7788 SCOTT  ANALYST    7566 1982-12-09  3000    NA 20         800

     

     


    5. R sqldf Package

    1) 집계함수 (Aggregate function)

     

    • Min 함수
    R Programming
    %%R
    
    sqldf(" SELECT deptno,MIN(sal) sal_min FROM emp group by deptno; ")

     

    Results
      deptno sal_min
    1     10    1300
    2     20     800
    3     30     950

     

     

     


    6. Python pandasql Package

     

    1) 집계함수 (Aggregate function)

     

    • Min 함수
    Python Programming
    ps.sqldf(" SELECT deptno,MIN(sal) sal_min FROM emp group by deptno; ")

     

    Results
    	deptno	sal_min
    0	10	1300
    1	20	800
    2	30	950

     


    7. R data.table Package

     

    1) 집계함수 (Aggregate function)

     

    • Min 함수
    R Programming
    %%R
    
    DT          <- data.table(emp)
    dept_DT     <- data.table(dept)
    
    DT[, min(sal), by =deptno]

     

    Results
       deptno   V1
    1:     20  800
    2:     30  950
    3:     10 1300

     


    2) 분석함수 (Analytic Function)

     

    • Min 함수
    R Programming
    %%R
    
    DT          <- data.table(emp)
    dept_DT     <- data.table(dept)
    
    DT[, sal_min := min(sal), by = deptno][order(deptno),][1:10, ]

     

    Results
        empno  ename       job  mgr   hiredate  sal comm deptno sal_min
     1:  7782  CLARK   MANAGER 7839 1981-01-09 2450   NA     10    1300
     2:  7839   KING PRESIDENT   NA 1981-11-17 5000   NA     10    1300
     3:  7934 MILLER     CLERK 7782 1982-01-23 1300   NA     10    1300
     4:  7369  SMITH     CLERK 7902 1980-12-17  800   NA     20     800
     5:  7566  JONES   MANAGER 7839 1981-04-02 2975   NA     20     800
     6:  7788  SCOTT   ANALYST 7566 1982-12-09 3000   NA     20     800
     7:  7876  ADAMS     CLERK 7788 1983-01-12 1100   NA     20     800
     8:  7902   FORD   ANALYST 7566 1981-12-03 3000   NA     20     800
     9:  7499  ALLEN  SALESMAN 7698 1981-02-20 1600  300     30     950
    10:  7521   WARD  SALESMAN 7698 1981-02-22 1250  500     30     950

     


    • Min 함수
    R Programming
    %%R
    
    DT          <- data.table(emp)
    dept_DT     <- data.table(dept)
    
    DT[, `:=` ( sal_min  = min(unlist(.SD)) ), 
         by = deptno,
         .SDcols = c("sal")][order(deptno),][1:10, ]

     

    Results
        empno  ename       job  mgr   hiredate  sal comm deptno sal_min
     1:  7782  CLARK   MANAGER 7839 1981-01-09 2450   NA     10    1300
     2:  7839   KING PRESIDENT   NA 1981-11-17 5000   NA     10    1300
     3:  7934 MILLER     CLERK 7782 1982-01-23 1300   NA     10    1300
     4:  7369  SMITH     CLERK 7902 1980-12-17  800   NA     20     800
     5:  7566  JONES   MANAGER 7839 1981-04-02 2975   NA     20     800
     6:  7788  SCOTT   ANALYST 7566 1982-12-09 3000   NA     20     800
     7:  7876  ADAMS     CLERK 7788 1983-01-12 1100   NA     20     800
     8:  7902   FORD   ANALYST 7566 1981-12-03 3000   NA     20     800
     9:  7499  ALLEN  SALESMAN 7698 1981-02-20 1600  300     30     950
    10:  7521   WARD  SALESMAN 7698 1981-02-22 1250  500     30     950

     


    8. Python Duckdb의 SQL

     

    1) 집계함수 (Aggregate function)

    Python Programming
    %%sql
      SELECT deptno,
             MIN(sal) sal_min
      FROM   emp
      group  by deptno

     

    Python Programming
    duckdb.sql(" SELECT deptno,                   \
                        MIN(sal) sal_min          \
                 FROM   emp                       \
                 group  by deptno ").df()

     

    Results
       deptno  sal_min
    0      20      800
    1      30      950
    2      10     1300

     


    2) 분석함수 (Analytic Function)

     

    Python Programming
    %%sql
      SELECT MGR, HIREDATE, SAL, EMPNO, ENAME, 
             MIN(SAL) OVER (PARTITION BY MGR ORDER BY HIREDATE) MGR_MIN
      FROM   emp
      ORDER  BY MGR, HIREDATE
      LIMIT  7

     

    Python Programming
    duckdb.sql(" SELECT MGR, HIREDATE, SAL, EMPNO, ENAME,                                \
                        MIN(SAL) OVER (PARTITION BY MGR ORDER BY HIREDATE) MGR_MIN       \
                 FROM   emp                                                              \
                 ORDER  BY MGR, HIREDATE                                                 \
                 LIMIT  7 ").df()

     

    Results
          mgr    hiredate   sal  empno   ename  MGR_MIN
    0  7566.0  1981/12/03  3000   7902    FORD     3000
    1  7566.0  1982/12/09  3000   7788   SCOTT     3000
    2  7698.0  1981/02/20  1600   7499   ALLEN     1600
    3  7698.0  1981/02/22  1250   7521    WARD     1250
    4  7698.0  1981/09/08  1500   7844  TURNER     1250
    5  7698.0  1981/09/28  1250   7654  MARTIN     1250
    6  7698.0  1981/12/03   950   7900   JAMES      950

     


     

    Cheers! A couple of soju shots along with some delicious Korean BBQ. (https://unsplash.com/photos/JWqQFv0n5U4)Min Function

     

      --------------------------------------------  

     

     

    [Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table] 오라클 함수와 R & Python 비교 사전 목록 링크

     

    오라클 SQL 함수(Oracle SQL Function) 목록 리스트 링크

     

    [SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트 링크 링크
    반응형

    댓글