본문 바로가기
통계프로그램 비교 시리즈/프로그래밍비교(Oracle,Python,R,SAS)

[기초 통계량 - MIN 함수] 그룹별 최소값 집계 - 47 (오라클 SQL, R, Python, SAS)

by 기서무나구물 2021. 8. 15.

포스팅 목차

     

    47. Display each job along with minimum sal being paid in each job group.

     

    * 개별 직무에 속한 직원들에게 지급되는 급여 중에서 최소 급여를 출력하시오.


    • Oracle : group by, min()
    • 파이썬 Pandas : groupby(), agg(), reset_index()
    • R 프로그래밍 : aggregate(), min, 사용자 정의 함수
    • R Dplyr Package : dplyr::group_by(), dplyr::summarise, min()
    • R sqldf Package : group by, min()
    • Python pandasql Package : group by, min()
    • R data.table Package : min(), by=
    • SAS Proc SQL : group by, min()
    • SAS Data Step : proc summary, min=, first., last.
    • Python Dfply Package : group_by(), summarize(), .min(), .max()
    • 파이썬 Base 프로그래밍 :

     


    1. Oracle(오라클)

    • Group by 구문와 min() 함수
    Oracle Programming
    select job, 
           min(sal) sal_min
    from   emp 
    group 
       by  job

     


    2. Python Pandas(파이썬)

    • Groupby 함수와 agg() 집계함수의 min 인수
    Python Programming
    emp.groupby(['job'])['sal'].agg(['min']).reset_index()

     

    Results
      job min
    0 ANALYST 3000
    1 CLERK 800
    2 MANAGER 2450
    3 PRESIDENT 5000
    4 SALESMAN 1250

     


    3. R Programming (R Package)

    • aggregate() 집계함수와 min 함수
    R Programming
    %%R
    
    aggregate(sal ~ job, data = emp, FUN = function(x) c(min_sal = min(x) ) )

     

    Results
            job  sal
    1   ANALYST 3000
    2     CLERK  800
    3   MANAGER 2450
    4 PRESIDENT 5000
    5  SALESMAN 1250

     


    4. R Dplyr Package

    • Group_by 함수와 Summarise() 집계함수의 min() 함수
    R Programming
    %%R
    
    emp %>%
      group_by(job) %>% summarise( min_sal=min(sal) )

     

    Results
    `summarise()` ungrouping output (override with `.groups` argument)
    # A tibble: 5 x 2
      job       min_sal
      <chr>       <dbl>
    1 ANALYST      3000
    2 CLERK         800
    3 MANAGER      2450
    4 PRESIDENT    5000
    5 SALESMAN     1250

     


    5. R sqldf Package

    • Group by 구문와 min() 함수
    R Programming
    %%R
    sqldf("select job, min(sal) sal_min from emp group by job")

     

    Results
            job sal_min
    1   ANALYST    3000
    2     CLERK     800
    3   MANAGER    2450
    4 PRESIDENT    5000
    5  SALESMAN    1250

     


    6. Python pandasql Package

    • Group by 구문와 min() 함수
    Python Programming
    ps.sqldf("select job, min(sal) sal_min from emp group by job")

     

    Results
      job sal_min
    0 ANALYST 3000
    1 CLERK 800
    2 MANAGER 2450
    3 PRESIDENT 5000
    4 SALESMAN 1250

     


    7. R data.table Package

    • by = 구문과 min() 함수
    R Programming
    %%R
    
    DT <- data.table(emp)
    
    DT[ , .('sal_min' = min(sal)), by = .(job)] 

     

    Results
             job sal_min
    1:     CLERK     800
    2:  SALESMAN    1250
    3:   MANAGER    2450
    4:   ANALYST    3000
    5: PRESIDENT    5000

     


    8. SAS Proc SQL

    • Group by 구문와 min() 함수
    SAS Programming
    %%SAS sas
    
    PROC SQL;
      CREATE TABLE STATSAS_1 AS
        select job, min(sal) AS SAL_MIN
        from   emp 
        group 
           by  job;
    QUIT;
    PROC PRINT;RUN;

     

    Results
    OBS job SAL_MIN
    1 ANALYST 3000
    2 CLERK 800
    3 MANAGER 2450
    4 PRESIDEN 5000
    5 SALESMAN 1250

     


    9. SAS Data Step

    • Proc summary 프로시져와 min= 인수
    SAS Programming
    %%SAS sas
    
    PROC SUMMARY DATA=EMP NWAY;
         CLASS job;
         VAR   SAL;
         OUTPUT OUT=STATSAS_2(DROP=_:) MIN=SAL_MIN;
    RUN;
    PROC PRINT;RUN;

     

    Results
    OBS job SAL_MIN
    1 ANALYST 3000
    2 CLERK 800
    3 MANAGER 2450
    4 PRESIDEN 5000
    5 SALESMAN 1250

     


    • By 구문과 min() 함수
    SAS Programming
    %%SAS sas
    
    PROC SORT DATA=EMP OUT=EMP_1;
         BY job;
    RUN;
    
    DATA STATSAS_3;
     SET EMP_1;
         BY job;
    
         RETAIN SAL_MIN 0;
    
         IF FIRST.job THEN SAL_MIN = SAL;
         ELSE DO;
            SAL_MIN=MIN(SAL_MIN,SAL);
         END;
    
         IF LAST.job THEN OUTPUT STATSAS_3;  * IF LAST.job;
         KEEP DEPTNO SAL_MIN;
    RUN;
    PROC PRINT;RUN;

     

    Results
    OBS deptno SAL_MIN
    1 20 3000
    2 10 800
    3 10 2450
    4 10 5000
    5 30 1250

     


    10. Python Dfply Package

    • Group_by 함수와 Summarize() 집계함수의 min() 함수
    Python Programming
    emp >> group_by('job') >> summarize( sal_min = X.sal.min() )

     

    Results
      job sal_min
    0 ANALYST 3000
    1 CLERK 800
    2 MANAGER 2450
    3 PRESIDENT 5000
    4 SALESMAN 1250

     


     

    [SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE]   SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트

     

     

    반응형

    댓글