본문 바로가기
통계프로그램 비교 시리즈/프로그래밍비교(Oracle,Python,R,SAS)

[데이터 필터링] 비상관 서브쿼리 결과를 만족하는 데이터 추출 - 54 (오라클 SQL, R, Python, SAS)

by 기서무나구물 2021. 8. 19.

포스팅 목차

    54. Display the names of clerks who earn salary more than that of James of that of sal lesser than that of Scott.

     

    * 직원 'JAMES' 의 급여보다 크고, 직원 'SCOTT'의 급여보다 적은 사무직 직원의 정보를 출력하시오.


    • Oracle : 서브쿼리(Subquery), where 구문
    • 파이썬 Pandas : iloc[], 비교구문
    • R 프로그래밍 : 비교구문
    • R Dplyr Package : filter(), pull() : 벡터 추출, unique()
    • R sqldf Package : 서브쿼리(Subquery), where 구문
    • Python pandasql Package : 서브쿼리(Subquery), where 구문
    • R data.table Package : 비교구분
    • SAS Proc SQL : 서브쿼리(Subquery), where 구문
    • SAS Data Step : CROSS JOIN (카테시안 곱) 구현, Macro 변수 할당
    • Python Dfply Package : filter_by(), @dfpipe (사용자 정의 함수), 비교구분
    • 파이썬 Base 프로그래밍 :

     


    1. Oracle(오라클)

    • 비상관 서브쿼리
    Oracle Programming
    select ename 
    from   emp 
    where  job='CLERK' 
      and  sal<(select sal from emp where ename='SCOTT') 
      and  sal>(select sal from emp where ename='JAMES');

     


    2. Python Pandas(파이썬)

    • iloc[0] : pandas.core.series.Series 를 numpy.int64로 변경
    • max(emp[emp['ename']=='SCOTT'].sal) 를 지정하면 바로 해결
    Python Programming
    emp[(emp['job']=='CLERK')     
        & (emp['sal'] < (emp[emp['ename']=='SCOTT'].sal).iloc[0])
        & (emp['sal'] > (emp[emp['ename']=='JAMES'].sal).iloc[0])]

     

    Results
      empno ename job mgr hiredate sal comm deptno
    10 7876 ADAMS CLERK 7788.0 1983/01/12 1100 NaN 20
    13 7934 MILLER CLERK 7782.0 1982/01/23 1300 NaN 10

     


    3. R Programming (R Package)

     

    R Programming
    %%R
    
    emp[   emp$job == 'CLERK' 
         & emp$sal < (emp[emp$ename == 'SCOTT',]$sal)
         & emp$sal > (emp[emp$ename == 'JAMES',]$sal),]

     

    Results
    # A tibble: 2 x 8
      empno ename  job     mgr hiredate     sal  comm deptno
      <dbl> <chr>  <chr> <dbl> <date>     <dbl> <dbl>  <dbl>
    1  7876 ADAMS  CLERK  7788 1983-01-12  1100    NA     20
    2  7934 MILLER CLERK  7782 1982-01-23  1300    NA     10

     


    4. R Dplyr Package

     

    R Programming
    %%R
    
    emp %>%
      filter(job == "CLERK" 
             & sal < ( emp %>% filter(ename == 'SCOTT') 
                           %>% pull(sal) 
                           %>% unique ) 
             & sal > ( emp %>% filter(ename == 'JAMES') 
                           %>% pull(sal) 
                           %>% unique ) 
            )

     

    Results
    # A tibble: 2 x 8
      empno ename  job     mgr hiredate     sal  comm deptno
      <dbl> <chr>  <chr> <dbl> <date>     <dbl> <dbl>  <dbl>
    1  7876 ADAMS  CLERK  7788 1983-01-12  1100    NA     20
    2  7934 MILLER CLERK  7782 1982-01-23  1300    NA     10

     


    5. R sqldf Package

    • 비상관 서브쿼리
    R Programming
    %%R
    sqldf(" select * 
            from   emp 
            where  job='CLERK' 
              and  sal<(select sal from emp where ename='SCOTT') 
              and  sal>(select sal from emp where ename='JAMES');")

     

    Results
      empno  ename   job  mgr   hiredate  sal comm deptno
    1  7876  ADAMS CLERK 7788 1983-01-12 1100   NA     20
    2  7934 MILLER CLERK 7782 1982-01-23 1300   NA     10

     


    6. Python pandasql Package

    • 비상관 서브쿼리
    Python Programming
    ps.sqldf("select * from emp where job='CLERK' and sal<(select sal from emp where ename='SCOTT') and sal>(select sal from emp where ename='JAMES');")

     

    Results
      empno ename job mgr hiredate sal comm deptno
    0 7876 ADAMS CLERK 7788.0 1983/01/12 1100 None 20
    1 7934 MILLER CLERK 7782.0 1982/01/23 1300 None 10

     


    7. R data.table Package

     

    R Programming
    %%R
    
    DT <- data.table(emp)
    
    DT[job == "CLERK" & sal< (DT[ename == "SCOTT", (sal)]) & sal > (DT[ename == "JAMES", (sal)]) ,]

     

    Results
       empno  ename   job  mgr   hiredate  sal comm deptno
    1:  7876  ADAMS CLERK 7788 1983-01-12 1100   NA     20
    2:  7934 MILLER CLERK 7782 1982-01-23 1300   NA     10

     


    8. SAS Proc SQL

    • 비상관 서브쿼리
    SAS Programming
    %%SAS sas
    
    PROC SQL;
      CREATE TABLE STATSAS_1 AS
        select ename 
        from   emp 
        where  job='CLERK' 
          and  sal<(select sal from emp where ename='SCOTT') 
          and  sal>(select sal from emp where ename='JAMES');
    QUIT;
    PROC PRINT;RUN;

     

    Results
    OBS ename
    1 ADAMS
    2 MILLER

     


    9. SAS Data Step

    • DATA STEP(CROSS JOIN-카테시안 곱);
    SAS Programming
    %%SAS sas
    
    DATA EMP_CLERK;
     SET EMP;
         WHERE job='CLERK';
    RUN;
    
    DATA STATSAS_3; 
     SET EMP(WHERE=(ename='SCOTT') RENAME=(SAL=SCOTT_SAL)); 
     SET EMP(WHERE=(ename='JAMES') RENAME=(SAL=JAMES_SAL));
         DO I=1 TO KOBS; 
            SET EMP_CLERK NOBS=KOBS POINT=I;
            IF SAL < SCOTT_SAL AND SAL > JAMES_SAL THEN OUTPUT;
         END;    
    RUN;
    
    PROC PRINT;RUN;

     

    Results
    OBS empno ename job mgr hiredate SCOTT_SAL comm deptno JAMES_SAL sal
    1 7876 ADAMS CLERK 7788 1983-01-12 3000 . 20 950 1100
    2 7934 MILLER CLERK 7782 1982-01-23 3000 . 10 950 1300

     


    • Macro 할당 방식
    SAS Programming
    %%SAS sas
    
    proc sql noprint;
      select SAL into :SCOTT_SAL
      from   EMP
      WHERE  ename='SCOTT';
    quit;
    
    proc sql noprint;
      select SAL into :JAMES_SAL
      from   EMP
      WHERE  ename='JAMES';
    quit;
    
    DATA STATSAS_3;
     SET EMP;
         IF SAL < &SCOTT_SAL AND SAL > &JAMES_SAL THEN OUTPUT;
         WHERE job='CLERK';
    RUN;
    PROC PRINT;RUN;
    

     


    10. Python Dfply Package

    사용자 함수

    Python Programming
    @dfpipe
    def pull_fun(df, column=-1):
        return df.loc[:, column].item()
    
    emp >> filter_by( X.job == 'CLERK',                                                    \
                      X.sal < ( emp >> filter_by(X.ename == 'SCOTT') >> pull_fun('sal')) , \
                      X.sal > ( emp >> filter_by(X.ename == 'JAMES') >> pull_fun('sal')) )

     

    Results
      empno ename job mgr hiredate sal comm deptno
    10 7876 ADAMS CLERK 7788.0 1983/01/12 1100 NaN 20
    13 7934 MILLER CLERK 7782.0 1982/01/23 1300 NaN 10

     


    [SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트

     

    반응형

    댓글