본문 바로가기
통계프로그램 비교 시리즈/프로그래밍비교(Oracle,Python,R,SAS)

[데이터 필터링 - 복수의 OR 연산자] OR 조건을 만족하는 데이터 추출 - 18 (오라클 SQL, R, Python, SAS)

by 기서무나구물 2021. 8. 5.

포스팅 목차

     

    18. Display the names of employees working in department number 10 or 20 or 40 or employees working as clerks, salesman or analyst.

     

    * 부서번호 10,20 또는 40에 근무하는 직원 또는 사무직, 판매직, 분석직에 근무하는 직원의 이름을 출력


    • Oracle : in 연산자, Or 연산자
    • 파이썬 Pandas : isin 연산자, or(|) 연산자, .loc, .query() 구문
    • R 프로그래밍 : %in% 연산자, or(|) 연산자, which() 함수, subset() 함수
    • R Dplyr Package : filter() 구문, %in% 연산자
    • R sqldf Package : in 연산자, Or 연산자
    • Python pandasql Package : in 연산자, Or 연산자
    • R data.table Package : %in% 연산자, or(|) 연산자
    • SAS Proc SQL : in 연산자, Or 연산자
    • SAS Data Step : in 연산자, Or 연산자
    • Python Dfply Package : filter_by(), or(|) 연산자
    • 파이썬 Base 프로그래밍 : mask(), or(|) 연산자

     


    1. 오라클(Oracle)

    • OR 연산자
    Oracle Programming
    select ename from emp 
    where  deptno in (10,20,40) or 
           job    in ('CLERK','SALESMAN','ANALYST');

     


    2. 파이썬(Pandas)

    • OR('|') 연산자
    Python Programming
    emp[ emp['deptno'].isin([10,20,40]) | emp['job'].isin(['CLERK','SALESMAN','ANALYST']) ]

     

    Results
      empno ename job mgr hiredate sal comm deptno hiredate_D
    0 7369 SMITH CLERK 7902.0 1980/12/17 800 NaN 20 1980-12-17
    1 7499 ALLEN SALESMAN 7698.0 1981/02/20 1600 300.0 30 1981-02-20
    2 7521 WARD SALESMAN 7698.0 1981/02/22 1250 500.0 30 1981-02-22
    3 7566 JONES MANAGER 7839.0 1981/04/02 2975 NaN 20 1981-04-02
    4 7654 MARTIN SALESMAN 7698.0 1981/09/28 1250 1400.0 30 1981-09-28
    6 7782 CLARK MANAGER 7839.0 1981/01/09 2450 NaN 10 1981-01-09
    7 7788 SCOTT ANALYST 7566.0 1982/12/09 3000 NaN 20 1982-12-09
    8 7839 KING PRESIDENT NaN 1981/11/17 5000 NaN 10 1981-11-17
    9 7844 TURNER SALESMAN 7698.0 1981/09/08 1500 0.0 30 1981-09-08
    10 7876 ADAMS CLERK 7788.0 1983/01/12 1100 NaN 20 1983-01-12
    11 7900 JAMES CLERK 7698.0 1981/12/03 950 NaN 30 1981-12-03
    12 7902 FORD ANALYST 7566.0 1981/12/03 3000 NaN 20 1981-12-03
    13 7934 MILLER CLERK 7782.0 1982/01/23 1300 NaN 10 1982-01-23

    • OR('|') 연산자
    Python Programming
    emp.loc[(emp.deptno.isin([10,20,40]) | emp.job.isin(['CLERK','SALESMAN','ANALYST'])) , ["ename"]].head()

     

    Results
      ename
    0 SMITH
    1 ALLEN
    2 WARD
    3 JONES
    4 MARTIN

    • query() 함수와 OR('|') 연산자
    Python Programming
    emp.query(" (deptno == [10, 20, 40]) | (job == ['CLERK','SALESMAN','ANALYST']) ").head()

     

    Results
      empno ename job mgr hiredate sal comm deptno
    0 7369 SMITH CLERK 7902.0 1980/12/17 800 NaN 20
    1 7499 ALLEN SALESMAN 7698.0 1981/02/20 1600 300.0 30
    2 7521 WARD SALESMAN 7698.0 1981/02/22 1250 500.0 30
    3 7566 JONES MANAGER 7839.0 1981/04/02 2975 NaN 20
    4 7654 MARTIN SALESMAN 7698.0 1981/09/28 1250 1400.0 30

     


    3. R Programming (R Package)

    • OR('|') 연산자
    R Programming
    %%R
    emp[ (emp$deptno %in% c(10, 20, 40)) | (emp$job %in% c('CLERK','SALESMAN','ANALYST'))  , c("ename") ]

     

    Results
    # A tibble: 13 x 1
       ename 
       <chr> 
     1 SMITH 
     2 ALLEN 
     3 WARD  
     4 JONES 
     5 MARTIN
     6 CLARK 
     7 SCOTT 
     8 KING  
     9 TURNER
    10 ADAMS 
    11 JAMES 
    12 FORD  
    13 MILLER

     


    • which() 함수와 OR('|') 연산자
    R Programming
    %%R
    emp[which(emp$deptno %in% c(10, 20, 40) | (emp$job %in% c('CLERK','SALESMAN','ANALYST')) ) , c("empno","ename") ]

     

    Results
    # A tibble: 13 x 2
       empno ename 
       <dbl> <chr> 
     1  7369 SMITH 
     2  7499 ALLEN 
     3  7521 WARD  
     4  7566 JONES 
     5  7654 MARTIN
     6  7782 CLARK 
     7  7788 SCOTT 
     8  7839 KING  
     9  7844 TURNER
    10  7876 ADAMS 
    11  7900 JAMES 
    12  7902 FORD  
    13  7934 MILLER

     


    • subset() 함수와 OR('|') 연산자
    R Programming
    %%R
    subset(emp,subset= (hiredate >=  '1981/01/09' & hiredate <= '1981/11/17') , select=c(empno,ename) )

     

    Results
    # A tibble: 8 x 2
      empno ename 
      <dbl> <chr> 
    1  7499 ALLEN 
    2  7521 WARD  
    3  7566 JONES 
    4  7654 MARTIN
    5  7698 BLAKE 
    6  7782 CLARK 
    7  7839 KING  
    8  7844 TURNER

     


    4. R Dplyr Package

    • filter() 함수와 OR('|') 연산자
    R Programming
    %%R
    emp %>% filter( deptno %in% c(10, 20, 40) | job %in% c('CLERK','SALESMAN','ANALYST') ) %>% dplyr::select(ename)

     

    Results
    # A tibble: 13 x 1
       ename 
       <chr> 
     1 SMITH 
     2 ALLEN 
     3 WARD  
     4 JONES 
     5 MARTIN
     6 CLARK 
     7 SCOTT 
     8 KING  
     9 TURNER
    10 ADAMS 
    11 JAMES 
    12 FORD  
    13 MILLER

     


    5. R sqldf Package

    • OR('|') 연산자
    R Programming
    %%R
    
    sqldf("select ename from emp where deptno in (10,20,40) or job in ('CLERK','SALESMAN','ANALYST')")

     

    Results
        ename
    1   SMITH
    2   ALLEN
    3    WARD
    4   JONES
    5  MARTIN
    6   CLARK
    7   SCOTT
    8    KING
    9  TURNER
    10  ADAMS
    11  JAMES
    12   FORD
    13 MILLER

     


    6. Python pandasql Package

    • OR('|') 연산자
    Python Programming
    ps.sqldf("select ename from emp where deptno in (10,20,40) or job in ('CLERK','SALESMAN','ANALYST')")

     

    Results
      ename
    0 SMITH
    1 ALLEN
    2 WARD
    3 JONES
    4 MARTIN
    5 CLARK
    6 SCOTT
    7 KING
    8 TURNER
    9 ADAMS
    10 JAMES
    11 FORD
    12 MILLER

     


    7. R data.table Package

    • OR('|') 연산자
    R Programming
    %%R
    
    DT          <- data.table(emp)
    dept_DT     <- data.table(dept)
    
    DT[ deptno %in% c(10,20,40) | job %in% c('CLERK','SALESMAN','ANALYST') , .(ename)]

     

    Results
         ename
     1:  SMITH
     2:  ALLEN
     3:   WARD
     4:  JONES
     5: MARTIN
     6:  CLARK
     7:  SCOTT
     8:   KING
     9: TURNER
    10:  ADAMS
    11:  JAMES
    12:   FORD
    13: MILLER

     


    8. SAS Proc SQL

    • OR('|') 연산자
    SAS Programming
    %%SAS sas
    
    PROC SQL;
      CREATE TABLE STATSAS_1 AS
        SELECT A.*
        FROM   EMP A
        WHERE DEPTNO IN (10,20,40) OR JOB IN ('CLERK','SALESMAN','ANALYST');
    QUIT;
    PROC PRINT;RUN;

     

    Results
    OBS empno ename job mgr hiredate sal comm deptno
    1 7369 SMITH CLERK 7902 1980-12-17 800 . 20
    2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30
    3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30
    4 7566 JONES MANAGER 7839 1981-04-02 2975 . 20
    5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30
    6 7782 CLARK MANAGER 7839 1981-01-09 2450 . 10
    7 7788 SCOTT ANALYST 7566 1982-12-09 3000 . 20
    8 7839 KING PRESIDEN . 1981-11-17 5000 . 10
    9 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30
    10 7876 ADAMS CLERK 7788 1983-01-12 1100 . 20
    11 7900 JAMES CLERK 7698 1981-12-03 950 . 30
    12 7902 FORD ANALYST 7566 1981-12-03 3000 . 20
    13 7934 MILLER CLERK 7782 1982-01-23 1300 . 10

     


    9. SAS Data Step

    • Where 구문과 OR('|') 연산자
    SAS Programming
    %%SAS sas
    
    DATA STATSAS_2;
     SET EMP;
         WHERE DEPTNO IN (10,20,40) OR JOB IN ('CLERK','SALESMAN','ANALYST');
    RUN;
    PROC PRINT;RUN;

     

    Results
    OBS empno ename job mgr hiredate sal comm deptno
    1 7369 SMITH CLERK 7902 1980-12-17 800 . 20
    2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30
    3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30
    4 7566 JONES MANAGER 7839 1981-04-02 2975 . 20
    5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30
    6 7782 CLARK MANAGER 7839 1981-01-09 2450 . 10
    7 7788 SCOTT ANALYST 7566 1982-12-09 3000 . 20
    8 7839 KING PRESIDEN . 1981-11-17 5000 . 10
    9 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30
    10 7876 ADAMS CLERK 7788 1983-01-12 1100 . 20
    11 7900 JAMES CLERK 7698 1981-12-03 950 . 30
    12 7902 FORD ANALYST 7566 1981-12-03 3000 . 20
    13 7934 MILLER CLERK 7782 1982-01-23 1300 . 10

     


    10. Python Dfply Package

    • filter_by() 함수와 OR('|') 연산자
    Python Programming
    emp >> filter_by( X.deptno.isin([10,20,40]) | X.job.isin(['CLERK','SALESMAN','ANALYST']) )

     

    Results
      empno ename job mgr hiredate sal comm deptno
    0 7369 SMITH CLERK 7902.0 1980/12/17 800 NaN 20
    1 7499 ALLEN SALESMAN 7698.0 1981/02/20 1600 300.0 30
    2 7521 WARD SALESMAN 7698.0 1981/02/22 1250 500.0 30
    3 7566 JONES MANAGER 7839.0 1981/04/02 2975 NaN 20
    4 7654 MARTIN SALESMAN 7698.0 1981/09/28 1250 1400.0 30
    6 7782 CLARK MANAGER 7839.0 1981/01/09 2450 NaN 10
    7 7788 SCOTT ANALYST 7566.0 1982/12/09 3000 NaN 20
    8 7839 KING PRESIDENT NaN 1981/11/17 5000 NaN 10
    9 7844 TURNER SALESMAN 7698.0 1981/09/08 1500 0.0 30
    10 7876 ADAMS CLERK 7788.0 1983/01/12 1100 NaN 20
    11 7900 JAMES CLERK 7698.0 1981/12/03 950 NaN 30
    12 7902 FORD ANALYST 7566.0 1981/12/03 3000 NaN 20
    13 7934 MILLER CLERK 7782.0 1982/01/23 1300 NaN 10

     


    Python Programming
    emp >> filter_by( X.deptno.isin([10,20,40]) | X.job.isin(['CLERK','SALESMAN','ANALYST']) ) >> head(3)

     

    Results
      empno ename job mgr hiredate sal comm deptno
    0 7369 SMITH CLERK 7902.0 1980/12/17 800 NaN 20
    1 7499 ALLEN SALESMAN 7698.0 1981/02/20 1600 300.0 30
    2 7521 WARD SALESMAN 7698.0 1981/02/22 1250 500.0 30

     

     


     

    [SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE]   SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트

     

    반응형

    댓글