본문 바로가기
통계프로그램 비교 시리즈/프로그래밍비교(Oracle,Python,R,SAS)

[데이터 변환] 조건절을 만족하는 데이터 변경 - 98

by 기서무나구물 2022. 9. 28.

포스팅 목차

     

    98. Update the salary of each employee by 10% increments that are not eligible for commission.

     

    * 커미션을 받을 수 없는 직원의 급여를 10%씩 인상하여서 업데이트 하여라.


    • Oracle : update 구문
    • 파이썬 Pandas : np.where(), isna(), loc[], apply(), lambda() 함수
    • R 프로그래밍 : ifelse(), transform()
    • R Dplyr Package : dplyr::mutate(), is.na(), ifelse(), case_when()
    • R sqldf Package : update 구문
    • Python pandasql Package : update 구문
    • R data.table Package : is.na(), ifelse(), dplyr::case_when()
    • SAS Proc SQL : update 구문, CASE WHEN 절
    • SAS Data Step : IF 조건문
    • Python Dfply Package : mutate(), make_symbolic(), np.where()
    • 파이썬 Base 프로그래밍 :

     


    1. Oracle(오라클)

    커미션이 존재하지 않는 직원의 급여를 10% 추가하여서 sal 변수를 update 한다.

    Oracle Programming
    create table withmooc as
      select *
      from   emp;
    
    update withmooc 
      set  sal=sal+(sal*10/100) 
      where comm is null;
    
    select *
    from   withmooc

     


    2. Python Pandas(파이썬)

    np.where 함수를 사용하여서 커미션이 지급되지 않는 직원을 선택하고, 이 직원들의 급여를 10%씩 인상한다.

    Python Programming
    import copy
    withmooc =copy.copy(emp)
    
    withmooc['sal'] = np.where(withmooc['comm'].isna(), withmooc['sal']+(withmooc['sal']*10/100), withmooc['sal'])
    
    withmooc.head()

     


    Results
    empno ename job mgr hiredate sal comm deptno
    7369 SMITH CLERK 7902.0 1980/12/17 880.0 NaN 20
    7499 ALLEN SALESMAN 7698.0 1981/02/20 1600.0 300.0 30
    7521 WARD SALESMAN 7698.0 1981/02/22 1250.0 500.0 30
    7566 JONES MANAGER 7839.0 1981/04/02 3272.5 NaN 20
    7654 MARTIN SALESMAN 7698.0 1981/09/28 1250.0 1400.0 30

     


    커미션이 지급되지 않는 직원을 선택하여서 이 직원들의 급여를 10%씩 인상하여서 변경한다.

    Python Programming
    import copy
    withmooc =copy.copy(emp)
    
    withmooc.loc[withmooc['comm'].isna(),'sal'] = withmooc['sal']+(withmooc['sal']*10/100)
    
    withmooc.head()

     


    Results
    empno ename job mgr hiredate sal comm deptno
    7369 SMITH CLERK 7902.0 1980/12/17 880.0 NaN 20
    7499 ALLEN SALESMAN 7698.0 1981/02/20 1600.0 300.0 30
    7521 WARD SALESMAN 7698.0 1981/02/22 1250.0 500.0 30
    7566 JONES MANAGER 7839.0 1981/04/02 3272.5 NaN 20
    7654 MARTIN SALESMAN 7698.0 1981/09/28 1250.0 1400.0 30

     


    if 함수를 사용하여서 커미션이 지급되지 않는 직원을 선택하고, 이 직원들의 급여를 10%씩 인상한다.

    Python Programming
    import copy
    withmooc =copy.copy(emp)
    
    withmooc["sal"] = withmooc.apply(lambda x: (x['sal']+(x['sal']*10/100)) if np.isnan(x["comm"]) else x["sal"] , axis=1)
    
    withmooc.head()

     

    Results
    empno ename job mgr hiredate sal comm deptno
    7369 SMITH CLERK 7902.0 1980/12/17 880.0 NaN 20
    7499 ALLEN SALESMAN 7698.0 1981/02/20 1600.0 300.0 30
    7521 WARD SALESMAN 7698.0 1981/02/22 1250.0 500.0 30
    7566 JONES MANAGER 7839.0 1981/04/02 3272.5 NaN 20
    7654 MARTIN SALESMAN 7698.0 1981/09/28 1250.0 1400.0 30

     


    3. R Programming (R Package)

    ifelse 함수를 사용하여서 커미션이 지급되지 않는 직원을 선택하고, 이 직원들의 급여를 10%씩 인상하여 변경한다.

    R Programming
    %%R
    
    withmooc <- emp
    
    withmooc$sal = ifelse(is.na(withmooc$comm), withmooc$sal+(withmooc$sal*10/100) , withmooc$sal)
    
    head(withmooc)

     

    Results
    # A tibble: 6 x 8
      empno ename  job        mgr hiredate     sal  comm deptno
      <dbl> <chr>  <chr>    <dbl> <date>     <dbl> <dbl>  <dbl>
    1  7369 SMITH  CLERK     7902 1980-12-17  880     NA     20
    2  7499 ALLEN  SALESMAN  7698 1981-02-20 1600    300     30
    3  7521 WARD   SALESMAN  7698 1981-02-22 1250    500     30
    4  7566 JONES  MANAGER   7839 1981-04-02 3272.    NA     20
    5  7654 MARTIN SALESMAN  7698 1981-09-28 1250   1400     30
    6  7698 BLAKE  MANAGER   7839 1981-03-01 3135     NA     30

     


    ifelse 함수를 사용하여서 커미션이 지급되지 않는 직원을 선택하고, transform 함수를 사용하여서 이 직원들의 급여를 10%씩 인상하여 변경한다.

    R Programming
    %%R
    
    withmooc <- emp
    
    withmooc <- transform(withmooc, sal = ifelse(is.na(withmooc$comm), withmooc$sal+(withmooc$sal*10/100) , withmooc$sal))
    
    head(withmooc)

     

    Results
      empno  ename      job  mgr   hiredate    sal comm deptno
    1  7369  SMITH    CLERK 7902 1980-12-17  880.0   NA     20
    2  7499  ALLEN SALESMAN 7698 1981-02-20 1600.0  300     30
    3  7521   WARD SALESMAN 7698 1981-02-22 1250.0  500     30
    4  7566  JONES  MANAGER 7839 1981-04-02 3272.5   NA     20
    5  7654 MARTIN SALESMAN 7698 1981-09-28 1250.0 1400     30
    6  7698  BLAKE  MANAGER 7839 1981-03-01 3135.0   NA     30

     


    4. R Dplyr Package

    ifelse 함수를 사용하여서 커미션이 지급되지 않는 직원을 선택하고, dplyr::mutate 함수를 사용하여서 이 직원들의 급여를 10%씩 인상한 값으로 변경한다.

    R Programming
    %%R
    
    withmooc <- emp
    
    withmooc %>% 
      dplyr::mutate( sal = ifelse(is.na(withmooc$comm), withmooc$sal+(withmooc$sal*10/100) , withmooc$sal) ) %>%
      head()

     

    Results
    # A tibble: 6 x 8
      empno ename  job        mgr hiredate     sal  comm deptno
      <dbl> <chr>  <chr>    <dbl> <date>     <dbl> <dbl>  <dbl>
    1  7369 SMITH  CLERK     7902 1980-12-17  880     NA     20
    2  7499 ALLEN  SALESMAN  7698 1981-02-20 1600    300     30
    3  7521 WARD   SALESMAN  7698 1981-02-22 1250    500     30
    4  7566 JONES  MANAGER   7839 1981-04-02 3272.    NA     20
    5  7654 MARTIN SALESMAN  7698 1981-09-28 1250   1400     30
    6  7698 BLAKE  MANAGER   7839 1981-03-01 3135     NA     30

     


    case_when 함수를 사용하여서 커미션이 지급되지 않는 직원을 선택하고, 이 직원들의 급여를 10%씩 인상하여 변경한다.

    R Programming
    %%R
    
    withmooc <- emp
    
    withmooc %>% 
      dplyr::mutate( sal = case_when(is.na(comm) ~ sal+(sal*10/100), TRUE ~ sal) ) %>%
      head()

     

    Results
    # A tibble: 6 x 8
      empno ename  job        mgr hiredate     sal  comm deptno
      <dbl> <chr>  <chr>    <dbl> <date>     <dbl> <dbl>  <dbl>
    1  7369 SMITH  CLERK     7902 1980-12-17  880     NA     20
    2  7499 ALLEN  SALESMAN  7698 1981-02-20 1600    300     30
    3  7521 WARD   SALESMAN  7698 1981-02-22 1250    500     30
    4  7566 JONES  MANAGER   7839 1981-04-02 3272.    NA     20
    5  7654 MARTIN SALESMAN  7698 1981-09-28 1250   1400     30
    6  7698 BLAKE  MANAGER   7839 1981-03-01 3135     NA     30

     


    5. R sqldf Package

    • UPDATE 구문 문제 : 8. Why am I having problems with update?
    • : https://cran.r-project.org/web/packages/sqldf/README.html
    • 커미션이 존재하지 않는 직원의 급여를 10% 추가하여서 sal 변수를 update 한다.
    • Sqlite 의 update / delete 구문 : select / where 구문으로 처리 가능 (98번 / 119번)
    R Programming
    %%R
    
    withmooc <- emp
    
    withmooc <- sqldf(c("UPDATE withmooc SET sal=sal+(sal*10/100) WHERE comm is null", "SELECT * FROM main.withmooc"), method = "raw")
    withmooc %>% head(7)

     

    Results
      empno  ename      job  mgr hiredate    sal comm deptno
    1  7369  SMITH    CLERK 7902     4003  880.0   NA     20
    2  7499  ALLEN SALESMAN 7698     4068 1600.0  300     30
    3  7521   WARD SALESMAN 7698     4070 1250.0  500     30
    4  7566  JONES  MANAGER 7839     4109 3272.5   NA     20
    5  7654 MARTIN SALESMAN 7698     4288 1250.0 1400     30
    6  7698  BLAKE  MANAGER 7839     4077 3135.0   NA     30
    7  7782  CLARK  MANAGER 7839     4026 2695.0   NA     10

     


    6. Python pandasql Package

    • 현재 pandasql에서 update 제대로 작동 안 함.
    Python Programming
    import copy
    withmooc =copy.copy(emp)
    
    withmooc_1 = ps.sqldf("select empno,ename,job,mgr,hiredate, \
                                  case when comm is null then (sal+(sal*10/100)) else sal end as sal, \
                                  comm, deptno from withmooc a")
    
    withmooc_1.head(7)

     


    Results
    empno ename job mgr hiredate sal comm deptno
    7369 SMITH CLERK 7902.0 1980/12/17 880 NaN 20
    7499 ALLEN SALESMAN 7698.0 1981/02/20 1600 300.0 30
    7521 WARD SALESMAN 7698.0 1981/02/22 1250 500.0 30
    7566 JONES MANAGER 7839.0 1981/04/02 3272 NaN 20
    7654 MARTIN SALESMAN 7698.0 1981/09/28 1250 1400.0 30
    7698 BLAKE MANAGER 7839.0 1981/03/01 3135 NaN 30
    7782 CLARK MANAGER 7839.0 1981/01/09 2695 NaN 10

     


    7. R data.table Package

    커미션이 지급되지 않는 직원을 선택하고, 이 직원들의 급여를 10%씩 인상하여 변경한다.

    R Programming
    %%R
    
    library(data.table)
    # library(plyr)
    
    DT      <- data.table(emp)
    dept_DT <- data.table(dept)
    
    DT[is.na(comm) , sal := (sal+(sal*10/100))][1:7, ]

     

    Results
       empno  ename      job  mgr   hiredate    sal comm deptno
    1:  7369  SMITH    CLERK 7902 1980-12-17  880.0   NA     20
    2:  7499  ALLEN SALESMAN 7698 1981-02-20 1600.0  300     30
    3:  7521   WARD SALESMAN 7698 1981-02-22 1250.0  500     30
    4:  7566  JONES  MANAGER 7839 1981-04-02 3272.5   NA     20
    5:  7654 MARTIN SALESMAN 7698 1981-09-28 1250.0 1400     30
    6:  7698  BLAKE  MANAGER 7839 1981-03-01 3135.0   NA     30
    7:  7782  CLARK  MANAGER 7839 1981-01-09 2695.0   NA     10

     


    ifelse 함수를 사용하여서 커미션이 지급되지 않는 직원을 선택하고, 이 직원들의 급여를 10%씩 인상하여 변경한다.

    R Programming
    %%R
    
    library(data.table)
    library(dplyr)
    
    DT      <- data.table(emp)
    dept_DT <- data.table(dept)
    
    DT[ , sal := ifelse(is.na(comm), (sal+(sal*10/100)) , sal ) ][1:7, ]

     

    Results
       empno  ename      job  mgr   hiredate    sal comm deptno
    1:  7369  SMITH    CLERK 7902 1980-12-17  880.0   NA     20
    2:  7499  ALLEN SALESMAN 7698 1981-02-20 1600.0  300     30
    3:  7521   WARD SALESMAN 7698 1981-02-22 1250.0  500     30
    4:  7566  JONES  MANAGER 7839 1981-04-02 3272.5   NA     20
    5:  7654 MARTIN SALESMAN 7698 1981-09-28 1250.0 1400     30
    6:  7698  BLAKE  MANAGER 7839 1981-03-01 3135.0   NA     30
    7:  7782  CLARK  MANAGER 7839 1981-01-09 2695.0   NA     10

     


    case_when 함수를 사용하여서 커미션이 지급되지 않는 직원을 선택하고, 이 직원들의 급여를 10%씩 인상하여 변경한다.

    R Programming
    %%R
    
    library(data.table)
    library(dplyr)
    
    DT      <- data.table(emp)
    dept_DT <- data.table(dept)
    
    DT[, sal := dplyr::case_when(is.na(comm) ~ sal+(sal*10/100), TRUE ~ sal) ][1:7, ]

     

    Results
       empno  ename      job  mgr   hiredate    sal comm deptno
    1:  7369  SMITH    CLERK 7902 1980-12-17  880.0   NA     20
    2:  7499  ALLEN SALESMAN 7698 1981-02-20 1600.0  300     30
    3:  7521   WARD SALESMAN 7698 1981-02-22 1250.0  500     30
    4:  7566  JONES  MANAGER 7839 1981-04-02 3272.5   NA     20
    5:  7654 MARTIN SALESMAN 7698 1981-09-28 1250.0 1400     30
    6:  7698  BLAKE  MANAGER 7839 1981-03-01 3135.0   NA     30
    7:  7782  CLARK  MANAGER 7839 1981-01-09 2695.0   NA     10

     


    8. SAS Proc SQL

     

    SAS Programming
    %%SAS sas
    
    DATA withmooc;
     SET emp;
    RUN;
    
    PROC SQL;
      UPDATE withmooc 
      SET    sal=sal+(sal*10/100) 
      WHERE  comm is null;
    QUIT;
    PROC PRINT data=withmooc(obs=3);RUN;

     


    Results
    empno ename job mgr hiredate sal comm deptno
    7369 SMITH CLERK 7902 1980-12-17 880 . 20
    7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30
    7521 WARD SALESMAN 7698 1981-02-22 1250 500 30

     


     

    SAS Programming
    %%SAS sas
    
    PROC SQL;
      CREATE TABLE STATSAS_1 AS
        select EMPNO,
               ENAME,
               SAL,
               CASE WHEN comm = . THEN sal+(sal*10/100)
               ELSE SAL END AS NEW_SAL
        from   emp;
    QUIT;
    PROC PRINT data=STATSAS_1(obs=3);RUN;

     

    Results
    empno ename sal NEW_SAL
    7369 SMITH 800 880
    7499 ALLEN 1600 1600
    7521 WARD 1250 1250

     


    9. SAS Data Step

     

    SAS Programming
    %%SAS sas
    
    DATA STATSAS_2(RENAME=NEW_SAL = sal);
     SET EMP;
         IF comm = . THEN NEW_SAL = sal+(sal*10/100);
         ELSE             NEW_SAL = SAL;
         DROP SAL;
    RUN;
    
    PROC PRINT data=STATSAS_2(obs=3);RUN;

     


    Results
    empno ename job mgr hiredate comm deptno sal
    7369 SMITH CLERK 7902 1980-12-17 . 20 880
    7499 ALLEN SALESMAN 7698 1981-02-20 300 30 1600
    7521 WARD SALESMAN 7698 1981-02-22 500 30 1250

     


    10. Python Dfply Package

    • [통계비교] 데이터 관리 (3. 결측값 할당을 위한 조건 변환) 참고
    • make_symbolic() 함수를 사용하여서 np.where() 조건문 적용
    Python Programming
    import copy
    withmooc =copy.copy(emp)
    
    withmooc >> mutate( sal = make_symbolic(np.where)(X.comm.isna(), X.sal+(X.sal*10/100), X.sal) ) >> \
      head()

     

    Results
    empno ename job mgr hiredate sal comm deptno
    7369 SMITH CLERK 7902.0 1980/12/17 880.0 NaN 20
    7499 ALLEN SALESMAN 7698.0 1981/02/20 1600.0 300.0 30
    7521 WARD SALESMAN 7698.0 1981/02/22 1250.0 500.0 30
    7566 JONES MANAGER 7839.0 1981/04/02 3272.5 NaN 20
    7654 MARTIN SALESMAN 7698.0 1981/09/28 1250.0 1400.0 30

     



    [SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE]   SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트

    반응형

    댓글