본문 바로가기
통계프로그램 비교 시리즈/오라클함수 비교(R & Python)

COALESCE Oracle Function [Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table, DuckDB]

by 기서무나구물 2021. 11. 21.

포스팅 목차

    * 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크


    [ COALESCE 함수 ]

     


    COALESCE 함수는 나열된 값을 순차적으로 체크하여 NULL이 아닌 첫 번째 인수를 반환한다. 모든 인수가 NULL인경우에는 NULL을 반환한다.

     

    1. Oracle(오라클)

     

    Oracle Programming
    SELECT empno,
           sal,
           comm,
           coalesce(comm,sal) comm_sal_1, 
           case when comm is null then sal else comm end comm_sal_2
    from   emp

     

    Results
    EMPNO	SAL	COMM	COMM_SAL_1	COMM_SAL_2
    ---------------------------------------------------
    7839	5000	 - 	5000		5000
    7698	2850	 - 	2850		2850
    7782	2450	 - 	2450		2450
    7566	2975	 - 	2975		2975
    7788	3000	 - 	3000		3000
    7902	3000	 - 	3000		3000
    7369	800	 - 	800		800
    7499	1600	300	300		300
    7521	1250	500	500		500
    7654	1250	1400	1400		1400
    7844	1500	0	0		0
    7876	1100	 - 	1100		1100
    7900	950	 - 	950		950
    7934	1300	 - 	1300		1300

     


    2. Python Pandas(파이썬)

     

    • stackoverflow 참고 : Coalesce values from 2 columns into a single column in a pandas dataframe [링크]

     

    Python Programming
    emp.comm.combine_first(emp.sal).head()

     

    Results
    0     800.0
    1     300.0
    2     500.0
    3    2975.0
    4    1400.0
    Name: comm, dtype: float64

     


     

    Python Programming
    emp['comm'].where(pd.notnull, emp['sal']).head()

     

    Results
    0     800.0
    1     300.0
    2     500.0
    3    2975.0
    4    1400.0
    Name: comm, dtype: float64

     


     

    Python Programming
    emp['comm'].mask(pd.isnull, emp['sal']).head()

     

    Results
    0     800.0
    1     300.0
    2     500.0
    3    2975.0
    4    1400.0
    Name: comm, dtype: float64

     


     

    Python Programming
    emp.apply(lambda x: (x['sal'] if np.isnan(x['comm']) else x['comm']), axis=1).head()

     

    Results
    0     800.0
    1     300.0
    2     500.0
    3    2975.0
    4    1400.0
    dtype: float64

     


     

    Python Programming
    np.where(emp["comm"].isnull(), emp["sal"], emp["comm"] )

     

    Results
    array([ 800.,  300.,  500., 2975., 1400., 2850., 2450., 3000., 5000.,
              0., 1100.,  950., 3000., 1300.])

     

     

     


    3. R Programming (R Package)

     

    R Programming
    %%R
    
    with(emp, coalesce(comm,sal))

     

    Results
     [1]  800  300  500 2975 1400 2850 2450 3000 5000    0 1100  950 3000 1300

     


     

    R Programming
    %%R
    
    ifelse(is.na(emp$comm), emp$sal, emp$comm)

     

    Results
     [1]  800  300  500 2975 1400 2850 2450 3000 5000    0 1100  950 3000 1300

     

     

     


    4. R Dplyr Package

     

    R Programming
    %%R
    
    emp %>%
      dplyr::mutate( comm_sal_1 = dplyr::coalesce(comm,sal),
                     comm_sal_2 = if_else(is.na(comm), sal, comm),
                     comm_sal_3 = dplyr::case_when( is.na(comm) ~ sal,
                                                    TRUE ~ comm) ) %>%
      head()

     

    Results
    # A tibble: 6 x 11
      empno ename job     mgr hiredate     sal  comm deptno comm_sal_1 comm_sal_2
      <dbl> <chr> <chr> <dbl> <date>     <dbl> <dbl>  <dbl>      <dbl>      <dbl>
    1  7369 SMITH CLERK  7902 1980-12-17   800    NA     20        800        800
    2  7499 ALLEN SALE~  7698 1981-02-20  1600   300     30        300        300
    3  7521 WARD  SALE~  7698 1981-02-22  1250   500     30        500        500
    4  7566 JONES MANA~  7839 1981-04-02  2975    NA     20       2975       2975
    5  7654 MART~ SALE~  7698 1981-09-28  1250  1400     30       1400       1400
    6  7698 BLAKE MANA~  7839 1981-03-01  2850    NA     30       2850       2850
    # ... with 1 more variable: comm_sal_3 <dbl>

     

     

     


    5. R sqldf Package

     

    R Programming
    %%R
    
    sqldf(" SELECT sal,
                   comm,
                   coalesce(comm,sal) comm_sal_1,
                   case when comm is null then sal else comm end comm_sal_2
            from emp  ")[1:5, ]

     

    Results
       sal comm comm_sal_1 comm_sal_2
    1  800   NA        800        800
    2 1600  300        300        300
    3 1250  500        500        500
    4 2975   NA       2975       2975
    5 1250 1400       1400       1400

     

     


    6. Python pandasql Package

     

    Python Programming
    ps.sqldf(" SELECT coalesce(comm,sal) comm_sal_1,  \
                      case when comm is null then sal else comm end comm_sal_2  \
               from emp  ").head()

     

    Results
    	comm_sal_1	comm_sal_2
    0	800.0		800.0
    1	300.0		300.0
    2	500.0		500.0
    3	2975.0		2975.0
    4	1400.0		1400.0

     

     


    7. R data.table Package

     

    R Programming
    %%R
    
    DT      <- data.table(emp)
    dept_DT <- data.table(dept)
    
    DT[, comm_sal_1 := fcoalesce(comm, sal) ][1:5, ]

     

    Results
       empno  ename      job  mgr   hiredate  sal comm deptno comm_sal_1
    1:  7369  SMITH    CLERK 7902 1980-12-17  800   NA     20        800
    2:  7499  ALLEN SALESMAN 7698 1981-02-20 1600  300     30        300
    3:  7521   WARD SALESMAN 7698 1981-02-22 1250  500     30        500
    4:  7566  JONES  MANAGER 7839 1981-04-02 2975   NA     20       2975
    5:  7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400     30       1400

     


     

    R Programming
    %%R
    
    DT      <- data.table(emp)
    dept_DT <- data.table(dept)
    
    DT[, `:=`( comm_sal_1 = fcoalesce(comm, sal), 
               comm_sal_2 = fifelse(is.na(comm), sal, comm), 
               comm_sal_3 = dplyr::case_when(is.na(comm) ~ sal, TRUE ~ comm))][1:5, -c("hiredate","job")  ]

     

    Results
       empno  ename  mgr  sal comm deptno comm_sal_1 comm_sal_2 comm_sal_3
    1:  7369  SMITH 7902  800   NA     20        800        800        800
    2:  7499  ALLEN 7698 1600  300     30        300        300        300
    3:  7521   WARD 7698 1250  500     30        500        500        500
    4:  7566  JONES 7839 2975   NA     20       2975       2975       2975
    5:  7654 MARTIN 7698 1250 1400     30       1400       1400       1400

     


     

    R Programming
    %%R
    
    DT[, { sal <- sal
           comm <- comm
           comm_sal_1 <- fcoalesce(comm, sal)
           comm_sal_2 <- case_when(is.na(comm) ~ sal, TRUE ~ comm)
          .(sal = sal, comm = comm, comm_sal_1 = comm_sal_1, comm_sal_2 = comm_sal_2)
         }][1:5, -c("hiredate","job")  ]

     

    Results
        sal comm comm_sal_1 comm_sal_2
    1:  800   NA        800        800
    2: 1600  300        300        300
    3: 1250  500        500        500
    4: 2975   NA       2975       2975
    5: 1250 1400       1400       1400

     


    8. Python DuckDB의 SQL

     

    Python Programming
    %%sql
      SELECT coalesce(comm,sal) comm_sal_1,
             case when comm is null then sal else comm end comm_sal_2
      from   emp

     

    Python Programming
    duckdb.sql(" SELECT coalesce(comm,sal) comm_sal_1,                                  \
                        case when comm is null then sal else comm end comm_sal_2        \
                 from   emp ").df().head()

     

    Results
       comm_sal_1  comm_sal_2
    0       800.0       800.0
    1       300.0       300.0
    2       500.0       500.0
    3      2975.0      2975.0
    4      1400.0      1400.0
     

     


    https://unsplash.com/photos/PGeslSkvPQg

     

      --------------------------------------------  

     

    [Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table] 오라클 함수와 R & Python 비교 사전 목록 링크

     

    오라클 SQL 함수(Oracle SQL Function) 목록 리스트 링크

     

    [SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트 링크 링크
    반응형

    댓글