포스팅 목차
* 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크
[ COALESCE 함수 ]
COALESCE 함수는 나열된 값을 순차적으로 체크하여 NULL이 아닌 첫 번째 인수를 반환한다. 모든 인수가 NULL인경우에는 NULL을 반환한다.
- 함수 설명 : COALESCE 오라클 함수 링크
1. Oracle(오라클)
Oracle Programming |
SELECT empno,
sal,
comm,
coalesce(comm,sal) comm_sal_1,
case when comm is null then sal else comm end comm_sal_2
from emp
Results |
EMPNO SAL COMM COMM_SAL_1 COMM_SAL_2
---------------------------------------------------
7839 5000 - 5000 5000
7698 2850 - 2850 2850
7782 2450 - 2450 2450
7566 2975 - 2975 2975
7788 3000 - 3000 3000
7902 3000 - 3000 3000
7369 800 - 800 800
7499 1600 300 300 300
7521 1250 500 500 500
7654 1250 1400 1400 1400
7844 1500 0 0 0
7876 1100 - 1100 1100
7900 950 - 950 950
7934 1300 - 1300 1300
2. Python Pandas(파이썬)
- stackoverflow 참고 : Coalesce values from 2 columns into a single column in a pandas dataframe [링크]
Python Programming |
emp.comm.combine_first(emp.sal).head()
Results |
0 800.0
1 300.0
2 500.0
3 2975.0
4 1400.0
Name: comm, dtype: float64
Python Programming |
emp['comm'].where(pd.notnull, emp['sal']).head()
Results |
0 800.0
1 300.0
2 500.0
3 2975.0
4 1400.0
Name: comm, dtype: float64
Python Programming |
emp['comm'].mask(pd.isnull, emp['sal']).head()
Results |
0 800.0
1 300.0
2 500.0
3 2975.0
4 1400.0
Name: comm, dtype: float64
Python Programming |
emp.apply(lambda x: (x['sal'] if np.isnan(x['comm']) else x['comm']), axis=1).head()
Results |
0 800.0
1 300.0
2 500.0
3 2975.0
4 1400.0
dtype: float64
Python Programming |
np.where(emp["comm"].isnull(), emp["sal"], emp["comm"] )
Results |
array([ 800., 300., 500., 2975., 1400., 2850., 2450., 3000., 5000.,
0., 1100., 950., 3000., 1300.])
3. R Programming (R Package)
R Programming |
%%R
with(emp, coalesce(comm,sal))
Results |
[1] 800 300 500 2975 1400 2850 2450 3000 5000 0 1100 950 3000 1300
R Programming |
%%R
ifelse(is.na(emp$comm), emp$sal, emp$comm)
Results |
[1] 800 300 500 2975 1400 2850 2450 3000 5000 0 1100 950 3000 1300
4. R Dplyr Package
R Programming |
%%R
emp %>%
dplyr::mutate( comm_sal_1 = dplyr::coalesce(comm,sal),
comm_sal_2 = if_else(is.na(comm), sal, comm),
comm_sal_3 = dplyr::case_when( is.na(comm) ~ sal,
TRUE ~ comm) ) %>%
head()
Results |
# A tibble: 6 x 11
empno ename job mgr hiredate sal comm deptno comm_sal_1 comm_sal_2
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <dbl> <dbl>
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 800 800
2 7499 ALLEN SALE~ 7698 1981-02-20 1600 300 30 300 300
3 7521 WARD SALE~ 7698 1981-02-22 1250 500 30 500 500
4 7566 JONES MANA~ 7839 1981-04-02 2975 NA 20 2975 2975
5 7654 MART~ SALE~ 7698 1981-09-28 1250 1400 30 1400 1400
6 7698 BLAKE MANA~ 7839 1981-03-01 2850 NA 30 2850 2850
# ... with 1 more variable: comm_sal_3 <dbl>
5. R sqldf Package
R Programming |
%%R
sqldf(" SELECT sal,
comm,
coalesce(comm,sal) comm_sal_1,
case when comm is null then sal else comm end comm_sal_2
from emp ")[1:5, ]
Results |
sal comm comm_sal_1 comm_sal_2
1 800 NA 800 800
2 1600 300 300 300
3 1250 500 500 500
4 2975 NA 2975 2975
5 1250 1400 1400 1400
6. Python pandasql Package
Python Programming |
ps.sqldf(" SELECT coalesce(comm,sal) comm_sal_1, \
case when comm is null then sal else comm end comm_sal_2 \
from emp ").head()
Results |
comm_sal_1 comm_sal_2
0 800.0 800.0
1 300.0 300.0
2 500.0 500.0
3 2975.0 2975.0
4 1400.0 1400.0
7. R data.table Package
R Programming |
%%R
DT <- data.table(emp)
dept_DT <- data.table(dept)
DT[, comm_sal_1 := fcoalesce(comm, sal) ][1:5, ]
Results |
empno ename job mgr hiredate sal comm deptno comm_sal_1
1: 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 800
2: 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 300
3: 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 500
4: 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 2975
5: 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 1400
R Programming |
%%R
DT <- data.table(emp)
dept_DT <- data.table(dept)
DT[, `:=`( comm_sal_1 = fcoalesce(comm, sal),
comm_sal_2 = fifelse(is.na(comm), sal, comm),
comm_sal_3 = dplyr::case_when(is.na(comm) ~ sal, TRUE ~ comm))][1:5, -c("hiredate","job") ]
Results |
empno ename mgr sal comm deptno comm_sal_1 comm_sal_2 comm_sal_3
1: 7369 SMITH 7902 800 NA 20 800 800 800
2: 7499 ALLEN 7698 1600 300 30 300 300 300
3: 7521 WARD 7698 1250 500 30 500 500 500
4: 7566 JONES 7839 2975 NA 20 2975 2975 2975
5: 7654 MARTIN 7698 1250 1400 30 1400 1400 1400
R Programming |
%%R
DT[, { sal <- sal
comm <- comm
comm_sal_1 <- fcoalesce(comm, sal)
comm_sal_2 <- case_when(is.na(comm) ~ sal, TRUE ~ comm)
.(sal = sal, comm = comm, comm_sal_1 = comm_sal_1, comm_sal_2 = comm_sal_2)
}][1:5, -c("hiredate","job") ]
Results |
sal comm comm_sal_1 comm_sal_2
1: 800 NA 800 800
2: 1600 300 300 300
3: 1250 500 500 500
4: 2975 NA 2975 2975
5: 1250 1400 1400 1400
8. Python DuckDB의 SQL
Python Programming |
%%sql
SELECT coalesce(comm,sal) comm_sal_1,
case when comm is null then sal else comm end comm_sal_2
from emp
Python Programming |
duckdb.sql(" SELECT coalesce(comm,sal) comm_sal_1, \
case when comm is null then sal else comm end comm_sal_2 \
from emp ").df().head()
Results |
comm_sal_1 comm_sal_2
0 800.0 800.0
1 300.0 300.0
2 500.0 500.0
3 2975.0 2975.0
4 1400.0 1400.0
--------------------------------------------
[Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table] 오라클 함수와 R & Python 비교 사전 목록 링크 |
오라클 SQL 함수(Oracle SQL Function) 목록 리스트 링크 |
[SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트 링크 링크 |
반응형
댓글