포스팅 목차
* 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크
[ LISTAGG Oracle Function ]
LISTAGG는 ORDER BY 절에 지정된 각 그룹 내에서 데이터를 정렬 한 다음 기준 열의 값을 연결하여 변수의 값으로 반환한다.
- 함수 설명 :
1. Oracle(오라클)
LISTAGG() 함수
deptno 그룹 내에서 사원번호 기준으로 데이터를 정렬 후 사원 이름을 합친다.
Oracle Programming |
SELECT LISTAGG(ENAME, ',') WITHin GROUP (ORDER BY EMPNO) AS "EMPLOYEE"
FROM EMP
GROUP BY DEPTNO
Results |
EMPLOYEE
-----------------------------------------
CLARK,KING,MILLER
SMITH,JONES,SCOTT,ADAMS,FORD
ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES
2. Python Pandas(파이썬)
문자 SUM() 함수
Python Programming |
emp.sort_values('hiredate').groupby('deptno')['ename'].sum()
Results |
deptno
10 CLARKKINGMILLER
20 SMITHJONESFORDSCOTTADAMS
30 ALLENWARDBLAKETURNERMARTINJAMES
Name: ename, dtype: object
agg() 함수와 list() 함수
Python Programming |
emp.sort_values('hiredate').groupby('deptno')['ename'].agg(lambda x: list(x))
Results |
deptno
10 [CLARK, KING, MILLER]
20 [SMITH, JONES, FORD, SCOTT, ADAMS]
30 [ALLEN, WARD, BLAKE, TURNER, MARTIN, JAMES]
Name: ename, dtype: object
agg() 함수와 join() 함수
Python Programming |
emp.sort_values('hiredate').groupby('deptno')['ename'].agg(lambda x: ', '.join(x))
Results |
deptno
10 CLARK, KING, MILLER
20 SMITH, JONES, FORD, SCOTT, ADAMS
30 ALLEN, WARD, BLAKE, TURNER, MARTIN, JAMES
Name: ename, dtype: object
2) 분석함수
- transform() 함수와 join() 함수
Python Programming |
withmooc = emp.copy()
withmooc['Emp_list'] = withmooc.sort_values('hiredate').groupby('deptno')['ename'].transform(lambda x: ', '.join(x))
withmooc.head()
Results |
empno ename job mgr hiredate sal comm deptno Emp_list
0 7369 SMITH CLERK 7902.0 1980/12/17 800 NaN 20 SMITH, JONES, FORD, SCOTT, ADAMS
1 7499 ALLEN SALESMAN 7698.0 1981/02/20 1600 300.0 30 ALLEN, WARD, BLAKE, TURNER, MARTIN, JAMES
2 7521 WARD SALESMAN 7698.0 1981/02/22 1250 500.0 30 ALLEN, WARD, BLAKE, TURNER, MARTIN, JAMES
3 7566 JONES MANAGER 7839.0 1981/04/02 2975 NaN 20 SMITH, JONES, FORD, SCOTT, ADAMS
4 7654 MARTIN SALESMAN 7698.0 1981/09/28 1250 1400.0 30 ALLEN, WARD, BLAKE, TURNER, MARTIN, JAMES
3. R Programming (R Package)
aggregate() 함수
R Programming |
%%R
aggregate(ename ~ deptno, data=emp[order(emp$hiredate), ], c)
Results |
deptno ename
1 10 CLARK, KING, MILLER
2 20 SMITH, JONES, FORD, SCOTT, ADAMS
3 30 ALLEN, WARD, BLAKE, TURNER, MARTIN, JAMES
aggregate() 함수와 paste() 함수
R Programming |
%%R
aggregate(ename ~ deptno, data=emp[order(emp$hiredate), ] , FUN=function(x) paste(x, collapse = "; "))
Results |
deptno ename
1 10 CLARK; KING; MILLER
2 20 SMITH; JONES; FORD; SCOTT; ADAMS
3 30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
ave() 함수와 paste() 함수
R Programming |
%%R
within(emp[order(emp$hiredate), ], { Emp_list <- ave(ename, deptno, FUN = function(x) paste(x, collapse = "; ")) })
Results |
# A tibble: 14 x 9
empno ename job mgr hiredate sal comm deptno Emp_list
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <chr>
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 SMITH; JONES; FORD; ~
2 7782 CLARK MANAG~ 7839 1981-01-09 2450 NA 10 CLARK; KING; MILLER
3 7499 ALLEN SALES~ 7698 1981-02-20 1600 300 30 ALLEN; WARD; BLAKE; ~
4 7521 WARD SALES~ 7698 1981-02-22 1250 500 30 ALLEN; WARD; BLAKE; ~
5 7698 BLAKE MANAG~ 7839 1981-03-01 2850 NA 30 ALLEN; WARD; BLAKE; ~
6 7566 JONES MANAG~ 7839 1981-04-02 2975 NA 20 SMITH; JONES; FORD; ~
7 7844 TURNER SALES~ 7698 1981-09-08 1500 0 30 ALLEN; WARD; BLAKE; ~
8 7654 MARTIN SALES~ 7698 1981-09-28 1250 1400 30 ALLEN; WARD; BLAKE; ~
9 7839 KING PRESI~ NA 1981-11-17 5000 NA 10 CLARK; KING; MILLER
10 7900 JAMES CLERK 7698 1981-12-03 950 NA 30 ALLEN; WARD; BLAKE; ~
11 7902 FORD ANALY~ 7566 1981-12-03 3000 NA 20 SMITH; JONES; FORD; ~
12 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 CLARK; KING; MILLER
13 7788 SCOTT ANALY~ 7566 1982-12-09 3000 NA 20 SMITH; JONES; FORD; ~
14 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 SMITH; JONES; FORD; ~
4. R Dplyr Package
paste() 함수
R Programming |
%%R
emp %>%
dplyr::arrange(hiredate) %>%
dplyr::group_by(deptno) %>%
dplyr::mutate( Emp_list = paste(ename, collapse = "; "))
Results |
# A tibble: 14 x 9
# Groups: deptno [3]
empno ename job mgr hiredate sal comm deptno Emp_list
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <chr>
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 SMITH; JONES; FORD; ~
2 7782 CLARK MANAG~ 7839 1981-01-09 2450 NA 10 CLARK; KING; MILLER
3 7499 ALLEN SALES~ 7698 1981-02-20 1600 300 30 ALLEN; WARD; BLAKE; ~
4 7521 WARD SALES~ 7698 1981-02-22 1250 500 30 ALLEN; WARD; BLAKE; ~
5 7698 BLAKE MANAG~ 7839 1981-03-01 2850 NA 30 ALLEN; WARD; BLAKE; ~
6 7566 JONES MANAG~ 7839 1981-04-02 2975 NA 20 SMITH; JONES; FORD; ~
7 7844 TURNER SALES~ 7698 1981-09-08 1500 0 30 ALLEN; WARD; BLAKE; ~
8 7654 MARTIN SALES~ 7698 1981-09-28 1250 1400 30 ALLEN; WARD; BLAKE; ~
9 7839 KING PRESI~ NA 1981-11-17 5000 NA 10 CLARK; KING; MILLER
10 7900 JAMES CLERK 7698 1981-12-03 950 NA 30 ALLEN; WARD; BLAKE; ~
11 7902 FORD ANALY~ 7566 1981-12-03 3000 NA 20 SMITH; JONES; FORD; ~
12 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 CLARK; KING; MILLER
13 7788 SCOTT ANALY~ 7566 1982-12-09 3000 NA 20 SMITH; JONES; FORD; ~
14 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 SMITH; JONES; FORD; ~
5. R sqldf Package
group_concat() 함수
R Programming |
%%R
sqldf(" SELECT deptno ,
GROUP_CONCAT(ename, '; ') grp_concat
FROM ( SELECT *
FROM emp
order by deptno,hiredate )
GROUP BY deptno
order by hiredate")
Results |
deptno grp_concat
1 20 SMITH; JONES; FORD; SCOTT; ADAMS
2 10 CLARK; KING; MILLER
3 30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
6. Python pandasql Package
group_concat() 함수
Python Programming |
ps.sqldf(" SELECT deptno , \
GROUP_CONCAT(ename, '; ') grp_concat \
FROM ( SELECT * \
FROM emp \
order by deptno,hiredate ) \
GROUP BY deptno \
order by hiredate ")
Results |
deptno grp_concat
0 20 SMITH; JONES; FORD; SCOTT; ADAMS
1 10 CLARK; KING; MILLER
2 30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
7. R data.table Package
paste() 함수
R Programming |
%%R
DT <- data.table(emp)
dept_DT <- data.table(dept)
DT[order(hiredate), Emp_list := paste(ename, collapse = "; "), by=deptno][, c('sal','comm','mgr','hiredate') := NULL]
Results |
empno ename comm deptno Emp_list
1: 7369 SMITH NA 20 SMITH; JONES; FORD; SCOTT; ADAMS
2: 7499 ALLEN 300 30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
3: 7521 WARD 500 30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
4: 7566 JONES NA 20 SMITH; JONES; FORD; SCOTT; ADAMS
5: 7654 MARTIN 1400 30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
6: 7698 BLAKE NA 30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
7: 7782 CLARK NA 10 CLARK; KING; MILLER
8: 7788 SCOTT NA 20 SMITH; JONES; FORD; SCOTT; ADAMS
9: 7839 KING NA 10 CLARK; KING; MILLER
10: 7844 TURNER 0 30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
11: 7876 ADAMS NA 20 SMITH; JONES; FORD; SCOTT; ADAMS
12: 7900 JAMES NA 30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
13: 7902 FORD NA 20 SMITH; JONES; FORD; SCOTT; ADAMS
14: 7934 MILLER NA 10 CLARK; KING; MILLER
8. Python Duckdb의 SQL
Python Programming |
%%sql
SELECT deptno ,
group_concat(ename, '; ') grp_concat_1,
string_agg(ename, '; ') grp_concat_2
FROM ( SELECT *
FROM emp
order by deptno,hiredate )
GROUP BY deptno
Python Programming |
duckdb.sql(" SELECT deptno , \
group_concat(ename, '; ') grp_concat_1, \
string_agg(ename, '; ') grp_concat_2 \
FROM ( SELECT * \
FROM emp \
order by deptno,hiredate ) \
GROUP BY deptno ").df()
Results |
deptno grp_concat_1 \
0 10 CLARK; KING; MILLER
1 20 SMITH; JONES; FORD; SCOTT; ADAMS
2 30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
grp_concat_2
0 CLARK; KING; MILLER
1 SMITH; JONES; FORD; SCOTT; ADAMS
2 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
--------------------------------------------
[Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table] 오라클 함수와 R & Python 비교 사전 목록 링크 |
오라클 SQL 함수(Oracle SQL Function) 목록 리스트 링크 |
[SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트 링크 링크 |
반응형
댓글