본문 바로가기
통계프로그램 비교 시리즈/오라클함수 비교(R & Python)

LISTAGG 오라클 함수 [Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table, DuckDB]

by 기서무나구물 2022. 1. 4.

포스팅 목차

    * 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크


    [ LISTAGG Oracle Function ]

     


    LISTAGG는 ORDER BY 절에 지정된 각 그룹 내에서 데이터를 정렬 한 다음 기준 열의 값을 연결하여 변수의 값으로 반환한다.

     

    • 함수 설명 : 

     


    1. Oracle(오라클)

     

    LISTAGG() 함수

    deptno 그룹 내에서 사원번호 기준으로 데이터를 정렬 후 사원 이름을 합친다.

     

    Oracle Programming
    SELECT LISTAGG(ENAME, ',') WITHin GROUP (ORDER BY EMPNO) AS "EMPLOYEE"
    FROM   EMP 
    GROUP BY DEPTNO

     

    Results
    EMPLOYEE
    -----------------------------------------
    CLARK,KING,MILLER
    SMITH,JONES,SCOTT,ADAMS,FORD
    ALLEN,WARD,MARTIN,BLAKE,TURNER,JAMES

     


    2. Python Pandas(파이썬)

     

    문자 SUM() 함수

     

    Python Programming
    emp.sort_values('hiredate').groupby('deptno')['ename'].sum()

     

    Results
    deptno
    10                    CLARKKINGMILLER
    20           SMITHJONESFORDSCOTTADAMS
    30    ALLENWARDBLAKETURNERMARTINJAMES
    Name: ename, dtype: object

     


    agg() 함수와 list() 함수

     

    Python Programming
    emp.sort_values('hiredate').groupby('deptno')['ename'].agg(lambda x: list(x))

     

    Results
    deptno
    10                          [CLARK, KING, MILLER]
    20             [SMITH, JONES, FORD, SCOTT, ADAMS]
    30    [ALLEN, WARD, BLAKE, TURNER, MARTIN, JAMES]
    Name: ename, dtype: object

     


    agg() 함수와 join() 함수

     

    Python Programming
    emp.sort_values('hiredate').groupby('deptno')['ename'].agg(lambda x: ', '.join(x))

     

    Results
    deptno
    10                          CLARK, KING, MILLER
    20             SMITH, JONES, FORD, SCOTT, ADAMS
    30    ALLEN, WARD, BLAKE, TURNER, MARTIN, JAMES
    Name: ename, dtype: object

     


    2) 분석함수

     

    • transform() 함수와 join() 함수

     

    Python Programming
    withmooc = emp.copy()
    
    withmooc['Emp_list'] = withmooc.sort_values('hiredate').groupby('deptno')['ename'].transform(lambda x: ', '.join(x))
    
    withmooc.head()

     

    Results
    	empno	ename	job		mgr	hiredate	sal	comm	deptno	Emp_list
    0	7369	SMITH	CLERK		7902.0	1980/12/17	800	NaN	20	SMITH, JONES, FORD, SCOTT, ADAMS
    1	7499	ALLEN	SALESMAN	7698.0	1981/02/20	1600	300.0	30	ALLEN, WARD, BLAKE, TURNER, MARTIN, JAMES
    2	7521	WARD	SALESMAN	7698.0	1981/02/22	1250	500.0	30	ALLEN, WARD, BLAKE, TURNER, MARTIN, JAMES
    3	7566	JONES	MANAGER		7839.0	1981/04/02	2975	NaN	20	SMITH, JONES, FORD, SCOTT, ADAMS
    4	7654	MARTIN	SALESMAN	7698.0	1981/09/28	1250	1400.0	30	ALLEN, WARD, BLAKE, TURNER, MARTIN, JAMES

     

     

     


    3. R Programming (R Package)

     

    aggregate() 함수

     

    R Programming
    %%R
    
    aggregate(ename ~ deptno, data=emp[order(emp$hiredate), ], c)

     

    Results
      deptno                                     ename
    1     10                       CLARK, KING, MILLER
    2     20          SMITH, JONES, FORD, SCOTT, ADAMS
    3     30 ALLEN, WARD, BLAKE, TURNER, MARTIN, JAMES

     


    aggregate() 함수와 paste() 함수

     

    R Programming
    %%R
    
    aggregate(ename ~ deptno, data=emp[order(emp$hiredate), ] , FUN=function(x) paste(x, collapse = "; "))

     

    Results
      deptno                                     ename
    1     10                       CLARK; KING; MILLER
    2     20          SMITH; JONES; FORD; SCOTT; ADAMS
    3     30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES

     


    ave() 함수와 paste() 함수

     

    R Programming
    %%R
    
    within(emp[order(emp$hiredate), ], { Emp_list <- ave(ename, deptno, FUN = function(x) paste(x, collapse = "; ")) })

     

    Results
    # A tibble: 14 x 9
       empno ename  job      mgr hiredate     sal  comm deptno Emp_list             
       <dbl> <chr>  <chr>  <dbl> <date>     <dbl> <dbl>  <dbl> <chr>                
     1  7369 SMITH  CLERK   7902 1980-12-17   800    NA     20 SMITH; JONES; FORD; ~
     2  7782 CLARK  MANAG~  7839 1981-01-09  2450    NA     10 CLARK; KING; MILLER  
     3  7499 ALLEN  SALES~  7698 1981-02-20  1600   300     30 ALLEN; WARD; BLAKE; ~
     4  7521 WARD   SALES~  7698 1981-02-22  1250   500     30 ALLEN; WARD; BLAKE; ~
     5  7698 BLAKE  MANAG~  7839 1981-03-01  2850    NA     30 ALLEN; WARD; BLAKE; ~
     6  7566 JONES  MANAG~  7839 1981-04-02  2975    NA     20 SMITH; JONES; FORD; ~
     7  7844 TURNER SALES~  7698 1981-09-08  1500     0     30 ALLEN; WARD; BLAKE; ~
     8  7654 MARTIN SALES~  7698 1981-09-28  1250  1400     30 ALLEN; WARD; BLAKE; ~
     9  7839 KING   PRESI~    NA 1981-11-17  5000    NA     10 CLARK; KING; MILLER  
    10  7900 JAMES  CLERK   7698 1981-12-03   950    NA     30 ALLEN; WARD; BLAKE; ~
    11  7902 FORD   ANALY~  7566 1981-12-03  3000    NA     20 SMITH; JONES; FORD; ~
    12  7934 MILLER CLERK   7782 1982-01-23  1300    NA     10 CLARK; KING; MILLER  
    13  7788 SCOTT  ANALY~  7566 1982-12-09  3000    NA     20 SMITH; JONES; FORD; ~
    14  7876 ADAMS  CLERK   7788 1983-01-12  1100    NA     20 SMITH; JONES; FORD; ~

     

     

     


    4. R Dplyr Package

     

    paste() 함수

     

    R Programming
    %%R
    
    emp %>%
      dplyr::arrange(hiredate) %>%
      dplyr::group_by(deptno) %>%
      dplyr::mutate( Emp_list = paste(ename, collapse = "; "))

     

    Results
    # A tibble: 14 x 9
    # Groups:   deptno [3]
       empno ename  job      mgr hiredate     sal  comm deptno Emp_list             
       <dbl> <chr>  <chr>  <dbl> <date>     <dbl> <dbl>  <dbl> <chr>                
     1  7369 SMITH  CLERK   7902 1980-12-17   800    NA     20 SMITH; JONES; FORD; ~
     2  7782 CLARK  MANAG~  7839 1981-01-09  2450    NA     10 CLARK; KING; MILLER  
     3  7499 ALLEN  SALES~  7698 1981-02-20  1600   300     30 ALLEN; WARD; BLAKE; ~
     4  7521 WARD   SALES~  7698 1981-02-22  1250   500     30 ALLEN; WARD; BLAKE; ~
     5  7698 BLAKE  MANAG~  7839 1981-03-01  2850    NA     30 ALLEN; WARD; BLAKE; ~
     6  7566 JONES  MANAG~  7839 1981-04-02  2975    NA     20 SMITH; JONES; FORD; ~
     7  7844 TURNER SALES~  7698 1981-09-08  1500     0     30 ALLEN; WARD; BLAKE; ~
     8  7654 MARTIN SALES~  7698 1981-09-28  1250  1400     30 ALLEN; WARD; BLAKE; ~
     9  7839 KING   PRESI~    NA 1981-11-17  5000    NA     10 CLARK; KING; MILLER  
    10  7900 JAMES  CLERK   7698 1981-12-03   950    NA     30 ALLEN; WARD; BLAKE; ~
    11  7902 FORD   ANALY~  7566 1981-12-03  3000    NA     20 SMITH; JONES; FORD; ~
    12  7934 MILLER CLERK   7782 1982-01-23  1300    NA     10 CLARK; KING; MILLER  
    13  7788 SCOTT  ANALY~  7566 1982-12-09  3000    NA     20 SMITH; JONES; FORD; ~
    14  7876 ADAMS  CLERK   7788 1983-01-12  1100    NA     20 SMITH; JONES; FORD; ~

     

     

     


    5. R sqldf Package

     

    group_concat() 함수

     

    R Programming
    %%R
    
    sqldf(" SELECT deptno ,
                   GROUP_CONCAT(ename, '; ') grp_concat
            FROM   ( SELECT *
                     FROM   emp
                     order by deptno,hiredate )
            GROUP BY deptno
            order by hiredate")

     

    Results
      deptno                                grp_concat
    1     20          SMITH; JONES; FORD; SCOTT; ADAMS
    2     10                       CLARK; KING; MILLER
    3     30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES

     

     

     


    6. Python pandasql Package

     

    group_concat() 함수

     

    Python Programming
    ps.sqldf(" SELECT deptno ,                                 \
                      GROUP_CONCAT(ename, '; ') grp_concat     \
               FROM   ( SELECT *                       \
                        FROM   emp                     \
                        order by deptno,hiredate )     \
               GROUP BY deptno                         \
               order by hiredate  ")

     

    Results
    	deptno	grp_concat
    0	20	SMITH; JONES; FORD; SCOTT; ADAMS
    1	10	CLARK; KING; MILLER
    2	30	ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES

     

     

     


    7. R data.table Package

     

    paste() 함수

     

    R Programming
    %%R
    
    DT          <- data.table(emp)
    dept_DT     <- data.table(dept)
    
    DT[order(hiredate), Emp_list := paste(ename, collapse = "; "), by=deptno][, c('sal','comm','mgr','hiredate') := NULL]

     

    Results
        empno  ename comm deptno                                  Emp_list
     1:  7369  SMITH   NA     20          SMITH; JONES; FORD; SCOTT; ADAMS
     2:  7499  ALLEN  300     30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
     3:  7521   WARD  500     30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
     4:  7566  JONES   NA     20          SMITH; JONES; FORD; SCOTT; ADAMS
     5:  7654 MARTIN 1400     30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
     6:  7698  BLAKE   NA     30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
     7:  7782  CLARK   NA     10                       CLARK; KING; MILLER
     8:  7788  SCOTT   NA     20          SMITH; JONES; FORD; SCOTT; ADAMS
     9:  7839   KING   NA     10                       CLARK; KING; MILLER
    10:  7844 TURNER    0     30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
    11:  7876  ADAMS   NA     20          SMITH; JONES; FORD; SCOTT; ADAMS
    12:  7900  JAMES   NA     30 ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES
    13:  7902   FORD   NA     20          SMITH; JONES; FORD; SCOTT; ADAMS
    14:  7934 MILLER   NA     10                       CLARK; KING; MILLER

     


    8. Python Duckdb의 SQL

     

    Python Programming
    %%sql
      SELECT deptno ,
             group_concat(ename, '; ') grp_concat_1,
             string_agg(ename, '; ') grp_concat_2
      FROM   ( SELECT *
               FROM   emp
               order by deptno,hiredate )
      GROUP BY deptno

     

    Python Programming
    duckdb.sql(" SELECT deptno ,                                      \
                        group_concat(ename, '; ') grp_concat_1,       \
                        string_agg(ename, '; ')   grp_concat_2        \
                 FROM   ( SELECT *                                    \
                          FROM   emp                                  \
                          order by deptno,hiredate )                  \
                 GROUP BY deptno ").df()

     

    Results
       deptno                               grp_concat_1  \
    0      10                        CLARK; KING; MILLER   
    1      20           SMITH; JONES; FORD; SCOTT; ADAMS   
    2      30  ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES   
    
                                    grp_concat_2  
    0                        CLARK; KING; MILLER  
    1           SMITH; JONES; FORD; SCOTT; ADAMS  
    2  ALLEN; WARD; BLAKE; TURNER; MARTIN; JAMES

     


    https://unsplash.com/photos/ipuiM-36tAg

      --------------------------------------------  

     

     

    [Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table] 오라클 함수와 R & Python 비교 사전 목록 링크

     

    오라클 SQL 함수(Oracle SQL Function) 목록 리스트 링크

     

    [SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트 링크 링크
    반응형

    댓글