본문 바로가기
통계프로그램 비교 시리즈/오라클함수 비교(R & Python)

INSTR 오라클 함수 [Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table, DuckDB]

by 기서무나구물 2021. 11. 25.

포스팅 목차

    * 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크


    [ INSTR Oracle Function ]

     


    INSTR함수는 문자 열중에서 지정한 문자가 처음 나타나는 위치를 숫자로 반환.

     

     


    1. Oracle(오라클)

    다음 예제는 문자열 CORPORATE FLOOR을 검색하여, 3번째 문자에서 시작하여, “OR”이 두 번째 발생하는 위치를 반환한다.

     

    Oracle Programming
    SELECT INSTR('CORPORATE FLOOR','OR', 3, 2) "Instring" 
    FROM   DUAL;

     

    Results
    Instring
    ----------
    14

     

     


    2. Python Pandas(파이썬)

    • find 함수에서 occurrence(발생횟수) 옵션은 지원 안 함.
    • [geeksforgeeks 참고] Python | Ways to find nth occurrence of substring in a string [링크]

     

    Python Programming
    str.find('CORPORATE FLOOR','OR',5)

     

    Results
    13

     

     


    re.finditer()

    Python Programming
    import re
    
    instr_iter = re.finditer('OR', 'CORPORATE FLOOR')
    
    for instr_pos in instr_iter:
        print(instr_pos)

     

    Results
    <re.Match object; span=(1, 3), match='OR'>
    <re.Match object; span=(4, 6), match='OR'>
    <re.Match object; span=(13, 15), match='OR'>

     

     

     


    3. R Programming (R Package)

     

    grep()

    R Programming
    %%R
    
    grep('CORPORATE FLOOR',c('OR'),value=T)

     

    Results
    character(0)

     


    stringr::str_locate()

    R Programming
    %%R
    
    stringr::str_locate('CORPORATE FLOOR','OR')

     

    Results
         start end
    [1,]     2   3

     


    stringr::str_locate_all()

    R Programming
    %%R
    
    stringr::str_locate_all('CORPORATE FLOOR','OR')
    
    # stringr::str_locate_all('CORPORATE FLOOR','OR')[[1]][1,'start']

     

    Results
    [[1]]
         start end
    [1,]     2   3
    [2,]     5   6
    [3,]    14  15

     

     

     


    4. R Dplyr Package

     

    stringr::str_locate_all()

    R Programming
    %%R
    
    emp %>% 
      dplyr::rowwise() %>% 
      dplyr::mutate(instr_start = stringr::str_locate_all(job,'MA')[[1]][,1][1],
                    instr_end   = stringr::str_locate_all(job,'MA')[[1]][,2][1] ) %>%
      head(10)

     

    Results
    # A tibble: 10 x 10
    # Rowwise: 
       empno ename  job      mgr hiredate     sal  comm deptno instr_start instr_end
       <dbl> <chr>  <chr>  <dbl> <date>     <dbl> <dbl>  <dbl>       <int>     <int>
     1  7369 SMITH  CLERK   7902 1980-12-17   800    NA     20          NA        NA
     2  7499 ALLEN  SALES~  7698 1981-02-20  1600   300     30           6         7
     3  7521 WARD   SALES~  7698 1981-02-22  1250   500     30           6         7
     4  7566 JONES  MANAG~  7839 1981-04-02  2975    NA     20           1         2
     5  7654 MARTIN SALES~  7698 1981-09-28  1250  1400     30           6         7
     6  7698 BLAKE  MANAG~  7839 1981-03-01  2850    NA     30           1         2
     7  7782 CLARK  MANAG~  7839 1981-01-09  2450    NA     10           1         2
     8  7788 SCOTT  ANALY~  7566 1982-12-09  3000    NA     20          NA        NA
     9  7839 KING   PRESI~    NA 1981-11-17  5000    NA     10          NA        NA
    10  7844 TURNER SALES~  7698 1981-09-08  1500     0     30           6         7

     

     

     


    5. R sqldf Package

     

    instr()

    R Programming
    %%R
    
    sqldf(" SELECT instr('CORPORATE FLOOR','OR') instr_pos ")

     

    Results
      instr_pos
    1         2

     

     

     


    6. Python pandasql Package

     

    • instr()
    Python Programming
    ps.sqldf(" SELECT instr('CORPORATE FLOOR','OR') instr_pos ")

     

    Results
    	instr_pos
    0	2

     

     

     


    7. R data.table Package

     

    stringr::str_locate_all()

    R Programming
    %%R
    
    DT          <- data.table(emp)
    dept_DT     <- data.table(dept)
    DT[,instr_start := stringr::str_locate_all(job,'MA')[[1]][,1][1], by=1:nrow(DT)][1:10, ]

     

    Results
        empno  ename       job  mgr   hiredate  sal comm deptno instr_start
     1:  7369  SMITH     CLERK 7902 1980-12-17  800   NA     20          NA
     2:  7499  ALLEN  SALESMAN 7698 1981-02-20 1600  300     30           6
     3:  7521   WARD  SALESMAN 7698 1981-02-22 1250  500     30           6
     4:  7566  JONES   MANAGER 7839 1981-04-02 2975   NA     20           1
     5:  7654 MARTIN  SALESMAN 7698 1981-09-28 1250 1400     30           6
     6:  7698  BLAKE   MANAGER 7839 1981-03-01 2850   NA     30           1
     7:  7782  CLARK   MANAGER 7839 1981-01-09 2450   NA     10           1
     8:  7788  SCOTT   ANALYST 7566 1982-12-09 3000   NA     20          NA
     9:  7839   KING PRESIDENT   NA 1981-11-17 5000   NA     10          NA
    10:  7844 TURNER  SALESMAN 7698 1981-09-08 1500    0     30           6

     


    8. Python DuckDB의 SQL

     

    Python Programming
    %%sql
      SELECT position('e' IN 'hello')            as pos_1,
             position('OR' in 'CORPORATE FLOOR') as pos_2

     

    Python Programming
    duckdb.sql("   SELECT position('e' IN 'hello')            as pos_1,                \
                          position('OR' in 'CORPORATE FLOOR') as pos_2 ").df()

     

     

    Results
       pos_1  pos_2
    0      2      2

     

     


    Bundang, Seoul, South Korea (https://unsplash.com/photos/PGeslSkvPQg)

      --------------------------------------------  

     

    [Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table] 오라클 함수와 R & Python 비교 사전 목록 링크

     

    오라클 SQL 함수(Oracle SQL Function) 목록 리스트 링크

     

    [SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트 링크 링크
    반응형

    댓글