본문 바로가기
통계프로그램 비교 시리즈/오라클함수 비교(R & Python)

MONTHS_BETWEEN 오라클 함수 [Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table, DuckDB]

by 기서무나구물 2021. 12. 4.

포스팅 목차

    * 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크


    [ MONTHS_BETWEEN Oracle Function ]

     


    MONTHS_BETWEEN과 ADD_MONTHS 함수는 월 단위로 날짜 연산을 하는 함수이다. MONTHS_BETWEEN 함수는 날짜와 날짜 사이의 개월 수를 출력하는 함수이고, ADD_MONTHS 함수는 특정 개월 수를 더한 날짜를 출력하는 함수이다.

     

     


    1. Oracle(오라클)

     

    months_between() 함수

    Oracle Programming
    select floor(months_between(sysdate,'15-aug-1947')) "age in months"
    from   dual

     

    Results
    age in months
    -----------------
    916

     


    2. Python Pandas(파이썬)

     

    현재 날짜를 반환한다.

     

    Python Programming
    from datetime import datetime
    today = datetime.today().strftime('%Y-%m-%d')
    print(today)
    
    import time
    t = time.time()
    today = time.strftime("%Y-%m-%d", time.gmtime(t))
    print( today )

     

    Results
    2021-02-03
    2021-02-03

     


    • Timedelta : days 차이 계산 가능(month/year는 지원 안 함)
    • 두 날짜 사이의 기간 차이를 반환한다.

     

    Python Programming
    date_1 = pd.to_datetime('2015-01-02 12:13:14')
    date_2 = pd.to_datetime('2012-03-02 12:13:14')
    
    # 두 날짜 사이의 기간 차이를 반환한다.
    (date_1 - date_2).days/31

     

    Results
    33.41935483870968

     


    사용자 함수 정의

    현재 날짜와 2020년 1월 20일 과의 개월 수 차이를 계산한다. (2개의 날짜 사이의 월을 기준으로 개월 수 차이를 계산)

     

    Python Programming
    from datetime import datetime
    
    def diff_month(d1, d2):
        return (d1.year - d2.year) * 12 + d1.month - d2.month
    
    diff_month(datetime.today(), datetime(2020, 1, 20))

     

    Results
    13

     

     


    relativedelta() 함수

    현재 날짜와 2020년 1월 20일 과의 개월 수 차이를 계산한다. (2개의 날짜 사이의 기간 차이를 계산 후 개월 수 차이를 계산)

     

    Python Programming
    from dateutil.relativedelta import relativedelta
    
    r = relativedelta(datetime.today(), datetime(2020, 1, 20))
    r.months + (12*r.years)

     

    Results
    12

     


    2) 테이블 예제

    두 날짜 사이의 년도 차이와 날짜 사이의 월을 별도로 계산

     

    Python Programming
    ((datetime(2020, 1, 20) - pd.to_datetime(emp['hiredate']))/np.timedelta64(1, 'M')).astype(int).head()

     

    Results
    0    469
    1    466
    2    466
    3    465
    4    459
    Name: hiredate, dtype: int32

     


    • pd.to_datetime() 함수

    : [stackoverflow 참고] Pandas - Number of Months Between Two Dates [링크]

     

    Pandas - Number of Months Between Two Dates

    I think this should be simple but what I've seen are techniques that involve iterating over a dataframe date fields to determine the diff between two dates. And I'm having trouble with it. I'm fami...

    stackoverflow.com

     

    Python Programming
    pd.to_datetime(["2020-01-20 00:00:00"]).to_period('M').astype(int) - pd.to_datetime(emp['hiredate']).dt.to_period('M').astype(int).head()

     

    0    469
    1    467
    2    467
    3    465
    4    460
    dtype: int64

     


    • relativedelta() 함수
    Python Programming
    emp.apply(lambda x: relativedelta(datetime(2020, 1, 20), pd.to_datetime(x['hiredate'])).months + 
                         ( 12* relativedelta(datetime(2020, 1, 20), pd.to_datetime(x['hiredate']))).years,
              axis=1).head()

     

    Results
    0    470
    1    478
    2    476
    3    474
    4    462
    dtype: int64

     


    • 근사적으로 계산
    Python Programming
    (datetime(2020, 1, 20) -  pd.to_datetime(emp['hiredate'])).map(lambda x: round(x.days/30.416)).head()

     

    Results
    0    469
    1    467
    2    467
    3    466
    4    460
    Name: hiredate, dtype: int64

     

     


    3. R Programming (R Package)

     

    year() 와 month() 함수

    R Programming
    %%R
    
    number_of_months = (year(Sys.Date()) - year(as.Date('2020-01-20'))) * 12 + month(Sys.Date()) - month(as.Date('2020-01-20'))
    number_of_months

     

    Results
    [1] 13

     


    사용자 함수 정의

    R Programming
    %%R
    
    months_between <- function(end_date, start_date) {
        end_date <- as.POSIXlt(end_date)
        start_date <- as.POSIXlt(start_date)
        12 * (end_date$year - start_date$year) + (end_date$mon - start_date$mon)
    }
    
    months_between(Sys.time(), as.Date("2020-01-20"))

     

    Results
    [1] 13

     


     

    R Programming
    %%R
    
    length(seq(from=as.Date('2020-01-20'), to=lubridate::today(), by='month'))

     

    Results
    [1] 13

     


    lubridate::interval() 함수

     

    R Programming
    %%R
    library(lubridate)
    
    # interval(mdy(01202020), today()) %/% months(1)
    lubridate::interval(as.Date('2020-01-20'), today()) %/% months(1)

     

    Results
    [1] 12

     


    • as.yearmon()
    R Programming
    %%R
    library(zoo)
    12 * (as.yearmon(today()) - as.yearmon(as.Date('2020-01-20')))

     

    Results
    [1] 13

     


    2) 테이블 예제

    • lubridate::interval() 함수
    R Programming
    %%R
    
    withmooc <- emp
    
    withmooc['month_gap'] = lubridate::interval(as.Date(withmooc$hiredate) , as.Date('2020-01-20') ) %/% months(1)
    
    withmooc[1:10, ]

     

    Results
    # A tibble: 10 x 9
       empno ename  job         mgr hiredate     sal  comm deptno month_gap
       <dbl> <chr>  <chr>     <dbl> <date>     <dbl> <dbl>  <dbl>     <dbl>
     1  7369 SMITH  CLERK      7902 1980-12-17   800    NA     20       469
     2  7499 ALLEN  SALESMAN   7698 1981-02-20  1600   300     30       467
     3  7521 WARD   SALESMAN   7698 1981-02-22  1250   500     30       466
     4  7566 JONES  MANAGER    7839 1981-04-02  2975    NA     20       465
     5  7654 MARTIN SALESMAN   7698 1981-09-28  1250  1400     30       459
     6  7698 BLAKE  MANAGER    7839 1981-03-01  2850    NA     30       466
     7  7782 CLARK  MANAGER    7839 1981-01-09  2450    NA     10       468
     8  7788 SCOTT  ANALYST    7566 1982-12-09  3000    NA     20       445
     9  7839 KING   PRESIDENT    NA 1981-11-17  5000    NA     10       458
    10  7844 TURNER SALESMAN   7698 1981-09-08  1500     0     30       460

     


    • as.yearmon()
    R Programming
    %%R
    
    library(zoo)
    withmooc <- emp
    withmooc['month_gap'] =  12 * (as.yearmon(as.Date('2020-01-20')) - as.yearmon(as.Date(withmooc$hiredate)))
    withmooc[1:10, ]

     

    Results
    # A tibble: 10 x 9
       empno ename  job         mgr hiredate     sal  comm deptno month_gap
       <dbl> <chr>  <chr>     <dbl> <date>     <dbl> <dbl>  <dbl>     <dbl>
     1  7369 SMITH  CLERK      7902 1980-12-17   800    NA     20      469.
     2  7499 ALLEN  SALESMAN   7698 1981-02-20  1600   300     30      467.
     3  7521 WARD   SALESMAN   7698 1981-02-22  1250   500     30      467.
     4  7566 JONES  MANAGER    7839 1981-04-02  2975    NA     20      465 
     5  7654 MARTIN SALESMAN   7698 1981-09-28  1250  1400     30      460.
     6  7698 BLAKE  MANAGER    7839 1981-03-01  2850    NA     30      466.
     7  7782 CLARK  MANAGER    7839 1981-01-09  2450    NA     10      468 
     8  7788 SCOTT  ANALYST    7566 1982-12-09  3000    NA     20      445.
     9  7839 KING   PRESIDENT    NA 1981-11-17  5000    NA     10      458.
    10  7844 TURNER SALESMAN   7698 1981-09-08  1500     0     30      460.

     


    • difftime() 함수
    R Programming
    %%R
    withmooc <- emp
    
    withmooc['month_gap'] = with(withmooc, difftime(as.Date('2020-01-20'), as.Date(hiredate)) / 30.416)
    withmooc[1:10, ]

     

    Results
    # A tibble: 10 x 9
       empno ename  job         mgr hiredate     sal  comm deptno month_gap    
       <dbl> <chr>  <chr>     <dbl> <date>     <dbl> <dbl>  <dbl> <drtn>       
     1  7369 SMITH  CLERK      7902 1980-12-17   800    NA     20 469.4240 days
     2  7499 ALLEN  SALESMAN   7698 1981-02-20  1600   300     30 467.2870 days
     3  7521 WARD   SALESMAN   7698 1981-02-22  1250   500     30 467.2212 days
     4  7566 JONES  MANAGER    7839 1981-04-02  2975    NA     20 465.9390 days
     5  7654 MARTIN SALESMAN   7698 1981-09-28  1250  1400     30 460.0539 days
     6  7698 BLAKE  MANAGER    7839 1981-03-01  2850    NA     30 466.9911 days
     7  7782 CLARK  MANAGER    7839 1981-01-09  2450    NA     10 468.6678 days
     8  7788 SCOTT  ANALYST    7566 1982-12-09  3000    NA     20 445.6865 days
     9  7839 KING   PRESIDENT    NA 1981-11-17  5000    NA     10 458.4100 days
    10  7844 TURNER SALESMAN   7698 1981-09-08  1500     0     30 460.7115 days

     

     


    4. R Dplyr Package

     

    lubridate::interval()

     

    R Programming
    %%R
    
    emp %>%
      dplyr::mutate(month_gap = lubridate::interval(as.Date(hiredate) , as.Date('2020-01-20') ) %/% months(1) ) %>%
      head()

     

    Results
    # A tibble: 6 x 9
      empno ename  job        mgr hiredate     sal  comm deptno month_gap
      <dbl> <chr>  <chr>    <dbl> <date>     <dbl> <dbl>  <dbl>     <dbl>
    1  7369 SMITH  CLERK     7902 1980-12-17   800    NA     20       469
    2  7499 ALLEN  SALESMAN  7698 1981-02-20  1600   300     30       467
    3  7521 WARD   SALESMAN  7698 1981-02-22  1250   500     30       466
    4  7566 JONES  MANAGER   7839 1981-04-02  2975    NA     20       465
    5  7654 MARTIN SALESMAN  7698 1981-09-28  1250  1400     30       459
    6  7698 BLAKE  MANAGER   7839 1981-03-01  2850    NA     30       466

     

     


    5. R sqldf Package

     

    julianday() 함수

     

    R Programming
    %%R
    
    sqldf(" select round( (julianday('now')-julianday('2020-01-20')) /30.416) month_gap ")

     

    Results
      month_gap
    1        13

     


    strftime() 함수

     

    R Programming
    %%R
    
    sqldf(" select ( round(strftime('%s','2020-01-20') - strftime('%s', hiredate * 3600 * 24, 'unixepoch')) /60/60/24)/30.416 month_gap from emp ")

     

    Results
       month_gap
    1   469.4240
    2   467.2870
    3   467.2212
    4   465.9390
    5   460.0539
    6   466.9911
    7   468.6678
    8   445.6865
    9   458.4100
    10  460.7115
    11  444.5686
    12  457.8840
    13  457.8840
    14  456.2073

     

     


    6. Python pandasql Package

     

    julianday() 함수

     

    Python Programming
    ps.sqldf(" select round( (julianday('now')-julianday('2020-01-20')) /30.416) month_gap ")

     

    Results
    	month_gap
    0	13.0

     


    strftime() 함수

     

    Python Programming
     
    ps.sqldf(" select (ROUND( strftime('%s','2020-01-20') - strftime('%s',(substr(hiredate,1,4)||'-'||substr(hiredate,6,2)||'-'||substr(hiredate,9,2))) ) /60/60/24)/30.416 month_gap   \
               from emp").head()

     

    Results
    	month_gap
    0	469.423987
    1	467.286954
    2	467.221199
    3	465.938979
    4	460.053919

     


    7. R data.table Package

     

    lubridate::interval() 함수

     

    R Programming
    %%R
    
    DT          <- data.table(emp)
    dept_DT     <- data.table(dept)
    
    DT[, month_gap := lubridate::interval(as.Date(hiredate) , as.Date('2020-01-20') ) %/% months(1) ][1:10, ]

     

    Results
        empno  ename       job  mgr   hiredate  sal comm deptno month_gap
     1:  7369  SMITH     CLERK 7902 1980-12-17  800   NA     20       469
     2:  7499  ALLEN  SALESMAN 7698 1981-02-20 1600  300     30       467
     3:  7521   WARD  SALESMAN 7698 1981-02-22 1250  500     30       466
     4:  7566  JONES   MANAGER 7839 1981-04-02 2975   NA     20       465
     5:  7654 MARTIN  SALESMAN 7698 1981-09-28 1250 1400     30       459
     6:  7698  BLAKE   MANAGER 7839 1981-03-01 2850   NA     30       466
     7:  7782  CLARK   MANAGER 7839 1981-01-09 2450   NA     10       468
     8:  7788  SCOTT   ANALYST 7566 1982-12-09 3000   NA     20       445
     9:  7839   KING PRESIDENT   NA 1981-11-17 5000   NA     10       458
    10:  7844 TURNER  SALESMAN 7698 1981-09-08 1500    0     30       460

     


    8. Python Duckdb의 SQL

     

    Python Programming
    %%sql
      select date_diff('month', Date '1947-08-15', CURRENT_DATE) age_in_months

     

    Python Programming
    duckdb.sql(" select date_diff('month', Date '1947-08-15', CURRENT_DATE) age_in_months ").df()

     

     

    Results
       age_in_months
    0            912

     


     

    Python Programming
    %%sql
       select empno, ename, hiredate,
              '2020-01-20' as base_date,
              date_diff('month', cast(hiredate as date), Date '2020-01-20') "age in months"
       from   emp
       LIMIT  6

     

    Python Programming
    select empno, ename, hiredate,                                                           \
           '2020-01-20' as base_date,                                                        \
           date_diff('month', cast(hiredate as date), Date '2020-01-20') AS age_in_months    \
    from   emp                                                                               \
    LIMIT  6 ").df()

     

    Results
       empno   ename    hiredate   base_date  age_in_months
    0   7369   SMITH  1980/12/17  2020-01-20            469
    1   7499   ALLEN  1981/02/20  2020-01-20            467
    2   7521    WARD  1981/02/22  2020-01-20            467
    3   7566   JONES  1981/04/02  2020-01-20            465
    4   7654  MARTIN  1981/09/28  2020-01-20            460
    5   7698   BLAKE  1981/03/01  2020-01-20            466

     


     

    Seoul, South Korea (https://unsplash.com/photos/Rp09svSHHMM)

      --------------------------------------------  

     

    [Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table] 오라클 함수와 R & Python 비교 사전 목록 링크

     

    오라클 SQL 함수(Oracle SQL Function) 목록 리스트 링크

     

    [SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트 링크 링크
    반응형

    댓글