포스팅 목차
* 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크
[ MONTHS_BETWEEN Oracle Function ]
MONTHS_BETWEEN과 ADD_MONTHS 함수는 월 단위로 날짜 연산을 하는 함수이다. MONTHS_BETWEEN 함수는 날짜와 날짜 사이의 개월 수를 출력하는 함수이고, ADD_MONTHS 함수는 특정 개월 수를 더한 날짜를 출력하는 함수이다.
- 함수설명 : MEDIAN 오라클 함수 링크
1. Oracle(오라클)
months_between() 함수
Oracle Programming |
select floor(months_between(sysdate,'15-aug-1947')) "age in months"
from dual
Results |
age in months
-----------------
916
2. Python Pandas(파이썬)
현재 날짜를 반환한다.
Python Programming |
from datetime import datetime
today = datetime.today().strftime('%Y-%m-%d')
print(today)
import time
t = time.time()
today = time.strftime("%Y-%m-%d", time.gmtime(t))
print( today )
Results |
2021-02-03
2021-02-03
- Timedelta : days 차이 계산 가능(month/year는 지원 안 함)
- 두 날짜 사이의 기간 차이를 반환한다.
Python Programming |
date_1 = pd.to_datetime('2015-01-02 12:13:14')
date_2 = pd.to_datetime('2012-03-02 12:13:14')
# 두 날짜 사이의 기간 차이를 반환한다.
(date_1 - date_2).days/31
Results |
33.41935483870968
사용자 함수 정의
현재 날짜와 2020년 1월 20일 과의 개월 수 차이를 계산한다. (2개의 날짜 사이의 월을 기준으로 개월 수 차이를 계산)
Python Programming |
from datetime import datetime
def diff_month(d1, d2):
return (d1.year - d2.year) * 12 + d1.month - d2.month
diff_month(datetime.today(), datetime(2020, 1, 20))
Results |
13
relativedelta() 함수
현재 날짜와 2020년 1월 20일 과의 개월 수 차이를 계산한다. (2개의 날짜 사이의 기간 차이를 계산 후 개월 수 차이를 계산)
Python Programming |
from dateutil.relativedelta import relativedelta
r = relativedelta(datetime.today(), datetime(2020, 1, 20))
r.months + (12*r.years)
Results |
12
2) 테이블 예제
두 날짜 사이의 년도 차이와 날짜 사이의 월을 별도로 계산
Python Programming |
((datetime(2020, 1, 20) - pd.to_datetime(emp['hiredate']))/np.timedelta64(1, 'M')).astype(int).head()
Results |
0 469
1 466
2 466
3 465
4 459
Name: hiredate, dtype: int32
- pd.to_datetime() 함수
: [stackoverflow 참고] Pandas - Number of Months Between Two Dates [링크]
Python Programming |
pd.to_datetime(["2020-01-20 00:00:00"]).to_period('M').astype(int) - pd.to_datetime(emp['hiredate']).dt.to_period('M').astype(int).head()
0 469
1 467
2 467
3 465
4 460
dtype: int64
- relativedelta() 함수
Python Programming |
emp.apply(lambda x: relativedelta(datetime(2020, 1, 20), pd.to_datetime(x['hiredate'])).months +
( 12* relativedelta(datetime(2020, 1, 20), pd.to_datetime(x['hiredate']))).years,
axis=1).head()
Results |
0 470
1 478
2 476
3 474
4 462
dtype: int64
- 근사적으로 계산
Python Programming |
(datetime(2020, 1, 20) - pd.to_datetime(emp['hiredate'])).map(lambda x: round(x.days/30.416)).head()
Results |
0 469
1 467
2 467
3 466
4 460
Name: hiredate, dtype: int64
3. R Programming (R Package)
year() 와 month() 함수
R Programming |
%%R
number_of_months = (year(Sys.Date()) - year(as.Date('2020-01-20'))) * 12 + month(Sys.Date()) - month(as.Date('2020-01-20'))
number_of_months
Results |
[1] 13
사용자 함수 정의
R Programming |
%%R
months_between <- function(end_date, start_date) {
end_date <- as.POSIXlt(end_date)
start_date <- as.POSIXlt(start_date)
12 * (end_date$year - start_date$year) + (end_date$mon - start_date$mon)
}
months_between(Sys.time(), as.Date("2020-01-20"))
Results |
[1] 13
R Programming |
%%R
length(seq(from=as.Date('2020-01-20'), to=lubridate::today(), by='month'))
Results |
[1] 13
lubridate::interval() 함수
R Programming |
%%R
library(lubridate)
# interval(mdy(01202020), today()) %/% months(1)
lubridate::interval(as.Date('2020-01-20'), today()) %/% months(1)
Results |
[1] 12
- as.yearmon()
R Programming |
%%R
library(zoo)
12 * (as.yearmon(today()) - as.yearmon(as.Date('2020-01-20')))
Results |
[1] 13
2) 테이블 예제
- lubridate::interval() 함수
R Programming |
%%R
withmooc <- emp
withmooc['month_gap'] = lubridate::interval(as.Date(withmooc$hiredate) , as.Date('2020-01-20') ) %/% months(1)
withmooc[1:10, ]
Results |
# A tibble: 10 x 9
empno ename job mgr hiredate sal comm deptno month_gap
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <dbl>
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 469
2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 467
3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 466
4 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 465
5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 459
6 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 466
7 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 468
8 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 445
9 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 458
10 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30 460
- as.yearmon()
R Programming |
%%R
library(zoo)
withmooc <- emp
withmooc['month_gap'] = 12 * (as.yearmon(as.Date('2020-01-20')) - as.yearmon(as.Date(withmooc$hiredate)))
withmooc[1:10, ]
Results |
# A tibble: 10 x 9
empno ename job mgr hiredate sal comm deptno month_gap
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <dbl>
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 469.
2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 467.
3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 467.
4 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 465
5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 460.
6 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 466.
7 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 468
8 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 445.
9 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 458.
10 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30 460.
- difftime() 함수
R Programming |
%%R
withmooc <- emp
withmooc['month_gap'] = with(withmooc, difftime(as.Date('2020-01-20'), as.Date(hiredate)) / 30.416)
withmooc[1:10, ]
Results |
# A tibble: 10 x 9
empno ename job mgr hiredate sal comm deptno month_gap
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <drtn>
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 469.4240 days
2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 467.2870 days
3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 467.2212 days
4 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 465.9390 days
5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 460.0539 days
6 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 466.9911 days
7 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 468.6678 days
8 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 445.6865 days
9 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 458.4100 days
10 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30 460.7115 days
4. R Dplyr Package
lubridate::interval()
R Programming |
%%R
emp %>%
dplyr::mutate(month_gap = lubridate::interval(as.Date(hiredate) , as.Date('2020-01-20') ) %/% months(1) ) %>%
head()
Results |
# A tibble: 6 x 9
empno ename job mgr hiredate sal comm deptno month_gap
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <dbl>
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 469
2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 467
3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 466
4 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 465
5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 459
6 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 466
5. R sqldf Package
julianday() 함수
R Programming |
%%R
sqldf(" select round( (julianday('now')-julianday('2020-01-20')) /30.416) month_gap ")
Results |
month_gap
1 13
strftime() 함수
R Programming |
%%R
sqldf(" select ( round(strftime('%s','2020-01-20') - strftime('%s', hiredate * 3600 * 24, 'unixepoch')) /60/60/24)/30.416 month_gap from emp ")
Results |
month_gap
1 469.4240
2 467.2870
3 467.2212
4 465.9390
5 460.0539
6 466.9911
7 468.6678
8 445.6865
9 458.4100
10 460.7115
11 444.5686
12 457.8840
13 457.8840
14 456.2073
6. Python pandasql Package
julianday() 함수
Python Programming |
ps.sqldf(" select round( (julianday('now')-julianday('2020-01-20')) /30.416) month_gap ")
Results |
month_gap
0 13.0
strftime() 함수
Python Programming |
ps.sqldf(" select (ROUND( strftime('%s','2020-01-20') - strftime('%s',(substr(hiredate,1,4)||'-'||substr(hiredate,6,2)||'-'||substr(hiredate,9,2))) ) /60/60/24)/30.416 month_gap \
from emp").head()
Results |
month_gap
0 469.423987
1 467.286954
2 467.221199
3 465.938979
4 460.053919
7. R data.table Package
lubridate::interval() 함수
R Programming |
%%R
DT <- data.table(emp)
dept_DT <- data.table(dept)
DT[, month_gap := lubridate::interval(as.Date(hiredate) , as.Date('2020-01-20') ) %/% months(1) ][1:10, ]
Results |
empno ename job mgr hiredate sal comm deptno month_gap
1: 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 469
2: 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 467
3: 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 466
4: 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 465
5: 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 459
6: 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 466
7: 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 468
8: 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 445
9: 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 458
10: 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30 460
8. Python Duckdb의 SQL
Python Programming |
%%sql
select date_diff('month', Date '1947-08-15', CURRENT_DATE) age_in_months
Python Programming |
duckdb.sql(" select date_diff('month', Date '1947-08-15', CURRENT_DATE) age_in_months ").df()
Results |
age_in_months
0 912
Python Programming |
%%sql
select empno, ename, hiredate,
'2020-01-20' as base_date,
date_diff('month', cast(hiredate as date), Date '2020-01-20') "age in months"
from emp
LIMIT 6
Python Programming |
select empno, ename, hiredate, \
'2020-01-20' as base_date, \
date_diff('month', cast(hiredate as date), Date '2020-01-20') AS age_in_months \
from emp \
LIMIT 6 ").df()
Results |
empno ename hiredate base_date age_in_months
0 7369 SMITH 1980/12/17 2020-01-20 469
1 7499 ALLEN 1981/02/20 2020-01-20 467
2 7521 WARD 1981/02/22 2020-01-20 467
3 7566 JONES 1981/04/02 2020-01-20 465
4 7654 MARTIN 1981/09/28 2020-01-20 460
5 7698 BLAKE 1981/03/01 2020-01-20 466
--------------------------------------------
[Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table] 오라클 함수와 R & Python 비교 사전 목록 링크 |
오라클 SQL 함수(Oracle SQL Function) 목록 리스트 링크 |
[SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트 링크 링크 |
반응형
댓글