포스팅 목차
12. Display the names of employees who are working in the company for the past 5 years.
* 최근 5년 이상 근무한 직원의 이름을 표시하시오
- 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크
 - [날짜 데이터] 근속 기간 계산 후 데이터 필터 - 현재 날짜, 날짜 기간 차이 계산
 - 조회를 위하여 2020년 현재 39년으로 예제 변경
 - 날짜처리 예제 :12 / 13 / 123번
 
  | 
1. 오라클(Oracle)
| Oracle Programming | 
select ename from emp 
where  sysdate-hiredate > 39*365;
2. 파이썬(Pandas)
- dateutil 패키지를 사용하여 처리
 
| Python Programming | 
from datetime import datetime   # 날짜모듈 
# 날짜 처리 예시
date1 = datetime.strptime(str('2013-08-15 12:00:00'), '%Y-%m-%d %H:%M:%S')
date2 = datetime.strptime(str('2012-02-15'), '%Y-%m-%d')
from dateutil import relativedelta
# date1과 date2 날짜 사이의 기간을 계산한다.
r = relativedelta.relativedelta(date1, date2)
print(r)
print(r.days)
| Results | 
relativedelta(years=+1, months=+6, hours=+12)
0
to_datetime 함수를 사용하여서 날짜형 데이터로 변수 형변환
| Python Programming | 
pd.to_datetime(emp['hiredate'], format='%Y-%m-%d %H:%M:%S', errors='raise')
| Results | 
0    1980-12-17
1    1981-02-20
2    1981-02-22
3    1981-04-02
4    1981-09-28
5    1981-03-01
6    1981-01-09
7    1982-12-09
8    1981-11-17
9    1981-09-08
10   1983-01-12
11   1981-12-03
12   1981-12-03
13   1982-01-23
Name: hiredate, dtype: datetime64[ns]
| Python Programming | 
import datetime
import dateutil
# Get the current date
now = datetime.datetime.utcnow()
now = now.date()
# Get the difference between the current date and the birthday
age = dateutil.relativedelta.relativedelta(now, pd.to_datetime(emp['hiredate'][0], format='%Y %m %d'))
age = age.years
age
| Results | 
39
dateutil의 relativedelta : 현재 날짜와 입사일(hiredate) 사이의 GAP을 계산하여서 39년 이상인 데이터 추출
| Python Programming | 
from dateutil import relativedelta
emp[emp.apply(lambda row: relativedelta.relativedelta(pd.Timestamp.now(), 
                                                       pd.to_datetime(row['hiredate'], format='%Y %m %d')).years, axis=1) >= 39][["ename"]]
| Results | 
| ename | |
| 0 | SMITH | 
| 1 | ALLEN | 
| 2 | WARD | 
| 3 | JONES | 
| 5 | BLAKE | 
| 6 | CLARK | 
| 9 | TURNER | 
두 날짜 사이의 GAP을 계산 후 39년 이상 근무한 직원을 선택한다.
| Python Programming | 
emp[ (pd.Timestamp.now() - pd.to_datetime(emp['hiredate'])).dt.days > 39*365 ][["ename"]]
| Results | 
| ename | |
| 0 | SMITH | 
| 1 | ALLEN | 
| 2 | WARD | 
| 3 | JONES | 
| 5 | BLAKE | 
| 6 | CLARK | 
| 9 | TURNER | 
| Python Programming | 
emp.loc[((pd.Timestamp.now() - pd.to_datetime(emp.hiredate)).dt.days > 39 *365 ) , ["ename"]]
| Results | 
| ename | |
| 0 | SMITH | 
| 1 | ALLEN | 
| 2 | WARD | 
| 3 | JONES | 
| 5 | BLAKE | 
| 6 | CLARK | 
| 9 | TURNER | 
- [참고] Pandas의 query 구문 사용 예제
 
| Python Programming | 
start_date, end_date = '1980/01/02', '1981/06/04'
emp.query('hiredate >= @start_date and hiredate <= @end_date')
| Results | 
| empno | ename | job | mgr | hiredate | sal | comm | deptno | hiredate_D | |
| 0 | 7369 | SMITH | CLERK | 7902.0 | 1980/12/17 | 800 | NaN | 20 | 1980-12-17 | 
| 1 | 7499 | ALLEN | SALESMAN | 7698.0 | 1981/02/20 | 1600 | 300.0 | 30 | 1981-02-20 | 
| 2 | 7521 | WARD | SALESMAN | 7698.0 | 1981/02/22 | 1250 | 500.0 | 30 | 1981-02-22 | 
| 3 | 7566 | JONES | MANAGER | 7839.0 | 1981/04/02 | 2975 | NaN | 20 | 1981-04-02 | 
| 5 | 7698 | BLAKE | MANAGER | 7839.0 | 1981/03/01 | 2850 | NaN | 30 | 1981-03-01 | 
| 6 | 7782 | CLARK | MANAGER | 7839.0 | 1981/01/09 | 2450 | NaN | 10 | 1981-01-09 | 
- pd.Timestamp를 사용하여서 문자 데이터를 날짜로 변경
 
| Python Programming | 
ts = pd.Timestamp
emp['hiredate_D'] = pd.to_datetime(emp['hiredate'])
emp.query('hiredate_D >= @ts("19820201T071320")')
| Results | 
| empno | ename | job | mgr | hiredate | sal | comm | deptno | hiredate_D | |
| 7 | 7788 | SCOTT | ANALYST | 7566.0 | 1982/12/09 | 3000 | NaN | 20 | 1982-12-09 | 
| 10 | 7876 | ADAMS | CLERK | 7788.0 | 1983/01/12 | 1100 | NaN | 20 | 1983-01-12 | 
- hiredate 변수를 날짜형으로 변경하여서 신규 변수 생성. 생성된 변수와 현재 날짜 비교
 - query 상에서 현재 날짜와 hiredate 사이의 비교(-) 처리를 못하여서 현재 날짜를 기준으로 - 39년 날짜 계산 후 처리
 
| Python Programming | 
from dateutil.relativedelta import relativedelta
emp['hiredate_D'] = pd.to_datetime(emp['hiredate'])
now = datetime.datetime.now() - relativedelta(years=39)
emp.query("@now > hiredate_D", engine='python')[["ename"]]
| Results | 
| ename | |
| 0 | SMITH | 
| 1 | ALLEN | 
| 2 | WARD | 
| 3 | JONES | 
| 5 | BLAKE | 
| 6 | CLARK | 
| 9 | TURNER | 
- pd.to_datetime를 적용하여서 hiredate를 날짜형으로 변경 후 현재 날짜를 기준으로 -39년 날짜와 비교
 
| Python Programming | 
from dateutil.relativedelta import relativedelta
ts_chg = pd.to_datetime
base_dt = datetime.datetime.now() - relativedelta(years=39)
emp.query('@base_dt > @ts_chg(hiredate)')[["ename"]]
| Results | 
| ename | |
| 0 | SMITH | 
| 1 | ALLEN | 
| 2 | WARD | 
| 3 | JONES | 
| 5 | BLAKE | 
| 6 | CLARK | 
| 9 | TURNER | 
3. R Programming (R Package)
- difftime 함수를 사용하여서 현재 날짜와 입사일 사이의 기간을 계산
 
| R Programming | 
%%R
difftime(Sys.Date(), emp$hiredate) / 365
| Results | 
Time differences in days
 [1] 39.75342 39.57534 39.56986 39.46301 38.97260 39.55068 39.69041 37.77534
 [9] 38.83562 39.02740 37.68219 38.79178 38.79178 38.65205
| R Programming | 
%%R
as.numeric(Sys.Date() - as.Date(emp$hiredate)) / 365
| Results | 
 [1] 39.75342 39.57534 39.56986 39.46301 38.97260 39.55068 39.69041 37.77534
 [9] 38.83562 39.02740 37.68219 38.79178 38.79178 38.65205
| R Programming | 
%%R
emp[Sys.Date() - emp$hiredate > 39*365 , c("empno","ename") ]
| Results | 
# A tibble: 7 x 2
  empno ename 
  <dbl> <chr> 
1  7369 SMITH 
2  7499 ALLEN 
3  7521 WARD  
4  7566 JONES 
5  7698 BLAKE 
6  7782 CLARK 
7  7844 TURNER
4. R Dplyr Package
*현재 날짜와 입사일 사이의 기간을 계산 후 filter 구문을 사용하여서 39년 이상 근무한 직원을 선택한다.
| R Programming | 
%%R
emp %>% filter( (Sys.Date() - hiredate) > 39*365 ) %>% dplyr::select(ename)
| Results | 
# A tibble: 7 x 1
  ename 
  <chr> 
1 SMITH 
2 ALLEN 
3 WARD  
4 JONES 
5 BLAKE 
6 CLARK 
7 TURNER
5. R sqldf Package
- 현재날짜를 인수로 지정하여서 출력
 
| R Programming | 
%%R
sprintf("select * from DF where a >= %d", Sys.Date())
| Results | 
[1] "select * from DF where a >= 18580"
- 위에서 산출된 현재 날짜를 sqldf 쿼리에 직접 입력하여 처리(수작업)
 
| R Programming | 
%%R
require(sqldf)
sqldf("select ename,
              strftime('%Y-%m-%d', 'now') now_day,
              hiredate, 
              (strftime('%d-%m-%Y', 'now') - hiredate) Gap,
              (hiredate - strftime('%d-%m-%Y', 'now')) Gap_1
       from emp
       where (18580 - hiredate) > 39*365")
| Results | 
   ename    now_day   hiredate   Gap Gap_1
1  SMITH 2020-11-13 1980-12-17 -3990  3990
2  ALLEN 2020-11-13 1981-02-20 -4055  4055
3   WARD 2020-11-13 1981-02-22 -4057  4057
4  JONES 2020-11-13 1981-04-02 -4096  4096
5  BLAKE 2020-11-13 1981-03-01 -4064  4064
6  CLARK 2020-11-13 1981-01-09 -4013  4013
7 TURNER 2020-11-13 1981-09-08 -4255  4255
- [참고] gsubfn 패키지 fn$ 을 사용하여서 sqldf에 파이썬의 함수와 파라미터의 값을 전달 방식
- Note that we must use the fn$ prefix to invoke the interpolation functionality
 - gsubfn 패키지 fn$ 참고(Insert Variables) : https://github.com/ggrothendieck/sqldf
 
 
| R Programming | 
# %%R
# 파라미터 처리
# p1 <- 7
# fn$sqldf("select * from iris where Sepal_Length > $p1")
| R Programming | 
emp
| Results | 
| empno | ename | job | mgr | hiredate | sal | comm | deptno | |
| 0 | 7369 | SMITH | CLERK | 7902.0 | 1980/12/17 | 800 | NaN | 20 | 
| 1 | 7499 | ALLEN | SALESMAN | 7698.0 | 1981/02/20 | 1600 | 300.0 | 30 | 
| 2 | 7521 | WARD | SALESMAN | 7698.0 | 1981/02/22 | 1250 | 500.0 | 30 | 
| 3 | 7566 | JONES | MANAGER | 7839.0 | 1981/04/02 | 2975 | NaN | 20 | 
| 4 | 7654 | MARTIN | SALESMAN | 7698.0 | 1981/09/28 | 1250 | 1400.0 | 30 | 
| 5 | 7698 | BLAKE | MANAGER | 7839.0 | 1981/03/01 | 2850 | NaN | 30 | 
| 6 | 7782 | CLARK | MANAGER | 7839.0 | 1981/01/09 | 2450 | NaN | 10 | 
| 7 | 7788 | SCOTT | ANALYST | 7566.0 | 1982/12/09 | 3000 | NaN | 20 | 
| 8 | 7839 | KING | PRESIDENT | NaN | 1981/11/17 | 5000 | NaN | 10 | 
| 9 | 7844 | TURNER | SALESMAN | 7698.0 | 1981/09/08 | 1500 | 0.0 | 30 | 
| 10 | 7876 | ADAMS | CLERK | 7788.0 | 1983/01/12 | 1100 | NaN | 20 | 
| 11 | 7900 | JAMES | CLERK | 7698.0 | 1981/12/03 | 950 | NaN | 30 | 
| 12 | 7902 | FORD | ANALYST | 7566.0 | 1981/12/03 | 3000 | NaN | 20 | 
| 13 | 7934 | MILLER | CLERK | 7782.0 | 1982/01/23 | 1300 | NaN | 10 | 
Sys.Date()를 fn$sqldf 기능과 함께 사용- [참고] strftime('%d-%m-%Y', 'now') : 현재 날짜를 문자로 반환
 
| R Programming | 
%%R
gsubfn::fn$sqldf("select *, strftime('%d-%m-%Y', 'now') now_char from emp where (`Sys.Date()` - hiredate) > 39 * 365  ")
| Results | 
  empno  ename      job  mgr   hiredate  sal comm deptno   now_char
1  7369  SMITH    CLERK 7902 1980-12-17  800   NA     20 09-09-2020
2  7499  ALLEN SALESMAN 7698 1981-02-20 1600  300     30 09-09-2020
3  7521   WARD SALESMAN 7698 1981-02-22 1250  500     30 09-09-2020
4  7566  JONES  MANAGER 7839 1981-04-02 2975   NA     20 09-09-2020
5  7698  BLAKE  MANAGER 7839 1981-03-01 2850   NA     30 09-09-2020
6  7782  CLARK  MANAGER 7839 1981-01-09 2450   NA     10 09-09-2020
7  7844 TURNER SALESMAN 7698 1981-09-08 1500    0     30 09-09-2020
6. Python pandasql Package
[참고] format 구문
| Python Programming | 
import pandasql as ps
a = 'v2'
b = 'v10'
q = "SELECT {} FROM df1 ".format(a+','+b)
q
| Results | 
'SELECT v2,v10 FROM df1 '
- [참고] format 구문
 
| Python Programming | 
import pandasql as ps
var = 'hiredate'
query = "SELECT {} FROM emp ".format(var)
ps.sqldf(query, globals())
| Results | 
| hiredate | |
| 0 | 1980/12/17 | 
| 1 | 1981/02/20 | 
| 2 | 1981/02/22 | 
| 3 | 1981/04/02 | 
| 4 | 1981/09/28 | 
| 5 | 1981/03/01 | 
| 6 | 1981/01/09 | 
| 7 | 1982/12/09 | 
| 8 | 1981/11/17 | 
| 9 | 1981/09/08 | 
| 10 | 1983/01/12 | 
| 11 | 1981/12/03 | 
| 12 | 1981/12/03 | 
| 13 | 1982/01/23 | 
- 반복구문 처리
 
| Python Programming | 
import pandasql as ps
test = ['''CLERK''',"""SALESMAN"""]
for i in test:
    query = """SELECT sal FROM emp where job = '""" + i + """'"""
    print(ps.sqldf(query, globals()))
    print()
| Results | 
    sal
0   800
1  1100
2   950
3  1300
    sal
0  1600
1  1250
2  1250
3  1500
- [참고] cast 함수 : 형변환 함수
 
| Python Programming | 
# emp.dtypes
| Python Programming | 
import pandasql as ps
query = """SELECT cast(hiredate as integer) FROM emp """
ps.sqldf(query, globals()).head()
| Results | 
| cast(hiredate as integer) | |
| 0 | 1980 | 
| 1 | 1981 | 
| 2 | 1981 | 
| 3 | 1981 | 
| 4 | 1981 | 
- pandasql 연습
 
| Python Programming | 
import datetime
nowdate= datetime.datetime.now()
format= "%Y-%m-%d" 
import time
emp['hiredate_D'] = pd.to_datetime(emp['hiredate'])
emp['to_date']    = pd.Timestamp.now()
emp['day_gap']    = (pd.Timestamp.now() - pd.to_datetime(emp['hiredate'])).dt.days
query = """SELECT to_date          현재날짜1,
                  date('now')      현재날짜2,
                  datetime('now')  현재날짜3,
                  typeof(hiredate) 변수Type확인,
                  sqlite_version() sqlite버젼확인,                  
                  date('2020-09-10', 'start of month') 초일자,
                  CAST(STRFTIME('%Y %m %d', hiredate_D) AS date) cast함수, 
                  STRFTIME('%Y %m %d', hiredate_D) 문자반환,
                  CAST(hiredate AS date) x2,
                  (cast(strftime('%s',hiredate_D) as interger))/(24*60*60) x3
           FROM emp
           WHERE day_gap > 39*365"""
# print(query)
query = ps.sqldf(query, globals())
query.head()
| Results | 
| 현재날짜 | 현재날짜2 | 현재날짜3 | 변수Type확인 | sqlite버젼확인 | 초일자 | cast함수 | 문자반환 | x2 | x3 | |
| 0 | 2020-09-10 09:45:49.454895 | 2020-09-10 | 2020-09-10 00:45:49 | text | 3.32.3 | 2020-09-01 | 1980 | 1980 12 17 | 1980 | 4003 | 
| 1 | 2020-09-10 09:45:49.454895 | 2020-09-10 | 2020-09-10 00:45:49 | text | 3.32.3 | 2020-09-01 | 1981 | 1981 02 20 | 1981 | 4068 | 
| 2 | 2020-09-10 09:45:49.454895 | 2020-09-10 | 2020-09-10 00:45:49 | text | 3.32.3 | 2020-09-01 | 1981 | 1981 02 22 | 1981 | 4070 | 
| 3 | 2020-09-10 09:45:49.454895 | 2020-09-10 | 2020-09-10 00:45:49 | text | 3.32.3 | 2020-09-01 | 1981 | 1981 04 02 | 1981 | 4109 | 
| 4 | 2020-09-10 09:45:49.454895 | 2020-09-10 | 2020-09-10 00:45:49 | text | 3.32.3 | 2020-09-01 | 1981 | 1981 03 01 | 1981 | 4077 | 
- 데이터에 날짜형 변수를 직접 생성 후 처리
 
| Python Programming | 
import datetime
nowdate= datetime.datetime.now()
format= "%Y-%m-%d" 
import time
emp['hiredate_D'] = pd.to_datetime(emp['hiredate'])
emp['to_date']    = pd.Timestamp.now()
emp['day_gap']    = (pd.Timestamp.now() - pd.to_datetime(emp['hiredate'])).dt.days
query = """SELECT to_date,
                  hiredate_D,
                  day_gap,
                  ename
           FROM emp
           WHERE  day_gap > 39*365"""
ps.sqldf(query, globals())
| Results | 
| to_date | hiredate_D | day_gap | ename | |
| 0 | 2020-09-10 09:39:10.707798 | 1980-12-17 00:00:00.000000 | 14512 | SMITH | 
| 1 | 2020-09-10 09:39:10.707798 | 1981-02-20 00:00:00.000000 | 14447 | ALLEN | 
| 2 | 2020-09-10 09:39:10.707798 | 1981-02-22 00:00:00.000000 | 14445 | WARD | 
| 3 | 2020-09-10 09:39:10.707798 | 1981-04-02 00:00:00.000000 | 14406 | JONES | 
| 4 | 2020-09-10 09:39:10.707798 | 1981-03-01 00:00:00.000000 | 14438 | BLAKE | 
| 5 | 2020-09-10 09:39:10.707798 | 1981-01-09 00:00:00.000000 | 14489 | CLARK | 
| 6 | 2020-09-10 09:39:10.707798 | 1981-09-08 00:00:00.000000 | 14247 | TURNER | 
- 문자형 날짜 데이터를 날짜로 변경하여 처리
 
| Python Programming | 
import datetime
nowdate= datetime.datetime.now()
format= "%Y-%m-%d" 
import time
query = """SELECT to_date,
                  date('now'),                  
                  ename,
                  (strftime('%s','now') -
                  strftime('%s',(substr(hiredate,1,4)||'-'||substr(hiredate,6,2)||'-'||substr(hiredate,9,2))))/60/60/24 second_gap
           FROM emp
           WHERE (strftime('%s','now') -                  
                  strftime('%s',(substr(hiredate,1,4)||'-'||substr(hiredate,6,2)||'-'||substr(hiredate,9,2))))/60/60/24 > 39*365"""
# print(query)
ps.sqldf(query, globals())
| Results | 
| to_date | date('now') | ename | second_gap | |
| 0 | 2020-09-10 09:45:49.454895 | 2020-09-10 | SMITH | 14512 | 
| 1 | 2020-09-10 09:45:49.454895 | 2020-09-10 | ALLEN | 14447 | 
| 2 | 2020-09-10 09:45:49.454895 | 2020-09-10 | WARD | 14445 | 
| 3 | 2020-09-10 09:45:49.454895 | 2020-09-10 | JONES | 14406 | 
| 4 | 2020-09-10 09:45:49.454895 | 2020-09-10 | BLAKE | 14438 | 
| 5 | 2020-09-10 09:45:49.454895 | 2020-09-10 | CLARK | 14489 | 
| 6 | 2020-09-10 09:45:49.454895 | 2020-09-10 | TURNER | 14247 | 
7. R data.table Package
%%R
DT          <- data.table(emp)
dept_DT     <- data.table(dept)
DT[(Sys.Date() - hiredate) > 39 * 365, .(ename)]
| Results | 
    ename
1:  SMITH
2:  ALLEN
3:   WARD
4:  JONES
5: MARTIN
6:  BLAKE
7:  CLARK
8: TURNER
8. SAS Proc SQL
%%SAS sas
PROC SQL;
  CREATE TABLE STATSAS_1 AS
    SELECT ENAME,
           TODAY() - hiredate as day_gap,
           datdif(hiredate, TODAY(), 'act/act') as day_gap_1
    FROM   EMP
    WHERE  TODAY() - hiredate > 40*365;
QUIT;
PROC PRINT;RUN;
| Results | 
| OBS | ename | day_gap | day_gap_1 | 
| 1 | SMITH | 14727 | 14727 | 
| 2 | ALLEN | 14662 | 14662 | 
| 3 | WARD | 14660 | 14660 | 
| 4 | JONES | 14621 | 14621 | 
| 5 | BLAKE | 14653 | 14653 | 
| 6 | CLARK | 14704 | 14704 | 
9. SAS Data Step
%%SAS sas
DATA STATSAS_2;
 SET EMP;
     To_day    = TODAY();
     Day_Gap   = TODAY() - hiredate;
     Day_gap_1 = datdif(hiredate, TODAY(), 'act/act');
     KEEP ENAME To_day Day_Gap Day_gap_1 HIREDATE;
     WHERE datdif(hiredate, TODAY(), 'act/act') > 40*365;
RUN;
PROC PRINT;RUN;
| Results | 
| OBS | ename | hiredate | To_day | Day_Gap | Day_gap_1 | 
| 1 | SMITH | 1980-12-17 | 22383 | 14727 | 14727 | 
| 2 | ALLEN | 1981-02-20 | 22383 | 14662 | 14662 | 
| 3 | WARD | 1981-02-22 | 22383 | 14660 | 14660 | 
| 4 | JONES | 1981-04-02 | 22383 | 14621 | 14621 | 
| 5 | BLAKE | 1981-03-01 | 22383 | 14653 | 14653 | 
| 6 | CLARK | 1981-01-09 | 22383 | 14704 | 14704 | 
10. Python Dfply Package
- 별도 함수 정의(make_symbolic)
 
| Python Programming | 
@make_symbolic
def to_datetime(series, infer_datetime_format=True):
    return pd.to_datetime(series, infer_datetime_format=infer_datetime_format)
emp >> filter_by( (pd.Timestamp.now() - to_datetime(X.hiredate)).dt.days > 39*365 ) >> head()
| Results | 
| empno | ename | job | mgr | hiredate | sal | comm | deptno | |
| 0 | 7369 | SMITH | CLERK | 7902.0 | 1980/12/17 | 800 | NaN | 20 | 
| 1 | 7499 | ALLEN | SALESMAN | 7698.0 | 1981/02/20 | 1600 | 300.0 | 30 | 
| 2 | 7521 | WARD | SALESMAN | 7698.0 | 1981/02/22 | 1250 | 500.0 | 30 | 
| 3 | 7566 | JONES | MANAGER | 7839.0 | 1981/04/02 | 2975 | NaN | 20 | 
| 4 | 7654 | MARTIN | SALESMAN | 7698.0 | 1981/09/28 | 1250 | 1400.0 | 30 | 
| Python Programming | 
emp >> \
  filter_by( (pd.Timestamp.now() - make_symbolic(to_datetime)(X.hiredate)).dt.days > 39*365 ) >> \
  head()
| Results | 
| empno | ename | job | mgr | hiredate | sal | comm | deptno | |
| 0 | 7369 | SMITH | CLERK | 7902.0 | 1980/12/17 | 800 | NaN | 20 | 
| 1 | 7499 | ALLEN | SALESMAN | 7698.0 | 1981/02/20 | 1600 | 300.0 | 30 | 
| 2 | 7521 | WARD | SALESMAN | 7698.0 | 1981/02/22 | 1250 | 500.0 | 30 | 
| 3 | 7566 | JONES | MANAGER | 7839.0 | 1981/04/02 | 2975 | NaN | 20 | 
| 4 | 7654 | MARTIN | SALESMAN | 7698.0 | 1981/09/28 | 1250 | 1400.0 | 30 | 
| Python Programming | 
@make_symbolic
def to_datetime(series, infer_datetime_format=True):
    return pd.to_datetime(series, infer_datetime_format=infer_datetime_format)
emp >> mutate(pd_date=to_datetime(X.hiredate)) >> head()
| Results | 
| empno | ename | job | mgr | hiredate | sal | comm | deptno | pd_date | |
| 0 | 7369 | SMITH | CLERK | 7902.0 | 1980/12/17 | 800 | NaN | 20 | 1980-12-17 | 
| 1 | 7499 | ALLEN | SALESMAN | 7698.0 | 1981/02/20 | 1600 | 300.0 | 30 | 1981-02-20 | 
| 2 | 7521 | WARD | SALESMAN | 7698.0 | 1981/02/22 | 1250 | 500.0 | 30 | 1981-02-22 | 
| 3 | 7566 | JONES | MANAGER | 7839.0 | 1981/04/02 | 2975 | NaN | 20 | 1981-04-02 | 
| 4 | 7654 | MARTIN | SALESMAN | 7698.0 | 1981/09/28 | 1250 | 1400.0 | 30 | 1981-09-28 | 

[SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트
반응형
    
    
    
  
댓글