본문 바로가기
통계프로그램 비교 시리즈/프로그래밍비교(Oracle,Python,R,SAS)

[날짜 포맷] 날짜 출력 포맷 지정 - 144 (오라클 SQL, R, Python, SAS)

by 기서무나구물 2023. 1. 1.

포스팅 목차

     

    144. List all employees with hire date in the format ‘June 4 1988’?

     

    * 직원들의 입사일자를 월 일 년(‘June 4 1988’) 형식으로 출력하시오.


    • Oracle : to_char()의 날짜 포맷 지정
    • 파이썬 Pandas : pd.to_datetime(), dt.strftime(), apply(), lambda 함수, info(), type(), dtypes, is_string_dtype
    • R 프로그래밍 : class(), Sys.setlocale(), format(), as.Date(), as.character
    • R Dplyr Package : Sys.setlocale(), as.character()
    • R sqldf Package : datetime(), strftime(), substr()
    • Python pandasql Package : datetime(), strftime(), substr()
    • R data.table Package : as.character()
    • SAS Proc SQL : PROC FORMAT 사용자 출력형식 생성, PUT() 함수의 날짜 포맷
    • SAS Data Step : PROC FORMAT 사용자 출력형식 생성, PUT() 함수의 날짜 포맷, INPUT 함수의 날짜 포맷(ANYDTDTE25., DATE9., ifmt25., nldate200.), STRIP 함수, YEAR 함수, compress()
    • Python Dfply Package : make_symbolic(), pd.to_datetime, strftime(), apply(), lambda 함수
    • 파이썬 Base 프로그래밍 :

     


    1. Oracle(오라클)

    to_char() 함수에 출력형식을 지정하여서 입사일자를 출력한다.

     

    Oracle Programming
    select empno,ename,hiredate,
           to_char(hiredate,'month dd yyyy') month_name
    from   emp;

     

     


    2. Python Pandas(파이썬)

    to_datetime()함수를 사용하여 문자형(object)으로 할당되어 있는 입사일자를 일자시간타입(datetime64)으로 변환 후 strftime() 함수를 사용하여 출력형식(영문월 일 년)을 지정하여 출력한다.

     

    Python Programming
    pd.to_datetime(emp['hiredate']).dt.strftime('%B %d %Y').head()

     

    Results
    0     December 17 1980
    1     February 20 1981
    2     February 22 1981
    3        April 02 1981
    4    September 28 1981
    Name: hiredate, dtype: object

     


    to_datetime()함수를 사용하여 문자형(object)으로 할당되어 있는 입사일자를 일자시간타입(datetime64)으로 변환 후 apply() 함수 내에서 strftime() 함수에 출력형식(영문월 일 년)을 지정하여 출력한다.

     

    Python Programming
    pd.to_datetime(emp['hiredate']).apply(lambda x: x.strftime('%B %d %Y')).head()

     

    Results
    0     December 17 1980
    1     February 20 1981
    2     February 22 1981
    3        April 02 1981
    4    September 28 1981
    Name: hiredate, dtype: object

     


    [참고] 변수 타입 확인하기

     

    Python Programming
    from pandas.api.types import is_string_dtype
    
    emp.info()
    
    type(emp['hiredate'])
    
    emp.dtypes
    
    is_string_dtype(emp['hiredate'])

     

    Results
    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 14 entries, 0 to 13
    Data columns (total 8 columns):
     #   Column    Non-Null Count  Dtype  
    ---  ------    --------------  -----  
     0   empno     14 non-null     int64  
     1   ename     14 non-null     object 
     2   job       14 non-null     object 
     3   mgr       13 non-null     float64
     4   hiredate  14 non-null     object 
     5   sal       14 non-null     int64  
     6   comm      4 non-null      float64
     7   deptno    14 non-null     int64  
    dtypes: float64(2), int64(3), object(3)
    memory usage: 1.0+ KB
    
    
    
    
    
    True

     

     


    3. R Programming (R Package)

    날짜 관련 로케일(locale)을 영문으로 지정하여서 format() 함수 내에서 출력형식(영문월 일 년)을 지정하여 출력한다. 월에 대하여 영문이름으로 출력을 위해 지정하였던 옵션을 초기화 하기 위하여 작업을 완료 후 “LC_TIME” 에 설정하였던 값을 제거하여 기본 시스템에서 제공하는 값이 사용된다.

     

    • Colab에서는 month name이 기본으로 출력 됨.

     

    R Programming
    %%R
    
    class(withmooc$hiredate)

     

    Results
    [1] "Date"

     


    R Programming
    %%R
    
    Sys.setlocale("LC_TIME","English")
    
    withmooc <- emp
    withmooc['Month_name'] <- format(as.Date(withmooc$hiredate),'%b %d %Y')
    
    print(withmooc[1:7, -c(3,4) ])
    
    Sys.setlocale("LC_TIME","")

     

    Results
    # A tibble: 7 x 8
      empno ename  hiredate     sal  comm deptno salary Month_name 
      <dbl> <chr>  <date>     <dbl> <dbl>  <dbl> <chr>  <chr>      
    1  7369 SMITH  1980-12-17   800    NA     20 $920   Dec 17 1980
    2  7499 ALLEN  1981-02-20  1600   300     30 $1,840 Feb 20 1981
    3  7521 WARD   1981-02-22  1250   500     30 $1,438 Feb 22 1981
    4  7566 JONES  1981-04-02  2975    NA     20 $3,421 Apr 02 1981
    5  7654 MARTIN 1981-09-28  1250  1400     30 $1,438 Sep 28 1981
    6  7698 BLAKE  1981-03-01  2850    NA     30 $3,278 Mar 01 1981
    7  7782 CLARK  1981-01-09  2450    NA     10 $2,818 Jan 09 1981
    [1] "Korean_Korea.949"

     


    날짜 관련 로케일(locale)을 영문으로 지정한 후 as.character() 함수 내에서 출력형식(‘영문월 일 년’)을 지정하여 출력한다. 월에 대하여 영문 약칭명으로 출력하기 위해 지정하였던 옵션을 초기화 하기 위하여 작업을 완료 후 “LC_TIME” 에 설정하였던 값을 제거하여 시스템에서 제공하는 기본 값으로 다시 변경된다.

     

    R Programming
    %%R
    Sys.setlocale("LC_TIME","English")
    
    withmooc <- emp
    withmooc['Month_name'] <- as.character(withmooc$hiredate, format = '%b %d %Y')
    
    print(withmooc[1:7, -c(3,4) ])
    
    Sys.setlocale("LC_TIME","")

     

    Results
    # A tibble: 7 x 8
      empno ename  hiredate     sal  comm deptno salary Month_name 
      <dbl> <chr>  <date>     <dbl> <dbl>  <dbl> <chr>  <chr>      
    1  7369 SMITH  1980-12-17   800    NA     20 $920   Dec 17 1980
    2  7499 ALLEN  1981-02-20  1600   300     30 $1,840 Feb 20 1981
    3  7521 WARD   1981-02-22  1250   500     30 $1,438 Feb 22 1981
    4  7566 JONES  1981-04-02  2975    NA     20 $3,421 Apr 02 1981
    5  7654 MARTIN 1981-09-28  1250  1400     30 $1,438 Sep 28 1981
    6  7698 BLAKE  1981-03-01  2850    NA     30 $3,278 Mar 01 1981
    7  7782 CLARK  1981-01-09  2450    NA     10 $2,818 Jan 09 1981
    [1] "Korean_Korea.949"

     

     


    4. R Dplyr Package

    날짜 관련 로케일(locale)을 영문으로 지정한 후 as.character() 함수 내에서 출력형식(‘영문월약칭명 일 년’)을 지정하여 출력한다. 월에 대하여 영문 약칭명으로 출력하기 위해 지정하였던 옵션을 초기화 하기 위하여 작업을 완료 후 “LC_TIME” 에 설정하였던 값을 제거하여 시스템에서 제공하는 기본 값으로 다시 변경된다.

     

    R Programming
    %%R
    
    Sys.setlocale("LC_TIME","English")
    
    emp %>% 
      dplyr::mutate(Month_name = as.character(hiredate, format = '%b %d %Y') ) %>%
      head() %>%
      print
    
    Sys.setlocale("LC_TIME","")

     

    Results
    # A tibble: 6 x 10
      empno ename  job        mgr hiredate     sal  comm deptno salary Month_name 
      <dbl> <chr>  <chr>    <dbl> <date>     <dbl> <dbl>  <dbl> <chr>  <chr>      
    1  7369 SMITH  CLERK     7902 1980-12-17   800    NA     20 $920   Dec 17 1980
    2  7499 ALLEN  SALESMAN  7698 1981-02-20  1600   300     30 $1,840 Feb 20 1981
    3  7521 WARD   SALESMAN  7698 1981-02-22  1250   500     30 $1,438 Feb 22 1981
    4  7566 JONES  MANAGER   7839 1981-04-02  2975    NA     20 $3,421 Apr 02 1981
    5  7654 MARTIN SALESMAN  7698 1981-09-28  1250  1400     30 $1,438 Sep 28 1981
    6  7698 BLAKE  MANAGER   7839 1981-03-01  2850    NA     30 $3,278 Mar 01 1981
    [1] "Korean_Korea.949"

     

     


    5. R sqldf Package

    substr() 함수를 사용하여서 입사월에 해당하는 영문 약칭명을 추출하고, strftime() 함수에 출력형식을 지정하여 입사일과 입사년을 선택한 후 파이프(‘||’) 연산자로 월의 영문 약칭명+입사일자+입사년월을 연결하여 출력한다.

    R Programming
    %%R
    
    sqldf( " select hiredate,
                    datetime(hiredate * 3600 * 24,'unixepoch') hiredate_dtime,
                    strftime('%d %Y', datetime(hiredate * 3600 * 24,'unixepoch')) day_year,
                    substr('JanFebMarAprMayJunJulAugSepOctNovDec', 1 + 3*strftime('%m', datetime(hiredate * 3600 * 24,'unixepoch')), -3) month_name,
                    substr('JanFebMarAprMayJunJulAugSepOctNovDec', 1 + 3*strftime('%m', datetime(hiredate * 3600 * 24,'unixepoch')), -3) ||' '||strftime('%d %Y', datetime(hiredate * 3600 * 24,'unixepoch')) month_name1
             from emp" ) %>% head()

     

    Results
        hiredate      hiredate_dtime day_year month_name month_name1
    1 1980-12-17 1980-12-17 00:00:00  17 1980        Dec Dec 17 1980
    2 1981-02-20 1981-02-20 00:00:00  20 1981        Feb Feb 20 1981
    3 1981-02-22 1981-02-22 00:00:00  22 1981        Feb Feb 22 1981
    4 1981-04-02 1981-04-02 00:00:00  02 1981        Apr Apr 02 1981
    5 1981-09-28 1981-09-28 00:00:00  28 1981        Sep Sep 28 1981
    6 1981-03-01 1981-03-01 00:00:00  01 1981        Mar Mar 01 1981

     

     


    6. Python pandasql Package

     

    Python Programming
    import copy
    
    
    ps.sqldf(" select hiredate,   \
                      date(substr(hiredate,1,4)||'-'||substr(hiredate,6,2)||'-'||substr(hiredate,9,2)) hiredate_dtime,   \
                      strftime('%d %Y',date(substr(hiredate,1,4)||'-'||substr(hiredate,6,2)||'-'||substr(hiredate,9,2))) day_year,   \
                    substr('JanFebMarAprMayJunJulAugSepOctNovDec', 1 + 3*strftime('%m', date(substr(hiredate,1,4)||'-'||substr(hiredate,6,2)||'-'||substr(hiredate,9,2)) ), -3) month_name,   \
                    substr('JanFebMarAprMayJunJulAugSepOctNovDec', 1 + 3*strftime('%m', date(substr(hiredate,1,4)||'-'||substr(hiredate,6,2)||'-'||substr(hiredate,9,2)) ), -3) ||' '||strftime('%d %Y', datetime(hiredate * 3600 * 24,'unixepoch')) month_name1   \
             from emp ").head()

     


    Results
      hiredate hiredate_dtime day_year month_name month_name
    0 1980/12/17 1980-12-17 17 1980 Dec Dec 04 1975
    1 1981/02/20 1981-02-20 20 1981 Feb Feb 05 1975
    2 1981/02/22 1981-02-22 22 1981 Feb Feb 05 1975
    3 1981/04/02 1981-04-02 02 1981 Apr Apr 05 1975
    4 1981/09/28 1981-09-28 28 1981 Sep Sep 05 1975

     


    7. R data.table Package

    날짜 관련 로케일(locale)을 영문으로 지정한 후 as.character() 함수 내에서 출력형식(‘영문월약칭명 일 년’)을 지정하여 출력한다. 월에 대하여 영문 약칭명으로 출력하기 위해 지정하였던 옵션을 초기화 하기 위하여 작업을 완료 후 “LC_TIME” 에 설정하였던 값을 제거하여 시스템에서 제공하는 기본 값으로 다시 변경된다.

     

    R Programming
    %%R
    
    Sys.setlocale("LC_TIME","English")
    
    DT          <- data.table(emp)
    dept_DT     <- data.table(dept)
    salgrade_DT <- data.table(salgrade)
    
    DT[, Month_name := as.character(hiredate, format = '%b %d %Y') ]
    
    print(DT[1:5, ])
    
    Sys.setlocale("LC_TIME","")

     

    Results
       empno  ename      job  mgr   hiredate  sal comm deptno salary  Month_name
    1:  7369  SMITH    CLERK 7902 1980-12-17  800   NA     20   $920 Dec 17 1980
    2:  7499  ALLEN SALESMAN 7698 1981-02-20 1600  300     30 $1,840 Feb 20 1981
    3:  7521   WARD SALESMAN 7698 1981-02-22 1250  500     30 $1,438 Feb 22 1981
    4:  7566  JONES  MANAGER 7839 1981-04-02 2975   NA     20 $3,421 Apr 02 1981
    5:  7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400     30 $1,438 Sep 28 1981
    [1] "Korean_Korea.949"

     


    8. SAS Proc SQL

    SAS Programming
    %%SAS sas
    
    * 사용자 출력형식(format) 생성;
    PROC FORMAT;
         PICTURE DTTEST
           LOW-HIGH = '%B %d %Y' (DATATYPE=DATE);
    RUN;
    
    * 사용자 출력형식(format) 생성;
    proc format;
          picture langtsda (default=50) other='%B, %d, %Y' (datatype=date);
    run;
    
    options locale=English_Unitedstates;
    * options locale=Korean_Korea;
    PROC SQL;
      CREATE TABLE STATSAS_1 AS
        select HIREDATE,
               PUT(HIREDATE,monname.)      AS HIREDATE_CHR,
               PUT(HIREDATE,EURDFMY.)      AS HIREDATE_CHR1,
               PUT(HIREDATE,EURDFWDX.)     AS HIREDATE_CHR2,
               PUT(HIREDATE,DATE9.)        AS HIREDATE_CHR3,
               PUT(HIREDATE,DATE11.)       AS HIREDATE_CHR4,
               PUT(HIREDATE,NLDATEYM.)     AS HIREDATE_CHR5,
               PUT(HIREDATE,DTTEST17.)     as HIREDATE_CHR6,
               PUT(HIREDATE,langtsda19.)   as HIREDATE_CHR7,
               nldate(HIREDATE,'%B %d %Y') as HIREDATE_CHR8
    from emp;
    QUIT;
    PROC PRINT;RUN;

     


    Results
    hiredate HIREDATE_CHR HIREDATE_CHR1 HIREDATE_CHR2 HIREDATE_CHR3 HIREDATE_CHR4 HIREDATE_CHR5 HIREDATE_CHR6 HIREDATE_CHR7 HIREDATE_CHR8
    1980-12-17 December DEC1980 17 December 1980 17DEC1980 17-DEC-1980 December 1980 December 17 1980 December, 17, 1980 December 17 1980
    1981-02-20 February FEB1981 20 February 1981 20FEB1981 20-FEB-1981 February 1981 February 20 1981 February, 20, 1981 February 20 1981
    1981-02-22 February FEB1981 22 February 1981 22FEB1981 22-FEB-1981 February 1981 February 22 1981 February, 22, 1981 February 22 1981
    1981-04-02 April APR1981 2 April 1981 02APR1981 02-APR-1981 April 1981 April 2 1981 April, 2, 1981 April 02 1981
    1981-09-28 September SEP1981 28 September 1981 28SEP1981 28-SEP-1981 September 1981 September 28 1981 September, 28, 1981 September 28 1981
    1981-03-01 March MAR1981 1 March 1981 01MAR1981 01-MAR-1981 March 1981 March 1 1981 March, 1, 1981 March 01 1981
    1981-01-09 January JAN1981 9 January 1981 09JAN1981 09-JAN-1981 January 1981 January 9 1981 January, 9, 1981 January 09 1981
    1982-12-09 December DEC1982 9 December 1982 09DEC1982 09-DEC-1982 December 1982 December 9 1982 December, 9, 1982 December 09 1982
    1981-11-17 November NOV1981 17 November 1981 17NOV1981 17-NOV-1981 November 1981 November 17 1981 November, 17, 1981 November 17 1981
    1981-09-08 September SEP1981 8 September 1981 08SEP1981 08-SEP-1981 September 1981 September 8 1981 September, 8, 1981 September 08 1981
    1983-01-12 January JAN1983 12 January 1983 12JAN1983 12-JAN-1983 January 1983 January 12 1983 January, 12, 1983 January 12 1983
    1981-12-03 December DEC1981 3 December 1981 03DEC1981 03-DEC-1981 December 1981 December 3 1981 December, 3, 1981 December 03 1981
    1981-12-03 December DEC1981 3 December 1981 03DEC1981 03-DEC-1981 December 1981 December 3 1981 December, 3, 1981 December 03 1981
    1982-01-23 January JAN1982 23 January 1982 23JAN1982 23-JAN-1982 January 1982 January 23 1982 January, 23, 1982 January 23 1982

     


    9. SAS Data Step

     

    SAS Programming
    %%SAS sas
    
    options locale=English_Unitedstates;
    * options locale=Korean_Korea;
    
    * 사용자 입력형식(informat) 생성;
    proc format;
          invalue ifmt (default=25) '/(\D+) (\d+) (\d+)/' (REGEXP) = [ANYDTDTE25.]
            other=_error_;
    run;
    
    PROC SQL;
      CREATE TABLE STATSAS_2 AS
        select HIREDATE,
               PUT(HIREDATE,yymmddn8.)                                              AS HIRE_CHAR_1,
    
               INPUT('01'||HIREDATE_CHR||STRIP(put(year(today()),4.)),ANYDTDTE25.) AS HIRE_DATE    FORMAT=monname.,
               INPUT('01'||HIREDATE_CHR1,ANYDTDTE25.)                              AS HIRE_DATE1   FORMAT=EURDFMY.,
               INPUT(compress(HIREDATE_CHR2),ANYDTDTE25.)                          AS HIRE_DATE2   FORMAT=EURDFWDX.,
               INPUT(HIREDATE_CHR3,DATE9.)                                         AS HIRE_DATE3   FORMAT=DATE9.,
               INPUT(HIREDATE_CHR4,DATE11.)                                        AS HIRE_DATE4   FORMAT=DATE11.,
               INPUT('01'||HIREDATE_CHR5,ANYDTDTE25.)                              AS HIRE_DATE5   FORMAT=NLDATEYM.,
               INPUT(HIREDATE_CHR6,ANYDTDTE25.)                                    AS HIRE_DATE6   FORMAT=DTTEST17.,
               INPUT(HIREDATE_CHR6,ifmt25.)                                        AS HIRE_DATE6_1 FORMAT=DTTEST25.,
    
               INPUT(HIREDATE_CHR6,ANYDTDTE25.)                                    AS HIRE_DATE7   FORMAT=langtsda19.,
    
               INPUT(HIREDATE_CHR8,ANYDTDTE25.)                                    AS HIRE_DATE8   FORMAT=nldate.,
               INPUT(HIREDATE_CHR8,nldate200.)                                     AS HIRE_DATE8_1 FORMAT=nldate.
    from STATSAS_1;
    QUIT;
    PROC PRINT data=STATSAS_2(obs=5);RUN;

     

    Results
    hiredate HIRE_CHAR_1 HIRE_DATE HIRE_DATE1 HIRE_DATE2 HIRE_DATE3 HIRE_DATE4 HIRE_DATE5 HIRE_DATE6 HIRE_DATE6_1 HIRE_DATE7 HIRE_DATE8 HIRE_DATE8_1
    1980-12-17 19801217 December DEC1980 17 December 1980 17DEC1980 17-DEC-1980 December 1980 December 17 1980 December 17 1980 December, 17, 1980 December 17, 1980 December 17, 1980
    1981-02-20 19810220 February FEB1981 20 February 1981 20FEB1981 20-FEB-1981 February 1981 February 20 1981 February 20 1981 February, 20, 1981 February 20, 1981 February 20, 1981
    1981-02-22 19810222 February FEB1981 22 February 1981 22FEB1981 22-FEB-1981 February 1981 February 22 1981 February 22 1981 February, 22, 1981 February 22, 1981 February 22, 1981
    1981-04-02 19810402 April APR1981 2 April 1981 02APR1981 02-APR-1981 April 1981 April 2 1981 April 2 1981 April, 2, 1981 April 02, 1981 April 02, 1981
    1981-09-28 19810928 September SEP1981 28 September 1981 28SEP1981 28-SEP-1981 September 1981 September 28 1981 September 28 1981 September, 28, 1981 September 28, 1981 September 28, 1981

     


    10. Python Dfply Package

     

    Python Programming
    emp >> \
      mutate( hire_date_1 = make_symbolic(pd.to_datetime)(X.hiredate).dt.strftime('%B %d %Y')  ,
              hire_date_2 = make_symbolic(pd.to_datetime)(X.hiredate).apply(lambda x: x.strftime('%B %d %Y'))  ) >> \
      select( ~X.job, ~X.mgr, ~X.comm, ~X.deptno )                                                               >> \
      head()

     

    Results
      empno ename hiredate sal hire_date_1 hire_date_2
    0 7369 SMITH 1980/12/17 800 December 17 1980 December 17 1980
    1 7499 ALLEN 1981/02/20 1600 February 20 1981 February 20 1981
    2 7521 WARD 1981/02/22 1250 February 22 1981 February 22 1981
    3 7566 JONES 1981/04/02 2975 April 02 1981 April 02 1981
    4 7654 MARTIN 1981/09/28 1250 September 28 1981 September 28 1981

     

     

     


    [SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE]   SQL EMP 예제로 만나는 테이블 데이터 전처리 방법 리스트

    반응형

    댓글