포스팅 목차
* 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크
[ TO_CHAR(number) Oracle Function ]
TO_CHAR(number) 함수는 수치형 데이터 값을 사용자가 지정한 number 포맷 형식(fmt)을 따르는 문자형(VARCHAR2) 데이터 타입 문자열로 변환하여 반환한다. 값 n은 NUMBER, BINARY_FLOAT 또는 BINARY_DOUBER 데이터형을 지정할 수 있다. 만약 fmt를 생략하면, n은 유효 자릿수를 유지하기 위해서 충분한 길이의 VARCHAR2로 변환한다.
- 함수 설명 : TO_CHAR(NUMBER) 오라클 함수 링크
1. Oracle(오라클)
TO_CHAR() 함수
수치형 데이터로 계산된 최소 급여를 문자형으로 변경하여 반환한다.
Oracle Programming |
SELECT TO_CHAR(MIN(SAL)) "TO_CHAR_VAR"
FROM EMP
Results |
TO_CHAR_VAR
---------------
800
2. Python Pandas(파이썬)
apply(str)
수치형으로 지정되어 있는 급여(‘sal’) 변수의 속성을 문자형으로 형 변환한다.
Python Programming |
withmooc = copy.copy(emp)
withmooc['sal_text'] = withmooc['sal'].apply(str)
display( withmooc.head() )
display( withmooc.info() )
# display( type(withmooc) )
Results |
empno ename job mgr hiredate sal comm deptno sal_text
0 7369 SMITH CLERK 7902.0 1980/12/17 800 NaN 20 800
1 7499 ALLEN SALESMAN 7698.0 1981/02/20 1600 300.0 30 1600
2 7521 WARD SALESMAN 7698.0 1981/02/22 1250 500.0 30 1250
3 7566 JONES MANAGER 7839.0 1981/04/02 2975 NaN 20 2975
4 7654 MARTIN SALESMAN 7698.0 1981/09/28 1250 1400.0 30 1250
Results |
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14 entries, 0 to 13
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 empno 14 non-null int64
1 ename 14 non-null object
2 job 14 non-null object
3 mgr 13 non-null float64
4 hiredate 14 non-null object
5 sal 14 non-null int64
6 comm 4 non-null float64
7 deptno 14 non-null int64
8 sal_text 14 non-null object
dtypes: float64(2), int64(3), object(4)
memory usage: 1.1+ KB
None
pandas.core.frame.DataFrame
3. R Programming (R Package)
sapply() 함수와 tostring 인수
toString 함수를 사용하여서 수치형으로 지정되어 있는 급여(‘sal’) 변수의 속성을 문자형으로 형 변환한다.
R Programming |
%%R
withmooc <- emp
withmooc['sal_text'] <- sapply(withmooc$sal,toString)
withmooc[1:10, ]
Results |
# A tibble: 10 x 9
empno ename job mgr hiredate sal comm deptno sal_text
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <chr>
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 800
2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 1600
3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 1250
4 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 2975
5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 1250
6 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 2850
7 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 2450
8 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 3000
9 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 5000
10 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30 1500
sprintf() 출력 함수
sprintf 함수를 사용하여서 원주율 값을 문자형으로 변환한다.
R Programming |
%%R
print(sprintf("%s", pi))
class(sprintf("%s", pi))
Results |
[1] "3.14159265358979"
[1] "character"
sprintf() 출력 함수
sprintf 함수를 사용하여서 수치형으로 지정되어 있는 급여(‘sal’) 변수의 속성을 문자형으로 형 변환한다.
R Programming |
%%R
withmooc <- emp
withmooc['sal_text'] <- sprintf("%s", withmooc$sal)
withmooc
Results |
# A tibble: 14 x 9
empno ename job mgr hiredate sal comm deptno sal_text
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <chr>
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 800
2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 1600
3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 1250
4 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 2975
5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 1250
6 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 2850
7 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 2450
8 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 3000
9 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 5000
10 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30 1500
11 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20 1100
12 7900 JAMES CLERK 7698 1981-12-03 950 NA 30 950
13 7902 FORD ANALYST 7566 1981-12-03 3000 NA 20 3000
14 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10 1300
as.character() 함수
as.character 함수를 사용하여서 수치형으로 지정되어 있는 급여(‘sal’) 변수의 속성을 문자형으로 형 변환한다.
R Programming |
%%R
withmooc <- emp
withmooc <- transform(withmooc, sal_text=as.character(sal)) # z=as.numeric(z)
str(withmooc)
Results |
'data.frame': 14 obs. of 9 variables:
$ empno : num 7369 7499 7521 7566 7654 ...
$ ename : chr "SMITH" "ALLEN" "WARD" "JONES" ...
$ job : chr "CLERK" "SALESMAN" "SALESMAN" "MANAGER" ...
$ mgr : num 7902 7698 7698 7839 7698 ...
$ hiredate: Date, format: "1980-12-17" "1981-02-20" ...
$ sal : num 800 1600 1250 2975 1250 ...
$ comm : num NA 300 500 NA 1400 NA NA NA NA 0 ...
$ deptno : num 20 30 30 20 30 30 10 20 10 30 ...
$ sal_text: chr "800" "1600" "1250" "2975" ...
자동 형 변환 작업 :
- [참고] Convert type of multiple columns of a dataframe at once [링크]
- [참고] Change the class from factor to numeric of many columns in a data frame [링크]
모든 변수에 대해 as.character를 사용하여서 문자로 변경 후 type.convert 함수로 자동 형 변환 작업을 수행한다. numeric 형태의 변수는 integer 형태의 변수로 형변환 작업을 수행한다. as.is = TRUE 옵션을 사용하여서 character 형태에 대한 factor로 형 변환하는 것을 제어한다.
R Programming |
%%R
withmooc <- emp
str(withmooc)
withmooc[] <- lapply(withmooc, function(x) type.convert(as.character(x), as.is = TRUE))
str(withmooc)
Results |
tibble [14 x 8] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ empno : num [1:14] 7369 7499 7521 7566 7654 ...
$ ename : chr [1:14] "SMITH" "ALLEN" "WARD" "JONES" ...
$ job : chr [1:14] "CLERK" "SALESMAN" "SALESMAN" "MANAGER" ...
$ mgr : num [1:14] 7902 7698 7698 7839 7698 ...
$ hiredate: Date[1:14], format: "1980-12-17" "1981-02-20" ...
$ sal : num [1:14] 800 1600 1250 2975 1250 ...
$ comm : num [1:14] NA 300 500 NA 1400 NA NA NA NA 0 ...
$ deptno : num [1:14] 20 30 30 20 30 30 10 20 10 30 ...
- attr(*, "spec")=
.. cols(
.. empno = col_double(),
.. ename = col_character(),
.. job = col_character(),
.. mgr = col_double(),
.. hiredate = col_date(format = ""),
.. sal = col_double(),
.. comm = col_double(),
.. deptno = col_double()
.. )
tibble [14 x 8] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ empno : int [1:14] 7369 7499 7521 7566 7654 7698 7782 7788 7839 7844 ...
$ ename : chr [1:14] "SMITH" "ALLEN" "WARD" "JONES" ...
$ job : chr [1:14] "CLERK" "SALESMAN" "SALESMAN" "MANAGER" ...
$ mgr : int [1:14] 7902 7698 7698 7839 7698 7839 7839 7566 NA 7698 ...
$ hiredate: chr [1:14] "1980-12-17" "1981-02-20" "1981-02-22" "1981-04-02" ...
$ sal : int [1:14] 800 1600 1250 2975 1250 2850 2450 3000 5000 1500 ...
$ comm : int [1:14] NA 300 500 NA 1400 NA NA NA NA 0 ...
$ deptno : int [1:14] 20 30 30 20 30 30 10 20 10 30 ...
- attr(*, "spec")=
.. cols(
.. empno = col_double(),
.. ename = col_character(),
.. job = col_character(),
.. mgr = col_double(),
.. hiredate = col_date(format = ""),
.. sal = col_double(),
.. comm = col_double(),
.. deptno = col_double()
.. )
as.character()
num, chr, Date 형태의 변수를 chr 형태의 변수로 일괄 변환
R Programming |
%%R
withmooc <- emp
str(withmooc)
withmooc[] <- apply(withmooc, 2, function(x) (as.character(x)))
str(withmooc)
Results |
tibble [14 x 8] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ empno : num [1:14] 7369 7499 7521 7566 7654 ...
$ ename : chr [1:14] "SMITH" "ALLEN" "WARD" "JONES" ...
$ job : chr [1:14] "CLERK" "SALESMAN" "SALESMAN" "MANAGER" ...
$ mgr : num [1:14] 7902 7698 7698 7839 7698 ...
$ hiredate: Date[1:14], format: "1980-12-17" "1981-02-20" ...
$ sal : num [1:14] 800 1600 1250 2975 1250 ...
$ comm : num [1:14] NA 300 500 NA 1400 NA NA NA NA 0 ...
$ deptno : num [1:14] 20 30 30 20 30 30 10 20 10 30 ...
- attr(*, "spec")=
.. cols(
.. empno = col_double(),
.. ename = col_character(),
.. job = col_character(),
.. mgr = col_double(),
.. hiredate = col_date(format = ""),
.. sal = col_double(),
.. comm = col_double(),
.. deptno = col_double()
.. )
tibble [14 x 8] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ empno : chr [1:14] "7369" "7499" "7521" "7566" ...
$ ename : chr [1:14] "SMITH" "ALLEN" "WARD" "JONES" ...
$ job : chr [1:14] "CLERK" "SALESMAN" "SALESMAN" "MANAGER" ...
$ mgr : chr [1:14] "7902" "7698" "7698" "7839" ...
$ hiredate: chr [1:14] "1980-12-17" "1981-02-20" "1981-02-22" "1981-04-02" ...
$ sal : chr [1:14] " 800" "1600" "1250" "2975" ...
$ comm : chr [1:14] NA " 300" " 500" NA ...
$ deptno : chr [1:14] "20" "30" "30" "20" ...
- attr(*, "spec")=
.. cols(
.. empno = col_double(),
.. ename = col_character(),
.. job = col_character(),
.. mgr = col_double(),
.. hiredate = col_date(format = ""),
.. sal = col_double(),
.. comm = col_double(),
.. deptno = col_double()
.. )
4. R Dplyr Package
sapply() 함수와 tostring 인수
toString 함수를 사용하여서 수치형으로 지정되어 있는 급여(‘sal’) 변수(리스트)의 속성을 문자형으로 형 변환한다.
R Programming |
%%R
withmooc <- emp
emp %>%
dplyr::mutate( sal_text = sapply(withmooc$sal,toString) ) %>%
head()
Results |
# A tibble: 6 x 9
empno ename job mgr hiredate sal comm deptno sal_text
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <chr>
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 800
2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 1600
3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 1250
4 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 2975
5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 1250
6 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 2850
as.character() 함수
as.character 함수를 사용하여서 수치형으로 지정되어 있는 급여(‘sal’) 변수의 속성을 문자형으로 형 변환한다.
R Programming |
%%R
withmooc <- emp
emp %>%
mutate(sal_text = as.character(sal)) %>%
head()
Results |
# A tibble: 6 x 9
empno ename job mgr hiredate sal comm deptno sal_text
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl> <chr>
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 800
2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 1600
3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 1250
4 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 2975
5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 1250
6 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 2850
hablar::convert() 함수
R Programming |
%%R
withmooc <- emp
library(hablar)
withmooc %>%
hablar::convert(chr(sal)) %>% # int(x, z)
head()
Results |
# A tibble: 6 x 8
empno ename job mgr hiredate sal comm deptno
<dbl> <chr> <chr> <dbl> <date> <chr> <dbl> <dbl>
1 7369 SMITH CLERK 7902 1980-12-17 800 NA 20
2 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30
3 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30
4 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20
5 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30
6 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30
[참고] 자동 형 변환 작업 :
- 모든 변수에 대해 as.character를 사용하여서 문자로 변경 후 type.convert 함수로 자동 형 변환 작업을 수행한다. numeric 형태의 변수는 integer 형태의 변수로 형변환 작업을 수행한다. as.is = TRUE 옵션을 사용하여서 character 형태에 대한 factor로 형변환하는 것을 제어한다.
- type_convert 함수 : https://readr.tidyverse.org/reference/type_convert.html
R Programming |
%%R
withmooc <- emp
# str(withmooc)
withmooc <- withmooc %>% mutate_all(funs(type.convert(as.character(.), as.is = TRUE)))
str(withmooc)
Results |
tibble [14 x 8] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ empno : int [1:14] 7369 7499 7521 7566 7654 7698 7782 7788 7839 7844 ...
$ ename : chr [1:14] "SMITH" "ALLEN" "WARD" "JONES" ...
$ job : chr [1:14] "CLERK" "SALESMAN" "SALESMAN" "MANAGER" ...
$ mgr : int [1:14] 7902 7698 7698 7839 7698 7839 7839 7566 NA 7698 ...
$ hiredate: chr [1:14] "1980-12-17" "1981-02-20" "1981-02-22" "1981-04-02" ...
$ sal : int [1:14] 800 1600 1250 2975 1250 2850 2450 3000 5000 1500 ...
$ comm : int [1:14] NA 300 500 NA 1400 NA NA NA NA 0 ...
$ deptno : int [1:14] 20 30 30 20 30 30 10 20 10 30 ...
- attr(*, "spec")=
.. cols(
.. empno = col_double(),
.. ename = col_character(),
.. job = col_character(),
.. mgr = col_double(),
.. hiredate = col_date(format = ""),
.. sal = col_double(),
.. comm = col_double(),
.. deptno = col_double()
.. )
mutate_if() 조건문과 as.character(.)
if 조건문을 사용하여서 변수 형태를 조사 후 형변환 작업을 수행한다. 아래 예제는 수치형 변수를 확인 후 문자형 변수로 형변환 작업을 수행한다.
R Programming |
%%R
withmooc <- emp
# str(withmooc)
withmooc <- withmooc %>% mutate_if(is.numeric, ~(as.character(.)))
str(withmooc)
Results |
tibble [14 x 8] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ empno : chr [1:14] "7369" "7499" "7521" "7566" ...
$ ename : chr [1:14] "SMITH" "ALLEN" "WARD" "JONES" ...
$ job : chr [1:14] "CLERK" "SALESMAN" "SALESMAN" "MANAGER" ...
$ mgr : chr [1:14] "7902" "7698" "7698" "7839" ...
$ hiredate: Date[1:14], format: "1980-12-17" "1981-02-20" ...
$ sal : chr [1:14] "800" "1600" "1250" "2975" ...
$ comm : chr [1:14] NA "300" "500" NA ...
$ deptno : chr [1:14] "20" "30" "30" "20" ...
- attr(*, "spec")=
.. cols(
.. empno = col_double(),
.. ename = col_character(),
.. job = col_character(),
.. mgr = col_double(),
.. hiredate = col_date(format = ""),
.. sal = col_double(),
.. comm = col_double(),
.. deptno = col_double()
.. )
5. R sqldf Package
cast() 함수
급여 변수의 형태를 text로 변환하고, typeof 함수로 변수 형태를 확인한다.
R Programming |
%%R
sqldf(" select cast(sal as text) as num_text,
typeof(cast(sal as text)) as sal_type
from emp; ")[1:10, ]
Results |
num_text sal_type
1 800.0 text
2 1600.0 text
3 1250.0 text
4 2975.0 text
5 1250.0 text
6 2850.0 text
7 2450.0 text
8 3000.0 text
9 5000.0 text
10 1500.0 text
6. Python pandasql Package
cast() 함수
Python Programming |
ps.sqldf(" select cast(sal as text) as num_text, \
typeof(cast(sal as text)) sal_type \
from emp ").head()
Results |
num_text sal_type
0 800 text
1 1600 text
2 1250 text
3 2975 text
4 1250 text
7. R data.table Package
sapplyr() 함수와 toString 인수
toString 함수를 사용하여서 수치형으로 지정되어 있는 급여(‘sal’) 변수(리스트)의 속성을 문자형으로 형 변환한다.
R Programming |
%%R
DT <- data.table(emp)
dept_DT <- data.table(dept)
DT[, sal_text := sapply(withmooc$sal,toString)][1:10, ]
# class(DT)
# str(DT)
Results |
empno ename job mgr hiredate sal comm deptno sal_text
1: 7369 SMITH CLERK 7902 1980-12-17 800 NA 20 800
2: 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 1600
3: 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 1250
4: 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 2975
5: 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 1250
6: 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 2850
7: 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 2450
8: 7788 SCOTT ANALYST 7566 1982-12-09 3000 NA 20 3000
9: 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 5000
10: 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30 1500
[참고] 자동 형 변환 작업
R Programming |
%%R
DT <- data.table(emp)
# str(DT)
DT1 = DT[, lapply(.SD, as.character)]
str(DT1)
Results |
Classes 'data.table' and 'data.frame': 14 obs. of 8 variables:
$ empno : chr "7369" "7499" "7521" "7566" ...
$ ename : chr "SMITH" "ALLEN" "WARD" "JONES" ...
$ job : chr "CLERK" "SALESMAN" "SALESMAN" "MANAGER" ...
$ mgr : chr "7902" "7698" "7698" "7839" ...
$ hiredate: chr "1980-12-17" "1981-02-20" "1981-02-22" "1981-04-02" ...
$ sal : chr "800" "1600" "1250" "2975" ...
$ comm : chr NA "300" "500" NA ...
$ deptno : chr "20" "30" "30" "20" ...
- attr(*, ".internal.selfref")=<externalptr>
사용자 정의 함수와 as.character(x)
R Programming |
%%R
DT <- data.table(emp)
DT = data.table(apply(DT, 2, function(x) as.character(x)))
str(DT)
Results |
Classes 'data.table' and 'data.frame': 14 obs. of 8 variables:
$ empno : chr "7369" "7499" "7521" "7566" ...
$ ename : chr "SMITH" "ALLEN" "WARD" "JONES" ...
$ job : chr "CLERK" "SALESMAN" "SALESMAN" "MANAGER" ...
$ mgr : chr "7902" "7698" "7698" "7839" ...
$ hiredate: chr "1980-12-17" "1981-02-20" "1981-02-22" "1981-04-02" ...
$ sal : chr " 800" "1600" "1250" "2975" ...
$ comm : chr NA " 300" " 500" NA ...
$ deptno : chr "20" "30" "30" "20" ...
- attr(*, ".internal.selfref")=<externalptr>
8. Python Duckdb의 SQL
Python Programming |
%%sql
SELECT cast(MIN(SAL) as varchar) as Char_num,
typeof(cast(MIN(SAL) as varchar)) as char_type
FROM emp
Python Programming |
duckdb.sql(" SELECT cast(MIN(SAL) as varchar) as Char_num, \
typeof(cast(MIN(SAL) as varchar)) as char_type \
FROM emp ").df()
Results |
Char_num char_type
0 800 VARCHAR
Python Programming |
%%sql
select cast(sal as text) as num_text,
typeof(cast(sal as text)) sal_type
from emp
LIMIT 6
Python Programming |
duckdb.sql(" select cast(sal as text) as num_text, \
typeof(cast(sal as text)) sal_type \
from emp \
LIMIT 6 ").df()
Results |
num_text sal_type
0 800 VARCHAR
1 1600 VARCHAR
2 1250 VARCHAR
3 2975 VARCHAR
4 1250 VARCHAR
5 2850 VARCHAR
--------------------------------------------
[Oracle, Pandas, R Prog, Dplyr, Sqldf, Pandasql, Data.Table] 오라클 함수와 R & Python 비교 사전 목록 링크 |
오라클 SQL 함수(Oracle SQL Function) 목록 리스트 링크 |
[SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트 링크 링크 |
댓글