포스팅 목차
13. Display the list of employees who have joined the company before 30th June 90 or after 31st dec 90.
* 1981년 1월 09일부터 1981년 11월 17일 사이에 입사한 직원들의 정보를 출력하시오.
- [데이터 필터] 지정한 날짜에 속하는 데이터 추출
|
1. 오라클(Oracle)
Oracle Programming |
select * from emp
where hiredate between '09-jan-1981' and '17-nov-1981';
Oracle Programming |
select * from emp
where to_char(hiredate,'yyyymmdd') between '19810109' and '19811117';
2. 파이썬(Pandas)
Python Programming |
emp[ emp['hiredate'].between('1981/01/09', '1981/11/17', inclusive = True)][["ename"]]
Results |
empno | ename | job | mgr | hiredate | sal | comm | deptno | hiredate_D | |
1 | 7499 | ALLEN | SALESMAN | 7698.0 | 1981/02/20 | 1600 | 300.0 | 30 | 1981-02-20 |
2 | 7521 | WARD | SALESMAN | 7698.0 | 1981/02/22 | 1250 | 500.0 | 30 | 1981-02-22 |
3 | 7566 | JONES | MANAGER | 7839.0 | 1981/04/02 | 2975 | NaN | 20 | 1981-04-02 |
4 | 7654 | MARTIN | SALESMAN | 7698.0 | 1981/09/28 | 1250 | 1400.0 | 30 | 1981-09-28 |
5 | 7698 | BLAKE | MANAGER | 7839.0 | 1981/03/01 | 2850 | NaN | 30 | 1981-03-01 |
6 | 7782 | CLARK | MANAGER | 7839.0 | 1981/01/09 | 2450 | NaN | 10 | 1981-01-09 |
8 | 7839 | KING | PRESIDENT | NaN | 1981/11/17 | 5000 | NaN | 10 | 1981-11-17 |
9 | 7844 | TURNER | SALESMAN | 7698.0 | 1981/09/08 | 1500 | 0.0 | 30 | 1981-09-08 |
Python Programming |
emp.loc[(emp.hiredate.between('1981/01/09', '1981/11/17', inclusive = True)) , ["ename"]]
Results |
ename | |
1 | ALLEN |
2 | WARD |
3 | JONES |
4 | MARTIN |
5 | BLAKE |
6 | CLARK |
8 | KING |
9 | TURNER |
Python Programming |
emp.query("'1981/01/09' <= hiredate <= '1981/11/17'")[["ename"]]
Results |
ename | |
1 | ALLEN |
2 | WARD |
3 | JONES |
4 | MARTIN |
5 | BLAKE |
6 | CLARK |
8 | KING |
9 | TURNER |
3. R Programming (R Package)
R Programming |
%%R
emp[emp$hiredate >= '1981/01/09' & emp$hiredate <= '1981/11/17' , c("empno","ename") ]
Results |
# A tibble: 8 x 2
empno ename
<dbl> <chr>
1 7499 ALLEN
2 7521 WARD
3 7566 JONES
4 7654 MARTIN
5 7698 BLAKE
6 7782 CLARK
7 7839 KING
8 7844 TURNER
R Programming |
%%R
emp[which(emp$hiredate >= '1981/01/09' & emp$hiredate <= '1981/11/17'), c("empno","ename") ]
Results |
# A tibble: 8 x 2
empno ename
<dbl> <chr>
1 7499 ALLEN
2 7521 WARD
3 7566 JONES
4 7654 MARTIN
5 7698 BLAKE
6 7782 CLARK
7 7839 KING
8 7844 TURNER
R Programming |
%%R
subset(emp,subset= (hiredate >= '1981/01/09' & hiredate <= '1981/11/17') , select=c(empno,ename) )
Results |
# A tibble: 8 x 2
empno ename
<dbl> <chr>
1 7499 ALLEN
2 7521 WARD
3 7566 JONES
4 7654 MARTIN
5 7698 BLAKE
6 7782 CLARK
7 7839 KING
8 7844 TURNER
4. R Dplyr Package
- between 함수 사용
R Programming |
%%R
# x : A numeric vector of values
emp %>% filter(between(hiredate, as.Date('1981/01/09'), as.Date('1981/11/17') ) ) %>% select(ename)
Results |
# A tibble: 8 x 1
ename
<chr>
1 ALLEN
2 WARD
3 JONES
4 MARTIN
5 BLAKE
6 CLARK
7 KING
8 TURNER
- 부등호 처리 방식(문자비교)
R Programming |
%%R
emp %>% filter(hiredate >= "1981-01-09" & hiredate <= "1981-11-17" ) %>% dplyr::select(ename)
Results |
# A tibble: 8 x 1
ename
<chr>
1 ALLEN
2 WARD
3 JONES
4 MARTIN
5 BLAKE
6 CLARK
7 KING
8 TURNER
5. R sqldf Package
- hiredate 가 날짜형식인 경우 처리 방식
R Programming |
%%R
date1=as.Date("1981-01-09")
date2=as.Date("1981-11-17")
s <- paste("select * from emp where hiredate between ", as.numeric(date1)," and ",as.numeric(date2) )
sqldf(s, verbose = TRUE)
Results |
sqldf: library(RSQLite)
sqldf: m <- dbDriver("SQLite")
sqldf: connection <- dbConnect(m, dbname = ":memory:")
sqldf: initExtension(connection)
sqldf: dbWriteTable(connection, 'emp', emp, row.names = FALSE)
sqldf: dbGetQuery(connection, 'select * from emp where hiredate between 4026 and 4338')
sqldf: dbDisconnect(connection)
empno ename job mgr hiredate sal comm deptno
1 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30
2 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30
3 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20
4 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30
5 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30
6 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10
7 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10
8 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30
R Programming |
%%R
vdate1 <- as.Date('1981-01-09')
vdate2 <- as.Date('1981-11-17')
gsubfn::fn$sqldf("select * from emp where hiredate between $vdate1 and $vdate2 ")
Results |
empno ename job mgr hiredate sal comm deptno
1 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30
2 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30
3 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20
4 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30
5 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30
6 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10
7 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10
8 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30
R Programming |
%%R
gsubfn::fn$sqldf("select *, strftime('%d-%m-%Y', 'now') now_char
from emp
where hiredate between `as.Date('1981-01-09')` and `as.Date('1981-11-17')` ")
Results |
empno ename job mgr hiredate sal comm deptno now_char
1 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30 09-09-2020
2 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30 09-09-2020
3 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20 09-09-2020
4 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30 09-09-2020
5 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30 09-09-2020
6 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10 09-09-2020
7 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10 09-09-2020
8 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30 09-09-2020
- 날짜형 데이터를 문자형을 변경 후 비교
R Programming |
%%R
sqldf("select * from emp where strftime('%Y-%m-%d', hiredate * 3600 * 24, 'unixepoch') between '1981-01-09' and '1981-11-17' ")
Results |
empno ename job mgr hiredate sal comm deptno
1 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30
2 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30
3 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20
4 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30
5 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30
6 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10
7 7839 KING PRESIDENT NA 1981-11-17 5000 NA 10
8 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30
6. Python pandasql Package
- 문자형 날짜 처리 방식
Python Programming |
ps.sqldf("select * from emp where hiredate between '1981/01/09' and '1981/11/17'")
Results |
empno | ename | job | mgr | hiredate | sal | comm | deptno | hiredate_D | to_date | day_gap | |
0 | 7499 | ALLEN | SALESMAN | 7698.0 | 1981/02/20 | 1600 | 300.0 | 30 | 1981-02-20 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14447 |
1 | 7521 | WARD | SALESMAN | 7698.0 | 1981/02/22 | 1250 | 500.0 | 30 | 1981-02-22 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14445 |
2 | 7566 | JONES | MANAGER | 7839.0 | 1981/04/02 | 2975 | NaN | 20 | 1981-04-02 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14406 |
3 | 7654 | MARTIN | SALESMAN | 7698.0 | 1981/09/28 | 1250 | 1400.0 | 30 | 1981-09-28 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14227 |
4 | 7698 | BLAKE | MANAGER | 7839.0 | 1981/03/01 | 2850 | NaN | 30 | 1981-03-01 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14438 |
5 | 7782 | CLARK | MANAGER | 7839.0 | 1981/01/09 | 2450 | NaN | 10 | 1981-01-09 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14489 |
6 | 7839 | KING | PRESIDENT | NaN | 1981/11/17 | 5000 | NaN | 10 | 1981-11-17 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14177 |
7 | 7844 | TURNER | SALESMAN | 7698.0 | 1981/09/08 | 1500 | 0.0 | 30 | 1981-09-08 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14247 |
- 날짜 데이터 조회
Python Programming |
emp['hiredate_D'] = pd.to_datetime(emp['hiredate'])
ps.sqldf("select *,date('1981-01-09') from emp where hiredate_D between date('1981-01-09') and date('1981-11-17')")
Results |
empno | ename | job | mgr | hiredate | sal | comm | deptno | hiredate_D | to_date | day_gap | date('1981-01-09') | |
0 | 7499 | ALLEN | SALESMAN | 7698.0 | 1981/02/20 | 1600 | 300.0 | 30 | 1981-02-20 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14447 | 1981-01-09 |
1 | 7521 | WARD | SALESMAN | 7698.0 | 1981/02/22 | 1250 | 500.0 | 30 | 1981-02-22 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14445 | 1981-01-09 |
2 | 7566 | JONES | MANAGER | 7839.0 | 1981/04/02 | 2975 | NaN | 20 | 1981-04-02 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14406 | 1981-01-09 |
3 | 7654 | MARTIN | SALESMAN | 7698.0 | 1981/09/28 | 1250 | 1400.0 | 30 | 1981-09-28 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14227 | 1981-01-09 |
4 | 7698 | BLAKE | MANAGER | 7839.0 | 1981/03/01 | 2850 | NaN | 30 | 1981-03-01 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14438 | 1981-01-09 |
5 | 7782 | CLARK | MANAGER | 7839.0 | 1981/01/09 | 2450 | NaN | 10 | 1981-01-09 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14489 | 1981-01-09 |
6 | 7844 | TURNER | SALESMAN | 7698.0 | 1981/09/08 | 1500 | 0.0 | 30 | 1981-09-08 00:00:00.000000 | 2020-09-10 09:45:49.454895 | 14247 | 1981-01-09 |
7. R data.table Package
R Programming |
%%R
DT <- data.table(emp)
dept_DT <- data.table(dept)
DT[hiredate >= "1981-01-09" & hiredate <= "1981-11-17", .(ename)]
Results |
ename
1: ALLEN
2: WARD
3: JONES
4: MARTIN
5: BLAKE
6: CLARK
7: KING
8: TURNER
8. SAS Proc SQL
SAS Programming |
%%SAS sas
PROC SQL;
CREATE TABLE STATSAS_1 AS
SELECT *
FROM EMP
WHERE hiredate between '09jan81'd and '17nov81'd;
QUIT;
PROC PRINT;RUN;
Results |
OBS | empno | ename | job | mgr | hiredate | sal | comm | deptno |
1 | 7499 | ALLEN | SALESMAN | 7698 | 1981-02-20 | 1600 | 300 | 30 |
2 | 7521 | WARD | SALESMAN | 7698 | 1981-02-22 | 1250 | 500 | 30 |
3 | 7566 | JONES | MANAGER | 7839 | 1981-04-02 | 2975 | . | 20 |
4 | 7654 | MARTIN | SALESMAN | 7698 | 1981-09-28 | 1250 | 1400 | 30 |
5 | 7698 | BLAKE | MANAGER | 7839 | 1981-03-01 | 2850 | . | 30 |
6 | 7782 | CLARK | MANAGER | 7839 | 1981-01-09 | 2450 | . | 10 |
7 | 7839 | KING | PRESIDEN | . | 1981-11-17 | 5000 | . | 10 |
8 | 7844 | TURNER | SALESMAN | 7698 | 1981-09-08 | 1500 | 0 | 30 |
9. SAS Data Step
SAS Programming |
%%SAS sas
DATA STATSAS_2;
SET EMP;
WHERE hiredate between '09jan81'd and '17nov81'd;
RUN;
PROC PRINT;RUN;
Results |
OBS | empno | ename | job | mgr | hiredate | sal | comm | deptno |
1 | 7499 | ALLEN | SALESMAN | 7698 | 1981-02-20 | 1600 | 300 | 30 |
2 | 7521 | WARD | SALESMAN | 7698 | 1981-02-22 | 1250 | 500 | 30 |
3 | 7566 | JONES | MANAGER | 7839 | 1981-04-02 | 2975 | . | 20 |
4 | 7654 | MARTIN | SALESMAN | 7698 | 1981-09-28 | 1250 | 1400 | 30 |
5 | 7698 | BLAKE | MANAGER | 7839 | 1981-03-01 | 2850 | . | 30 |
6 | 7782 | CLARK | MANAGER | 7839 | 1981-01-09 | 2450 | . | 10 |
7 | 7839 | KING | PRESIDEN | . | 1981-11-17 | 5000 | . | 10 |
8 | 7844 | TURNER | SALESMAN | 7698 | 1981-09-08 | 1500 | 0 | 30 |
SAS Programming |
%%SAS sas
DATA STATSAS_3;
SET EMP;
hiredate_char = put(hiredate, yymmddn.);
WHERE put(hiredate, yymmddn.) between '19810109' and '19811117';
RUN;
PROC PRINT;RUN;
Results |
OBS | empno | ename | job | mgr | hiredate | sal | comm | deptno | hiredate_char |
1 | 7499 | ALLEN | SALESMAN | 7698 | 1981-02-20 | 1600 | 300 | 30 | 19810220 |
2 | 7521 | WARD | SALESMAN | 7698 | 1981-02-22 | 1250 | 500 | 30 | 19810222 |
3 | 7566 | JONES | MANAGER | 7839 | 1981-04-02 | 2975 | . | 20 | 19810402 |
4 | 7654 | MARTIN | SALESMAN | 7698 | 1981-09-28 | 1250 | 1400 | 30 | 19810928 |
5 | 7698 | BLAKE | MANAGER | 7839 | 1981-03-01 | 2850 | . | 30 | 19810301 |
6 | 7782 | CLARK | MANAGER | 7839 | 1981-01-09 | 2450 | . | 10 | 19810109 |
7 | 7839 | KING | PRESIDEN | . | 1981-11-17 | 5000 | . | 10 | 19811117 |
8 | 7844 | TURNER | SALESMAN | 7698 | 1981-09-08 | 1500 | 0 | 30 | 19810908 |
10. Python Dfply Package
Python Programming |
emp >> filter_by(X.hiredate.between('1981/01/09', '1981/11/17', inclusive = True)) >> select(X.empno, X.ename)
Results |
empno | ename | |
1 | 7499 | ALLEN |
2 | 7521 | WARD |
3 | 7566 | JONES |
4 | 7654 | MARTIN |
5 | 7698 | BLAKE |
6 | 7782 | CLARK |
8 | 7839 | KING |
9 | 7844 | TURNER |
Python Programming |
emp >> mask(X.hiredate.between('1981/01/09', '1981/11/17', inclusive = True)) >> select(X.empno, X.ename) >> head(3)
Results |
empno | ename | |
1 | 7499 | ALLEN |
2 | 7521 | WARD |
3 | 7566 | JONES |
[SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트
반응형
댓글