포스팅 목차
55. Display the names of employees who earn a Sal more than that of James or that of salary lesser than that of Scott.
* 직원 'JAMES' 의 급여보다 크고, 직원 'SCOTT'의 급여보다 적은 직원의 정보를 출력하시오.
- 파이썬 & R 패키지 호출 및 예제 데이터 생성 링크
- [데이터 추출] 비상관 서브쿼리(Uncorrelated Subqueyr) 를 만족하는 데이터 추출 - 비교연산자
|
1. Oracle(오라클)
- 비상관 서브쿼리
Oracle Programming |
select *
from emp
where sal < (select sal from emp where ename='SCOTT')
and sal > (select sal from emp where ename='JAMES');
2. Python Pandas(파이썬)
- iloc[0] : pandas.core.series.Series 를 numpy.int64로 변경
- max(emp[emp['ename']=='SCOTT'].sal) 를 지정하면 바로 해결
Python Programming |
emp[(emp['sal'] < (emp[emp['ename']=='SCOTT'].sal).iloc[0])
& (emp['sal'] > (emp[emp['ename']=='JAMES'].sal).iloc[0])]
Results |
empno | ename | job | mgr | hiredate | sal | comm | deptno | |
1 | 7499 | ALLEN | SALESMAN | 7698.0 | 1981/02/20 | 1600 | 300.0 | 30 |
2 | 7521 | WARD | SALESMAN | 7698.0 | 1981/02/22 | 1250 | 500.0 | 30 |
3 | 7566 | JONES | MANAGER | 7839.0 | 1981/04/02 | 2975 | NaN | 20 |
4 | 7654 | MARTIN | SALESMAN | 7698.0 | 1981/09/28 | 1250 | 1400.0 | 30 |
5 | 7698 | BLAKE | MANAGER | 7839.0 | 1981/03/01 | 2850 | NaN | 30 |
6 | 7782 | CLARK | MANAGER | 7839.0 | 1981/01/09 | 2450 | NaN | 10 |
9 | 7844 | TURNER | SALESMAN | 7698.0 | 1981/09/08 | 1500 | 0.0 | 30 |
10 | 7876 | ADAMS | CLERK | 7788.0 | 1983/01/12 | 1100 | NaN | 20 |
13 | 7934 | MILLER | CLERK | 7782.0 | 1982/01/23 | 1300 | NaN | 10 |
3. R Programming (R Package)
- 데이터 필터링
R Programming |
%%R
emp[ emp$sal < (emp[emp$ename == 'SCOTT',]$sal)
& emp$sal > (emp[emp$ename == 'JAMES',]$sal),]
Results |
# A tibble: 9 x 8
empno ename job mgr hiredate sal comm deptno
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl>
1 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30
2 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30
3 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20
4 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30
5 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30
6 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10
7 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30
8 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20
9 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10
4. R Dplyr Package
- 데이터 필터링
R Programming |
%%R
emp %>%
filter(sal < ( emp %>% filter(ename == 'SCOTT')
%>% pull(sal)
%>% unique )
& sal > ( emp %>% filter(ename == 'JAMES')
%>% pull(sal)
%>% unique )
)
Results |
# A tibble: 9 x 8
empno ename job mgr hiredate sal comm deptno
<dbl> <chr> <chr> <dbl> <date> <dbl> <dbl> <dbl>
1 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30
2 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30
3 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20
4 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30
5 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30
6 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10
7 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30
8 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20
9 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10
5. R sqldf Package
- 비상관 서브쿼리
R Programming |
%%R
sqldf("select *
from emp
where sal < (select sal from emp where ename='SCOTT')
and sal > (select sal from emp where ename='JAMES')")
Results |
empno ename job mgr hiredate sal comm deptno
1 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30
2 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30
3 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20
4 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30
5 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30
6 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10
7 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30
8 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20
9 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10
6. Python pandasql Package
- 비상관 서브쿼리
Python Programming |
ps.sqldf("select * \
from emp \
where sal < (select sal from emp where ename='SCOTT') \
and sal > (select sal from emp where ename='JAMES')")
Results |
empno | ename | job | mgr | hiredate | sal | comm | deptno | |
0 | 7499 | ALLEN | SALESMAN | 7698.0 | 1981/02/20 | 1600 | 300.0 | 30 |
1 | 7521 | WARD | SALESMAN | 7698.0 | 1981/02/22 | 1250 | 500.0 | 30 |
2 | 7566 | JONES | MANAGER | 7839.0 | 1981/04/02 | 2975 | NaN | 20 |
3 | 7654 | MARTIN | SALESMAN | 7698.0 | 1981/09/28 | 1250 | 1400.0 | 30 |
4 | 7698 | BLAKE | MANAGER | 7839.0 | 1981/03/01 | 2850 | NaN | 30 |
5 | 7782 | CLARK | MANAGER | 7839.0 | 1981/01/09 | 2450 | NaN | 10 |
6 | 7844 | TURNER | SALESMAN | 7698.0 | 1981/09/08 | 1500 | 0.0 | 30 |
7 | 7876 | ADAMS | CLERK | 7788.0 | 1983/01/12 | 1100 | NaN | 20 |
8 | 7934 | MILLER | CLERK | 7782.0 | 1982/01/23 | 1300 | NaN | 10 |
7. R data.table Package
- 데이터 필터링
R Programming |
%%R
DT <- data.table(emp)
DT[sal< (DT[ename == "SCOTT", (sal)]) & sal > (DT[ename == "JAMES", (sal)]) ,]
Results |
empno ename job mgr hiredate sal comm deptno
1: 7499 ALLEN SALESMAN 7698 1981-02-20 1600 300 30
2: 7521 WARD SALESMAN 7698 1981-02-22 1250 500 30
3: 7566 JONES MANAGER 7839 1981-04-02 2975 NA 20
4: 7654 MARTIN SALESMAN 7698 1981-09-28 1250 1400 30
5: 7698 BLAKE MANAGER 7839 1981-03-01 2850 NA 30
6: 7782 CLARK MANAGER 7839 1981-01-09 2450 NA 10
7: 7844 TURNER SALESMAN 7698 1981-09-08 1500 0 30
8: 7876 ADAMS CLERK 7788 1983-01-12 1100 NA 20
9: 7934 MILLER CLERK 7782 1982-01-23 1300 NA 10
8. SAS Proc SQL
- 비상관 서브쿼리를 이용한 데이터 필터링
SAS Programming |
%%SAS sas
PROC SQL;
CREATE TABLE STATSAS_1 AS
select ename
from emp
where sal < (select sal from emp where ename='SCOTT')
and sal > (select sal from emp where ename='JAMES');
QUIT;
PROC PRINT data=STATSAS_1(obs=5);RUN;
Results |
OBS | ename |
1 | ALLEN |
2 | WARD |
3 | JONES |
4 | MARTIN |
5 | BLAKE |
9. SAS Data Step
- DATA STEP(CROSS JOIN-카테시안 곱) 방식;
SAS Programming |
%%SAS sas
DATA STATSAS_3;
SET EMP(WHERE=(ename='SCOTT') RENAME=(SAL=SCOTT_SAL));
SET EMP(WHERE=(ename='JAMES') RENAME=(SAL=JAMES_SAL));
DO I=1 TO KOBS;
SET EMP NOBS=KOBS POINT=I;
IF SAL < SCOTT_SAL AND SAL > JAMES_SAL THEN OUTPUT;
END;
RUN;
PROC PRINT data=STATSAS_3(OBS=3);RUN;
Results |
OBS | empno | ename | job | mgr | hiredate | SCOTT_SAL | comm | deptno | JAMES_SAL | sal |
1 | 7499 | ALLEN | SALESMAN | 7698 | 1981-02-20 | 3000 | 300 | 30 | 950 | 1600 |
2 | 7521 | WARD | SALESMAN | 7698 | 1981-02-22 | 3000 | 500 | 30 | 950 | 1250 |
3 | 7566 | JONES | MANAGER | 7839 | 1981-04-02 | 3000 | . | 20 | 950 | 2975 |
- Macro 변수 할당 방식
%%SAS sas
proc sql noprint;
select SAL into :SCOTT_SAL
from EMP
WHERE ename='SCOTT';
quit;
proc sql noprint;
select SAL into :JAMES_SAL
from EMP
WHERE ename='JAMES';
quit;
DATA STATSAS_3;
SET EMP;
IF SAL < &SCOTT_SAL AND SAL > &JAMES_SAL THEN OUTPUT;
RUN;
PROC PRINT data=STATSAS_3(obs=3);RUN;
Results |
OBS | empno | ename | job | mgr | hiredate | sal | comm | deptno |
1 | 7499 | ALLEN | SALESMAN | 7698 | 1981-02-20 | 1600 | 300 | 30 |
2 | 7521 | WARD | SALESMAN | 7698 | 1981-02-22 | 1250 | 500 | 30 |
3 | 7566 | JONES | MANAGER | 7839 | 1981-04-02 | 2975 | . | 20 |
10. Python Dfply Package
- 데이터 필터링
Python Programming |
# [참고] 함수 사용. (52번 예제 참고)
@dfpipe
def pull_fun(df, column=-1):
return df.loc[:, column].item()
emp >> filter_by( X.sal < ( emp >> filter_by(X.ename == 'SCOTT') >> pull_fun('sal')) , \
X.sal > ( emp >> filter_by(X.ename == 'JAMES') >> pull_fun('sal')) )
Results |
empno | ename | job | mgr | hiredate | sal | comm | deptno | |
1 | 7499 | ALLEN | SALESMAN | 7698.0 | 1981/02/20 | 1600 | 300.0 | 30 |
2 | 7521 | WARD | SALESMAN | 7698.0 | 1981/02/22 | 1250 | 500.0 | 30 |
3 | 7566 | JONES | MANAGER | 7839.0 | 1981/04/02 | 2975 | NaN | 20 |
4 | 7654 | MARTIN | SALESMAN | 7698.0 | 1981/09/28 | 1250 | 1400.0 | 30 |
5 | 7698 | BLAKE | MANAGER | 7839.0 | 1981/03/01 | 2850 | NaN | 30 |
6 | 7782 | CLARK | MANAGER | 7839.0 | 1981/01/09 | 2450 | NaN | 10 |
9 | 7844 | TURNER | SALESMAN | 7698.0 | 1981/09/08 | 1500 | 0.0 | 30 |
10 | 7876 | ADAMS | CLERK | 7788.0 | 1983/01/12 | 1100 | NaN | 20 |
13 | 7934 | MILLER | CLERK | 7782.0 | 1982/01/23 | 1300 | NaN | 10 |
[SQL, Pandas, R Prog, Dplyr, SQLDF, PANDASQL, DATA.TABLE] SQL EMP 예제로 만나는 테이블 데이터 처리 방법 리스트
반응형
댓글