APAN 5310: SQL & Relational Databases Fall 2018
Homework Assignment 5
SOLUTIONS
(due 10/9/2018, 6:00 pm ET)
Question 1 (10 points: 1 point for each table, 1 point for correct order, 2 points for design justification)
Assume the following simplified database schema of a site that tries to match graduating students with
jobs. The main theme is that employers create job postings with required skills and students post their
information and skills. Then, students can search for jobs and submit an application. Employers can
search for students, review their information and skills and extend an invitation for an interview.
Provide the SQL statements that create seven (7) tables as shown. Primary keys are underlined.
You will have to define data types, primary/foreign keys and NOT NULL constraints. Provide a brief
explanation, including any assumptions, in the second box below (after the CREATE TABLE statements).
employers (emp_name, emp_address, emp_phone, emp_email)
job_skills (job_id, job_skill )
1
jobs (job_id, emp_name, job_name, start_date, min_gpa, salary, description,
post_date)
student (student_id, student_name, student_email, major, university, gpa,
graduation_date)
student_skills (student_id, student_skill )
2
application (student_id, job_id, date_submitted)
interview (job_id, student_id, interview_date, offer_made )
3
Type the create statements in the order they have to be executed so that there is no error in PostgreSQL.
It is highly recommended that you create a new database in Codio and test your answer.
1
The `job_skills` table holds the skills required for the job, one at a time, as specified by the employer. Hence, for a
row in this table, do not add multiple skills. For example, a job with ID 1 and two necessary skills would be: (1,
“Python”), (1, “SQL”). Note that this is allowed due to the composite primary key.
2
Similar to the `job_skills` table, the `student_skills` table holds the skills a student possesses, one at a time, as
specified by the student. Hence, for a row in this table, do not add multiple skills. For example, a student with ID
‘xyz992’ and three skills would be: (‘xyz992’, “R”), (‘xyz992’, “SQL”), (‘xyz992’, “Machine Learning”). Note that this is
allowed due to the composite primary key.
3
Use this attribute to store whether an offer was made to the candidate and if it was accepted. Hint: restrict values to
a list such as (“no offer”, “offer accepted”, and “offer rejected”).
HW 5 Solutions 1
APAN 5310: SQL & Relational Databases Fall 2018
(type your SQL statements below for creating the tables)
CREATE TABLE employers (
emp_name varchar(100),
emp_address varchar(100),
emp_phone varchar(12),
emp_email varchar(100) UNIQUE NOT NULL,
PRIMARY KEY (emp_name));
CREATE TABLE jobs (
job_id serial,
emp_name varchar(100) NOT NULL,
job_name varchar(250) NOT NULL,
start_date date NOT NULL,
min_gpa numeric(3,2),
salary numeric(8,2),
description text,
post_date timestamp NOT NULL,
PRIMARY KEY (job_id),
FOREIGN KEY (emp_name) REFERENCES employers (emp_name));
CREATE TABLE job_skills (
job_id integer,
job_skill varchar(40),
PRIMARY KEY (job_id, job_skill),
FOREIGN KEY (job_id) REFERENCES jobs (job_id));
CREATE TABLE student (
student_id varchar(12),
student_name varchar(100) NOT NULL,
student_email varchar(40) UNIQUE NOT NULL,
major varchar(40),
university varchar(100),
gpa numeric(3,2),
graduation_date date,
PRIMARY KEY (student_id));
CREATE TABLE student_skills (
student_id varchar(12),
student_skill varchar(40),
PRIMARY KEY (student_id, student_skill),
FOREIGN KEY (student_id) REFERENCES student (student_id));
CREATE TABLE application (
student_id varchar(12),
job_id integer,
date_submitted timestamp NOT NULL,
PRIMARY KEY (student_id, job_id),
FOREIGN KEY (student_id) REFERENCES student (student_id),
FOREIGN KEY (job_id) REFERENCES jobs (job_id));
CREATE TABLE interview (
job_id integer,
student_id varchar(12),
interview_date timestamp NOT NULL,
offer_made varchar(20)
CHECK (offer_made IN ('no offer', 'offer extended',
'offer accepted', 'offer rejected', NULL)),
PRIMARY KEY (job_id, student_id),
FOREIGN KEY (job_id) REFERENCES jobs (job_id),
FOREIGN KEY (student_id) REFERENCES student (student_id));
HW 5 Solutions 2
APAN 5310: SQL & Relational Databases Fall 2018
(type your answers below for your selection of data types, foreign keys, NOT NULL, other assumptions)
Key points:
● _id variables could conceivably be varchar or serial or integer
● NOT NULL constraints may vary according to your assumptions. All reasonable assignments
will be accepted
● Notice that emails have UNIQUE constraints. This ensures that employers and students may
only have distinct email addresses
● Employer phone number may be either integer or varchar. The assumption that we made is
that phone numbers must be stored as XXX-XXX-XXXX. The data type that you select will, of
course, impact how question 4 is answered
● Notice that the “offer made” attribute in the “interview” table has a CHECK constraint to ensure
that entries are restricted to a list of values
Question 2 (2 points)
Draw the ER diagram for the schema detailed in Question 1 using Engineering/Crow’s Foot notation. The
diagram must show cardinalities as per your database design. You may draw the ER diagram in any
software you prefer, Lucidchart is recommended. Hand drawn diagrams will not be accepted. Attach
the ER diagram at the end of this document or upload as a separate file.
HW 5 Solutions 3
APAN 5310: SQL & Relational Databases Fall 2018
NOTE
The SQL queries below are suggested answers to each question, they are not the only accepted
answers. There are several ways to produce the requested result in each question. However, two
queries that are expected to return the same result should do so given the same dataset. In other
words, if you test the answers below on your sample data, they should produce the same result as your
answers. If they do not, then there is probably a problem with your answer. Looking at it the other way,
your answers should return the same result as the answers below given the sample dataset that we are
attaching at the end of this document. This dataset is comprised of mostly random data including rows
that will validate or invalidate queries. Please keep in mind that grading will follow the posted rubric.
Question 3 (1 point)
Provide the SQL statement that returns the job ID, company name, job name, and post date/time of all job
listings that “Jimmy Junior” may be qualified for (meaning, that he has a high-enough GPA and at least
one of his skills is also listed as a skill for the job).
(type your SQL statement below)
SELECT j.job_id, j.emp_name, j.job_name, j.post_date
FROM jobs j
JOIN job_skills js ON j.job_id = js.job_id
JOIN student_skills ss ON js.job_skill = ss.student_skill
JOIN student s ON s.student_id = ss.student_id
WHERE s.student_name = 'Jimmy Junior'
AND j.min_gpa <= s.gpa
AND js.job_skill = ss.student_skill
GROUP BY j.job_id, j.emp_name, j.job_name, j.post_date;
Using the attached sample data, the result will be:
HW 5 Solutions 4
APAN 5310: SQL & Relational Databases Fall 2018
Question 4 (1 point)
Provide the SQL statement that returns the job ID, company name, and job name of any job posted by a
company in New York during April 2018. (Hint: use the first 3 digits of the phone number to locate
companies in NY. Limit search to area codes ‘212’, ‘646’ and ‘718’)
(type your SQL statement below)
SELECT j.job_id, e.emp_name, j.job_name
FROM employers e
JOIN jobs j ON e.emp_name = j.emp_name
WHERE SUBSTRING (e.emp_phone, 1, 3) IN ('212', '646', '718')
AND EXTRACT (MONTH FROM j.post_date) = 4
AND EXTRACT (YEAR FROM j.post_date) = 2018;
Using the attached sample data, the result will be:
If you stored phone numbers as integers, the function to extract the first 3 digits is LEFT().
Question 5 (1 point)
Provide the SQL statement that returns the job ID and job name of the job that received the most
applications.
(type your SQL statement below)
WITH job_applications (job_id, app_count) AS (
SELECT j.job_id, COUNT(*)
FROM application a
JOIN jobs j ON j.job_id = a.job_id
GROUP BY j.job_id)
SELECT j.job_id, j.job_name
FROM jobs j
JOIN job_applications ja ON j.job_id = ja.job_id
WHERE ja.app_count = (SELECT MAX(app_count) FROM job_applications);
Using the attached sample data, the result will be:
HW 5 Solutions 5
APAN 5310: SQL & Relational Databases Fall 2018
Question 6 (1 point)
Provide the SQL statement that returns the name of any student who did more than three (3) interviews
without getting a single offer.
(type your SQL statement below)
WITH interview_offers (student_id, interview_count, offer_count) AS (
SELECT s.student_id, COUNT(*),
COUNT(CASE WHEN i.offer_made IN ('offer accepted', 'offer extended') THEN 1 END)
FROM interview i
JOIN student s ON i.student_id = s.student_id
GROUP BY s.student_id)
SELECT s.student_name
FROM student s
JOIN interview_offers io ON s.student_id = io.student_id
WHERE io.interview_count >= 3
AND io.offer_count = 0;
Using the attached sample data, the result will be “Justin Daniels”. Both “>= 3” and “> 3” are accepted for
the interview count.
Question 7 (1 point)
Provide the SQL statement that returns names of all students who have skills in both “SQL” and “Machine
Learning”.
(type your SQL statement below)
SELECT s.student_name
FROM student s, student_skills ss1, student_skills ss2
WHERE ss1.student_skill = 'Machine Learning'
AND ss2.student_skill = 'SQL'
AND s.student_id = ss1.student_id
AND ss1.student_id = ss2.student_id;
Using the attached sample data, the result will be “Tamera Joisce”.
HW 5 Solutions 6
APAN 5310: SQL & Relational Databases Fall 2018
Question 8 (1 point)
Provide the SQL statement that for each skill (either job skill or student skill) it returns the number of jobs
looking for it and the number of students having it.
(type your SQL statement below)
SELECT tt.job_skill AS skill,
(SELECT COUNT(*)
FROM job_skills js
WHERE js.job_skill = tt.job_skill) AS js_count,
(SELECT COUNT(*)
FROM student_skills ss
WHERE ss.student_skill = tt.job_skill) AS ss_count
FROM ((SELECT job_skill FROM job_skills AS js) UNION
(SELECT student_skill FROM student_skills AS ss)) AS tt
GROUP BY tt.job_skill;
Using the attached sample data, the result will be:
HW 5 Solutions 7
APAN 5310: SQL & Relational Databases Fall 2018
Question 9 (1 point)
Provide the SQL statement that returns the name of the skill that has the highest ratio of jobs looking for
that skill versus students having that skill. (Hints: return skills with at least one student having it, to avoid
division by zero, also consider using WITH or a temporary table or a VIEW in conjunction with a SELECT
on that)
(type your SQL statement below)
WITH skill_ratio (skill, ratio) AS (
SELECT ss.student_skill,
COUNT(DISTINCT js.job_id)/COUNT(DISTINCT ss.student_id)::float
FROM student_skills ss
LEFT OUTER JOIN job_skills js ON ss.student_skill = js.job_skill
GROUP BY ss.student_skill)
SELECT skill
FROM skill_ratio
WHERE ratio = (SELECT MAX(ratio) FROM skill_ratio);
Using the attached sample data, the result will be “NoSQL”.
Question 10 (1 point)
Provide the SQL statement that returns the skill that resulted in the highest average salary among all
accepted job offers in 2013. (Hints: we are looking for the student skill resulting in the highest pay, not the
job skill resulting in the highest pay, also consider using WITH or a temporary table or a VIEW in
conjunction with a SELECT on that)
(type your SQL statement below)
WITH skill_pay (student_skill, avg_pay) AS (
SELECT ss.student_skill, AVG(j.salary)
FROM jobs j
JOIN interview i ON j.job_id = i.job_id
JOIN student_skills ss ON i.student_id = ss.student_id
WHERE i.offer_made = 'offer accepted'
AND EXTRACT (YEAR FROM i.interview_date) = 2018
GROUP BY ss.student_skill)
SELECT student_skill, avg_pay
FROM skill_pay
WHERE avg_pay = (SELECT MAX(avg_pay) FROM skill_pay);
This question actually had a typo, the year was meant to be “2018” instead of 2013. For consistency with
the sample data, the solution above shows “2018”. Your answers will be graded with us changing 2013 to
2018 and checking against the sample data. Given the above, the result will be:
HW 5 Solutions 8
APAN 5310: SQL & Relational Databases Fall 2018
Sample Data
You may use this data with the provided create table queries in Q1 to test your answers.
INSERT INTO employers (emp_name, emp_address, emp_phone, emp_email)
VALUES ('ACME Inc.', '123 Nowhere Street', '212-903-5645',
'acme@acme.com'),
('Oberbrunner Inc', '942 Roth Crossing', '531-883-7038',
'info@oberbrunner.com'),
('ABC Technologies LLC', '34 Broadway', '989-342-7156',
'info@abctech.com');
INSERT INTO jobs (emp_name, job_name, start_date,
min_gpa, salary, description, post_date)
VALUES ('ACME Inc.', 'Data Scientist', '2018-06-23',
3.00, 80000, 'long job description...', '2018-04-04 10:30:44'),
('ACME Inc.', 'Analyst', '2018-06-01',
3.10, 70000, 'long job description...', '2018-03-05 11:01:23'),
('Oberbrunner Inc', 'Data Engineer', '2018-07-07',
3.00, 85000, 'long job description...', '2018-04-23 15:14:59'),
('ABC Technologies LLC', 'Financial Analyst', '2010-07-15',
3.00, 78000, 'long job description...', '2018-05-25 10:11:34');
INSERT INTO job_skills (job_id, job_skill)
VALUES (1, 'SQL'),
(1, 'Python'),
(2, 'SQL'),
(2, 'Machine Learning'),
(2, 'NoSQL'),
(3, 'RDBMS'),
(3, 'NoSQL'),
(3, 'Cloud Systems'),
(4, 'Risk Analysis'),
(4, 'Portfolio Management');
INSERT INTO student (student_id, student_name, student_email,
major, university, gpa, graduation_date)
VALUES ('abc123', 'Reynard Chateau', 'rchateau@uni.edu',
'Data Science', 'Best University', 3.55, '2018-05-19'),
('def456', 'Tamera Joisce', 'tjoisce@uni.edu',
'Computer Science', 'Best University', 3.74, '2018-05-19'),
('xyz987', 'Jimmy Junior', 'jjunior@uni.edu',
'Computer Science', 'Best University', 3.60, '2018-05-19'),
('klm678', 'Susan Right', 'sright@uni.edu',
'Statistics', 'Best University', 3.95, '2018-05-19'),
('vbn823', 'Justin Daniels', 'jdaniels@uni.edu',
'History', 'Best University', 3.51, '2018-08-31');
INSERT INTO student_skills (student_id, student_skill)
VALUES ('abc123', 'SQL'),
('abc123', 'Python'),
('def456', 'SQL'),
('def456', 'Machine Learning'),
HW 5 Solutions 9
APAN 5310: SQL & Relational Databases Fall 2018
('xyz987', 'Machine Learning'),
('xyz987', 'NoSQL'),
('xyz987', 'RDBMS'),
('xyz987', 'Cloud Systems'),
('klm678', 'Machine Learning'),
('klm678', 'Probabilities'),
('vbn823', 'Writing'),
('vbn823', 'Research'),
('vbn823', 'Communication');
INSERT INTO application (student_id, job_id, date_submitted)
VALUES ('abc123', 1, '2018-04-10 18:04:01'),
('def456', 2, '2018-04-29 12:38:34'),
('klm678', 1, '2018-05-05 11:08:05'),
('vbn823', 1, '2018-05-06 14:01:45'),
('vbn823', 2, '2018-05-06 14:23:14'),
('vbn823', 3, '2018-05-06 14:48:03');
INSERT INTO interview (job_id, student_id, interview_date, offer_made)
VALUES (1, 'abc123', '2018-05-10 11:00:00', 'offer accepted'),
(2, 'def456', '2018-05-14 12:00:00', 'offer accepted'),
(3, 'xyz987', '2018-06-22 15:00:00', 'offer extended'),
(1, 'vbn823', '2018-05-20 16:00:00', 'no offer'),
(2, 'vbn823', '2018-05-22 09:00:00', 'no offer'),
(3, 'vbn823', '2018-05-20 17:00:00', 'no offer'),
(4, 'vbn823', '2018-05-24 13:00:00', 'no offer');
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。