练习一: 各部门工资最高的员工(leetcode184 难度:中等)
【leetcode】184 部门工资最高的员工
创建Employee 表,包含所有员工信息,每个员工有其对应的 Id, salary 和 department Id。
并插入数据:
# 第一题 USE autumn; CREATE TABLE Employee (Id INTEGER NOT NULL, Name VARCHAR(100) NOT NULL, Salary INTEGER NOT NULL, DepartmentId INTEGER NOT NULL, PRIMARY KEY(Id) ) # DESC Employee; INSERT INTO Employee VALUES('1', 'Joe', '70000', '1'); INSERT INTO Employee VALUES('2', 'Henry', '80000', '2'); INSERT INTO Employee VALUES('3', 'Sam', '60000', '2'); INSERT INTO Employee VALUES('4', 'Max', '90000', '1'); SELECT * FROM Employee;
+----+-------+--------+--------------+ | Id | Name | Salary | DepartmentId | +----+-------+--------+--------------+ | 1 | Joe | 70000 | 1 | | 2 | Henry | 80000 | 2 | | 3 | Sam | 60000 | 2 | | 4 | Max | 90000 | 1 | +----+-------+--------+--------------+
创建Department 表,包含公司所有部门的信息。
# 创建部门表 CREATE TABLE Department( Id VARCHAR(100) NOT NULL, Name VARCHAR(100) NOT NULL, PRIMARY KEY(Id) ) INSERT INTO Department VALUES('1', 'IT'); INSERT INTO Department VALUES('2', 'Sales');
+----+----------+ | Id | Name | +----+----------+ | 1 | IT | | 2 | Sales | +----+----------+
编写一个 SQL 查询,找出每个部门工资最高的员工。例如,根据上述给定的表格,Max 在 IT 部门有最高工资,Henry 在 Sales 部门有最高工资。
+------------+----------+--------+ | Department | Employee | Salary | +------------+----------+--------+ | IT | Max | 90000 | | Sales | Henry | 80000 | +------------+----------+--------+
(1)如果使用窗口函数,注意这样写是错的,因为窗口函数并没有“减少”行数。
SELECT Department, Name AS Employee, MAX(Salary) OVER (PARTITION BY Department) AS Salary FROM ( SELECT b.Name AS Department, a.Name, a.Salary FROM Employee AS a INNER JOIN Department AS b ON a.DepartmentId = b.Id ) AS F;
上面这样的结果则是:
所以继续使用关联子查询,但是这种写法看似正确,由于执行顺序的问题,WHERE部分是错误的:
# 第一题 SELECT Department, Name AS Employee, Salary FROM ( SELECT b.Name AS Department, a.Name, a.Salary FROM Employee AS a INNER JOIN Department AS b ON a.DepartmentId = b.Id ) AS F # GROUP BY Department, # WHERE Salary = (MAX(F.Salary) OVER (PARTITION BY F.Department)); WHERE Salary = (SELECT MAX(F.Salary) FROM F GROUP BY F.Department);
(2)我们重新分析下:
员工表employee是有部门编号,但是无部门类型,部门表department是有部门类型的。
因为要查所有员工,所以应该用员工表进行左连结(和部门表);然后找出每个部门内最高的工资作为子查询(这块即WHERE),特别注意该where中的group by的东西需要出现在select后。
PS:这里字段名和表名都是employee,不要看懵了。
# 第一题 SELECT Department.name AS Department, employee.name AS Employee, Salary FROM employee LEFT JOIN department ON employee.DepartmentId = department.Id WHERE (employee.DepartmentId, Salary) in (SELECT DepartmentId, max(Salary) FROM employee GROUP BY DepartmentId);
练习二: 换座位(leetcode626 难度:中等)
【leetcode】626 换座位
小美是一所中学的信息科技老师,她有一张 seat 座位表,平时用来储存学生名字和与他们相对应的座位 id。
其中纵列的id是连续递增的
小美想改变相邻俩学生的座位。
你能不能帮她写一个 SQL query 来输出小美想要的结果呢?
请创建如下所示seat表:
示例:
+---------+---------+ | id | student | +---------+---------+ | 1 | Abbot | | 2 | Doris | | 3 | Emerson | | 4 | Green | | 5 | Jeames | +---------+---------+
即上表的创表并插入数据:
# 第二题 # 创表语句 USE autumn; CREATE TABLE seat (id VARCHAR(100) NOT NULL, student VARCHAR(100) NOT NULL, PRIMARY KEY(id) ) # 插入数据 INSERT INTO seat VALUES('1', 'Abbot'); INSERT INTO seat VALUES('2', 'Doris'); INSERT INTO seat VALUES('3', 'Emerson'); INSERT INTO seat VALUES('4', 'Greeen'); INSERT INTO seat VALUES('5', 'Jeames');
假如数据输入的是上表,则输出结果如下:
+---------+---------+ | id | student | +---------+---------+ | 1 | Doris | | 2 | Abbot | | 3 | Green | | 4 | Emerson | | 5 | Jeames | +---------+---------+
注意:
如果学生人数是奇数,则不需要改变最后一个同学的座位。
方法一:
注意因为题目说了id
是按序递增的,最后一个同学的id即所有座位总数,统计这个总数有两种方法:
# 统计座位总数法一 SELECT COUNT(*) AS counts FROM seat; # 统计座位总数法二 SELECT COUNT(distinct id) FROM seat;
但是第一种不能用在下面做法内,全部sql代码:
SELECT IF(id%2 = 0, id - 1, #IF(id = (SEELCT COUNT(*) AS counts FROM seat), # 这句不可以 IF(id = (select COUNT(distinct id) from seat), # 如果是最后一个 id, id + 1)) AS id, student FROM seat ORDER BY id;
练习三: 分数排名(leetcode178 难度:中等)
假设在某次期末考试中,二年级四个班的平均成绩分别是 93、93、93、91
,请问可以实现几种排名结果?分别使用了什么函数?排序结果是怎样的?(只考虑降序)
+-------+-----------+ | class | score_avg | +-------+-----------+ | 1 | 93 | | 2 | 93 | | 3 | 93 | | 4 | 91 | +-------+-----------+
(1)排序1:若分数相同则排名相同,平分后的下一个名次应该是下一个连续的整数值,即名次之间没有间隔值。用了窗口函数,因为是对全部行进行排序,所以不需要用PARTITION BY
。
SELECT Score, dense_rank() OVER (ORDER BY Score desc) AS 'Rank' FROM Scores;
练习四:连续出现的数字(leetcode180 难度:中等)
【leetcode】180 连续出现的数字
+-------------+---------+ | Column Name | Type | +-------------+---------+ | id | int | | num | varchar | +-------------+---------+ id 是这个表的主键。
编写一个 SQL 查询,查找所有至少连续出现三次的数字。
查询的结果如下:
Logs 表: +----+-----+ | Id | Num | +----+-----+ | 1 | 1 | | 2 | 1 | | 3 | 1 | | 4 | 2 | | 5 | 1 | | 6 | 2 | | 7 | 2 | +----+-----+ Result 表: +-----------------+ | ConsecutiveNums | +-----------------+ | 1 | +-----------------+ 1 是唯一连续出现至少三次的数字。
方法一:使用自连接
可以看做将一张表复制出多张一毛一样的表来使用,因为是要找连续出现3次及其以上的数字,所以“复制”三次。
# Write your MySQL query statement below SELECT DISTINCT a.num as ConsecutiveNums FROM Logs AS a, Logs AS b, Logs AS c WHERE a.Id = b.Id - 1 AND b.Id = c.Id - 1 AND a.Num = b.Num AND b.Num = c.Num;
方法二:窗口函数
这种方法可以参考猴子题解——拼多多面试题:如何找出连续出现N次的内容?。
leetcode603的连续空余座位和这个很类似:
SELECT DISTINCT a.seat_id # 自连接 FROM cinema a join cinema b ON abs(a.seat_id - b.seat_id) = 1 AND a.free = true and b.free = true ORDER BY seat_id;
练习五:树节点 (leetcode608 难度:中等)
对于tree表,id是树节点的标识,p_id是其父节点的id。
+----+------+ | id | p_id | +----+------+ | 1 | null | | 2 | 1 | | 3 | 1 | | 4 | 2 | | 5 | 2 | +----+------+
每个节点都是以下三种类型中的一种:
- Root: 如果节点是根节点。
- Leaf: 如果节点是叶子节点。
- Inner: 如果节点既不是根节点也不是叶子节点。
写一条查询语句打印节点id及对应的节点类型。按照节点id排序。要求上面例子的对应结果为:
+----+------+ | id | Type | +----+------+ | 1 | Root | | 2 | Inner| | 3 | Leaf | | 4 | Leaf | | 5 | Leaf | +----+------+
说明
- 节点’1’是根节点,因为它的父节点为NULL,有’2’和’3’两个子节点。
- 节点’2’是内部节点,因为它的父节点是’1’,有子节点’4’和’5’。
- 节点’3’,‘4’,'5’是叶子节点,因为它们有父节点但没有子节点。
下面是树的图形:
1 / \ 2 3 / \ 4 5
注意
如果一个树只有一个节点,只需要输出根节点属性。
方法一:用case
本题要判断节点类型:
(1)如果在tree表中p_id为null说明该节点一定为根结点;
(2)如果在tree表中的p_id一列中没有出现的节点,一定为Inner节点(没做过父节点);这个判断语句为:WHEN id IN (SELECT p_id FROM tree WHERE p_id IS NOT NULL) THEN 'Inner',注意是用tree.id判断而非tree.p_id,并且不同WHENE之间是没有逗号的。
(3)其余为叶子结点Leaf。
SELECT id, CASE WHEN p_id IS NULL THEN 'Root' WHEN id IN (SELECT p_id FROM tree WHERE p_id IS NOT NULL) THEN 'Inner' ELSE 'Leaf' END AS Type FROM tree ORDER BY id;
方法二:用if
# 方法二:用if SELECT id, IF(isnull(p_id),'Root', IF(id IN (SELECT p_id FROM tree WHERE p_id IS NOT NULL),'Inner','Leaf')) AS Type FROM tree ORDER BY id
练习六:至少有五名直接下属的经理 (leetcode570 难度:中等)
Employee表包含所有员工及其上级的信息。每位员工都有一个Id,并且还有一个对应主管的Id(ManagerId)。
+------+----------+-----------+----------+ |Id |Name |Department |ManagerId | +------+----------+-----------+----------+ |101 |John |A |null | |102 |Dan |A |101 | |103 |James |A |101 | |104 |Amy |A |101 | |105 |Anne |A |101 | |106 |Ron |B |101 | +------+----------+-----------+----------+
针对Employee表,写一条SQL语句找出有5个下属的主管。对于上面的表,结果应输出:
+-------+ | Name | +-------+ | John | +-------+
注意:
没有人向自己汇报。
方法一:关联子查询(敲黑板)
回顾我们之前用关联子查询的栗子:
SELECT product_type, product_name, sale_price FROM product AS p1 WHERE sale_price > (SELECT AVG(sale_price) FROM product AS p2 WHERE p1.product_type = p2.product_type GROUP BY product_type);
同样我们的方法和上面的类似,内外两层查询通过b.ManagerId = a.Id
条件关联,主查询需要满足b.Id
个数大于等于5。
SELECT a.Name FROM Employee2 AS a WHERE 5 <= (SELECT COUNT(b.Id) FROM Employee2 AS b WHERE b.ManagerId = a.Id);
方法二:
速度更快的方法,按照 managerId
分组,使用 having
筛选出大于等于 5 名下属的经理 id。
# Write your MySQL query statement below SELECT name FROM employee WHERE id in( SELECT managerId FROM employee GROUP BY managerId having count(managerId) >= 5 );
练习七:查询回答率最高的问题 (leetcode578 难度:中等)
求出survey_log表中回答率最高的问题,表格的字段有:uid, action, question_id, answer_id, q_num, timestamp。
uid是用户id;action的值为:“show”, “answer”, “skip”;当action是"answer"时,answer_id不为空,相反,当action是"show"和"skip"时为空(null);q_num是问题的数字序号。
写一条sql语句找出回答率最高的 question_id。
举例:
输入
说明
问题285的回答率为1/1,然而问题369的回答率是0/1,所以输出是285。
注意:
最高回答率的意思是:同一个问题出现的次数中回答的比例。
方法一:
首先要读懂题目,就比如选手答题的过程,面对出现的每道题目出现show,可以选择answer或者skip,题目要求的是各题的回答率,进行排序。题目的“说明”也说了:问题285的回答率为1/1,然而问题369的回答率是0/1,所以输出是285。所以我们需要先把1/1和0/1这些对应数字找出来:
SELECT question_id, SUM(case when action = "answer" THEN 1 ELSE 0 END) as num_answers, SUM(case when action = "show" THEN 1 ELSE 0 END) as num_shows FROM survey_log GROUP BY question_id;
结果为:
基于上面的表对(num_answers / num_shows)
进行排序,取最大值:
SELECT question_id FROM ( SELECT question_id, SUM(case when action = "answer" THEN 1 ELSE 0 END) as num_answers, SUM(case when action = "show" THEN 1 ELSE 0 END) as num_shows FROM survey_log GROUP BY question_id ) as tbl ORDER BY (num_answers / num_shows) DESC LIMIT 1;
注意:最后加上LIMIT 1
,只要找到了对应的一条记录,就不会继续向下扫描了,效率会大大提高。 LIMIT 1适用于查询结果为1条(也可能为0)会导致全表扫描的的SQL语句。
方法二:
SELECT question_id FROM survey_log GROUP BY question_id ORDER BY COUNT(answer_id) / COUNT(IF(action = 'show', 1, 0)) DESC LIMIT 1;
练习八:各部门前3高工资的员工(leetcode185 难度:困难)
将练习一中的 employee
表清空,重新插入以下数据(也可以复制练习一中的 employee
表,再插入第5、第6行数据):
+----+-------+--------+--------------+ | Id | Name | Salary | DepartmentId | +----+-------+--------+--------------+ | 1 | Joe | 70000 | 1 | | 2 | Henry | 80000 | 2 | | 3 | Sam | 60000 | 2 | | 4 | Max | 90000 | 1 | | 5 | Janet | 69000 | 1 | | 6 | Randy | 85000 | 1 | +----+-------+--------+--------------+
和练习一一样,还有Department
表,包含公司所有部门的信息。
# 创建部门表 CREATE TABLE Department( Id VARCHAR(100) NOT NULL, Name VARCHAR(100) NOT NULL, PRIMARY KEY(Id) ) INSERT INTO Department VALUES('1', 'IT'); INSERT INTO Department VALUES('2', 'Sales');
+----+----------+ | Id | Name | +----+----------+ | 1 | IT | | 2 | Sales | +----+----------+
【题目要求】编写一个 SQL 查询,找出每个部门工资前三高的员工。例如,根据上述给定的表格,查询结果应返回:
+------------+----------+--------+ | Department | Employee | Salary | +------------+----------+--------+ | IT | Max | 90000 | | IT | Randy | 85000 | | IT | Joe | 70000 | | Sales | Henry | 80000 | | Sales | Sam | 60000 | +------------+----------+--------+
方法一:
SELECT D1.Name Department, E1.Name Employee, E1.Salary FROM Employee E1, Employee E2, Department D1 WHERE E1.DepartmentID = E2.DepartmentID AND E2.Salary >= E1.Salary AND E1.DepartmentID = D1.ID GROUP BY E1.Name HAVING COUNT(DISTINCT E2.Salary) <= 3 ORDER BY D1.Name, E1.Salary DESC;
此外,请考虑实现各部门前N高工资的员工功能。
练习九:平面上最近距离 (leetcode612 难度: 困难)
point_2d表包含一个平面内一些点(超过两个)的坐标值(x,y)。
写一条查询语句求出这些点中的最短距离并保留2位小数。
|x | y | |----|----| | -1 | -1 | | 0 | 0 | | -1 | -2 |
最短距离是1,从点(-1,-1)到点(-1,-2)。所以输出结果为:
+--------+ |shortest| +--------+ |1.00 | +--------+
注意:所有点的最大距离小于10000。
方法一:自连接:
SELECT p1.x, p1.y, p2.x, p2.y, round(min(sqrt(power(p1.x - p2.x, 2) + power(p1.y - p2.y, 2))), 2) AS shortest FROM point_2d AS p1, point_2d AS p2 WHERE p1.x != p2.x OR p1.y != p2.y ORDER BY shortest;
其中不等于也可以用<>,:
SELECT p1.x, p1.y, p2.x, p2.y, round(min(sqrt(power(p1.x - p2.x, 2) + power(p1.y - p2.y, 2))), 2) AS shortest FROM point_2d AS p1, point_2d AS p2 WHERE (p1.x, p1.y) <> (p2.x, p2.y);
直接输出shortest:
: select round(min(sqrt(power(p1.x-p2.x,2) + power(p1.y-p2.y,2))),2) shortest from point_2d p1, point_2d p2 where (p1.x, p1.y) <> (p2.x, p2.y);
练习十:行程和用户(leetcode612 难度:困难)
Trips 表中存所有出租车的行程信息。每段行程有唯一键 Id,Client_Id 和 Driver_Id 是 Users 表中 Users_Id 的外键。Status 是枚举类型,枚举成员为 (‘completed’, ‘cancelled_by_driver’, ‘cancelled_by_client’)。
Users 表存所有用户。每个用户有唯一键 Users_Id。Banned 表示这个用户是否被禁止,Role 则是一个表示(‘client’, ‘driver’, ‘partner’)的枚举类型。
+----------+--------+--------+ | Users_Id | Banned | Role | +----------+--------+--------+ | 1 | No | client | | 2 | Yes | client | | 3 | No | client | | 4 | No | client | | 10 | No | driver | | 11 | No | driver | | 12 | No | driver | | 13 | No | driver | +----------+--------+--------+
写一段 SQL 语句查出2013年10月1日至2013年10月3日期间非禁止用户的取消率。基于上表,你的 SQL 语句应返回如下结果,取消率(Cancellation Rate)保留两位小数。
+------------+-------------------+ | Day | Cancellation Rate | +------------+-------------------+ | 2013-10-01 | 0.33 | | 2013-10-02 | 0.00 | | 2013-10-03 | 0.50 | +------------+-------------------+