반응형

Table: Customer

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| customer_id   | int     |
| name          | varchar |
| visited_on    | date    |
| amount        | int     |
+---------------+---------+
In SQL,(customer_id, visited_on) is the primary key for this table.
This table contains data about customer transactions in a restaurant.
visited_on is the date on which the customer with ID (customer_id) has visited the restaurant.
amount is the total paid by a customer.

 

You are the restaurant owner and you want to analyze a possible expansion (there will be at least one customer every day).

Compute the moving average of how much the customer paid in a seven days window (i.e., current day + 6 days before). average_amount should be rounded to two decimal places.

Return the result table ordered by visited_on in ascending order.

The result format is in the following example.

 

Example 1:

Input: 
Customer table:
+-------------+--------------+--------------+-------------+
| customer_id | name         | visited_on   | amount      |
+-------------+--------------+--------------+-------------+
| 1           | Jhon         | 2019-01-01   | 100         |
| 2           | Daniel       | 2019-01-02   | 110         |
| 3           | Jade         | 2019-01-03   | 120         |
| 4           | Khaled       | 2019-01-04   | 130         |
| 5           | Winston      | 2019-01-05   | 110         | 
| 6           | Elvis        | 2019-01-06   | 140         | 
| 7           | Anna         | 2019-01-07   | 150         |
| 8           | Maria        | 2019-01-08   | 80          |
| 9           | Jaze         | 2019-01-09   | 110         | 
| 1           | Jhon         | 2019-01-10   | 130         | 
| 3           | Jade         | 2019-01-10   | 150         | 
+-------------+--------------+--------------+-------------+
Output: 
+--------------+--------------+----------------+
| visited_on   | amount       | average_amount |
+--------------+--------------+----------------+
| 2019-01-07   | 860          | 122.86         |
| 2019-01-08   | 840          | 120            |
| 2019-01-09   | 840          | 120            |
| 2019-01-10   | 1000         | 142.86         |
+--------------+--------------+----------------+
Explanation: 
1st moving average from 2019-01-01 to 2019-01-07 has an average_amount of (100 + 110 + 120 + 130 + 110 + 140 + 150)/7 = 122.86
2nd moving average from 2019-01-02 to 2019-01-08 has an average_amount of (110 + 120 + 130 + 110 + 140 + 150 + 80)/7 = 120
3rd moving average from 2019-01-03 to 2019-01-09 has an average_amount of (120 + 130 + 110 + 140 + 150 + 80 + 110)/7 = 120
4th moving average from 2019-01-04 to 2019-01-10 has an average_amount of (130 + 110 + 140 + 150 + 80 + 110 + 130 + 150)/7 = 142.86

Step 1: 기준 날짜로 7일치 조인만 수행

SELECT
  a.visited_on,
  b.visited_on AS joined_day,
  b.amount
FROM (SELECT DISTINCT visited_on FROM Customer) a
JOIN Customer b
  ON b.visited_on BETWEEN DATE_SUB(a.visited_on, INTERVAL 6 DAY) AND a.visited_on
ORDER BY a.visited_on, b.visited_on;
  • a: 기준 날짜 (Customer 테이블에서 visited_on만 DISTINCT로 뽑은 것)
  • b: 그 기준 날짜에서 6일 전부터 현재까지의 모든 방문기록

샘플 데이터를 이용한 예시입니다.

기준 날짜: 2024-01-01

  • b.visited_on 범위: 2023-12-26 ~ 2024-01-01
  • 해당되는 날짜: 2024-01-01

기준 날짜: 2024-01-07

  • 범위: 2023-12-31 ~ 2024-01-07
  • 해당 날짜: 2024-01-01, 2024-01-02, 2024-01-04, 2024-01-07

이 부분에서 헷갈릴 수 있는 건 Join입니다.

SELECT *
FROM A
JOIN B
  ON A.id = B.id

보통 A와 B 테이블에서 id 값이 정확히 같은 행만 연결하는 조인문을 많이 봐 왔습니다. (즉, “매칭되는 한 행”만 연결)

ON b.visited_on BETWEEN DATE_SUB(a.visited_on, INTERVAL 6 DAY) AND a.visited_on

하지만 이 문제에선 b의 방문일이 a의 방문일 기준으로 6일 전부터 그날까지 포함되면 JOIN 합니다. 예를 들어서:

  • a.visited_on = 2024-01-07이면
  • b.visited_on이 2024-01-01부터 2024-01-07 사이면 괜찮음.

그래서 이 JOIN은 단순히 하나의 값이 같은 게 아니라 여러 행을 붙일 수 있는 범위 JOIN입니다. 뒤에 따라오는 AND는 그냥 두 조건을 동시에 만족하라는 의미입니다. 

Step 2: 총합과 평균 계산

SELECT
  a.visited_on,
  SUM(b.amount) AS amount,
  ROUND(SUM(b.amount) / 7, 2) AS average_amount
FROM (SELECT DISTINCT visited_on FROM Customer) a
JOIN Customer b
  ON b.visited_on BETWEEN DATE_SUB(a.visited_on, INTERVAL 6 DAY) AND a.visited_on
GROUP BY a.visited_on
ORDER BY a.visited_on;
  1. 기준 날짜 리스트 생성 (DISTINCT visited_on)
  2. 각 날짜 기준으로 과거 6일까지의 데이터 JOIN
  3. 합계와 평균 계산 (SUM, ROUND)
  4. 날짜순 정렬
SELECT DISTINCT visited_on FROM Customer

날짜만 따로 뽑아서 기준 날짜로 사용할 a.visited_on 리스트를 만듭니다.

JOIN Customer b
  ON b.visited_on BETWEEN DATE_SUB(a.visited_on, INTERVAL 6 DAY) AND a.visited_on

a.visited_on = 2024-01-07인 경우, b.visited_on2024-01-01 ~ 2024-01-07 범위에 해당하는 모든 amount가 붙습니다.

Step 3: 정확히 7일 데이터가 있는 경우만 필터링

SELECT
  a.visited_on,
  SUM(b.amount) AS amount,
  ROUND(SUM(b.amount) / 7, 2) AS average_amount
FROM (SELECT DISTINCT visited_on FROM Customer) a
JOIN Customer b
  ON b.visited_on BETWEEN DATE_SUB(a.visited_on, INTERVAL 6 DAY) AND a.visited_on
GROUP BY a.visited_on
HAVING COUNT(DISTINCT b.visited_on) = 7
ORDER BY a.visited_on;
  • HAVING COUNT(DISTINCT b.visited_on) = 7 조건 추가
  • 즉, 이동 평균에 사용할 7일치가 정확히 모두 존재할 때만 결과에 포함.
반응형
올리브한입