반응형

Table: Activity

+--------------+---------+
| Column Name  | Type    |
+--------------+---------+
| player_id    | int     |
| device_id    | int     |
| event_date   | date    |
| games_played | int     |
+--------------+---------+
(player_id, event_date) is the primary key (combination of columns with unique values) of this table.
This table shows the activity of players of some games.
Each row is a record of a player who logged in and played a number of games (possibly 0) before logging out on someday using some device.

 

Write a solution to report the fraction of players that logged in again on the day after the day they first logged in, rounded to 2 decimal places. In other words, you need to count the number of players that logged in for at least two consecutive days starting from their first login date, then divide that number by the total number of players.

The result format is in the following example.

 

Example 1:

Input: 
Activity table:
+-----------+-----------+------------+--------------+
| player_id | device_id | event_date | games_played |
+-----------+-----------+------------+--------------+
| 1         | 2         | 2016-03-01 | 5            |
| 1         | 2         | 2016-03-02 | 6            |
| 2         | 3         | 2017-06-25 | 1            |
| 3         | 1         | 2016-03-02 | 0            |
| 3         | 4         | 2018-07-03 | 5            |
+-----------+-----------+------------+--------------+
Output: 
+-----------+
| fraction  |
+-----------+
| 0.33      |
+-----------+
Explanation: 
Only the player with id 1 logged back in after the first day he had logged in so the answer is 1/3 = 0.33

서브쿼리: 각 유저의 첫 로그인 날짜

SELECT 
    player_id, 
    MIN(event_date) AS first_login --  Finds the earliest (first) login date for each player 
    # The subquery retrieves the first login date for each player
FROM 
    Activity 
GROUP BY 
    player_id
    
player_id | first_login
1         | 2016-03-01
2         | 2017-06-25
3         | 2016-03-02

메인 쿼리: 전체 Activity에서 DATE_SUB(event_date, 1일) 결과 구하기

WHERE (player_id, DATE_SUB(event_date, INTERVAL 1 DAY)) IN (...)

player_id | event_date | DATE_SUB(event_date, 1)
1         | 2016-03-01 | 2016-02-29
1         | 2016-03-02 | 2016-03-01 
2         | 2017-06-25 | 2017-06-24
3         | 2016-03-02 | 2016-03-01
3         | 2018-07-03 | 2018-07-02
DATE_SUB(event_date, INTERVAL 1 DAY))는 첫 로그인 날짜 - 1일을 구하는 겁니다.

 

(player_id, DATE_SUB(event_date, 1일)) 이 그 플레이어의 첫 로그인일과 일치하는지 확인

 

  • 1번 유저:
    • 2016-03-02 → DATE_SUB = 2016-03-01 ✅ 첫 로그인과 같음 → 포함됨
  • 2번 유저:
    • DATE_SUB 결과들 중 2017-06-25와 일치하는 건 없음 ❌
  • 3번 유저:
    • 2016-03-02의 DATE_SUB = 2016-03-01 ❌
    • 2018-07-03의 DATE_SUB = 2018-07-02 ❌
      → 둘 다 첫 로그인(2016-03-02)과 안 맞음
SELECT 
    ROUND(COUNT(DISTINCT player_id) -- counts the number of unique players who logged in again on the day after their first login date
    /(SELECT COUNT(DISTINCT player_id) FROM Activity), -- counts the total number of unique players in the Activity table
    2) AS fraction
FROM
    Activity
where (player_id, DATE_SUB(event_date, INTERVAL 1 DAY)) in
(SELECT 
    player_id, 
    MIN(event_date) AS first_login --  Finds the earliest (first) login date for each player 
    # The subquery retrieves the first login date for each player
FROM 
    Activity 
GROUP BY 
    player_id
)

| fraction |
| -------- |
| 0.33     |

 

COUNT(DISTINCT player_id)

  • 👉 조건에 맞는 (즉, 첫 로그인 다음 날에도 로그인한) 유저의 수를 셈
  • 위 예시에서는 딱 한 명 — player_id = 1

(SELECT COUNT(DISTINCT player_id) FROM Activity)

  • 👉 전체 유저 수를 셈.
  • 예시 테이블 기준으로는 1, 2, 3 — 총 3명
반응형
올리브한입