Unverified Commit 594aade8 authored by Tao Feng's avatar Tao Feng Committed by GitHub

doc: add a few sample query for user metric doc (#774)

* doc: add a few sample query for user metric doc
Signed-off-by: 's avatarfeng-tao <fengtao04@gmail.com>

* doc: fix index
Signed-off-by: 's avatarfeng-tao <fengtao04@gmail.com>
parent de66341d
......@@ -40,20 +40,145 @@ Once you have the event in your data warehouse, you could start building differe
1. WAU
Sample query if the event table named as `default.event_amundsenfrontend_user_action`
```
SELECT date_trunc('week', CAST("ds" AS TIMESTAMP)) AS "__timestamp",
COUNT(DISTINCT user_value) AS "count_distinct_active_users"
FROM
(SELECT *
FROM default.event_amundsenfrontend_user_action
WHERE ds > '2019-09-01') AS "expr_qry"
WHERE "ds" >= '2020-04-21 00:00:00.000000'
AND "ds" <= '2020-10-21 05:31:14.000000'
GROUP BY date_trunc('week', CAST("ds" AS TIMESTAMP))
ORDER BY "count_distinct_active_users" DESC
LIMIT 10000
```
2. DAU
3. Top 50 active user
Sample query if the event table named as `default.event_amundsenfrontend_user_action`
```
SELECT date_trunc('day', CAST("ds" AS TIMESTAMP)) AS "__timestamp",
COUNT(DISTINCT user_value) AS "count_distinct_active_users"
FROM
(SELECT *
FROM default.event_amundsenfrontend_user_action
WHERE ds > '2019-09-01') AS "expr_qry"
WHERE "ds" >= '2020-07-21 00:00:00.000000'
AND "ds" <= '2020-10-21 00:00:00.000000'
GROUP BY date_trunc('day', CAST("ds" AS TIMESTAMP))
ORDER BY "count_distinct_active_users" DESC
LIMIT 50000
```
You could also exclude weekends:
```
SELECT date_trunc('day', CAST("ds" AS TIMESTAMP)) AS "__timestamp",
COUNT(DISTINCT user_value) AS "count_distinct_active_users"
FROM
(SELECT *
FROM default.event_amundsenfrontend_user_action
WHERE ds > '2019-09-01') AS "expr_qry"
WHERE "ds" >= '2020-04-21 00:00:00.000000'
AND "ds" <= '2020-10-21 05:33:11.000000'
AND day_of_week(logged_at) NOT IN (6,
7)
GROUP BY date_trunc('day', CAST("ds" AS TIMESTAMP))
ORDER BY "count_distinct_active_users" DESC
LIMIT 50000
```
3. User Penetration per role
Sample query if the event table named as `default.event_amundsenfrontend_user_action` and a table for user:
```
SELECT "title" AS "title",
COUNT(DISTINCT email) * 100 / MAX(role_count) AS "penetration_percent"
FROM
(SELECT e.occurred_at,
u.email,
u.title,
tmp.role_count
FROM default.family_user u
JOIN default.event_amundsenfrontend_user_action e ON u.email = e.user_value
JOIN
(SELECT title,
count(*) role_count
FROM default.family_user
GROUP BY 1) as tmp ON u.title = tmp.title
where ds is not NULL) AS "expr_qry"
WHERE "occurred_at" >= from_iso8601_timestamp('2020-10-14T00:00:00.000000')
AND "occurred_at" <= from_iso8601_timestamp('2020-10-21T00:00:00.000000')
AND "role_count" > 20
GROUP BY "title"
ORDER BY "penetration_percent" DESC
LIMIT 100
```
4. Usage breakdown per role_count
sample query:
```
SELECT "title" AS "title",
count("email") AS "COUNT(email)"
FROM
(SELECT e.occurred_at,
u.email,
u.title,
tmp.role_count
FROM default.family_user u
JOIN default.event_amundsenfrontend_user_action e ON u.email = e.user_value
JOIN
(SELECT title,
count(*) role_count
FROM default.family_user
GROUP BY 1) as tmp ON u.title = tmp.title
where ds is not NULL) AS "expr_qry"
WHERE "occurred_at" >= from_iso8601_timestamp('2020-10-14T00:00:00.000000')
AND "occurred_at" <= from_iso8601_timestamp('2020-10-21T00:00:00.000000')
GROUP BY "title"
ORDER BY "COUNT(email)" DESC
LIMIT 15
```
5. search click through rate
sample query:
```
SELECT date_trunc('day', CAST("occurred_at" AS TIMESTAMP)) AS "__timestamp",
SUM(CASE
WHEN CAST(json_extract_scalar(keyword_args_json, '$.index') AS BIGINT) <= 3 THEN 1
ELSE 0
END) * 100 / COUNT(*) AS "click_through_rate"
FROM
(SELECT *
FROM default.event_amundsenfrontend_user_action
WHERE ds > '2019-09-01') AS "expr_qry"
WHERE "occurred_at" >= from_iso8601_timestamp('2020-09-21T00:00:00.000000')
AND "occurred_at" <= from_iso8601_timestamp('2020-10-21T00:00:00.000000')
AND "command" IN ('_get_table_metadata',
'_get_dashboard_metadata',
'_log_get_user')
AND json_extract_scalar(keyword_args_json, '$.source') IN ('search_results',
'inline_search')
GROUP BY date_trunc('day', CAST("occurred_at" AS TIMESTAMP))
ORDER BY "click_through_rate" DESC
LIMIT 10000
```
6. Top 50 active user
4. Top search term
7. Top search term
5. Top popular tables
8. Top popular tables
6. Search click index
9. Search click index
7. Metadata edits
10. Metadata edits
8. Metadata edit leaders
11. Metadata edit leaders
9. Amundsen user per role (by joining with employee data)
12. Amundsen user per role (by joining with employee data)
10. ...
13. ...
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment