Clean up doc (#249)

34d43b76 · Tamika Tannis · GitHub · 839db8c3 · 34d43b76 · 839db8c3
Unverified Commit 34d43b76 authored Aug 02, 2019 by Tamika Tannis Committed by GitHub Aug 02, 2019
12 changed files
--- a/README.md
+++ b/README.md
-# Amundsen
+# Amundsen Frontend Service

 [![PyPI version](https://badge.fury.io/py/amundsen-frontend.svg)](https://badge.fury.io/py/amundsen-frontend)
 [![Build Status](https://api.travis-ci.com/lyft/amundsenfrontendlibrary.svg?branch=master)](https://travis-ci.com/lyft/amundsenfrontendlibrary)
@@ -10,14 +10,9 @@

 Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data. It does that today by indexing data resources (tables, dashboards, streams, etc.) and powering a page-rank style search based on usage patterns (e.g. highly queried tables show up earlier than less queried tables). Think of it as Google search for data. The project is named after Norwegian explorer [Roald Amundsen](https://en.wikipedia.org/wiki/Roald_Amundsen), the first person to discover South Pole.

-It includes three microservices and a data ingestion library.
- [amundsenfrontendlibrary](https://github.com/lyft/amundsenfrontendlibrary): Frontend service which is a Flask application with a React frontend.
- [amundsensearchlibrary](https://github.com/lyft/amundsensearchlibrary): Search service, which leverages Elasticsearch for search capabilities, is used to power frontend metadata searching.
- [amundsenmetadatalibrary](https://github.com/lyft/amundsenmetadatalibrary): Metadata service, which leverages Neo4j or Apache Atlas as the persistent layer, to provide various metadata.
- [amundsendatabuilder](https://github.com/lyft/amundsendatabuilder): Data ingestion library for building metadata graph and search index.
-Users could either load the data with [a python script](https://github.com/lyft/amundsendatabuilder/blob/master/example/scripts/sample_data_loader.py) with the library
-or with an [Airflow DAG](https://github.com/lyft/amundsendatabuilder/blob/master/example/dags/sample_dag.py) importing the library.
+The frontend service leverages a separate [search service](https://github.com/lyft/amundsensearchlibrary) for allowing users to search for data resources, and a separate [metadata service](https://github.com/lyft/amundsenmetadatalibrary) for viewing and editing metadata for a given resource. It is a Flask application with a React frontend.

+For information about Amundsen and our other services, visit the [main repository](https://github.com/lyft/amundsen). Please also see our instructions for a [quick start](https://github.com/lyft/amundsen/blob/master/docs/installation.md#bootstrap-a-default-version-of-amundsen-using-docker) setup  of Amundsen with dummy data, and an [overview of the architecture](https://github.com/lyft/amundsen/blob/master/docs/architecture.md).

 ## Requirements
 - Python >= 3.5
@@ -44,36 +39,6 @@ Please note that the mock images only served as demonstration purpose.

    ![](docs/img/data_preview.png)

-## Get Involved in the Community
-
-Want help or want to help?
-Use the button in our [header](https://github.com/lyft/amundsenfrontendlibrary#amundsen) to join our slack channel. Please join our [mailing list](https://groups.google.com/forum/#!forum/amundsen-dev) as well.
-
-## Powered By
-
-Here is the list of organizations that are using Amundsen today. If your organization uses Amundsen, please file a PR and update this list.
-
-Currently **officially** using Amundsen:
-
-1. [Bang & Olufsen](https://www.bang-olufsen.com/en)
-1. [Data Sprints](https://datasprints.com/)
-1. [Everfi](https://everfi.com/)
-1. [ING](https://www.ing.com/Home.htm)
-1. [LMC](https://www.lmc.eu/cs/)
-1. [Lyft](https://www.lyft.com/)
-1. [Square](https://squareup.com/us/en)
-1. [Workday](https://www.workday.com/en-us/homepage.html)
-
-## Getting started
-
-Please visit the Amundsen documentation for help with [installing Amundsen](https://github.com/lyft/amundsenfrontendlibrary/blob/master/docs/installation.md#install-standalone-application-directly-from-the-source)
-and getting a [quick start](https://github.com/lyft/amundsenfrontendlibrary/blob/master/docs/installation.md#bootstrap-a-default-version-of-amundsen-using-docker) with dummy data
-or an [overview of the architecture](docs/architecture.md).
-
-## Architecture Overview
-
-Please visit [Architecture](docs/architecture.md) for Amundsen architecture overview.
-
 ## Installation

 Please visit [Installation guideline](docs/installation.md) on how to install Amundsen.
@@ -86,18 +51,5 @@ Please visit [Configuration doc](docs/configuration.md) on how to configure Amun

 Please visit [Developer guidelines](docs/developer_guide.md) if you want to build Amundsen in your local environment.

-## Roadmap
-
-Please visit [Roadmap](docs/roadmap.md) if you are interested in Amundsen upcoming roadmap items.
-
-## Publications
- [Disrupting Data Discovery](https://www.slideshare.net/taofung/strata-sf-amundsen-presentation) (Strata SF 2019)
- [Amundsen - Lyft's data discovery & metadata engine](https://eng.lyft.com/amundsen-lyfts-data-discovery-metadata-engine-62d27254fbb9) (Lyft engineering blog)
- [Amundsen: A Data Discovery Platform from Lyft](https://www.slideshare.net/taofung/data-council-sf-amundsen-presentation) (Data council 19 SF)
- [Software Engineering Daily podcast on Amundsen](https://softwareengineeringdaily.com/2019/04/16/lyft-data-discovery-with-tao-feng-and-mark-grover/) (April 2019)
- [Disrupting Data Discovery](https://www.slideshare.net/markgrover/disrupting-data-discovery) (Strata London 2019)
- [Disrupting Data Discovery (video)](https://www.youtube.com/watch?v=m1B-ptm0Rrw) (Strata SF 2019)
- ING Data Analytics Platform (Amundsen is mentioned) { [slides](https://static.sched.com/hosted_files/kccnceu19/65/ING%20Data%20Analytics%20Platform.pdf), [video](https://www.youtube.com/watch?v=8cE9ppbnDPs&t=465) } (Kubecon Barcelona 2019)
-
 # License
 [Apache 2.0 License.](/LICENSE)
--- a/docker-amundsen-atlas.yml
+++ b/docker-amundsen-atlas.yml
-version: '3'
-services:
-  atlas:
-      image: wbaa/rokku-dev-apache-atlas:latest
-      container_name: atlas_amundsen
-      ports:
-          - 21000:21000
-      networks:
-        - amundsennet
-      environment:
-        - ATLAS_PROVISION_EXAMPLES=true
-  amundsensearch:
-      image: amundsendev/amundsen-search:1.1.1
-      container_name: amundsensearch
-      ports:
-        - 5001:5000
-      #depends_on:
-      #  - elasticsearch
-      networks:
-        - amundsennet
-      environment:
-        - CREDENTIALS_PROXY_USER=admin
-        - CREDENTIALS_PROXY_PASSWORD=admin
-        - PROXY_ENDPOINT=atlas:21000
-        - PROXY_CLIENT=ATLAS
-  amundsenmetadata:
-      image: amundsendev/amundsen-metadata:1.0.11
-      container_name: amundsenmetadata
-      depends_on:
-        - atlas
-      ports:
-        - 5002:5000
-      networks:
-        - amundsennet
-      environment:
-        - CREDENTIALS_PROXY_USER=admin
-        - CREDENTIALS_PROXY_PASSWORD=admin
-        - PROXY_HOST=atlas
-        - PROXY_PORT=21000
-        - PROXY_CLIENT=ATLAS
-  amundsenfrontend:
-      image: amundsendev/amundsen-frontend:1.0.6
-      container_name: amundsenfrontend
-      depends_on:
-        - amundsenmetadata
-        - amundsensearch
-      ports:
-        - 5000:5000
-      networks:
-        - amundsennet
-      environment:
-        - METADATASERVICE_BASE=http://amundsenmetadata:5000
-        - SEARCHSERVICE_BASE=http://amundsensearch:5000
-networks:
-  amundsennet:
--- a/docker-amundsen.yml
+++ b/docker-amundsen.yml
-version: '3'
-services:
-  neo4j:
-      image: neo4j:3.3.0
-      container_name: neo4j_amundsen
-      environment:
-        - NEO4J_AUTH=neo4j/test
-      ulimits:
-        nofile:
-          soft: 40000
-          hard: 40000
-      ports:
-          - 7474:7474
-          - 7687:7687
-      volumes:
-          - ./example/docker/neo4j/conf:/conf
-      networks:
-        - amundsennet
-  elasticsearch:
-      image: elasticsearch:6.7.0
-      container_name: es_amundsen
-      ports:
-          - 9200:9200
-      networks:
-        - amundsennet
-  amundsensearch:
-      image: amundsendev/amundsen-search:1.1.1
-      container_name: amundsensearch
-      ports:
-        - 5001:5000
-      depends_on:
-        - elasticsearch
-      networks:
-        - amundsennet
-      environment:
-        - PROXY_ENDPOINT=es_amundsen
-  amundsenmetadata:
-      image: amundsendev/amundsen-metadata:1.0.11
-      container_name: amundsenmetadata
-      depends_on:
-        - neo4j
-      ports:
-        - 5002:5000
-      networks:
-        - amundsennet
-      environment:
-         - PROXY_HOST=bolt://neo4j_amundsen
-  amundsenfrontend:
-      image: amundsendev/amundsen-frontend:1.0.6
-      container_name: amundsenfrontend
-      depends_on:
-        - amundsenmetadata
-        - amundsensearch
-      ports:
-        - 5000:5000
-      networks:
-        - amundsennet
-      environment:
-        - SEARCHSERVICE_BASE=http://amundsensearch:5000
-        - METADATASERVICE_BASE=http://amundsenmetadata:5000
-
-networks:
-  amundsennet:
--- a/docs/architecture.md
+++ b/docs/architecture.md
-# Architecture
-
-The following diagram shows the overall architecture for Amundsen.
-![](img/Amundsen_Architecture.png)
-
-The frontend service serves as web UI portal for users interaction. 
-It is Flask-based web app which representation layer is built with React with Redux, Bootstrap, Webpack, and Babel.
-
-The search service leverages Elasticsearch's search functionality and 
-provides a RESTful API to serve search requests from the frontend service. 
-Currently only [table resources](https://github.com/lyft/amundsendatabuilder/blob/master/databuilder/models/elasticsearch_document.py) are indexed and searchable.
-The search index is built with the [elasticsearch publisher](https://github.com/lyft/amundsendatabuilder/blob/master/databuilder/publisher/elasticsearch_publisher.py).
-
-The metadata service currently uses a Neo4j proxy to interact with Neo4j graph db and serves frontend service's metadata. 
-The metadata is represented as a graph model:
-![](img/graph_model.png)
-The above diagram shows how metadata is modeled in Amundsen. 
-Amundsen provides a [data ingestion library](https://github.com/lyft/amundsendatabuilder) for building the metadata. At Lyft, we build the metadata once a day 
-using an Airflow DAG([example](https://github.com/lyft/amundsendatabuilder/blob/master/example/dags/sample_dag.py)).
\ No newline at end of file
--- a/docs/authentication/oidc.md
+++ b/docs/authentication/oidc.md
-# OIDC (Keycloak) Authentication
-Setting up end-to-end authentication using OIDC is fairly simple and can be done using a Flask wrapper i.e., [flaskoidc](https://github.com/verdan/flaskoidc). 
-
-`flaskoidc` leverages the Flask's `before_request` functionality to authenticate each request before passing that to 
-the views. It also accepts headers on each request if available in order to validate bearer token from incoming requests. 
-
-## Installation
-Please refer to the [flaskoidc documentation](https://github.com/verdan/flaskoidc/blob/master/README.md) 
-for the installation and the configurations. 
-
-Note: You need to install and configure `flaskoidc` for each microservice of Amundsen 
-i.e., for frontendlibrary, metadatalibrary and searchlibrary in order to secure each of them.
-
-## Amundsen Configuration
-Once you have `flaskoidc` installed and configured for each microservice, please set the following environment variables:
-
- amundsenfrontendlibrary:
-```bash
-    APP_WRAPPER: flaskoidc
-    APP_WRAPPER_CLASS: FlaskOIDC
-```
-    
- amundsenmetadatalibrary:
-```bash
-    FLASK_APP_MODULE_NAME: flaskoidc
-    FLASK_APP_CLASS_NAME: FlaskOIDC
-```
-
- amundsensearchlibrary: _(Needs to be implemented)_
-```bash
-    FLASK_APP_MODULE_NAME: flaskoidc
-    FLASK_APP_CLASS_NAME: FlaskOIDC
-```
- 
-By default `flaskoidc` whitelist the healthcheck URLs, to not authenticate them. In case of metadatalibrary and searchlibrary 
-we may want to whitelist the healthcheck APIs explicitly using following environment variable. 
-
-```bash
-    FLASK_OIDC_WHITELISTED_ENDPOINTS: 'api.healthcheck'
-```
-
-## Setting Up Request Headers
-To communicate securely between the microservices, you need to pass the bearer token from frontend in each request 
-to metadatalibrary and searchlibrary. This should be done using `REQUEST_HEADERS_METHOD` config variable in frontendlibrary.
-
- Define a function to add the bearer token in each request in your config.py:
-```python
-def get_access_headers(app):
-    """
-    Function to retrieve and format the Authorization Headers
-    that can be passed to various microservices who are expecting that.
-    :param oidc: OIDC object having authorization information
-    :return: A formatted dictionary containing access token
-    as Authorization header.
-    """
-    try:
-        access_token = app.oidc.get_access_token()
-        return {'Authorization': 'Bearer {}'.format(access_token)}
-    except Exception:
-        return None
-``` 
-
- Set the method as the request header method in your config.py:
-```python
-REQUEST_HEADERS_METHOD = get_access_headers
-```
-
-This function will be called using the current `app` instance to add the headers in each request when calling any endpoint of 
-metadatalibrary and searchlibrary [here](https://github.com/lyft/amundsenfrontendlibrary/blob/master/amundsen_application/api/utils/request_utils.py)
-
-
-## Setting Up Auth User Method
-In order to get the current authenticated user (which is being used in Amundsen for many operations), we need to set
-`AUTH_USER_METHOD` config variable in frontendlibrary. 
-This function should return email address, user id and any other required information. 
-
- Define a function to fetch the user information in your config.py:
-```python
-def get_auth_user(app):
-    """
-    Retrieves the user information from oidc token, and then makes
-    a dictionary 'UserInfo' from the token information dictionary.
-    We need to convert it to a class in order to use the information
-    in the rest of the Amundsen application.
-    :param app: The instance of the current app.
-    :return: A class UserInfo
-    """
-    from flask import g
-    user_info = type('UserInfo', (object,), g.oidc_id_token)
-    # noinspection PyUnresolvedReferences
-    user_info.user_id = user_info.preferred_username
-    return user_info
-``` 
-
- Set the method as the auth user method in your config.py:
-```python
-AUTH_USER_METHOD = get_auth_user
-```
-
-Once done, you'll have the end-to-end authentication in Amundsen without any proxy or code changes.
-  
\ No newline at end of file
+See [this doc](https://github.com/lyft/amundsen/blob/master/docs/authentication/oidc.md) in our main repository for information on how to set up end-to-end authentication using OIDC.
+  
--- a/docs/img/Amundsen_Architecture.png
+++ b/docs/img/Amundsen_Architecture.png
--- a/docs/installation-aws-ecs/aws-ecs-deployment.md
+++ b/docs/installation-aws-ecs/aws-ecs-deployment.md
-# Deployment of non-production Amundsen on AWS ECS using aws-cli 
-
-The following is a set of intructions to run Amundsen on AWS Elastic Container Service. The current configuration is very basic but it is working. It is a migration of the docker-amundsen.yml to run on AWS ECS.
-
-## Install ECS CLI
-
-The first step is to install ECS CLI, please follow the instructions from AWS [documentation](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ECS_CLI_installation.html)
-
-### Get your access and secret keys from IAM
-
-```bash
-# in ~/<your-path-to-cloned-repo>/amundsenfrontendlibrary/docs/instalation-aws-ecs
-$ export AWS_ACCESS_KEY_ID=xxxxxxxx
-$ export AWS_SECRET_ACCESS_KEY=xxxxxx
-$ export AWS_PROFILE=profilename
-```
-
-For the purpose of this instruction we used the [tutorial](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-cli-tutorial-ec2.html#ECS_CLI_tutorial_compose_create) on AWS documentation
-
-
-## STEP 1: Create a cluster configuration:
-
-```bash
-# in ~/<your-path-to-cloned-repo>/amundsenfrontendlibrary/docs/instalation-aws-ecs
-$ ecs-cli configure --cluster amundsen --region us-west-2 --default-launch-type EC2 --config-name amundsen
-```
-
-### STEP 2: Create a profile using your access key and secret key:
-
-```bash
-# in ~/<your-path-to-cloned-repo>/amundsenfrontendlibrary/docs/instalation-aws-ecs
-$ ecs-cli configure profile --access-key $AWS_ACCESS_KEY_ID --secret-key $AWS_SECRET_ACCESS_KEY --profile-name amundsen 
-```
-
-### STEP 3: Create the Cluster Use profile name from \~/.aws/credentials
-
-```bash
-# in ~/<your-path-to-cloned-repo>/amundsenfrontendlibrary/docs/instalation-aws-ecs
-$ ecs-cli up --keypair JoaoCorreia --extra-user-data userData.sh --capability-iam --size 1 --instance-type t2.large --cluster-config amundsen --verbose --force --aws-profile $AWS_PROFILE 
-```
-
-### STEP 4: Deploy the Compose File to a Cluster
-
-```bash
-# in ~/<your-path-to-cloned-repo>/amundsenfrontendlibrary/docs/instalation-aws-ecs
-$ ecs-cli compose --cluster-config amundsen --file docker-ecs-amundsen.yml up --create-log-groups
-```
-
-You can use the ECS CLI to see what tasks are running.
-
-```bash
-$ ecs-cli ps
-```
-
-### STEP 5 Open the EC2 Instance 
-
-Edit the Security Group to allow traffic to your IP, you should be able to see the frontend, elasticsearch and neo4j by visiting the URLs:
-
- http://xxxxxxx:5000/
- http://xxxxxxx:9200/
- http://xxxxxxx:7474/browser/
-
-## TODO
-
- Configuration sent to services not working properly (amunsen.db vs graph.db) 
- Create a persistent volume for graph/metadata storage. [See this](https://aws.amazon.com/blogs/compute/amazon-ecs-and-docker-volume-drivers-amazon-ebs/)
- Refactor the VPC and default security group permissions
-
-
-
-
-
-
-
-
-
-
-
-
--- a/docs/installation-aws-ecs/docker-ecs-amundsen.yml
+++ b/docs/installation-aws-ecs/docker-ecs-amundsen.yml
-version: '3'
-services:
-  neo4j:
-      image: neo4j:3.3.0
-      container_name: neo4j_amundsen
-      environment:
-        - NEO4J_AUTH=neo4j/test
-        # These dont seem to be working though!
-        - NEO4J_dbms.active_database=amundsen.db
-        - NEO4J_dbms.directories.data=/neo4j/data
-        - NEO4J_dbms.directories.logs=/var/log/neo4j
-        - NEO4J_dbms.directories.import=/var/lib/neo4j/import
-        - NEO4J_dbms.security.auth_enabled=false
-        - NEO4J_dbms.connectors.default_listen_address=0.0.0.0
-      ulimits:
-        nofile:
-          soft: 40000
-          hard: 40000
-      ports:
-          - 7474:7474
-          - 7687:7687
-      logging:
-        driver: awslogs
-        options: 
-          awslogs-group: amundsen-neo4j
-          awslogs-region: us-west-2
-          awslogs-stream-prefix: amundsen-neo4j
-
-  elasticsearch:
-      image: elasticsearch:6.7.0
-      container_name: es_amundsen
-      ports:
-          - 9200:9200
-      ulimits:
-        nofile:
-           soft: 65536
-           hard: 65536
-      logging:
-        driver: awslogs
-        options: 
-          awslogs-group: amundsen-elasticsearch
-          awslogs-region: us-west-2
-          awslogs-stream-prefix: amundsen-elasticsearch
-
-  amundsensearch:
-      image: amundsendev/amundsen-search:1.1.1
-      container_name: amundsensearch
-      ports:
-        - 5001:5000
-      depends_on:
-        - elasticsearch
-      environment:
-        - PROXY_ENDPOINT=es_amundsen
-      logging:
-        driver: awslogs
-        options: 
-          awslogs-group: amundsensearch
-          awslogs-region: us-west-2
-          awslogs-stream-prefix: amundsensearch
-
-  amundsenmetadata:
-      image: amundsendev/amundsen-metadata:1.0.7
-      container_name: amundsenmetadata
-      depends_on:
-        - neo4j
-      ports:
-        - 5002:5000
-      environment:
-         - PROXY_HOST=bolt://neo4j_amundsen
-      logging:
-        driver: awslogs
-        options: 
-          awslogs-group: amundsenmetadata
-          awslogs-region: us-west-2
-          awslogs-stream-prefix: amundsenmetadata
-
-  amundsenfrontend:
-      image: amundsendev/amundsen-frontend:1.0.5
-      container_name: amundsenfrontend
-      depends_on:
-        - amundsenmetadata
-        - amundsensearch
-      ports:
-        - 5000:5000
-      environment:
-        - SEARCHSERVICE_BASE=http://amundsensearch:5000
-        - METADATASERVICE_BASE=http://amundsenmetadata:5000
-      logging:
-        driver: awslogs
-        options: 
-          awslogs-group: amundsenfrontend
-          awslogs-region: us-west-2
-          awslogs-stream-prefix: amundsenfrontend
-
-
--- a/docs/installation-aws-ecs/ecs-params.yml
+++ b/docs/installation-aws-ecs/ecs-params.yml
-version: 1
-task_definition:
-  services:
-    neo4j:
-      cpu_shares: 100
-      mem_limit: 3GB
-    elasticsearch:
-      cpu_shares: 100
-      mem_limit: 3GB
-    amundsensearch:
-      cpu_shares: 100
-      mem_limit: 500MB
-    amundsenmetadata:
-      cpu_shares: 100
-      mem_limit: 500MB   
-    amundsenfrontend:
-      cpu_shares: 100
-      mem_limit: 500MB
--- a/docs/installation-aws-ecs/userData.sh
+++ b/docs/installation-aws-ecs/userData.sh
-#!/bin/bash
-
-# For ElasticSearch
-sysctl -w vm.max_map_count=262144
-
--- a/docs/installation.md
+++ b/docs/installation.md
@@ -31,44 +31,3 @@ You should now have the application running at http://localhost:5000, but will n
 ```bash
 $ python3 amundsen_application/wsgi.py
 ```
-
-## Bootstrap a default version of Amundsen using Docker
-
-The following instructions are for setting up a version of Amundsen using Docker. At the moment, we only support a bootstrap for connecting the Amundsen application to an example metadata service.
-
-1. Install `docker` and  `docker-compose`.
-2. Clone [this repo](https://github.com/lyft/amundsenfrontendlibrary) or download the [docker-amundsen.yml](https://github.com/lyft/amundsenfrontendlibrary/blob/master/docker-amundsen.yml) file directly.
-3. Enter the directory where the `docker-amundsen.yml` file is and then:
-    ```bash
-    $ docker-compose -f docker-amundsen.yml up
-    ```
-4. Ingest dummy data into Neo4j by doing the following:
-   * Clone [amundsendatabuilder](https://github.com/lyft/amundsendatabuilder).
-   * Run the following commands in the `amundsendatabuilder` directory:
-   ```bash
-    $ python3 -m venv venv
-    $ source venv/bin/activate  
-    $ pip3 install -r requirements.txt
-    $ python3 setup.py install
-    $ python3 example/scripts/sample_data_loader.py
-   ```
-5. View UI at [`http://localhost:5000`](http://localhost:5000) and try to search `test`, it should return some result.
-
-### Verify setup
-
-1. You can verify dummy data has been ingested into Neo4j by by visiting [`http://localhost:7474/browser/`](http://localhost:7474/browser/) and run `MATCH (n:Table) RETURN n LIMIT 25` in the query box. You should see two tables:
-   1. `hive.test_schema.test_table1`
-   2. `dynamo.test_schema.test_table2`
-2. You can verify the data has been loaded into the metadataservice by visiting:
-   1. [`http://localhost:5000/table_detail/gold/hive/test_schema/test_table1`](http://localhost:5000/table_detail/gold/hive/test_schema/test_table1)
-   2. [`http://localhost:5000/table_detail/gold/dynamo/test_schema/test_table2`](http://localhost:5000/table_detail/gold/dynamo/test_schema/test_table2)
-
-### Troubleshooting
-
-1. If the docker container doesn't have enough heap memory for Elastic Search, `es_amundsen` will fail during `docker-compose`.
-   1. docker-compose error: `es_amundsen | [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]`
-   2. Increase the heap memory [detailed instructions here](https://www.elastic.co/guide/en/elasticsearch/reference/7.1/docker.html#docker-cli-run-prod-mode)
-      1. Edit `/etc/sysctl.conf`
-      2. Make entry `vm.max_map_count=262144`. Save and exit.
-      3. Reload settings `$ sysctl -p`
-      4. Restart `docker-compose`
--- a/docs/roadmap.md
+++ b/docs/roadmap.md
-# Amundsen Roadmap
-
-**Mission**: To organize all information about data and make it universally actionable<br/>
-**Vision (2020)**: Centralize a comprehensive and actionable map of all our data resources that can be leveraged to solve a growing number of use cases and workflows
-
-The following roadmap gives an overview of what we are currently working on and what we want to tackle next. We share it so that the community can plan work together. Let us know in the Slack channel if you are interested in taking a stab at leading the development of one of these features (or of a non listed one!).
-
-## Current focus
-
-**Search & Resource page redesign**<br/>
-*What*: Redesign the search experience and the resource page, to make them scalable in the number of resources types and the number of metadata<br/>
-*Status*: Designs are ready, engineering work has started<br/>
-*Links*: [Designs](https://drive.google.com/drive/folders/12oBrcXUsDtOsuU_QvO93LTvs4Dehx6az?usp=sharing)
-
-**Email notifications system**<br/>
-*What*: We are creating an email notification system to reach Amundsen’s users. The primary goal is to use this system to help solve the lack of ownership for data assets at Lyft. The secondary goal is to engage with users for general purposes.<br/>
-*Status*: Designs are ready, engineering work has started
-
-## Next steps
-
-**Index Dashboards**
-*What*: We want to help with the discovery of existing analysis work, dashboards. This is going to help avoid reinventing the wheel, create value for less technical users and help give context on how tables are used.<br/>
-*Status*: Product + technical specs are ready, designs are ready, implementation has not started<br/>
-*Links*: [Designs](https://drive.google.com/drive/folders/12oBrcXUsDtOsuU_QvO93LTvs4Dehx6az?usp=sharing) | [Product Specifications](https://docs.google.com/document/d/16cSKgM2sCYvhKq54yfwaHKwslJEGtdS2g5dcPV4p5qo/edit?usp=sharing) | [Technical RFC](https://docs.google.com/document/d/1PHk8OjcIULJ7hG0ckeMrRfTk3vXqnq5asEykgQUw-Ow/edit?usp=sharing)
-
-**Native lineage integration**<br/>
-*What*: We want to create a native lineage integration in Amundsen, to better surface how data assets interact with each other<br/>
-*Status*: implementation has not started
-
-**Landing page**<br/>
-*What*: We are creating a proper landing page to provide more value, with an emphasis on helping users finding data when then don’t really know what to search for (exploration)<br/>
-*Status*: being spec’d out
-
-**Push ingest API**<br/>
-*What*: We want to create a push API so that it is as easy as possible for a new data resource type to be ingested<br/>
-*Status*: implementation has started (around 80% complete)
-
-**GET Rest API**<br/>
-*What*: enable users to access our data map programmatically through a Rest API<br/>
-*Status*: implementation has started
-
-**Index Druid tables and S3 buckets**<br/>
-*What*: add these new resource types to our data map and create resource pages for them<br/>
-*Status*:  implementation has not started
-
-**Granular Access Control**<br/>
-*What*: we want to have a more granular control of the access. For example, only certain types of people would be able to see certain types of metadata/functionality<br/>
-*Status*: implementation has not started
-
-**Show distinct column values**<br/>
-*What*: When a column has a limited set of possible values, we want to make then easily discoverable<br/>
-*Status*: implementation has not started
-
-**“Order by” for columns**<br/>
-*What*: we want to help users make sense of what are the columns people use in the tables we index. Within a frequently used table, a column might not be used anymore because it is know to be deprecated<br/>
-*Status*: implementation has not started
-
-**Index online datastores**<br/>
-*What*: We want to make our DynamoDB and other online datastores discoverable by indexing them. For this purpose, we will probably leverage the fact that we have a centralized IDL (interface definition language)<br/>
-*Status*: implementation has not started
-
-**Integration with BI Tools**<br/>
-*What*: get the richness of Amundsen’s metadata to where the data is used: in Bi tools such as Mode, Superset and Tableau<br/>
-*Status*: implementation has not started
-
-**Index Processes**<br/>
-*What*: we want to index ETLs and pipelines from our Machine Learning Engine<br/>
-*Status*: implementation has not started
-
-**Versioning system**<br/>
-*What*: We want to create a versioning system for our indexed resources, to be able to index different versions of the same resource. This is especially required for machine learning purposes.<br/>
-*Status*: implementation has not started
-
-**Index Teams**<br/>
-*What*: We want to add teams pages to enable users to see what are the important tables and dashboard a team uses<br/>
-*Status*: implementation has not started
-
-**Index Services**<br/>
-*What*: With our microservices architecture, we want to index services and show how these services interact with data artifacts<br/>
-*Status*: implementation has not started
-
-**Index Pub/Sub systems**<br/>
-*What*: We want to make our pub/sub systems discoverable<br/>
-*Status*: implementation has not started
-
-