Project Brief
Project description for the Hackathon project
Tour Boat
A Data Lake Explorer web app and management toolset
Hackathon participants will create and populate a small (10-100MB) data lake in the cloud with flat files and/or semi-structured data (files should be KB/MB in scale). They will then create tools and user interfaces for exploring, manipulating and visualizing the data.
Core Features:
A web app facilitating the exploration of the files/folders in the data lake
Tools for managing the data, e.g. extracting from original sources and storing the data in a useful format (CSV, JSON, etc.)
Potential Features:
Additional features and capabilities of the finished product will be dependent upon the interest and capabilities of the of the participants, but could include:
Map spatial data from the data lake using MapBox
Allow uploads into data lake via UI using drag-drop
Implement authentication for web app using Auth0
Automate data processing using S3+Lambda (trigger code when a file is dropped into an S3 bucket to be processed)
View file contents from data lake using Ag-grid
Visualize file contents from data lake using D3
Build backend processing tools to convert custom data (e.g. log files) to CSV for inclusion in data lake
Allow SQL processing of files from data lake using AlaSQL
Manage cloud resources as code using Serverless.com
Project rules:
All code and required resources will be managed and stored in a GitHub repo; ability to use basic git functionality is required
Basic documentation and instructions will be produced by the participants for all code and tools they create. It will be in markdown files stored in the code repository, and should be sufficient to understand and maintain everything produced. Any diagrams or illustrations should be produced using draw.io and stored in the repo as .PNG files (which can be easily edited later) and linked in the markdown files.
Last updated
Was this helpful?